On the Wondrous Behaviour of Rats and Researchers

This paper deals with the connection between effort and rewards, regarding the optimal behaviour of the agents involved. The context is that in which a subject is required to perform a costly task and is rewarded with a prize when the task is performed properly. We provide here a simple model that explains why reducing the reward associated with a costly task may induce a higher effort. This is an experimental observation of the behaviour of rats in the lab, but it also appears in other mammals (e.g. researchers).


The Wonder
After a stormy meeting with the politicians that decide on research funding in my Region, I was complaining with some colleagues about the erratic nature of the research policy. I mentioned that it was really surprising to observe that researchers kept complying with the rules and making substantive efforts in spite of the random nature of some funding policies. Even when research funds were curtailed. An experimental psychologist told me that the same wondrous behaviour could be observed in the lab when rats were subject to the manipulation of their rewards. Reducing the frequency of the rewards, associated with the performance of some task, yielded the unexpected result that some experimental subjects worked harder, exhibiting a compulsive and aggressive behaviour matched with unhappiness. "Just like us", he concluded. So there seemed to be some common driving force between rats and researchers in the presence of random rewards.
Back home I kept thinking about this comparison, trying to figure out how to uncover that common behavioural pattern. The model below is the answer I propose: a simple analytic framework that is able to reproduce the bizarre pattern of behaviour of rats and researchers. The bottom line is actually one of the most elementary psychological principles: the repetition of the stimulus reduces the intensity of the response. Or, put more formally, the concavity of the objective function (actually the degree of concavity) is what drives this behaviour of the experimental subject. Let us see how.

The Explanation
Consider a situation in which an experimental subject (a rat, say) is required to exert an effort or perform a given task that is costly. He is granted a reward proportional to the number of times he does it properly. 1 The experiment is repeated continuously for a given time span, that we normalize to one for the sake of simplicity. The controller decides on the frequency π of correct actions that will be rewarded with a fixed prize, of size 1 (that can be interpreted as one unit of food). The behaviour of the subject responds to an objective function that in- π the expected reward. We assume that function f is increasing in the expected reward, e π , but that the increase of satisfaction grows at a decreasing rate. That is, (in other words, f is increasing in and concave in the expected reward).
Note that the effort variable has two different effects. On the one hand, it is a source of dissatisfaction. On the other hand, it is positively related to the expected reward. The relationship between effort and satisfaction is described by the derivative of U(.) with respect to e. That is, 1 We do not consider here any punishment (electroshocks, say), even though they can be easily accommodated within the model. One can interpret that the subject has been deprived from food for a relevant while, so that no food is already a punishment. Consequently, satisfaction is positively related to effort whenever ∂ − and will be negatively related otherwise. So it will depend on the specific shape of function f and on the particular point we consider. Note that the concavity of f in the expected rewards suggests that this relationship will be positive for low values of the effort and negative for high ones. Be as it may, the agent will choose the optimal effort, e * , which is the value that satisfies the following equation: ( ) that is, the value of the effort such that the subject's incremental satisfaction due to the prize obtained equals the dissatisfaction derived from the effort. This equalization or marginal effects is the standard requirement for optimal actions.
The relationship between optimal effort and the frequency of the prizes is a subtle one, because larger rewards induce two opposite effects. On the one hand, there is a tendency to increase effort due to the fact that each unit of effort becomes more rewarding. On the other hand, there is a tendency to reduce effort because now with less effort the subject may achieve the same prize.
Which effect eventually dominates depends on the shape of f, in particular on its curvature (the degree of concavity of the function). Recall on this point that the curvature of a function is controlled by its second derivative and that it can be expressed in terms of the elasticity of its first derivative. 2 In our case that elasticity measures the relative change in the marginal satisfaction of the subject due to a change in the expected reward.
By letting ρ denote such an elasticity, we would have: This result explains why we may observe that experimental subjects work harder when the prize is given with a smaller frequency. The more responsive the subject, the larger the effort increase associated with a reduction in the prize. That may also explain compulsive behaviour in some subjects.
Note that the reduction of the frequency of the prize makes those subjects with 1 ρ > unhappy in a twofold way. On the one hand, they get on average smaller rewards. On the other hand, they exert a higher effort. Yet working harder is their best response!
The simplest case of a utility function that permits one discussing the role of this coefficient ρ is that in which it is constant. The family of functions with Constant Elasticity of Substitution, CES, yields the following formula in our case: The derivative of U with respect to e in this case is given by: The sign of this expression depends on the sign of the second term of the right hand side. The optimal level of effort is obtained when that term is zero.
That is, a value that is independent on the frequency of the rewards. In this case those two opposite effects derived from a change in the frequency are exactly of the same magnitude, so that one cancels the other.

Discussion: What to Expect?
It seems to follow logically from the result above that starvation maximizes the subject's willingness to cooperate, when 1 ρ > . Therefore a path of reductions in the research funds would induce researchers to achieve the highest possible production levels. 3  for some constant K. In this case the optimal decision is clearly e * = 0. This outcome is reminiscent of Seligman [2] theory of learned helplessness. On the other hand the behaviour will be different when there is recall, as shown in the Morris [3] water maze experiment. Putting past rewards in the objective function opens a new set of possibilities and the degree of convexity of the function relative to that variable will again play a role in the determination of the behaviour (e.g. taken as a benchmark may induce frustration when the frequency is reduced and hence reduce effort or, alternatively, still more effort is exerted in order to try to achieve previous outcome).

Final Remarks
We have analysed here a behavioural pattern observed both in the lab and in some humans (rats and researches, in our reference model) that seems rather counter-intuitive. It refers to the response derived from reducing the expected reward associated with performing a costly task. In some cases, reducing the rewards results in a higher effort exerted by the subjects. We have shown that there are particular circumstances in which the bizarre experimental behaviour of rats and researchers can be rationally explained in terms of agents that try to maximise their achievements (nourishment, satisfaction, welfare …). The key element for that behaviour is the sensitivity of the marginal response to changes in the reward, which is reflected in the degree of concavity of the objective func- tion. Yet increasing effort when the frequency of prizes is reduced for those sensitive subjects is not a universal law, as their behaviour may be also affected by some other aspects, such as pattern recognition and recall.

Conflicts of Interest
The author declares no conflicts of interest regarding the publication of this paper.