Volume 8: pp. 60-77

zentall_thumbAnimals Prefer Reinforcement that Follows Greater Effort: Justification of Effort or Within-Trial Contrast?

Thomas R. Zentall
University of Kentucky

Reading Options:

PDF | Add to Endnote | Kindle | eBook


Abstract:

Justification of effort by humans is a form of reducing cognitive dissonance by enhancing the value of rewards when they are more difficult to obtain. Presumably, assigning greater value to rewards provides justification for the greater effort needed to obtain them. We have found such effects in adult humans and children with a highly controlled laboratory task. More importantly, under various conditions we have found similar effects in pigeons, animals not typically thought to need to justify their behavior to themselves or others. To account for these results, we have proposed a mechanism based on within-trial contrast between the end of the effort and the reinforcement (or the signal for reinforcement) that follows. This model predicts that any relatively aversive event can serve to enhance the value of the reward that follows it, simply through the contrast between those two events. In support of this general model, we have found this effect in pigeons when the prior event consists of: (a) more rather than less effort (pecking), (b) a long rather than a short delay, and (c) the absence of food rather than food. We also show that within-trial contrast can occur in the absence of relative delay reduction theory. Contrast of this kind may also play a role in other social psychological phenomena that have been interpreted in terms of cognitive dissonance.

Keywords: cognitive dissonance, justification of effort, contrast, delay reduction


When humans behave in a way that is inconsistent with the way they think they should behave, they will often try to justify their behavior by altering their beliefs. The theory on which this behavior is based is known as cognitive dissonance theory (Festinger, 1957). Evidence for the attempt to reduce cognitive dissonance comes from the classic study by Festinger and Carlsmith (1959) who found that subjects, who were given a small reward for agreeing to tell a prospective subject that a boring task was interesting, then rated the task more interesting than subjects who were given a large reward. Presumably, those given a small reward could not justify their behavior for the small reward so, to justify their behavior, they remembered the task as being more interesting. On the other hand, those given the large reward did not have to justify their behavior because the large reward was sufficient.

But the theory that such decisions are cognitively influenced has been challenged by evidence that humans with retrograde amnesia show cognitive-dissonance-like effects without having any memory for the presumed dissonant event (Lieberman, Ochsner, Gilbert, & Schacter, 2001). Lieberman et al. asked amnesics, to choose between pictures that they had originally judged to be similarly preferred. When they then asked the subjects to rate the pictures again, they, much like control subjects, now rated the chosen pictures higher than the unchosen pictures. What is surprising is that the amnesics had no memory for ever having seen the chosen pictures before. This result implies that cognitive dissonance is an implicit automatic process that requires little cognitive processing.

The same conclusion was reached by Egan, Santos, and Bloom (2007) who examined a similar effect in 4-year-old children and monkeys. When subjects were required to choose between two equally-preferred alternatives, they later avoided the unchosen alternative over a novel alternative, but they did so only if they were forced to make the original choice.

Festinger himself believed that his theory also applied to the behavior of nonhuman animals (Lawrence & Festinger, 1962), but the examples that they provided were only remotely related to the cognitive dissonance research that had been conducted with humans and the results that were obtained were easily accounted for by simpler behavioral mechanisms (e.g., the partial reinforcement extinction effect, which was attributed by others to a generalization decrement [Capaldi, 1967] or to an acquired response in the presence of frustration [Amsel, 1958]). Thus, the purpose of the research described in this article is to examine an analog design that could be used with nonhuman animals to determine if they too would show a similar cognitive dissonance effect.

One form of cognitive dissonance reduction is the justification of effort effect (Aronson & Mills, 1959). When a goal is difficult to obtain, Aronson and Mills found that it is often judged to be of more value than the same goal when it is easy to obtain. Specifically, Aronson and Mills reported that a group that required a difficult initiation to join was perceived as more attractive than a group that was easy to join. This effect appears to be inconsistent with the Law of Effect or the Law of Least Effort (Thorndike, 1932) because goals with less effort to obtain should have more value than goals that require more effort. To account for these results, Aronson and Mills proposed that the difficulty of the initiation could only be justified by increasing the perceived value of joining the group.

Alternatively, it could be argued that there may be a correlation between the difficulty in joining a group and the value of group membership. That is, although there is not always sufficient information on which to determine the value of a group, a reasonable heuristic may be that the difficulty of being admitted to the group is a functional (but perhaps imperfect) source of information about the value of group membership. Put more simply, more valuable groups are often harder to join.

The problem with studying justification of effort in humans is that humans often have had experience with functional heuristics or rules of thumb and what may appear to be a justification of effort, may actually reflect no more than the generalized use of this heuristic. On the other hand, if cognitive dissonance actually involves implicit automatic processes, cognitive processes may not be involved and one should be able to demonstrate justification of effort effects in nonhuman animals, under conditions that control for prior experience with the ability of effort to predict reward value.

The beauty of the Aronson and Mills (1959) design is that it easily can be adapted for use with animals because one can train an animal that a large effort is required to obtain one reinforcer whereas a small effort is required to obtain a different reinforcer. If the two reinforcers are objectively of equal value, one can then ask if the value of the reinforcer that requires greater effort is then preferred over the reinforcer that requires lesser value. Finding two reinforcers that have the same initial value and, more important, reinforcers that will not change in value with experience (unrelated to the effort involved in obtaining them) is quite a challenge (but see Johnson & Gallagher, 2011). Alternatively, one could use a salient discriminative stimulus that signals the presentation of the reinforcer following effort of one magnitude and a different discriminative stimulus that signals the same reinforcer following effort of another magnitude. One can then ask if the animal has a preference for either conditioned reinforcer, each having served equally often as a signal for the common reinforcer.

In this review, I will first present the results of an experiment in which we have found evidence for justification of effort in pigeons and then will describe a noncognitive model based on contrast to account for this effect. I will then demonstrate the generality of the effect to show that a variety of relatively aversive events can be used to produce a preference for the outcome that follows. We have interpreted the results of these experiments in terms of within-trial contrast and have proposed that it is unlike the various contrast effects that have been described in the literature (incentive contrast, anticipatory contrast, and behavioral contrast). Although an alternative theory, delay reduction, can make predictions similar to within-trial contrast, in several experiments we have found that within-trial contrast can be found in the absence of differential delay reduction. Although several studies have reported a failure to find evidence for within- trial contrast, the procedures and results of these studies have proven useful in identifying some of the boundary conditions that appear to constrain the appearance of this effect. Finally, I will suggest that contrast effects of this kind may be involved in several psychological phenomena that have been studies in humans (e.g., general cognitive dissonance effects, the distinction between intrinsic and extrinsic reinforcement, and learned industriousness).

Justification of Effort in Animals

To determine the effect of prior effort on the preference for the conditioned reinforcer that followed, Clement, Feltus, Kaiser, and Zentall (2000) trained pigeons with a procedure analogous to that used by Aronson and Mills (1959). All training trials began with the presentation of a white stimulus on the center response key. On half of the training trials, a single peck to the white key turned it off and turned on two different colored side keys, for example red and yellow, and choice of the red stimulus (S+) was reinforced but not the yellow stimulus (S-) (sides were counterbalanced over trials and colors were counterbalanced over subjects). On the remaining training trials, 20 pecks to the white key turned it off and turned on two different colored side keys, for example green and blue, and choice of the green stimulus was reinforced (see design of this experiment in Figure 1). Following extensive training, a small number of probe trials was introduced (among the training trials) involving the two conditioned reinforcers (i.e., red and green) as well as the two conditioned inhibitors (i.e., yellow and blue) to determine if the training had resulted in a preference for one over the other.

Figure 1. Design of the experiment by Clement et al. (2000), in which one pair of discriminative stimuli followed 20 pecks and the other pair of discriminative stimuli followed a single peck. Following extensive training, when pigeons were given a choice between the two positive stimuli, they preferred the one that followed the greater number of pecks.

Figure 1. Design of the experiment by Clement et al. (2000), in which one pair of discriminative stimuli followed 20 pecks and the other pair of discriminative stimuli followed a single peck. Following extensive training, when pigeons were given a choice between the two positive stimuli, they preferred the one that followed the greater number of pecks.

Interestingly, traditional learning theory (Hull, 1943; Thorndike 1932) would predict that this sort of training should not result in a differential preference because each of the conditioned stimuli would have been associated with the same reinforcer, obtained following the same delay from the onset of the conditioned reinforcer, and following the same effort in the presence of the conditioned reinforcer. That is, the antecedent events on training trials (the number of previous pecks experienced prior to the conditioned reinforcer during training) should not affect stimulus preference on probe trials.

Alternatively, one could imagine that stimuli that had been presented in the context of the single peck requirement would be associated with the easier trials and stimuli that had been presented in the context of the 20-peck requirement would be associated with the harder trials. If that was the case, it might be that the conditioned stimulus that was presented on the single-peck trials would be preferred over the conditioned stimulus that was presented on the 20-peck trials.

If, however, cognitive dissonance theory is correct, it could be that in order to “justify” the 20-peck requirement (because on other trials only a single peck was required) the pigeons would give added value to the reinforcer that followed the 20-peck requirement. If this was the case, the added value might transfer to the conditioned reinforcer that signaled its occurrence and one might find a preference for the stimulus that followed the greater effort.

Finally, it is possible that the peck requirement could serve as an occasion setter (or conditional stimulus) that the pigeons could use to anticipate which color would be presented. For example, if a pigeon was in the process of pecking 20 times it might anticipate the appearance of the green conditioned stimulus that it would choose. That is, the pecking requirement could bias the pigeon to choose the color that was most associated with that requirement. On probe trials there would be no peck requirement but Clement et al. (2000) reasoned that on probe trials, without an initial peck requirement, the pigeons might be biased to choose the conditioned reinforcer that in training required a single peck to produce because no required pecking would be more similar to a single peck than to 20 pecks. To allow for this possibility, Clement et al. presented three kinds of conditioned reinforcer probe trials: Trials initiated by a single peck to a white key, trials initiated by 20 pecks to a white key, and trials that started with a choice between the two conditioned reinforcers, with no white key.

The results of this experiment were clear. Regardless of the pecking requirement, on test trials (20, 1, or no pecks), the pigeons showed a significant preference (69.3%) for the conditioned stimulus that in training had required 20 pecks to produce. Thus, they showed a justification of effort effect. Furthermore, the two simultaneous discriminations were not acquired at different rates. That is, neither the number trials required to acquire the two simultaneous discriminations nor the number of reinforcements associated with the two S+ stimuli were significantly different.

A similar result was obtained by Kacelnik and Marsh (2002) with starlings. With their procedure, on some trials, they required the starlings to fly back and forth four times from one end of their cage to the other in order to light a colored key and peck the key to obtain a reinforcer. On other trials, the starlings had to fly back and forth 16 times to obtain a different colored key and peck the key to obtain the same reinforcer. On test trials, when the starlings were given a choice between the two colored lights without a flight requirement, 83% of them preferred the color that had required the greatest number of flights to produce.

Clement et al. (2000) and Kacelnik and Marsh (2002) used colors as the conditioned reinforcers to be able to use a common reinforcer as the outcome for both the easy and hard training trial. But in the natural ecology of animals, it is more likely that less arbitrary cues would be associated with the different alternatives. For example, one could ask if an animal might value reinforcement more from a particular location if it had to work harder to get the reinforcer from that location. In nature one could require that the animal travel farther to obtain food from one location than from another but it would be difficult to allow the animal to choose between the two locations without incurring the added cost of the additional travel time. However, such an experiment could be conducted in an operant chamber by manipulating the response requirement during training. Thus, we conducted an experiment in which we used two feeders, one that provided food on trials in which 30 pecks were required to the center response key, the other that provided the same food but at a different location, on trials in which a single peck was required to the center response key (Friedrich & Zentall, 2004). Prior to the start of training, we obtained a baseline feeder preference score for each pigeon. On each forced trial, the left or right key was illuminated (white) and pecks to the left key raised the left feeder, whereas pecks to the right key raised the right feeder. On interspersed choice trials, both the right and left keys were lit and the pigeons had a choice of which feeder would be raised (see Figure 2 top).

Figure 2. Design of the experiment by Friedrich & Zentall (2004), in which pigeons had to make 30 pecks to receive reinforcement from their less preferred feeder and only one peck to receive reinforcement from their more preferred feeder.

Figure 2. Design of the experiment by Friedrich & Zentall (2004), in which pigeons had to make 30 pecks to receive reinforcement from their less preferred feeder and only one peck to receive reinforcement from their more preferred feeder.

On training trials, the center key was illuminated (yellow) and either 1 peck or 30 pecks were required to turn off the center key and raised one of the two feeders. For each pigeon, the high-effort response raised the less preferred feeder and the low-effort response raised the more preferred feeder. Forced and free choice feeder trials continued through training to monitor changes in feeder preference (see Figure 2 bottom). Over the course of training, we found that there was a significant (20.5%) increase in preference for the originally non-preferred feeder (the feeder associated with the high-effort response; see Figure 3). To ensure that the increased preference for the originally non-preferred feeder was not due to the extended period of training, a control group was included. For the control group, over trials, each of the two response requirements was followed by each feeder equally often. Relative to the initial baseline preference, this group showed only a 0.5% increase in preference for their non-preferred feeder as a function of training. Thus, it appears that the value of the location of food can be enhanced by being preceded by a high-effort response, as compared to a low-effort response.

The ecological validity of the effect of prior effort on preference for the outcome that follows was further tested in a recent experiment by Johnson and Gallagher (2011) in which mice were trained to press one lever for glucose and a different lever for polycose. When initially tested, the mice showed a preference for the glucose; however, when the response requirement for the polycose was increased from one to 15 lever presses, and the mice were offered both reinforcers, they showed a preference for the polycose over the sucrose. Thus, increasing the effort required to obtain the less preferred food resulted in a reversal in preference. The once less-preferred food was now more preferred. Furthermore, neutral cues that had been paired with the reinforcers during training (a tone for one, white noise for the other) then became conditioned reinforcers that the mice worked to obtain in extinction, and they responded preferentially to produce the high-effort cue.

Figure 3. When pigeons were trained to make 30 pecks to receive reinforcement from their less preferred feeder and only one peck to receive reinforcement from their more preferred feeder and were then given a choice of feeders, they showed a shift in preference to the one they had had to work harder for in training (green circles, after Friedrich & Zentall, 2004). For the control group (red circles), both feeders were equally often associated with the 30-peck response. The dotted line represents the baseline preference for the originally non-preferred feeder.

Figure 3. When pigeons were trained to make 30 pecks to receive reinforcement from their less preferred feeder and only one peck to receive reinforcement from their more preferred feeder and were then given a choice of feeders, they showed a shift in preference to the one they had had to work harder for in training (green circles, after Friedrich & Zentall, 2004). For the control group (red circles), both feeders were equally often associated with the 30-peck response. The dotted line represents the baseline preference for the originally non-preferred feeder.

Had the experiments described above been conducted with human subjects, the results likely would have been attributed to cognitive dissonance. It is unlikely, however, that cognitive dissonance is responsible for the added value given to outcomes that follow greater effort in pigeons and mice. Instead, this phenomenon can be described more parsimoniously as a form of positive contrast.

A Model of Justification of Effort for Animals

To model the contrast account, one should set the relative value of the trial to zero. Next, it is assumed that key pecking (or the time needed to make those pecks) is a relatively aversive event and results in a negative change in the value of the trial. It is also assumed that obtaining the reinforcer causes a shift to a more positive value (relative to the value at the start of the trial). The final assumption is that the value of the reinforcer depends on the relative change in value; that is, the change in value from the end of the response requirement to the appearance of the reinforcer (or the appearance of the conditioned reinforcer that signals reinforcement; see Figure 4). In the case of the second experiment, it would be the change in value from the end of the response requirement to the location of the raised feeder. Thus, because the positive change in value following the high-effort response would be larger than the change in value following the low-effort response, the relative value of the reinforcer following a high-effort response should be greater than that of the low-effort response.

Figure 4. A model of the justification of effort effect based on contrast (i.e., the change in relative value following the less aversive initial event and following the more aversive initial event).

Figure 4. A model of the justification of effort effect based on contrast (i.e., the change in relative value following the less aversive initial event and following the more aversive initial event).

A similar model of suboptimal choice has been proposed by Aw, Vasconcelos, and Kacelnik (2011). They indicate “that animals may attribute value to their options as a function of the experienced fitness or hedonic change at the time of acting” (p. 1118). That is, the value of a reinforcer may depend on the state of the animal at the time of reinforcement. The poorer the state of the animal, the more valued the reinforcer will be. They have referred to this implied contrast as state-dependent valuation learning.

Relative Aversiveness of the Prior Event

Delay to Reinforcement as an Aversive Event.

If the interpretation of these experiments that is presented in Figure 4 is correct, then other relatively-aversive prior events (as compared with the comparable event on alternative trials) should result in a similar enhanced preference for the stimuli that follow. For example, given that pigeons should prefer a shorter delay to reinforcement over a longer delay to reinforcement, they should also prefer discriminative stimuli that follow a delay over those that follow no delay.

To test this hypothesis, we trained pigeons to peck the center response key (20 times on all trials) to produce a pair of discriminative stimuli (as in Clement et al., 2000). On some trials, pecking the response key was followed immediately by one pair of discriminative stimuli (no delay), whereas on the remaining trials, pecking the response key was followed by a different pair of discriminative stimuli but only after a delay of 6 sec. On test trials, the pigeons were given a choice between the two conditioned reinforcers, but in this experiment they showed no preference (DiGian, Friedrich, & Zentall, 2004, Group Unsignaled Delay).

One difference between the manipulation of effort used in the first two experiments and the manipulation of delay used in this one was in the effort manipulation in the earlier experiments. Once the pigeon had pecked once and the discriminative stimuli failed to appear, the pigeon could anticipate that 19 additional pecks would be required. Thus, the additional effort could be anticipated following the first response and the pigeon would be required to make 19 more responses in the presence of that anticipation. In the case of the delay manipulation, however, the pigeon could not anticipate whether a delay would occur or not, and at the time the delay occurred, no further responding was required. Thus, with the delay manipulation, the pigeon would not have to experience having to peck in the context of the anticipated delay. Would the results be different if the pigeon could anticipate the delay at a time when responding was required? To test this hypothesis, the delay to reinforcement manipulation was repeated but this time the initial stimulus was predictive of the delay (DiGian et al., 2004, Group Signaled Delay). On half of the trials, a vertical line appeared on the response key and 20 pecks resulted in the immediate appearance of a pair of discriminative stimuli (e.g., red and yellow). On the remaining trials, a horizontal line appeared on the response key and 20 pecks resulted in the appearance of the other pair of discriminative stimuli (e.g., green and blue) but only after a 6-sec delay (see Figure 5). For this group, the pigeons could anticipate whether 20 pecks would result in a delay or not, so they had to peck in the context of the anticipated delay. When pigeons in this group were tested, as in the effort-manipulation experiments, they showed a significant (65.4%) preference for the conditioned reinforcer that in training had followed the delay. Once again, the experience of a relatively-aversive event produced an increase in the value of the conditioned reinforcer that followed. Furthermore, the results of this experiment demonstrated that it may be necessary for the subject to anticipate the aversive event for positive contrast to be found.

The Absence of Reinforcement as an Aversive Event

A related form of relatively-aversive event is the absence of reinforcement in the context of reinforcement on other trials. Could reinforcement or its absence result in a preference for the conditioned reinforcer that follows the absence of reinforcement? To test this hypothesis, pigeons were trained to peck a response key five times on all trials to produce a pair of discriminative stimuli. On some trials pecking the response key was followed immediately by 2-s access to food from the central feeder and then immediately by the presentation of one pair of discriminative stimuli, whereas on the remaining trials pecking the response key was followed by the absence of food (for 2 s) and then by the presentation of a different pair of discriminative stimuli. On test trials, the pigeons were given a choice between the two S+ stimuli, but once again they showed no preference (Friedrich, Clement, & Zentall, 2005, Group Unsignaled Reinforcement).

Figure 5. Design of experiment by DiGian et al. (2004, Group Signaled Delay) in which one stimulus signaled the appearance of discriminative stimuli without a delay and the other stimulus signaled the appearance of a different pair of discriminative stimuli with a 6-s delay. Following extensive training, when pigeons were given a choice between the two positive stimuli, they preferred the one that followed the 6-s delay.

Figure 5. Design of experiment by DiGian et al. (2004, Group Signaled Delay) in which one stimulus signaled the appearance of discriminative stimuli without a delay and the other stimulus signaled the appearance of a different pair of discriminative stimuli with a 6-s delay. Following extensive training, when pigeons were given a choice between the two positive stimuli, they preferred the one that followed the 6-s delay.

As with the unsignaled delay condition, for this group, the aversive event, the absence of reinforcement, could not be anticipated prior to its occurrence. To test the hypothesis that this contrast effect depends on the anticipation of the aversive event, the absence of reinforcement manipulation was repeated but this time the initial stimulus was predictive of the delay (Friedrich et al., 2005, Group Signaled Reinforcement). Once again, on half of the trials, a vertical line appeared on the response key and 5 pecks resulted in the presentation of food followed by the appearance of one pair of discriminative stimuli. On the remaining trials, a horizontal line appeared on the response key and 5 pecks resulted in the absence of food followed by the appearance of the other pair of discriminative stimuli (see Figure 6). For this group, the pigeons could anticipate whether 5 pecks would result in reinforcement or not. When pigeons in this group were tested, they showed a significant (66.7%) preference for the conditioned reinforcer that in training had followed the absence of reinforcement. Once again, the experience of a relatively aversive event produced an increase in the value of the conditioned reinforcer that followed.

Figure 6. Design of experiment by Friedrich et al. 2005, Group Signaled Reinforcement) in which one stimulus signaled that food would be presented prior to the appearance of discriminative stimuli and the other stimulus signaled that food would not be presented prior to the appearance of a different pair of discriminative stimuli. Following extensive training, when pigeons were given a choice between the two positive stimuli, they preferred the one that followed the absence of food.

Figure 6. Design of experiment by Friedrich et al. 2005, Group Signaled Reinforcement) in which one stimulus signaled that food would be presented prior to the appearance of discriminative stimuli and the other stimulus signaled that food would not be presented prior to the appearance of a different pair of discriminative stimuli. Following extensive training, when pigeons were given a choice between the two positive stimuli, they preferred the one that followed the absence of food.

The Anticipation of Effort as the Aversive Event.

Can anticipated effort, rather than actual effort, serve as the aversive event that increases the value of stimuli signaling reinforcement that follows? This question addresses the issue of whether the positive contrast between the initial aversive event and the conditioned reinforcer depends on actually experiencing the aversive event. One account of the added value that accrues to stimuli that follow greater effort is that during training, the greater effort experienced produces a heightened state of arousal, and in that heightened state of arousal, the pigeons learn more about the discriminative stimuli that follow, than about the discriminative stimuli that follow the lower state of arousal produced by lesser effort. Examination of the acquisition functions for the two simultaneous discriminations offers no support for this hypothesis. Over the various experiments that we have conducted, there has been no tendency for the simultaneous discrimination that followed greater effort, longer delays, or the absence of reinforcement to have been acquired faster than the discrimination that followed less effort, shorter delays, or reinforcement. However, those discriminations were acquired very rapidly and there might have been a ceiling effect. That is, it might be easy to miss a small difference in the rate of discrimination acquisition sufficient to produce a preference for the conditioned reinforcer that follows the more aversive event.

Thus, the purpose of the anticipation experiments was to ask if we could obtain a preference for the discriminative stimuli that followed a signal that more effort might be required but actually was not required on that trial. More specifically, at the start of half of the training trials, pigeons were presented with, for example, a vertical line on the center response key. On half of these trials, pecking the vertical line replaced it with a white key and a single peck (low effort) to the white key resulted in reinforcement. On the remaining vertical-line trials, pecking the vertical line replaced it with a simultaneous discrimination S+L S-L on the left and right response keys and choice of the S+ was reinforced. A schematic presentation of the design of this experiment appears in Figure 7.

Figure 7. Design of experiment by Clement & Zentall (2002, Exp. 1) to determine the effect of the anticipation of effort (1 vs. 30 pecks). On some trials pigeons were presented with a vertical-line stimulus and 10 pecks would produce either a white stimulus (one peck to the white stimulus would produce reinforcement) or a choice between two colors (choice of the correct stimulus would be reinforced). On other trials pigeons were presented with a horizontal-line stimulus and 10 pecks would produce either a white stimulus (30 pecks to the white stimulus would produce reinforcement) or a choice between two other colors (choice of the correct stimulus would be reinforced). On probe trials, when given a choice between the two correct colors, the pigeons preferred the color associated with the horizontal-line stimulus (the correct stimulus that on other horizontal-line trials would have required 30 pecks to receive reinforcement).

Figure 7. Design of experiment by Clement & Zentall (2002, Exp. 1) to determine the effect of the anticipation of effort (1 vs. 30 pecks). On some trials pigeons were presented with a vertical-line stimulus and 10 pecks would produce either a white stimulus (one peck to the white stimulus would produce reinforcement) or a choice between two colors (choice of the correct stimulus would be reinforced). On other trials pigeons were presented with a horizontal-line stimulus and 10 pecks would produce either a white stimulus (30 pecks to the white stimulus would produce reinforcement) or a choice between two other colors (choice of the correct stimulus would be reinforced). On probe trials, when given a choice between the two correct colors, the pigeons preferred the color associated with the horizontal-line stimulus (the correct stimulus that on other horizontal-line trials would have required 30 pecks to receive reinforcement).

On the remaining training trials, the pigeons were presented with a horizontal line on the center response key. On half of these trials, pecking the horizontal line replaced it with a white key and 30 pecks (high effort) to the white key resulted in reinforcement. On the remaining horizontal-line trials, pecking the horizontal line replaced it with a different simultaneous discrimination S+H S-H on the left and right response keys and again choice of the S+ was reinforced. On test trials when the pigeons were given a choice between S+H and S+L, once again, they showed a significant (66.5%) preference for S+H.

It is important to note that in this experiment the events that occurred in training on trials, involving the two pairs of discriminative stimuli, were essentially the same. It was only on the other half of the trials, those trials on which the discriminative stimuli did not appear, that differential responding was required. Thus, the expectation of differential effort, rather than actual differential effort appears to be sufficient to produce a differential preference for the conditioned reinforcers that follow. These results extend the findings of the earlier research to include anticipated effort.

The Anticipation of the Absence of Reinforcement as the Aversive Event.

If anticipated effort can function as a relative conditioned aversive event, can the anticipated absence of reinforcement serve the same function? Using a design similar to that used to examine differential anticipated effort, we evaluated the effect of differential anticipated reinforcement (Clement & Zentall, 2002, Exp. 2). On half of the training trials, pigeons were presented with a vertical line on the center response key. On half of these trials, pecking the vertical line was followed immediately by reinforcement (high probability reinforcement). On the remaining vertical-line trials, pecking the vertical line replaced it with a simultaneous discrimination S+HP S-HP and choice of the S+ was reinforced, but only on a random 50% of the trials. A schematic presentation of the design of this experiment appears in Figure 8. On the remaining training trials, the pigeons were presented with a horizontal line on the center response key. On half of these trials, pecking the horizontal line was followed immediately by the absence of reinforcement (low probability reinforcement). On the remaining horizontal-line trials, pecking the horizontal line replaced it with a different simultaneous discrimination S+LP S-LP and again choice of the S+ was reinforced, but again, only on a random 50% of the trials. On test trials, when the pigeons were given a choice between S+HP and S+LP, they showed a significant (66.9%) preference for S+LP. Thus, the anticipation of an aversive, absence-of-food event appears to produce a preference for the S+ that follows the initial stimulus and that preference is similar to the anticipation of a high effort response.

Figure 8. Design of experiment by Clement & Zentall (2002, Exp. 2) to determine the effect of the anticipation of the absence of reinforcement. On some trials pigeons were presented with a vertical-line stimulus and 10 pecks would produce either reinforcement or a choice between two colors (choice of the correct stimulus would be reinforced 50% of the time). On other trials pigeons were presented with a horizontal-line stimulus and 10 pecks would produce either the absence of reinforcement or a choice between two other colors (choice of the correct stimulus would be reinforced 50% of the time). On probe trials, when given a choice between the two correct colors, the pigeons preferred the color associated with the horizontal-line stimulus (the correct stimulus that on other horizontal-line trials would have produced the absence of reinforcement).

Figure 8. Design of experiment by Clement & Zentall (2002, Exp. 2) to determine the effect of the anticipation of the absence of reinforcement. On some trials pigeons were presented with a vertical-line stimulus and 10 pecks would produce either reinforcement or a choice between two colors (choice of the correct stimulus would be reinforced 50% of the time). On other trials pigeons were presented with a horizontal-line stimulus and 10 pecks would produce either the absence of reinforcement or a choice between two other colors (choice of the correct stimulus would be reinforced 50% of the time). On probe trials, when given a choice between the two correct colors, the pigeons preferred the color associated with the horizontal-line stimulus (the correct stimulus that on other horizontal-line trials would have produced the absence of reinforcement).

In a follow-up experiment (Clement & Zentall, 2002, Exp. 3), we tried to determine whether preference for the discriminative stimuli associated with the anticipation of the absence of food was produced by the anticipation of positive contrast between the certain absence of food and a 50% chance of food (on discriminative stimulus trials) or negative contrast between the certain anticipation of food and a 50% chance of food (on the other set of discriminative stimulus trials). A schematic presentation of the design of this experiment appears at the top of Figure 9.

For Group Positive, the conditions of reinforcement were essentially nondifferential (i.e., reinforcement always followed vertical-line trials whether the discriminative stimuli S+HP S-HP were presented or not). Thus, on half of the vertical line trials, reinforcement was presented immediately for responding to the vertical line. On the remaining vertical-line trials, pecking the vertical line replaced it with a different simultaneous discrimination S+HP S-HP and reinforcement was presented for responding to the S+. Thus, there should have been little contrast established between these two kinds of trial.

On half of the horizontal-line trials, however, no reinforcement always followed responses to the horizontal line. On the remaining horizontal-line trials involving S+LP S-LP, reinforcement was presented for responding to the S+. Thus, for this group, on horizontal-line trials, there was the opportunity for positive contrast to develop on discriminative stimulus trials (i.e., the pigeons should expect that reinforcement might not occur on those trials and they might experience positive contrast when it does occur).

For Group Negative, on all horizontal-line trials the conditions of reinforcement were essentially nondifferential (i.e., the probability of reinforcement on horizontal-line trials was always 50% whether the trials involved discriminative stimuli or not). Thus, there should have been little contrast established between these two kinds of trial (see the bottom of Figure 9). That is, on half of the horizontal-line trials, reinforcement was provided immediately with a probability of .50 for responding to the horizontal line. On the remaining horizontal-line trials, the discriminative stimuli S+LP S-LP were presented and reinforcement was obtained for choices of the S+ but only on 50% of the trials.

On half of the vertical-line trials, however, reinforcement was presented immediately for responding to the vertical line (with a probability of 1.00). On the remaining vertical-line trials, the discriminative stimuli S+LP S-LP were presented and reinforcement was provided for choice of the S+ with a probability of 50%. Thus, for this group, on verticalline trials, there was the opportunity for negative contrast to develop on discriminative stimulus trials (i.e., the pigeons should expect that reinforcement is quite likely and they might experience negative contrast when it does not occur).

Figure 9. Design of experiment by Clement & Zentall (2002, Exp. 3) to determine if the effect of the anticipation of the absence of reinforcement was due to positive or negative contrast. For group positive (top panel), on some trials pigeons were presented with a vertical-line stimulus and 10 pecks would produce either reinforcement or a choice between two colors (choice of the correct stimulus S+HP would be reinforced 100% of the time, thus, no contrast). On other trials pigeons were presented with a horizontal-line stimulus and 10 pecks would produce either the absence of reinforcement or a choice between two other colors (choice of the correct stimulus S+LP would be reinforced 100% of the time, thus, positive contrast). On probe trials, when given a choice between the two correct colors, S+HP and S+LP the pigeons preferred the color associated with the horizontal-line stimulus (the correct stimulus that on other horizontal-line trials would have produced the absence of reinforcement), thus providing evidence for positive contrast (on the horizontal-line trials).

Figure 9. Design of experiment by Clement & Zentall (2002, Exp. 3) to determine if the effect of the anticipation of the absence of reinforcement was due to positive or negative contrast. For group positive (top panel), on some trials pigeons were presented with a vertical-line stimulus and 10 pecks would produce either reinforcement or a choice between two colors (choice of the correct stimulus S+HP would be reinforced 100% of the time, thus, no contrast). On other trials pigeons were presented with a horizontal-line stimulus and 10 pecks would produce either the absence of reinforcement or a choice between two other colors (choice of the correct stimulus S+LP would be reinforced 100% of the time, thus, positive contrast). On probe trials, when given a choice between the two correct colors, S+HP and S+LP the pigeons preferred the color associated with the horizontal-line stimulus (the correct stimulus that on other horizontal-line trials would have produced the absence of reinforcement), thus providing evidence for positive contrast (on the horizontal-line trials).

On test trials, when pigeons in Group Positive were given a choice between the two S+ stimuli, they showed a significant (60.1%) preference for the positive discriminative stimulus that in training was preceded by a horizontal line (the initial stimulus that on other trials was followed by the absence of reinforcement). Thus, Group Positive showed evidence of positive contrast.

When pigeons in Group Negative were given a choice between the two S+ stimuli, they showed a 58.1% preference for the positive discriminative stimulus that in training was preceded by a horizontal line (the initial stimulus that on other trials was followed by a lower probability of reinforcement than on comparable trials involving the vertical line). Thus, Group Negative showed evidence of negative contrast. In this case, it should be described as a reduced preference for the positive discriminative stimulus preceded by the vertical line, which on other trials was associated with a higher probability of reinforcement (100%). Considering the results from both Group Positive and Group Negative it appears that both positive and negative contrast contributed to the preferences found by Clement and Zentall, (2002, Exp. 2).

Hunger as the Aversive Event

According to the contrast model, if pigeons are trained to respond to one conditioned reinforcer when hungry and to respond to a different conditioned reinforcer when less hungry, when they are given a choice between the two conditioned reinforcers, they should prefer the conditioned reinforcer to which they learned to respond when hungrier. That is, they should prefer the stimulus that they experienced when they were in a relatively more aversive state. Vasconcelos and Urcuioli (2008b) tested this prediction by training pigeons to peck one colored stimulus on days when they were hungry and to peck a different colored stimulus on days when they were less hungry. On test days, when the pigeons were given a choice between the two colored stimuli, they showed a preference for the stimulus that they pecked when they were hungrier. Furthermore, this effect was not state-dependent because the pigeons preferred the color that they had learned to peck when hungrier, whether they were tested more or less hungry. Similar results were reported by Marsh, Schuck-Paim, and Kacelnik (2004) with starlings (see also Pompilio & Kacelnik, 2005). Furthermore, the effect appears to have considerable generality because Pompilio, Kacelnik, and Behmer (2006) were able to show similar effects in grasshoppers.

Within-Trial Contrast in Humans.

It can be argued that if within-trial contrast is analogous to justification of effort, one should be able to show similar effects with humans. In fact, when humans were given a modified version of the task used by Clement et al. (2000) a similar effect was found (Klein, Bhatt, & Zentall, 2005). The humans were told that they would have to “click on a mouse” to receive a pair of abstract shapes and by clicking on the shapes they could learn which shape was correct. On some trials, a single click was required to present one of two pairs of shapes and one shape from each pair was designated as correct. On the remaining trials, 20 clicks were required to present one of two different pairs of shapes and again one shape from each pair was designated as correct. Thus, there was a total of four pairs of shapes. On test trials, the subjects were asked to choose between pairs of correct shapes, one shape that had followed a single mouse click the other that had followed 20 mouse clicks. Consistent with the contrast hypothesis, subjects showed a significant (65.2%) preference for the shapes that followed 20 clicks. Furthermore, after their choice, when the subjects were asked why they had chosen those shapes, typically they did not know and most of them were not even aware of which shapes had followed the large and small response requirement. When a similar procedure was used with 8-year old children, they showed a similar 66.7% preference for the shapes that they had to work harder to obtain (Alessandri, Darcheville, & Zentall, 2008).

Contrast or Relative Delay Reduction?

We have described the preferences we have found for conditioned reinforcers (and feeder location) as a contrast effect. However, one could also interpret these effects in terms of relative delay reduction (Fantino & Abarca, 1985). According to the delay reduction hypothesis, any stimulus that predicts reinforcement sooner in its presence than in its absence will become a conditioned reinforcer. In the present experiments, the temporal relation between the conditioned reinforcers and the reinforcers was held constant, so one could argue that neither conditioned reinforcer should have served to reduce the delay to reinforcement more than the other. But the delay reduction hypothesis is meant to be applied to stimuli in a relative sense. That is, one can consider the predictive value of the discriminative stimuli relative to the time in their absence or, in the present case, to the total duration of the trial. If one considers delay reduction in terms of its duration relative to the duration of the entire trial, then the delay reduction hypothesis can account for the results of the present experiments. For example, in the case of the differential effort manipulation, as it takes longer to produce 20 responses (pecks or clicks) than to produce 1 response, 20-response trials would be longer in duration than 1-response trials. Thus, the appearance of the discriminative stimuli would occur relatively later in a 20-response trial than in a 1-response trial. The later in a trial that the discriminative stimuli appear, the closer would be their onset to reinforcement, relative to the start of the trial and thus, the greater relative reduction in delay that they would represent.

The delay reduction hypothesis can also account for the effect seen with a delay versus the absence of a delay. But what about trials with reinforcement versus trials without reinforcement? In this case, the duration of the trial is the same with and without reinforcement, prior to the appearance of the discriminative stimuli; however, delay reduction theory considers the critical time to be the interval between reinforcements. Thus, on trials in which the discriminative stimuli are preceded by reinforcement, the time between reinforcements is short, so the discriminative stimuli are associated with little delay reduction. On trials in which the discriminative stimuli are preceded by the absence of reinforcement, however, the time between reinforcements is relatively long (i.e., the time between reinforcement on the preceding trial and reinforcement on the current trial), so the discriminative stimuli on the current trial would be associated with a relatively large reduction in delay.

Delay reduction theory has a more difficult time accounting for the effects of differential anticipated effort because trials with both sets of discriminative stimuli were not differentiated by number of responses, delay, or reinforcement. Thus, all trials with discriminative stimuli should be of comparable duration. The same is true for the effects of differential anticipated reinforcement because that manipulation occurred on trials independent of the trials with the discriminative stimuli. Thus, taken as a whole, based on what has been presented to this point, the contrast account appears to offer a more parsimonious account of the data.

On the other hand, it should be possible to distinguish between the delay reduction and contrast accounts with the use of a design similar to that used in the first experiment, with one important change. Instead of requiring that the pigeons peck many times on half of the trials and a few times on the remaining trials, one could use two schedules that accomplish the same thing while holding the duration of the trial event constant. This could be accomplished by using a fixed interval schedule (FI, the first response after a fixed duration would present one pair of discriminative stimuli) on half of the trials and a differential reinforcement of other behavior schedule (DRO, the absence of key pecking for the same fixed duration would present the other pair of discriminative stimuli) on the remaining trials. Assuming that the pigeons prefer the DRO schedule (but it is not certain that they would), then according to the contrast account the pigeons should prefer the discriminative stimuli that follow the FI schedule over the discriminative stimuli that follow the DRO schedule. According to the delay reduction hypothesis, if trial duration is held constant and the two pairs of discriminative stimuli occupy the same relative proportion of the two kinds of trial, the pigeons should not differentially prefer either pair of discriminative stimuli, regardless of which schedule is preferred.

We tested the prediction of delay reduction theory by equating the trial duration on high effort and low effort trials by first training the pigeons to respond on a FI schedule to one stimulus on half of the trials and a DRO schedule to a different stimulus on the remaining trials (Singer, Berry, & Zentall, 2007). But before introducing the discriminative stimuli, we tested the pigeons for their schedule preference. We then followed the two schedules with discriminative stimuli as in the earlier research and finally tested the pigeons for their conditioned reinforcer preference (see Figure 10). Consistent with contrast theory, we found that the pigeons reliably preferred (by 63.2%) the discriminative stimuli that followed their least preferred schedule (Figure 11; see also Singer & Zentall, 2011, Exp. 1). Furthermore, consistent with a contrast account, as the schedule preference varied in direction and degree among the pigeons, we examined the correlation between schedule preference and preference for the conditioned reinforcer that followed and found a significant negative correlation (r = -.78). The greater the schedule preference the less they preferred the conditioned reinforcer that followed that schedule.

Figure 10. Design of experiment that controlled for the duration of a trial. Choice of the left key resulted in presentation of a horizontal line, for example, on the center key and if the pigeon refrained from pecking (DRO20s) the horizontal line, it could choose between a red (S+) and yellow (S-) stimulus on the side keys. Choice of the right key resulted in presentation of a vertical line on the center key and if the pigeon pecked (FI20s) the vertical line, it could choose between a green (S+) and blue (S-) stimulus on the side keys. Pigeons schedule preference was used to predict their preference for the S+ stimulus that followed the schedule on probe trials (after Singer, Berry, & Zentall, 2007).

Figure 10. Design of experiment that controlled for the duration of a trial. Choice of the left key resulted in presentation of a horizontal line, for example, on the center key and if the pigeon refrained from pecking (DRO20s) the horizontal line, it could choose between a red (S+) and yellow (S-) stimulus on the side keys. Choice of the right key resulted in presentation of a vertical line on the center key and if the pigeon pecked (FI20s) the vertical line, it could choose between a green (S+) and blue (S-) stimulus on the side keys. Pigeons schedule preference was used to predict their preference for the S+ stimulus that followed the schedule on probe trials (after Singer, Berry, & Zentall, 2007).

Further support for the contrast account came from an experiment in which there were 30-peck trials and single-peck trials but trial duration was extended on single-peck trials to equal the duration of 30-peck trials by inserting a delay following the single peck, equal to the time each pigeon took to complete the immediately-preceding 30-peck requirement (Singer & Zentall, 2011, Exp. 2). Once again, following a test to determine which schedule was preferred, discriminative stimuli were inserted following completion of the schedule and the pigeons’ preference for the conditioned reinforcers was assessed. Again, the pigeons preferred the conditioned reinforcer that followed the least-preferred schedule, 60.4% of the time (but see Vasconcelos, Lionello-DeNolf, & Urcuioli, 2007).

Figure 11. For each pigeon, probe trial preference for the S+ stimulus that followed the least preferred schedule in training (after Singer, Berry, & Zentall, 2007).

Figure 11. For each pigeon, probe trial preference for the S+ stimulus that followed the least preferred schedule in training (after Singer, Berry, & Zentall, 2007).

A different approach to equating trial duration was demonstrated with human subjects by Alessandri, Darcheville, Delevoye-Turrell, and Zentall (2008). Instead of using number of mouse clicks as the differential initial event, we used pressure on a transducer. On some trials, signaled by a discriminative stimulus, the subjects had to press the transducer lightly to produce a pair of shapes. On other trials, signaled by a different discriminative stimulus, the subjects had to press the transducer with greater force (50% of their maximum force assessed during pretraining). Following training, when subjects were given a choice between pairs of the conditioned reinforcers, they showed a significant 66.7% preference for those stimuli that had required the greater force to produce in training (and the effect was independent of the force required on test trials). Thus, further support for the contrast account was obtained under conditions in which it would be difficult to account for the effect by delay reduction theory.

Failures to Replicate the Within-Trial Contrast Effect

Several studies have reported a failure to obtain a contrast effect of the kind reported by Clement et al. (2000). Such reports are instructive because they can help to identify the boundary conditions for observing the effect. The first of these studies was reported by Vasconcelos, Urcuioli, and Lionello-DeNolf (2007) who attempted to replicate the original Clement et al. finding with 20 sessions of training beyond acquisition of the simple simultaneous discriminations that were acquired very quickly. It should be noted, however, that in more-recent research we have found that the amount of training required to establish the within-trial contrast effect is often greater than that used by Clement et al. Although Clement et al. found a contrast effect with 20 sessions of additional training, later research suggested that up to 60 sessions of training is often required to obtain the effect (see, e.g., Friedrich & Zentall, 2004).

Arantes and Grace (2007) also failed to replicate the contrast effect. In their first experiment they tested their pigeons without overtraining and in their second experiment they tested their pigeons at various points up to 27 sessions of overtraining. Thus, once again it may be that insufficient training was provided. However, in their second experiment, a subgroup of four pigeons was given more than twice the number of training sessions and although they did find a preference for the conditioned reinforcer that followed the greater effort in training, it was not statistically reliable. However, the smaller contrast effect reported by Arantes and Grace may be attributable to the extensive prior experience (in a previous experiment) that these pigeons had had with lean variable interval schedules. It is possible that the prior experience with lean schedules sufficiently reduced the aversiveness of the 20-peck requirement to reduce the magnitude of the contrast effect that they found. Another factor that may have contributed to the reduced magnitude of their effect was the use of a 6-s delay between choice of the conditioned reinforcer and reinforcement. Although Clement et al. (2000) also included a 6-s delay, later research suggested that contrast effects at least as large can be obtained if reinforcement immediately follows choice of the conditioned reinforcer.

Finally, Vasconcelos and Urcuioli (2008a) noted that they too failed to find a significant contrast effect following extensive overtraining. However, the effect that they did find (about 62% choice of the conditioned reinforcer that followed the greater pecking requirement) was quite comparable in magnitude to the effect reported by Clement et al. (2000). Their failure to find a significant effect may be attributed to the fact that there were only four pigeons in their experiment. That is, their study may have lacked sufficient power to observe significant within-trial contrast. Thus, the several failures to find a contrast effect with procedures similar to those used by Clement et al. suggest that observation of the contrast effect may require considerable overtraining, the absence of prior training with lean schedules of reinforcement, and a sufficient sample size to deal with individual differences in the magnitude of the effect.

The Nature of the Contrast

The contrast effects found in the present research appear to be somewhat different from the various forms of contrast that have been reported in the literature (see Flaherty, 1996). Flaherty distinguishes among three kinds of contrast.

Incentive Contrast

In incentive contrast, the magnitude of reward that has been experienced for many trials, suddenly changes, and the change in behavior that follows is compared with the behavior of a comparison group that has experienced the final magnitude of reinforcement from the start. Early examples of incentive contrast were reported by Tinklepaugh (1928), who found that if monkeys were trained for a number of trials with a preferred reward (e.g., fruit), when they then encountered a less preferred reward (e.g., lettuce, a reward that they would normally readily work for) they often would refuse to eat it.

Incentive contrast was more systematically studied by Crespi (1942, see also Mellgren, 1972). Rats trained to run down an alley for a large amount of food and shifted to a small amount of food, typically run slower than rats trained to run for the smaller amount of food from the start (negative incentive contrast). Conversely, rats trained to run for a small amount of food and shifted to a large amount of food may run faster than rats trained to run for the larger amount of food from the start (positive incentive contrast). By its nature, incentive contrast must be assessed following the shift in reward magnitude rather than in anticipation of the change because, generally, only a single shift is experienced.

Capaldi (1972) has argued that negative successive incentive contrast of the kind studied by Crespi (1942) can be accounted for as a form of generalization decrement (the downward shift in incentive value represents not only a shift in reinforcement value but also a change in context), however, generalization decrement is not able to account for positive successive incentive contrast effects (also found by Crespi and in the present research) when the magnitude of reinforcement increases.

Incentive contrast would seem to be an adaptive mechanism by which animals can increase their sensitivity to changes in reinforcement density. Just as animals use lateral inhibition in vision to help them discriminate spatial changes in light intensity resulting in enhanced detection of edges (or to provide better figure-ground detection), so too may incentive contrast help the animal detect changes in reinforcement magnitude important to its survival. Thus, incentive contrast may be a perceptually-mediated detection process.

Anticipatory Contrast

In a second form of contrast, anticipatory contrast, there are repeated (typically one a day) experiences with the shift in reward magnitude, and the measure of contrast involves behavior that occurs prior to the anticipated change in reward value. Furthermore, the behavior assessed is typically consummatory behavior rather than running speed. For example, rats often drink less of a weak saccharin solution if they have learned that it will be followed by a strong sucrose solution, relative to a control group for which saccharin is followed by more saccharin (Flaherty, 1982). This form of contrast differs from others in the sense that the measure of contrast involves differential rates of the consumption of a reward (rather than an independent behavior such as running speed).

Behavioral Contrast

A third form of contrast involves the random alternation of two signaled outcomes. When used in a discrete-trials procedure with rats, the procedure has been referred to as simultaneous incentive contrast. Bower (1961), for example, reported that rats trained to run down an alley to both large and small signaled magnitudes of reward ran slower to the small magnitude of reward than rats that ran only to the small magnitude of reward.

The more-often-studied, free-operant analog of this task is called behavioral contrast. To observe behavioral contrast, pigeons are trained on an operant task involving a multiple schedule of reinforcement. In a multiple schedule, two (or more) schedules, each signaled by a distinctive stimulus, are randomly alternated. Positive behavioral contrast can be demonstrated by training pigeons initially with equal probability of reinforcement schedules (e.g., two variable-interval 60-s schedules) and then reducing the probability of reinforcement in one schedule (e.g., from variable-interval 60-s to extinction) and noting an increase in the response rate in the other, unaltered schedule (Halliday & Boakes, 1971; Reynolds, 1961). Similar results can be demonstrated in a between groups design (Mackintosh, Little, & Lord, 1972) in which pigeons are trained on the multiple variable-interval 60-s and extinction schedules from the start, and their rate of pecking during the variable-interval 60-s schedule is compared with other pigeons that have been trained on two variable-interval 60-s schedules.

The problem with classifying behavioral contrast according to whether it involves a response to entering the richer schedule (as with incentive contrast) or the anticipation of entering the poorer schedule (as with anticipatory contrast) is, during each session, there are multiple transitions from the richer to the poorer schedule and vice versa. Thus, when one observes an increase in responding in the richer schedule resulting from the presence of the poorer schedule at other times, it is not clear whether the pigeons are reacting to the preceding poorer schedule or they are anticipating the next poorer schedule.

Williams (1981) attempted to distinguish between these two mechanisms by presenting pigeons with triplets of trials in a ABA design (with the richer schedule designated as A) and comparing their behavior to that of pigeons trained with an AAA design. Williams found very different kinds of contrast in the two A components of the ABA schedule. In the first A component, Williams found a generally higher level of responding that was maintained over training sessions (see also Williams, 1983). In the second A component, however, he found a higher level of responding primarily at the start of the component, an effect known as local contrast, but the level of responding was not maintained over training sessions (see also, Cleary, 1992). Thus, there is evidence that behavioral contrast may be attributable primarily to the higher rate of responding by pigeons in anticipation of the poorer schedule rather than in response to the appearance of the richer schedule (Williams, 1981; see also Williams & Wixted, 1986).

It is generally accepted that the higher rate of responding to the stimulus associated with the richer schedule of reinforcement occurs because, in the context of the poorer schedule, that stimulus is relatively better at predicting reinforcement (Keller, 1974). Or in more cognitive terms, the richer schedule seems even better in the context of a poorer schedule.

There is evidence, however, that it is not that the richer schedule appears better, but that the richer schedule will soon get worse. In support of this distinction, although pigeons peck at a higher rate at stimuli that predict a worsening in the probability of reinforcement, it has been found that when given a choice, pigeons prefer stimuli that they respond to less but that predict no worsening in the probability of reinforcement (Williams, 1992). Thus, curiously, under these conditions, response rate has been found to be negatively correlated with choice.

The implication of this finding is that the increased responding associated with the richer schedule does not reflect its greater value to the pigeon, but rather its function as a signal that conditions will soon get worse because the opportunity to obtain reinforcement will soon diminish. This analysis suggests that the mechanism responsible for anticipatory contrast (Flaherty, 1982) and, in the case of behavioral contrast, responding in anticipation of a worsening schedule (Williams, 1981), is likely to be a compensatory or learned response. In this sense, these two forms of contrast are probably quite different from the perceptual-like detection process involved in incentive contrast.

The Present Within-Trial Contrast Effect

What all contrast effects have in common is the presence, at other times, of a second condition that is either better or worse than the target condition. The effect of the second condition often is to exaggerate the difference between the two conditions. Although there have been attempts to account for these various contrast effects, Mackintosh (1974) concluded that no single principle will suffice (see also Flaherty, 1996). Thus, even before the contrast effect reported by Clement et al. (2000) and presented here was added to the list, contrast effects resisted a comprehensive explanation.

Procedurally, the positive contrast effect reported by Clement et al. (2000) appears to be most similar to that involved in anticipatory contrast (Flaherty, 1982) because in each case there is a series of paired events, the second of which is better than the first. High effort is followed by discriminative stimuli in the case of the Clement et al. procedure, and a low concentration of saccharin is followed by a higher concentration of sucrose in the case of anticipatory contrast. However, the effect reported by Clement et al. is seen in a choice response made in the presence of the second event (i.e., preference for one conditioned reinforcer over the other) rather than the first (i.e., differential consumption of the saccharin solution).

Alternatively, although successive incentive contrast and the contrast effect reported by Clement et al. (2000) both involve a change in behavior during the second component of the task, the mechanisms responsible for these effects must be quite different. In the case of the Clement et al. procedure, the pigeons experienced the two-event sequences many hundreds of times prior to test and thus, they could certainly learn to anticipate the appearance of the discriminative stimuli and the reinforcers that followed, whereas in the case of successive incentive contrast, the second component of the task could not be anticipated.

The temporal relations involved in the within-trial contrast effect reported by Clement et al. (2000) would seem more closely related to those that have been referred to as local contrast (Terrace, 1966). As already noted, local contrast refers to the temporary change in response rate that occurs following a stimulus change that signals a change in schedule. But local contrast effects tend to occur early in training and they generally disappear with extended training. Furthermore,  Furthermore, if local contrast was responsible for the contrast effect reported by Clement et al., they should have found a higher response rate to the positive stimulus that followed the higher effort response than to the positive stimulus that followed the lower effort response. But differences in response rate have not been found, only differences in choice. Thus, the form of contrast characteristic of the research described in this review appears to be different from the various contrast effects described in the literature. First, the present contrast effect is a within-subject effect that is measured by preference score. Second, in a conceptual sense, it is the reverse of what one might expect based on more-typical contrast effects. Typically, a relatively-aversive event (e.g., delay to reinforcement) is judged to be more aversive (as measured by increased latency of response or decreased choice) when it occurs in the context of a less-aversive event that occurs on alternative trials (i.e., it is a between-trials effect). The contrast effect described here is assumed to occur within trials and the effect is to make the events that follow the relatively aversive event more preferred than similar events that follow less-aversive events. Thus, referring to this effect as a contrast effect is descriptive but it is really quite different from the other contrast effects described by Flaherty (1996). For all of the above reasons we consider the contrast effect presented here to be different from other contrast effects that have been studied in the literature and we propose to refer to it as within-trial contrast.

Possibly Related Psychological Phenomena

The within-trial contrast effect described here may be related to other psychological phenomena that have been described in the literature.

Contrafreeloading.

A form of contrast similar to that found in the present experiment may be operating in the case of the classic contrafreeloading effect (e.g., Carder & Berkowitz, 1970; Jensen, 1963; Neuringer, 1969). For example, pigeons trained to peck a lit response key for food will often obtain food by pecking the key even when they are presented with a dish of free food. Although it is possible that other factors contribute to the contrafreeloading effect (e.g., reduced familiarity with the free food in the context of the operant chamber, Taylor, 1975, or perhaps preference for small portions of food spaced over time), it is also possible that the pigeons value the food obtained following the effort of key pecking more than the free food, and if the effort required is relatively small, the added value of food for which they have to work may at times actually be greater than the cost of the effort required to obtain it.

Justification of Effort.

As mentioned earlier, justification of effort in humans has been attributed to the discrepancy between one’s beliefs and one’s behavior (Aronson & Mills, 1959). The present research suggests that contrast may be a more parsimonious interpretation of this effect not only in pigeons but also in humans. In fact, the present results may have implications for a number of related phenomena that have been studied in humans.

The term work ethic has often been used in the human literature to describe a value or a trait that varies among members of a population as an individual difference (e.g., Greenberg 1977). But it also can be thought of as a typically human characteristic that appears to be in conflict with traditional learning theory (Hull, 1943). Work (effort) is generally viewed as at least somewhat aversive and as behavior to be reduced, especially if less-effortful alternatives are provided to obtain a reward. Other things being equal, less work should be preferred over more work (and in general it is).Yet, it is also the case that work, per se, is often valued in our culture. Students are often praised for their effort independent of their success. Furthermore, the judged value of a reward may depend on the effort that preceded it. For example, students generally value a high grade that they have received in a course not only for its absolute value, but also in proportion to the effort required to obtain it. Consider the greater pride that a student might feel about an A grade in a difficult course (say, Organic Chemistry) as compared to a similar A grade in an easier course (say, Introduction to Golf).

Although, in the case of such human examples, cultural factors, including social rewards, may be implicated, a more fundamental, nonsocial mechanism may also be present. In the absence of social factors, it may generally be the case (as in the present experiments) that the contrast between the value of the task prior to reward and at the time of reward may be greater following greater effort than following less
effort.

Cognitive Dissonance.

As described earlier, when humans experience a tedious task, their evaluation of the aversiveness of the task is sometimes negatively correlated with the size of the reward provided for agreeing to describe the task to others as pleasurable, a cognitive dissonance effect (Festinger & Carlsmith, 1959). The explanation that has been given for the cognitive dissonance effect is that the conflict between attitude (the task was tedious) and behavior (participants had agreed to describe the task to another person as enjoyable) was more easily resolved when a large reward was given (“I did it for the large reward”) and thus, a more honest evaluation of the task could be provided. However, one could also view the contrast between effort and reward to be greater in the large reward condition than in the small reward condition. Thus, in the context of the large reward, looking back at the subjective aversiveness of the prior task, it might be judged as greater than in the context of small reward.

Intrinsic versus Extrinsic Reinforcement.

Contrast effects of the kind reported here may also be responsible for the classic finding that extrinsic reinforcement may reduce intrinsic motivation (Deci, 1975; but see also Eisenberger & Cameron, 1996). If rewards are given for activities that may be intrinsically rewarding (e.g., puzzle solving), providing extrinsic rewards for such an activity may lead to a subsequent reduction in that behavior when extrinsic rewards are no longer provided. This effect has been interpreted as a shift in self-determination or locus of control (Deci & Ryan,
1985; Lepper, 1981). But such effects can also be viewed as examples of contrast. In this case, it may be the contrast between extrinsic reinforcement and its sudden removal that is at least partly responsible for the decline in performance (Flora, 1990). Such contrast effects are likely to be quite different from those responsible for the results of the present experiment, however, because the removal of extrinsic reinforcement results in a change in actual reward value, relative to the reward value expected (i.e., the shift from a combination
of both extrinsic and intrinsic reward to intrinsic reward alone). Thus, the effect of extrinsic reinforcement on intrinsic motivation is probably more similar to traditional reward shift effects of the kind reported by Crespi (1942, i.e., rats run slower after they have been shifted from a large to a small magnitude of reward than rats that have always experienced the small magnitude of reward).

Learned Industriousness.

Finally, contrast effects may also be involved in a somewhat different phenomenon that Eisenberger (1992) has called learned industriousness. Eisenberger has found that if one is rewarded for putting a large amount of effort into a task (compared to a small amount of effort into a task), it may increase ones general readiness to expend effort in other goal-directed tasks. Eisenberger has attributed this effect to the conditioned reward value of effort, a reasonable explanation for the phenomenon, but contrast may also be involved.

Depending on the relative effort required in the first and second tasks, two kinds of relative contrast are possible. First, if the target (second) task is relatively effortful, negative contrast between the previous low-effort task and the target task may make persistence on the second task more aversive for the low-effort group (and the absence of negative contrast less aversive for the high-effort group). Second, for the high-effort group, if the target task requires relatively little effort, positive contrast between the previous high-effort
task and the target task may make persistence less aversive. In either case, contrast provides a reasonable alternative account of these data.

Conclusions

From the previous discussion it should be clear that positive contrast effects of the kind reported in the present research may contribute to a number of experimental findings that have been reported using humans (and sometimes animals) but that traditionally have been explained using more complex cognitive and social accounts. Further examination of these phenomena from the perspective of simpler contrast effects may lead to more parsimonious explanations of what have previously been interpreted to be uniquely human phenomena.

But even if contrast is involved in these more complex phenomena, it may be that more cognitive factors, of the type originally proposed, may also play a role in these more complex social contexts. It would be informative, however, to determine the extent to which contrast effects contribute to these phenomena.

Finally, the description of the various effects as examples of contrast may give the mistaken impression that such effects are simple and are well understood. As prevalent as contrast effects appear to be, the mechanisms that account for them remain quite speculative. Consider the prevalence of the opposite effect, generalization, in which experience with one value on a continuum spreads to other values in direct proportion to their similarity to the experienced value (Hull, 1943). According to a generalization account, generalization between two values of reinforcement should tend to make the two values more similar to each other, rather than more different. An important goal of future research should be to identify the conditions that produce contrast and those than produce generalization.

At the very least, the presence of contrast implies some form of relational learning that cannot be accounted for by means of traditional behavioral theories. Thus, although contrast may provide an alternative, more parsimonious account of several complex social psychological phenomena, contrast should not be considered a simple mechanism. Instead it can be viewed as a set of relational phenomena that must be explained in their own right.


References

Alessandri, J., Darcheville, J.-C., & Zentall, T. R. (2008). Cognitive dissonance in children: Justification or contrast? Psychonomic Bulletin & Review, 15, 673-677. doi.org/10.3758/PBR.15.3.673 PMid:18567273

Alessandri, J., Darcheville, J.-C., Delevoye-Turrell, & Zentall, T. R. (2008). Preference for rewards that follow greater effort and greater delay. Learning & Behavior, 36, 352-358. doi.org/10.3758/LB.36.4.352 PMid:18927058

Amsel, A. (1958). The role of frustrative nonreward in noncontinuous reward situations. Psychological Bulletin, 55, 102–119. doi.org/10.1037/h0043125 PMid:13527595

Arantes, J. & Grace, R. C. (2007). Failure to obtain value enhancement by within-trial contrast in simultaneous and successive discriminations. Learning & Behavior, 36, 1-11. doi.org/10.3758/LB.36.1.1

Aronson, E., & Mills, J. (1959). The effect of severity of initiation on liking for a group. Journal of Abnormal and Social Psychology, 59, 177–181. doi.org/10.1037/h0047195

Aw, J., Vasconcelos, M., Kacelnik , A. (2011). How costs affect preferences: Experiments on state-dependence, hedonic state and within-trial contrast in starlings. Animal Behaviour, 81, 1117-1128 doi.org/10.1016/j.anbehav.2011.02.015

Bower, G. H. (1961). A contrast effect in differential conditioning. Journal of Experimental Psychology, 62, 196–199. doi.org/10.1037/h0048109

Capaldi, E. J. (1967). A sequential hypothesis of instrumental learning. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation (Vol. 1, pp. 67–156). New York: Academic Press.

Capaldi, E. J. (1972). Successive negative contrast effect: Intertrial interval, type of shift, and four sources of generalization decrement. Journal of Experimental Psychology, 96, 433-438. doi.org/10.1037/h0033695

Carder, B., & Berkowitz, K. (1970). Rats preference for earned in comparison with free food. Science, 167, 1273–1274. doi.org/10.1126/science.167.3922.1273 PMid:5411917

Cleary, T. L. (1992). The relationship of local to overall behavioral contrast. Bulletin of the Psychonomic Society, 30, 58–60.

Clement, T. S., Feltus, J., Kaiser, D. H., & Zentall, T. R. (2000). “Work ethic” in pigeons: Reward value is directly related to the effort or time required to obtain the reward Psychonomic Bulletin & Review, 7, 100–106. doi.org/10.3758/BF03210727 PMid:10780022

Clement, T. S., & Zentall, T. R (2002). Second-order contrast based on the expectation of effort and reinforcement. Journal of Experimental Psychology: Animal Behavior Processes, 28, 64–74. doi.org/10.1037/0097-7403.28.1.64

Crespi, L. P. (1942). Quantitative variation in incentive and performance in the white rat. American Journal of Psychology, 40, 467–517. doi.org/10.2307/1417120

Deci, E. (1975). Intrinsic motivation. New York: Plenum. doi.org/10.1007/978-1-4613-4446-9

Deci, E., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. New York: Plenum Press. PMid:3841237

DiGian, K. A., Friedrich, A. M., & Zentall, T. R. (2004). Reinforcers that follow a delay have added value for pigeons. Psychonomic Bulletin & Review, 11, 889–895. doi.org/10.3758/BF03196717

Egan, L. C., Santos, L. R., & Bloom, P. (2007). The origins of cognitive dissonance evidence from children and monkeys. Psychological Science, 18, 978−983. doi.org/10.1111/j.1467-9280.2007.02012.x PMid:17958712

Eisenberger, R. (1992). Learned industriousness. Psychological Review, 99, 248–267. doi.org/10.1037/0033-295X.99.2.248 PMid:1594725

Eisenberger, R., & Cameron, J. (1996). Detrimental effects of reward. American Psychologist, 51, 1153–1166. doi.org/10.1037/0003-066X.51.11.1153 PMid:8937264

Fantino, E., & Abarca, N. (1985). Choice, optimal foraging, and the delay-reduction hypothesis. Behavioral and Brain Sciences, 8, 315–330. doi.org/10.1017/S0140525X00020847

Festinger, L. (1957). A theory of cognitive dissonance. Stanford, CA: Stanford University Press.

Festinger L., & Carlsmith, J. M. (1959). Cognitive consequences of forced compliance. Journal of Abnormal and Social Psychology. 58, 203–210. doi.org/10.1037/h0041593

Flaherty, C. F. (1982). Incentive contrast. A review of behavioral changes following shifts in reward. Animal Learning & Behavior, 10, 409–440. doi.org/10.3758/BF03212282

Flaherty, C. F. (1996). Incentive relativity. New York: Cambridge University Press. PMCid:163278

Flora, S. R. (1990). Undermining intrinsic interest from the standpoint of a behaviorist. Psychological Record, 40, 323–346.

Friedrich, A. M., & Zentall, T. R. (2004). Pigeons shift their preference toward locations of food that take more effort to obtain. Behavioural Processes, 67, 405–415. doi.org/10.1016/j.beproc.2004.07.001 PMid:15518990

Friedrich, A. M., Clement, T. S., & Zentall, T. R. (2005). Discriminative stimuli that follow the absence of reinforcement are preferred by pigeons over those that follow reinforcement. Learning & Behavior, 33, 337-342. doi.org/10.3758/BF03192862

Greenberg, J. (1977). The Protestant work ethic and reactions to negative performance evaluations on a laboratory task. Journal of Applied Psychology, 62, 682–690. doi.org/10.1037/0021-9010.62.6.682

Halliday, M. S., & Boakes, R. A. (1971). Behavioral contrast and response independent reinforcement. Journal of the Experimental Analysis of Behavior, 16, 429–434. http://dx.doi.org/10.1901/jeab.1971.16-429 PMid:16811560 PMCid:1333947

Hull, C. L. (1943). Principles of behavior. New York: Appleton-Century-Crofts. PMid:16578092 PMCid:1078614

Jensen, G. D. (1963). Preference of bar pressing over “freeloading” as a function of number of rewarded presses. Journal of Experimental Psychology, 65, 451–454. doi.org/10.1037/h0049174 PMid:13957621

Johnson, A. W., & Gallagher, M. (2011). Greater effort boosts the affective taste properties of food. Proceedings of the Royal Society B: Biological Sciences, 278, 1450-1456. doi.org/10.1098/rspb.2010.1581 PMid:21047860 PMCid:3081738

Kacelnik, A., & Marsh, B. (2002). Cost can increase preference in starlings. Animal Behaviour, 63, 245-250. doi.org/10.1006/anbe.2001.1900

Keller, K. (1974). The role of elicited responding in behavioral contrast. Journal of the Experimental Analysis of Behavior, 21, 249–257. doi.org/10.1901/jeab.1974.21-249 PMid:16811742 PMCid:1333192

Klein, E. D., Bhatt, R. S., & Zentall, T. R. (2005). Contrast and the justification of effort. Psychonomic Bulletin & Review, 12, 335-339. doi.org/10.3758/BF03196381

Lawrence, D. H., & Festinger, L. (1962). Deterrents and reinforcement: The psychology of insufficient reward. Stanford, CA: Stanford University Press.

Lieberman, M. D., Ochsner, K. N., Gilbert, D. T., & Schacter, D. L. (2001). Do amnesics exhibit cognitive dissonance reduction? The role of explicit memory and attention in attitude change. Psychological Science, 12, 135-140. doi.org/10.1111/1467-9280.00323 PMid:11340922

Lepper, M. R. (1981). Intrinsic and extrinsic motivation in children: Detrimental effects of superfluous social controls. In W. A. Collins (Ed.), Aspects of the development of competence: The Minnesota Symposium on Child Psychology (Vol. 14, pp. 155–214). Hillsdale, NJ: Lawrence Erlbaum.

Mackintosh, N. J. (1974) The psychology of animal learning. London: Academic Press.

Mackintosh, N. J., Little, L., & Lord, J. (1972). Some determinants of behavioral contrast in pigeons and rats. Learning and Motivation, 3, 148–161. doi.org/10.1016/0023-9690(72)90035-5

Marsh, B., Schuck-Palm, C., & Kacelnik, A. (2004). State-dependent learning affects foraging choices in starlings. Behavioral Ecology, 15, 396-399. doi.org/10.1093/beheco/arh034

Mellgren, R. L. (1972). Positive and negative contrast effects using delayed reinforcement. Learning and Motivation, 3, 185–193. doi.org/10.1016/0023-9690(72)90038-0

Neuringer, A. J. (1969). Animals respond for food in the presence of free food. Science, 166, 399–401. doi.org/10.1126/science.166.3903.399 PMid:5812041

Popilio, L., & Kacelnik, A. (2005). State-dependent learning and suboptimal choice: When starlings prefer long over short delays to food. Animal Behaviour, 70, 571-578. doi.org/10.1016/j.anbehav.2004.12.009

Popilio, L., Kacelnik, A., & Behmer, S. (2006). State dependent learned valuation drives choice in an invertebrate. Science, 311, 1613-1615. doi.org/10.1126/science.1123924 PMid:16543461

Reynolds, R. S. (1961). Behavioral contrast. Journal of the Experimental Analysis of Behavior, 4, 57–71. doi.org/10.1901/jeab.1961.4-57 PMid:13741096 PMCid:1403981

Singer, R. A., Berry, L. M., & Zentall, T. R. (2007). Preference for a stimulus that follows an aversive event: Contrast or delay reduction? Journal of the Experimental Analysis of Behavior, 87, 275-285. doi.org/10.1901/jeab.2007.39-06 PMid:17465316 PMCid:1832171

Singer, R. A. & Zentall, T. R. (2011). Preference for the outcome that follows a relative aversive event: Contrast or delay reduction? Learning and Motivation, 42, 255-271. doi.org/10.1016/j.lmot.2011.06.001 PMid:22993453 PMCid:3444245

Taylor, G. T. (1975). Discriminability and the contrafreeloading phenomenon. Journal of Comparative and Physiological Psychology, 88, 104–109. doi.org/10.1037/h0076222 PMid:1120788

Terrace, H. S. (1966). Stimulus control. In W. K. Honig (Ed.), Operant behavior: Areas of research and application. New York: Appleton-Century-Crofts.

Thorndike, E. L. (1932). The fundamentals of learning. New York: Teachers College. doi.org/10.1037/10976-000

Tinklepaugh, O. L. (1928). An experiment study of representative factors in monkeys. Journal of Comparative Psychology, 8, 197–236. doi.org/10.1037/h0075798

Vasconcelos, M., Urcuioli, P. J., & Lionello-DeNolf, K. M. (2007). Failure to replicate the ‘‘work ethic’’ effect in pigeons. Journal of the Experimental Analysis of Behavior, 87, 383–399. doi.org/10.1901/jeab.2007.68-06 PMid:17575903 PMCid:1868581

Vasconcelos, M. & Urcuioli, P. J. (2008a).Certainties and mysteries in the within-trial contrast literature: A reply to Zentall (2008). Learning & Behavior, 36, 23-25 doi.org/10.3758/LB.36.1.23

Vasconcelos, M. & Urcuioli, P. J. (2008b). Deprivation level and choice in pigeons: A test of within-trial contrast. Learning & Behavior, 36, 12-18. doi.org/10.3758/LB.36.1.12 PMid:18318422

Williams, B. A. (1981). The following schedule of reinforcement as a fundamental determinant of steady state contrast in multiple schedules. Journal of the Experimental Analysis of Behavior, 35, 293–310. doi.org/10.1901/jeab.1981.35-293 PMid:16812218 PMCid:1333085

Williams, B. A. (1983). Another look at contrast in multiple schedules. Journal of the Experimental Analysis of Behavior, 39, 345–384. doi.org/10.1901/jeab.1983.39-345 PMid:16812325 PMCid:1347926

Williams, B. A. (1992). Inverse relations between preference and contrast. Journal of the Experimental Analysis of Behavior, 58, 303–312. doi.org/10.1901/jeab.1992.58-303 PMid:16812667 PMCid:1322062

Williams, B. A., & Wixted, J. T. (1986). An equation for behavioral contrast. Journal of the Experimental Analysis of Behavior, 45, 47–62. doi.org/10.1901/jeab.1986.45-47 PMid:3950534 PMCid:1348210