Can Rewards Obviate Stereotype Threat Effects on Mental Rotation Tasks?

We examined whether sex-linked performance differences in Mental Rotation (MR) were obviated by rewards for performing the tasks. MR is typically seen as the domain of men, and therefore women completing the MR tasks likely worked under conditions of stereotype threat, which meant that their performance could vary according to situational variables. Men and women (n = 33 each) performed rotations and provided several self-reflective reports on their performances and background information about their experiences. Half of the participants (within sex) were rewarded for their participation with a gift card. Women’s MR performance was lower than men’s when no reward was given, but equaled it when they were rewarded. The finding was not a function of skill and self-reported effort, and emerged even when a stringent scoring technique was employed. The results suggest that rewards, even if they are not large, may nullify stereotype threat effects on women’s MR.


Can Rewards Obviate Stereotype Threat Effects on Rotation Tasks?
"In the end, it is impossible not to become what others believe you are" Gabriel Garcia Marquez in Memories of My Melancholy Whores, 2005.Garcia Marquez's words illustrate the basic idea of stereotype threat: the fear people of clearly-defined groups have that they may confirm negative stereotypes about their group (Steele, 1997).Such fears often lead people to perform well below their level of competence (Steele & Aronson, 1995).If people claim group membership, understand the stereotype of the group, and are worried about what others will think of them personally (or their group as a whole), they can perform below their abilities on a variety of cognitive tasks (Shapiro, 2011;Shapiro & Neuberg, 2007).
Such thoughts may be at the root of the underperformance of Black men in comparison to their White peers on standardized tests (Aronson et al., 1998;Croizet & Claire, 1998), or for White men in similar circumstances when compared to Asian men (who may be seen as gifted in mathematics; Smith & White, 2002).While the earliest studies of how stereotype threat exerts its influence focused on the dimensions of race and ethnicity, threat can also hinder performances of people based on other dimensions, including academic interest (Seibt & Förster, 2004), spatial relations (Brownlow, Valentine, & Owusu, 2008), social sensitivity (Koenig & Eagly, 2005), socioeconomic status (Croizet & Claire, 1998), and athleticism (Stone, Lynch, Sjomeling, & Darley, 1999).
Sex-linked stereotype threat effects have been well-documented in the domain of mathematics performance, an academic area where most believe-quite erroneously-that men are superior to women (Else-Quest, Hyde, & Linn, 2010).When experimental manipulations make clear that mathematics ability is to be tested, women do poorly in relation to men (Carr & Steele, 2009;Schmader, 2002;Spencer, Steele, & Quinn, 1999).The threat-induced decrease in math performance and other related cognitive tasks can be lessened or eliminated simply by reminding women of the positive, achievement-oriented aspects of their sex role (Schmader, Johns, & Barquissau, 2004), by highlighting another part of a social identity that is not deficient in math (Gresky, Ten Eyck, Lord, & McIntyre, 2005;Schmader et al., 2008), by presenting "peer testimonials" about the ease of the task (Brownlow, Janas, Blake, Rebadow, & Mellon, 2011), by claiming a test is being used simply to gather baseline information (Gonzalez, Blanton, & Williams, 2002), through the presentation of a high-achieving role model (Lesko & Corpus, 2006), and even by noting that women make better students and research subjects than do men (McIntyre et al., 2003).Thus, nullifying a stereotype threat and preventing underperformance is possible through a variety of means.
Stereotype threat affects self-efficacy, which in turn may influence actual task ability.If people perceive that they may fail at a given task, they may avoid the task (Spencer et al., 1999;Steele & Aronson, 1995).If they do engage and perform poorly, lack of efficacy is confirmed, perpetuating the idea of doubt about ability in the future (Schmader et al., 2004) and ultima-tely leading to a lack of interest in that area (Keller & Dauenheimer, 2003).Worse, stereotype threat can undermine actual ability, by preventing target persons from encoding and learning necessary information to start (R. Rydell, M. Rydell, & Boucher, 2010;Taylor & Walton, 2011).
Areas where women have little confidence and are underrepresented (such as STEM fields including science, engineering, and mathematics; see Shapiro & Williams, 2012) often employ Mental Rotation.Mental rotation (MR) is the transformation of three-dimensional blocks or objects in the head.Men outperform women by being quicker and more accurate at rotations (Bodner & Guay, 1997;Newcombe, 2007;Voyer, Voyer, & Bryden, 1995).Sex-linked differences in MR can be reduced or eliminated via practice, emphasis on accuracy over speed, or by shifting the focus on the tasks from rotations per se to generalized cognitive abilities (Alington, Leaf, & Monaghan, 2001;Sharps, Price, & Williams, 1994;Scali, Brownlow, & Hicks, 2000).Practice on the task (Kass, Ahlers, & Dugger, 1998), on spatial games (Cherney, 1998), and classes in mathematics and physical sciences are linked to better MR performance (Brownlow, McPherson, & Acks, 2003), as is athletic activity that employs spatial behavior (Ozel, Larue, & Molinaro, 2004; but only for men, Balentine & Brownlow, 2006).
In sum, stereotype threat provides one possible explanation for women's lack of MR ability in contrast to men.Although stereotype threat can be nullified in many ways, focusing on changing the cognitions or attributions for performance may be key to changing subsequent performance.Rewards may change not only behavior, but may also shift the attribution for behaveior from internal to external (Freedman, Cunningham, & Krismer, 1992;Greenberg, Pyszczynski, & Paisley, 1984).Moreover, rewards for research participation imply that the task is difficult, unpleasant, and tedious, thus making attributions easier to externalize (Freedman et al., 1992), and perhaps alleviating concern that individual performance will reflect on an entire group.Thus, the purpose of this experiment was to examine how rewards for participation would influence performance on MR.We hypothesized that women's performance would improve if they were unconcerned or less concerned about stereotype threat, and therefore predicted that reward would improve women's performance in this domain.

Method Participants
A total of 66 college students (n = 33 men; n = 33 women), aged 17 -22, participated for course credit in psychology or sociology courses.Participants were assigned randomly within sex to complete rotation tasks for research credit only, or for credit and a gift card reward, resulting in a 2 × 2 (Sex × Reward) between-participant design.Because all participants earned research credit (a requirement), the gift card to the College bookstore-given at the start of the experiment and part of the recruitment into the experiment-served as the reward.

Dependent Measures-MR Performance
The men and women completed the Purdue Visualization of Rotations Test (PVRT; Bodner & Guay, 1997), a test composed of 20 multi-dimensional block rotations.In this test, each block is paired with an identical shape that has been rotated along two dimensions (such as tilted forward and turned right), and under that is another shape by itself.The test then shows five different rotated options for the unpaired shape, with only one correct rotation that matches the rotation pattern of the original paired shapes.The participants could score from 0 to 20 depending on the number of correct responses (raw score).The adjusted score (the raw score minus the number of incorrect responses) was recorded to accommodate guessing (Goldstein, Haldane, & Mitchell 1990), and ranged from -20 to 20.Time on task in s was also recorded.

Dependent Measures-Self Reports of Skill, Efficacy, Effort, and Handicapping
Self-efficacy and performance expectations may positively influence performance under stereotype threat (Smith;2006), but lack of efficacy and tendency to self-handicap may increase stereotype-threat based underperformance (Stone, 2002).Thus, participants completed several measures of their skill, efficacy, efforts, and the judgments of the task and their beliefs about it using separate bipolar scales with the endpoints of 1 to 7, each bounded by opposite-meaning endpoints.One such question was how hard participants tried, bounded by 1 I didn't try very hard and 7 I tried very hard.The specific questions are described below in the section titled "data reduction".

Dependent Measures-Background
Background measures were taken because certain activities give people practice with MR tasks (Cadinu et al., 2003;Voyer & Isaacs, 1993).These activities included the number of sciences courses, particularly organic chemistry (Bodner & Guay, 1997), mathematics, dance, organized sports, and art classes.Self-reports about abilities in these areas were assessed on 7-point scales (endpoints labeled 1 not good at all to 7 very good).Participants reported the number of hours they played video/interactive games (0 -2, 3 -6, 7 -10, 11 -14, 15+).

Procedure
After obtaining consent, we told participants that "you are about to complete several problem solving tasks that involve rotating multi-dimensional blocks".These instructions were also provided on the cover page of the dependent measures booklet.Those in the reward group were told, "for your time and efforts, we are giving you a gift card to the College Bookstore; it's yours to keep for participation."The gift card was then given to the participant (or not), sample rotations were provided, questions answered, and the participants were left alone in the cubicle with the instructions to ring the bell once they started the first MR, and to ring the bell again when they finished (timed, in s, using a stopwatch).The PVRT booklet was removed and the men and women were given a packet of background assessment and self-efficacy scales.Reports of self-efficacy, and self-handicapping were taken in two orders to reduce order effects.Debriefing occurred at a later time when all the data had been collected.

Data Analysis
Each major performance measure was entered separately into 2 × 2 (Sex × Reward) ANOVAs.Then, self-report measures of performance and background were subjected to separate factor analyses; resultant factors related to performance were then employed as covariates in analyses to examine whether efficacy and/or background mitigated the joint influence of sex and reward on performance.

Effect of Sex and Incentives on MR Performances
To examine how rewards influenced MR of women and men, the raw scores, adjusted scores, and the time on task (in sec) were separately entered into 2 × 2 (Sex × Incentive) ANOVAs.The means and standard deviations from these analyses can be seen in Table 1.
Adjusted scores showed the same pattern, with no main effect of incentive, F(1, 62) < 1.00, MSE = 48.32,ns, and sex, F(1, 62) = 2.99, p = .09.However, the Sex × Incentive interacttion produced a marginally significant effect, F(1, 62) = 3.66, p = .06, p 2 = .06.The adjusted scores had more variability than did the raw scores, thus the effect was only marginal, but the means followed the same pattern as the raw score means.Women who were rewarded (M = 5.94, SD = 6.78) scored on par with men who were rewarded (M = 5.62, SD = 8.33).In contrast, when stereotype threat was in the air and no reward was offered, men (M = 9.18, SD = 6.40) outperformed women (M = 2.94, SD = 6.16).There were no main effects of sex or incentive, nor was there an interaction between sex and incentive, for time-to-complete the task, all Fs(1, 62) ≤ 1.24, MSE = 53739.22,all ps ≥ .24.

Data Reduction
Reduction of the data was necessary because multiple measures of the same constructs may have been taken.A factor analysis with varimax rotation was calculated on judgments of task liking, efficacy, self-handicapping, and perception of performance.The analysis produced six factors that accounted for 70.04% of the variance.The factors and their loadings are reported in Table 2.These factors were named Skill (including ability, enjoyment, positive performance evaluation, lack of frustration, understanding of the difficulty of the task, and perception that the task was not "tricky"), Self-Handicapping (affirmative judgments of recent life pressure, school stress, and feeling rushed during the day of the experiment), Effort (belief the test is valid, amount of effort put forth), Evaluation Apprehension (nervousness, concern for evaluation), Task (Dis)Liking (lack of enjoyment of task), and Reward Pressure (pressure due to the presence of reward).
A second factor analysis with varimax rotation was performed on measures of self-reported background experiences in the task.The factor analysis was performed to reduce and combine overlapping measures of sports experience, perceived math and science skill, and artistic abilities.The analysis produced

Relationship of Self-Reports and Background Measures to MR Performance
Factor means, after reverse scoring as needed, were calculated.These means were then correlated with the performance measures (MR time, raw score, and adjusted score).The relationships among self-reported efficacy/enjoyment factors, academic and sports background, and performance can be seen in Table 4.In sum, self-reported skill and effort positively related to raw score, and sports background was negatively related to raw score.A similar pattern appeared with adjusted score, with math/science skill also showing a positive correlation with that measure.Finally, only self-reported effort was related (positively) to time.

Influence of Self-Reports of Efficacy and Background as Mediators of Sex and Nullification Effects on MR Performance
Factors related to each performance measure were employed as covariates in 2 × 2 (Sex × Reward) ANCOVAs on time, raw score, and adjusted score in order to examine whether any would change or eliminate the patterns described previously.For raw score, there were three covariates: sports skill, effort, and self-reported skill.Of these, self-reported skill was signifycant, F(1, 58) = 10.04,MSE = 9.54, p = .002, 2 p  = .15,but the others were not.The ANCOVA results, in parallel with the ANOVA findings, showed no main effects of sex or reward, however, the interaction was again significant, F(1, 58) = 4.29, p = .04, 2 p  = .07.Three covariates were significant for the analysis with adjusted scores, all Fs(1, 57) and MSE = 35.33:sports skill (F = 4.07, p = .048, 2 p  = .07),self-reported skill (F = 5.63, p = .021, 2 p  = .09),and math/science ability (F = 5.15, p = .027, 2 p  = .08).Despite the significant covariates, the pattern of results for adjusted scores remained as previous, with no main effects of sex or reward.However, there was a significant interaction, F(1, 57) = 5.35, p = .024, 2 p  = .09.
Finally, effort was a significant covariate to time to complete the task, F(1, 61) = 5.53, MSE = 50082.92, p = .02, 2 p  = .08,but no main effects nor the interaction in the ANCOVA were significant.

Discussion
The results revealed that reward can obviate MR performance differences between men and women.The influence of the reward as a means to equalize women's MR performance with men's remained when various background skills and self-reported effort were held constant, and was not a function of self-handicapping, evaluation apprehension, incentive, task liking, math/science skill, reward pressure, or certain background experiences.As per previous research, men outperformed women when there was no reward and the stereotype threat was in the air (Newcombe, 2007;Scali & Brownlow, 2001;Sharps et al., 1994).
That women and men performed on par when rewarded shows that rewards worked differently for each, essentially increasing women's performance and decreasing men's, and suggests that rewards might nullify the deleterious effects of stereotype threat on women's MR performance, assuming that women but not men were working under threat.Rewards may have affected women but not men because incentives for research participation connote that a task might be hard (Freedman et al., 1992), and perhaps women (who may have already thought the task was going to be unpleasant) were not further negatively influenced, but men (for whom the task should not have been seen as onerous) did come to see the task that way.Like stereotype threat mediators, reward can positively and negatively affect performance.How an individual views the incentive can influence how well he or she performs, much like the way self-efficacy can affect performance.If a person is offered money, then performance expectations may increase (Ostrove, 1978).More importantly, research enticements may make attributions for performance easier to externalize and decrease internal attributions for behavior (Freedman et al., 1992), a situation which may have benefited women by reduceing fear of low performance.
The results also suggest that it is possible that the reward nullified the stereotype threat for women by decreasing the intrusive thoughts that often occur for the targets of stereotype threat.That spatial tasks are generally thought of as "male" may have allowed the men to avoid the evaluation apprehendsion and intrusive thoughts that women may have experienced due to stereotype threat (Ostrove, 1978;Raty & Kasanen, 2007;Schmader, 2002).The women who received a reward may have been prevented from thinking about the task by being given the reward to them upfront-or perhaps women simply perceived that they needed to work harder to justify their gift.These women could make external attributions but did not do so because they experienced no evaluation apprehension (which decreases performance; Brodish & Devine, 2009).As a result, it is possible that no intrusive thoughts about performance occurred and the stereotype threat toward their sex was ignored, resulting no sex differences on the MR tasks when reward was given.
That reward can at least temporarily reduce sex-linked differences in MR has implications for understanding how women may come to change their evaluations of their personal abilities.If persons working under stereotype threat have doubt about their abilities on a task, those doubts can have a negative influence on self-perceived future abilities on related tasks (Schmader et al., 2004).Such self-doubt contributes to a lack of interest in pursuing a given field even if ability is present (Keller & Dauenheimer, 2003), because others continue to hold the stereotype (Cheryan, 2012).Reward may be a mechanism to keep a level of engagement that might otherwise be lost.Although paying people to complete cognitive and technical tasks in an institution is not a practical solution to the problems associated with stereotype threat, there are other forms of reward, including explicit encouragement, that may function the same way.Such explicit acknowledgements can have consequences for women's participation in STEM fields where their presence is lacking (Shapiro, 2012).Whether rewards work to obviate group-related differences under stereotype threat on other cognitive tasks is still an open question.

Table 1 .
Means, SD, and Fs for MR performance measures as a function of sex and reward.

Table 2 .
Results of factor analyses on self-reports efficacy measures.
Note: N = 66.Factor Analysis is the result of Principal Component Analyses with varimax rotation.

Table 3 .
Results of factor analyses on background measures.