Multi-Informant Test Anxiety Assessment of Adolescents

A total of 263 junior and senior high school students (grades 7, 8, 9, 10, 11, 12; ages 12 to 19) with relatively more informants identifying as females (57.4%) than males (42.6%) and more junior high school students (68.3%) than high school students (31.7%), along with 267 parents and 167 teachers responded to a student, parent, and teacher version of the German Test Anxiety Inventory (TAI-G) (Hoddapp & Benson, 1997). All reliabilities for all TAI-G scales for all three samples were above .70. The resulting data were fitted to two, three, and four factor models of test anxiety based on theoretical and empirical evidence. The four factor model (worry, emotion, distraction, lack of confidence) of the reduced (17 item) version of the TAI-G (Hoddapp & Benson, 1997) yielded the best fitting model for students (comparative fit index = .97; residual mean square = .042), parents (comparative fit index = .95; residual mean square = .073), and teachers (comparative fit index = .96; residual mean square = .080), thus providing very strong support for the proposed model. Sex, age, grade, and informant differences are presented and discussed. In conclusion, this study supports further research and use of a multi-informant assessment system of test anxiety.


Introduction
The construct of Test Anxiety (TA) has undergone considerable evolution since Sarason and Mandler's (1952) early research demonstrating a link between anxiety and poor test performance.This foundational study was followed by the development of the Test Anxiety Scale for Children (TASC; Sarason et al., 1960) which measured TA among children as a unitary construct.Follow-up research suggested that TA was a multidimensional construct which could be divided into two fundamental components: Worry and Emotionality (Liebert & Morris, 1967).Worry represented the cognitive concerns relating to failure and consequences of failure, whereas Emotionality represented the physiological symptomatology associated with anxiety (e.g., heart racing).Later, several studies supported the inclusion of Cognitive Obstruction or Cognitive Interference (McKeachie, 1984;Swanson & Howell, 1996;Tyron, 1980;Wine, 1971).Sarason (1984) agreed, claiming that both worry (i.e., preoccupation with failure, negative self-talk) and cognitive interference (i.e., disruptive/blocking thoughts) could more accurately describe the cognitive domain of TA.As a result, this factor was represented in Sarason's (1984) Reactions to Tests (RTT) scales, developed through factor analysis on a sample of undergraduate students.In an effort to further develop the construct of TA, Carver and Scheier (1984) proposed that Lack of Confidence should be included in the TA framework.Eventually, these contributions led to the development of a commonly utilized and accepted measure of TA in recent research: the German Test Anxiety Index (TAI-G; Hodapp, 1991Hodapp, , 1995)).

Assessment of Test Anxiety
Assessment of TA using scales designed to measure factors known to comprise the construct of TA has relied exclusively on self-reports.This practice continues despite the broad use of multi-informant procedures in the field of anxiety assessment at large.According to Zeidner (1998), there is some evidence that children report more internalizing symptoms than parents (Angold et al., 1987;Edelbrock et al., 1986).Reliance on self-reporting, however, is not the norm in the broad scheme of anxiety measurement.Typically, anxious symptomatology is assessed within a multi-informant assessment framework, whereby self-reports are compared to observations made by parents, teachers, and other sources (Jensen, Rubio-STipec, Canino, Bird, Dulcan, Schwab-Stone et al., 1999;Kazdin, 1986;Kendal & Flanery-Schroeder, 1998;Ollendick, 1986;Grills & Ollendick, 2003;Comer & Kendall, 2004).As outlined by Brown-Jacobsen, Wallace, & Whiteside (2011), the majority of researchers and clinicians support the utility of multi-informant approaches to general anxiety assessment, as they are held to enhance diagnostic accuracy and direct more informed treatment choices compared to self-evaluations alone.Despite this endorsement of multi-informant assessment for anxiety, TA assessment continues to rely upon self-evaluation procedures.

Aim of the Study
The current study was primarily undertaken to investigate the possibility of establishing a multi-informant assessment framework for TA.Hence, the aim was to examine the construct validity of TA across multiple raters.A secondary aim of this study was to examine sex, age, grade, and informant differences with respect to TA.The major questions to be addressed in this study were: (1) Is the factor structure of the English version of the TAI-G, child self-report, maintained within a student sample from grades 7 through 12, as well as across parent and teacher ratings of grade 7 through 12 students' TA? (2) Do TAI-G subscale and Total scores differ as a function of demographic variables (i.e., age, sex, grade)?and (3) Do TAI-G subscale and Total scores differ as a function of type of Informant (i.e., student, parent, teacher)?

Participants
The sample for the study was grades 7 through 12 students from one school district.Participants were randomly selected from a volunteer pool.When possible, the study also included one of each student's legal guardians, and one of their teachers.The final analysis was conducted with the participation of 263 students (37.7%), 267 parents (38.3%), and 167 teachers (23.9%).Demographic characteristics of the student sample were determined for sex, age, and grade.This analysis revealed that relatively more females (i.e., approximately 57.4%) compared to males (i.e., approximately 42.6%) took part in the study.The age range for students fell between 12 (7.2%)and 19 (1.1%).The grade range for students fell between 7 (20.6%)and 12 (11.1%).Participating schools included 10 junior high schools (i.e., grades 7 to 9) and 5 high schools (i.e., grades 10 to 12).As such, representation was slightly more than twice that for younger/junior high (i.e., approximately 68.3%) students compared to older/senior high (i.e., approximately 31.7%) students).Demographic characteristics of age and sex were not determined for parents and teachers.

Procedure
Once permission from parents, students, and teachers was obtained, one student from each class was randomly selected for participation.The homes of participating students and their parents were contacted via phone by the (1 st ) researcher and two research assistants.The student, teacher, and parent scales were administered over the telephone after a session of practice trials during which all research assistants and the researcher agreed upon a specific framework within which to make introductions and administer the scales via telephone.Using telephone correspondence was a necessary condition required by the school board.The requirement ensured that any disruption of student time during school hours was eliminated.
Student TA was assessed using the English version of the German Test Anxiety Inventory (TAI-G; Hodapp & Benson, 1997; see Table 1).Studies have suggested that this instrument is psychometrically sound.Confirmatory factor analysis (Hodapp & Benson, 1997) supported the Lieber and Morris (1967) dimensions of TA (i.e., Worry and Emotionality), as well as Sarason's (1984) Interference, and Carver and Scheier's (1984) Lack of Confidence among a sample of university students.The  10.I ask myself whether my performance will be good enough.
11.I am preoccupied by other thoughts which distract me.
13.I know that I can rely on myself.
14.I think about how important it is for me to receive a good result.21.I am concerned about my grades.
22. I tremble with fear.
23.I worry that something might go wrong.
24.My concentration is interrupted by interfering thoughts.
26.I think that I will succeed.
27.I think about what will happen if I don't do well.
29.I am convinced that I will do well.
30.I have the feeling everything is so difficult for me.
TAI-G is purported to have strong psychometric properties among college-aged students, as well as mixed samples consisting of college-aged and adolescent students, with each of the four factors (i.e., Worry, Emotionality, Interference, and Lack of Confidence) demonstrating reliability and validity among German and American populations (Hodapp, 1991(Hodapp, , 1995;;Hodapp & Benson, 1997;Keith et al., 2003;Musch & Broder, 1999;Stober, 2004).Total scores and subscales demonstrate alpha coefficients, ranging from .79 to .94,providing adequate evidence of internal consistency (Hodapp, 1991).For this study, the wording of each item of the TAI-G self-report was slightly altered by the researcher to develop the parent and teacher versions.For example, instead of "I worry," the item will state "your child worries" or "this student worries" (permission provided by V. Hodapp through email correspondence).

Descriptive Analyses
For the current study, Table 2 presents descriptive statistics (i.e.number of participants, raw score means, standard deviations, ranges) for each of the TAI-G subscales (i.e., Worry, Emotionality, Lack of Confidence, and Interference) as well as the TAI-G total score for each of the three samples (i.e., students, parents, and teachers).To assess the normality of the scales, skewness and kurtosis values were computed.Skewness and kurtosis values between the values of -2 and +2 are considered acceptable (Bachman, 2004).All of the skewness and kurtosis values were well within the acceptable range for all TAI-G scales for all samples.Cronbach's alpha internal consistency reliabilities for the TAI-G scales are presented in Table 3. Reliabilities should be above .70to be considered acceptable (Cronbach, 1951).All reliabilities for all TAI-G scales for all three samples were above .70.

Confirmatory Analyses
Confirmatory Factor Analyses (CFAs) using Lisrel 8.8 were applied to the students in the current sample to determine whether the 30-item four-factor model could be replicated among a younger, school-age sample.The CFA procedure specified a model with four latent factors and each survey item loading on its respective factor.This procedure was repeated across parent and teacher TAI-G ratings of student TA in order to test the consistency of the four-factor structure within a multi-informant assessment framework.Table 4 presents the standardized factor loadings for the 30-item 4-factor solutions for each sample.As also indicated by the model-fit statistics in Table 5, the student sample provided the best fit, followed by the parent sample, and finally the teacher sample.The slightly poorer fit in the teacher sample was also evidenced in less agreement in the factor loadings for this sample.Nevertheless, the four-factor structure was reasonable in all three samples.Table 5 depicts the results of CFAs applied to examine the four-factor structure of the 30-item TAI-G.CFAs were also applied to examine alternative models of TA, including a four-factor 17-item version of the TAI-G (Hodapp & Benson, 1997) and other reduced factor models (e.g., Worry and Emotion; Worry, Emotion, and Distraction).Good model fit was determined when the RMSEA was smaller than .08 and the CFI was larger than .95,although values of at least .90can be considered acceptable (Browne & Cudeck, 1993;Hu & Bentler, 1999;Wen, Hau, & Marsh, 2004).Although not considered one of the more commonly used fit indices, GFIs were also included and considered acceptable when values of at least .90were obtained (Byrne, 2001;Shevlin & Miles, 1999).As depicted in Table 5, the RMSEA criteria was met for the student sample when CFA tested the four-factor model on the 30-item TAI-G results (RMSEA = .068);however, this criteria was not met for the parent and teacher samples (parents: RMSEA = .093;teachers: RMSEA = .110).The CFIs for the TAI-G for each sample ranged from .91 to .92,failing to meet the recommended criteria of .95 for a good fit, but still within the acceptable range.CFAs were also applied to alternative (i.e., reduced item and reduced factor) versions of the TAI-G in order to test model fit.Fit indices for these CFAs are also presented in Table 5.This analysis revealed that a 17-item four-factor TAI-G model, also developed by Hodapp and Benson (1997), yielded the best-fitting model overall, meeting the suggested the RMSEA (≤.08) and CFI (≥.95) criteria across all three samples.Since the 30-item four-factor version of the TAI-G is the focus of this study and had CFIs for all participants within an acceptable range, the primary focus of subsequent analyses was based upon this version of the TAI-G.However, post hoc analyses for the 17 item model resulted in virtually identical findings.
One-way ANOVAs comparing 12, 15, and 18-year-olds were conducted on the TAI-G scales as rated by students, parents, and teachers.These groupings represent participant age comparisons between the youngest, those in the middle, and the oldest.One-way ANOVAs were first conducted for the student self-rated TAI-G subscales and Total scores.The overall ANOVAs yielded a significant difference only for the Emotionality subscale.Post-hoc comparisons between specific groups were then conducted for Emotionality.The analysis yielded significantly higher Emotionality scores for the 12-year-old students (M = 16.89,SD = 4.24) as compared to the 15-year-old students (M = 13.90,SD = 3.94; p < .05;comparison automatically adjusted by Bonferroni).No other comparisons showed significant differences between any age groups for the student ratings.One-way ANOVA comparisons were then conducted for the parent-rated TAI-G subscales and Total score comparisons as well as the teacher-rated scales.The comparisons yielded no significant differences between any age categories for any of the TAI-G subscales or Total TAI-G scale for either the parent or teacher ratings.
A repeated measures analysis of variance (ANOVA) was conducted to compare TAI-G scores across students, parents, and teachers.A repeated measures analysis was necessary because the different informants each rated the same student hence, each student had a student (self) rating, a teacher rating, and a parent rating.As mentioned earlier, all TAI-G scales were assessed to be sufficiently normally distributed according to their skewness and kurtosis values; hence the variables were appropriate for use in the ANOVA.Mauchly's test of sphericity, which needs to be assessed for the within-subjects ANOVA, was also tested for each of the ANOVAs.Sphericity was not violated for Worry, Interference, or Total Score.It was violated for Emotionality and Lack of Confidence.When sphericity is violated, the degrees of freedom need to be modified by using a correction factor such as the Greenhouse-Geisser Epsilon.This correction was applied to the results for Emotionality and Lack of Confidence.However, it should also be noted that the Greenhouse-Geisser results were exactly the same as the results when sphericity is assumed.
The repeated measures ANOVA yielded significant differences between informants on subscales Emotionality, Lack of Confidence, and Interference (all ps < .001).Post-hoc comparisons were conducted to determine the direction of effects among informants.All p-values for post-hoc comparisons were corrected in SPSS by the Bonferroni adjustment for multiple comparisons.
Post-hoc analysis for student Emotionality revealed that student and parent ratings were not significantly different from one another (p > .05),but both were significantly higher than the ratings of teachers (ps < .01).This pattern of results was replicated for the Lack of Confidence scale, with higher parent and student ratings compared to teachers (ps < .001),but no significant differences between parents and students themselves (p > .05).This pattern was, again, replicated for the Interference scale, such that the student and parent ratings yielded significantly higher scores than teacher ratings (ps < .01),but student and parent ratings were not significantly discrepant (p > .05).

Discussion
From this study, it was determined that the four-factor model of TA is best applied to the sample within a multi-informant system of assessment, using a reduced 17 item version of the TAI-G.Future research should aim to corroborate these findings and develop normative data for student TA across multiple raters.In order to determine internal consistency, Cronbach's alpha (Cronbach, 1951) was calculated for the items of each subscale and Total scores across all three informant samples.All TAI-G subscales across all informant samples exceeded the criteria for acceptable reliability of .70 (Cronbach, 1951), remaining consistent with the range of alpha coefficients (.79 to .94) reported by the author for Total scores and subscales (Hodapp, 1991).
Moreover, the results of the self-rated TAI-G in this study provide support for female susceptibility for TA with regard to two of the four factors (i.e.Worry & Emotionality); consistent with findings from research that has utilized the traditional two-factor model of TA (Liebert & Morris, 1967).Analyses of sex effects across all informants revealed concordance between students and parents with respect to their identification of test anxiety symptoms for both males and females.This student-parent concordance suggests that parents are able to accurately gauge differences between males and females with regard to TA symptoms.Therefore, clinical decisions and insights regarding gender that are drawn from concordant parent and student data would likely be well founded.Equally important, however, are discordant reports.For example, this study con-tributes the unique, and unexpected, finding associated with teacher endorsement of male susceptibility to symptoms of Cognitive Interference.This is interesting because teachers provide the only analysis of TA symptoms that is based on first-hand observation, as well as a perspective that has never been studied in the field of TA.The possibility that males are more prone than females to developing Cognitive Interference represents a major shift from the traditional association between females and anxiety in general.Such discordant information is also very important in clinical practice, as it can be used as an indicator of possible informant biases such as self-preservation, avoidance, and resistance relative to their ratings.
Results associated with age and grade level provided substantiation of the four-factor model of TA across a sample of English speaking adolescents.Previous research substantiating the four-factor model was conducted on mixed age groups of American and German samples in different educational environments.The results of the student self-rated TAI-G analysis revealed that the youngest students demonstrated higher Emotionality compared to those in their mid-teens.The oldest teens, however, demonstrated higher Lack of Confidence compared to students in their mid-teens.For the Emotionality and Lack of Confidence factors, 12-year-old and 18-year-old students demonstrated no significant differences.The results suggest that early adolescence, as well as late adolescence represent periods that may render students particularly susceptible to developing Emotionality and Lack of Confidence (e.g., onset of adolescence, higher academic demands, and career decisions).
This study also examined performance variation as a function of educational level (junior high vs. senior high) on the TAI-G across student, parent, and teacher samples.Main effects were only noted within the student samples, with significantly higher TAI-G Worry scores for junior high students compared to senior high students.These results suggest that test-related Worry, compared to the other factors, should be given particular attention, and that it should likely be attended to from an early age.Studies do suggest that TA increases slowly in the early school years, then levels off and eventually decreases in later school years.Studies vary, however, with regard to exactly when this occurs.Hembree (1998) suggested that a sharp increase occurs at grades 3 to 5, stabilizes in secondary school, and decreases in college.The data in the current study suggests that junior high students experience failure focused thoughts (i.e., Worry) to a greater degree than high school students when it comes to testing.This finding appears to corroborate a study by Manly and Rosemire (1972), which suggested that TA prevalence is highest at the junior high level compared to senior high.
TAI-G factor scores also varied as a function of Informant.With Emotionality, Lack of Confidence, and Interference, student and parent ratings were not significantly different from one another, however, the student and parent ratings were significantly higher than the teacher ratings.Since parents and students demonstrated more concordance across TA factors compared to teacher reports, it appears likely that students and parents are better reporters of TA symptomatology in three of four factor categories.However, teachers, students, and parents demonstrated concordance with regard to reporting student Worry.That Worry is considered the most robust of the four factors, it is clinically significant that all informants gauged this factor concordantly.Knowing that all informants, on average, recognize and endorse Worry in a consistent manner can enhance clinical judgment relative to discordant reports.

Conclusion
This study contributes to the theory, extant empirical literature, and practices related to TA.From a theoretical perspective, a valuable contribution is extended toward the substantiation of the four-factor model within a multi-informant framework of TA assessment.Empirically, this study substantiates and extends claims made with regard to TA ratings as a function of demographic variables of gender, age, and grade level.Ultimately, this study supports further investigations and use of a multi-informant assessment system of TA.
4. I think about my abilities.5. Distracting thoughts keep "popping" into my head.6.I worry about whether I can cope with being examined.7. I am "up-tight".8.I have faith in my own performance.9.I am thinking about the consequences of failing.
15.I easily lose my train of thoughts.16.My heart pounds.17.I worry about my results.18.I feel anxious.19.I forget things because I am too preoccupied with my personal problems.20.I am satisfied with myself.

Table 2 .
Descriptive statistics for TAI-G.

Table 4 .
Standardized factor loadings for 30-item 4-factor solutions for student, parent, and teacher samples.

Table 5 .
Overall model fit indices for test anxiety models across student, parent, and teacher ratings.