Experimental Models for Research in Stress and Behavior

Stress research has gained popularity due to the increased acknowledgement of chronic stress on personal health. With this increased interest, researchers need to assure that the public receives quality, evidence-based solutions. Improvements following a stress reduction intervention are generally assessed by a self-survey pre-post rather than objective biomarkers of stress. There is a need in the literature for a research paradigm utilizing two different stressors to prevent any alteration in post-intervention results due to habituation of the stressor. The Trier Social Stress Test (TSST) and the Beilock Stress Test (BST) are two different stress protocols published in the literature. The present study has three objectives: 1) to compare the efficacy of two different previously documented psychological stressors, the TSST and the BST; 2) to compare an invasive measure, serum cortisol, to a non-invasive measure, the galvanic skin response (GSR); and 3) to examine the effects of sex on the response. Fifty-seven college age males (n = 31) and females (n = 26) completed both protocols. Blood samples were collected every 10 min for 110 minutes. Baseline, stressor, and recovery 1, 2, and 3 were averaged for a 20 min period. A 2 (test: BST or TSST) by 2 (sex: male or female) by 5 (trials: baseline, stressor, and recovery 1, 2, and 3) Mixed Plot ANCOVA with repeated measures on test and trial was used to analyze the data. There was not a significant main effect for test or sex for cortisol or the GSR. There was a significant difference for trial for both biomarkers: cortisol F(4,208) = 39.41; and GSR F(4,216) = 15.18. There was also a significant interaction term for sex × trial × test, F(4,208) = 4.51 and for test × trial, F(4,208) = 14.31 for cortisol. The conclusion is that the TSST and the BST can be used as pretest posttest stressors in translational studies assessing the effectiveness of a stress reduction technique if slight modifications are made in the statistical design. Corresponding author. J. J. Robert-McComb et al.


Introduction
Novel experimental models designed to examine changes in stress-related behavior are of interest to the scientific community in neuropsychobiology.There is a current need for a research paradigm utilizing two different stressors to prevent any alteration in post-intervention results caused by a dampened response due to familiarity to the stressor, also known as habituation [1].Using varying stressors that elicit similar physiological responses should curtail or prevent a dampened stress response due to prior acclimation to one given stressor.An experimental model with two effective stressors that elicit a similar stress response would further affirm results indicating effectiveness of stress reduction techniques.
In the present study, we compared two different published stressors, the Trier Social Stress Test (TSST) and the Beilock Stress Test (BST).Both stressors have been published and shown to elicit a stress response [2] [3].The TSST utilizes mental arithmetic, public speaking, an audience, and an anticipatory period to create a physiological response.The original purpose of the research by Kirschbaum et al. in 1993 was to show that their designed stress-related test would elicit the needed cortisol response for an effective stressor [3].The stressor used in the Beilock and Carr study (2005) has not been subjected to the same level of scrutiny as the TSST for the purposes of demonstrating a significant stress response [2].Beilock and Carr (2005) were interested in examining how stress affected motor skill performance rather than examining potential biomarkers to a stressor.In the BST participants answered math questions on a computer which immediately indicated "correct" or "incorrect" and scored them at the end [2].Serum cortisol was not measured as it was in the Kirschbaum study.The Beilock and Carr study measured performance under high and low pressure with varying levels of working memory.It should be noted that varying phases of the stress protocols may affect individuals differently, yet both stress protocols have been published and are widely known.Yet, there are no published studies that have compared biomarkers of stress between protocols.
Additionally, the researchers in the present study examined a noninvasive biomarker of stress, the galvanic skin response (GSR), and compared the results to an invasive biomarker of stress, serum cortisol.If a noninvasive measure can capture changes in stress related behavior as well as an invasive measure, our published results would provide a more practical and accessible research paradigm for the assessment of changes in stress-related behavior for future translational studies.The elicitation of the response would be different because the biomarkers represent different physiological systems.Yet we may see the same trend in a much shorter time period and answer our research questions using more practical assessment tools.
Cortisol, a glucocorticoid, is a product of the hypothalamic-pituitary-adrenal (HPA) pathway [4].HPA activation occurs at a much slower rate than the initial activity by the sympathetic nervous system activating fight or flight hormones (epinephrine and norepinephrine released at neural synapses directly into the bloodstream).Cortisol also remains in the bloodstream significantly longer than catecholamine biomarkers from the sympathicadrenal-medullary axis [4] [5].Due to this slower activation rate, serum cortisol should be collected throughout an extended recovery period (70 -90 min) following the stressor [3] [6].Cortisol has served as a reliable biomarker of stress because its half-live is approximately 70 min.However, the collection of cortisol requires a specially trained team of personnel and researchers who have access to BSLII labs.Extensive funding is required for the assays, blood collection equipment, and trained personnel.
GSR, also known as electrodermal activity, has been found to increase during psychological and physical stressors [7]- [9].Utilizing the GSR is noninvasive and inexpensive [10].The resulting sympathetic nervous system activation after exposure to a stressor results in many physical manifestations including dilation of pupils and bronchi, acceleration of heart rate, decreased digestive activity, and increased perspiration [9].The increased sweat on the surface of the skin increases the skin's electrical conductance and can be easily measured by relatively inexpensive skin conductance equipment [1].The GSR can be utilized for research examining any state involving alteration to autonomic function and activity [11]- [13].Researchers can measure skin conductance in many areas including hands, palms, feet, underarms, groin, and head.The weakness of assessing stress based solely on change in skin conductance lies in the difficulty differentiating simultaneously between tonic and phasic activity [1].
Furthermore, research is inconsistent regarding the difference in the stress response between sex and or gender.Studies have shown mixed results regarding sex and gender responses to psychological stressors [14]- [17].If sex alters the stress response then future studies assessing the effectiveness of a stress reduction technique would need to control for sex in the experimental design.
In summary, our study had one primary objective and two secondary objectives.The primary objective was to examine the physiological stress response between the TSST and BST, utilizing serum cortisol as a biomarker.Secondly, we compared the results from an invasive measure, serum cortisol, to the results from a noninvasive measure, the GSR.Thirdly, we examined the effect of sex on these two different biomarkers and the published stressors (TSST and BST).

Recruitment
Participants were recruited through flyers, messages in Tech Announce, and announcements made in college classes.The experiment ran from 2013 to 2014.The recruited number of participants was based on previously published studies [3] [18].Meeting times were set up for screening following recruitment; the time of day or day of the week was irrelevant for the first visit.

Screening: Visit One
During visit one, the potential participant was briefed and screened for study eligibility.The investigators have found that a briefing prior to the signing of the consent form decreases reported dropout rates since all who sign the consent form are considered study participants.If willing to complete the study after the informal briefing, participants signed the informed consent as approved by the Institutional Review Board at Texas Tech University Health Science Center.After signing the consent form, they completed the following screening assessments to determine study eligibility: a) the Par-Q Canadian Society [19]; b) the Health History Questionnaire [20]; c) weight; and d) hematocrit levels.To be in the study, participants must answer NO to all questions on the Par-Q, have a hematocrit level > 38%, weigh at least 50 kg, and a not practice meditation on a regular basis, or practice any mind-body exercise such as yoga as indicated on the Health History Questionnaire.If they passed the screening criteria, they completed the following questionnaires: a) Spielberger's State-Trait Anxiety Inventory [21]; b) the Stress Vulnerability Scale [22], and c) the Perceived Stress Scale [23].Weight relative to height was also calculated by dividing body weight in kiliograms by height in meters squared (Body Mass Index [BMI kg•m −2 ]).Fifty seven participants signed the consent form and passed the screening criteria.

Procedures for Laboratory Stressors: Visit Two and Three
During the second and third visit, the participants were exposed to either the TSST (Stressor A) or the BST (Stressor B).Stressors A and B were counter-balanced via participant.One week was scheduled between stressors.All experimental sessions were run between 9:00 a.m.-1:00 p.m. Every attempt was made to test on the same day and time of day for each participant.Participants were asked to refrain from all food, alcohol, caffeine, gum chewing, or the use of tobacco products 3 hours before their scheduled appointment: A questionnaire was used to address food intake compliance.
Participants assumed a seated position in a comfortable hospital chair for rest and recovery periods.For the stressor, they were in the chair as dictated by Stressor A or B. ProComp Infiniti w/Biograph software (Thought Technology; Quebec, Canada) was used to monitor physiological changes prior to and following both laboratory stressors.The galvanic skin response (GSR) was measured by placing a finger sensor on the non-dominant hand or the same arm as the indwelling catheter.Biofeedback measurements were recorded at the same time intervals as cortisol.
A trained technician obtained circulating cortisol levels by inserting an indwelling IV catheter (12.7 cm).To keep the line patent, blood samples (5 cc's) were taken at 10 min intervals throughout the entire protocol followed by a saline flush after the insertion of the IV.
The protocol used for testing was identical for both tests with the exceptions of a differing stressor (A or B). Figure 1 illustrates the protocol and the 20 min time periods to denote baseline, stressor, recovery 1, recovery 2 and recovery 3.

Beilock Stress Protocol
The stressor used for the BST was modified slightly from the Beilock & Carr study, When High-Powered People Fail [2].Dr. Sian Beilock, from the University of Chicago, shared the modular math problems used in her published study with the researchers at TTU.
The modular math problems were installed and displayed on a desktop computer using E-Pime 2 (Psychology Software Tools; Sharpsburg, PA) software designed specifically for experimental research.The complete protocol for the stressor using the modular math problems ensues.
After insertion of the intravenous catheter, participants rested in a comfortable hospital chair for 30 min in room A. At time 0 min they were taken to a second room (room B) and introduced to the task at hand.Two investigators with white coats were in the room with the tester during the 20 min session.They were instructed to judge modular arithmetic (MA) problems as quickly as possible without sacrificing accuracy, pressing the "T" or "F" key to indicate whether each problem was true or false, respectively.The participant was told that his or her performance was videotaped so that local math teachers and professors could examine his or her performance on this new task.The video camera was placed 0.61 m to the right of the participant to record the participant and the computer screen.In order to adhere to a 20 min session all of the MA tasks in a specific part of the session may not be completed depending on the speed of the participant.Each trial began with a 500 ms fixation point at the center of the screen.The fixation was immediately replaced by a MA problem that remained on the screen until the participant responded.After the response, the word "Correct" or "Incorrect" appeared for 1000 ms, providing feedback.The screen then went blank for a 1000 ms inter-trial interval.Participants performed low-demand [e.g., 7 = 2 (mod 5), medium, and high-demand [e.g., 44 = 28 (mod 7)] practice problems, presented in a random order, differently to each participant.
Participants then completed a 70-problem low, medium, and high pressure test.The problems in each test were presented in a different random order to each participant.Each problem appeared only once for each participant.Within each test, there were low-demand and high-demand problems.Following this test, participants were given a scenario designed to create a high-pressure environment by involving sources of pressure commonly seen in the real world (peer pressure and social evaluation).
Participants were informed that the computer uses reaction time (RT) and accuracy equally to compute a MA score.They were told that if they could improve their MA score by 20% relative to the preceding practice trials, they would win a prize.Participants were informed that obtaining the award requires "team effort".Each participant had been randomly paired with another individual, and for either person to receive a prize, both members of the pair had to improve.Next, the participant was told that their partner had already completed the experiment and had improved by 20%.If the participant improved by 20%, both the participant and the partner will receive a prize.However, if the participant did not improve by the required amount, neither individual will receive a prize.The participant then completed the block of 24 MA problems.The entire stressor took 20 minutes.Participants were taken back into room A, where they rested in the hospital chair for 60 min while blood was sampled every 10 min.

Trier Social Stress Protocol
The stress protocol as described by Kirschbaum et al. (1993) was followed as identically as possible [3].After insertion of the intravenous catheter, participants rested in a comfortable hospital chair for 30 min in room A. At time 0 min, the participant was taken to a second room (room B) by an investigator dressed in a white lab coat and introduced to the task at hand.In room B, 3 investigators robed in white lab coats were sitting at a table with a tape recorder.The investigators served as the selection committee (comprised of both males and females).A video camera was placed in the corner of the room and a microphone was accessible for the research participant.When the participant entered the room, he or she was asked to stand at the microphone in front of the selection committee.Next, one of the investigators asked the participant to take over the role of a job applicant who was invited for a personal interview with the company's staff managers (selection committee).They were told that after a preparation period they should introduce themselves to the managers in a free speech of 5 min duration and convince the managers that he or she was the perfect applicant for the vacant position.The managers were introduced as being specially trained to monitor nonverbal behavioral.Furthermore, it was announced that a voice frequency analysis on nonverbal behavioral would be performed on the tape-recorded talk and that a video analysis of the participant's performance would also be conducted.
Following these instructions, the participant returned to room A and was given 10 min to prepare their talk while sitting in the hospital chair.They were provided with paper and pencils for outlining their talks, however, they were not allowed to use the written outline for their speech.At time + 10 min the research participant was guided back to room B. One of the investigators (managers) welcomed the job applicant and asked him or her to deliver the talk during the next 5 min.Whenever the participants finished their speeches in less than 5 min, the managers respond in a standardized way.First they told the research participant "You still have some time left.Please continue!"Should the participant finish a second time before the 5 min was over, the managers were quiet for 20 s and then asked prepared questions.At time + 15 min, the selection committee of managers asked the participant to serially subtract the number 13 from 1022 as fast and as accurately as possible.On every failure, the participants had to restart at 1022 with one member of the committee interfering "Stop.1022."At time + 20 min the task was stopped and the participant was taken back to room A by the investigator.Thereafter, participants rested for 60 min for three 20 min recovery periods.

Debriefing of the Participants
Debriefing of the participants occurred after all testing had been completed.In the debriefing, the researchers relayed to the participants that the investigators in lab coats were not really trained to observe non-verbal behavior nor were they critically judging the participants' behavior.They were also told that they were not videotaped or voice recorded.

Data Analysis
A linear mixed model (2 × 2 × 5 Mixed ANCOVA) was used to examine the changes in outcome variables (galvanic skin response or cortisol) over five repeated time points (trial: baseline, stressor, recovery 1, recovery 2, and recovery 3) for two different stressors (test: Beilock and TSST) between sexes (sex: male and female).Therefore, the model estimated one main effect for between-subject factor (sex) and two main effects for within-subject factor (test and trials) and associated interaction effects between factors across outcome variables.The parameters were estimated using the restricted maximum likelihood method with the unstructured covariance structure for multivariate repeated measures after controlling for the participant's baseline characteristics.The least-square adjusted means for the outcome variables at each time point for two different stressors were estimated for multiple pairwise comparisons.Finally, partial correlation coefficients between two outcome variables were calculated across each repeated time point after controlling for the participant's baseline characteristics.Alpha level was set at 0.05 unless otherwise specified and PROC MIXED procedure was used to examine the linear mixed models using a SAS version 9.4 (SAS Institute; Carry, NC).

Descriptive Statistics
The descriptive statistics of covariates at baseline (N = 57) can be found in Table 1.There was no difference between sex with the exception of BMI kg•m −2 for any of the covariates.

Results of the Mixed Model Analysis
Even though 57 participants signed the consent form and participated in the study protocols, 2 of the 57 participants had missing data and their results could not be analyzed in the final statistical results.The results of the Mixed Model Analysis can be found in Table 2.For the GSR, the only significance difference was between trial.Least-square adjusted means for the pair wise comparisons showed the differences to be between: trial For cortisol, we also saw a significant main effect for trial: trial 1 and 4 t(208) = −4.85,p < 0.0001; trial 1 and 5 t(208) = −7.44,p < 0.0001; trial 2 and 4 t(208) = −8.57,p < 0.0001; trial 2 and 5 t(208) = −10.50,p < 0.0001; trial 3 and 4 t(208) = −11.02,p < 0.0001; trial 3 and 5 t(208) = −10.84,p < 0.0001; and trial 4 and 5 t(208) = −6.21,p < 0.0001.There was also a significant two-way interaction for test × trial and a significant 3 way interaction for test × trial × sex.There were no other significant main effects or interactions.
The outcome measures for GSR and cortisol are presented in Table 3 for baseline, stressor, recovery 1, recovery 2, and recovery 3 for both sexes for the TSST and the BST.The measures have been adjusted for covariates.

Correlation between Outcome Variables over the Experimental Phases
Correlation coefficients adjusted for covariates are presented for GSR and cortisol throughout the stress protocol for both the GSR and the cortisol in Table 4.All correlations were not significant and very low, suggesting no linear relationship between GSR and cortisol.

Discussion
Limitations of the study are that only one noninvasive measure was collected.Other suitable noninvasive measures could be: the reciprocal of conductance-resistance; heart rate and heart variability; temperature; respiration rate, and electroencephalography/neurofeedback.Other invasive measures that may be of interest to the community of scholars are: adrenocorticotropic hormone, vasopressin, epinephrine, norepinephrine, and growth hormone, and possibly some cytokines depending on the release time and half-life of the cytokine.

Primary Study Objective
Our primary objective in the study was to experimentally demonstrate that two different stressors, the TSST and the BST [2] [3] would elicit a similar stress response (cortisol response or GSR) and these stressors could be used in future translational research as pretest posttest stressors.We will discuss the GSR and the cortisol response to these stressors separately.

Cortisol
The main effect for test was not significant for cortisol, nor was the main effect for sex, significance was only found for trial.This would suggest that the TSST and the BST could be used in translational research to document   change in behavior and control for habituation.However, in order to be able to use cortisol as an outcome variable, trial would have to be collapsed across time or trial could be embedded in the statistical design.In Epel's study [24], cortisol reactivity referred to total cortisol output on the stress day, calculated as area under the curve (AUC, in μg/dl•minutes).In our study design, since there was a significant interaction term for test × trial and a significant interaction term for test × trail × sex between the TSST and the BST, future researchers using these same stressors could not be sure if this effect was due to the intervention or to the effect of the differing stressors.See Figure 2.
If you examine Figure 2, you notice that men peaked during the first recovery period following the stressor for the TSST: Yet for the BST, the recovery 1 average of cortisol for a 20 min period was lower than the 20 min baseline period.For the women, the average cortisol response never really peaked after baseline.Our findings can be partly explained by the high level of cortisol at baseline.Our baseline levels are at the upper end of normal.
Normal plasma levels at 8:00 a.m.range from 5.5 to 26.3 μg/dL, at 4:00 p.m. the range is 2.0 -18.0 26.3 μg/dL [9].So it seems that the anticipatory state of the participant at baseline may mask the acute cortisol response to a stressor, specifically if an IV is being inserted.Even though Epel et al. [24] was using salivary cortisol, the researchers in her study noted a decline in cortisol taken at min 15, 30 and 45 (beginning of stressor) and then a dramatic increase in min 90 after the completion of the stressor.Epel et al. [24] also demonstrated that cortisol levels were lower on a rest day at baseline than on the day of the test.
Noteworthy is the wide range for normal resting cortisol levels (5.5 to 26.3 μg/dL).There is a high degree of interindividual variability of response in psychoendocrine studies [25].Epel et al. [24] divided the participatns in her study into high reactors and low reactors and a very different patten of cortisol reactivity occurred following a stressor.When both high and low responders were in the same pool, her results are similar to ours in that the stress response was masked by the intial high levl at the first baseline sample.Researchers many times normalize their data, we have done that before in our studies, yet it is very dificult to get a significant stress response effect unless sample sizes are very large.
Regarding the interaction for test × trial × sex to the stressors, it has been shown that women with disordered eating have a blunted cortisol response to a stressor [26].While the women in our study were not screened for disordered eating, levels of diordered eating are high among women on a college campus.This could possible contribuet to the significant interaction effect for test × sex × trial.Other contributing factors are discussed in refererence to the effect of sex on test and trial.
In summary, if the biomarker of stress used in translational research to document behavioral change was serum cortisol, the BST and the TSST stressors could not be used in the same study as pretest posttest stressors using the same statistical design we used (2 × 2 × 5 Mixed ANCOVA).These two stressors could be used to document change in behavior counterbalanced pretest posttest in translational research if the 5 trials were averaged and represented one measure of cortisol reactivity or if the total cortisol output on the day of the stressor was calculated as area under the curve (AUC, in μg/dl•minutes) as in Epel's study [24].

Galvanic Skin Response
There were no surprises for the galvanic skin response.There was a rise from baseline during the stressor, and during recovery, the response continued to decline for both the TSST and the BST.This was true for men and women alike.See Figure 3.We can confidently state that if the biomarker used to describe changes in stress related behavior or reactivity to a stressor in translational research was the GSR, then the TSST and the BST could be used as pretest posttest stressors using the same statistical design (2 × 2 × 5 Mixed ANCOVA) that we used in our study.We recommend that the order of the stressor be counter balanced from pretest to posttest.However, if any change occurred acrross trial, it could be stated that this was due to the intervention and not the stressor or the effect of sex.Using two effective stressors would control for habitutation, and the reserchers could be confident that any change that was seen between trial, sex, or test would be a result of the intervention.

Secondary Study Objectives
The invasive (cortisol) and noninvasive measures (GSR) are not correlated.Table 4 shows that there is a very low correlation between the GSR and cortisol.The GSR represents an immediate response, peaks in cortisol may occur late in an extended stressor or during recovery to the stressor.
Our other secondary objective was the effect of sex on the stress response.Studies have consistently demonstrated that women report more distress to fear-producing and stressful experiences than men [27] [28].Yet, Kirshbaum [29] did not find a significant difference in serum cortisol secretion to a psychological stress or between sex.Our study results are somewhat different, even though we did not see a significant main effect for sex for serum cortisol, there was a significant 3 way interaction for sex × trial × test for serum cortisol.As can be seen in Figure 1, serum cortisol levels were higher for men than women during recovery 1 for the TSST but lower than women for the BST throughout the whole stress protocol.Males had greater cortisol reactivity to the public speaking task in the TSST and females had higher cortisol levels on the day they were completing the modular math problems in the BST.Although not significant this same trend can be seen with the GSR.

Conclusion
Our study results support the use of the TSST and the BST counterbalanced as pretest posttest in translational research if noninvasive biomarkers of stress, such as the GSR, are used to document change using the same study design that was used in our study.If serum cortisol is the biomarker used to document the effectiveness of an intervention, the TSST and the BST should be counterblanced from pretest to post test and the average cortsiol on the day of the stressor should be used (rather than examining trial separately) or the AUC (in mg/dl × minutes) on the day of the stressor.More research is needed assessing other potential stress protocols to be able to quantify changes in behavior that resulted from the intervention and not habituation to the stressor.

Figure 2 .
Figure 2. Means for cortisol during the TSST (straight line) and the BST (broken line) for males and females.Note: Means are 20 min averages.

Figure 3 .
Figure 3. Means for the GSR during the TSST (straight line) and the BST (broken line) for males and females.Note: Means are 20 min averages.

Table 2 .
Results of the Mixed Model Analysis for each outcome variable.The restricted maximum likelihood estimation method was used for parameter estimations after adjusting for covariates (age, BMI, Spielberg's State/Trait Anxiety inventory scores, stress vulnerability scores, perceived stress scores, and experimental order (TSST-BST, and BST-TSST). *Note:

Table 3 .
Least-square adjusted means (SE) for the galvanic skin response (GSR) and cortisol over the experimental phase.
Note: Values are adjusted for covariates; TSST = Trier Social Stress Test; BST = Beilock Stress Test.

Table 4 .
Partial correlation between outcome variables over the experimental phases.Values indicate the partial correlation coefficients adjusted for covariates, (age, BMI, Spielberg's State/Trait Anxiety inventory scores, stress vulnerability scores, perceived stress scores, and experimental order (Trier-Beilock, and Beilock-Trier). Note: