Deliberative versus Intuitive Psychodiagnostic Decision

Several studies have demonstrated that in the mental health domain, experience does not always lead to better diagnostic decisions, suggesting that in clinical psychology experience-based intuition might actually not improve performance. The aim of the current study was to investigate differences in preferred reasoning styles of novice and experienced clinical psychologists as possible explanation of this surprising phenomenon. We investigated clinical and control decisions of novice (n = 20) and experienced (n = 20) clinical psychologists as well as age-matched controls (n = 20 and n = 20 respectively) by using vignettes and MouselabWeb matrices. We assessed their reasoning style preferences by using the Rational-Experiential Inventory (Pacini & Epstein, 1999). Results showed that experienced and novice clinical psychologists did not differ in diagnostic accuracy and that experienced psychologists had a higher preference for rational thinking than novices. We also found that in experienced psychologists a stronger preference for deliberation was associated with greater accuracy, and in novice psychologists a stronger preference for intuitive reasoning was associated with less accurate decisions. It might be that it is not a question of more experience but of deliberation about the task that could help clinicians perform more accurately.

Two possible explanations for this phenomenon have been suggested. One is a task effect (Shanteau, 1992;Shanteau & Weiss, 2014;Tracey et al., 2014). In a "wicked" learning environment (Hogarth, 2001) such as the clinical domain, where decisions are based on uncertain, incomplete knowledge and without feedback, it is hard to learn from experience and improve performance. The other explanation is that the reasoning style on which experienced clinical psychologists rely does not fit the task (Tracey et al., 2014). In our paper we focus on the second explanation.
As experience increases, professionals tend to move from the deliberative, detailoriented processing of the beginner, to faster, more automated information processing (Betsch & Haberstroh, 2005;Elstein & Schwartz, 2002;Evans, 2008;Kahneman, 2011). Epstein (e.g., 2010) uses the terms "experiential/intuitive" versus "rational/analytical" to refer to these different ways of processing information. Experienced clinical psychologists may thus diagnose more intuitively, quickly matching client presentations to prototypes (Westen, 2012). However, the benefit of greater experience, demonstrated in many other fields of expertise (Ericsson, 2009;Quińones et al., 1995), is offset in the mental health domain by using a less suitable reasoning style, explaining why more experienced psychologists are not more diagnostically accurate than novices.
Preferred reasoning styles have been found to be associated with differences in diagnostic accuracy, but the link to experience has not been established. In one study (Aarts et al., 2012), clinical psychologists were grouped in terms of diagnostic accuracy, with lower accuracy related to stronger preferences for rational thinking.
In our study we employed two formats for making clinical decisions: i) brief text vignettes and ii) MouselabWeb matrices (Schulte-Mecklenbeck et al., 2011), a process tracing tool providing information about both how long and how often cues are inspected. We also included control vignettes and matrices, and recruited age-matched participants from other fields of expertise to control for age and the domain-specificity of the effects. Hypotheses.
Outcome hypothesis. We expected to find no differences in psychodiagnostic accuracy between novices and experienced clinical psychologists on the clinical tasks (cf. Spengler et al., 2009); and we expected clinical psychologists to perform better than control participants on the clinical tasks, as these required domain specific knowledge.
Style hypothesis. We expected experienced clinical psychologists to report a stronger preference for experiential processing than novice psychologists (cf. Betsch & Haberstroh, 2005). Additionally, because the environment does not allow learning from experience (Tracey et al., 2014), more experience would not be associated with higher accuracy, and a stronger preference for rational processing would be related to higher accuracy.
Processing hypothesis. We expected experienced clinical psychologists to be more intuitive and quicker in their decision-making (Westen, 2012), especially on the clinical tasks, than both novice psychologists and control participants.

Participants
Twenty novice (18 females) and 20 experienced clinical psychologists (14 females) participated in this study. Novices were Master students in clinical psychology or young professionals, with a mean of 3 months of experience (SD = 3.2 months) and an average age of 25.3 years (SD = 3.83 years). Experienced clinical psychologists had a mean of 15.6 years of experience (SD = 11.4 years) and with an average age of 42.9 years (SD = 13.1 years).
Novice clinical participants were recruited at two universities, with similar clinical psychology curricula. Experienced clinical participants were recruited through the membership list of their professional organization.
Additionally, we recruited forty age-matched control participants. Twenty (14 female) were Master students or young professionals in a field other than mental health, with an average age of 23.6 years (SD = 3.21 years). The other twenty (12 females) controls, for the experienced group, had an average age of 43.3 years (SD = 11.8 years).
All participated in the study from home using their computers. Eight gift certificates (each worth €25) were raffled among the 80 participants.

Procedure
This study was part of a larger project addressing the effects of experience on clinical decision-making, which also included a memory and a triad task (cf. Bowers et al., 1990). In the first session participants were asked to read a twenty-line case description, and then completed the vignette task and the MouselabWeb matrices (described below). In the next part participants completed the triad task, where they had to judge whether triples of words were coherent or incoherent. In the second session, two weeks later, participants were asked to write down everything they remembered about the case description, and they completed the Rational-Experiential Inventory (REI; see below).
Here we focus on relationship between thinking style preferences and diagnostic accuracy and therefore describe the results of the Vignettes, the MouselabWeb matrices, and the REI. participants to indicate which of two DSM-IV-TR diagnoses best fit the case information. There were always eight pieces of information. 1 -4 pieces of information was diagnostic (i.e., defining) for one but not for the other diagnosis. The remaining pieces of information were non-specific for either of the two diagnoses. For instance, in a task with the diagnostic choices major depressive disorder and dysthymic disorder, information that symptoms have been present only for the last month is defining for major depressive disorder, while that of low energy can typically occur in both disorders. Different pieces of information and diagnoses were used in each task, using a total of 90 unique clinical tasks (45 vignette format, 45 MouselabWeb format).
The control tasks concerned general knowledge about countries, food, and animals and were constructed in the same way as the clinical tasks (i.e., eight pieces of information and two possible answers). The control tasks were piloted and difficulty matched using a different group of young and older volunteers.
Vignettes. The vignettes were short text descriptions of the eight pieces of information, followed by the two possible diagnostic labels. Control vignettes were in the same  The MouselabWeb software allowed us to measure accuracy (correct or incorrect), the number of opened boxes (number of acquisitions), and the time spent on each task.
The design of the study was as presented in Figure 2.

Rational Experiential Inventory
To assess differences between the four groups in their thinking style preferences, two 2 (experience) x 2 (profession) Analyses of Variance (ANOVA) were performed (one ANOVA for rational style, one for experiential). There were no significant differences between novices and experienced participants (F (1, 76) = 1.65, p = 0.20) and clinicians and controls (F (1, 76) = 0.44, p = 0.84) in their rational style preference. However, the profession/experience interaction was significant (F (1, 76) = 5.25, p < 0.05). Tukey's HSD comparisons indicated a significant difference only between the novice clinical psychologists and the other three groups (p < 0.05): Novice psychologists had a lower preference than the other groups for the rational thinking style. No differences were found between novices and experienced participants (F (1, 76) = 1.67, p = 0.21) and clinicians and controls (F (3, 76) = 0.401, p = 0.53) in experiential style preference ( Table   1). The interaction between profession and experience level was also not significant (F (1, 76) = .209, p = 0.65).  The scale ranges from 1 to 5.

Analysis Strategy
To investigate the role of experience level (novice/experienced), profession (clinical psychologist/control), question type (clinical/non-clinical), and their potential interactions on diagnostic decisions, we used a (generalized) linear-mixed effects models approach (sometimes also referred to as multilevel models or hierarchical linear models) that can account for non-independence in the data (for example, due to the fact that each participant contributed more than 1 data point). This approach has several advantages compared to more traditional analysis approaches, as it allows analysis of data at the trial level (thus making it unnecessary to aggregate across items or participants) while safeguarding against inflated Type I errors by modeling all relevant potential sources of variation and taking into account the non-independence. We used the lme4 package (Bates, Maechler, Bolker, & Walker, 2014) in R (R Core Team, 2013) for the mixed-models analysis. To determine p-values for the effects of interest based on Likelihood Ratio Tests (comparing the model with the effect of interest to the same model without the effect of interest), we used the mixed function from the package afex version 0.15 -2 (Singmann, Bolker, & Westfall, 2014). As a general modeling strategy, we always first ran an omnibus model containing all predictors and interaction terms of interest (experience level, profession, vignette type, and their interactions), and then ran follow-up models to further investigate significant interactions and/or main effects of interest.
For the vignettes analysis (as for the MouselabWeb), experience level, profession, vignette type, and their interactions were modeled as fixed effects, and participants and item were modeled as random intercepts. Vignette type was added as random slopes varying over participants, and experience level and profession were added as random slopes varying over item; in addition, the model contained all possible random correlation terms among the random effects. This represents a "maximal" random effects structure that both accounts for the repeated-measure nature of the data and avoids inflated Type 1 errors (Barr, Levy, Scheepers, & Tily, 2013).
For the vignettes data, we first present the analysis of accuracy (correct or incorrect response), using a generalized mixed-effects models approach appropriate for the binary data, followed by the analysis of response times, which used a Gaussian model. The same analysis was done for the MouselabWeb data. An additional model used the number of acquisitions as an independent variable. Finally, within each such analysis, we first present the results of the models without REI scores (representing tests of our outcome hypotheses), followed by the same models with REI scores added (representing tests of our style hypotheses).

Vignettes-Outcome and Style
There were no significant main effects of experience level (χ2 (1)  For descriptive reasons (to present the relationships in a measure more familiar to most readers than the coefficients in the mixed-effect models) Pearson correlations  Figure 3. Association between experientiality (EXP) and accuracy (left) and rationality (RAT) and accuracy (right) for novice and experienced psychologists in vignettes. Accuracy score is the percentage of correct answers. Experientiality and rationality scores were analyzed as continuous variables but are, for illustrative purposes, presented as binary variables were computed between REI scores and accuracy of both novice and experienced clinicians. As in the mixed-models analysis, higher experiential scores were negatively correlated with the mean accuracy of novice (r = −0.626; p < 0.01) but not experienced clinicians (r = −0.096; p = 0.69); higher rational scores were positively correlated with the mean accuracy of experienced (r = 0.457; p < 0.05) but not novice clinicians (r = −0.278; p = 0.24).
There were no significant interactions between EXP and RAT and other variables in the control group or in control tasks (all p > 0.08).

Vignettes-Processing Time
To investigate whether the four groups differed in task completion time, we used a similar modelling approach as for accuracy, with a dependent variable of duration (in sec, log transformed) of each task, and we used the lmer instead of the glmer function.
As for the vignettes, we present correlations for purely descriptive purposes. Higher experiential scores were negatively correlated with the accuracy of novice (r = −0.472; p < 0.05) but not of experienced clinicians (r = 0.121; p = 0.61); higher rational scores were positively correlated to the mean accuracy of experienced (r = 0.457; p < 0.05) but not novice clinicians (r = −0.331; p = 0.15).

MouselabWeb-Processing Time and Acquisitions
Only one main effect was significant in the omnibus model with time: Response times differed significantly between task types (χ 2 (1) = 13.05, coeff = 0.09, p < 0.001, CI 95% [0.04, 0.14]). All participants took longer to complete the clinical than the control tasks.
Only one significant main effect was found in the model with acquisitions: Task type was associated with number of acquisitions. All participants had more acquisitions in the clinical than in the control tasks (χ 2 (1) = 41.67, coeff = 1.19, p < 0.001, CI 95% [0.86, 1.55]). All other main effects and interactions were non-significant (all p's > 0.21) (see Table 3).

Discussion
This study investigated the association between thinking style preferences and accuracy   (5) The number of acquisitions indicates how many boxes were opened on average per task.
control participants on the clinical tasks, and performed equally well on the control tasks.
Secondly, and contrary to our hypothesis, novice and experienced clinical psychologists did not differ in self-reported preference for an experiential thinking style, while novice psychologists had a lower self-reported preference for a rational thinking style than the other groups.
Thirdly, we demonstrated that preferred thinking style was associated with diagnostic accuracy in different ways across groups. In novice psychologists, stronger preferences for experiential thinking were associated with lower accuracy, while in experienced psychologists, a stronger preference for rational thinking was related to greater accuracy. This effect was found in both task formats. We thus did not replicate the finding that a preferred rational thinking style is negatively associated with accuracy (Aarts et al., 2012). Instead, we found a more complex pattern, such that in experienced, but not novice clinical psychologists, the association between rational style and accuracy was positive rather than negative. Importantly, the current study employed a more extensive design than Aarts et al. (2012)-2 vs. 45 vignettes plus 45 MouselabWeb matrices-and hence our findings are likely more reliable. It should be noted though that here we define "experienced" psychologists as those with four or more years of experience while Aarts et al. (2012) criterion was a minimum of ten years practise. Since our sample contained too few participants with this experience level, and because of the large difference between the novices and the experienced psychologists in years of experience, we could not use experience as a continuous measure.
Our results do not support the explanation that experience does not affect psychodiagnostic accuracy because experienced clinical psychologists prefer to use intuition more than novices. On the contrary: Experienced clinical psychologists did not report a stronger preference for experiential reasoning than novices. They may realize that the clinical environment is not predictable, but is, in Hogarth's terms "wicked" (Hogarth, 2001), and that they have not had an opportunity to learn (cf. Kahneman & Klein, 2009). No educated intuition seems achievable in this task; clinicians do not engage in deliberate practice and they lack accurate feedback (Tracey et al., 2014). A novice's intuition is uninformed and therefore not conducive to accuracy, which can explain that preferring to use intuition does not help novices be more accurate.
Finally, we found that experienced clinical psychologists were not faster than novices; they were either equally fast (in the vignettes) or slower (in the MouselabWeb matrices). This ties in with the conclusions regarding thinking style preferences: Experienced psychologists do not prefer to use experience-based reasoning more than novices. Not being faster may be an age-effect, since experienced participants, both clinical and control, took longer to complete the tasks than the younger participants. A conclusion is that there are no processing differences between novice and experienced participants, at least not with the measures used here. In our sample, more experience in psychodiagnosis did not lead to more automatic processing of diagnostic information.
There are a few limitations that have to be addressed. First, there is a questionable relationship between self-report of thinking strategy and actual strategy use (Nisbett & Wilson, 1977;Wilson, 2002). Higher REI scores indicate a stronger preference for, but not necessarily actual use of, the respective thinking style. However, previous studies demonstrated that REI scores do correlate with performance on tasks that have heuristic-intuitive or reasoned-rational solutions (Witteman et al., 2009).
Another limitation to this study was that the tasks employed were quite easy (average correct responses over 80%). Future research might profit from using only the more difficult tasks. One can argue that the tasks used are somewhat artificial, more than asking clinicians to interact with an actor-client (Groenier, Beerthuis, Pieters, Witteman, & Swinkels, 2011), and do not mimic actual psychodiagnosis. While in practice diagnostic decision-making indeed does not involve a binary choice, diagnostic classification is a sub-task that needs to be performed before treatment can start and entails clustering the presented symptoms into a disorder label. We used forced-choice tasks to allow us to see which symptom(s) were judged as diagnostic of the presented disorders.
As done previously (e.g. Witteman & Van den Bercken, 2007) vignettes were used to optimize methodological rigor (cf. Bachmann et al., 2008); this greater methodological rigor, however, comes at the cost of the ability to generalize our results to more realistic diagnostic situations.
Finally, though our sample size is typical for this kind of study, increasing the number of participants would increase the confidence that our results generalize to the larger population of clinical psychologists.

Conclusion
The results of the current study indicate that a preference for deliberative thinking is associated with better clinical decision-making, but only for experienced clinical psychologists. For novice psychologists, preferring experiential or intuitive processing is associated with poorer clinical decision-making. We conclude that deliberating about a psychodiagnostic classification serves even the more experienced clinical psychologists, while novices should not trust their intuition. Our results might be used to inform the training of clinical psychologists. Prospective clinical psychologists should be aware of the impact of their thinking style on their diagnostic accuracy, and be encouraged to deliberate and to question their intuition.