Pseudodiagnosticity: The Role of the Rarity Factor in the Perception of the Informativeness of Data

This paper presents the results of a study designed to investigate the pseudodiagnosticity bias as a failure to identify and select diagnostically relevant information. The reported experiment (N = 240) aims to deepen understanding of the role played by the rarity of evidential features in a classical pseudodiagnosticity task. The problem used for the experiment was a classical pseudodiagnosticity task. Six experimental versions were constructed: they differed in the rarity of features proposed and in the percentages (high or low) associated with them. The results show that people’s responses appear to be influenced by the percentage values associated with explicit information more than by a rarity factor. When an initial piece of evidence is associated with a low percentage, the percentage of normatively diagnostic answers is greater than when this percentage is high. Furthermore, rarity is not, in itself, a crucial factor in the occurrence of pseudodiagnosticity bias. Rather, the perception of the difference between two evidential features in terms of informative value influences people’s responses when orienting a diagnostic evaluation. When people perceive an initial piece of evidence as having greater informative value than a second piece of evidence, they tend to (correctly) move their attention from the focal hypothesis to the alternative one.


Introduction
The term pseudodiagnosticity, first used by Doherty, Mynatt, Tweney and Schiavo (1979), refers to the "failure to identify and select diagnostically relevant information".In a simple case where participants are asked to choose between two hypotheses (e.g., H and not-H), they tend to select and consider the data that refer to only one hypothesis, without considering the information on the alternative hypothesis.In addition to pointing out a sort of incapacity in diagnostic behavior, pseudodiagnosticity is important because of its possible consequences.For example, a physician who does not adequately analyze the risks associated with a patient's symptoms may not recognize the illness from which the patient is suffering, with easily imaginable consequences.
The definition of pseudodiagnosticity proposed by Doherty et al. (1979) can be understood by framing it within the normative model provided by Bayes' Theorem.This theorem allows an evaluation of the probability of an event (e.g., the presence of an illness) in relation to another event (e.g., the symptoms of an illness) in which the diffusion within a certain population is known.
Consider the following equation:

P H D P H P D H P H P D H P H P D H
H and D stand for (respectively) the hypothesis and data, subscripts 1 and 2 label two mutually exclusive and exhaustive hypotheses, and the subscript i indexes a set of data.The posterior probability P(H 1 /D i ) can be calculated only if both the probability of a set of data given the hypothesis being tested, P(D 1 /H 1 ), and the probability of the same piece of data given the alternative hypothesis, P(D 1 /H 2 ), are known.The information is diagnostically relevant only if it permits the completion of the likelihood ratio  .The likelihood ratio is independent of the base rate.In some cases, it is possible that a datum provides strong evidence (high probability) in favor of a less probable hypothesis (low base rate).However, given that the numerator and the denominator of the likelihood ratio are independent, "observing a datum that is a necessary concomitant of H 1 , that is, P(D/H 1 ) = 1, may be uninformative if it is also a concomitant of H 2 " (Beyth-Marom & Fischhoff, 1983).
A pseudodiagnosticity bias means not selecting the data that are necessary to complete the likelihood ratio and instead choosing the information that refers to a single hypothesis.In this sense, there is a strong connection with confirmation bias (Evans, 1989;Klayman, 1995;Wason, 1960), the tendency to select data favoring one's hypothesis even in the presence of information that does not support it.Doherty et al. (1979) demonstrated the strong consistency of the phenomenon of pseudodiagnosticity with an experimental paradigm that, even with various modifications, was accepted by almost all researchers engaged in the analysis of pseudodiagnosticity.The experimental paradigm for the study of pseudodiagnosticity (pd task) is substantially an information selection problem, which can be conceptualized as a 2 × 2 table (see Table 1).
In the tasks used to study pseudodiagnosticity bias, participants are given information on cell A in such a way as to concentrate the participant's attention on hypothesis H 1 , also defined as the focal hypothesis.To decide which of the two hypotheses is true, participants are asked to select information A theoretical interpretation of this phenomenon is suggested by some authors who claim that "the reluctance of participants to select information about both decision alternatives on inference problems is due to the limitation on the number of hypotheses that can be maintained and operated upon in working memory" (Doherty & Mynatt, 1987;Mynatt, Doherty, & Dragan, 1993).According to this perspective, participants would only be able to operate on a single hypothesis at a time because the pd task requires aware reflection inside working memory, which can necessitate serial-type processing instead of parallel processing.The working memory system, because of its limited capacity, would supply "insufficient attentional resources to update two hypotheses at once" (Mynatt, Doherty, & Dragan, 1993).The theoretical interpretation by Doherty et al. (1979) seems consistent with the Dual Process Theory by Evans (2006Evans ( , 2009;;Evans & Over, 1996).According to this theory, the pseudodiagnosticity bias depends on the involvement of the implicit system (pragmatic and automatic) that replaces the action of the explicit system (rational and sequential) because of the limited capacities of the latter working system.This type of limit depends on the limits of the working memory, to which the explicit system is tightly connected (Evans, Venn, & Feeney, 2002).
Some recent studies have analyzed the conditions that favor or limit the occurrence of this bias.In particular, a series of very interesting studies investigated how rare information can affect the pseudodiagnosticity task.These studies showed that, in a classical pd task, when an initial piece of information concerned a rare feature (that provided some supporting evidence for the focal hypothesis), participants tended to select a further piece of information on that rare feature that could provide evidence for the alternative hypothesis, leading to the avoidance of the habitual pseudodiagnosticity bias (Feeney, Evans, & Clibbens, 1997;Feeney, Evans, & Venn, 2000a, 2008).The results obtained were replicated in many experimental studies with different versions of the classical pd task, showing that "people are significantly more confident in a hypothesis sup-ported by rare rather than common evidence" (Feeney et al., 1997).
In one experiment (Feeney, Evans, & Venn, 2000a), the authors asked the participants to identify the model of car that was bought by the participant's sister (Model X or Model Y), knowing that she was interested in two features: the presence of a car radio (very common feature) and a maximum speed higher than 165 mph (rare feature).Participants were given information about the percentage of cars X able to reach a speed higher than 165 mph (80%) and were asked to select a second piece of information to establish which car was bought by their sister.The results showed that rarity significantly affects the pseudodiagnosticity task: 43% of the participants chose to know Cell B (the percentage of cars Y able to reach a speed higher than 165 mph, the normatively correct choice), whereas 44% chose Cell C (the percentage of cars X that have a radio, the pseudodiagnostic choice).Even if C choices were a very small majority, they were significantly lower than the choices obtained in the classical tasks (with two common items).The authors submitted another group of participants to the same test, providing 10% as the percentage value for the focal hypothesis (Cell A).The results obtained in this version of the test were substantially equal to the ones previously described, demonstrating that the "rarity" factor affects participants' choices independently of the percentage (80% -10%) associated with it.These results led the authors to conclude that "the effect of feature rarity is mediated via a hard-wired heuristic rather than any sophisticated on-line processing of probabilities" (Feeney, Evans & Venn, 2000a).They claimed that the heuristic principle is sensitive to the rare object features but insensitive to the statistical changes in the information.In a further study, the authors pointed out that, when faced with incomplete information, participants use their "past" knowledge to make inferences.From this perspective, even the perception of rarity seems to depend on their own knowledge.Every participant is able to consider whether a feature is common or rare and to use such an estimate to solve the task (Feeney, Evans, & Venn, 2000b).This result seems coherent with other studies (McKenzie & Mikkelsen, 2000;Oaksford & Chater, 1994) that emphasize the importance of rarity in lay hypothesis testing.
Our study investigates the pseudodiagnosticity bias, especially analyzing the role of the rarity factor in perceptions of the informativeness of data.Given that, as suggested by some authors (Feeney, Evans, & Venn, 2008;McKenzie & Mikkelsen, 2000), people seem to be sensitive to rarity in judging whether the available evidence supports a given hypothesis, our intention was to analyze how and under what conditions rarity can change habitual pseudodiagnostic behavior.In fact, the relationship between the perception of a datum as "common" or "rare" and the perception of its level of informativeness seems less predictable.For example, as suggested by Maggi et al. (1998), in some experimental tasks participants had to evaluate features that were so common that they could assign them a low informative value, at the risk of perceiving these features as completely useless for the task.In this sense, the use of very common features, such as the presence of a car radio in Feeney, Evans & Venn (2000a), could be problematic.Only by adopting scenarios with evidential features that are perceived by participants as sufficiently informative (both with common features and with rare ones) it is possible to understand the relationship between rarity and informativeness.
Our intention was to analyze the case in which the two evi-dential features chosen for the pd task were both rare.Our hypothesis was that the effect of rarity should be attenuated given that a rare datum turns out to be more informative if it is compared with a common datum.From this point of view, rarity is not informative in itself, but it is the comparison between the rarity of each datum that turns out to be fundamental for the task solution.
Our study aimed to investigate the rarity effect in the classical pd task.In particular, we investigated the following:  the role of the rarity effect in the perception of the informativeness of data;  the generality of this factor by analyzing the role played by the percentages associated with the explicit information (rare or common) shown to participants, given that there is little agreement in the literature about its role.For example, Feeney, Evans & Venn (2000a) found that the perception of the rarity of the data led to the reduction of errors with both high and low percentages, whereas Mynatt, Doherty, & Dragan (1993) showed an increase in the number of cell-B choices (resulting in a decrease of the pseudodiagnosticity bias) when P(D 1 /H 1 ) was less than .5.Our hypothesis was that the rarity effect (i.e., the perception of the rarity of the evidential features) played a crucial role in the perception of the informativeness of the data shown and that this effect was mediated by the rarity of the second evidential feature and the percentage associated with the evidential feature shown (i.e., D 1 ).In the first case, we thought that rarity could reduce "pseudodiagnosticity" only when the evidential feature was perceived as rare in comparison with D 2 (with D 1 and D 2 both being rare, we hypothesized a minor effect).In the second case, we thought that when P(D 1 /H 1 ) was less than .5,participants could evaluate H 1 as less plausible, thus considering the importance of the information associated with cell B (the correct choice) or cell D, both referring to H 2 .

Method Participants
Two hundred forty students at the University of Milano-Bicocca (who were not experts in statistics) were randomly selected for inclusion in the study.

Materials
The problem used for the experiment was structurally identical to the problems used by Mynatt et al. (1993; the car problem) and subsequently used (in revised forms) in the literature.Six experimental versions of the car problem were used.These versions differed in the type of features proposed (in terms of rarity) and in the percentages (high or low) associated with them (see Table 2).
A pre-test was done to select for our study only features that were perceived by people as sufficiently informative (see Table 2).In our pre-test, the feature used by Feeney, Evans and Venn (the presence of a car radio) was perceived as "not informative".

Procedure
Each participant was given only one version of the car problem.A between-subjects design was used (40 participants for each version).Given P(D 1 /H 1 ) (the percentage of model X cars with a top speed higher than 220 km/h (for Versions 1, 2, 3 and 4) or the percentage of model X cars with manual air conditioning in car equipment (for Versions 5 and 6)), participants were asked whether they wanted to discover cell B, cell C or cell D (i.e., P(D 1 /H 2 ), P(D 2 /H 1 ) or P(D 2 /H 2 )).Participants performed the task on their own.The instructions were given verbally, and there were no time limits.

Results
A chi-square analysis was conducted to determine the influence of the rarity of the features and of the percentages associated with them for pseudodiagnosticity bias: normatively correct choices (B cell choices) were compared with incorrect choices (C cell and D cell choices were aggregated).
The results (see Table 3 for the overall results) showed a minor effect of rarity on pseudodiagnosticity that partially disconfirmed the results obtained by Feeney, Evans and Venn (2000a).
The rarity of the evidential feature was not sufficient to orient participants' choices to reduce the pseudodiagnosticity bias.The percentage of correct choices did not significantly differ in relation to the presence (30% of cell B choices; data from Versions 1, 2, 3 and 4 were aggregated) or absence of a rare first feature (31.3% of cell B choices; data from Versions 5 and 6 were aggregated): χ 2 (1, N = 240) = .039,p > .05.
In contrast, as we hypothesized, the percentage of correct choices differed significantly in relation to the difference in rarity between the two features.When there was a difference (data from Versions 1 and 2 were aggregated), the percentage of correct choices was greater (40%) than when there was no difference (25.6%; data from Versions 3, 4, 5 and 6 were aggregated): χ 2 (1, N = 240) = 5.207, p < .05.Data supporting the importance of this element were obtained by a chi-square analysis that showed a marginally significant difference in correct choices between Version 1 and Version 3 (χ 2 (1, N = 80) = 3.117, p < .10)and a significant difference between Version 2 and Version 4 (χ 2 (1, N = 80) = 5.115, p < .05).
The results also supported our second prediction by revealing a significant role of the percentages associated with the evidential feature.A chi-square analysis showed that when the percentage associated with the evidential feature was low (data from Versions 2, 4 and 6 were aggregated), the percentage of correct choices was greater (40.8%) than when this percentage was high (20%; data from Versions 1, 3 and 5 were aggregated): χ 2 (1, N = 240) = 12.304, p < .001.It was noteworthy that the only condition in which percentages not seem to be influential is when both of the features were common (χ 2 (1, N = 80) = 1.455, p > .05).
Spontaneous justifications by participants suggested that they tended to combine the information from the rarity of the evidential feature and from the percentage associated with it.When the evidential feature was rare and the percentage was high (in Versions 1 and 3), participants tended to mentally fill in the empty B cell with a low percentage (e.g., "...given that this feature is rare, the probability that model Y cars also have this feature should be low...otherwise the feature would be too common"), therefore opting for the common pseudodiagnostic error (cell C choice).

Discussion
The role attributed to the rarity of the evidential feature in the classical pseudodiagnosticity task should be reduced.First, our results showed that the rarity factor did not act like a heuristic, independently of the associated percentage values, as hypothesized by Feeney, Evans and Venn (2000).Participants' answers appeared to be influenced by the percentage values associated with the explicit information more than by the rarity factor.When high percentages (80%) were provided, participants tended to focus on a single (focal) hypothesis, therefore exhibiting the pseudodiagnosticity bias.In contrast, when low percentages (10%) were provided, participants seemed to move their attention to the alternative hypothesis, answering in a di-agnostically correct way.These results were in line with the ones reported by Mynatt et al. (1993).
Furthermore, rarity is not, in itself, a crucial factor in the occurrence of the pseudodiagnosticity bias; rather, the crucial factor is people's perceptions of the difference between the two features (D 1 and D 2 ) in terms of their informative value (see similar conclusions drawn from the analysis of the Wason selection task by Oaksford andCheater, 1994, 1997).In this direction, our results appeared similar to those obtained by Vallée-Tourangeau and Villejoubert (2010;Villejoubert & Vallée-Tourangeau, 2012), that underlined the importance of information relevance in pseudodiagnostic reasoning.
Our study, though showing the consistency of the pseudodiagnosticity bias, contributes to highlight some limitations of the standard pseudodiagnosticity paradigm, first introduced by Doherty et al. (1979) and adopted successively by many authors, given that the rigidity and standardized form of the paradigm could limit its practical applicability.This paradigm (in which two hypotheses and two pieces of data are shown and then, after providing information on one hypothesis, participants are asked to select a further piece of information to identify the correct hypothesis) has the advantage of being clear and intelligible.However, it appears to be not very flexible and, perhaps, barely applicable.Beginning from this perspective, some recent studies have attempted to overcome the limits imposed by this paradigm.For example, Feeney et al. (1997) proposed a task that, while maintaining the standard pseudodiagnosticity task structure, introduces rating scales for the participants' confidence in the hypotheses.This additional task should permit an analysis of how the participants change their opinions as a consequence of the obtained information, allowing a more precise and qualitative investigation of pseudodiagnosticity.
The use of a qualitative methodology is particularly appropriate to deepen understanding of the reasons behind the pseudodiagnosticity bias.For example, answers that seem discordant may depend on similar cognitive strategies (and motivations).The selection of the pseudodiagnostic option may be guided not only by a confirmation strategy but also by a falsificatory one.Given the dichotomous structure of the scenarios usually adopted, selecting the pseudodiagnostic option does not necessarily involve verification of the focal hypothesis because of the possibility of testing whether the alternative hypothesis is false.If one chooses the normatively wrong option C and finds no evidence, one could use this datum-a low percentage-to support the alternative hypothesis.
Future research on pseudodiagnosticity should attempt to identify the precise role played by the different factors involved in a pseudodiagnosticity task, under more realistic experimental conditions, if possible.

Table 1 .
Standard Pd task structure.

Table 3 .
Experiment results: percentage of choices.