Options for Evaluating Treatment Benefit in Mci and Prodromal Alzheimer's Disease: Content Validity of the Perceived Deficits Questionnaire (pdq) in Patients with Early Symptoms of Cognitive Decline *

Background: Many instruments used to assess outcomes of treatment for Alzheimer's disease (AD) have no published evidence of their relevance and content validity in earlier stages of the disease, i.e., mild cognitive impairment, or prodromal AD (pAD). The objective of this project was to evaluate the applicability and usefulness of the Perceived Deficits Questionnaire (PDQ) as an outcome measure in this population using qualitative methodology to support content validity. Method: Two waves of qualitative interviews were conducted in patients with MCI and pAD. Results: Evidence for content validity and usefulness of the instrument was demonstrated in the patient interviews. Minor modifications to the wording of several items were suggested for the PDQ and the recall period was changed. Conclusion: With these modifications, the PDQ has improved content validity and relevance. It is therefore a potentially useful outcome measure to evaluate therapeutic benefit in interventional studies of patients in the early stages of AD.


Introduction
The value of patient-reported outcomes (PRO) in measuring the treatment effects in patients with mild cognitive impairment (MCI) and prodromal Alzheimer's disease (pAD) is increasingly recognized.However, the availability of PRO instruments with established reliability and validity for this early part of the Alzheimer's disease continuum is so far limited [1].Furthermore, there is ongoing debate about how to diagnose those patients with cognitive impairment who are most likely to progress to AD, and various terms, such as MCI due to AD [2], amnestic MCI [3], and prodromal AD [4], have all been proposed, all with subtle differences in their definition.For example, Albert et al. [2] propose both clinical criteria and research criteria for MCI due to AD, the latter of which incorporate biomarkers to increase the predictive validity of the diagnosis.Prodromal AD is described as an amnestic syndrome of the hippocampal type and can be identified clinically by 1) a very poor free recall; and 2) a decreased total recall due to an insufficient effect of cueing with a high sensitivity of 79.7% and a specificity of 89.9% [4].Clearly, the various MCI diagnostic criteria are closely related, and there is a need for clinically relevant instruments that are applicable across categories.To underscore this point, Ganguli and colleagues recently conducted a one-year study looking at various approaches to diagnosing MCI and their ability to predict disease progression, and found few differences in predictive utility, proposing that research criteria be validated at the community level before being incorporated into clinical practice [5].
PROs represent the voice of the patient in treatment development [6]; therefore, as more studies focus on treatments for MCI and pAD, there is a growing need for validated instruments for this part of the AD disease spectrum.One PRO that has previously been used in this population for evaluating medical treatments is the Perceived Deficit Questionnaire (PDQ) [7].Originally designed to capture the decline in cognitive function among multiple sclerosis and traumatic brain injury patients [7], the PDQ provides a way to measure the patient's own perception of his or her cognition in specific domains: attention and concentration, retrospective memory, prospective memory, and planning and organizing [7].One major study of MCI found that the PDQ was able to distinguish differences in treatment better than other well-established measures used in this population [8].Assessing the relevance and content validity of a self-reported outcome instrument has recently been mandated by the United States (US) Food and Drug Administration (FDA) to obtain labeling claims on the same instrument [9] [10].The FDA guidance recommends the use of qualitative research methods, typically cognitive interviews with patients, in ensuring PRO content validity [9].Interviews are usually conducted among patients with the condition to ensure that the instrument is comprehensible and relevant to the patient population of interest, However, the relevance and hence content validity of the PDQ was never formally established with MCI or pAD patients, thereby limiting our understanding of whether the instrument is able to accurately measure patient relevant concepts of interest in this population.
Therefore, the primary objective of this study was to assess the relevance and content validity of the PDQ (in patients) using qualitative interviewing methodology (Wave 1).A secondary objective of this study was to adapt the measures to improve content validity based on the patients' evaluations and, subsequently, to evaluate the revised instrument using the same methodological approach (Wave 2).The comprehensibility of the items, appropriateness of the recall period, and the response options of the modified measure were also evaluated.These steps were undertaken to ensure that the PDQ was fit for use in clinical trials with MCI and/or pAD populations, so that the patients did not encounter questions that they clearly did not understand or could not answer.The project was designed to enable confident use of the PDQ in populations with early symptoms of cognitive decline in clinical practice and clinical trials in order to provide a better evaluation of treatment benefit in patients with these symptoms.

Study Design
This was a cross-sectional, qualitative research study conducted at two clinical sites in the continental United States (California and Vermont).As discussed above, the study was conducted in two waves in order to test the measure modifications made after the first wave of interviews.Wave 1 took place in October and November 2011; Wave 2 took place in June 2012.In accordance with the FDA guidance on qualitative research, one-onone in-person cognitive interviews were conducted during which participants completed the PDQ while providing feedback on the content of each item.The sample size of each wave was defined by the saturation of concepts as previously defined.Institutional review board (IRB) approval was obtained prior to the start of the study and after instrument modification, and the study was conducted in accordance with the Helsinki Declaration of 1975, as revised in 1983.All participants provided written informed consent prior to the start of the interview.

Participants
Potential subjects were identified from patient databases or medical records at each site.Eligible patients included subjects at least age 50, with a diagnosis of MCI or pAD and a Mini-Mental State Examinations (MMSE) score between 24 and 30 within the past three months.For Wave 1, participants with an MCI diagnosis were specifically recruited and had to meet criteria based on neuropsychological testing (cognitive testing one and a half standard deviations below normal for age and education) as outlined by Albert and colleagues [2].During Wave 2, participants were screened for pAD with a clinical dementia rating (CDR) of 0.5 within the last six months of study baseline and a subjective complaint of worsening memory over the previous 12 months.We identified this population in order to specifically target a population of individuals more likely to progress to AD.Although the diagnoses of MCI due to AD and pAD differ in some respects, from a symptomatic perspective, there is considerable overlap, and we consider the results of these interviews generalizable to both MCI and pAD.Subjects were excluded from participating in Waves 1 and 2 if they were clinically depressed, had a history of alcohol or substance abuse, had experienced a negative life event, had taken drugs that affected cognition, or were mentally incapable of participating in the interview.
We have listed all patient inclusion and exclusion criteria in Appendix A1 (supplementary material).Overall, 19 pAD or MCI patients participated in the two waves of interviews.Wave 1 included 11 subjects for patients diagnosed with MCI, and Wave 2 included eight subjects for patients diagnosed with pAD.Table 1 provides a full list of the sociodemographic characteristics of all subjects for both waves.

Instruments
The PDQ is a 20-item questionnaire that covers four domains of cognitive function: attention and concentration, retrospective memory, prospective memory, and planning and organizing.The items were rationally derived or selected from a similar questionnaire developed for head-injured individuals [11].The response options are answered on a five-point Likert scale, with 0 = never, 1 = rarely, 2 = sometimes, 3 = often, and 4 = almost always.A recall period of four weeks is traditionally used for each item.Subscales can be calculated by summing raw scores for the relevant five items (subscale range is 0 -20), and the total score is calculated by summing raw scores for all of the PDQ items (scale range is 0 -80).A higher score indicates greater perceived cognitive impairment.

Data Collection Procedure
Prior to participating in the interviews, each participant completed an eligibility form, the Patient Health Questionnaire-9 (PHQ-9), and the MMSE administered by the clinical site or an Evidera staff member.During the interviews, subjects completed a version of the PDQ (slight modifications were made between waves) followed by a cognitive debrief of the instrument where patients report their thoughts and feelings with regard to each item and their understanding of each item is assessed.While participants completed the PDQ measure, pauses were taken between questions for interviewers to ask questions and debrief each item.
The interview began with open-ended questions about symptoms of memory impairment then moved to questions about the instructions of the questionnaires-for example, "Were there any words or phrases in the instructions that were difficult to understand?"The interview then transitioned to the participant's understanding and interpretation of each item, including whether: 1) the questions were clear and easy to understand; 2) the response options were clear and appropriate; 3) the recall period was appropriate for the concepts being addressed, and 4) individual concepts could be expressed in alternative ways.
Interviews were conducted using a semi-structured interview guide.All interviews were conducted in English and took approximately 90 -120 minutes to complete.Upon completion of the interview, each participant completed a basic sociodemographic information form.Clinical sites also completed a clinical form about the subjects' MCI/pAD and recorded patient medications.

Data Analysis
Descriptive summary statistics were used to characterize the sociodemographic and clinical characteristics of the

Continued
High blood pressure --1 (12.5%) 1 (5.3%) subject.Content analysis was used to evaluate the information gathered during the qualitative interviews.A qualitative analysis software program, ATLAS.ti[12], was used as a tool to systematically organize and categorize the text in the interview transcripts.The research team developed a coding dictionary to organize feedback provided by the participants about specific items or sections in the measures, and words and phrases were coded into groupings of feedback for items and sections of the measures.The output created from the ATLAS.ticoding included participant quotes linked to key concepts from the cognitive interviews.The conceptual quotes from each wave were analyzed and used to summarize the results.

Results
The demographic characteristics of participants in Wave 1 and 2 were similar.There were more men than women, the most commonly reported employment status was "retired," and the majority reported more than one comorbid condition.Table 1 includes the full description of subject and informant sociodemographic information.
Table 2 provides the information obtained from the medical chart review.On average, subjects in Wave 1 reported symptoms of cognitive impairment and a diagnosis of MCI more recently than participants in Wave 2 reported a diagnosis of pAD; however, the difference was not statistically significant.The mean MMSE score for both samples averaged 28.2 out of 30, indicating very mild impairment.During the cognitive interviews, each patient completed a version of the PDQ, either in its original form (Wave 1) or the amended version (Wave 2).

Wave 1 PDQ Results
We assessed participants' understanding of an item's intent and the timeframe used in the questionnaire.All participants understood 15 of 20 items.The five items that were partially misunderstood or misinterpreted by one or more participants included Items 6, 10, 12, 13, and 15.Of the five items that were misunderstood, three items (6, 10, and 13) were misunderstood by only one participant, and two items (12 and 15) were misunderstood by two participants.For Item 12 "have trouble getting started, even if you had a lot of things to do?" participants suggested that they did not have generally a lot to do.

"It, it-to me 'getting started,' and 'lots of things to do' didn't relate and that's why I reread it. Uh, I have trouble getting started, uh, started doing what, you know? Even if I have a lot of things to do, uh, that-the different parts of that question was confusing to me. Is there a different way that we could word it that you can think of? I, I, I think I would say 'having trouble getting started when you have lots of things to do'. [Participant 01-02-109, Item 12]"
Regarding Item 15, participants said that forgetting to turn on the alarm clock and forgetting to turn off the stove were not the same in terms of their consequences."No, I don't.Yeah, I don't.But in my case it was okay because, um, you know, I don't rely on that alarm clock.So, I don't have to answer yes or no to that.To me that's an immediate no and the stove is an immediate no.But maybe had it been stove and oven, you know, or a water faucet, you know, then to me those are danger things.The alarm to me is not danger.That's just, you'll be late to work if you don't do that.[Participant 01-02-107]" The PDQ asks patients to consider the previous four weeks when answering the questions.We asked what actual timeframe participants were considering while answering the question; their responses varied between and within items.For only three of the 20 items, one or more participants indicated that they were using a fourweek timeframe, and four participants indicated that Item 4 did not require a timeframe at all, since they had always been organized or disorganized.All other item responses ranged from days to weeks to a lifetime.In other words, rather than simply considering the four-week recall period, participants seemed to think back to the last example of the behavior or symptom described in the item.We discussed this issue with the author of the PDQ, and a decision was made to narrow the recall period to provide a more reasonable timeframe for subjects to consider when answering the questions.The new timeframe would also provide ample recall to evaluate treatments.
Overall, Wave 1 results showed the PDQ was a relevant, well-understood questionnaire for MCI patients.Based on the cognitive interviews and a discussion with the PDQ's author, two changes were made to the instrument between Wave 1 and Wave 2. They included the following: 1) Changing Item 15 to "forget to do things like turn off the stove, or lock the door?" 2) Changing the recall period from four weeks to one week.

Wave 2 PDQ Results
The interview guide used included similar questions to the prior cognitive interviews, as well as questions on item relevance and item response options.Figures 1 and 2 provide a graphic visualization of participant understanding and relevance by PDQ item, recall period, and response options by PDQ item.

Understanding
Of the 20 items debriefed during the interviews, 17 items were interpreted as the author intended by at least 70% of the patients.The three items which caused the greatest difficulty for participants were Items 5, 8, and 12. Item 8 "have difficulty planning what to do in the day?" was the most difficult to interpret, with only 38% (n = 3) of participants demonstrating an interpretation of its meaning that matched the intent of the item.The reasons why participants found this item difficult related to variations in scheduling or reliance on a calendar to dictate their plans.Two participants said it was not relevant because they did not have many plans to make.It was decided to keep this item in the instrument as some people did experience this difficulty, and hence we felt that it was best to keep measuring this effect.
Close to half of the participants (n = 3) had problems with Item 5 "have trouble concentrating on what people are saying during a conversation?"The main reason was misinterpretation of the question relating to the importance of a particular conversation, instead of the act of forgetting.

"Well, there's two variables, one, depending on who I'm talking to and two, whether I'm interested in what they are having to say or what the topic is about I should say. [Participant 001-104]"
We found similar difficulties as in Wave 1 for Item 12 'have trouble getting started, even if you had a lot of things to do?' Item 12 was equated to laziness or procrastination for the participants who did not understand the intent of the question.
"It means that, um, I know I have these things that are uh, have to be completed, but I'm procrastinating and, uh, making, rationals of why I shouldn't do it, shouldn't do them.And, uh, being lazy, you know, and no-no account.
[Participant 001-104]" As suggested during the first wave of the interview, we amended the item to say "have trouble getting started, when you had a lot of things to do?"We also noted that Item 11 "forget the date unless you looked it up?" was answered correctly by most participants (n = 7); however, the responses indicated a misinterpretation of the question.At least three participants referred to using a calendar or phone to help them keep track of the date, not understanding the relationship to thinking and memory associated with the item.We again decided to keep the item in the revised item list as it is useful to keep an account of this effect for some patients.

Relevance
We asked participants whether an item was relevant to their memory issues or thinking problems; many participants indicated that an item was not relevant if they personally did not face the issue.For example, Item 12 "have trouble getting started, even if you had a lot of things to do?" was rated as the least relevant, with no participants indicating relevance to their memory problems.However, the reasons included statements such as "Because I have zero occurrences" [Participant 001-107] and "Because it would only-if-if there are things that I have to do, um, they're going to be put on my, uh, calendar" Item 15, "forget to do things like turn off the stove, or lock the door?" and Item 17, "have trouble holding phone numbers in your head, even for a few seconds?" were identified as relevant to the participant by less than 30% of the sample.Item 3, "forget what you came into the room for?" was the most relevant item based on participant responses, with six of the eight participants indicating it was relevant.

Recall Period
On average, five of the eight participants indicated that they were thinking about the last week on any given item.This was a higher rate than the four-week recall period considered in Wave 1. Item 4, "have trouble getting things organized?"had only one participant indicate that he was thinking about the last week while answering the question, whereas for Item 10, "forget what you did the night before?"almost all participants indicated that they were thinking of "last week" (n = 7).Other timeframes commonly reported by participants included the past in general; longer than a week; specific timeframes like a month, a year, last year; a couple weeks; and not applicable or never happened.

Response Options
The response options were generally well understood by participants.When asked to explain why a certain response was selected compared to adjacent options, patients were able to provide a clear reasoning for their choice 92.5% of the time.On average, seven of the eight participants were able to differentiate between the response options on any given question.For seven of the 20 questions, all the participants were able to select a response option and identify why they had selected it.Item 9, "have trouble concentrating on things like watching a television program or reading a book?" gave the most participants trouble in choosing the response options, with three participants unable to explain why they had selected their choice over another.
"And, uh, when I'm reading a book because I like to read so much I don't usually having concentrating on telling-on-on what is-the content of the book is.So, I-I really don't know how to answer that.Uh, I would say-I wouldn't say never, but I would-I wouldn't say-uh, so-uh, so I probably would have to say sometimes.

Discussion
The PDQ was cognitively debriefed in MCI and pAD samples to establish the relevance and content validity of the instrument where there was limited prior evidence.Overall, the items in the PDQ were largely well understood by most patients with MCI and/or pAD during Wave 1 and Wave 2, and although only a small sample of participants were included in each wave of the study, there was overall consensus across the concepts and questions reviewed.Only one item (Item 15) was modified between waves, in order to equate the severity of memory failure consequences between the examples used in the item.In other words, the example of setting the alarm was changed to locking the door, as setting an alarm was not relevant to this mostly retired population and the consequences of oversleeping were less severe than leaving the stove on.The timeframe was also modified from four weeks to one week as a consequence of the first wave of debriefings, which enhanced participants' ability to justify and explain their response choices.These modifications were well-accepted during Wave 2 interviews.During the interviews, the majority of the items (12 out of 20) were seen as relevant to the majority of patients with pAD.Eight items were identified by participants as not relevant to their current experience; these items were mostly part of the planning and organizing domain of the PDQ (three of five items).After further examination of the transcripts, this seemed largely due to participants reporting relevance of a particular item to their own lives, rather than considering the relevance of a particular item to all patients with early signs of AD.However, in the context of use as an outcome measure this may not be considered irrelevant for the person who has experienced problems in this area and an improvement in score could identify a beneficial treatment.
While the recall period for the PDQ was shortened from four weeks to one week to improve validity, there is still debate about what constitutes the most appropriate recall period for this patient population.When the recall period was set at four weeks, there was considerable variation in the actual recall period used.To some extent, that pattern was less observed when the recall period was only one week, as subjects could conform to the one-week recall period much more easily.Nevertheless, when participants did not have an experience or a memory difficulty during the past week, they would simply go back to their last experience of the event and answer accordingly.
Our study also raised the issue of denial and potential lack of insight by participants struggling with early symptoms of a debilitating illness.For example, some respondents indicated that items were not relevant because they did not want to acknowledge that they were experiencing such symptoms, or they lacked some insight when they had found alternative strategies such as looking at their phone to remember the date.
In addition, we want to raise awareness that many pharmaceutical companies are currently contributing to the development of another MCI specific patient reported outcome measure to evaluate the therapeutic benefit of their drugs in the MCI population via the PRO Consortium of the Critical Path Institute (C-Path).The PDQ tested here measures a different concept (cognitive impairment) than the other instrument, which is focused on the measurement of complex activities of daily living (CADL) and interpersonal functioning (IF).
In conclusion, we found that the patient insights were valuable in improving the tool, and given that the PDQ has demonstrated content validity, we would advocate its further use to evaluate therapeutic benefit in clinical trials of patients with MCI or pAD.Our results suggest that items in the planning and organization sub-domain may be less of an issue (low relevance) for some of these patients, but improvements or lack of deterioration in the scores of this domain would be valuable for these patients.The modifications made to the PDQ improved the instrument's content validity and relevance in this population, therefore making the instrument more suitable for administration in clinical trials.

*
One participant selected both "White" and "American Indian or Alaska Native.";** Participants checked all that applied.

Figure 1 .
Figure 1.Percent of participants who understood an item or found the item relevant in the PDQ (Wave 2, n = 8).

Figure 2 .
Figure 2. Percent of participants who understood the response options and recall period for the PDQ (Wave 2, n = 8).
[Participant 001-102].Item 18 "forget what you did last weekend?"had one (13%) participant indicate the item was relevant.All others gave reasons such as "I remember what I did the whole weekend" [Participant 001-101].One participant said it wasn't applicable to him, but he see how it would be relevant for someone else."If the weekend wasn't somewhat special, it might be a little bit more difficult to remember.[Participant 001-102]"