Mapping the EQ-5 D-5 L Utility Scores : A Pilot Study in Orthopaedic Patients

Objective: The purpose of this study is to investigate whether the LE CAT, PROMIS PF CAT, Depression CAT, or Pain CAT can be used as a proxy for the EQ-5D-5L. Background: Patient-reported outcome measures have become vital tools for physicians to understand the effectiveness and value of treatment and care. Methods: This study was conducted in 2012 with 116 patients that took the EQ-5D-5L and a number of patient-reported outcome instruments in a university orthopaedic clinic. Regression analyses were conducted to predict EQ-5D-5L index scores from the LE CAT, PROMIS PF CAT, Depression CAT, and Pain CAT. Results: All predictors, separately or combined, significantly predicted the EQ-5D-5L index scores (p < 0.0001). The LE CAT was the best predictor; it alone accounted for 37% of the variability in the EQ-5D-5L. When combining patient-reported outcome measures, the best predicting model was the one consisting of the LE CAT, Depression CAT and Pain CAT; they explained for 43.9% of the variance in EQ-5D-5L. Conclusions: The findings provide encouraging news that the LE CAT, PF CAT, Depression CAT and Pain CAT can be used alone or in combination as a proxy for the EQ-5D-5L. Researchers have the options of using these patient-reported outcome measures for economic evaluations and medical intervention studies.


Introduction
Assessment of health outcomes has progressed greatly in the past decade with the introduction of new tools that rely on advanced statistical methodologies.Patient-reported outcome measures have become vital tools for physicians to understand the effectiveness and value of treatment and care.Physicians traditionally relied mainly on clinical measures to assess and compare the effectiveness of treatment.Even though clinical measures provide a great deal of information to the physician and patient, they do not necessarily reflect how a patient feels in their everyday life [1].There is some evidence that there is not a very strong correlation between clinical and patient-reported outcome measures and therefore, measures should be used in conjunction with each other when assessing the condition of patients [2]- [4].As a result, physicians are increasingly relying on the integration of clinical measures and patient-reported outcome measures to assess treatment outcomes and interventions.
There is not a universal patient-reported outcome measure that is widely used by all physicians.Instrument selection varies depending on the physician's background and familiarity with the tools and their areas of specialization.The Medical Outcomes General Health Survey and the European Quality of Life-5 Dimensions (EQ-5D) are very popular among physicians to measure health outcomes and are utilized for quality-adjusted life-years in cost-utility analyses.Other instruments such as the National Institute of Health sponsored Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function (PF), Computerized Adaptive Test (CAT), and the Lower Extremity (LE) CAT have demonstrated desirable psychometric properties and are increasing applications in clinical studies.Utilizing item response theory, the PROMIS PF item bank is developed, which includes 124 items measuring upper extremity, lower extremity, central, axial, and instrumental activities of daily living [5].The PF CAT draws items from the PF item bank for test administration.It has been demonstrated to exhibit validity and reliability in the orthopaedic population and for patients undergoing foot and ankle surgery [6].A recent study suggested that less instrument bias could be accomplished by developing a lower extremity instrument from the PF item bank.As a result, the 79-item LE CAT item bank was developed in an effort to specifically target patients with lower extremity conditions [5] [7].The PROMIS also includes item banks that address depression and pain.When considering cost-utility analyses, the EQ-5D is often adopted and its preference-based index scores are used for estimation of quality-adjusted life years for economic evaluation.Its index scores are derived from measures across five health dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression.The EQ-5D can be a very useful tool for clinicians and researchers.The EQ-5D-3L (three response levels) has been shown to be a valid and reliable instrument in many contexts, but it has also received scrutiny for lacking sensitivity and failing to capture disease specific conditions [8]- [12].In response, the European Quality of Life Group developed a five response level version of the EQ-5D, the European Quality of Life-5 Dimensions 5 Response Levels (EQ-5D-5L).Compared to the previous version with three response levels, the EQ-5D-5L has demonstrated a lower ceiling effect [11], good convergent validity [12], and content and face validity [8].
Researchers in the past have conducted studies to map preference-based index scores from general health outcome profile measures [13]- [17].Several studies have been published detailing the mapping of the EQ-5D-3L index scores from the Medical Outcomes General Health Survey using large national samples [18]- [21].Recently, scores from the PROMIS global item bank and selected domain item banks have also been mapped to the EQ-5D-3L on the general population and a few clinical samples [22].Given the increasing utilization of patient-reported outcome measures and the application of PROMIS instruments in clinical settings and in research studies, mapping of the health preference scores from the patient-reported outcome measures can be very useful and these scores can be used in cost-effectiveness studies when health preference measures are unavailable.To our knowledge, there is no existing published study to date reporting the mapping of EQ-5D-5L from any patient-reported outcome measures or health profiles.
The purpose of this study is to investigate whether the LE CAT, PF CAT, PROMIS Depression, and/or PROMIS Pain measures can be mapped from the EQ-5D-5L.In doing so, we are ultimately interested if the scores from any of these patient-reported outcome instruments can be used as a proxy of health preference scores when health preference assessments such as EQ-5D are absent.

Data Collection
In 2012, data were collected from a large university orthopaedic center with patients that were 18 years old or older who came to visit for lower extremity orthopaedic problems.This university orthopaedic center had over 90,000 total patient visits and over 3,000 total surgeries per year.At the end of their clinic visit, patients were asked to voluntarily complete the EQ-5D-5L, the LE CAT, the PF CAT, the PROMIS Depression CAT, and the PROMIS Pain CAT instruments (See Appendices 1-5) along with a few demographic questions.These ques-tionnaires were administered electronically using tablet computers.The order of questionnaire administration was completely random in efforts to minimize potential bias from test administration.Institutional Review Board approval was obtained prior to the study.Medical and research staff obtained informed consent from the patients and the rights of the subjects were protected.The final sample consisted of 116 patients.

Analytic Approach
Descriptive statistics were conducted to summarize patient demographic and instrument characteristics.All CAT scores used in the analyses were T scores, which had a mean of 50 and standard deviation of 10.The EQ-5D-5L utility index scores were derived and converted from the responses to the five dimensions of the EQ-5D questionnaire.To examine the association between the patient-reported outcome measures (LE CAT, PF CAT, Depression CAT, and Pain CAT) and the EQ-5D-5L index value, we examined the Pearson product-moment correlations among these instruments.To investigate whether the patient-reported outcome measures can be mapped from the EQ-5D-5L, we applied ordinary least square linear regression analyses to predict EQ-5D-5L from the patient-reported outcome measures.Six regression models were conducted to assess the individual contribution of each of the patient-reported outcome measures and the combined contribution of the patient-reported outcome measures.Model 1 included only the LE CAT as the predictor of EQ-5D-5L; model 2 included only the PF CAT as the predictor; model 3 included only the Depression CAT as the predictor; model 4 included only the Pain CAT as the predictor; model 5 included the LE CAT, Depression CAT and Pain CAT as the predictors; and model 6 included the PF CAT, Depression CAT and Pain CAT as predictors.Intraclass correlation coefficients were computed to compare the actual EQ-5D-5L index scores with the estimated EQ-5D-5L index scores from the predictors in the regression models.To further understand the sensitivity of these patient-reported outcome measures as well as the EQ-5D-5L across health conditions, we conducted independent samples t-tests to see if there were significant score differences between patients who: (1) have numb feet versus do not have numb feet, and (2) have diabetes versus do not have diabetes.
A sample size of 116 patients would achieve 99.7% power to detect an R 2 of 0.200 attributed to 3 independent variables using an F-Test with a significance level of 0.05.

Demographic Characteristics
The study sample consisted of 55% female, 93.4% white, with a mean age of 44 years old (range from 18 to 80).About 9% of the participants identified as Hispanic or Latino.About 66% of the participants completed some college courses.Approximately 64% of the participants said they did not drink alcohol, 86% did not smoke, and 94% reported that they did not have diabetes.51% reported that there was another area that limited their current activities besides their foot and ankle problem.About 23% of the participants stated that their feet were numb and 8% reported that they had rheumatoid arthritis.

Instrument Characteristics
Descriptive summary (range, minimum, maximum, mean, standard deviation) of the scores from all of the instruments was presented in Table 1.All these instruments were moderately to highly correlate with each other (See Table 2).The highest correlation between the health preference measure and the patient-reported outcome measures was between the EQ-5D-5L and the LE CAT, r(116) = 0.612, p < 0.0001.As expected, the PRO instruments were moderately to highly correlate with each other.The highest correlation among the instruments was between the LE CAT and the PF CAT, r(116) = 0.801, p < 0.0001.

Regression Analyses
To predict the EQ-5D-5L index scores, six linear regression models were run and their results were summarized in Table 3.The PF CAT and the LE CAT were not included in the same model because they were highly correlated.All the predictors separately or combined significantly predicted the EQ-5D-5L scores (p < 0.0001).The LE CAT was the best individual predictor; it alone accounted for 37% of the variability in the EQ-5D-5L.When combining patient-reported outcome measures, the best predicting model was the one consisting of the LE CAT,  Depression CAT and Pain CAT; they explained for 43.9% of the variance in EQ-5D-5L.Table 4 presents the actual EQ-5D-5L index scores versus the predicted EQ-5D-5L index scores from the six regression models.All predicted scores were essentially identical to the actual scores (p > 0.99).
To assess the degree of correlation between the actual EQ-5D-5L scores and the predicted EQ-5D-5L scores from the regression models, we computed the intraclass correlations (See Table 5).The intraclass correlation for the LE CAT was 0.708, the highest among the individual predictors.The intraclass correlation for the LE CAT, Depression CAT, and Pain CAT combined was 0.770.The intraclass correlation for the PF CAT, Depression CAT, Pain CAT combined was 0.740.

Instrument Scores by Health Conditions
Between participants that did and did not have numb feet, only the LE CAT showed significant difference in mean scores (p = 0.002) (See Table 6).When investigating participants that did and did not have diabetes, both the LE CAT (p = 0.01) and the PF CAT (p = 0.02) demonstrated that there was a significant difference in scores.

Discussion
In this study, we investigated whether the EQ-5D-5L preference scores could be mapped from the LE CAT, PF CAT, PROMIS Depression CAT, and/or PROMIS Pain CAT measures using regression estimation methods.We examined the correlations between the instruments, the differences in scores by health conditions, the variance   accounted for when predicting EQ-5D-5L scores, and intraclass correlations between the actual and predicted EQ-5D-5L scores.All of the instruments were significantly correlated with each other, with the LE CAT and the PF CAT having the highest correlation, which was expected.Given that the LE CAT had the highest correlation with the EQ-5D-5L index scores (r = 0.612), this suggested that the LE CAT could be best mapped to the EQ-5D-5L.The correlations of the EQ-5D-5L with the PF CAT, the Depression CAT and the Pain CAT were all relatively high, at r = 0.538, −0.446, and −0.502 respectively.
We conducted regression analyses to assess the effects of the patient-reported outcome measures on the EQ-5D-5L index scores.We found that the LE CAT scores alone accounted for 37% of the variance in the EQ-5D-5L scores, which was much more than we expected.The PF CAT, Depression CAT, and Pain CAT, each alone predicted 28.3%, 19.2%, and 24.5% of the variance in the EQ-5D-5L respectively.When including the Depression CAT and the Pain CAT scores with the LE CAT and PF CAT measures in separate models, the variance explained increased to over 40% for the PF CAT and 45% for the LE CAT.These findings demonstrated that if we were going to pick a single instrument as a proxy for the EQ-5D-5L preference scores in this patient population, it would be the LE CAT.When including more than one patient-reported outcome measures to predict the EQ-5D, the LE CAT, the Depression CAT, and the Pain CAT accounted for the most of the variance in the models.While adopting a combination of patient-reported outcome measures did not minimize the number of items needed, it provided physicians more options for instruments that can be used when health preference assessments were unavailable.
We also examined the intraclass correlation to assess whether the EQ-5D-5L index scores can be accurately estimated by the patient-reported outcome measures.Findings indicate that the patient-reported outcome measures can accurately predict the EQ-5D-5L index scores.Alone, the PF CAT had an ICC of 0.621 and the LE CAT had an intraclass correlation of 0.708.This again shows that the LE CAT is a better proxy of the EQ-5D-5L, but the PF CAT also does a very good job.When we examine the intraclass correlations of the PF CAT and the LE CAT combined with the Depression CAT and Pain CAT, the intraclass correlations are largely demonstrating good agreement.The LE CAT with the Depression CAT and Pain CAT had a slightly better intraclass correlation of 0.770 compared to 0.740 of the PF CAT with the Depression and Pain CAT.As a result, we would assert that the LE CAT, Depression CAT, and Pain CAT together are a good approximation of the EQ-5D-5L preference index.If clinicians or researchers were only interested in using a single patient-reported outcome measure, the LE CAT would be the best proxy choice for EQ-5D-5L.
Finally, we tested for score differences for each instrument (LE CAT, PF CAT, Depression CAT, Pain CAT, EQ-5D-5L) with patients who had two different types of health conditions.Most of the instruments were not sensitive enough to demonstrate significant mean score differences in measuring patients with numb feet and diabetes.Only the LE CAT was able to significantly discriminate patients with numb feet versus patients without numb feet.Both the LE CAT and the PF CAT were able to significantly discriminate patients with diabetes and without diabetes.This finding spoke to the possibility that the LE CAT or the PF CAT could be a potentially better tool for conducting cost-effectiveness research or economic analyses.

Conclusion
Taken together, our results demonstrate that the LE CAT or a combination of the LE CAT with the Depression CAT and Pain CAT can be a useful proxy for the EQ-5D-5L preference index.Physicians or researchers interested in conducting cost-utility analyses have the choice of using either the EQ-5D-5L, the LE CAT, the PF CAT, the Depression CAT, or the Pain CAT individually or in combination.Further research is needed to confirm the results of this study in a larger patient population.

Table 1 .
Descriptive statistics of the instruments.

Table 2 .
Pearson product-moment correlations among the instruments.

Table 4 .
Mean actual and predicted EQ-5D-5L index scores from various PRO measures.

Table 5 .
Intraclass correlation correlations (ICC) of the actual EQ-5D-5L index scores with the predicted EQ-5D-5L index scores from various PRO measures.

Table 6 .
Mean score differences by health condition.