Matching Clients to Alcohol Treatments Using 16 Client Characteristics Simultaneously: A Cluster Analytic Approach with Research and Clinical Implications

Project MATCH sought to identify client characteristics that could be used to select treatments for specific clients to increase the effectiveness of psychosocial treatments. Results based on examining matching variables one at a time were deemed disappointing by the Project MATCH investigators. In this secondary data analysis, we present analyses examining 16 matching variables simultaneously through cluster analysis in the outpatient arm. While null results were found for the first cluster (n = 380), there was a longer time to first drink for members of the second cluster (n = 275) receiving Twelve Step Facilitation (TSF) compared to clients receiving cognitive behavioral therapy or Motivational Enhancement Therapy (MET). For the third cluster (n = 297) clients had a longer time to first drink if receiving cognitive behavioral therapy or TSF compared to MET. Additional analyses show that these cluster assignments can be adequately approximated by using nine of the 16 matching variables. It is hoped that these results rekindle interest in the benefits of client-treatment matching.


Introduction
Excessive alcohol consumption is devastating at a societal and individual level. The social harm and the healthcare costs of alcohol are eclipsed only by the so-cial harm and healthcare costs of heroin and cocaine (Nutt, King, Saulsbury, & Blackmore, 2007). At the individual level there are several alcohol-related negative consequences that have prolonged and disastrous effects. These include Fetal Alcohol Syndrome, motor vehicle accident fatalities, other alcohol-related homicides and suicides, and alcohol-related traumatic events and their PTSD sequelae.
Project MATCH (Matching Alcoholism Treatments to Client Heterogeneity) sought to improve the effectiveness of alcohol treatment by determining which treatment is best for whom? Project MATCH was one of the largest RCTs (randomized clinical trial) of psychotherapy ever completed. Aside from the client characteristic X treatment interactions detected, there were a number of other accomplishments of this study which spanned from Seattle to Providence to Houston to Albuquerque. These accomplishments include the development of three distinct and theory driven psychosocial treatment approaches: Cognitive-Behavioral Coping Skills Therapy (CBT, Kadden et al., 1992) Motivational Enhancement Therapy (MET, Miller, Zweben, DiClemente, & Rychtarik, 1992) and Twelve Step Facilitation (TSF, Nowinski, Baker, & Carroll, 1992). This RCT emphasized statistical power. Project MATCH exemplified a quantum leap in research methods and was a paragon of multisite research collaboration (Drummond, 1999).
In the original analyses, treatment matching effects were found for psychiatric severity, support for drinking, and anger, in the outpatient arm. A treatment matching effect was found for alcohol dependence severity, in the aftercare arm (Project MATCH Research Group, 1997a, 1997b, 1998a. In the original analyses, following the principle of Occam's Razor, matching effects for matching variables were assessed one at a time. That is, matching effects for two or more matching variables at a time crossed with treatment were apparently not examined. Given that alcohol dependence and relapse have been theorized to be multidetermined (Witkiewitz & Marlatt, 2004) we examined matching effects involving two and more matching variables at a time.

Sample Description
This is a secondary data analysis. The matching variables were assessed at baseline. Pertinent to the analyses presented here, the Form-90 (Miller, 1996) an interview, was completed every 3 months, five consecutive times to determine treatment outcome. In this report we focus on the outpatient arm of 952. This sample was predominantly male (72%) and White (81%) and half (48%) had been in treatment prior to the index treatment episode. The average age was 38.9 (SD = 10.7). Most (64%) were single and half (51%) were employed. The average years of education was 13.4 years. Clients were ineligible if they were dependent on a drug other than alcohol (except marijuana). The most common reason (45%) for not volunteering for the RCT was the inconvenient location of the study or transportation barriers. Additional sample characteristics are described in Project MATCH Research Group (1997a).

Statistical Approach
We focused on the outpatient arm of Project MATCH for two reasons. First, the outpatient arm was a cleaner experimental design. In the aftercare arm, clients were randomized to one of the three treatments, following inpatient stays of varying length, and more importantly of various theoretical orientations. These various theoretical orientations were not recorded in a systematic way. We hypothesize that while the aftercare arm potentially increased statistical power by nearly doubling the sample size, the implementation of the aftercare arm added error variance to the experimental procedure. Further, while inpatient stays have been shown to be beneficial in some cases (e.g., Rychtarik et al., 2000) due to cost, inpatient treatment has become less common in general, in the years since Project MATCH.
Initial exploratory data analysis looked at pairs of matching variables and treatment contrasts with continuous abstinence 2 months posttreatment as the dependent variable. (Two months was chosen to balance clinical relevance with statistical power.) Matching variables were dichotomized using median splits, and 3-way interactions, three 2-way interactions, and 3 main effects were tested using logistic regressions. The 3-way interaction (matching variable A X matching variable B X treatment contrast) was the effect of interest. These analyses produced significant results at a frequency consistent with the Type I error rate.
We therefore changed strategy, focusing on using the matching variables that were specifically continuous variables, and used cluster analysis to accommodate multiple matching variables simultaneously. This latter approach is in line with Rowntree's (2004) emphasis that recoding continuous variables into categorical or dichotomous variables results in a loss of information.
In the cluster analysis, only continuous matching variables were used. Dichotomous matching variables were not used since these would influence cluster solutions based on the de facto prevalence of the values of the dichotomous variables. For example, since there were more men than women in Project MATCH, the largest cluster would likely be male. Focusing on continuous variables left 17 matching variables to be considered. A correlation matrix revealed that the two self-efficacy measures (the confidence variable and the temptation minus confidence variable) were highly correlated (r = −.88). We dropped the latter since difference measures are notoriously bouncy (e.g., deviate from a normal distribution), and so that the construct of self-efficacy would not drive the cluster solution by being represented twice in the input variables. For each of the continuous matching variables, a z-score was calculated, to put the various input variables on a level playing field. Next, a mean substitution of missing values was conducted for the matching variables (3.2% of the data had missing values). We used SPSS 26 two-step cluster analysis to determine the optimal num-ber of clusters, and then used K Means Cluster Analysis to get the cluster solutions. The cluster analysis procedure converged in 11 iterations (the default setting in SPSS is 10 iterations).

Cluster Analysis and Survival Analysis Results
The first cluster consisted of 380 (39.9%), the second cluster consisted of 275 (28.9%), and the third cluster consisted of 297 (31.2%) of the 952 outpatients. The cluster solution on the zee transformed matching variables is reported in Table 1. We also flagged the cluster centers that were at or below the 25 th percentile (z ≤ −.66) and at or above the 75 th percentile (z ≥ .66). We identify the raw variables for extreme values in Table 1. Cluster 1 did not have any centers that were extreme. Cluster 2 was typified by high values on alcohol involvement (measured by the Alcohol Use Inventory, Wanberg et al., 1977), psychopathology (measured by the legal section of the ASI, McLellan et al., 1992), sociopathy (measured by the socialization scale of the CPI, Gough, 1975), meaning seeking (PIL, Crumbaugh & Maholik, 1976, SONG, Crumbaugh, 1977, alcohol dependence (measured by the Alcohol Dependence Scale, Skinner & Allen, 1982), and anger (Spielberger trait anger scale, 1988), and a low value on social functioning (assessed by the Psychosocial Functioning Inventory, Feragne, Longabaugh, &Stevenson, 1983 andthe DrInC, Miller, Tonigan, &Longabaugh, 1995). Cluster  Prochaska & DiClemente, 1992) and a high value on social functioning. We crossed cluster with treatment contrasts using Kaplan-Meier survival analyses and p-values are reported in Table 2. Treatment contrasts were conducted for pairwise comparisons to reduce error variance. (We did not want to lump apples and oranges together.) Mean survival times are listed in Table 3. Results indicate no treatment differences for clients in Cluster 1. For Cluster 2: TSF (Tx 3) was better than CBT (Tx 1, see Figure 1) and TSF better than MET. For Cluster 3, CBT (Tx 1) was better than MET (Tx 2, see Figure 2) and TSF was better than MET. Note that of the nine treatment contrasts, four (44%) were statistically significant.

Determining Proxies for Cluster Membership
We were able to recreate exactly, the algorithm SPSS used to assign each case to a specific cluster. This algorithm minimized the distance between the z-values of the case and the cluster centers. In other words, for cluster 1, we calculated the difference between zee01 and .01 (the cluster 1 center on the zee01 variable) and squared this. We did this for zee01 through zee16 and summed these. We ran the same analyses for the centers of cluster 2, and the centers of cluster 3. The case was assigned to whichever of the three sums had the lowest value. This resulted in all 952 cases being correctly assigned to the cluster membership variable generated by SPSS.
While this scoring procedure could be automated, and treatment assignments made on this basis, conducting all 16 assessments would be incredibly burdensome, at least for the client (these assessments hypothetically could be automated). Our first, more basic approximation, consisted of using these three rules, which was 71.8% accurate:     2) if zee08 greater than or equal to.95, then cluster equals 2.
3) else cluster equals 1. Our second approximation used squared differences based on the nine variables labeled in Table 1 instead of all 16 variables. This cluster assignment procedure resulted in 91% of the clients being assigned to the correct cluster. All of the survival analyses were replicated with this proxy membership variable, except that the treatment difference in the new cluster 2 for MET versus TSF, had now become a trend (Log rank p = .062, Breslow p = .054, and Tarone-Ware p = .054). The treatment benefits for TSF over CBT for the new cluster 2, and treatment benefits for CBT and TSF over MET for cluster 3, remained statistically significant. Also, when we compared the new cluster 2 clients receiving TSF versus those not receiving TSF (CBT and MET groups combined), this was statistically significant (Log rank p = .006, Breslow p = .009, and Tarone-Ware p = .006). So that clients could be prospectively matched to the relevant treatment by interested clinic directors, raw means and raw standard deviations for these nine variables are reported in Table 4.

Research Implications
Cluster analysis can be used to test for matching effects by capitalizing on natural separations in client profiles.

Clinical Implications
Clients in cluster 2 should get TSF. Clients in cluster 3 should get CBT or TSF. TSF fits with the extra-treatment resource of Alcoholics Anonymous. CBT is more easily adapted to address other substances besides alcohol, than TSF. For example, there are only a few Nicotine Anonymous meetings in the U.S. compared to the ubiquity of AA meetings. Therefore, it does not make sense to adapt TSF for nicotine use. Hypothetically TSF could be tailored in turn, to address the profile of cluster 2, and the profile of cluster 3. While CBT might be tailored to more specifically address the profile of cluster 3. For cluster 1, there is no clear winner, but MET is conducted in four sessions compared to the twelve sessions of CBT or TSF, so MET is more cost-effective in the case of cluster 1. Sophisticated alcohol treatment centers could use computerized surveys at intake to determine whether CBT, MET, or TSF would be the most appropriate for a client presenting with Alcohol Use Disorder. CBT and TSF could be administered in a group format to increase cost-effectiveness.

Conclusion
Previous published results and conclusions indicated the evidence for treatment matching was disappointing (Edwards, 1999;Miller, 2005;Project MATCH Research Group, 1998b). In contrast, using a cluster analytic approach, which capitalizes on natural separations in the distributions of the data, we found treatment matching for 2 of the 3 clusters or 60% of the outpatients. Our results suggest that our approach could be used to test for patient treatment matching effects in other RCTs. We hope these findings renew interest in the possible benefits (Donovan & Mattson, 1994) of client-treatment matching.