Using Novel Statistical Techniques to Accurately Determine the Predictive Dose Range in a Study of Overall Survival after Definitive Radiotherapy for Stage III Non-Small Cell Lung Cancer in Association with Heart Dose ()
1. Introduction
Late cardiac effects years to decades after thoracic radiotherapy (RT) have been well-described [1]. However, recent prospective studies in lung cancer have identified radiation-related cardiac events occurring on an earlier timeframe of months to years. Heart dose volume histogram (DVH) variables associated with cardiac toxicity in these prospective studies include: mean heart dose (MHD) [2] [3] [4], heart V5 (volume of the heart receiving at least 5 Gy) [2] [3] [4] [5], heart V30 [2] [3], heart V35 [6], heart V55 [4], heart V60 [5], mean left ventricle dose [3] [5], left ventricle V5 and V30 [3] [5], mean left atrium dose [5], left atrium V30 [5], and right atrium V60 [5].
Heart dose has not only been associated with cardiac events [2] [3] [4] [5] [6] in prospectively evaluated patients but also overall survival (OS) [7] [8] [9]. In contrast to expectations, the Radiation Therapy Oncology Group (RTOG) 0617 trial demonstrated decreased OS in patients randomized to higher dose radiotherapy for locally advanced non-small-cell lung cancer (NSCLC) [7]. Subsequent analysis of RTOG 0617 associated increased heart V40 [8] and heart V50 [10] with decreased OS.
Prior published studies determined a wide range of associations between dosimetric variables and OS which made their consistent clinical application difficult and even raised doubts as to the veracity of the findings [11]. The inconsistencies among studies are most likely attributable to correlations among Dose Volume Histogram (DVH) variables which were not accounted for in a conventional statistical analysis. To address these shortcomings, we used advanced statistical techniques to systematically search for heart DVH variables which were most predictive for the OS in a cohort of 119 patients, treated for stage III NSCLC with definitive RT at Mayo Clinic Arizona (MCA). We selected well established statistical techniques which are applicable to highly correlated covariates but also enhanced them with novel constraints which reflect radiobiological knowledge specific to RT. These additional constraints improve generalizability of the model and make the results easier to interpret.
The increasing use of intensity modulated RT (IMRT) and proton beam therapy (PBT) may allow for more precise cardiac sparing radiation plans, provided that evidence based treatment planning constraints on heart dosimetry can be reliably established.
2. Methods and Materials
2.1. Patient Characteristics
From 2006 to 2013 at Mayo Clinic Arizona, 119 stage III NSCLC patients were treated with definitive RT. RT was delivered using involved-field technique and either 3-dimensional conformal RT (3D-CRT) or IMRT. Patient characteristics are shown in Table 1.
Table 1. Baseline patient and treatment characteristics. Treatments were conventionally fractionated at 1.8 Gy - 2.0 Gy per fraction.
2.2. Treatment Planning Heart Constraints
Typical dose constraints on the whole heart structure during treatment planning were: Maximum dose < 62 Gy, Mean dose < 26 Gy, V30Gy < 46% ; V40Gy < 33%.
2.3. Heart Dose Extraction
The whole heart was contoured for each patient on the radiation planning computed tomography (CT) image using Eclipse treatment planning system (Varian, Inc). Dosimetric information was extracted from Eclipse using the Eclipse Application Programmer Interface (ESAPI) software and reprocessed for statistical analysis using proprietary institutional software. The typical planning image was acquired with axial spacing of 2 mm and 1 - 2 mm voxels in the anterior-posterior and medial-lateral dimensions. Dose extraction was performed using a rectangular grid with the dimensions of the planning image.
2.4. Dosimetric Analysis
The dose to the heart was quantified using dosimetric index VD which was defined as the percentage of the volume of the heart (or heart segment) receiving dose ≥ D in Gy. For each dose-volume histogram (DVH), a range of VD indices was extracted, with dose D varying in 1 Gy steps between 5 Gy and 60 Gy. The dose was not converted to biologically equivalent dose as all treatments were conventionally fractionated photons at 1.8 - 2.0 Gy per fraction.
2.5. Statistical Analysis
Possible association between heart dosimetry and OS is most commonly investigated using the multivariate Cox model with heart dosimetry represented by preselected heart DVH variables (e.g., a VD which is the percentage of heart volume receiving dose D, or greater). The drawback of this approach is that p-values associated with DVH variables need to be scaled by False Discovery Rate (FDR) correction. Since the dosimetric variables are highly correlated, the FDR correction can be overly strict and therefore may find none of the dosimetric variables being significant, particularly in studies with limited patient numbers. Also, this approach cannot assess the joint effect of dosimetric variables on OS. To address these limitations a multivariate Cox model can be adopted in which heart dosimetry is represented by the linear combination of DVH features, i.e.,:
,
which links the hazard function
with dosimetric variables
. For example,
can be
. The conventional multivariate Cox model may suffer from multicollinearity due to the high correlation between the dosimetric variables. Another challenge is the curse of dimensionality as the number of dosimetric variables included in the multivariate model is typically quite large compared with the limited sample size. To address these issues, we took advantage of modern developments in statistical analysis by adding constraints on the coefficient estimates, which are known as the “variable selection techniques” [12]. The well-known Lasso model [13] adds an L1 penalty on the coefficient estimates, i.e.,
, which has the effect of suppressing the small-effect coefficients to be zero and thus selecting the subset of important dosimetric variables simultaneously. To further account for the high correlation among the dosimetric variables, the fused Lasso model [14] includes an additional L1 penalty on dosimetric variables with adjacent dose levels i.e.,
. The upper bounds in these constraints, s and γ, are selected using a grid search to optimize a commonly used model selection criterion. Based on the work of Dai and Breheny [15] we use leaveoneout cross validation of linear predictors during the grid search to find parameters s and γ associated with the lowest cross validation error (Supplement S2.2).
In this paper, we adopted the fused Lasso as our base formulation but added two additional constraints inspired by biomedical domain knowledge. The resulting model is called knowledge-constrained Lasso (KC-Lasso). The two constraints are non-negativity and monotonicity of coefficients for dosimetric variables. The non-negativity means that we require
,
. This constraint reflects a biologically motivated hypothesis that increasing VD poses either a higher hazard risk or no significant risk but cannot lower the risk. The monotonicity constraint specifies that
where
to
are coefficients corresponding to VD’s with D from the lowest to the highest dose levels and is motivated by radiobiology. If the same volume is irradiated to a higher dose, the hazard ratio associated with the irradiation cannot decrease since higher doses are always associated with lower cell survival fractions. Lower cell survival fraction may keep clinical toxicity the same or make it worse, but it cannot make it better (Supplement S2.1). Integrating the non-negativity and monotonicity constraints into the fused Lasso formulation, the resulting KC-Lasso model can be estimated by the “penalized” function in the R package.
The analysis was performed in two distinct steps: In the first step we selected patient-specific covariates which were predictive for the OS, using Cox model with Akaike Information Criterion (AIC). Following covariates were considered: stage, chemotherapy, age, prescription dose, mean lung dose, lung V20, tumor site, and laterality. Tumor volume was not included as a candidate covariate because of strong correlation between volume and stage (Supplement S2.5). In the second step we retained only predictive patient specific covariates and additionally included dosimetric covariates, without penalizing patient specific covariates.
We applied the KC-Lasso model with two different resolutions, one with 1 Gy spacing between indices, i.e., 5 Gy, 6 Gy, ···, 59 Gy, 60 Gy included in the model, which we called the “dense fit”; and the other with 5 Gy spacing, i.e., 5 Gy, 10 Gy, ···, 55 Gy, 60 Gy, which we called the “sparse fit”. The selected patient-specific covariates were included in each model. The specific purpose of varying the dose step was to test the self-consistency of the approach as the step size selection is arbitrary and the final evaluation of the Hazard Ratios should not depend strongly on the step size selection. Greater resolution provides a better estimate of the effective dose threshold and the final Hazard Ratio, at a cost of greater complexity of the model in its final applications.
The KC-Lasso model can identify the dose threshold (or thresholds) beyond which the dose volume variables,
become predictive for OS. However, the model itself cannot be used directly in commercially available optimization packages which set constraints on individual indices only. For this reason we also fit a family of multivariate Cox models, each with only one VD index representing heart dosimetry, to obtain a (slightly less precise) model which is directly usable as a dose constraint in commercially available treatment planning systems. Since KC-Lasso identifies the range of doses which are predictive for OS, we did not scale p-values in the simplified approach by the FDR correction if they fell within the range indicated by KC-Lasso. In clinical applications the single index constraint can be used to set the optimization constraint based on limiting the Hazard Ratio, while the full formulation of KC-Lasso can be used for final evaluation of the Hazard Ratio in the plan.
KC-Lasso is not the only statistical methodology that could be applied to DVH analysis. The alternatives could include Lasso and Fused Lasso methods without knowledge constraints or entirely different statistical approaches which account for correlations among indices. To explore potential alternatives, we applied Lasso, Fused Lasso, and Elastic Net [16] models to our data set and compared the results to KC-Lasso.
3. Results
Median follow-up for all patients was 18 months (range 1.1 to 81.2 months). At last follow-up, 47 patients (39.5%) were alive (Table 1). Median follow-up of the patients alive at last follow up evaluation was 30.4 months (range 2.5 to 81.2 months). Three patient-specific variables were associated with OS in all models: age before RT, disease stage, and receipt of chemotherapy. Prescription dose, mean lung dose, lung V20, radiation technique (3D-CRT vs IMRT), timing of chemotherapy (concurrent vs sequential), tumor laterality, tumor site and volume were not significant.
3.1. Patient Specific Covariates
Overall Survival was worse for older patients (HR = 1.04 per year, P = 0.01), worse for stage IIIB vs IIIA (HR = 1.78, P = 0.02), and better for patients who received chemotherapy (HR = 0.46, P = 0.04).
3.2. Whole Heart Dosimetry Using KC-Lasso Multivariate Analysis
KC-lasso identified a dose threshold, D*, above which the dose volume variables,
, become predictive for OS. In the dense fit, D* = 51 Gy, i.e., all the coefficients corresponding to dosimetric variables with D < 51 Gy are zero while the first non-zero coefficient occurs for V51 and all the coefficients with D ≥ 51 Gy are non-zero. In the sparse fit, D* = 55 Gy. Both results are summarized in Table 2.
3.3. Whole Heart Dosimetry Using Single VD Index
Table 3 shows p-values associated with VD indices obtained by fitting a family of Cox models, each using a single VD index to represent heart dosimetry, spaced in 5 Gy increments from V5 to V60. Heart V55 predicted OS in a statistically significant manner, while V50 and V60 were nearly statistically significant. The hazard ratios (HRs) for OS are worse for increasing heart V55 (HR 1.044 per 1% increase in heart volume exposed to at least 55 Gy, P = 0.03) (Table 4).
3.4. KC-Lasso Consistency Check
The two variants of KC-Lasso (Table 2) and a conventional model (Table 4) provide a consistency check on the estimates of Hazard Ratio (HR) with both approaches. The ratio between coefficients in “dense” and “sparse” fit of KC-Lasso (Table 2) is approximately 1:5, which is the same as the step size ratio, hence both models will evaluate to a similar HR value. Similarly, the value of the coefficient in a conventional model is 0.043 (Table 4 and Table 1S), which is
Table 2. Summary of coefficients for VD features which are predictive for OS in the KC-Lasso model. Results are shown for two models, each using an array of VD indices as input. In the “dense” model indices are spaced by 1 Gy, whereas in the “sparse” model indices are spaced by 5 Gy. For both models, coefficients are zero in the V1 - V50 range, and increase with dose thereafter. The mathematical formula needed to calculate the hazard ratio is shown at the bottom of the table. The p-value associated with dosimetric variables is shown in the last column. Note that weights in the “dense” model are approximately 5 times lower than the weights in the “sparse” model, which means that weights scale in proportion to the dose step. Both models will evaluate to a similar Hazard Ratio, showing the consistency between the two models. Summary of KC-Lasso DVH feature.
Table 3. A summary of P-values associated with DVH indices for a family of Cox models that represent heart dosimetry as a single, whole heart DVH. Index VD indicates percentage of heart volume receiving a dose greater or equal to D[Gy]. P-values are lowest in the same range of VD as non-zero indices of the KC-Lasso model. Since the lowest p-value is associated with V55, which is also located in the middle of the KC-Lasso range, we choose V55 as the best approximation which can be used in treatment planning. The Hazard Ratio associated with V55 is HR = 1.044 for each 1% of the heart volume exposed to at least 55 Gy.
Table 4. Multivariate Cox model for OS using whole heart V55 as a single DVH index representing heart dosimetry. V55 represents percentage of whole heart volume receiving dose 55 Gy, or greater. The Hazard Ratio for cardiac toxicity can be calculated as
. Equations and examples needed to calculate an individualized HR for OS using a specific patient’s variables are provided in the Supplement.
comparable to coefficients in the “sparse” KC-lasso fit (Table 2), and will thus yield a similar estimate of HR. All models are approximations and one does not expect an exact agreement among them, but one does expect a reasonable consistency of HR estimates.
3.5. Alternative Statistical Approaches
The three alternative statistical approaches (Lasso and Fused Lasso without knowledge constraints and Elastic Net without knowledge constraints) generated fits to data which were of comparable statistical significance to KC-Lasso but each of the three approaches had features which were difficult to interpret, like negative correlation coefficients or isolated correlation coefficients at a single dose. Hence knowledge based constraints, similar to constraints in KC-Lasso, are likely needed to create models which are both intuitively understandable and more likely to be generalizable to other data sets. A detailed discussion of the comparisons among competing statistical techniques can be found in the supplemental section (Supplement S2.4).
4. Discussion
We used advanced statistical techniques to overcome limitations of conventional statistical methods which are often used to search for associations between heart dosimetry and OS in lung cancer patients. Conventional analyses use the Cox model with preselected DVH variables (in a univariate or multivariate setting) and seek to establish statistically significant associations between DVH variables and OS. These approaches raise False Discovery (FDR) concerns and ignore strong correlations between DVH variables, which can lead to variable results when studies are compared (Table 5) [11]. Additional discussion of the limitations of the univariate approach is provided in the supplemental section (Supplement S2.3).
The model introduced in our work (KC-Lasso) treats the entire DVH as input and finds a contiguous range of DVH variables predictive for OS (the sensitivity
Table 5. Studies published since 2008 reporting an association between survival outcomes and heart dose.
Abbreviations: 2D-RT, 2-dimensional RT; 3D-CRT, 3-dimensional conformal RT; ACM, all-cause mortality; AE, adverse event; CHD, coronary heart disease; D(xx), minimum dose to xx% of the volume; fx, fractions; GI, gastrointestinal; Gy, Gray; HR, hazard ratio; IMRT, intensity-modulate RT; LA, left atrium; LV, left ventricle; mos, months; NR, not reported; NSCLC, non-small cell lung cancer; OS, overall survival; PCI, percutaneous coronary intervention; RR, relative risk; RT, radiotherapy; RTOG, Radiation Therapy Oncology Group; SBRT, stereotactic body RT; SVC, superior vena cava; univ., university; V(x), volume receiving at least x Gy; yrs, years. ^indicates median *indicates mean ªindicates prospective rindicates retrospective.
range). However, the model itself cannot be used directly in commercially available optimization packages which set thresholds on individual indices only. Hence, we supplemented our analysis with a conventional approach, which used a single DVH variable to represent heart dosimetry in a multivariate Cox model and searched for the model in which this variable had the greatest statistical significance (Table 3 and Table 4). We argue that the variable with greatest statistical significance can be selected without concerns for FDR, as long as it belongs to the “sensitivity range” selected by the KC-Lasso model. Clinically, the conventional approach would be used to establish optimization constraints, by setting a limit on the Hazard Ratio, while the more complete KC-Lasso model could be used to evaluate the Hazard Ratio in the treatment plan. A more detailed discussion of the limitations of the conventional model can be found in the supplemental section (Supplement S2.3).
Our findings build upon prior studies that also show an association between high doses to relatively small volumes of the heart and decreased survival in NSCLC [7] [8] [10] [17] - [24]. Consistent with the findings of other investigators, our model predicts worse OS for older age [25], more advanced stage [26], and lack of chemotherapy [27].
Hazard ratios for OS associated with heart irradiation in our study are of comparable magnitude to the HRs for older age. Using the conventional model approximation as an illustration, each additional 1% of heart receiving ≥55 Gy carries similar OS impact to an additional year of older age (Table 4).
Table 5 provides a summary of the existing literature associating heart dose with OS [4] [7] [8] [10] [17] - [24] [28] [29] [30] [31] [32]. While our study found a range of doses for which DVH variables were associated with OS (V51 - V60), other studies have identified V30 [7] [17], V40 [8] [18], V50 [10] [19] [22], V55 [19], maximum heart dose [18], or mean heart dose [32]. The variability among studies may be attributable to strong correlations between DVH parameters which are caused by the physical properties of radiation beams (Supplement S2.3). The strength and pattern of such correlations may depend on treatment delivery techniques, which change over time and may thus affect each study differently. Advanced statistical techniques, such as the techniques employed in our study, confer an advantage of systematically examining the entire DVH, while accounting for correlations between DVH variables. Results in Table 2 show that all VD indices in the “sensitivity range” contribute to the Hazard Ratio. Additional discussion of potential reasons for discrepancies among studies is provided in the supplemental section (Supplement S2.3).
The etiology bridging the gap between heart dose and survival has yet to be confirmed. A recent systematic review details the existing literature on dosimetry and cardiac endpoints across pediatric, breast, lung, esophageal, and hematologic malignancies, emphasizing risk for coronary disease, valvular disease, arrhythmia, and pericardial disease after thoracic RT [24]. The apparent importance of upper heart substructures and specifically the superior right heart [9] [20] [30] [33] suggest radiation damage to the cardiac conduction system may impact survival in NSCLC patients. Several recent findings support this hypothesis. Among NSCLC patients treated on prospective dose-escalation trials at University of North Carolina, 11% had documented arrhythmia at 26 months after RT [3]. However, if radiation caused transient fatal arrhythmias, they would not likely be identified and may simply be recorded as deaths due to lung cancer. At 6 months after thoracic RT for locally advanced NSCLC, Vivekanandan et al. [9] found ECG changes in 38% of patients and ECG changes were associated with worse OS on multivariate analysis. Adding to this evidence, we have recently published [33] an expanded analysis of 3-dimensional dose distributions in the heart, for the same patient cohort as the present study, which found that the dose to the right-superior portion of the heart was most responsible for the decreased OS. More detailed cardiac evaluation of NSCLC patients receiving thoracic RT is necessary to evaluate this hypothesis.
We acknowledge the limitations of our study. Although our findings were statistically significant, our sample size is limited. Of the 16 studies in the past 10 years that have found an association between heart dose and survival, 12 included more patients than the present study (Table 5) [4] [7] [8] [10] [17] - [23] [28] [29] [30] [31]. Moreover, our data only included the clinical outcome of OS without any other clinical outcomes like cardiac events. Multivariate analysis mitigates the possibility of confounding by other disease and treatment-related variables but cannot exclude confounding by unaccounted for variables. Some studies have suggested confounding by or interactions with immunosuppression [22], pre-existing coronary heart disease [32], lung dose [34], or extent of mediastinal lymph node involvement [35]. Because of limited sample size we only performed Leave One Out Cross Validation and were not able to perform sample subdivision into model fitting and validation parts. Additional validation must be left to future work with an expanded data sample.
Based on our findings and the existing literature, high dose to the heart should be avoided whenever possible. For patients with NSCLC treated with conventionally fractionated RT, heart doses ≥51 Gy may decrease OS. The superior right heart may be the most at-risk for radiation induced toxicity. Data shown in Table 2 and Table 4 (heart DVH, age, stage, receipt of chemotherapy) can be used to calculate an individualized HR for OS for every stage III NSCLC patient undergoing thoracic RT (Supplement S1).
5. Conclusion
Among stage III NSCLC patients undergoing thoracic RT, worse OS is associated with higher heart dose, older age, more advanced stage, and lack of chemotherapy. Doses higher than 51 Gy were predictive for reduced OS, while heart V55 appeared to provide the best estimate of OS for setting treatment planning constraints, with HR 1.044 per 1% increase of heart volume exposed to at least 55 Gy.
Acknowledgements
Mayo Clinic Arizona and the Arizona State University Rising Star Initiative provided funding to support the collaboration between the Department of Radiation Oncology at Mayo Clinic Arizona and the School of Computing, Informatics, and Decision Systems Engineering at Arizona State University. Varian, Inc. provided a grant to Arizona State University to fund this work. SES received support for research time from the North Central Cancer Treatment Group and Mayo Clinic with funding from the Public Health Service (CA-25224, CA-37404, CA-35267, CA-35431, CA-35195, CA-63848, CA-63849, CA-35113, CA-35103, CA-35415, CA-35101, CA-35119, CA-35090).
Statistical Analysis
Prof. Jing Li supervised the statistical analysis; Jiuyun Hu carried out the analysis as part of his PhD dissertation.
Data Sharing
Research data not available at this time.
Supplemental
S1. Clinical Applications Supplement
The Cox proportional hazards regression model can be written as follows:
where
represent potential predictors and
a baseline risk when all predictors are set to zero.
S1.1. Whole Heart DVH Analysis
TableS1 shows fit coefficients, hazard ratios, and p-values which were associated with significant predictors of overall survival in the model which included V55Gy for the whole heart DVH.
S1.2. Example Applications Using Whole Heart V55Gy
1) With the “reference patient” defined as stage IIIA, no chemotherapy, 70.5 years old, and heart V55Gy = 0%, the individualized hazard ratio for an actual patient with stage IIIB disease, who received chemotherapy, and whose heart V55Gy was N%, can be computed as (from Table 4):
The interpretation of HR = X is that the likelihood of dying per unit of time for the actual patient is X-fold greater than the likelihood of dying per unit of time for the reference patient.
2) Holding the other variables constant, for a patient whose V55Gy to the whole heart is N%, the associated hazard ratio for cardiac radiation exposure can be calculated as
or
.
Consider two patients who are the same age, have the same cancer stage, and both received chemotherapy, but differ only by extent of heart irradiation. Patient A received heart V55Gy = 0%. Patient B received heart V55Gy = 10%. Using the coefficient from the second column of TableS1, the individualized hazard ratio for patient A is
. The individualized hazard ratio for patient B is
. The likelihood of dying per unit of time is 1.537 times higher for patient B than for patient A.
Table S1. Model coefficients and associated hazard ratios in the model with whole heart DVH.
Another way to obtain the estimate of hazard ratio associated with cardiac irradiation is to use the hazard ratio listed in Table 4. Using the same example patients as before, the hazard ratio for patient B is 1.04410 = 1.538.
3) If two groups of patients have hazard ratios equal to HR1 and HR2, the ratio of the mean survival times between the two groups can be approximated by
.
S2. Statistical Analysis Supplement
S2.1. Radiobiological and Clinical Motivation for “Knowledge Constraints” in KC-Lasso
1) Introduction
Dose-volume analysis is one of the primary tools used in phenomenological modelling of clinical toxicity in radiation therapy. Dose volume analysis reflects the basic clinical and radiobiological insight that the likelihood of clinical toxicity depends on both the dose level and the volume to which the dose is applied. In general, larger doses and larger volumes to which dose is applied lead to greater likelihood of toxicity. The dominant effect of depositing dose in a volume of tissue is cell kill (cell depletion), usually described as cell Survival Fraction (SF), which is the proportion of cells that survive an irradiation. The SF is related to dose by a Linear Quadratic model equation
where BED is Biologically Equivalent Dose (BED) related to the physical dose through a
linear quadratic equation
. In the discussion that follows we
will equate the physical dose D with BED and refer to both as “dose”.
Irradiating a volume of tissue with a dose level “D” can lead to one of the three categories of outcomes:
a) The SF may be high enough that the tissue can compensate for lost cells and there is no clinical toxicity observed.
b) The SF is low enough that the tissue may not be able to fully compensate for lost cells which can lead to transient or permanent toxicity. The toxicity is transient if the tissue can rebuild itself over time and permanent if the tissue can no longer rebuild itself. In this regime, the onset of toxicity is probabilistic and may additionally depend on patient specific characteristics, like age, state of health or genetics.
c) The SF is so low that toxicity will inevitably occur, for all patients.
The (1)-(3) states are typically modelled by the sigmoid (logistic) curve. The Region (1) corresponds to the beginning part of the curve, Region (2) corresponds to the rising part of the curve, and Region (3) corresponds to the saturation region of the logistic curve.
2) The Linear Predictor in KC-Lasso
The linear predictor in KC-Lasso assumes the following form
where
is the percentage of organ volume with dose
, or greater.
3) Positivity Condition in KC-Lasso
The positivity condition in KC-Lasso says that all coefficients
have to be greater or equal to zero. (
)
The motivation for the positivity condition is that any dose, applied to any volume, kills cells (SF < 1). Such irradiation can produce no risk (Region (1)), some risk (Regions (2)-(3)), but it cannot reduce risk, which would be implied by negative coefficients.
4) Monotonicity Condition in KC-Lasso
Consider one of the elements in the linear predictor:
Let us fix the value of
at an arbitrarily chosen value of 0.05 (5% of the volume irradiated to the dose
or higher). What should the contribution of this term to risk be when
increases? An increase in dose means that the surviving fraction (SF) decreases. A decrease in SF means that the contribution of this term to risk has to increase (Region (2), increasing risk) or stay the same (Region (3), saturated risk).
Returning to the full linear predictor feature, one thus observes that, for a fixed value of
(e.g. 0.05) the contribution of consecutive terms to the feature has to increase, or stay the same, as the dose increases. This argument leads us to the monotonicity condition, which says that consecutive coefficients must be greater or the same as their predecessors:
For any two dose levels
, with
, let
and
be the coefficients for
and
, respectively. Then,
.
S2.2. Internal Validation
Since the KC lasso method is based on fused lasso, there are two tuning parameters we need to determine, i.e., the L1 penalty for the coefficients and the L1 penalty for the steps of the coefficients. We use the work by Dai and Breheny [15] who concluded that a method of leave-one-out cross validation with linear predictors works the best for a Cox model. To be more specific, suppose that there are n patients in the data. In the i-th iteration, we leave the i-th patient out and use the remaining n-1 patients to fit the Cox model with KC lasso. Then the ‘linear predictor’ of the i-th patient can be defined as
, where
is the feature vector for i-th patient and
is the estimated coefficient vector of the model without i-th patient. After we get “linear predictors” for all patients, we can define the predictive accuracy based on the “linear predictors” by
,
where
is the event time or last follow up time of i-th patient.
is the set of patients at risk at time
. The cross validation error for linear predictors is then defined by
. We find the set of tuning parameters with the lowest CVE (cross validation error).
S2.3. Limitations of the Univariate Analysis
In FigureS1, we show p-values associated with the dosimetric variable VD ina Cox model which contains patient specific covariates (age, chemotherapy and stage) and one VD index at a time. We effectively scanned the VD space and recorded the p-value associated with each VD. One observes a monotonic decline of p-values, starting between doses of 35 Gy - 40 Gy and continuing towards the minimum near V55Gy. The monotonic decline in such a broad range of doses is caused primarily by correlations among indices for this population of patients, convolved with the threshold dose. The magnitude of correlations in the present data is shown in FigureS2. Correlations among indices are determined by the combination of three factors: 1) anatomic variability among patients, 2) variability in tumor location and volume, 3) dose distributions which are characteristic for the delivery methods being used. The magnitude of p-value at the minimum also depends on the sample size. If one sets an arbitrary threshold of p-value at p = 0.05, this threshold will be crossed at a dose level which depends both on the sample size and on the pattern of correlations, which can vary in different studies due to differences in treatment delivery methods being used. Hence, the method of searching for the dose threshold with univariate analysis tends to produce threshold doses which are too low and can vary among studies. A more advanced statistical technique, which explicitly “corrects” for correlations, needs to be used to detect a threshold dose which will remain consistent for all studies and is generalizable to future clinical applications.
If one considers multiple VD indices as uncorrelated, independent variables, p-value obtained for each of these variables should be subjected to multiple comparisons correction. The correction will depend on the size of the scanning
Figure S1. P-values associated with VD covariate in a Cox model with patient specific variables (age, stage and chemotherapy) and one dosimetric covariate at a time. Doses are in Gy and a 1 Gy step was used in the analysis. Data in Table 3 were sampled from this figure.
Figure S2. Correlations among VD indices, the correlation matrix on the right side and a single section through the matrix at the level of V40 on the left side.
step and may prevent the p-values from reaching the threshold of statistical significance for patient cohorts of a realistic size. Adjusting the step size to reach the threshold of statistical significance is not well justified and reduces the precision of searching for a dose threshold which is associated with the clinical outcome. The magnitude of this problem is illustrated in FigureS3, using the VD step size of 1 Gy. Red bars show p-values obtained without multiple comparison correction, while blue bars show p-values adjusted for the correction. The threshold of statistical significance would not have been reached in the present patient cohort if p-values were corrected for multiple comparisons.
S2.4. Alternative Statistical Methods
KC-Lasso has been designed for the present study to address the problem of correlations among dosimetric variables. One can reasonably ask whether other statistical techniques could perform equally as well as KC-Lasso model. To address this question we compared KC-Lasso (Knowledge Constrained, Fused Lasso) to Elastic Net [16], Lasso [13] and Fused Lasso [14] models. We make a comparison by first deriving the coefficients for the VD variables in each model which creates a linear predictor (feature) for each. We then compare the p-value associated with the feature in an unpenalized Cox model containing patient specific covariates. All four models were associated with very similar p-values, as summarized in TableS2.
All four models produced very similar survival curves for surviving patients. KC-Lasso showed some differences in survival curves for deceased patients. Two examples are shown in FigureS4 and FigureS5.
Table S2. P-values associated with the linear predictor in four models in an unpenalized cox model with patient specific covariates.
Figure S3. An illustration of the effect that multiple comparisons correction would have on p-values in the univariate analysis for the present study.
Figure S4. An example of survival curves for a surviving patient.
All four models perform similarly on the same data set, though one could argue that KC-Lasso is showing slightly better prediction of survival probability. The primary difference between the models is in the choice of coefficients for the linear predictors that each model makes. The coefficients chosen by the Elastic
Figure S5. An example of survival curves for a deceased patient.
Net (FigureS6), Lasso (FigureS7) and Fused Lasso (FigureS8) models are shown below.
One observes that both the Elastic Net, Lasso and Fused Lasso models selected negative coefficients. Negative coefficients are biologically implausible because they imply a possibility that the irradiation of tissue at risk improves the odds of survival. A second biologically implausible feature is the selection of a single VD index (or a small group of indices) without a simultaneous selection of indices at higher doses. Higher doses are always associated with lower cell survival fraction in the volume. Radiobiology suggests that lower cell survival fraction in a fixed volume of tissue should always be associated with higher likelihood of clinical complications, or at least the same likelihood of complications if cell survival fractions are so low that the adverse clinical outcome is virtually assured. Consequently, once the first VDthr (threshold dose) is selected, all VD>Dthr should also be selected and their coefficients should be higher (more risk) or equal (saturated risk). The constraint on the maximum dose in the analysis must be imposed by the maximum dose available in the data.
In summary, KC-Lasso is not the only statistical model that can be successfully fit to the present data set. However, KC-Lasso has been designed to satisfy “common sense” boundary conditions (positivity and monotonicity conditions imposed on coefficients) as well as to account for correlations between VD indices. The purpose of this design has been to make the results of the model easier to interpret intuitively and to be more generalizable.
S2.5. Tumor Volume as a Patient Specific Covariate
Tumor volume can influence the likelihood of OS and can also influence
Figure S6. Coefficients selected by the elastic net model.
Figure S7. Coefficients selected by the lasso model.
the irradiation of the heart. Tumor volume was difficult to assess in this retrospective study because physicians frequently broke the target hierarchy (GTV-CTV-PTV-ITV) during contouring and thus forced estimates of the tumor volume with assumptions. We estimated the tumor volume (under assumptions) but decided to substitute tumor volume with a patient specific covariate which is strongly correlated with volume, namely clinical stage (3A and 3B). The clinical stage has been reliably recorded prior to treatment and reflects added clinical risk associated with tumor progression. The correlation between our estimated tumor volume and the clinical stage is summarized in TableS3.
When volume alone is used in the Cox model it is not a predictor for the OS (p = 0.35). When volume alone is used in the Cox model with chemotherapy and age it is predictive for OS with p = 0.048. When stage is used instead of volume,
Figure S8. Coefficients selected by the fused lasso model.
Table S3. Summary of correlations between the disease stage and the estimated CTV volume.
the stage (3A/3B) is a predictor for OS with p = 0.024. When volume and stage are included simultaneously, neither one is the predictor with p-values p = 0.18 for volume and p = 0.07 for stage. Given that stage and estimated volume are strongly correlated, we chose to include stage alone in the analysis.