^{1}

^{*}

^{2}

^{3}

^{4}

Selecting which explanatory variables to include in a given score is a common difficulty, as a balance must be found between statistical fit and practical application. This article presents a methodology for constructing parsimonious event risk scores combining a stepwise selection of variables with ensemble scores obtained by aggregation of several scores, using several classifiers, bootstrap samples and various modalities of random selection of variables. Selection methods based on a probabilistic model can be used to achieve a stepwise selection for a given classifier such as logistic regression, but not directly for an ensemble classifier constructed by aggregation of several classifiers. Three selection methods are proposed in this framework, two involving a backward selection of the variables based on their coefficients in an ensemble score and the third involving a forward selection of the variables maximizing the AUC. The stepwise selection allows constructing a succession of scores, with the practitioner able to choose which score best fits his needs. These three methods are compared in an application to construct parsimonious short-term event risk scores in chronic HF patients, using as event the composite endpoint of death or hospitalization for worsening HF within 180 days of a visit. Focusing on the fastest method, four scores are constructed, yielding out-of-bag AUCs ranging from 0.81 (26 variables) to 0.76 (2 variables).

In [

The more the variables contained in a model, the more complicated its use in particular in clinical practice. Therefore, a balance must be found between increasing the number of variables to allow for a better statistical fit and keeping this number sufficiently small to facilitate practical application. With the increased number of potential predictors in the medical field (through the use of "big data" from both electronic medical records and the increasing number of available biomarkers), the need for the statistical selection of variables also increases, particularly if the goal is to continue building parsimonious and effective models. For HF, variables can be selected using a literature review in order to assess which variables are the most clinically relevant [

Since the primary goal in the present study is to construct a score using an already-defined ensemble method, some of the above selection methods are not applicable in this setting. For example, the likelihood ratio test based on a probabilistic model can be used to achieve a stepwise selection for a given classifier such as logistic regression, but not directly for a classifier constructed by aggregation of several classifiers. Other selection criteria must therefore be defined in this framework. Given this context, this article presents in Section 2 a methodology for constructing parsimonious event scores combining a stepwise selection of variables and the use of ensemble scores. In particular, we define herein three methods, two of which involve a backward selection based on the variables’ coefficients in an ensemble score, and the third involving the combination of a forward selection using the area under the ROC curve (AUC) as criterion and an ensemble score. Due to the stepwise selection, a succession of scores is constructed which allows the user to choose which of the latter yields the best balance between performance and the number of variables.

As a concrete illustration, these three methods of construction of parsimonious scores are compared according to AUC and processing time in an application aimed at constructing short-term event risk scores in chronic heart failure (CHF) patients. Heart failure is a global and major cause of mortality and morbidity [

In [

In this section, a methodology for constructing parsimonious event risk scores combining a stepwise selection of variables with ensemble scores is presented. Each method consists of two phases, first a preselection of variables per classifier, second a stepwise construction of ensemble scores.

Univariate tests (Wilcoxon test for continuous variables and Fisher’s exact test for categorical variables) are first used to test the association between the response variable and each explanatory variable. Variables with a p-value greater than 0.2 are excluded.

The methodology detailed in Duarte et al. [

1) n_{1} classifiers are chosen.

2) n_{2} bootstrap samples are drawn from the working sample. Each bootstrap sample is used n_{1} times (each sample is used by each classifier).

3) n_{3} modalities of random selection of variables are chosen, “modality” representing a means to select the variables.

4) n_{1}n_{2}n_{3} models are built, each using a different combination of classifiers, bootstrap samples and modalities of selection of variables.

5) A first aggregation by classifiers is performed. The coefficients of the models are averaged to yield n_{1}intermediate scores.

6) The coefficients of the intermediate scores are normalized such that the scores themselves are between 0 and 100, using the same method as in Duarte et al. ( [

7) The final score is constructed by taking a convex combination of the intermediate scores maximizing the AUC OOB (AUC on out-of-bag samples).

The AUC OOB (AUC on out-of-bag samples) is computed as follows: for a given statistical unit, the scores obtained from bootstrap samples that do not include this statistical unit are aggregated to obtain an OOB prediction. By applying this method for all statistical units, the OOB predictions for the entire sample are used to compute the AUC OOB.

The search of an optimal set of coefficients of the convex combination of the intermediate scores may be achieved in a discrete subset of the set A = { ( α 1 , ... , α n 1 ) : α 1 + ... + α n 1 = 1 } . We used this method in the application. This search may take too much time due to the number of elements of A. Otherwise, the simplest way is to use A 1 = { ( 1 , 0 , ... , 0 ) , ... , ( 0 , 0 , ... , 1 ) } , thus to choose the classifier among the n_{1} classifiers which maximizes the AUC OOB. Note that in Super Learner ( [

Compared to the methodology presented in Duarte et al. [

As the number p of explanatory variables after the first exclusion of variables still remains too large to create a parsimonious score, a second phase is added in order to preselect a fewer number of variables. Three different methods with an additional preselection are proposed and their results compared. In Method 1, any adapted preselection of variables can be performed for each of the n_{1} classifiers and the sets of preselected variables are united in one set; then, a backward construction of scores is performed. In Method 2, a backward construction of scores is performed with a random selection of variables at each step. In Method 3, a forward preselection of variables for one of the classifiers or for each of the classifiers using the AUC in resubstitution as criterion is performed followed by a forward construction of scores using the AUC OOB as criterion.

Preselection of variables: For each of the n_{1} classifiers, any adapted preselection of variables can be performed. Thus, n_{1} sets of preselected variables are created. The union of these n_{1} sets is used as initial preselection. Let s be the number of preselected variables.

Backward construction of scores: For i = 1 , 2 , ... , s , at step i: an ensemble score is constructed from j = s − i + 1 variables (i.e., for i = 1, j = s; for i = s, j = 1), using the method described in 2.2 with n_{1} classifiers, n_{2} bootstrap samples, n_{3} modalities of random selection of variables. The variable with the lowest normalized and standardized coefficient in absolute value in this score is excluded for the step i + 1 (backward selection).

This allowed determining the evolution of the AUC OOB according to the number of selected variables, as well as the order of removal of the variables. Parsimonious scores with few variables can be chosen among this sequence of s scores.

Preselection of variables: No initial preselection of variables is performed; all of the p explanatory variables are included.

Backward construction of scores: For i = 1 , 2 , ... , p , at step i: an ensemble score is constructed from j = p − i + 1 variables (i.e., for i = 1, j = p; for i = p, j = 1), using the method described in 2.2 with n_{1} classifiers, n_{2} bootstrap samples, n_{3} modalities of random selection of variables. The variable with the lowest normalized and standardized coefficient in absolute value in this score is excluded for the step i + 1.

Again, this process allows determining the evolution of the AUC OOB according to the number of selected variables, as well as the order of removal of the variables, and parsimonious scores with few variables can be chosen among this sequence of p scores.

Forward preselection of variables: A forward preselection using AUC as criterion is performed for one of the classifiers or each of the classifiers. For a given classifier, let t denote a stopping time; for i = 1 , 2 , ... , t , at step i: i – 1 variables denoted V 1 , ... , V i − 1 are available from step i – 1, for every set of variables V 1 , ... , V i − 1 , V j with j ≠ 1 , ... , i − 1 , a classification is performed on the entire sample without bootstrapping; the variable, denoted V_{i}, yielding the maximal AUC in resubstitution is included, provided that the AUC significantly increases using DeLong’s test; otherwise, the inclusion of variables is stopped.

Note that the AUC can be computed as long as there is a prediction for each statistical unit, without assumption on the manner with which this prediction was obtained.

Forward construction of scores: For each classifier, for i = 1 , 2 , ... , t , at step i: an intermediate score using the i preselected variables for this classifier, is constructed, using n_{2} bootstrap samples (the same for all of the classifiers) and n_{3} modalities of random selection of variables. The n_{1} intermediate scores using the same number of preselected variables are aggregated in a final score by combining their predictions for each statistical unit as described in 2.2.

The area under the ROC curve (AUC) for the out-of-bag (OOB) estimations is used as internal validation and as the main criterion to compare the different scores. Several AUC OOB are studied: the AUC OOB for the intermediate scores and, mainly, the AUC OOB for the global score. Sensitivity (Se) and specificity (Sp) corresponding to the highest Youden index (Se + Sp − 1), as well as the number of selected variables and processing time, are also taken into account.

Herein, two classifiers (n_{1} = 2), linear discriminant analysis (LDA), which is equivalent to linear regression on binary outcomes, and logistic regression (LR) were chosen to construct the ensemble scores. The number of bootstrap samples was, n_{2} = 1000. Two modalities of random selection of variables were chosen, n_{3} = 2: namely, one modality consisted in randomly drawing a defined number of variables; the other in randomly drawing a defined number of groups of related variables (correlated or linked by construction) and, for each selected group, randomly draw one variable. The groups of related variables used in the application are shown in the Supplementary Material A.1.

The score constructed for linear discriminant analysis is denoted S L D A and the one for logistic regression S L D A . The two normalized scores are denoted S ¯ L D A and S ¯ L R , and the final score S ¯ = λ S ¯ L D A + ( 1 − λ ) S ¯ L R ( 0 ≤ λ ≤ 1 ) (see

For Method 1, a stepwise preselection using the Akaike Information Criterion (AIC) was performed on the working sample, without bootstrapping, both for LDA and for LR. Note that herein, the AIC can be used as criterion since both LDA and LR are probabilistic models.

For Method 3, the results presented used a forward preselection with LR (Method 3a). Results obtained using a forward preselection using both LR and LDA (Method 3b) or using only LDA (Method 3c) are available as Supplementary Material (Part B).

The data used in this study are derived from the GISSI-HF trial: a multicenter, randomized, double-blind, placebo-controlled trial designed to assess the effect of n − 3 polyunsaturated fatty acids in patients with CHF. The detailed protocol and main results of this trial have already been described elsewhere [

Eligible patients were adult men and women with clinical evidence of HF of any cause, with a New York Heart Association (NYHA) class II-IV, and having had a left ventricular ejection fraction (LVEF) measured within 3 months prior to enrolment. Patients with a LVEF greater than 40% had to have been admitted at least once to hospital for HF in the preceding year to meet the inclusion criteria. In addition to contraindications linked to the studied treatment, exclusion criteria included acute coronary syndrome or revascularization procedure within the preceding 1 month; and planned cardiac surgery expected to be performed within 3 months after randomization.

After randomization and the baseline visit, patients underwent scheduled visits at 1, 3, 6, 12 months and every 6 months thereafter until the end of the trial. Data collected at baseline included patient description, medical history, etiology of HF, LVEF measurements, electrocardiogram data, clinical and cardiovascular examination, blood chemistry tests, pharmacological treatments and dietary habits. During the follow-up visits, collected data consisted of patient description, clinical and cardiovascular examination, LVEF measurement, electrocardiogram data, blood chemistry tests (only at 1, 3, 6, 12, 24, 36 and 48 months), pharmacological treatment (including the study treatment) and dietary habits. Events of interest were also recorded. The entire GISSI-HF trial included 7046 eligible and randomized patients, with the final sample analyzed in [

The present study used a subsample of the GISSI-HF data containing 1231 patients with N-terminal prohormone brain natriuretic peptide (NT-proBNP) measurements. The dataset included baseline and follow-up visits for these patients, as well as their associated health events.

(Patient, visit) couples were used herein as statistical units, i.e. each observation was associated to a patient for a given visit. We assumed that the short-term future of a patient was only dependent on the most recent measurements. Thus, the links between several couples pertaining to the same patient were not taken into account, as in [

Several variables were derived from the available data, either for the follow-up visits (when values were available at baseline but not for the follow-up) or for all visits: mean blood pressure (BP) (1/3 * systolic BP + 2/3 * diastolic BP); estimated plasma volume (ePVS) ((100-hematocrit)/hemoglobin as defined in [

Categorical variables were recoded as binary dummy variables. In particular, in the case of ordinal variables (i.e. NYHA class and peripheral edema), an ordinal encoding was used, namely constructing the binary variables NYHA ≥ II, NYHA ≥ III and NYHA ≥ IV and, similarly, peripheral edema ≥ ankles, peripheral edema ≥ knee, peripheral edema ≥ above.

Since some variables were only available at baseline but were unlikely to change over time (e.g. sex), their values were copied for follow-up visits. Similarly, certain medical history variables available at baseline (such as previous acute myocardial infarction (AMI), previous stroke, angina pectoris, coronary artery bypass graft (CABG), previous hospitalization for worsening HF) were copied for follow-up visits and, when possible, updated using the information from the events.

NT-proBNP values were only measured at baseline and at the 3-months follow-up. Due to the importance of this variable in the literature [

Lastly, the response variable was defined as the occurrence of a composite event (death for worsening HF or hospitalization for worsening HF) within 180 days of a visit.

Since the laboratory tests for measuring blood parameters were performed only at baseline, 1, 3, 6, 12, 24, 36 and 48 months, only the observations corresponding to these visits were retained. Incomplete observations (with missing values) were also excluded.

Several variables not relevant to this study were excluded (e.g. “technical variables”, such as identification numbers or dates, or “intermediary variables” used to build other variables, such as the cause of death or drug doses), as well as variables with more than 1000 missing values. The remaining variables and the groups of related variables are shown in the Supplementary Material A.1.

Six binary variables with univariate p-value greater than 0.2 (Fisher’s exact test) were excluded: gender being “female”, main cause of HF being “hypertension” or “other”, history of coronary angioplasty, left ventricular hypertrophy, pathological Q waves.

In order to eliminate outliers without excluding the associated observations, all continuous variables were winsorized: all values lower than the 1^{st} percentile (respectively greater than the 99^{th} percentile) were set to the value of the 1^{st} percentile (resp. the 99^{th} percentile). This method was used to avoid excluding more observations, since the number of cases was already small compared to the controls and to avoid reducing the number of patients with event.

Continuous variables were then transformed to satisfy the linearity assumption of logistic regression. For each continuous variable, a similar method to that described in Duarte et al. [^{−}^{2} for optimal transformation, while the optimal transformation for eGFR and NT-proBNP was 1/x and ln(x) respectively. The remaining five variables (BMI, systolic blood pressure, hematocrit, uricemia and LVEF) had a quadratic relationship with the logit and were transformed accordingly. After the transformation, the coefficient associated with the cubic component of the spline was non-significantly different from 0 for each of the transformed variables.

This transformed dataset was used for Methods 1, 2 and the LR intermediate score of Method 3a. A similar technique was used on a duplicate dataset for the LDA intermediate score of Method 3a, but with transformation of the variables in order to satisfy the linearity assumption for linear regression; fifteen variables were transformed: ten were transformed using a quadratic (x – k)^{2} transformation (BMI, systolic blood pressure, diastolic blood pressure, mean blood pressure, hematocrit, hemoglobin, ePVS, serum sodium, uricemia, total cholesterol and LVEF); three using an inverse square x^{−}^{2} transformation (eGFR, triglycerides, cholesterol HDL); one using a square transformation (serum creatinine); and one using a square root transformation (NT-proBNP).

The p-values of the tests, before and after transformation, as well as the transformation functions applied to the variables both for the LR and for the LDA are available as Supplementary Material (Part C).

Given the large imbalance between cases and controls, the sample was balanced by duplicating each case 15 times. This is equivalent to giving each case fifteen times more weight than a control. Preliminary analyses (not shown) showed that using a sample that was rebalanced in this manner resulted in better performance compared to using the unbalanced sample.

After the exclusions, the working sample consisted in 11,411 observations of 62 explanatory variables, with 5595 (duplicated) events and 5816 non-events.

Summary statistics of the sample prior to data management (winsorization, transformation of the variables and sample balancing) are available in Supplementary Material A.2. Summary statistics of the sample after winsorization and sample balancing, but before the transformation of the variables, are provided in Supplementary Material A.1.

The detailed preselections with their corresponding AUC are given in

For Method 1, 50 variables were preselected during the stepwise selection phase, after which the maximum AUC OOB was obtained for the score using 49 variables. The total runtime for the first method was approximately 1h30 (5 min for the two stepwise preselections and 1h25 for the backward selection using scores).

Comparatively, for Method 2, the maximum AUC OOB corresponded to the score using 58 variables. The total runtime of the second method was approximately 1h35 (exclusively for the backward selection using scores).

For Method 3a, the logistic forward preselection yielded 26 variables, mostly clinical or biological, after which the AUC no longer increased significantly. The total runtime of the third method was approximately 1h05 minutes if all the scores were constructed (less than 5 min for the preselection and 30 min for each of the successions of scores). However, unlike the other two methods, it is not mandatory to construct all of the scores with Method 3a and one could construct only one score after the preselection of variables. In this case, the total runtime would be reduced to less than 10 min (less than 5 min for the preselection and 2 - 5 min to construct one score).

Preselected variables were extremely similar between all 3 methods. For Methods 1 and 2, three variables were needed to obtain an AUC OOB greater than 0.75 (for Method 3a, only two were needed). Among these variables, two were common to all methods: NT-proBNP and NYHA ≥ III. In order to obtain an AUC OOB above 0.78, all methods necessitated eight variables, seven of which were common to the three methods: NT-proBNP, NYHA ≥ III, Glycemia, systolic blood pressure, beta-blockers, peripheral edema ≥ “above” and NYHA ≥ II. Lastly, for an AUC OOB threshold of 0.80, Methods 1 and 2 necessitated 17 variables, while Method 3a necessitated 15. In this case, 13 variables were common to the three methods: added to the six aforementioned variables were cholesterol HDL, heart rate, uricemia, third heart sound, bilirubin and paroxystic atrial fibrillation. Globally, the three selections were very similar.

For a fixed number of variables, the three methods yielded extremely similar AUC OOB, even when the selections of variables themselves were different. Since Method 3a generally yielded the best AUC OOB for a given number of selected variables and with a faster runtime, only the results for parsimonious

Method 1 | Method 2 | Method 3a | |||||||
---|---|---|---|---|---|---|---|---|---|

Variables | AUC OOB* | Variables | AUC OOB* | Variables | AUC OOB** (LR part) | AUC OOB** (LDA part) | AUC OOB*** (all) | ||

1 | NT-proBNP | 0.7246 | NT-proBNP | 0.7246 | NT-proBNP | 0.7246 | 0.7246 | 0.7246 | |

2 | NYHA ≥ III | 0.7482 | NYHA ≥ III | 0.7482 | NYHA ≥ III | 0.7482 | 0.7523 | 0.7523 | |

3 | Periph. edema ≥ “above” | 0.7547 | Heart rate | 0.7529 | NYHA ≥ II | 0.7550 | 0.7579 | 0.7579 | |

4 | Glycemia | 0.7620 | Systolic BP | 0.7591 | Glycemia | 0.7621 | 0.7642 | 0.7642 | |

5 | Systolic BP | 0.7671 | NYHA ≥ II | 0.7647 | Periph. edema ≥ “above” | 0.7687 | 0.7688 | 0.7694 | |

6 | Beta-blockers | 0.7730 | Beta-blockers | 0.7696 | Beta-blockers | 0.7731 | 0.7714 | 0.7736 | |

7 | NYHA ≥ II | 0.7787 | Glycemia | 0.7764 | Systolic BP | 0.7791 | 0.7761 | 0.7792 | |

8 | Cholesterol HDL | 0.7829 | Periph. edema ≥ “above” | 0.7810 | Cholesterol HDL | 0.7835 | 0.7796 | 0.7835 | |

9 | Mean BP | 0.7827 | Cholesterol HDL | 0.7852 | Paroxystic AF | 0.7864 | 0.7831 | 0.7867 | |

10 | Diastolic BP | 0.7840 | Uricemia | 0.7885 | Uricemia | 0.7902 | 0.7866 | 0.7904 | |

11 | Heart rate | 0.7861 | Bilirubin | 0.7912 | Bilirubin | 0.7925 | 0.7876 | 0.7926 | |

12 | Uricemia | 0.7897 | Diuretics | 0.7913 | Implantable defibrillator | 0.7948 | 0.7908 | 0.7950 | |

13 | Third heart sound | 0.7922 | Previous AMI | 0.7932 | Neoplasia | 0.7966 | 0.7924 | 0.7968 | |

14 | Bilirubin | 0.7950 | Paroxystic AF | 0.7953 | Third heart sound | 0.7984 | 0.7947 | 0.7985 | |

15 | Previous AMI | 0.7967 | Third heart sound | 0.7982 | Heart rate | 0.8001 | 0.7963 | 0.8002 | |

16 | Paroxystic AF | 0.7988 | LVEF | 0.7990 | Previous AMI | 0.8020 | 0.7977 | 0.8020 | |

17 | Implantable defibrillator | 0.8010 | Triglycerides | 0.8006 | Triglycerides | 0.8038 | 0.7993 | 0.8038 | |

18 | Neoplasia | 0.8027 | Neoplasia | 0.8028 | LVEF | 0.8052 | 0.8010 | 0.8052 | |

19 | LVEF | 0.8045 | Ascitis | 0.8038 | Hypertension | 0.8067 | 0.8021 | 0.8067 | |

20 | Triglycerides | 0.8064 | Implantable defibrillator | 0.8060 | Mitral insufficiency | 0.8080 | 0.8040 | 0.8080 | |

21 | Diuretics | 0.8070 | Hemoglobin | 0.8058 | Smoker or ex-smoker | 0.8091 | 0.8053 | 0.8091 | |

22 | Ascitis | 0.8085 | ePVS | 0.8061 | Ascitis | 0.8104 | 0.8060 | 0.8104 | |

23 | Mid-apical pulmonary rales | 0.8091 | Hematocrit | 0.8070 | Periph. edema ≥ “ankles” | 0.8116 | 0.8069 | 0.8116 | |

24 | Smoker or ex-smoker | 0.8099 | Smoker or ex-smoker | 0.8080 | NYHA ≥ IV | 0.8119 | 0.8071 | 0.8119 | |

25 | Mitral insufficiency | 0.8108 | Mitral insufficiency | 0.8086 | BMI | 0.8130 | 0.8084 | 0.8130 | |

26 | Hypertension | 0.8121 | BMI | 0.8103 | Mid-apical pulmonary rales | 0.8137 | 0.8084 | 0.8137 | |

27 | BMI | 0.8131 | Hypertension | 0.8119 | |||||

28 | Periph. edema ≥ “ankles” | 0.8144 | Previous hosp. for worsening HF | 0.8119 | |||||

29 | Periph. edema ≥ “knee” | 0.8151 | Mid-apical pulmonary rales | 0.8127 | |||||

30 | CABG | 0.8157 | Diabetes | 0.8127 | |||||

31 | Calcium antagonists | 0.8161 | CABG | 0.8133 | |||||

32 | Previous hosp. for worsening HF | 0.8166 | Periph. edema ≥ “knee” | 0.8136 | |||||

33 | Bundle branch block | 0.8170 | Diastolic BP | 0.8137 | |||||

34 | NYHA ≥ IV | 0.8176 | NYHA ≥ IV | 0.8145 | |||||

35 | Serum sodium | 0.8178 | Bundle branch block | 0.8147 | |||||

36 | Diabetes | 0.8180 | Calcium antagonists | 0.8153 | |||||

37 | COPD | 0.8181 | Total cholesterol | 0.8149 | |||||

38 | Previous stroke | 0.8184 | Mean BP | 0.8151 | |||||

39 | Years of school education | 0.8187 | COPD | 0.8150 | |||||

40 | Age | 0.8186 | Periph. edema ≥ “ankles” | 0.8163 | |||||

41 | Weight | 0.8186 | Atrial fibrillation | 0.8166 | |||||

42 | Serum creatinine | 0.8185 | Cause of HF = “not known” | 0.8165 | |||||

43 | eGFR | 0.8186 | Previous stroke | 0.8167 | |||||

44 | Total cholesterol | 0.8184 | Aortic stenosis | 0.8167 | |||||

45 | Aortic stenosis | 0.8185 | Age | 0.8164 | |||||

46 | Cause of HF = “not known” | 0.8186 | Angina pectoris | 0.8164 | |||||

47 | Atrial fibrillation | 0.8187 | Years of school education | 0.8166 | |||||

48 | Pulmonary rales | 0.8186 | Waiting for cardiac transplantation | 0.8168 | |||||

49 | Basal pulmonary rales | 0.8188 | Serum sodium | 0.8170 | |||||

50 | Transient ischemic attack | 0.8187 | Definitive pace maker | 0.8171 | |||||

51 | Basal pulmonary rales | 0.8168 | |||||||

52 | Weight | 0.8169 | |||||||

53 | eGFR | 0.8169 | |||||||

54 | Transient ischemic attack | 0.8170 | |||||||

55 | Hepatomegaly | 0.8165 | |||||||

56 | Pulmonary rales | 0.8168 | |||||||

57 | ECG evaluation | 0.8167 | |||||||

58 | ACE-inhibitors | 0.8172 | |||||||

59 | Serum creatinine | 0.8168 | |||||||

60 | Serum potassium | 0.8170 | |||||||

61 | CVP > 6 cm H20 | 0.8170 | |||||||

62 | Cause of HF = “cardiomyopathy” | 0.8167 |

*AUC OOB obtained for the score including the variable in the row as well as all previous variables. **The AUC OOB of these columns were obtained by building an intermediate score using only LDA (respectively LR) for the linear part (resp. logistic part) from the selected variables. ***The AUC OOB of this column was obtained by constructing a full ensemble score with the same number of variables for both LDA and LR, using the optimal λ for each score. ACE: angiotensin-converting enzyme; AF: atrial fibrillation; AMI: acute myocardial infarction; AUC OOB: area under the ROC curve out-of-bag; BMI: body mass index; BP: blood pressure CABG: coronary artery bypass graft; COPD: chronic obstructive pulmonary disease; CVP: central venous pressure; eGFR: estimated glomerular filtration rate; ePVS: estimated plasma volume; HDL: high-density lipoprotein; HF: heart failure; LVEF: left ventricular ejection fraction; NT-proBNP: N-terminal prohormone brain natriuretic peptide; NYHA: New York Heart Association.

scores constructed by this method are given at the end of this section.

Four scores constructed by Method 3a were particularly studied: the score including all variables selected by the forward preselection, denoted S3.26 (the number of the method and the number of variables used), and three “parsimonious” scores, denoted S3.15, S3.8 and S3.2, which yielded an AUC OOB above certain thresholds (0.80, 0.78 and 0.75). To attain these thresholds, 15, 8 and 2 variables were respectively needed. The AUC OOB with λ = 0.5 and the optimal λ, as well as the optimal sensitivity and specificity according to the maximum Youden index of these four scores are given in

Score S3.2 had an AUC OOB of 0.7523 with an optimal λ = 1 (i.e. only LDA

AUC OOB | Method 1 | Method 2 | Method 3a | Number of variables common to all methods |
---|---|---|---|---|

≥0.750 | 3 | 3 | 2 | 2 |

≥0.760 | 4 | 5 | 4 | 2 |

≥0.770 | 6 | 7 | 6 | 4 |

≥0.780 | 8 | 8 | 8 | 7 |

≥0.790 | 13 | 11 | 10 | 9 |

≥0.800 | 17 | 17 | 15 | 13 |

≥0.810 | 25 | 26 | 22 | 21 |

Note: even if the methods necessitated the same number of variables to obtain a given AUC, the variables themselves may not be the same.

Score designation | S3.26 | S3.15 | S3.8 | S3.2 | ||||
---|---|---|---|---|---|---|---|---|

Data | Working sample defined in Section 3.3. Variables transformed differently for the linear intermediate score and the logistic intermediate score. | |||||||

Number of bootstrap samples | 1000 | |||||||

Number of variables used | 26 | 15 | 8 | 2 | ||||

Number of modalities | 2 | |||||||

λ value | λ = 0.5 | λ = 0 (optimal) | λ = 0.5 | λ = 0.09 (optimal) | λ = 0.5 | λ = 0.06 (optimal) | λ = 0.5 | λ = 1 (optimal) |

AUC OOB of the LDA | 0.8084 | 0.7963 | 0.7796 | 0.7523 | ||||

AUC OOB of the LR | 0.8137 | 0.8001 | 0.7835 | 0.7482 | ||||

AUC OOB of the final score | 0.8121 | 0.8137 | 0.7996 | 0.8002 | 0.7830 | 0.7835 | 0.7502 | 0.7523 |

Sensitivity* | 0.861 | 0.823 | 0.759 | 0.724 | 0.713 | 0.748 | 0.810 | 0.826 |

Specificity* | 0.611 | 0.651 | 0.689 | 0.719 | 0.707 | 0.675 | 0.551 | 0.547 |

Maximum Youden index | 0.472 | 0.474 | 0.448 | 0.443 | 0.420 | 0.423 | 0.361 | 0.373 |

*Sensitivity and specificity associated with the maximum value of the Youden index.

was used). Score S3.8 had an AUC OOB of 0.7835 with an optimal λ = 0.06. Score S3.15 had an AUC OOB of 0.8001 with an optimal λ = 0.09. Finally, the full score including all preselected variables had an AUC OOB of 0.8137 with an optimal λ = 0 (i.e. only LR was used). It is interesting to note that for score S3.2, only LDA was used while for score S3.26 only LR was used. Thus, both classifiers are useful.

In this article, we presented and compared different methods of construction of parsimonious ensemble scores, with the construction of short-term event scores for CHF as a concrete illustration. Parsimonious scores were obtained by combining stepwise selections of variables and the use of an ensemble score. Since classic criteria of stepwise selection based on probabilistic models cannot be used in the case of an ensemble score, we proposed using a criterion based on the absolute values of the coefficients of variables in an ensemble score and a second criterion based on the AUC.

An advantage of a stepwise selection of predictors is that it allows automatically building a succession of scores and therefore choosing which of the latter has the best balance between performance and the number of variables, according to the desired quality objectives. Once this choice is made, the selected score can be used as a “classic” score. The use of an ensemble method to construct this score also provides confidence in the stability and performance of the results. Indeed, ensemble methods generally yield better results than a single predictor, provided that the predictors constituting the ensemble perform sufficiently well individually and are sufficiently different from each other [

Other selection methods could have been tested, for example by building all possible ensemble scores at each step with one more variable than in the previous step, keeping only the variable yielding the largest increase in AUC OOB. However, this would have entailed a lengthy processing time due to the large number of ensemble scores to construct and preliminary results (not shown) conclude that they would not have yielded a better performance than the presented methods. In the application, variants of Method 3 could also be used, e.g. preselecting variables using LDA as opposed to LR. Summarized results for these alternative methods are presented in the Supplementary Material.

Regarding the variables used, when applying our method to the construction of a short-term score in patients with CHF, the most predictive variable was systematically NT-proBNP, which is a well-known predictor of HF [

All variables included in the parsimonious scores S3.15, S3.8 and S3.2 are easily available from either the patient’s medical history (paroxystic atrial fibrillation, previous AMI, implantable defibrillator, neoplasia), the patient’s drug consumption (beta-blockers), a clinical examination (NYHA class, peripheral edema, heart rate, blood pressure, third heart sound), or laboratory blood tests (NT-proBNP, glycemia, cholesterol HDL, bilirubin, uricemia, triglycerides).

To our knowledge, no study has presented a score for short-term (180 days) events in CHF. Therefore, comparing the performance of our scores with others in the literature is difficult. Recent existing scores were generally constructed to predict long-term events for CHF patients, often at 1 or 2 years [

· In Voors et al. [

· The AUC of score S3.8 is similar to that of the score proposed by Spinar et al. [

· The MAGGIC risk score [

The main limitation of our application study is that only one dataset was used in our tests. However, the present work is mostly a “proof of concept” of the usefulness of the presented methods of construction of parsimonious ensemble scores.

Variables selection methods based on a probabilistic model can be used to achieve a stepwise selection for a given classifier such as logistic regression, but not directly for a classifier constructed by aggregation of several classifiers. In this article, we have proposed to construct parsimonious ensemble scores using sample balancing, several classifiers, bootstrap samples and stepwise variable selection methods in this setting. As a concrete application, we constructed a short-term event (death or hospitalization for HF at 180 days) score for CHF patients, yielding satisfactory AUC values with respect to other scores in other HF patients’ populations. The methods proposed and tested in this article can be reproduced on any delay, any set of variables and any other settings (other types of HF or other diseases) as long as there is a sufficient number of cases, i.e. a sufficiently large training dataset. Applications on other datasets and comparisons with other methods should be conducted in order to confirm the interest of the proposed methods.

The authors thank Mr. Pierre Pothier for editing this manuscript. Results incorporated in this article received funding from the investments for the Future program, France under grant agreement No ANR-15-RHU-0004.

The authors declare no conflicts of interest regarding the publication of this paper.

Lalloué, B., Monnez, J.-M., Lucci, D. and Albuisson, E. (2021) Construction of Parsimonious Event Risk Scores by an Ensemble Method. An Illustration for Short-Term Predictions in Chronic Heart Failure Patients from the GISSI-HF Trial. Applied Mathematics, 12, 627-653. https://doi.org/10.4236/am.2021.127045

Variables | Groups of related variables | Mean (SD) or N (%) | |
---|---|---|---|

Female^{b,d} | - | 2227 (19.5%) | |

Age^{a,g} | - | 68.10 (10.20) | |

Years of school education^{d,g} | - | 6.92 (3.65) | |

Weight^{g} | Obesity | 75.87 (14.33) | |

BMI^{a,g} | 26.96 (4.48) | ||

Smoker or ex-smoker^{b,d} | - | 6645 (58.2%) | |

Heart Rate^{g} | - | 72.49 (13.38) | |

Diastolic blood pressure^{g} | Blood pressure | 76.28 (10.17) | |

Systolic blood pressure^{g} | 125.21 (19.41) | ||

Mean blood pressure^{a,g} | 92.58 (12.17) | ||

NYHA class^{c} (ref: “NYHA I”) | ≥II | NYHA | 10837 (95.0%) |

≥III | 3061 (26.8%) | ||

≥IV | 242 (2.1%) | ||

Peripheral edema^{c,d} (ref: “No”) | ≥Ankles | Peripheral edema | 1768 (15.5%) |

≥Knee | 316 (2.8%) | ||

≥Above | 159 (1.4%) | ||

Main cause of HF ^{b} (ref: “Ischemic”) | Cardiomyopathy | - | 3126 (27.4%) |

Hypertension | - | 1726 (15.3%) | |

Other | - | 346 (3.0%) | |

Not known | - | 175 (1.5%) | |

Ascites^{b,d} | - | 147 (1.3%) | |

Hepatomegaly^{b,d} | - | 2188 (19.2%) | |

Mitral insufficiency^{b,d} | - | 5461 (47.9%) | |

CVP > 6 cm H20^{b,d} | - | 1139 (10.0%) | |

Basal pulmonary rales^{b,d} | - | 1732 (15.2%) | |

Mid-apical pulmonary rales^{b,d} | - | 79 (0.7%) | |

Pulmonary rales^{b,d} | - | 599 (5.2%) | |

Aortic stenosis^{b,d} | - | 315 (2.8%) | |

Third heart sound (S_{3})^{b,d} | - | 2177 (19.1%) | |

Hematocrit^{g} | Hematology | 40.16 (4.53) | |

Hemoglobin^{g} | 13.40 (1.60) | ||

ePVS^{a,g} | 4.57 (0.92) | ||

Serum creatinine^{g} | Renal function | 1.27 (0.44) | |

eGFR^{a,g,h} | 64.08 (22.63) | ||

Serum potassium^{g} | - | 4.48 (0.50) | |

Serum sodium^{g} | - | 139.49 (3.33) | |

Uricemia^{g} | - | 6.43 (1.94) | |

Triglycerides^{g} | - | 137.92 (84.01) | |

Cholesterol HDL^{g} | Cholesterol | 47.58 (13.19) | |

Total Cholesterol^{g} | 175.10 (44.48) | ||

Bilirubin^{g} | - | 0.84 (0.42) | |

Glycemia^{g} | - | 122.98 (46.60) | |

NT-proBNP^{f,g} | - | 1856.60 (2194.91) | |

Diabetes mellitus^{b,d} | - | 3481 (30.5%) | |

Hypertension^{b,d} | - | 6470 (56.7%) | |

Previous AMI^{b,e} | - | 5421 (47.5%) | |

Previous stroke^{b,e} | - | 643 (5.6%) | |

Previous hosp. for worsening HF^{b,e} | - | 6526 (57.2%) | |

Angina pectoris^{b,e} | - | 2060 (18.1%) | |

Coronary angioplasty^{b,d} | - | 1478 (13.0%) | |

Transient ischemic attack (TIA)^{b,d} | - | 1228 (10.8%) | |

COPD^{b,d} | - | 2348 (20.6%) | |

CABG^{b,e} | - | 2847 (24.9%) | |

Implantable defibrillator^{b,d} | - | 1020 (8.9%) | |

Paroxystic AF^{b,d} | - | 2756 (24.2%) | |

Neoplasia^{b,d} | - | 592 (5.2%) | |

Definitive pace maker^{b,d} | - | 1944 (17.0%) | |

Waiting for cardiac transplantation^{b,d} | - | 122 (1.1%) | |

LVEF^{d,g} | - | 32.58 (10.05) | |

Bundle branch block^{b} | - | 3883 (34.0%) | |

Atrial fibrillation^{b} | - | 2087 (18.3%) | |

Left ventricular hypertrophy^{b} | - | 1885 (16.5%) | |

Pathological Q waves^{b} | - | 2236 (19.6%) | |

Normal ECG evaluation^{b} | - | 415 (3.6%) | |

ACE-inhibitors^{a,b} | - | 8782 (77.0%) | |

Beta-blockers^{a,b} | - | 7430 (65.1%) | |

Calcium antagonists^{a,b} | - | 803 (7.0%) | |

Diuretics^{a,b} | - | 10813 (94.8%) |

^{a}derived variable; ^{b}binary variable encoding; ^{c}ordinal encoding; ^{d}baseline value copied to follow-up visits; ^{e}baseline value copied to follow-up visits and updated when possible; ^{f}interpolated values; ^{g}winsorized variable. SD: standard deviation; BMI: body mass index; NYHA: New York Heart Association; HF: heart failure; CVP: central venous pressure; ePVS: estimated plasma volume; eGFR: estimated glomerular filtration rate; HDL, high-density lipoprotein; AMI: acute myocardial infarction; COPD: chronic obstructive pulmonary disease; CABG: coronary artery bypass graft; AF: atrial fibrillation; LVEF: left ventricular ejection fraction; ACE: angiotensin-converting enzyme.

Variables | Groups of related variables | Mean (SD) or N (%) | |
---|---|---|---|

Female^{b,d} | - | 1219 (19.7%) | |

Age^{a} | - | 66.94 (10.76) | |

Years of school education^{d} | - | 7.00 (3.77) | |

Weight | Obesity | 76.25 (14.76) | |

BMI^{a} | 26.96 (4.46) | ||

Smoker or ex-smoker^{b,d} | - | 3425 (55.3%) | |

Heart rate | - | 70.22 (13.25) | |

Diastolic blood pressure | Blood pressure | 77.37 (10.29) | |

Systolic blood pressure | 127.16 (18.75) | ||

Mean blood pressure^{a} | 93.96 (12.07) | ||

NYHA^{c }(ref: “NYHA I”) | ≥II | NYHA | 5671 (91.6%) |

≥III | 1017 (16.4%) | ||

≥IV | 46 (0.74%) | ||

Peripheral edema^{c,d }(ref: “No”) | ≥Ankles | Peripheral edema | 732 (11.8%) |

≥Knee | 106 (1.7%) | ||

≥Above | 33 (0.5%) | ||

Main cause of HF^{b }(ref: “Ischemic”) | Cardiomyopathy | - | 1866 (30.2%) |

Hypertension | - | 956 (15.4%) | |

Other | - | 178 (2.9%) | |

Not known | - | 133 (2.1%) | |

Ascites^{b,d} | - | 21 (0.3%) | |

Hepatomegaly^{b,d} | - | 914 (14.8%) | |

Mitral insufficiency^{b,d} | - | 2647 (42.8%) | |

CVP > 6 cm H20^{b,d} | - | 467 (7.5%) | |

Basal pulmonary rales^{b,d} | - | 738 (11.9%) | |

Mid-apical pulmonary rales^{b,d} | - | 51 (0.8%) | |

Pulmonary rales^{b,d} | - | 263 (4.3%) | |

Aortic stenosis^{b,d} | - | 105 (1.7%) | |

Third heart sound (S_{3})^{b,d} | - | 945 (15.3%) | |

Hematocrit | Hematology | 40.58 (4.35) | |

Hemoglobin | 13.60 (1.56) | ||

ePVS^{a} | 4.47 (0.89) | ||

Serum creatinine | Renal function | 1.21 (0.42) | |

eGFR^{a} | 67.59 (22.79) | ||

Serum potassium | - | 4.47 (0.50) | |

Serum sodium | - | 139.65 (3.45) | |

Uricemia | - | 6.39 (1.84) | |

Triglycerides | - | 147.31 (110.90) | |

Cholesterol HDL | Cholesterol | 49.15 (13.62) | |

Total cholesterol | 180.67 (44.55) | ||

Bilirubin | - | 0.81 (0.56) | |

Glycemia | - | 119.61 (46.53) | |

NT-proBNP^{f} | - | 1312.57 (1978.60) | |

Diabetes mellitus^{b,d} | - | 1535 (24.8%) | |

Hypertension^{b,d} | - | 3390 (54.8%) | |

Previous AMI^{b,e} | - | 2649 (42.8%) | |

Previous stroke^{b,e} | - | 293 (4.7%) | |

Previous hosp. for worsening HF^{b,e} | - | 3068 (49.6%) | |

Angina pectoris^{b,e} | - | 968 (15.6%) | |

Coronary angioplasty^{b,d} | - | 792 (12.8%) | |

Transient ischemic attack (TIA)^{b,d} | - | 500 (8.1%) | |

COPD^{b,d} | - | 1074 (17.4%) | |

CABG^{b,e} | - | 1321 (21.3%) | |

Implantable defibrillator^{b,d} | - | 488 (7.9%) | |

Paroxystic AF^{b,d} | - | 1174 (19.0%) | |

Neoplasia^{b,d} | - | 242 (3.9%) | |

Definitive pace maker^{b,d} | - | 824 (13.3%) | |

Waiting for cardiac transplantation^{b,d} | - | 38 (0.6%) | |

LVEF^{d} | - | 33.56 (9.74) | |

Bundle branch block^{b} | - | 2007 (32.4%) | |

Atrial fibrillation^{b} | - | 911 (14.7%) | |

Left ventricular hypertrophy^{b} | - | 1031 (16.7%) | |

Pathological Q waves^{b} | - | 1200 (19.4%) | |

Normal ECG evaluation^{b} | - | 289 (4.7%) | |

ACE-inhibitors^{a,b} | - | 4848 (78.3%) | |

Beta-blockers^{a,b} | - | 4434 (71.6%) | |

Calcium antagonists^{a,b} | - | 509 (8.2%) | |

Diuretics^{a,b} | - | 5689 (91.9%) |

^{a}derived variable; ^{b}binary variable encoding; ^{c}ordinal encoding; ^{d}baseline value copied to follow-up visits; ^{e}baseline value copied to follow-up visits and updated when possible; ^{f }interpolated values. SD: standard deviation; BMI: body mass index; NYHA: New York Heart Association; HF: heart failure; CVP: central venous pressure; ePVS: estimated plasma volume; eGFR: estimated glomerular filtration rate; HDL, high-density lipoprotein; AMI: acute myocardial infarction; COPD: chronic obstructive pulmonary disease; CABG: coronary artery bypass graft; AF; atrial fibrillation; LVEF: left ventricular ejection fraction; ACE: angiotensin-converting enzyme.

Preselection of variables: Two forward preselections using AUC as criterion were performed, one for logistic regression (LR) and the other for linear discriminant analysis (LDA). Let t denote a stopping time. For i = 1 , 2 , ... , t : at step i: i − 1 variables denoted V 1 , ... , V i − 1 were available from step i − 1 . For every set of variables V 1 , ... , V i − 1 , V j with j ≠ 1 , ... , i − 1 , a logistic regression (respectively a linear regression) was performed on the entire sample without bootstrap. The variable, denoted V i , yielding the maximal AUC in resubstitution was included for the step i + 1 , provided that the AUC significantly increased using DeLong’s test; otherwise, the inclusion of variables was stopped ( t = i ). The preselection using logistic regression (respectively LDA) was used to build an intermediate LR score (respectively an intermediate LDA score). Note that the number of preselected variables and the preselected variables themselves may differ between the two preselections.

Note that, contrary to the preselection phase of Method 1 with AIC, there is no need in this instance for a probabilistic model. Indeed, the AUC can be computed as long as there is a prediction, without assumption on the manner with which this prediction was obtained.

Construction of intermediate scores: For each classifier, intermediate scores using only the associated selected variables were constructed, using 1000 bootstrap samples (the same for both classifiers) and two modalities of selection of variables (all variables or all groups of related variables). Since the preselection was performed separately for both classifiers, intermediate scores may not use the same variables.

Construction of final scores: The two intermediate scores were aggregated in a final score by averaging their prediction for each statistical unit. Since the intermediate scores in this method were constructed independently from each other on two different sets of variables, there were multiple ways to combine the latter. In this instance, intermediate scores using the same number of preselected variables by classifier were aggregated in a final score.

A2.2. Method 3cPreselection of variables: A forward preselection using AUC as criterion was performed using LDA. Let t denote a stopping time. For i = 1 , 2 , ... , t : at step i: i − 1 variables denoted V 1 , ... , V i − 1 were available from step i − 1 . For every set of variables V 1 , ... , V i − 1 , V j with j ≠ 1 , ... , i − 1 , a linear regression was performed on the entire sample without bootstrap. The variable, denoted V i , yielding the maximal AUC in resubstitution was included for the step i + 1 , provided that the AUC significantly increased using DeLong’s test; otherwise, the inclusion of variables was stopped ( t = i ).

Note that, contrary to the preselection phase of Method 1 with AIC, there is no need in this instance for a probabilistic model. Indeed, the AUC can be computed as long as there is a prediction for each statistical unit, without assumption on the manner with which this prediction was obtained.

Construction of intermediate scores: For each classifier, intermediate scores using only the preselected variables, with the transformations corresponding to the classifier (see Subsection 3.2.4), were built, using 1000 bootstrap samples (the same for both classifiers) and two modalities of selection of variables (all variables or all groups of related variables).

Construction of final scores: The two intermediate scores using the same number of preselected variables were aggregated in a final score by averaging their prediction for each statistical unit.

Method 3b | Method 3c | |||||||
---|---|---|---|---|---|---|---|---|

Variables (LR part) | AUC OOB** (LR part) | Variables (LDA part) | AUC OOB** (LDA part) | AUC OOB*** (all) | Variables | AUC OOB** (LR part) | AUC OOB** (LDA part) | AUC OOB*** (all) |

NT-proBNP | 0.7246 | NT-proBNP | 0.7246 | 0.7246 | NT-proBNP | 0.7246 | 0.7246 | 0.7246 |

NYHA ≥ III | 0.7482 | NYHA ≥ III | 0.7523 | 0.7523 | NYHA ≥ III | 0.7482 | 0.7523 | 0.7523 |

NYHA ≥ II | 0.7550 | Glycemia | 0.7591 | 0.7625 | Glycemia | 0.7559 | 0.7591 | 0.7592 |

Glycemia | 0.7621 | NYHA ≥ II | 0.7642 | 0.7642 | NYHA ≥ II | 0.7624 | 0.7642 | 0.7642 |

Periph. edema ≥ “above” | 0.7687 | Paroxystic AF | 0.7683 | 0.7721 | Paroxystic AF | 0.7669 | 0.7683 | 0.7685 |

Beta-blockers | 0.7731 | Systolic BP | 0.7725 | 0.7801 | Systolic BP | 0.7721 | 0.7725 | 0.7733 |

Systolic BP | 0.7791 | Beta-blockers | 0.7770 | 0.7818 | Beta-blockers | 0.7781 | 0.7770 | 0.7789 |

Cholesterol HDL | 0.7835 | Cholesterol HDL | 0.7807 | 0.7856 | Cholesterol HDL | 0.7823 | 0.7807 | 0.7829 |

Paroxystic AF | 0.7864 | Uricemia | 0.7839 | 0.7886 | Uricemia | 0.7860 | 0.7839 | 0.7865 |

Uricemia | 0.7902 | Third heart sound | 0.7866 | 0.7916 | Third heart sound | 0.7883 | 0.7866 | 0.7888 |

Bilirubin | 0.7925 | Periph. edema ≥ “above” | 0.7891 | 0.7935 | Periph. edema ≥ “above” | 0.7923 | 0.7891 | 0.7926 |

Implantable defibrillator | 0.7948 | Implantable defibrillator | 0.7912 | 0.7956 | Implantable defibrillator | 0.7941 | 0.7912 | 0.7945 |

Neoplasia | 0.7966 | Neoplasia | 0.7930 | 0.7975 | Neoplasia | 0.7958 | 0.7930 | 0.7961 |

Third heart sound | 0.7984 | Triglycerides | 0.7949 | 0.7990 | Triglycerides | 0.7979 | 0.7949 | 0.7981 |

Heart rate | 0.8001 | Heart rate | 0.7965 | 0.8009 | Heart rate | 0.7998 | 0.7965 | 0.7999 |

Previous AMI | 0.8020 | Bilirubin | 0.7980 | 0.8026 | Bilirubin | 0.8019 | 0.7980 | 0.8019 |

Triglycerides | 0.8038 | Previous AMI | 0.7995 | 0.8038 | Previous AMI | 0.8036 | 0.7995 | 0.8036 |

LVEF (baseline) | 0.8052 | LVEF | 0.8011 | 0.8052 | LVEF | 0.8052 | 0.8011 | 0.8052 |

Hypertension | 0.8067 | Mitral insufficiency | 0.8028 | 0.8069 | Mitral insufficiency | 0.8064 | 0.8028 | 0.8064 |

Mitral insufficiency | 0.8080 | Diuretics | 0.8039 | 0.8082 | Diuretics | 0.8070 | 0.8039 | 0.8070 |

Smoker or ex-smoker | 0.8091 | Hypertension | 0.8051 | 0.8094 | Hypertension | 0.8084 | 0.8051 | 0.8084 |

Ascitis | 0.8104 | Smoker or ex-smoker | 0.8060 | 0.8105 | Smoker or ex-smoker | 0.8093 | 0.8060 | 0.8093 |

Periph. edema ≥ “ankles” | 0.8116 | Periph. edema ≥ “ankles” | 0.8069 | 0.8117 | Periph. edema ≥ “ankles” | 0.8101 | 0.8069 | 0.8101 |

NYHA ≥ IV | 0.8119 | Ascitis | 0.8081 | 0.8121 | Ascitis | 0.8119 | 0.8081 | 0.8119 |

BMI | 0.8130 | BMI | 0.8093 | 0.8131 | BMI | 0.8131 | 0.8093 | 0.8131 |

Mid-apical pulm. Rales | 0.8137 |

*AUC OOB obtained for the score including the variable in the row as well as all previous variables. **The AUC OOB of these columns were obtained by building an intermediate score using only LDA (respectively LR) for the linear part (resp. logistic part) from the selected variables. ***The AUC OOB of these columns was obtained by building a full ensemble score with the same number of variables for both LDA and LR, using the optimal λ for each score. BMI: body mass index; NYHA: New York Heart Association; LVEF: left ventricular ejection fraction.

A3. Transformation of the Variables for the Logistic Regression and the Linear Discriminant AnalysisVariable | For logistic regression | For linear discriminant analysis | ||||
---|---|---|---|---|---|---|

p-value before | Transformation | p-value after | p-value before | Transformation | p-value after | |

Age | 0.364 | 0.853 | ||||

Years of school education | 0.449 | 0.462 | ||||

Weight | 0.280 | 0.267 | ||||

BMI | 0.006 | (x − 27.8)^{2} | 0.051 | 0.004 | (x − 28.0)^{2} | 0.059 |

Heart rate | 0.149 | 0.806 | ||||

Diastolic blood pressure | 0.704 | 0.291 | ||||

Systolic blood pressure | <0.001 | (x − 142.0)^{2} | 0.756 | <0.001 | (x − 142.0)^{2} | 0.133 |

Mean blood pressure | 0.028 | x^{−2} | 0.516 | 0.003 | (x − 108.7)^{2} | 0.707 |

Hematocrit | 0.001 | (x − 43.4)^{2} | 0.051 | <0.001 | (x − 43.4)^{2} | 0.220 |

Hemoglobin | 0.068 | 0.005 | (x − 15.3)^{2} | 0.222 | ||

ePVS | 0.242 | 0.034 | (x − 3.3)^{2} | 0.300 | ||

Serum creatinine | 0.376 | 0.007 | x^{2} | 0.352 | ||

eGFR | 0.004 | x^{−1} | 0.648 | <0.001 | x^{−2} | 0.487 |

Serum potassium | 0.067 | 0.056 | ||||

Serum sodium | 0.055 | 0.031 | (x − 142.3)^{2} | 0.455 | ||

Uricemia | <0.001 | (x − 6.7)^{2} | 0.383 | <0.001 | (x − 6.7)^{2} | 0.975 |

Triglycerides | 0.023 | x^{−2} | 0.672 | 0.009 | x^{−2} | 0.220 |

Cholesterol HDL | 0.009 | x^{−2} | 0.819 | 0.001 | x^{−2} | 0.404 |

Total cholesterol | 0.011 | x^{−2} | 0.230 | 0.001 | (x − 192.5)^{2} | 0.051 |

Bilirubin | 0.210 | 0.800 | ||||

Glycemia | 0.924 | 0.609 | ||||

NT-proBNP | <0.001 | Ln(x) | 0.407 | <0.001 | x^{0.5} | |

LVEF | <0.001 | (x − 42.8)^{2} | 0.228 | <0.001 | (x − 42.7)^{2} | 0.959 |

See next tables for abbreviations.