Risk Factors and Prediction of Stroke in a Population with High Prevalence of Diabetes: The Strong Heart Study
Wenyu Wang1, Ying Zhang1, Elisa T. Lee1, Barbara V. Howard2, Richard B. Devereux3, Shelley A. Cole4, Lyle G. Best5, Thomas K. Welty6, Everett Rhoades7, Jeunliang Yeh1, Tauqeer Ali1, Jorge R. Kizer8, Hooman Kamel3, Nawar Shara2, David O. Wiebers9, Julie A. Stoner1
1College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
2MedStar Health Research Institute, Hyattsville, MD, USA.
3Weill Cornell Medical College, New York, NY, USA.
4Texas Biomedical Research Institute, San Antonio, TX, USA.
5Missouri Breaks Industries Research Inc., Eagle Butte, SD, USA.
6Aberdeen Area Tribal Chairmen’s Health Board, Rapid City, SD, USA.
7College of Medicine, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
8Albert Einstein College of Medicine, Bronx, NY, USA.
9Mayo Clinic and Mayo Foundation, Rochester, MN, USA.
DOI: 10.4236/wjcd.2017.75014   PDF    HTML   XML   1,809 Downloads   2,917 Views   Citations


Background and Objective: American Indians have a high prevalence of diabetes and higher incidence of stroke than that of whites and blacks in the U.S. Stroke risk prediction models based on data from American Indians would be of clinical and public health value. Methods and Results: A total of 3483 (2043 women) Strong Heart Study participants free of stroke at baseline were followed from 1989 to 2010 for incident stroke. Overall, 297 stroke cases (179 women) were identified. Cox models with stroke-free time and risk factors recorded at baseline were used to develop stroke risk prediction models. Assessment of the developed stroke risk prediction models regarding discrimination and calibration was performed by an analogous C-statistic (C) and a version of the Hosmer-Lemeshow statistic (HL), respectively, and validated internally through use of Bootstrapping methods. Results: Age, smoking status, alcohol consumption, waist circumference, hypertension status, antihypertensive therapy, fasting plasma glucose, diabetes medications, high/low density lipoproteins, urinary albumin/creatinine ratio, history of coronary heart disease/heart failure, atrial fibrillation, or Left ventricular hypertrophy, and parental history of stroke were identified as the significant optimal risk factors for incident stroke. Discussion: The models produced a C = 0.761 and HL = 4.668 (p = 0.792) for women, and a C = 0.765 and HL = 9.171 (p = 0.328) for men, showing good discrimination and calibration. Conclusions: Our stroke risk prediction models provide a mechanism for stroke risk assessment designed for American Indians. The models may be also useful to other populations with high prevalence of obesity and/or diabetes for screening individuals for risk of incident stroke and designing prevention programs.

Share and Cite:

Wang, W. , Zhang, Y. , Lee, E. , Howard, B. , Devereux, R. , Cole, S. , Best, L. , Welty, T. , Rhoades, E. , Yeh, J. , Ali, T. , Kizer, J. , Kamel, H. , Shara, N. , Wiebers, D. and Stoner, J. (2017) Risk Factors and Prediction of Stroke in a Population with High Prevalence of Diabetes: The Strong Heart Study. World Journal of Cardiovascular Diseases, 7, 145-162. doi: 10.4236/wjcd.2017.75014.

1. Introduction

Stroke is a major health care challenge in American Indians (AIs). Recent data indicate that AIs have a higher incidence of stroke than that of whites and blacks in the US [1] . Stroke is one of the leading causes of death as well as disability among AIs [2] [3] . Cigarette smoking, diabetes mellitus (DM), and high blood pressure are well documented modifiable risk factors for stroke [4] . We previously reported that risk factors for stroke among the AI population included age, high blood pressure, smoking, albuminuria, and diabetes [1] . Among them, DM (48.8%) and albuminuria (29.6%) were the prominent factors related to future stroke [1] [5] as well as coronary heart disease (CHD) [6] [7] in AIs.

A stroke prediction model utilizing routinely collected variables will assist providers who care for AIs in evaluating the risk of stroke in their patients and assist communities to design more effective and targeted interventions. Several stroke risk-assessment tools have been developed including the widely-used Framingham Risk Profile [8] [9] [10] . However, the contributions of certain common risk factors for incident stroke vary across populations [11] . Further, some risk factors/correlates have not previously been included; for example, albuminuria has been found to be significantly and independently associated with almost all of chronic diseases such as DM [12] , hypertension (HTN) [13] , and CHD [6] [7] in AIs. It is important to include these risk factors in the stroke prediction models for AIs.

This article presents gender-specific stroke risk prediction equations based on longitudinal data from the Strong Heart Study (SHS) during 1989-2010. A “risk calculator” from the equations will be developed for individuals to input their values of the risk factors and instantly obtain a probability (risk) of developing stroke in 10 years (will be available on the SHS Web site: http://strongheart.ouhsc.edu).

2. Methods

2.1. Study Population

The SHS is a population-based cohort study of cardiovascular disease (CVD) and its risk factors in AI tribes/communities in southwestern Oklahoma, central Arizona, and North and South Dakota. Participants (n = 3516; 2056 women) aged 45 to 74 years underwent baseline examination from 1989 to 1992. The design, inclusion and exclusion criteria of participants, survey methods, and laboratory techniques of the SHS have been described in detail [14] [15] along with methods of definition and identification of first stroke [1] [16] . Participants in the present analysis (3483; 2043 women) had no history of stroke or stroke-like events at the baseline examination. Among them, 297 (179 women) suffered an incident stroke during an average follow-up of 15.04 years (inter-quartile range 9.7 - 20.2 years) through the end of 2010. The study was approved by Institutional Review Boards of the participating institutions and tribes as well as the Indian Health Service. Informed consent was obtained from all participants.

2.2. Baseline Characteristics

Information on demographic factors, medical history, medication use, and personal health habits was collected by interview. A physical examination was conducted and fasting blood samples were collected for laboratory tests including lipids and lipoproteins. Anthropometric measurements were taken and sitting blood pressure (1st and 5th Korotkoff sounds) was measured three times consecutively using mercury sphygmomanometers (WA Baum Co) after five minutes of rest [17] . The average of the 2nd and 3rd systolic and diastolic blood pressure measurements were used in the analyses. HTN status was defined by the Seventh Joint National Committee on Hypertension criteria [18] : HTN if systolic blood pressure (SBP) ≥ 140 mmHg or diastolic blood pressure (DBP) ≥ 90 mmHg or on antihypertensive therapy, normal if SBP < 120 mmHg and DBP < 80 mmHg, and pre-hypertension (Pre-HTN) otherwise. DM status was defined by the American Diabetes Association diagnosis and classification guidelines [19] : DM if fasting plasma glucose (FPG) ≥ 7.0 mmol/L (126 mg/dL) or on diabetes medications, impaired fasting glucose (IFG) (or prediabetes) if 5.6 mmol/L (100 mg/dL) ≤ FPG < 7.0 mmol/L, and normal fasting plasma glucose (NFG) if FPG < 5.6 mmol/L. Micro- and macro-albuminuria were defined as urinary albumin/ creatinine ratios of 30 - 299 mg/g and ≥ 300 mg/g, respectively. Current smoking status was defined as smoking currently, smoking regularly, and having smoked at least 100 cigarettes in one’s entire life until the date of interview. Estimated glomerular filtration rate (eGFR) was derived based on serum creatinine that was recalibrated to an isotope dilution mass spectrometry (IDMS)-traceable serum creatinine assay [20] and using the CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration) formula [21] . Participants who had CHD or congestive heart failure (HF), atrial fibrillation (AFIB), or left ventricular hypertrophy (LVH) by electrocardiography before or at the baseline examination were considered as having a history of CHD/HF, AFIB, or LVH, respectively.

2.3. Outcome Variables

All study participants without a prior history of stroke at the baseline examination were under follow-up surveillance for incident stroke events occurring between the date of the baseline examination and December 31, 2010. Mortality and morbidity follow-up data were available in 99.8% and 99.2% of participants, respectively.

2.3.1. Fatal Stroke

Fatal events included deaths judged to be due to definite and possible stroke. Deaths occurring during the follow-up were confirmed through Indian Health Service or private hospital records and through direct contact by study personnel with participants’ families or other informants [14] [15] [22] . The process of ascertaining stroke deaths has been reported previously [1] [16] [22] . All possible stroke-related deaths were reviewed by physician members of the Strong Heart Study Mortality Review Committee and then reviewed by neurologists (D.O.W., J.P.W.) or since 2004 by a cardiologist focused on stroke (J.R.K.) for confirmation using previously described criteria [23] that differentiated eight subtypes of stroke-related events [cardioembolic infarction, subarachnoid hemorrhage, in- traparenchymal hemorrhage, lacunar infarction, other unknown infarction, transient ischemic attack (TIA), unknown type of stroke, atherothrombotic infarction].

2.3.2. Nonfatal Stroke

The process to confirm nonfatal stroke was similar to that for fatal stroke. Neurologists (D.O.W., J.P.W.) and later the cardiologist (J.R.K.) made up the adjudication review committee and provided the final diagnosis for non-fatal events (definite and possible non-fatal strokes) that occurred from the date of the baseline examination to Dec. 31, 2010 [14] [16] [22] [23] . Stroke event sub-types used are the same as described for fatal stroke. If more than one event happened in the same individual, the date of the earliest was considered to be the first stroke date.

2.4. Statistical Methods

Overall incidence rates (per 1000 persons-years) of stroke and their 95% confidence intervals and incidence rates by stroke types, gender, age groups (45 - 54, 55 - 64, and 65 - 74 years old) and centers (South/North Dakotas, Oklahoma, and Arizona) were estimated by dividing the total number of observed stroke events by the total follow-up stroke-free times (person-years) in the respective group. Stroke incidences by gender among sub-categories of each potential baseline risk factor were also estimated. Cox proportional-hazards models were used to assess univariate associations of individual risk factors with incident stroke after adjusting for age. Cox model with competing risks [24] was used in sensitivity analyses. A p-value of <0.05 was considered to be statistically significant.

2.5. Development of Prediction Equations

Cox proportional-hazard models were also used to assess the simultaneous association of multiple risk factors with incident stroke and to develop gender-spe- cific stroke prediction models. Backward variable selection methods [24] with a significance level of 0.05 was used to select optimal sets of baseline risk factors for incident stroke. The potential risk factors included were, age, body mass index (BMI), waist circumference (WAIST), SBP, DBP, antihypertensive therapy (denote its indicator function as HTNRX, HTNRX = 1 if on antihypertensive therapy and = 0 if not), smoking status, physical activity, alcohol consumption, FPG, diabetes medications (denote its indicator function as DMRX, DMRX = 1 if on diabetes medications and = 0 if not), urinary albumin/creatinine ratio (UACR), eGFR, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), triglyceride (TG), history of CHD/HF, parental history of CVD, stroke, DM or HTN, history of or electrocardiogram evident atrial fibrillation (AFIB) and left ventricular hypertrophy (LVH), as well as categorization of these variables such as DM status (Yes/No; or DM, IFG and NFG), HTN status (HTN, pre-HTN, normal), and albuminuria status (macro- albuminuria, micro-albuminuria, normal). Logarithmic transformation of skewed variables was applied if needed. For the significant risk factors selected for the models, their interactions were also considered and further selected for their possible additional contributions.

2.6. Discrimination, Calibration, and Validation of the Prediction Equations

An analogous C-statistic [7] [25] was calculated to evaluate the discrimination ability of the stroke prediction models in separating those who developed stroke from those who did not. This C-statistic is analogous to the area under the receiver operating characteristic curve (ROC curve) based on a logistic regression. A C-statistic value of ≥0.7 indicates good discrimination ability, and the closer the C value is to 1.0, the better is the discrimination ability. A version of the Hosmer-Lemeshow χ2 statistic (HL-statistic) [7] [25] was computed to assess model calibration ability (or how closely the predicted probabilities reflected actual risk). Participants were divided into deciles according to their predicted probabilities of stroke in 10 years using the proposed prediction model, and the HL-statistic was calculated to compare the differences between the predicted and actual proportions of stroke events. HL-statisticvalues of <20 are considered good calibration.

In addition, the stroke prediction models were validated internally with the use of bootstrapping methods [7] [25] . Samples of the same size (n = 3483) as the original cohort were taken 1000 times from the original cohort with replacement. Then the “optimism” [7] [25] for the C-statistic or the p-value for the HL-statistic was calculated based on these 1000 bootstrapping samples. A “Bootstrap-corrected statistic” then was evaluated as “the statistic from the model”-“the optimism for the statistic”. A bootstrap-corrected statistic from a model is a nearly unbiased estimate of the expected value of the statistic from the external validations of the model, with smaller “optimism” values indicating better validity of the statistic [25] . All analyses were conducted with SAS 9.4 (SAS Institute Inc., Cary, NC, USA).

3. Results

Table 1 shows estimated incidence rates (per 1000 person-years) of stroke for all SHS participants without prior stroke. There were no significant gender difference, but significant center differences among Arizona, Oklahoma and Dakotas with Dakotas the highest followed by Oklahoma and Arizona. Incidence rate was significantly increased with age. Incidence rates were highest for cardioembolic infarction followed by other unknown infarction, lacunar infarction and intraparenchymal hemorrhage among identified stroke types.

Gender-specific stroke incidence rates by sub-categories of each potential baseline risk factor and its univariate association with incident stroke after adjusting for age are shown in Table 2. Age, and after adjusting for age, smoking, HTN, DM, albuminuria, history of CHD/HF, and AFIB were univariately significantly associated with incident stroke for both women and men. Alcohol consumption, HDL-C, history of LVH, and parental history of stroke were significant risk factors for women only. There were no significant univariate association

Table 1. Incidence rate of stroke (per 1000 person-years) during averaged 15.04 years of follow-up for SHS stroke-free baseline participants aged 45 - 74 years.

Table 2. Incidence rates (per 1000 person-years) of stroke by gender and sub-categories of potential baseline risk factors, and their age-adjusted individual association to incident stroke.

AFIB, atrial fibrillation; Albuminuria, normal if urinary albumin-to-creatinine ratio (UACR) < 30 mg/g, microalbuminuria if 30 ≤ UACR < 300, and macroalbuminuria if UACR ≥ 300; CHD, coronary heart disease; DBP, diastolic blood pressure; DM, diabetes, DM status, DM if fasting plasma glucose (FPG) ≥ 7.0 mmol/L (126 mg/dL) or on diabetes medications, impaired fasting glucose (IFG) (or prediabetes) if 5.6 mmol/L (100 mg/dL) ≤ FPG < 7.0, and normal fasting plasma glucose (NFG) if FPG < 5.6; HDL-C, high-density lipoprotein cholesterol; HF, heart failure; HTN, hypertension, HTN status, HTN if systolic blood pressure (SBP) ≥ 140 mmHg or DBP ≥ 90 or on antihypertensive therapy, normal if SBP < 120 and DBP < 80, and prehypertension (Pre-HTN) otherwise; LVH, left ventricular hypertrophy; P, P-value; Ref., reference level. *Hazard Ratios and P-values are from the Cox models for assessing age-adjusted association of incident stroke with the individual risk factors. The p-values are from testing whether a hazards ratio of the respective level vs. its reference level is significantly different from one (e.g. P = 0.0012 for the hazards ratio of 1.891 for HTN vs. Normal (reference level) in women).

of incident stroke with BMI, WAIST, physical activity, LDL-C, TG, eGFR, parental history of CVD/DM/HTN in either women or men (data not shown).

Among these associations, after adjusting for age, for examples, those with DM had 2.25-fold higher risk than those without DM in women, and 1.65-fold higher in men; and those with macroalbuminuria or microalbuminuria had respective 3.39 or 1.66-fold higher risk than those had normal UACR in women, and 3.29 or 1.70-fold higher in men.

Gender-specific stroke prediction models are shown in Table 3. Age, current smoking, alcohol consumption, DBP and SBP as well as their interaction with HTN treatments, UACR, interaction of FPG and diabetes medications, HDL-C, history of CHD/HF, LVH and AFIB, and parental history of stroke were significantly associated with incident stroke in women. While age, WAIST, current smoking, DBP and SBP as well as their interaction with HTN treatments, Pre- HTN, UACR, diabetes medications, LDL-C, and history of CHD/HF were significantly associated with incident stroke in men.

The illustration of using the models in Table 3 to predict risk of incident stroke in 10 years for a stroke-free individual with measured risk factors or covariates was shown in Appendix.

In women, assuming the other measures in the model are the same, for examples, those with low to moderate alcohol consumption (1 - 14 drinks per week) had 50% lower risk compared with the others; and 2.5% higher risk per 10 mg/dl higher FPG among participants on diabetic medication. All terms related to blood pressures in Equation (3) (Appendix) can be rearranged as 0.02441*DBP*HTNRX + 0.00224*DBP*(1 − HTNRX) + 0.01424*SBP*(1 − HTNRX). Therefore, associations of blood pressures with incident stroke are different

Table 3. Cox proportional hazards models for stroke-free time.

AFIB, atrial fibrillation; CHD, coronary heart disease; Coeff, coefficient; DBP (SBP), diastolic (systolic) blood pressure; DBP*HTNRX, DBP*(1-HTNRX), SBP*(1-HTNRX), the interaction of DBP/SBP and antihypertensive therapy, where HTNRX=1 if on antihypertensive therapy and = 0 if not; FPG, Fasting plasma glucose; FPG*DMRX, the interaction of FPG and diabetes medications, where DMRX = 1 if on diabetes medications and = 0 if not; HDL-C, high-density lipoprotein cholesterol; HF, heart failure; HTN, hypertension; HTN status, HTN if SBP ≥ 140 mmHg or DBP ≥ 90 or on antihypertensive therapy, normal if SBP < 120 and DBP < 80, and prehypertension (Pre-HTN) otherwise; LDL-C, low-density lipoprotein cholesterol; LVH, Left ventricular hypertrophy; P, p-value; S0(10), the baseline stroke-free time function evaluated at t = 10 years; SE, standard error; UACR, urinary albumin creatinine ratio; WAIST, waist circumference. *The unit used to calculate Hazard Ratio is 5 years for age, and 10 mg/dl for FPG and HDL-C.

among those with and without antihypertensive therapy. In women, the medians UACR in the three sub-categories, normal, microalbuminuria, and macroalbuminuria were 7, 66, and 1492 mg/g, respectively. If we use these medians as the respective reference levels of UACR in the three sub-categories, then based on the relationship between coefficient and hazard ratio in a Cox proportional-ha- zard model [24] , the hazard ratios of macroalbuminuria vs. microalbuminuria, macroalbuminuria vs. normal, and microalbuminuria vs. normal, will be 1.174 [= Exp((Log(1492) − Log(66)) × 0.11852), where 0.11852 is the estimated coefficient for Log(UACR) in the model for women, Table 3], 1.315, and 1.120, respectively.

For men, assuming the other measures in the model are the same, the estimated hazard ratios of different levels vs. their respective reference level or hazard ratio by units change for each variable can be interpreted similarly. In addition, the age related terms in Equation (4) (Appendix) is 0.10268 × age − 0.91966 × I(age ≥ 65). Assuming the other measures in the model are the same, based on the relationship between coefficient and hazard ratio in a Cox proportional-ha- zard model [24] , this means for every 5 years higher age stroke risk is 67% [=Exp(5 × 0.10268) − 1] higher. The association of age with incident stroke risk is dependent upon the term 0.10268*age for those aged <65, and the term 0.10268*age − 0.91966 for those aged 65 or older. Similarly, the hazard ratios of macroalbuminuria vs. microalbuminuria, macroalbuminuria vs. normal, and microalbuminuria vs. normal, based on the three medians (5, 70, and 873 mg/g for normal, microalbuminuria, and macroalbuminuria sub-categories in men, respectively) are 1.159, 1.354, and 1.168, respectively.

The C-statistics from the models for women and men are 0.761 and 0.765, respectively, indicating good discrimination ability. The respective HL-statistics 4.668 (p = 0.792) and 9.171 (p = 0.328) show good calibration ability of the models.

Figure 1 displays the calibration plots comparing actual observed risk and predicted decile specific means of risk in men and women. The internal validation results based on the bootstrapping method show a bootstrap-corrected C- statistic of 0.7456 (after subtraction of optimism of 0.01489) for women, and 0.7458 (after subtraction of optimism of 0.01949) for men. The respective bootstrap-corrected p-value for HL-statistic was 0.9998 (optimism = −0.2075) for women and 0.4768 (optimism = −0.14875) for men. These C-statistics and p-values for HL-statistic with their bootstrap-corrections, and the small optimism values indicate good calibration and discrimination ability as well as stability of the prediction models.

We also applied Framingham 2008 [9] or American College of Cardiology (ACC)/American Heart Association (AHA) 2013 [10] prediction models (with published estimated coefficients for risk factors and values of baseline function at t = 10) to predict stroke risk in AIs. The applications of Framingham 2008 prediction models produced a C-statistic = 0.701 and a HL-statistic = 109.73 (p < 0.0001) for women, and C = 0.706 and HL-statistic = 281.9 (p < 0.0001) for men; and those applications of ACC/AHA 2013 prediction models for White pro- duced a C-statistic=0.705 and a HL-statistic = 29.82 (p < 0.00023) for women,

Figure 1. Calibration by deciles of model-based predicted probabilities of stroke event in 10 years. “KM” denotes observed risk (by using Kaplan-Meier method). “Model” denotes the models in Table 3 based predicted, “FS2008” the Framingham 2008 models based predicted, and “ACC2013” the ACC/AHA 2013 models (for White) based predicted decile specific risk means in deciles.

and C = 0.709 and HL-statistic = 82.3 (p < 0.0001) for men, while those for Black produced a C-statistic = 0.705 and a HL-statistic = 91.2 (p < 0.0001) for women, and C = 0.711 and HL-statistic = 80.6 (p < 0.0001) for men. The predicted decile specific means of risk in men and women from Framingham 2008 or ACC/AHA 2013 (for White) models are also showed in Figure 1.

To explore performance of the generated models in predicting risk of non-hemorrhagic incident strokes only, a sensitivity analyses was conducted by treating incident hemorrhagic stroke as a competing risk (and hence as censored event competing with non-hemorrhagic incident stroke) [24] . The generated models produced a C = 0.763 and a HL-statistic = 4.877 (p = 0.7706) for women, and C = 0.771 and HL-statistic = 5.558 (p = 0.6966) for men, and therefore there were better discrimination and calibration scores for the generated models for non-hemorrhagic incident strokes compared to those respective Cs and HLs for all incident strokes shown in Table 3.

4. Discussion

The new prediction models for incident stroke based on data routinely acquired in a clinical setting should prove to be helpful for care providers to evaluate stroke risk of their patients. Of perhaps equal importance, they will allow providers to further reinforce preventive measures such as smoking cessation, preventing or managing diabetes, and controlling blood pressure and LDL levels.

Some of these risk factors such as age, smoking status, SBP, DBP, HTN status, DM status, history of CHD/HF, AFIB, and LVH have also been reported as stroke risk factors [1] [8] [10] ; and LDL-C, alcohol consumption and albuminuria in other studies [1] [5] [26] [27] [28] . Among them, albuminuria is especially and significantly associated with incident stroke in AIs. We found that SHS participants who had macroalbuminuria or microalbuminuria had respectively 3.39 or 1.66 times higher risk of incident stroke than those with normal UACR in women, and 3.29 or 1.70 times in men from the age-adjusted univariate analyses (Table 2). These hazard ratios remained to be 1.315 and 1.120 in women and 1.354 and 1.168 times in men after adjusting for the other risk factors in the models (Table 3) as explained in Results section. The hazard ratios of macroalbuminuria vs. normal UACR were almost equal to those of AFIB vs. not AFIB. Given that AFIB constitutes a previously well-known significant and crucial risk factor for incident stroke [8] , the considerable association of albuminuria to stroke in this population cannot be ignored. The significant terms of diabetes medications in men and the interaction of FPG with diabetes medications in women remained in the final models. Which show DM is significantly associated with incident stroke risk and suggest that controlling FPG, especially in those with DM and on diabetes medications, is very important in preventing incident stroke. Our models identified significant independent contributions and combined effects of these risk factors in predicting risk of incident stroke after adjusting for the other risk factors in the respective models. Our models were also somewhat better at predicting non-hemorrhagic strokes than total strokes. This is likely because the majority of stroke cases were non-hemorrhagic strokes.

There are some interesting gender differences from this study. Table 2 and Table 3 show that low-moderate alcohol consumption (1 - 14 drinks for female) may be protective against incident stroke in women only. The beneficial effect of low-moderate alcohol consumption in women is consistent with previous findings, but the lack of a significant association for men contradicts those reported in the literature [27] . HDL-C was associated univariatly and multivariately with incident stroke only in women while LDL-C was associated only in men. The reasons for these gender differences are unclear and require further investigation.

Our models had improved predictive value compared to either the Framingham 2008 [9] or ACC/AHA 2013 [10] models when examined in AI. The lower performance of the Framingham or ACC/AHA models may be affected by their miscalibration [29] (that is, the average predicted risk from these models are not close to the stroke event rate in AI). The 10 years stroke event rate in AI were 0.043 for women and 0.050 for men, while the average predicted risk from the Framingham models were 0.128 and 0.139, and from ACC/AHA models (for White) 0.078 and 0.142, respectively. The miscalibration can also be seen in Figure 1 and from their large HL-statistics and respective significant p-values mentioned in Results section.

We did not use a reclassification statistic such as net reclassification im- provement (NRI) [30] to compare our models with those reported in the literature such as the Framingham or ACC/AHA models. The reasons are due to those reported issues related to miscalibration on clinical use of a risk equation (as we discussed above) in different populations, comparing different models such as different outcomes or population groups used in reported models, and uncertainty about how to draw proper 10-year stroke risk cutoff points [29] . In addition, the NRI is the difference of Youden indexes from two models for a binary classification with a cutoff probability. The problems associated with NRI include concerns about statistical invalidity in real and simulated data, inad- equately accounting for clinically important differences in shifts among risk categories if there are three or more risk categories, and other controversies [29] [30] [31] [32] .

5. Conclusion

Our generated stroke prediction models based on the data from the SHS provide a stroke risk appraisal specific for a population with high prevalence of obesity, diabetes, and renal disease. With the increasing of incidence and prevalence of obesity and diabetes in the US, we believe that our generated prediction models would provide an additional helpful assessment tool for other similar populations. Although our generated stroke prediction models are internally validated, they should be tested and validated in other populations.


This study was supported by cooperative agreement grants U01-HL41642, U01- HL41652, U01-HL41654, U01-HL65520, and U01-HL65521 and research grants R01-HL109315, R01HL109301, R01HL109284, R01HL109282 and R01HL109319 from the National Heart, Lung, and Blood Institute, Bethesda, MD. The authors express their deep appreciation to the participating American Indian tribes/ communities, the Indian Health Service, and the participants for their support and assistance, and also express great thanks to the SHS field center coordinators and the SHS staff for collecting the data. The authors also gratefully acknowledge the contributions of Dr. J. P. Whisnant, Department of Neurology, Mayo Clinic, Rochester, MN, who provided cerebrovascular expertise and reviewed stroke events that occurred in SHS participants from 1995-2004. The authors also thanks Dr. Dedra Buchwald, Dr. Stephen Schwartz, Dr. Clemma Muller, Dr. Paul Jensen, and Mr. Adam Omidpanah for valuable discussions, comments and suggestions.


The stroke-free time distribution function based on the Cox proportional-hazard model from the final selected model was as follows [24]


where S0(t) is the estimated baseline stroke-free time function, x = (x1, x2, ・・・, xp) are the final selected optimal set of risk factors, and b1, b2, ・・・, bp are their respective estimated coefficients. S0(t) was estimated according to a method proposed by Breslow [24] .

Based on Equation (1), the probability that an individual will develop stroke in t years is estimated by the following equation


with the given set of risk factors x of the individual.

From Table 3 the summation terms in Equation (1) for women and men are:





where I(.) is the indicator function, which equals 1 if the condition in the parentheses is met and 0 otherwise.

To illustrate the use of models in Table 3 and Equations (1) to (4) to predict risk of incident stroke in 10 years for, say, a stroke-free man who is 60.5 years old smoker, and has waist circumference = 100 cm, SBP/DBP = 183/114 mmHg and take hypertension medications, not using DM medications, UACR = 160.6 mg/g, LDL-C = 162 mg/dl, and the history of CHD/HF, by applying Equation (4), the summation term in Equation (1) for this man equals 0.10268 × 60.5 − 0.91966 × 0 − 0.01881 × 100 + 0.65459 × 1 + 0.02661 × 114 × 1 + 0.01566 × 183 × (1 − 1) + 0.53292 × 0 +0.13446 × Log(160.6) + 0.79694 × 0 − 0.49104 × 0 + 0.98687 × 1 = 9.687187. From Equations (2) and (1), his probability (risk) of developing stroke in 10 years will equal to

Table A1. Average predicted probability (risk) of developing incident stroke in 10 years.

where S0(10) (=0.999942632, Table 3) is the baseline stroke-free time function evaluated at t=10 for the model. The predicted probability of 60.3% is about 5 times the average probability 12.7% (Table A1) risk of developing incident stroke in 10 years for a man this age. This calculation can be easily conducted by a MS Excel work sheet or directly using stroke risk calculator that will be created on the SHS web site.

Submit or recommend next manuscript to SCIRP and we will provide best service for you:

Accepting pre-submission inquiries through Email, Facebook, LinkedIn, Twitter, etc.

A wide selection of journals (inclusive of 9 subjects, more than 200 journals)

Providing 24-hour high-quality service

User-friendly online submission system

Fair and swift peer-review system

Efficient typesetting and proofreading procedure

Display of the result of downloads and visits, as well as the number of cited articles

Maximum dissemination of your research work

Submit your manuscript at: http://papersubmission.scirp.org/

Or contact wjcd@scirp.org

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Zhang, Y., Galloway, J.M., Welty, T.K., Wiebers, D.O., Whisnant, J.P., Devereux, R.B., et al. (2008) Incidence and Risk Factors for Stroke in American Indians: The Strong Heart Study. Circulation, 118, 1577-1584.
[2] American Indian and Alaska Native Heart Disease and Stroke Fact Sheet: Division for Heart Disease and Stroke Prevention, Center for Disease Control, 2012.
[3] Heron, M. (2011) Deaths: Leading Causes for 2007. National Vital Statistics Reports, 59, 1-95.
[4] Goldstein, L.B., Bushnell, C.D., Adams, R.J., Appel, L.J., Braun, L.T., Chaturvedi, S., et al. (2011) Guidelines for the Primary Prevention of Stroke: A Guideline for Healthcare Professionals from the American Heart Association/American Stroke Association. Stroke, 42, 517-584.
[5] Karas, M.G., Devereux, R.B., Wiebers, D.O., Whisnant, J.P., Best, L.G., Lee, E.T., et al. (2012) Incremental Value of Biochemical and Echocardiographic Measures in Prediction of Ischemic Stroke: The Strong Heart Study. Stroke, 43, 720-726.
[6] Howard, B.V., Lee, E.T., Cowan, L.D., Devereux, R.B., Galloway, J.M., Go, O.T., et al. (1999) Rising Tide of Cardiovascular Disease in American Indians. The Strong Heart Study. Circulation, 99, 2389-2395.
[7] Lee, E.T., Howard, B.V., Wang, W., Welty, T.K., Galloway, J.M., Best, L.G., et al. (2006) Prediction of Coronary Heart Disease in a Population with High Prevalence of Diabetes and Albuminuria: The Strong Heart Study. Circulation, 113, 2897-2905.
[8] Wolf, P.A., D’Agostino, R.B., Belanger, A.J. and Kannel, W.B. (1991) Probability of Stroke: A Risk Profile from the Framingham Study. Stroke, 22, 312-318.
[9] D’Agostino, R.B., Vasan, R.S., Pencina, M.J., Wolf, P.A., Cobain, M., Massaro, J.M., et al. (2008) General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study. Circulation, 117, 743-753.
[10] Goff Jr., D.C., Lloyd-Jones, D.M., Bennett, G., Coady, S., D’Agostino, R.B., Gibbons, R., et al. (2014) 2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation, 129, S49-S73.
[11] Cappuccio, F.P., Oakeshott, P., Strazzullo, P. and Kerry, S.M. (2002) Application of Framingham Risk Estimates to Ethnic Minorities in United Kingdom and Implications for Primary Prevention of Heart Disease in General Practice: Cross Sectional Population Based Study. BMJ, 325, 1271.
[12] Wang, W., Lee, E.T., Howard, B.V., Fabsitz, R.R., Devereux, R.B. and Welty, T.K. (2011) Fasting Plasma Glucose and Hemoglobin A1c in Identifying and Predicting Diabetes: The Strong Heart Study. Diabetes Care, 34, 363-368.
[13] Wang, W., Lee, E.T., Fabsitz, R.R., Devereux, R., Best, L., Welty, T.K., et al. (2006) A longitudinal Study of Hypertension Risk Factors and Their Relation to Cardiovascular Disease: The Strong Heart Study. Hypertension, 47, 403-409.
[14] Lee, E.T., Welty, T.K., Fabsitz, R., Cowan, L.D., Le, N.A., Oopik, A.J., et al. (1990) The Strong Heart Study. A Study of Cardiovascular Disease in American Indians: Design and Methods. American Journal of Epidemiology, 132, 1141-1155.
[15] Howard, B.V., Welty, T.K., Fabsitz, R.R., Cowan, L.D., Oopik, A.J., Le, N.A., et al. (1992) Risk Factors for Coronary Heart Disease in Diabetic and Nondiabetic Native Americans. The Strong Heart Study. Diabetes, 41, 4-11.
[16] Kizer, J.R., Wiebers, D.O., Whisnant, J.P., Galloway, J.M., Welty, T.K., Lee, E.T., et al. (2005) Mitral Annular Calcification, Aortic Valve Sclerosis, and Incident Stroke in Adults Free of Clinical Cardiovascular Disease: The Strong Heart Study. Stroke, 36, 2533-2537.
[17] Howard, B.V., Lee, E.T., Yeh, J.L., Go, O., Fabsitz, R.R., Devereux, R.B., et al. (1996) Hypertension in Adult American Indians. The Strong Heart Study. Hypertension, 28, 256-264.
[18] Chobanian, A.V., Bakris, G.L., Black, H.R., Cushman, W.C., Green, L.A., Izzo Jr., J.L., et al. (2003) The Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure: The JNC 7 Report. JAMA, 289, 2560-2571.
[19] American Diabetes Association (2009) Diagnosis and Classification of Diabetes Mellitus. Diabetes Care, 32, S62-S67.
[20] Shara, N.M., Wang, H., Mete, M., Al-Balha, Y.R., Azalddin, N., Lee, E.T., et al. (2012) Estimated GFR and Incident Cardiovascular Disease Events in American Indians: The Strong Heart Study. American Journal of Kidney Diseases, 60, 795-803.
[21] Levey, A.S., Stevens, L.A., Schmid, C.H., Zhang, Y.L., Castro 3rd, A.F., Feldman, H.I., et al. (2009) A New Equation to Estimate Glomerular Filtration Rate. Annals of Internal Medicine, 150, 604-612.
[22] Lee, E.T., Cowan, L.D., Welty, T.K., Sievers, M., Howard, W.J., Oopik, A., et al. (1998) All-Cause Mortality and Cardiovascular Disease Mortality in Three American Indian Populations, Aged 45 - 74 Years, 1984-1988. The Strong Heart Study. [See Comment]. American Journal of Epidemiology, 147, 995-1008.
[23] Howard, B.V., Robbins, D.C., Sievers, M.L., Lee, E.T., Rhoades, D., Devereux, R.B., et al. (2000) LDL Cholesterol as a Strong Predictor of Coronary Heart Disease in Diabetic Individuals with Insulin Resistance and Low LDL—The Strong Heart Study. Arteriosclerosis, Thrombosis, and Vascular Biology, 20, 830-835.
[24] Lee, E.T. and Wang, W.Y. (2013) Statistical Methods for Survival Data Analysis. 4th Edition, John Wiley & Sons, Inc., Hoboken, NJ.
[25] Harrell Jr., F.E., Lee, K.L. and Mark, D.B. (1996) Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors. Statistics in Medicine, 15, 361-387.
[26] Park, J.H. and Ovbiagele, B. (2015) Stroke: LDL and Stroke Risk-Clinical Practice or Target Practice? Nature Reviews Neurology, 11, 8-9.
[27] Patra, J., Taylor, B., Irving, H., Roerecke, M., Baliunas, D., Mohapatra, S., et al. (2010) Alcohol Consumption and the Risk of Morbidity and Mortality for Different Stroke Types—A Systematic Review and Meta-Analysis. BMC Public Health, 10, 258.
[28] Lee, M., Saver, J.L., Chang, K.H. and Ovbiagele, B. (2010) Level of Albuminuria and Risk of Stroke: Systematic Review and Meta-Analysis. Cerebrovascular Diseases, 30, 464-469.
[29] Leening, M.J., Vedder, M.M., Witteman, J.C., Pencina, M.J. and Steyerberg, E.W. (2014) Net Reclassification Improvement: Computation, Interpretation, and Controversies: A Literature Review and Clinician’s Guide. Annals of Internal Medicine, 160, 122-131.
[30] Pencina, M.J., D’Agostino Sr., R.B. and Steyerberg, E.W. (2011) Extensions of Net Reclassification Improvement Calculations to Measure Usefulness of New Biomarkers. Statistics in Medicine, 30, 11-21.
[31] Hilden, J. and Gerds, T.A. (2014) A Note on the Evaluation of Novel Biomarkers: Do Not Rely on Integrated Discrimination Improvement and Net Reclassification Index. Statistics in Medicine, 33, 3405-3414.
[32] Pepe, M.S., Fan, J., Feng, Z., Gerds, T. and Hilden, J. (2015) The Net Reclassification Index (NRI): A Misleading Measure of Prediction Improvement Even with Independent Test Data Sets. Statistics in BioSciences, 7, 282-295.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.