Competing Risks Analysis of African American Breast Cancer Patients

Purpose: Recent studies showed that African Americans (AA) breast cancer patients experience lower survival than any other race. The knowledge of cause-specific survival of such patients is necessary to investigate the different factors associated with the disease and support the clinical practice. Methods: The parametric competing risk method is applied to build up the survival models and the parametric mixture model is used to study the overall survival of these patients. The Kaplan-Meier survival estimation is also computed to compare the results. Results: The overall death rate decreases sharply immediately after the diagnosis and increases thereafter. The risk of death from breast cancer itself is the highest at the first five years; other causes, however, pose more threats to patients after this period. The patients who received only surgery have higher survival rate in long run. The use of radiation only does not have the significant effect on patients’ survival. Conclusion: Our study shows that the parametric competing risk models are promising in estimating the cause-specific survival of AA breast cancer patients and can be used for clinical practice. We also observed that heart and other diseases pose more threat to breast cancer patients in the long run.


Introduction
Breast cancer is a disastrous burden for women all over the world.Around 1.7 million women worldwide (12% of all new cancer cases) were diagnosed with breast cancer in 2012 [1].According to National Cancer Institute (NCI), breast cancer in women in the United States is common and the estimated new cases for 2015 will be 231,840, representing 29% of all new cancer cases in the female.African American (AA) breast cancer patients experience lower survival rates in the United States compared to others [2] [3] [4].According to the statistics from the National Cancer Institute, for every 100,000 African American women, there are 121 people diagnosed with breast cancer.Compared to Caucasian women, AA women have 10% lower incidence rate but 37% higher death rate [5].African Americans' socio-economic status and knowledge also make them one of the most vulnerable patients [6] [7].
In studying the survival of the patients, the parametric models are useful alternatives of Kaplan-Meier and Cox-PH model [8].Because the model provides the time ratio, it is easier to interpret the results and is more informative and relevant to clinicians.Previous investigations have pointed out the significance of causes of death for breast cancer patients other than breast cancer itself and used competing risk method to study them [9] [10] [11] [12].Moreover, in studying the competing risk, parametric analysis takes into account all possible risks simultaneously and may provide better findings [13].
The aims of this study are: 1) to perform the parametric competing risk analysis of AA Breast Cancer Patients in the USA from 1973 to 2012; 2) to apply the parametric mixture model method to observe the overall survival; and 3) to compare the parametric and non-parametric survival models for a specific group of sample.

Material
The data set for this study was provided by the Surveillance, Epidemiology, and End Results (SEER) Program (SEER, 2012) [14].This data set consists of 57,181 African American women diagnosed with breast cancer using histology, cytology, or microscopic confirmation during the period 1973-2012.Our study includes African American breast cancer patients aged above 20 at diagnosis (cancer site labeled breast by ICD-O-3 codes [C500-506 and C508-509]).After removing patients who do not have sufficient information, there are 47,016 subjects with malignant tumor (39,446) and carcinoma tumor (7570).Out of 7570 patients with carcinoma tumors, 1513 died (20%).Out of 39,446 patients with malignant tumors, 19,950 died (50.6%).
Patients' actual ages in year were stored as a numerical variable with a range from 20 to 106.Disease stage was categorized based on SEER simplified version of stage [15].There are four categories of the stage including in situ, localized, regional, and distant (coded as 0, 1, 2, and 4 respectively by SEER Historic Stage A [15]).The information on radiation and surgical therapies is available in the SEER data sets.Information about chemotherapy and hormone therapy data can be obtained by linking the SEER and the Medicare claims datasets but the quality of the data is questionable [16] [17].Therefore, only radiotherapy and surgical therapy are included in this study.Tumor histologic grades were classified as well differentiated (grade 1), moderately differentiated (grade 2), poorly diffe-rentiated (grade 3), undifferentiated (grade 4), and unknown status of grade.In observation of the data, all the breast cancer patients with in-situ stage have carcinoma tumor behavior.All patients with other stages are classified as having malignant tumor.Therefore, tumor behavior is not included in this study because this information is already included by cancer stage.The patients' marital status is categorized as single, married, separated, divorced, and widowed.Socioeconomic factors are also in our interest.Insurance classification is the only socio-economic factor in the SEER database.However, more than 80% of the data in insurance classification is unknown.Therefore, we decided not to include this factor in our analysis.

Methodology
First, SEER Cause-Specific death Classification and Other Cause of Death Classification are used to obtain information about the vital status and the cause of death of the studied patients.However, these two items do not categorize the causes of death other than breast cancer.In order to investigate other significant causes of death, we used the item Cause of Death to SEER Site Recode (ICD_5DIG).We observed that 54.35% of the patients are still alive, 23.67% have died because of breast cancer, and 6.98% have died because of heart diseases.Therefore, in this study, we analyze the three competing causes of death: breast cancer, heart disease, and other causes.We observed that the distributions of survival times in years for all of these causes of death follow the Weibull distribution.
Suppose that the data consist of observations 1 2 , n t t t ⋅⋅⋅ on the survival time of n patients.Associate with each individual i a random variable B i that classifies the causes of death: Following the cause-specific distributions approach in Prentice et al. (1978) [18] and Kalbfleisch and Prentice (1980) [19], the competing risk model starts with cause-specific hazards: The conditional cumulative distribution function (CDF) of i t , given that the death is of type j is: Then the overall CDF is given by This produces the hazard function associated with the CDF

( )
F t as developed by Maller and Zhu [20] ( ) ( ) ( ) ( ) Since all the three cause-specific survival functions are the accelerated failure time models with Weibull distribution, the overall hazard and survival function of the competing risk model with the adjustment of the applicable covariates given in formula (1) can be written as: where X and B are the covariate and parameter matrices.

Results
The competing risk models and the overall survival model as described above The cause-specific hazard rates are plotted in Figure 1.Cancer patients generally focus on cancer more than any other disease they have.However, this result indicates that the rate of death due to other causes is higher than breast cancer after the first 5 years.This result suggests that in the long run, breast cancer patients should also pay attention to other diseases.
The overall hazard function is plotted in Figure 2. The hazard plot for all causes of death estimates that the rate of death of African American breast cancer patients declines slightly from 0.0385% to 0.0375% in the first year after diagnosis.After that, the death rate increases gradually.The overall survival rate of  patients decreases gradually by approximately 3.5% per year.The 10-year survival probability is about 65%.

Comparison between Competing Risk Models and Kaplan-Meier Estimation
We also compare the Competing Risk Models with the Kaplan-Meier curves for the specified sample with the fixed covariate values as described in section 2.2.Estimates for both methods were computed and plotted on the same graphs.Results are presented in Figure 3.
Figure 3 show that in comparison to the Kaplan-Meier estimation, the use of the Accelerated Failure Time Model exaggerates the survival probability from year 2 to 18.The highest difference is about 4%.The estimation from the Accelerated Failure Time Model for the risk of heart diseases is slightly lower than the Kaplan-Meier estimation from year 5 to 15.The parametric model for other risks shows strong agreement to the Kaplan-Meier estimation.However, the parametric model provides a continuous function and thus a smooth survival curve.In addition, it incorporates the effect of covariates and thus makes the prediction more flexible.The small deviation after year 18 can be explained by the lack of data points.
The hazard estimation between the competing-risk and non-parametric models are compared in Figure 4.The competing risk model's hazard rate estimate was presented in Equation ( 2).The parametric model shows the upward trend in the rate of death over time while the Kaplan-Meier estimation provides a fluctuated hazard rate over time.This characteristic of the non-parametric model makes the inference of the hazard rate unrealistic.Usually, the rate of death does not depict too many changes in a population due to particular diseases.The consistency of the competing risk model makes the inference of hazard rate more reasonable.However, the waving pattern of the Kaplan-Meier curve may imply some important factor that is not included in the parametric model such as drug resistance.

Discussion
In this study, we found several important characteristics of the rate of death of the patients and the effects of some factors on the survival time of AA breast cancer patients.The overall death rate of patients decreases in the short beginning period and increases thereafter (Figure 2).However, we discovered that only the death rate due to Breast Cancer decreases in the beginning period as opposed to the Heart Diseases and Other Causes (Figure 1).This clearly indicates the importance of the competing risk models in the study of African American Breast Cancer patients.
Our study shows the significant risk of heart diseases and other causes to breast cancer patients (Figure 1).The risk of dying of breast cancer was the highest in the first five years, but it gradually decreases over time while the risk of dying of other causes increases.Therefore, our model implies that other diseases should not be underestimated when treating African American Breast Cancer patients.This result, however, may be confounded by other factors such as age.There is a possibility that death after 5 years are primarily caused by aging.
In Figure 2, the risk of dying goes down in the first one month after diagnosis and goes up significantly after that.This curve includes all patients in this study, regardless of treatment, age, or tumor behavior.Explanation for this phenomenon can be found in Figure 1.Our model shows that the risk from breast cancer decreases sharply in the first few months while the risks of other threats increase.The short decline in the aggregated hazard rate is attributed to the sharp decline in breast cancer risk and the later rise is attributed to the increase in other risks.
The comparison between two methods for a specified group of patients is shown in Figure 3. Due to the existence of the covariates in the model and the consistency of hazard rate, the parametric competing risk model can provide more reasonable results than the Kaplan-Meier estimate when dealing with different populations.Furthermore, in cases where the size of the population is insufficient, the Kaplan-Meier estimate will give constant survival rate over time.The estimation of the survival of a new group of patients can be calculated easily from the parametric model by simply changing the value of the covariates.These findings suggest that the parametric model may stimulate further study about breast cancer in African Americans.
The contribution of tumor differentiation is usually overlooked in the clinical setting.Some researchers believe that the degree of differentiation is not always an indication of the level of tumor invasiveness.The study of Jogi et al. (2012) supported the idea that the differentiation grade is associated with tumor behavior [21].Our model in this study supports this idea.In the model, the parameters for more differentiated tumors are more negative than those of the less differentiated ones.This indicates that the degree of differentiation has a significant contribution to patients' survival time.The less differentiated the tumor is, the higher risk the patient has.
We observed two notable results about the effect of age and treatments.First, our observation that survival decreases as age at diagnosis increases contradicts the conclusion by Keegan et al. (2012) in which adolescent and young adults had 44 percent higher risk of dying from breast cancer than patients from 40 to 64 years old [22].Colzani et al. (2011) also concluded that women aged less than 45 have 95% probability of death whereas this percentage in patients aged from 65 to 74 is only 44.5% [22] [23].We further investigated this phenomenon by the smooth hazard curve of the two age groups: before and after 45 years old [24].
The graph given in Figure 5 shows that the younger patients have higher risk in the first 10 years, which is similar to the two findings mentioned above.The highest difference in the rate of death is about 1.4%.After 15 years of diagnosis, the rate of death of patients older than 45 becomes higher.This supports the finding from our models.
The second notable result is the insignificance of radioactive treatment to sur-vival time of the risk of breast cancer.This aspect can also be verified by the estimation of smooth hazard rate presented in Figure 6.
Patients who received both radiation and surgery and those who received surgery only had relatively the same death rate in the first 15 years.After that, patients who received only surgery had slightly lower death rate.Patients who received only radiation had the highest mortality rate.The rate of death of these patients is about 20% in the early years but declines rapidly over time.Patients who received neither radiation nor surgery also have high mortality rate in the early years, but the rate of death also decline over time.It cannot be said for certain  that using only radiation has a negative effect on breast cancer patients' survival.Treatment options are chosen based on many different factors such as cancer stages, tumor size, and patient's preference [25] [26] [27] [28] [29].Patients who chose radiation therapy only in this dataset may already have a high risk of death.Choice of treatment may also reflect socio-economic status; it is possible that patients who choose radiation only will not have much access financially for treating other diseases.The rapid decline in hazard rate of the radiation-only group is not attributed to radiation therapy since patients who did not receive it also have the same pattern of decline.The reason for this drop is the decrease of risk of breast cancer that was shown in Figure 1.This finding contradicts Clark et al. [30] who suggested that breast irradiation does not affect survival and Whelan et al. [31] who suggested that radiation reduces risk.Steward et al.'s study [32] showed that adjuvant radiation improved survival of patients undergoing breast-conserving therapy.This finding is similar to our case where combination of surgery and radiation shows significant improvement in survival.
However, Steward et al. did not present any result for cases with radiation only that can be compared with our interesting finding.In addition, our study focuses on African American patients while Steward et al.'s does not differentiate race.
Further studies on the effectiveness of radiation only to confirm or disprove this finding will be helpful for physicians.

Conclusions
Besides the interest of statistical methodologies, our study presents the following notable findings that may be useful to clinical physicians.
• Patients have the highest risk of dying from breast cancer in the first five years after diagnosis.After that, other diseases pose bigger threats.
• Our findings support the idea of previous studies that the differentiation grade is associated with tumor behavior.The less differentiated the tumor is, the more dangerous it is.
• Younger patients have higher risk than older ones in the first 10 years after diagnosis.The difference diminishes after this period.Previous studies presented mixed results about this phenomenon.
• Our study shows that patients who received only radiation have higher risk of dying than other types of treatment and have similar risk as ones who received no treatment.This finding contradicts several previous studies and will need further investigation to confirm or disprove.
were fitted.The parameter estimates, their p-values along with 95% CI are presented in the Appendix.The coefficient estimate for scale parameter is 1.1471.This means that risk of dying due to breast cancer itself decreases with time.For the risk of dying from heart diseases, older age decreases the expected survival time.The patients with all the other stage other than in situ are at risk of dying faster in order.Our study shows that using only radiation is likely to decrease the survival time of patients.The coefficient estimate for scale parameter is 0.7301, the risk of dying due to heart disease for breast cancer patients increases at a decreasing rate.Patients who are under the risk of other causes have a parameter pattern similar to the patients under the risk of heart diseases.The coefficient estimate for scale parameter is 0.7117.This means that risk of dying due to other causes for breast cancer patients increases at a decreasing rate.

Figure 1 .
Figure 1.Hazard rate of each cause of death.

Figure 2 .
Figure 2. Overall rate of death of African American breast cancer patients.

Figure 3 .
Figure 3.Comparison of the of cause-specific and overall survival curves computed by the two methods.

Figure 4 .
Figure 4. Hazard rate estimation by competing risk model and Kaplan-Meier method.

Figure 5 .
Figure 5. Hazard rate of the risk of breast cancer for the two age groups.

Figure 6 .
Figure 6.Hazard rate of the risk of breast cancer for the four types of treatments.