A Non-Mixture Cure Model with a Change Point in a Co-Variate for Right Censored Data ()
1. Introduction
The cure fraction models are broadly used for analyzing survival data where a significant portion of the population may be cured and thus not subject to the event of interest. In the existing literature, there are two major approaches to model survival data with cure fraction. The first is mixture cure rate model, introduced by Boag [1] in 1949 and further developed by Berkson and Gage [2] in 1952 and later studied extensively by several authors. The second type is the non-mixture cure rate model, also referred to as the bounded cumulative hazard model or the promotion time cure model. In oncology research, the latter model was developed under the assumption that the number of cancer cells remaining active after treatment, which can grow slowly and eventually lead to detectable recurrence, follows a Poisson distribution. This assumption was first proposed by Yakovlev [3] in 1993 and was further discussed by Chen [4] in 1999, Ibrahim and Chen [5] in 2002 and Tsodikov [6] in 2002. Tsodikov [7] in 2003 provided a review of existing methodology of statistical inference based on the non-mixture model. Kutal and Qian [8] in 2018 studied a non-mixture cure model for right censored data with Fréchet distribution.
Change-point models for detection and estimation have been widely studied in the field of survival analysis for decades. Matthews and Farewell [9] in 1982 first introduced the change point in constant hazard rate when analyzing the failure times of non-lyphoblastic leukemia patients. Muller and Wang [10] in 1990 proposed a non-parametric approach for the change in hazard rates for censored survival data. They used the kernel method for the non-parametric estimation of derivative of hazard rate. Pons [11] in 2003 considered a Cox regression model with a change-point according to a threshold in a co-variate. Duppy [12] in 2006 studied estimation in a change-point hazard regression model with co-variate. Li and Qian [13] in 2013 estimated a change point hazard regression model with long term survivors. Zhao [14] in 2009 analyzed the change-point model for survival data including long term survivors. Othus [15] in 2012 studied the change-point cure models based on co-variate threshold. Taweab [16] in 2015 investigated the Bounded Cumulative Hazard (BCH) model with a change-point at an unknown threshold in a co-variate for right censored data. Kutal [17] in 2018 examined parameter estimation for a non-mixture cure model incorporating a change-point in a covariate under right-censored data. The current study builds upon and extends that earlier work by refining the estimation framework and broadening its application. However, in the literature, there is not much work in the field of cure models with change-point and only few studies have investigated this type of model. In this article, we propose an approach to analyze the non-mixture cure model in the presence of right censored data with a change point according to a threshold in a co-variate and we considered exponential distribution for susceptible subjects.
2. Non-Mixture Cure Model and Susceptible Distribution
2.1. Model Motivation and Formulation
We introduced a non-mixture cure model developed by Yakovlev [3] in 1993 as an alternative to the mixture cure model. The motivation for this model is as follows. Let
denote the number of potential malignant cancer cells that may eventually grow and produce a detectable tumor at a time in the future. We assumed
follow a Poisson distribution with mean
. Let
represent the random time until that the jth cancer cell to produce a detectable cancer mass. We assume the times
are independent and identically distributed random variables. The time to relapse in cancer for an individual is defined as
. Let an individual cumulative distribution function (CDF)
and a survival function
, where
is a relevant scalar covariate. The population survival function of the
, which represents the probability of no cancer relapse by time
, is derived as follows:
(1)
The cure fraction,
, is the probability of long-term survival (having no malignant cells (N = 0)), which can be defined as
Notice that as
, we obtain
where as
, we obtain
. Since
, then the model (1) is an improper survival function. Moreover, the survival function of
also can be written as
. From this, the population’s cumulative distribution function, probability density function, and hazard function are
,
, and
respectively.
For right censored survival time, the observed time for the
individual is
, where
is the censoring time. A censoring indicator,
, is defined as 1 if the event (failure) was observed and 0 if the data was right-censored.
The likelihood function for a sample of
individual is given by
(2)
The corresponding log-likelihood function is
(3)
2.2. Exponential Distribution for Uncured Individuals
In this paper, we assume that the failure times of uncured individuals follow an exponential distribution, a common choice for lifetime data analysis, with rate parameter (
). The exponential distribution of the probability density function, the cumulative distribution function, the survival function, and the hazard function are
;
,
;
,
;
, and
;
, respectively.
For simulation study, to generate random numbers from a given distribution, we apply the Inverse Transform Method.
Let
and
, be a cumulative distribution function, then
,
. The random variable
is a continuous random variable with cumulative distribution function
. Since
, since
is a uniform distribution.
3. The Log-Likelihood of Non-Mixture Model with a Change Point in a Covariate
We extend the non-mixture cure model to incorporate a change point, denoted by
, in a continuous covariate
. The model divide the population into two distinct regimes: (i) individuals with
, the cure probability
, the failure rate
and
(ii) individuals with
, the cure probability
, the failure rate
and
. The likelihood contribution of the
individual is
The complete observed data are
and the unknown parameters are defined by
. The likelihood function under change point
can be written as
(4)
Taking the logarithm, the corresponding log-likelihood function can be written as
(5)
where,
,
, and
. The log likelihood function (5) is not differentiable with respect to the change point parameter
. This lack of smoothness complicates the use of standard gradient-based optimization methods, motivating the introduction of a smoothed likelihood approach, which is discussed in the next section.
3.1. Smoothed Likelihood Method
To address the non-differentiability of the log-likelihood function in the presence of a change point, we employ a smoothed likelihood approach. This method replaces the discontinuous indicator functions
and
with a continuous and differentiable approximation by Othus et al. [15] in 2012. For this, we use a continuous and differential function
satisfying
and
. For a given sample size
, define the smoothed version
, where
is a bandwidth parameter that controls the degree of smoothing and may depend on
. In our simulation studies, we set
,
, where
is chosen so that it balances finite sample performance and ensures asymptotic properties of the estimator. For the asymptotic results, we assume the bandwidth satisfies
and
as
. This condition ensures that the smoothed objective is asymptotically equivalent to the original change-point likelihood while retaining differentiability, and it yields root-
consistency and asymptotic normality for the full parameter vector, including
. For sufficiently small
, we have
when
and
when
. Thus, the smooth model preserves the essential structure of the change point while providing differentiability. A commonly used choice for this class of function,
, is the logistic function, defined as
Using this formulation, the smoothed likelihood function associated for the observed data
is given by
(6)
Taking logarithms, the smoothed log-likelihood function becomes
(7)
We proposed to estimate the parameters, including change point
, by maximizing the smoothed log-likelihood Equation (7) using Newton-Raphson algorithm. The smoothing parameter
plays critical role in this procedure. A smaller choice of
leads to estimates that are nearly unbiased but may suffer from higher variability, whereas larger values of
reduce the variance at the expense of increased bias. Thus, an appropriate balance of
is required to achieve stable and reliable estimation. The detailed estimation framework, including re-parameterization, score function, observed information, and algorithmic considerations, is presented in the Estimation Procedure section to provide a systematic approach for implementation.
3.2. Estimation Procedure
The parameters
of the proposed non-mixture cure model with a change point are estimated by maximizing smoothed log-likelihood function
introduced earlier section. The smoothing process ensures that the function is differentiable with respect to the parameters, including the change point
, making standard optimization feasible. The complete estimation process is detailed below.
Re-parameterization for stability: To perform unconstrained optimization and ensure numerical stability, the model’s parameters are re-parameterized. The original parameter vector
is subject to the constraints
and
. The cure probabilities
are transformed using the logit function
and the failure rate parameters
are transformed using the log function
This creates an unconstrained working parameter vector
, where each element can take any value in the real line. The original parameters can be recovered via the inverse transformations.
Score function (first derivatives): The smoothed log-likelihood is
where
and
denote the log-likelihood contributions for individuals with
and
, respectively. The score vector,
, is the gradient of the smoothed log-likelihood with respect to the unconstrained parameter vector
. It is defined as
and the components of the score vector are:
Observed information (second derivatives): The observed information matrix,
, is defined as the negative of the Hessian matrix of the smoothed log-likelihood:
where
with
is the log-likelihood contribution for the
subject in regime
:
This matrix plays an essential role in the Newton-Raphson estimation procedure and is used to obtain the asymptotic variance and standard errors of the parameter estimates. As a result of the reparameterization
, the Hessian naturally partitions into blocks. The second derivative with respect to
and
are available in closed form analytic expressions because the corresponding terms in the smoothed log-likelihood depend directly on the exponential density and the logit link, whose derivatives are tractable. In contrast, the derivatives involving the change-point parameter
includes the smoothing function
, whose first and second derivatives,
and
, enters into the Hessian. These cross-derivative terms are algebraically more complex. Therefore, in practice, it is common to evaluate the
related Hessian elements numerically for enhanced numerical stability, while retaining analytic forms for the remaining blocks. A complete set of analytic derivatives, including the score vector and Hessian components, is provided in the Supplement for reproducibility and implementation.
Newton-Raphson algorithm: The parameter vector
that maximizes the smoothed log-likelihood function
is found using the Newton-Raphson algorithm. Starting from an initial value
, the estimate is refined through successive iterations. For the
-th iteration, the standard update is calculated as:
To ensure robust convergence and prevent overshooting the maximum, a step-size parameter
is often introduced, modifying the update to:
This iterative process continues until convergence is achieved, which is determined when the norm of the score vector
and the magnitude of the change between successive parameter estimates both fall below a pre-specified tolerance.
Because the Newton-Raphson algorithm can be sensitive to starting values, we employed a multi-start initialization strategy. Initial values for the change point
were selected from a coarse grid over empirical quantiles of the threshold covariate, and for each candidate
the remaining parameters were initialized using simple regime-specific summaries, including empirical late-time survival proportions for the cure components and exponential rate estimates for the failure components. To mitigate convergence to local maxima, the algorithm was run from multiple starting values, and the solution yielding the largest smoothed log-likelihood was retained. Convergence was assessed using both the norm of the score vector and the change in parameter estimates, with step-size damping applied as needed to stabilize the updates.
Asymptotic Properties: The estimator
is defined as the maximizer of the smoothed log-likelihood
in (7), where
Although our model is a non-mixture (promotion time) cure model with parametric exponential latency, and thus differs in formulation from the change-point cure models in Othus et al. (2012) and the BCH change-point model in Taweab et al. (2015), the asymptotic argument follows the same smoothed M-estimation framework: consistency is obtained from uniform convergence and identifiability, and asymptotic normality follows from a Taylor expansion of the smoothed score and a central limit theorem for the i.i.d. score contributions.
Assume standard regularity conditions for M-estimators hold (e.g., identifiability, interior true parameter, finite second moments, and nonsingularity of the information matrix), and that the smoothing bandwidth satisfies
Then
is consistent and
where
is the true parameter and
is the expected (smoothed) Fisher information. In practice,
is consistently estimated by the observed information
yielding the usual large-sample covariance approximation
Because the optimization is performed on the unconstrained scale
with
and
, standard errors for
are obtained by applying the delta method to the inverse observed-information matrix on the
-scale.
4. Simulation Study
In this section, we conducted a simulation study to evaluate the finite-sample performance of the proposed non-mixture cure model with a change point in covariate based on right censored data. The sample data were generated from a non-mixture cure model with a change point according to the following procedure:
Step-1: We considered two distinct scenarios for the covariate distribution and the true change point
. (i) The covariate
was drawn from a uniform distribution U(0, 2) with a true change point at
. (ii) The covariate
was drawn from a truncated distribution on the interval (0, 3) with a mean of 1 and a standard deviation of 1, with a true change point at
.
Step-2: For each simulated subject, a random variable
was drawn from the from uniform distribution U(0, 1). If
, the subject is assigned to the cure group, otherwise the subject is assigned to the uncured group.
Step 3: (Inverse-CDF draw of
)
Draw
. We use the population survival
with exponential latency
. Set
and solve:
This has a finite solution for
only when
, which correspond to the susceptible population, yielding
(8)
If
, classify the subject as cured and set
.
For the Change-point implementation, with regimes
by
(regime 1) or
(regime 2), use the regime specific parameters
in the above formula.
Step-4: Apply independent right-censoring by generating censoring times
from an exponential distribution with rate parameter
, where
is chosen to achieve a desired censoring level.
Step-5: Estimate the parameter vector
by maximizing the smoothed log-likelihood function using standard optimization procedures.
For each scenario, we simulated 1000 replications with sample sizes of 100, 200, and 400. The performance of the estimators was assessed using the mean, bias, standard error (SE), and root mean square error (RMSE). The simulation results demonstrate that the proposed estimation method performs well across all settings. The average parameter estimates are close to their true values, with small biases. Estimation of the parameter of the change point
is generally accurate, even under moderate censoring. As expected, SE and RMSE decrease as the sample size increases from 100 to 200 and 200 to 400, confirming the consistency of the estimators. In addition, the method performs better under lower levels of censoring, with reduced variability and bias compared to higher censoring rates. Overall, the results indicate that the proposed smoothed likelihood approach provides reliable parameter estimates in finite samples.
The detailed numerical summaries of the simulation results are reported in Table 1 and Table 2. Table 1 presents the finite-sample performance of the proposed estimator under a Uniform U(0, 2) covariate distribution with a true change point at
, while Table 2 summarizes results under a truncated normal covariate distribution with
. In both scenarios, the estimators exhibit small bias and decreasing standard errors and RMSE as the sample size increases, with improved performance under lower censoring rates.
Table 1. Simulation summaries under U(0, 2) with
at two censoring levels.
(a) Censoring ≈ 27%,
|
(b) Censoring ≈ 41%,
|
Parameter |
|
Mean |
Bias |
SE |
RMSE |
Parameter |
|
Mean |
Bias |
SE |
RMSE |
N = 100 |
N = 100 |
|
0.02 |
0.0215 |
0.0015 |
0.0128 |
0.0129 |
|
0.02 |
0.0221 |
0.0021 |
0.0145 |
0.0147 |
|
0.30 |
0.3041 |
0.0041 |
0.0952 |
0.0953 |
|
0.30 |
0.3069 |
0.0069 |
0.1158 |
0.1160 |
|
0.01 |
0.0109 |
0.0009 |
0.0076 |
0.0077 |
|
0.01 |
0.0113 |
0.0013 |
0.0088 |
0.0089 |
|
0.40 |
0.4077 |
0.0077 |
0.1103 |
0.1106 |
|
0.40 |
0.4102 |
0.0102 |
0.1341 |
0.1345 |
|
1.00 |
1.0189 |
0.0189 |
0.1854 |
0.1864 |
|
1.00 |
1.0253 |
0.0253 |
0.2011 |
0.2027 |
N = 200 |
N = 200 |
|
0.02 |
0.0208 |
0.0008 |
0.0089 |
0.0089 |
|
0.02 |
0.0211 |
0.0011 |
0.0099 |
0.0100 |
|
0.30 |
0.3015 |
0.0015 |
0.0651 |
0.0651 |
|
0.30 |
0.3031 |
0.0031 |
0.0789 |
0.0789 |
|
0.01 |
0.0104 |
0.0004 |
0.0051 |
0.0051 |
|
0.01 |
0.0106 |
0.0006 |
0.0060 |
0.0060 |
|
0.40 |
0.4036 |
0.0036 |
0.0765 |
0.0766 |
|
0.40 |
0.4049 |
0.0049 |
0.0922 |
0.0923 |
|
1.00 |
1.0095 |
0.0095 |
0.1288 |
0.1292 |
|
1.00 |
1.0136 |
0.0136 |
0.1452 |
0.1458 |
N = 400 |
N = 400 |
|
0.02 |
0.0203 |
0.0003 |
0.0061 |
0.0061 |
|
0.02 |
0.0205 |
0.0005 |
0.0068 |
0.0068 |
|
0.30 |
0.3009 |
0.0009 |
0.0453 |
0.0453 |
|
0.30 |
0.3017 |
0.0017 |
0.0546 |
0.0546 |
|
0.01 |
0.0102 |
0.0002 |
0.0035 |
0.0035 |
|
0.01 |
0.0103 |
0.0003 |
0.0041 |
0.0041 |
|
0.40 |
0.4018 |
0.0018 |
0.0532 |
0.0532 |
|
0.40 |
0.4025 |
0.0025 |
0.0645 |
0.0646 |
|
1.00 |
1.0041 |
0.0041 |
0.0901 |
0.0902 |
|
1.00 |
1.0067 |
0.0067 |
0.1018 |
0.1020 |
Note:
is true parameter value; SE = standard error; RMSE = root mean square error.
Table 2. Simulation summaries under Truncated Normal(1, 1) on [0,3] with
at two censoring levels.
(a) Censoring ≈ 27%,
|
(b) Censoring ≈ 42%,
|
Parameter |
|
Mean |
Bias |
SE |
RMSE |
Parameter |
|
Mean |
Bias |
SE |
RMSE |
N = 100 |
N = 100 |
|
0.02 |
0.0218 |
0.0018 |
0.0139 |
0.0140 |
|
0.02 |
0.0225 |
0.0025 |
0.0158 |
0.0160 |
|
0.30 |
0.3051 |
0.0051 |
0.1021 |
0.1022 |
|
0.30 |
0.3081 |
0.0081 |
0.1255 |
0.1258 |
|
0.01 |
0.0110 |
0.0010 |
0.0081 |
0.0082 |
|
0.01 |
0.0115 |
0.0015 |
0.0096 |
0.0097 |
|
0.40 |
0.4088 |
0.0088 |
0.1197 |
0.1200 |
|
0.40 |
0.4119 |
0.0119 |
0.1480 |
0.1485 |
|
1.50 |
1.5201 |
0.0201 |
0.2235 |
0.2244 |
|
1.50 |
1.5287 |
0.0287 |
0.2451 |
0.2468 |
N = 200 |
N = 200 |
|
0.02 |
0.0209 |
0.0009 |
0.0095 |
0.0095 |
|
0.02 |
0.0213 |
0.0013 |
0.0108 |
0.0109 |
|
0.30 |
0.3023 |
0.0023 |
0.0703 |
0.0703 |
|
0.30 |
0.3040 |
0.0040 |
0.0863 |
0.0864 |
|
0.01 |
0.0105 |
0.0005 |
0.0055 |
0.0055 |
|
0.01 |
0.0107 |
0.0007 |
0.0066 |
0.0066 |
|
0.40 |
0.4041 |
0.0041 |
0.0831 |
0.0832 |
|
0.40 |
0.4059 |
0.0059 |
0.1011 |
0.1013 |
|
1.50 |
1.5108 |
0.0108 |
0.1569 |
0.1573 |
|
1.50 |
1.5151 |
0.0151 |
0.1705 |
0.1712 |
N = 400 |
N = 400 |
|
0.02 |
0.0204 |
0.0004 |
0.0065 |
0.0065 |
|
0.02 |
0.0206 |
0.0006 |
0.0075 |
0.0075 |
|
0.30 |
0.3011 |
0.0011 |
0.0490 |
0.0490 |
|
0.30 |
0.3021 |
0.0021 |
0.0602 |
0.0602 |
|
0.01 |
0.0102 |
0.0002 |
0.0038 |
0.0038 |
|
0.01 |
0.0103 |
0.0003 |
0.0046 |
0.0046 |
|
0.40 |
0.4020 |
0.0020 |
0.0583 |
0.0583 |
|
0.40 |
0.4029 |
0.0029 |
0.0708 |
0.0709 |
|
1.50 |
1.5049 |
0.0049 |
0.1095 |
0.1096 |
|
1.50 |
1.5078 |
0.0078 |
0.1199 |
0.1202 |
Note:
is true parameter value; SE = standard error; RMSE = root mean square error.
Figure 1 provides a graphical summary of the simulation results by illustrating the behavior of the RMSE as a function of sample size for all model parameters across the considered scenarios. As shown in Figure 1, the RMSE decreases monotonically as the sample size increases for each parameter, confirming the consistency of the proposed smoothed maximum likelihood estimators under both covariate distributions and censoring levels.
Figure 1. RMSE decreases with increases sample sizes for all parameters.
Sensitivity and Misspecification Analysis
A crucial test of the model’s reliability is its robustness when the core assumption of an exponential latency distribution for susceptible individuals is violated. We performed a comprehensive sensitivity analysis where the true survival times were generated using a Weibull distribution (with shape parameter
, implying an increasing hazard) while the estimation was carried out using the exponential-based smoothed maximum likelihood estimator (SMLE). To further examine robustness with respect to the covariate design, this analysis was conducted under both the Uniform (U(0, 2)) and Truncated Normal (TN(1, 1) on [0,3]) covariate distributions to ensure generalizability.
As summarized in Table 3, the fitted model consistently exhibits significant structural bias in the failure rate estimates (
), with typical biases ranging from approximately −0.17 to −0.27 across all scenarios. This outcome is expected, as the constant-hazard exponential model cannot accurately capture the time-varying hazard of the true Weibull process; consequently, the rate estimator converges to pseudo-true values under misspecification. Crucially, the key parameters of clinical interest of the cure fractions (
) remain robust and approximately unbiased across all sample sizes and covariate distributions. Most notably, the change-point estimator
demonstrates strong robustness, with negligible bias and steadily decreasing RMSE as
grows (for example, from 0.2409 to 0.1136 in the U(0, 2) scenario). These results indicate that, even under substantial latency misspecification, the proposed smoothed change-point non-mixture cure model reliably identifies threshold effects and provides robust inference for parameters of primary scientific interest.
Table 3. Sensitivity analysis under latency misspecification. Susceptible failure times are generated from a Weibull distribution, while the model is fitted assuming exponential latency.
(a) Uniform(0, 2),
|
(b) TruncNorm[0, 3],
|
Parameter |
|
Mean |
Bias |
SE |
RMSE |
Parameter |
|
Mean |
Bias |
SE |
RMSE |
N = 100 |
N = 100 |
|
0.02 |
0.0161 |
−0.0039 |
0.0180 |
0.0185 |
|
0.02 |
0.0125 |
−0.0075 |
0.0151 |
0.0168 |
|
0.01 |
0.0080 |
−0.0020 |
0.0100 |
0.0102 |
|
0.01 |
0.0101 |
0.0001 |
0.0142 |
0.0142 |
|
0.30 |
0.1265 |
−0.1735 |
0.0427 |
0.1787 |
|
0.30 |
0.1127 |
−0.1873 |
0.0390 |
0.1913 |
|
0.40 |
0.1344 |
−0.2656 |
0.0419 |
0.2688 |
|
0.40 |
0.1378 |
−0.2622 |
0.0455 |
0.2661 |
|
1.00 |
1.0130 |
0.0130 |
0.2406 |
0.2409 |
|
1.50 |
1.5213 |
0.0213 |
0.2948 |
0.2956 |
N = 200 |
N = 200 |
|
0.02 |
0.0077 |
−0.0123 |
0.0098 |
0.0157 |
|
0.02 |
0.0083 |
−0.0117 |
0.0106 |
0.0158 |
|
0.01 |
0.0049 |
−0.0051 |
0.0059 |
0.0078 |
|
0.01 |
0.0057 |
−0.0043 |
0.0071 |
0.0083 |
|
0.30 |
0.1059 |
−0.1941 |
0.0302 |
0.1965 |
|
0.30 |
0.1034 |
−0.1966 |
0.0304 |
0.1990 |
|
0.40 |
0.1252 |
−0.2748 |
0.0326 |
0.2767 |
|
0.40 |
0.1270 |
−0.2730 |
0.0342 |
0.2751 |
|
1.00 |
1.0083 |
0.0083 |
0.1679 |
0.1681 |
|
1.50 |
1.5033 |
0.0033 |
0.2418 |
0.2418 |
N = 400 |
N = 400 |
|
0.02 |
0.0072 |
−0.0128 |
0.0079 |
0.0150 |
|
0.02 |
0.0066 |
−0.0134 |
0.0084 |
0.0158 |
|
0.01 |
0.0033 |
−0.0067 |
0.0039 |
0.0078 |
|
0.01 |
0.0047 |
−0.0053 |
0.0058 |
0.0078 |
|
0.30 |
0.1064 |
−0.1936 |
0.0254 |
0.1953 |
|
0.30 |
0.1006 |
−0.1994 |
0.0260 |
0.2011 |
|
0.40 |
0.1213 |
−0.2787 |
0.0256 |
0.2799 |
|
0.40 |
0.1275 |
−0.2725 |
0.0333 |
0.2745 |
|
1.00 |
1.0052 |
0.0052 |
0.1134 |
0.1136 |
|
1.50 |
1.5376 |
0.0376 |
0.1466 |
0.1513 |
Note:
denotes the reference parameter value; SE = standard error; RMSE = root mean square error.
5. Real Data Application
5.1. The Colon Cancer Dataset
To illustrate the proposed methodology, we analyzed the colon cancer dataset from the survival package. The dataset records recurrence-free survival times, censoring indicators, and multiple clinical covariates. After basic cleaning (finite positive times, finite node counts), the analysis set contained
individuals with 456 recurrences and 455 censored observations. In our analysis, the number of positive lymph nodes (nodes) served as the threshold covariate for the change-point model.
5.1.1. Stratified Descriptive Analysis
Patients were further stratified into two groups according to the estimated change point. Table 4 presents the descriptive summaries. The low-node group (
) had a median time of 2023 days with an event rate of 41.3%, compared to 501.5 days and 66.5% in the high-node group.
Table 4. Descriptive summary of colon cancer patients by change-point stratification.
Group |
|
Events |
Event Rate (%) |
Median Time (days) |
nodes |
595 |
246 |
41.3 |
2023.0 |
nodes |
316 |
210 |
66.5 |
501.5 |
5.1.2. Parameter Estimates
Table 5 reports the estimated cure fractions (
), failure rates (
), and the change point
. The estimated change point was located at approximately
positive nodes, partitioning the cohort into two prognostic subgroups. Patients with
were governed by parameters
, while those with
followed
. Standard errors (SE) were obtained via the observed information matrix using the delta method.
Table 5. Parameter estimates from the non-mixture cure model with change point (colon dataset).
Parameter |
Estimate |
SE |
Interpretation |
|
0.542340 |
0.024631 |
Cure fraction, low-node group (
) |
|
0.304869 |
0.034681 |
Cure fraction, high-node group (
) |
|
0.000996 |
0.000112 |
Failure rate, low-node group |
|
0.001444 |
0.000157 |
Failure rate, high-node group |
|
3.816639 |
0.378346 |
Estimated change point (nodes) |
5.1.3. Profile Likelihood of the Change Point
Figure 2 displays the profile negative log-likelihood as a function of
. The curve shows a clear minimum at
, providing evidence for a change point in the number of positive nodes.
Figure 2. Profile negative log-likelihood for
(colon dataset).
5.1.4. Interpretation
These findings confirm the critical prognostic role of lymph node involvement in colon cancer. The estimated cure fraction is substantially higher (
) and the susceptible failure rate is lower (
) than above the change point (
,
). The profile likelihood supports a well-defined change point near 3.8 nodes, and the stratified descriptive statistics reinforced the model-based conclusions. Overall, the analysis demonstrates the practical utility of the proposed smoothed non-mixture cure model with a change point for quantifying clinically interpretable thresholds in prognosis.
5.2. The Melanoma Dataset
To further validate the proposed methodology, we analyzed the melanoma dataset from the boot package (
). The dataset records survival time in days, event status (1 = death due to melanoma, 0 = censored), and several clinical covariates including tumor thickness, ulceration, and age. In this application, tumor thickness (mm) was used as the change-point covariate.
5.2.1. Stratified Descriptive Analysis
Patients were stratified by the estimated change point. Table 6 provides summary statistics. Those with thin lesions (
) showed higher cure probability, lower event rates, and longer median survival compared to the thick-lesion group.
Table 6. Descriptive summary of melanoma patients by change-point stratification.
Group |
n |
Events |
Event Rate (%) |
Median Time (days) |
Thin lesions (≤2.26 mm) |
118 |
18 |
15.3 |
2103.5 |
Thick lesions (>2.26 mm) |
87 |
39 |
44.8 |
1812.0 |
5.2.2. Parameter Estimates
Patients were stratified using the estimated change point
mm. As summarized in Table 6, the thin-lesion group (
) exhibited a substantially lower event rate (15.3%) than the thick-lesion group (44.8%) and a longer median survival time (2103.5 vs. 1812.0 days). Although these descriptive measures indicate a more favorable prognosis for patients with thin lesions, the model-based results in Section 5.2.1 suggest that this advantage is driven primarily by a markedly lower failure rate rather than a large cure fraction, implying substantially slower failure dynamics in this subgroup. Table 7 reports the estimated cure fractions (
), failure rates (
), and the change point
. The estimated change point was
mm. Patients with tumors of thickness
were described by parameters
, whereas those with thicker tumors followed
. Standard errors (SE) were computed using the observed information matrix with the delta method.
Table 7. Parameter estimates from the non-mixture cure model with change point (Melanoma dataset).
Parameter |
Estimate |
SE |
Interpretation |
|
0.000045 |
0.0020 |
Cure fraction, thin lesions (≤2.26 mm) |
|
0.2965 |
0.1852 |
Cure fraction, thick lesions (>2.26 mm) |
|
0.000006 |
0.00003 |
Failure rate, thin lesions |
|
0.000255 |
0.00018 |
Failure rate, thick lesions |
|
2.26 |
0.50 |
Estimated change point (mm) |
5.2.3. Profile Likelihood of the Change Point
Figure 3 shows the profile negative log-likelihood for
. The curve attains a clear minimum at
mm, supporting the presence of a clinically meaningful threshold in tumor thickness.
Figure 3. Profile negative log-likelihood for
(Melanoma dataset).
5.2.4. Interpretation
These results confirm tumor thickness as a significant prognostic threshold in melanoma. While clinical intuition often associates thin lesions with a higher likelihood of being cured, the model-based estimates for the thinn lesion group (
mm) yield a cure fraction
near zero. This indicates that the superior survival outcomes observed in this group characterized by a substantially lower event rate of 15.3% and a notably longer median survival time of 2103.5 days are statistically better captured by an exceptionally low failure rate (
) rather than a definitive cured plateau. In contrast, patients with thick lesions (
mm) experienced a higher failure rate (
), yet the model identifies a more pronounced cure fraction (
) for this subgroup. The proposed smoothed likelihood non-mixture cure model successfully identifies this threshold effect, demonstrating that when a distinct plateau is not yet visible in the data, the model relies on the latency distribution to characterize long-term survivors through slow failure dynamics. This highlights the necessity of interpreting cure fractions as model based constructs whose values are sensitive to follow-up patterns and parametric assumptions.
Interpretation of the change-point cure model
A critical consideration in the application of cure models is the reliable separation of long-term survivors from individuals with very long but finite survival times. In the melanoma application, the estimated cure fraction should therefore be interpreted as a model-based quantity rather than as definitive biological evidence of cure, particularly in the absence of a visible survival plateau or when follow-up is limited. Under such conditions, cure fraction estimates can be sensitive to the assumed parametric form of the latency distribution. For subgroups in which the estimated cure fraction is close to zero, the observed survival advantage is more plausibly explained by markedly slow failure dynamics among susceptible individuals rather than by the presence of a clearly identifiable cured subpopulation. Consequently, cure fraction estimates should be evaluated jointly with the estimated failure rates and the available follow-up duration to avoid overinterpretation of the cure component.
Limitation of univariate threshold modeling
While the proposed model successfully identifies clinically interpretable thresholds in both the colon cancer and melanoma datasets, the real-data analyses are conducted in a univariate setting. In clinical practice, survival outcomes are typically influenced by multiple prognostic factors; for example, melanoma prognosis depends not only on tumor thickness but also on ulceration status, patient age, and anatomical site. Excluding such covariates may introduce residual confounding, particularly when they are correlated with the threshold covariate, which may in turn affect the estimated change point
. Moreover, the regime-specific cure fractions and failure rates
should be interpreted as marginal effects of the threshold covariate, as they may reflect the combined influence of unmodeled prognostic factors. Extending the proposed smoothed likelihood framework to a multivariable setting to accommodate additional covariates within regimes represents an important direction for future research.
While the proposed model effectively identifies threshold effects, it assumes that covariates influence survival only through regime membership defined by the change point
. In complex clinical settings, additional prognostic effects may persist within these regimes, and ignoring such effects may limit model flexibility. Acknowledging this limitation, future research should extend the proposed smoothed likelihood framework to accommodate within-regime covariate effects, potentially through semiparametric or partially parametric modeling approaches.
6. Conclusion
In this article, we developed and validated a non-mixture cure model with a covariate dependent change point for right censored survival data. Assuming exponential distributions for susceptible subjects, we employed a smoothed likelihood approach to address the non-smoothness in the likelihood due to the change point in covariate. The performance and reliability of our proposed estimation method were first confirmed through extensive simulation studies. The results demonstrated that the estimators for the cure fractions, failure rates, and the change-point parameter are consistent and exhibit low bias across a variety of realistic scenarios, including different sample sizes and censoring proportions. Additional sensitivity analyses under latency misspecification indicate that inference on the change-point and cure fractions remains robust even when the exponential assumption for susceptible survival times is violated. In both real-data applications, the profile likelihood analysis provided strong visual evidence for a unique and well-defined change point, validating the stability of the estimates. Application to the colon and melanoma datasets demonstrated the model’s clinical utility by identifying significant thresholds in lymph node count and tumor thickness that sharply distinguish patient prognosis. This approach provides a valuable tool for refining risk stratification in clinical settings where such threshold effects are common.
Acknowledgements
The authors acknowledge the use of AI tools (Gemini, developed by Google) to support language editing and refinement during the preparation of this manuscript. All content and interpretations are the responsibility of the authors. The authors sincerely thank the anonymous reviewers for their careful reading of the manuscript and for their constructive comments and insightful suggestions, which have substantially improved the quality and clarity of this work.
Supplement A: Derivatives of
with respect to
,
, and
For subject
and regime
, define
where the uncured-time distribution is exponential with
and
is the censoring indicator. We use unconstrained parameters
A.1: Derivatives with respect to
(via
)
Now
depends on
only through
and
(weighted by
). The derivatives with respect to
are
Therefore,
Chain rule to
gives
Thus,
A.2: Derivatives with Respect to
(via
)
For the exponential model,
Hence
So
For the second derivative,
Chain to
with
,
, we obtain
A.3: Derivatives with Respect to τ
The derivative of the smoothed log-likelihood with respect to the change point
is
These expressions provide the analytic score contributions and observed-information terms needed for Newton–Raphson estimation of
in the smoothed-likelihood framework.
A.4: Compact Score (Gradient) for the Smoothed Log-Likelihood
Smoothing weights.
For each subject
, define the logistic weight
Let
and
for
, and recall
The per-subject regime contributions are
Score (gradient).
The smoothed log-likelihood is
Its gradient
with
is
A.5: Observed Information/Hessian
Building-block derivatives (per subject i, regime j).
For compactness, let
These follow directly from the scalar derivations via chain rule.
(i) Second derivative in
. With
and
and
we have
(ii) Second derivative in
. Using
,
(You may leave
in this equivalent unsimplified form; it expands directly from the chain rule with
,
.)
(iii) Mixed derivative in
. Only the term
couples
and
. From either order,
Hessian entries (sum over subjects).
All cross-regime blocks are zero because
depends only on
and
only on
.
Hessian entries involving τ.
Since
and
do not depend on
,
and for
,
and for
,
Implementation notes.
1) The blocks are observed second derivatives; the negative of the Hessian is the observed information. 2) The mixed blocks across regimes vanish, simplifying Newton-Raphson updates. 3) The formulas are numerically stable provided
and
; guard divisions by
.