^{1}

^{*}

^{1}

^{*}

The composite quantile regression should provide estimation efficiency gain over a single quantile regression. In this paper, we extend composite quantile regression to nonparametric model with random censored data. The asymptotic normality of the proposed estimator is established. The proposed methods are applied to the lung cancer data. Extensive simulations are reported, showing that the proposed method works well in practical settings.

Consider the following nonparametric regression model with random censored data:

where is an unknown smoothing function, is a positive function representing the standard deviation and is the random error with mean 0 and variance 1. Let C denote the censoring variable, whose distribution may depend on U, where U is vector of observed covariates. In this paper, we focus on random right censoring, we only observe the triples, where and are the observed response variable and the censoring indicator respectively, where is the survival time.

Censored quantile regression was first studied by [

Intuitively, the composite quantile regression (CQR) should provide estimation efficiency gain over a single quantile regression; see [

The paper is organized as follows. In Section 2, local composite quantile regression for nonparametric model with censored data is introduced, and the main theoretical results are also given in this section. Both simulation examples and a real data application are given in Section 3 to illustrate the proposed procedures. Final remarks are given in Section 4. The technical proofs are deferred to the Appendix.

We first consider an ideal situation where, the conditional cumulative distribution function of the survival time given, is assumed to be known. In this case, we define the following weight function:

. In reality, is unknown and has to be estimated. We propose to estimate nonparametrically using the local Kaplan-Meier estimator

where and

, where

is a smooth kernel function, is the bandwidth converging to zero as. By plugging into (2), we obtain the estimated local weights

. Consider estimating the value of at

. The LCQRC procedure estimates, defined by, via minimizing the locally weighted objective function

where, be q check loss functions at q quantile positions: and is any value sufficiently large to exceed all.

Remark 1. The detail explant of can see Remark 1 of [

Denote by the marginal density function of the covariate, and

. To prove main results in this paperthe following technical conditions are imposed.

A1. The functions and have first derivatives with respect to, denoted as and, which are uniformly bounded away from infinity. In addition, and have bounded second order partial derivatives with respect to U.

A2. is positive definite matrix.

A3. has a continuous second derivative in the neighborhood of.

A4. is differentiable and positive in the neighborhood of.

A5. The conditional variance is continuous in the neighborhood of.

A6. Assume that the error has a symmetric distribution with a positive density.

Remark 2. Assumption A1 is needed for the local Kaplan-Meier estimator. It allows us to obtain the local expansions of and in the neighborhood of, and to obtain the uniform consistency and the linear representation of, which are needed for deriving the asymptotic normality result. Assumption A2 ensures that the expectation of the estimating function has a unique zero, and it is needed to establish the asymptotic distribution. Assumptions A3- A6 are the same conditions for establishing the asymptotic normality of local composite quantile regression ([

We state the asymptotic normality for in the following theorem.

Theorem 1. Assume that the triples constitute and i.i.d. multivariate random sample, and that the censoring variable is independent of conditional on the covariate. Suppose that is an interior of the support of. Under the regularity conditions A1-A6, if and, then

where stands for convergence in distribution and

, where

and.

In this section, we conduct simulation studies to assess the finite sample performance of the proposed procedures and illustrate the proposed methodology on a lung cancer data set. Moreover, we compare the performance of the newly proposed method with LCQR ([

In the proposed compute process, we take

and

. The bandwidth h^{*} can be obtained by 10-fold cross-validation method (see [

The data are generated from the following model

where is uniformly distributed on and is i.i.d. standard normal random variables. The censoring variable and. The value of the constant c in the model determines the censoring proportion. In our simulations, we consider three censoring rates (CR): 20%, 30% and 40%. For each censoring rate, the sample sizes are taken to be 100 and 200. To evaluate the finite sample performance of our estimator. Two distance measures are approximated, the first one the mean absolute deviation error (MADE) is given by, and the second one the mean squared error (MSE) defines as

. Furthermore, we define the rate of MADE and MSE which are

and

.

For right censored data, quantile functions with close to 1 may not be identifiable due to censorship. In our similations, we consider for LCQR and LCQRC estimators. The means and standard deviations of MADE, MSE, RMADE and RMSE are respectively reported in

It is necessary to investigate the effect of heteroscedastic errors. The observations, are generated from following model

where and are generated following the same way as in Example 1. The means and standard deviations of MADE, MSE, RMADE and RMSE are respectively reported in

performance of LCQRC is presented in

As an illustration, we now apply the proposed LCQRC to the lung cancer data. The data contain 228 observations on ten variables. The censoring percentage is 27%, so the estimators are expected to perform well. More details about the study can be found in [

given by, and the second one the mean squared error defined as

, where. Furthermore, we define the rate of and which are

and

. Next, we report and compare results with LCQR and NQRC for estimating the survival time. The simulation results for the LCQR, LCQRC and NQRC are given in

shows that the proposal is valid.

In this work, we have focused on the LCQR for nonparametric model with censored data and its nice theoretical properties have been proven. The proposed approaches are demonstrated by simulation examples and real data applications. In addition, we believe the method can be extended to varying coefficient model (see [

Lemma 1. Assume assumption A1 hold. Then

where.

Proof. This follows directly from theorem 2.1 of [

Proof of Theorem 1 Let

,

, ,

,

Then is the minimizer of the following criterion:

where and. To apply the identity ([

we have

Since is any value sufficiently large to exceed all, and, then.

Denote, where

.

By the conditional independence of and given, we have

Therefore,

By Lemma 1, we have

Then, we can obtain

So, we can obtain, then

where

.

Note that the error is symmetric, thus, then it follows that

Since, then

So, we can obtain

This completes the proof.