Statistical Diagnosis for General Transformation Model with Right Censored Data Based on Empirical Likelihood

In this work, we consider statistical diagnostic for general transformation models with right censored data based on empirical likelihood. The models are a class of flexible semiparametric survival models and include many popular survival models as their special cases. Based on empirical likelihood methodologe, we define some diagnostic statistics. Through some simulation studies, we show that out proposed procedure can work fairly well.


Introduction
Statistical diagnosis developed in the mid-1970s, which is a new statistical branch.In the course of development of the past 40 years, the diagnosis and influence analysis of linear regression model has been fully developed (R. D. Cook and S. Weisberg [1], Bocheng Wei, Guobin Lu & Jianqing Shi [2]).Influence diagnostics for the proportional hazards model has been fully developed (L. A. Weissfeld [3]), for example, the proportional odds model, heteroscedastic linear transformation model, generalized linear transformation model, generalized transformation model and the other survival models.
The empirical likelihood method originates from Thomas & Grunkemeier [4].Owen [5] first proposed the definition of empirical likelihood and expounded the system info of empirical likelihood.The empirical CDF of 1 2 , , , n X X X  is defined as ( ) . Zhu and Ibrahim [6] utilized this method for statistical diagnostic, they developed diagnostic measures for assessing the influence of individual observations when using empirical likelihood with general estimating equations, and used these measures to construct goodness-of-fit statistics for testing possible misspecification in the estimating equations.Liugen Xue and Lixing Zhu [7] summarized the application of empirical likelihood method.Many authors have successfully applied empirical likelihood to the analysis of survival data.For example, Qin and Jing [8] investigated empirical likelihood confidence intervals for Cox's regression models with right censored data; He [9] studied the goodness-of-fit of Cox's regression models with various types of censored data; Gu et al. [10] considered inferences for Cox's regression models with time-dependent coefficients; Zhou [11], Zheng and Yu [12] and Zhou et al. [13] studied empirical likelihood for accelerated failure time models, multivariate accelerated failure time models and heteroscedastic accelerated failure time models respectively.Li et al. [14] overviewed some applications of empirical likelihood in survival analysis; Lu and Liang [15] discussed empirical likelihood procedure based on estimating equations for a class of flexible survival models-lineartransformation models, which includes popular proportional hazard regression models and proportional odds regression models as its special cases.Jianbo Li et al. [16] studied empirical likelihood inference for general transformation models with right censored data.
In this paper, we will consider statistical diagnostic for a class of very general survival models-general transformation models with right censored data in the form of where ( ) S t is the conditional survival function of failure time variable T given covariate vector Z ; ( ) is a known monotonically increasing function with respect to u satisfying ( ) is a parameter vector including regression coefficients and possible model transformation parameters in Φ .Model (1) includes many popular survival models, for example heteroscedastic linear transformation models, as their special cases.Note that when where 1 h − is a survival function, Model (1) reduces to the popular linear transformation models (Clayton and Cuzik [17]; Dabrowska and Doksum [18]; Bickel [19]; Cheng et al. [20]; Fine et al. [21]).
So far the diagnosis of the general transformation model with random right censorship based on empirical likelihood method has not yet seen in the literature.This paper attempts to study it.One advantage of this procedure is that it is free of baseline survival function and censoring distribution.The class of models we investigate is also general than previous studies for survival models.
The rest of the paper is organized as follows.Empirical likelihood and estimation equation are presented in Section 2. The main results are given in Section 3 and Section 4. Section 5 contains some simulation studies as well as applications.Conclusions with discussions are given in Section 6.

Empirical Likelihood and Estimation Equation
Let C be the censoring variable, ( ) be the censored event time variable and ( )  , , , [16] has proposed the empirical log-likelihood ratio function for β can be defined by ( ) ( ) ( ) . By Qin and Lawless [22], Owen [5], when ( ) the empirical log-likelihood ratio statistic equal to the maximum Regard λ and β as independent variable and define Obviously, the maximum empirical likelihood estimates β and λ are the solutions of following equations

Case-Deletion Influence Measures
Consider Model (1), where the j-th case ( ) This model is called case-deletion model.Let ( ) ˆj β is the maximum empirical likelihood estimate of β in model (2).In order to study the influence of the j-th case ( ) , j j t Z , and compare the difference between β and ( ) ˆj β .The important result as follows theorem.

Empirical Cook Distance
Zhu, et al. [6] proposed empirical cook distance.Let M is a nonnegative matrix.The empirical cook distance is defined as follows where ( )

Empirical Likelihood Distance
Empirical likelihood distance is advanced from the view of data fitting.Considering the influence of deleting the j-th case.In order to eliminate the influence of scale, it is also need to divide the variance of estimator Because the keystone is to review the influence of deleting the j-th case.Hence, .Then, the W-K statistic can be expressed as follows

Local Influence Analysis of Model
We consider the local influence method for a case-weight perturbation n R ω ∈ , for which the empirical log-li- kelihood function . In this case, 0 ω ω = , defined to be an 1 n × vector with all elements equal to 1, represents no perturbation to the empirical likelihood, because ( ) ( ) β ω β = .Thus, the empirical likelihood displacement is defined as where ( ) ,where j e is an 1 n × vector with j-th component 1 and 0 otherwise.The 1 v represents the most influential perturbation to the empirical likelihood function, whereas the observation ( ) t Z with a large j e C can be regarded as influential.
As the discuss of Zhu et al. [6], for the general transformation regression model with random right censorship, we can deduce that where for all the cases (Qian Jun, et al. [23]).The survival data simulated by software SAS as follows Table 1.
In order to check out the validity of our proposed methodology, we change the response variable value of the third, 20th, 54th, 80th and 99th data.
For every case, it is easy to obtain ( ) i W β .For the parameters β and λ , using the samples, we evaluated their maximum empirical likelihood estimators for two models.Consequently, it is easy to calculate the value of 11 12 21 22 , , , S S S S and From all figures, we can see that in most cases, the value of C that the third, 20th, 54th, 80th and 99th data are strong influence point.Indeed, our proposed approaches are illustrated.

Discussion
In this paper, we considered the statistical diagnostic for general transformation models with right censored data based on empirical likelihood.We also studied in detail the method of simulating survival data under three different censored proportions.Through simulation studies, we illustrate that our proposed method can work fairly well.
Zhensheng Huang [24] analyzed empirical likelihood for varying-coefficient single-index model with right censored data.In addition, Zhengsheng Huang [25] studied profile empirical likelihood inferences for the single-Table 1. Survival data (Note: the "star" in top right corner represent censored data).

The proportional hazard Cox regression model
The proportional odds regression model       index-coefficient regression model.All of these will be topics for our further research. by

ℜ
, the partial ranking among the n k uncensored failure times and the censored observations between each neighboring pair of uncensored observations.Given the partial ranking ω is the maximum empirical likelihood estimator of β based on ( ) h is a direction in n R .Thus, the normal curvature of the influence graph local influence measures based on the normal curvature to the proportional hazard Cox regression model and the proportional odds regression model.For all two models, we will generate censoring times from ( ) 0, U C .By properly choosing values of C , we consider three censoring proportions 10%, 20%,30% r C =

C
are reasonably close to one fixed value.Following the definition and properties of i e C , we can diagnose the strong influence points, the value of which deviate from the average seriously.From Figures 1-3, we can see from the value of i e C that the third, 20th, 54th and 80th data are strong influence point.From Figures 4-6, we can see from the value of i e