Efficient Shrinkage Estimation about the Partially Linear Varying Coefficient Model with Random Effect for Longitudinal Data

In this paper, an efficient shrinkage estimation procedure for the partially linear varying coefficient model (PLVC) with random effect is considered. By selecting the significant variable and estimating the nonzero coefficient, the model structure specification is accomplished by introducing a novel penalized estimating equation. Under some mild conditions, the asymptotic properties for the proposed model selection and estimation results, such as the sparsity and oracle property, are established. Some numerical simulation studies and a real data analysis are presented to examine the finite sample performance of the procedure.


Introduction
With the effort to reduce the risk of model misspecification, more flexible nonlinear and non/semiparametric models have been proposed for independent and subjectdependent data.See, for example, [1] introduced the varying coefficient model.Reference [2] and [3] studied the partially linear varying coefficient (PLVC) model and single index model, respectively, for longitudinal data.For longitudinal data, see [4] for an intensive review.
As a natural extension of [5], which used marginal model for longitudinal data analysis, a random effect method is developed when considering within-subject correlation and further shrinkage estimation.In the literatures in longitudinal data study, random effect method has received relatively enough attention.See, for example, [6]- [9] and so W. B. Li on.Some advantages of random effect method were mentioned in [3] including its computation efficiency.
For the special case of PLVC model with random effect, an important problem is to choose the significant covariant variable.Shrinkage estimation based on regularization has attracted lots of interest.See, for example, [10]- [13].Also, some extensions of the variable selection under the regularization framework to varying coefficient models include, [5] [14]- [16].In this article, the variable selection problem for PLVC model with random effect is investigated.Because of the obvious simplicity and wide usage, a penalized estimating equation based shrinkage estimation procedure is introduced, following the idea of [17].
The rest of this article is organized as follows.In Section 2, the model, estimation procedure and statistical properties of the estimators are introduced.In Section 3, the practical computational issues are discussed and some numeric simulations and a real data analysis for the finite sample performance are illustrated in Section 4.

Let ( )
, , , , X Z W U Y be the observed data associated with the ith subject in a longitudinal study.We consider the partially linear varying coefficient mixed effect model (PLVCMeM) ( ) where β is a 1 q × coefficient vector, i ν is a 1 l × vector of random effect with mean 0 and covariance matrix D, ( ) θ ⋅ is an unknown smoothing 1 p × function vector and ij ε is a random variable with mean 0 and variance 2 0 σ > .
Assume that ( ) ( ) ( ) ( ) is a set of B-spline basis functions of order θ can be expressed with a linear combination of normalized B-spline basis function is the spline coefficient vector.Therefore, with the given spline basis ( ) b u , model (1) can be approximated as ( ) , where ( ) Hence, the estimators for β and γ can be derived with the following weighted least square equation where is the variance-covariance matrix.
A primary goal for ( 1) is to explore useful information for Z and X, it is important to select and estimate the nonzero coefficients in β and nonzero functions in ( ) enhance model prediction and interpretability.To achieve the simultaneous estimation and variable selection, a shrinkage penalty function is developed based on estimating equation with a regularization based shrinkage estimation being defined with the following penalized estimating equation, which is similarly introduced in [17] as where ( ) . Alternatively, when denoting ( ) where q λ is a penalty function used to penalize the coefficients of nonsignificant va- riables, λ is a tuning parameter.
Among many choices of the penalty function q λ , the smoothly clipped and devia- tions (SCAD) penalty function is adopted as where , which is suggested by [10].Since the SCAD penalty is a spline function on an interval near zero and constant outside, it can shrink small value of an estimation to zero while having no impact on the large one.

Estimators of the Variance Components
An efficient estimation for the parameters of interest in model ( 5) depends on estimators for the variance component, therefore a consistent estimators for them is required.Suppose that the variance covariance matrix for model ( 1) is where 1 m represents a 1 m × vector of ones and m I is the m m × identity matrix.
Model ( 7) is called a variance component model and is very popular in longitudinal analysis.See, for example, [3] [6] [9] and so on.Furthermore, let ( ) Hence, it follows that a moment estimator for variance component ( ) where by the estimator β  and ( ) Therefore, the estimator for i Ω is defined as

Asymptotic Properties about the Estimators
In this section, we investigate the asymptotic behavior of the estimators for the para-metric, nonparametric and variance component as well.Throughout the article, the following assumptions are needed to facilitate the technical details, although they may not be the weakest conditions.Let 0 (C4) For an The eigenvalues of ( ) G u are bounded away from infinity and zero for sufficiently large n.
Firstly, we present that the estimators given by ( 8) are asymptotic normal.Theorem 1 Suppose that conditions (C1)-(C5) hold, then ˆ0, , and 0, , To obtain the consistency and oracle property about the estimators, additional conditions are required as follows, which are similar to that used in [10] and [15].
Theorem 2 ensures the convergence rate of the weighted estimators for not only the parametric component, but also the nonparametric component.Furthermore, the following two theorems provide us with the oracle property of the consistent estimators.
Theorem 3 Under the conditions (C1)-(C7) and the number of knots ( ) ( ) as n → ∞ .Then, with probability tending to 1, β and ( ) According to Remark 1 in [10], the variable selection procedure can specify the model correctly and efficiently.Next theorem further shows us that under some conditions, the nonzero coefficient estimators of the parametric component have the same asymptotic distribution as that based on true model.Let ( ) ( ) , , , , , tion of the estimator β * is established.Theorem 4 Under the conditions (C1)-(C7) and the number of knots ( ) ( ) where * Σ is defined by (A.10) in Appendix.

Practical Computational Issues
Denote that i Ω and ˆi Ω are the true and estimated covariance matrix within the ith subject.With a given initial value ( ) , we can refine the regularized estimators α according to the following iterative algorithms.
Step 1. Calculate the estimator i Ω by the formula (10) with the given value 0 α .
Step 2. Solve the penalized estimators α for the parametric and nonparametric component by the Equation ( 5) with the within-subject covariance matrix interpolated by the covariance matrix estimated in Step 1.
Step 3. Replace the estimator 0 α in Step 1 with the one α in Step 2 and iterate between Step 1 and Step 2 until the convergence criterion is met.Consequently, the required estimator can be solved with this iterative procedure.Remark 1.This modified penalized estimation procedure inherits the computational efficiency and sparsity of Lasso type solutions.And the computational details can be referred to [13].
Although our theoretical results give technical conditions on n k and λ , in prac- tice, these parameters is always chosen based on data to enhance the adaptive property.It follows that the cross validation method is used to choose the optimal n k and λ .
However, in the real application, the parameter n k and 1 2 , k l λ λ is chosen by mini- mizing the following cross-validation score ( ) , , , , , , where ( ) ˆi α − is the solution to the function (3) with the ith subject being deleted.Con- sidering the character of the turning parameter, in practice, the turning parameter λ is used as

Empirical Study
We now use two examples to illustrate the superiority of the proposed weighted shrinkage estimation to that one without considering within-subject correlation.
Example 1.Consider a partially linear varying coefficient mixed effect model ( ) where To illustrate the estimation accuracy of the proposed method, we define generalized mean squared errors (GMSE) and the square root of average square error (RASE) to be ( ) ( )( ) And for the purpose of a intensive comparison, in addition to the proposed method in this article, two other estimation methods are also required, that are the "naive" approach based on the working independence method, and the "ideal" one based on the true within-subject covariance.And the estimator, obtained by the "naive" approach, the proposed method in this article and the "ideal" approach, are denoted to be β , ˆW β and W β  , respectively.During the simulations, it is assumed that "C" and "I" means the average number of the zero coefficients being correctly set to zero and the average number of the nonzero coefficients being incorrectly set to zero, respectively.The results about variable selection, based on 100 replications, are included in Table 1.Table 1 shows that the proposed method can select the true model quite well and leads to smaller GMSE and RASE values.Table 2 reports a satisfactory estimation for the variance component.As the sample increase, the performance becomes better.
Secondly, the variance, bias and mean square error of the estimators for the nonzero parameters, denoted to be "V", "Bias" and "MSE", are listed in Table 3. From Table 3, all the three methods can obtain consistent estimators with small bias.Moreover, the values of "V" and "MSE" also argue that the newly proposed method and the "ideal" method can derive a more efficient estimator than the "naive" method does.What's more, the asymptotic normality of the estimators for the parametric component is shown with Quantile-Quantile plot in Figure 1.
Finally, in Figure 2, all of the curves, estimated by all the three methods, fit the true nonparametric curve well.However, their 95% confidence intervals have different interval length, with the proposed method in this article and the "ideal" method showing a similar and smaller length.All these scenarios indicate a great improvement of the estimation with the proposed method in this article.Example 2. To illustrate the effectiveness of the proposed estimation procedure, we shall apply it to the analysis of a longitudinal AIDS data set, which is reported by [18] and comprises HIV status of 283 homosexual males who were infected with HIV during a follow-up period between 1984 and 1991.The focus in this application is to probe into the trend of the mean CD4 percentage depletion over time and evaluate the effects of cigarette smoking, preHIV infection CD4 percentage and age at infection on mean CD4 percentage after infection.
For the jth measurement of the ith subject, let ij Y be the mean CD4 percentage, ij t be the time in years after HIV infection, ,1 i X be the centered preCD4 percentage and , θ θ ⋅ ⋅ and their 95% pointwise confidence interval with three coincident curves being the estimated ones for the nonparametric component, and the red dotted, the coincident solid and dotted-dashed ones being the estimated confidence intervals by the "naive", the "ideal" and the proposed method, respectively.( ) , where the baseline of CD4 percentage ( ) θ is used to represent the mean CD4 percentage of t years after the infection.The random effect i ν , representing the with- in-subject correlation, is also included in the assumed model.By the analysis, there are two variables ( ) θ and ,1 i X with nonzero estimators, which indicate the significant effect on the response variable ij Y .Other variables' coef- ficients are zero and have no significant effect.Moreover, the estimated curve in Figure 3 shows the trend of the mean CD4 percentage depletion over time.

Conclusion and Discussion
This article considered an efficient shrinkage estimation for the partially linear varying coefficient models with random effect.Variance component model was employed to take within subject correlation into consideration.Some asymptotic properties, such as convergence rate, consistency and oracle property, were established.Moreover, the effectiveness was further illustrated by a real data analysis.As a more ambitious goal, we would try to investigate the performance of variable selection issue for mixed effect model under a more general within-subject covariance matrix.
value of the corresponding parameters β and ( ) differentiable.(C2) The density function f(u), which genernates the sequence of design points ij U , is bounded away from 0 and infinity on [0, 1].And assume that f(u) is continuously differentiable on (0, 1).(C3)The number of measurements m is bounded.
and nonparametric component coefficients are zero.Moreover, it is further assumed that be (2,0.5)and (1,0.5).It is assumed that the number of observed subject and repeated measurement within each subject are n = 50 and m = 3, 4, respectively.

Figure 2 .
Figure 2. The estimated average curve for the nonparametric component

Table 1 .
Simulation results for the variable selection.

Table 2 .
Simulation results for the estimators of the variance components2 2, Wβ 