_{1}

In this paper, an efficient shrinkage estimation procedure for the partially linear varying coefficient model (PLVC) with random effect is considered. By selecting the significant variable and estimating the nonzero coefficient, the model structure specification is accomplished by introducing a novel penalized estimating equation. Under some mild conditions, the asymptotic properties for the proposed model selection and estimation results, such as the sparsity and oracle property, are established. Some numerical simulation studies and a real data analysis are presented to examine the finite sample performance of the procedure.

With the effort to reduce the risk of model misspecification, more flexible nonlinear and non/semiparametric models have been proposed for independent and subject- dependent data. See, for example, [

As a natural extension of [

For the special case of PLVC model with random effect, an important problem is to choose the significant covariant variable. Shrinkage estimation based on regularization has attracted lots of interest. See, for example, [

The rest of this article is organized as follows. In Section 2, the model, estimation procedure and statistical properties of the estimators are introduced. In Section 3, the practical computational issues are discussed and some numeric simulations and a real data analysis for the finite sample performance are illustrated in Section 4.

Let

where

Assume that

where

where

A primary goal for (1) is to explore useful information for Z and X, it is important to select and estimate the nonzero coefficients in

where

where

Among many choices of the penalty function

where

An efficient estimation for the parameters of interest in model (5) depends on estimators for the variance component, therefore a consistent estimators for them is required. Suppose that the variance covariance matrix for model (1) is

where

where by the estimator

Therefore, the estimator for

In this section, we investigate the asymptotic behavior of the estimators for the parametric, nonparametric and variance component as well. Throughout the article, the following assumptions are needed to facilitate the technical details, although they may not be the weakest conditions. Let

(C1) For some

(C2) The density function f(u), which genernates the sequence of design points

(C3) The number of measurements m is bounded.

(C4) For an

(C5) Let

Firstly, we present that the estimators given by (8) are asymptotic normal.

Theorem 1 Suppose that conditions (C1)?(C5) hold, then

where

To obtain the consistency and oracle property about the estimators, additional conditions are required as follows, which are similar to that used in [

(C6) Let

(C7)

Theorem 2 Under the conditions (C1)-(C7) and the number of knots

i)

ii)

where

Theorem 2 ensures the convergence rate of the weighted estimators for not only the parametric component, but also the nonparametric component. Furthermore, the following two theorems provide us with the oracle property of the consistent estimators.

Theorem 3 Under the conditions (C1)-(C7) and the number of knots

Let

as

i)

ii)

According to Remark 1 in [

Let

Theorem 4 Under the conditions (C1)-(C7) and the number of knots

where

Denote that

Step 1. Calculate the estimator

Step 2. Solve the penalized estimators

Step 3. Replace the estimator

Remark 1. This modified penalized estimation procedure inherits the computational efficiency and sparsity of Lasso type solutions. And the computational details can be referred to [

Although our theoretical results give technical conditions on

where

We now use two examples to illustrate the superiority of the proposed weighted shrinkage estimation to that one without considering within-subject correlation.

Example 1. Consider a partially linear varying coefficient mixed effect model

where

To illustrate the estimation accuracy of the proposed method, we define generalized mean squared errors (GMSE) and the square root of average square error (RASE) to be

And for the purpose of a intensive comparison, in addition to the proposed method in this article, two other estimation methods are also required, that are the “naive” approach based on the working independence method, and the “ideal” one based on the true within-subject covariance. And the estimator, obtained by the “naive” approach, the proposed method in this article and the “ideal” approach, are denoted to be

The results about variable selection, based on 100 replications, are included in

Secondly, the variance, bias and mean square error of the estimators for the nonzero parameters, denoted to be “V”, “Bias” and “MSE”, are listed in

Finally, in

n = 50, m = 3 | n = 50, m = 4 | |||||||
---|---|---|---|---|---|---|---|---|

C | I | GMSE | C | I | GMSE | |||

(2, 0.5) | 7 | 0 | 0.0428 | 7 | 0 | 0.0313 | ||

7 | 0 | 0.0138 | 7 | 0 | 0.0085 | |||

7 | 0 | 0.0138 | 7 | 0 | 0.0083 | |||

(1, 0.5) | 7 | 0 | 0.0266 | 7 | 0 | 0.0188 | ||

7 | 0 | 0.1863 | 7 | 0 | 0.0085 | |||

7 | 0 | 0.1289 | 7 | 0 | 0.008 | |||

C | I | RASE | C | I | RASE | |||

(2, 0.5) | 7.92 | 0 | 0.4471 | 8 | 0 | 0.3601 | ||

8 | 0 | 0.2466 | 8 | 0 | 0.1916 | |||

8 | 0 | 0.2419 | 8 | 0 | 0.1888 | |||

(1, 0.5) | 7.92 | 0 | 0.3461 | 8 | 0 | 0.2792 | ||

7.95 | 0 | 0.2477 | 8 | 0 | 0.1916 | |||

7.98 | 0 | 0.2402 | 8 | 0 | 0.1867 |

n = 50, m = 3 | n = 50, m = 4 | ||||||
---|---|---|---|---|---|---|---|

M | V | MSE | M | V | MSE | ||

(2, 0.5) | 1.6 | 0.27 | 0.418 | 1.64 | 0.42 | 0.54 | |

(1, 0.5) | 0.7 | 0.12 | 0.206 | 0.64 | 0.13 | 0.25 | |

M | V | MSE | M | V | MSE | ||

(2, 0.5) | 0.5 | 0.02 | 0.025 | 0.53 | 0.02 | 0.022 | |

(1, 0.5) | 0.4 | 0.08 | 0.008 | 0.50 | 0.01 | 0.01 |

Var | Bias | MSE | Var | Bias | MSE | Var | Bias | MSE | ||

(2, 0.5) | 0.498 | 7.063 | 0.997 | 0.494 | 6.456 | 0.911 | 0.484 | 6.140 | 0.861 | |

0.159 | 5.191 | 0.428 | 0.168 | 6.519 | 0.593 | 0.149 | 0.502 | 0.152 | ||

0.150 | 5.090 | 0.409 | 0.157 | 6.457 | 0.576 | 0.138 | 0.231 | 0.139 | ||

(1, 0.5) | 0.286 | −4.931 | 0.529 | 0.304 | −11.76 | 1.688 | 0.291 | −5.826 | 0.630 | |

0.153 | −4.128 | 0.323 | 0.152 | −4.408 | 0.346 | 0.161 | −5.328 | 0.445 | ||

0.148 | −4.739 | 0.373 | 0.138 | −2.798 | 0.216 | 0.145 | −5.386 | 0.435 |

Example 2. To illustrate the effectiveness of the proposed estimation procedure, we shall apply it to the analysis of a longitudinal AIDS data set, which is reported by [

For the jth measurement of the ith subject, let

where the baseline of CD4 percentage

By the analysis, there are two variables

This article considered an efficient shrinkage estimation for the partially linear varying coefficient models with random effect. Variance component model was employed to take within subject correlation into consideration. Some asymptotic properties, such as convergence rate, consistency and oracle property, were established. Moreover, the effectiveness was further illustrated by a real data analysis. As a more ambitious goal, we would try to investigate the performance of variable selection issue for mixed effect

model under a more general within-subject covariance matrix.

This work was partially supported by the National Statistical Science Research Project of China [Grant No. 2014LZ14 and 2015LZ27] and the Yancheng Teachers’ Professor and Doctors’ Research Project [Grant No. 14YSYJB0108].

Li, W.B. (2016) Efficient Shrinkage Estimation about the Partially Linear Varying Coefficient Model with Random Effect for Longitudinal Data. Open Journal of Statistics, 6, 862-872. http://dx.doi.org/10.4236/ojs.2016.65071

Supplementary material related to this article can be asked for by email.