Shrinkage Estimation of Semiparametric Model with Missing Responses for Cluster Data


This paper simultaneously investigates variable selection and imputation estimation of semiparametric partially linear varying-coefficient model in that case where there exist missing responses for cluster data. As is well known, commonly used approach to deal with missing data is complete-case data. Combined the idea of complete-case data with a discussion of shrinkage estimation is made on different cluster. In order to avoid the biased results as well as improve the estimation efficiency, this article introduces Group Least Absolute Shrinkage and Selection Operator (Group Lasso) to semiparametric model. That is to say, the method combines the approach of local polynomial smoothing and the Least Absolute Shrinkage and Selection Operator. In that case, it can conduct nonparametric estimation and variable selection in a computationally efficient manner. According to the same criterion, the parametric estimators are also obtained. Additionally, for each cluster, the nonparametric and parametric estimators are derived, and then compute the weighted average per cluster as finally estimators. Moreover, the large sample properties of estimators are also derived respectively.

Share and Cite:

Zhang, M. , Qiao, J. , Yang, H. and Liu, Z. (2015) Shrinkage Estimation of Semiparametric Model with Missing Responses for Cluster Data. Open Journal of Statistics, 5, 768-776. doi: 10.4236/ojs.2015.57076.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Sun, Y., Li, J.L. and Zhang, W.Y. (2012) Estimation and Model Selection in a Class of Semiparametric Models for Cluster Data. Annals of the Institute of Statistical Mathematics, 64, 835-856.
[2] Cai, J.W. (2005) Semiparametric Models for Clustered Recurrent Event Data. Life Data Analysis, 11, 405-425.
[3] Vichi, M. (2008) Fitting Semiparametric Clustering Models to Dissimilarity Data. Advances in Data Analysis and Classification, 2, 121-161.
[4] Yi, G.Y., He, W.Q. and Liang, H. (2011) Semiparametric Marginal and Association Regression Methods for Clustered Binary Data. Annals of the Institute of Statistical Mathematics, 63, 511-533.
[5] Carroll, R., Maity, A., Mammen, E. and Yu, K. (2009) Efficient Semiparametric Marginal Estimation for the Partially Linear Additive Model for Longitudinal/Clustered Data. Statistics in Biosciences, 1, 10-31.
[6] He, S., Wang, F. and Sun, L.Q. (2013) A Semiparametric Additive Rates Model for Clustered Recurrent Event Data, Acta Mathematicae Applicatae Sinica. English. Series, 29, 55-62.
[7] You, J.H. and Chen, G.M. (2006) Estimation of a Semiparametric Varying-Coefficient Partially Linear Errors-in-Variables Model. Journal of Multivariate Analysis, 97, 324-341.
[8] Fan, J.Q. and Huang, T. (2005) Profile Likelihood Inferences on Semiparametric Varying-Coefficient Partially Linear Models. Bernoulli, 11, 1031-1057.
[9] Wei, C.H. and Wu, X.Z. (2008) Profile Lagrange Multiplier Test for Partially Linear Varying-Coefficient Regression Models. Journal of Systems Science & Mathematical Sciences, 28, 416-424.
[10] Zhang, W., Lee, S.Y. and Song, X. (2002) Local Polynomial Fitting in Semivarying Coefficient Models. Journal of Multivariate Analysis, 82, 166-188.
[11] Fan, J.Q. and Zhang, W.Y. (1999) Statistical Estimation in Varying-Coefficient Models. Annals of Statistics, 27, 1491-1581.
[12] Hastile, T.J. and Tibshirani, R.J. (1993) Varying-Coefficient Models (With Discussion). Journal of the Royal Statistical Society: Series B, 55, 757-796.
[13] Xia, Y.C. and Li, W.K. (1999) On the Estimation and Testing of Functional-Coefficient Linear Models. Statistica Sinica, 9, 737-757.
[14] Hoover, D.R., Rice, J.A., Wu, C.O. and Yang, L.P. (1998) Nonparametric Smoothing Estimates of Time-Varying Coefficient Models with Longitudinal Data. Biometrika, 85, 809-822.
[15] Engle, R.F., Granger, W.J., Rice, J. and Weiss, A. (1996) Semiparametric Estimates of the Relation between Weather and Electricity Techniques. Journal of the American Statistical Association, 80, 310-319.
[16] Yatchew, A. (1997) An Elementary Estimator of the Partial Linear Model. Economics Letters, 57, 135-143.
[17] Speckman, P. (1988) Kernel Smoothing in Partial Linear Models. Journal of the Royal Statistical Society: Series B, 50, 413-416.
[18] Liang, H. (2006) Estimation in Partially Linear Models and Numerical Comparisons. Computational Statistics & Data Analysis, 50, 675-687.
[19] Chu, C. and Cheng, P. (1995) Nonparametric Regression Estimation with Missing Data. Journal of Statistical Planning and Inference, 48, 85-99.
[20] Wei, C.H. (2010) Estimation in Partially Linear Varying-Coefficient Errors-in-Variables Models with Missing Responses. Acta Mathematica Scientia, 30, 1042-1054.
[21] Wang, Q., Linton, O. and Hardle, W. (2007) Semiparametric Regression Analysis with Missing Response at Random. Journal of Multivariate Analysis, 98, 334-345.
[22] Tibshirani, R. (1996) Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society: Series B, 58, 267-288.
[23] Zou, H. (2006) The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101, 1418-1429.
[24] Yuan, M. and Lin, Y. (2006) Model Selection and Estimation in Regression with Grouped Variables. Journal of the Royal Statistical Society: Series B, 68, 49-67.
[25] Wang, H.S. and Xia, Y.C. (2009) Shrinkage Estimation of the Varying Coefficient Model. Journal of the American Statistical Association, 104, 747-757.
[26] Hu, T. and Xia, Y.C. (2010) Adaptive Semivarying Coefficient Model Selection. Statistica Sinica, 22, 575-599.
[27] Hunter, D.R. and Li, R. (2005) Variable Selection Using MM Algorithms. Annals of Statistics, 33, 1617-1642.

Copyright © 2021 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.