Subsampling Method for Robust Estimation of Regression Models

Abstract

We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation; it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.

Share and Cite:

M. Tsao and X. Ling, "Subsampling Method for Robust Estimation of Regression Models," Open Journal of Statistics, Vol. 2 No. 3, 2012, pp. 281-296. doi: 10.4236/ojs.2012.23034.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] P. J. Huber, “Robust Statistics”, Wiley, New York, 1981.
[2] F. R. Hampel, E.M. Ronchetti, P. J. Rousseeuw and W. A. Stahel, “Robust Statistics: The Approach Based on Influence Functions”, Wiley, New York, 1986.
[3] P. J. Rousseeuw and A. M. Leroy, “Robust Regression and Outlier Detection”, Wiley, New York, 1987.
[4] R. A. Maronna, R. D. Martin and V. J. Yohai, “Robust Statistics: Theory and Methods'', Wiley, New York, 2006.
[5] D. G. Simpson, D. Ruppert and R. J. Carroll, “On One-step GM-estimates and Stability of Inferences in Linear Regression”, Journal of the American Statistical Association, Vol. 87, 1992, pp. 439-450.
[6] V. J. Yohai, “High Breakdown-point and High Efficiency Estimates for Regression”, The Annals of Statistics, Vol. 15, 1987, pp. 642-656.
[7] K. A. Brownlee, “Statistical Theory and Methodology in Science and Engineering”, second edition, Wiley, New York, 1965.
[8] D. F. Andrews,”A Robust Method for Multiple Linear Regression”, Technometrics, Vol. 16, 1974, pp. 523-531.
[9] D. C. Montgomery, E. A. Peck and G. G. Vining, “Introduction to Linear Regression Analysis”, 4th edition, Wiley, New York, 2006.
[10] J. R. Ashford, “An Approach to the Analysis of Data for Semi-quantal Responses in Biological Assay”, Biometrics, Vol. 15, 1959, pp. 573-581.
[11] E. Cantoni and E. Ronchetti, “Robust Inference for Generalized Linear Models”, Journal of American Statistical Association, Vol. 96, 2001, pp. 1022-1030.
[12] D. M. Bates and D. G. Watts, “Nonlinear Regression Analysis and Its Applications”, Wiley, New York, 1988.
[13] M. Tsao, “Partial Depth Functions for Multivariate Data”, manuscript in preparation, 2012.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.