Flexible Model Selection Criterion for Multiple Regression

Abstract

Predictors of a multiple linear regression equation selected by GCV (Generalized Cross Validation) may contain undesirable predictors with no linear functional relationship with the target variable, but are chosen only by accident. This is because GCV estimates prediction error, but does not control the probability of selecting irrelevant predictors of the target variable. To take this possibility into account, a new statistics “GCVf” (“f”stands for “flexible”) is suggested. The rigidness in accepting predictors by GCVf is adjustable; GCVf is a natural generalization of GCV. For example, GCVf is designed so that the possibility of erroneous identification of linear relationships is 5 percent when all predictors have no linear relationships with the target variable. Predictors of the multiple linear regression equation by this method are highly likely to have linear relationships with the target variable.

Share and Cite:

K. Takezawa, "Flexible Model Selection Criterion for Multiple Regression," Open Journal of Statistics, Vol. 2 No. 4, 2012, pp. 401-407. doi: 10.4236/ojs.2012.24048.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] R. H. Myers, “Classical and Modern Regression with Applications (Duxbury Classic),” 2nd Edition, Duxbury Press, Pacific Grove, 2000.
[2] D. C. Montgomery, E. A. Peck and G. G. Vining, “Introduction to Linear Regression Analysis,” 3rd Edition, Wiley, New York, 2001.
[3] Y. Wang, “Smoothing Splines: Methods and Applications,” Chapman & Hall/CRC, Boca Raton, 2011. doi:10.1201/b10954

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.