In this article, the restricted almost unbiased ridge logistic estimator (RAURLE) is proposed to estimate the parameter in a logistic regression model with exact linear re-strictions when there exists multicollinearity among explanatory variables. The performance of the proposed estimator over the maximum likelihood estimator (MLE), ridge logistic estimator (RLE), almost unbiased ridge logistic estimator (AURLE), and restricted maximum likelihood estimator (RMLE) with respect to different ridge parameters is investigated through a simulation study in terms of scalar mean square error.

Multicollinearity Ridge Estimator Almost Unbiased Ridge Logistic Estimator Linear Restrictions Scalar Mean Square Error
1. Introduction

Multicollinearity inflates the variance of the maximum likelihood estimator (MLE) in the logistic regression. As a result, one may not obtain an efficient estimate for the parameter in the logistic regression model. To combat the multicollinearity in logistic regression, several alternative techniques have been proposed in the literature. One of the most famous techniques is to consider suitable biased estimators in place of Maximum likelihood estimator. The biased estimators proposed in the literature, are the Ridge Logistic Estimator (RLE) (Schaefer et al., 1984  ), Liu Logistic Estimator (LLE) (Liu, 1993  , Urgan and Tez, 2008  , and Mansson et al., 2012  ), Principal Component Logistic Estimator (PCLE) (Aguilera et al., 2006  ), Modified Logistic Ridge Estimator (MLRE) (Nja et al., 2013  ), Liu-type estimator (Inan and Erdogan, 2013  ), and Almost Unbiased Liu Logistic Estimator (AULLE) (Xinfeng, 2015  ). Morever, Asar (2015)  , proposed some new methods to solve the multicollinearity in logistic regression by introducing new methods of estimating the shrinkage parameter in Liu-type estimators. Only the sample information was used in all the above estimation procedures. An alternative technique suggested to solve the multicollinearity problem is to consider parameter estimation with some linear restrictions on the unknown parameters, which are generally based on prior information of the sample data, and further they may be in the exact or stochastic form. By incorporating linear restrictions to the sample information, different types of biased estimators were introduced in the literature, and some researchers have incorporated these estimators with the logistic regression estimator to improve its performance. In the presence of exact linear restrictions in addition to sample logistic regression model, Duffy and Santer (1989)  introduced the restricted maximum likelihood estimator (RMLE) by incorporating the restricted least squares estimator based on exact linear restriction to the logistic regression. Later, the Restricted Logistic Ridge Estimator (Asar et al., 2016  ), Restricted Logistic Liu Estimator (RLLE) (Şiray et al., 2015  ), Modified Restricted Liu Estimator (Wu, 2016  ), Restricted two parameter Liu type estimator (Asar et al., 2016  ) were introduced to the logistic regression with exact linear restrictions. In the presence of stochastic linear restrictions in addition to sample logistic regression model, Nagarajah and Wijekoon (2015) introduced the Stochastic Restricted Maximum Likelihood Estimator (SRMLE). Following Nagarajah and Wijekoon (2015)  , the Stochastic Restricted Ridge Maximum Likelihood Estimator (SRRMLE) was proposed by Varathan and Wijekoon (2016)  by incorporating Ridge Logistic Estimator (RLE) with the SRMLE.

Wu and Asar (2016)  has proposed a new biased estimator called Almost Unbiased Ridge Logistic Estimator (AURLE), and shown its performance over the other available estimators. In this article, we further improve the logistic regression estimator by combining AURLE with RMLE, and name it as the Restricted Almost Unbiased Ridge Logistic Estimator (RAURLE). Further, the performance of RAURLE based on estimated ridge parameters using different methods given in the literature was considered, and compared each of these cases with MLE, RLE, AURLE and RMLE. The proceeding sections of the article are organized as follows. The model specification and estimation are discussed in Section 2. The proposed estimator and its asymptotic properties are given in Section 3. Section 4 describes the existing methods related to some ridge parameters. In Section 5, the performance of the proposed estimator by considering different ridge parameters is compared with respect to the scalar mean squared error (SMSE) with MLE, RLE, AURLE and RMLE by performing a Monte Carlo simulation study. Finally, conclusions of the study are presented in Section 6.

2. Model Specification and Estimation

Consider the following logistic regression model

which follows Bernoulli distribution with parameter as

where is the ith row of X, which is an data matrix with p predictor variables and is a vector of coefficients, are independent with mean zero and variance of the response. The maximum likelihood estimator (MLE) of can be obtained as follows:

where; Z is the column vector with ith element equals

and, which is an unbiased estimate of

. The covariance matrix of is

In the presence of multicollinearity, Schaefer et al. (1984)  proposed to incorporate the Logistic Ridge Estimator (LRE), in place of the MLE in the logistic regression model (1)

where and k is the ridge parameter,.

The asymptotic properties of LRE:

However the LRE is a biased estimator which produces inconsistent estimates for the parameter (Wu and Asar, 2016  ). Consequently, the Almost Unbiased Ridge Logistic Estimator (AURLE) was introduced by Wu and Asar (2016)  and it is defined as

where.

And the asymptotic properties of AURLE:

As another remedial action for multicollinearity, one may use the exact linear restrictions in addition to the sample logistic regression model (1). The resulting esti- mator is called as Restricted estimator.

Suppose that the following exact restriction is given in addition to the general logistic regression model (1).

where H is a known matrix and h is an vector of known con- stants.

In the presence of the above restriction (11) in addition to the logistic regression model (1), Duffy and Santner (1989)  proposed the following Restricted Maximum Likelihood Estimator (RMLE).

The asymptotic mean and variance of are

and

Consequently the bias of,

3. The Proposed Estimator

To improve the performance of the estimators further, in this section, by combining AURLE and RMLE, we propose a new estimator which is called as the Restricted Almost Unbiased Ridge Logistic Estimator (RAURLE) and defined as

where. Note that this estimator is based on the ridge para- meter k, and its performance is based on the choice of k.

The asymptotic properties of are

and

Consequently, the mean square error can be obtained as,

4. Some Ridge Estimators

Now we consider the existing methods to obtain an estimated value for the ridge parameter k, since RAURLE depends on k. Many researchers suggested various methods of estimating the ridge parameter in the ridge regression approach and recently this estimation method is added to the logistic regression. In this research, we have considered the following existing ridge parameter estimation methods to compare the performance of the proposed estimator with some existing estimators in logistic regression.

1) Hoerl and Kennard (1970)  ;

where is the maximum element of, is the eigen vector of.

2) Hoerl et al. (1975)  ;

where p is the number of predictor variables in the model (1).

3) Lawless and Wang (1976)  ;

4) Lindley and Smith (1972)  ;

5) Schaefer et al. (1984)  ;

5. Simulation Study

It is difficult to compare the mean square error of the estimators theoretically, since none of the estimators MLE, RLE, AURLE, RMLE and RAURLE are not always superior. So, we use Monte Carlo simulation to examine the performance of the proposed estimator over the existing estimators under different levels of multicolli- nearity. Following McDonald and Galarneau (1975)  and Kibria (2003)  , the explanatory variables are generated using the following equation.

where are independent pseudo standard normal random numbers and repre- sents the correlation between any two explanatory variables. The n observations for the response variable are obtained from the Bernoulli () distribution in (1). Four explana- tory variables are generated using (26) and four different values of corresponding to 0.80, 0.90, 0.95 and 0.99 are considered. Further for the sample size n, three different values 25, 60, and 100 are also considered. The parameter values of are chosen so that and, which is common restrictions in many simulation studies. Further for the ridge parameter k, five different choices are used as defined in the Equations (21)-(25). The simulation is repeated 2000 times by generating new pseudo-random numbers and the simulated SMSE values of the estimators are obtained using the following equation.

where is any estimator considered in the rth simulation. The simulation results are given in Tables 1-3. It can be noticed from the Tables 1-3 that the scalar mean square error of the proposed estimator RAURLE is smaller compared to MLE, RLE, AURLE and RMLE, with respect to all the selected values of n, r, and k, considered in this research. Further, the new estimator RAURLE has better performance when is used.

6. Concluding Remarks

In this paper, we proposed a restricted almost unbiased ridge logistic estimator (RAURLE) in logistic regression with exact linear restrictions when the explanatory variables are highly correlated. Through a Monte Carlo simulation study, we examined

Estimator
0.80MLE2.79132.79132.79132.79132.7913
RLE2.11561.79072.58502.33252.5182
AURLE2.67542.52652.78112.73932.7733
RMLE0.79460.79460.79460.79460.7946
RAURLE0.77270.74200.79190.78500.7911
0.90MLE5.38045.38045.38045.38045.3804
RLE3.21101.17073.28013.78762.8573
AURLE4.73352.51654.77675.04404.4847
RMLE1.32301.32301.32301.32301.3230
RAURLE1.21270.74131.22021.26801.1662
0.95MLE10.592110.592110.592110.592110.5921
RLE3.38901.05353.15225.66362.4171
AURLE6.60492.65896.29468.87635.2217
RMLE2.09852.09852.09852.09852.0985
RAURLE1.48680.70451.43161.85981.2326
0.99MLE52.369152.369152.369152.369152.3691
RLE3.21280.546911.065012.17378.0410
AURLE9.12831.591024.770826.521119.5062
RMLE4.19854.19854.19854.19854.1985
RAURLE1.05870.29432.39912.53451.9761
Estimator
0.80MLE1.00271.00271.00271.00271.0027
RLE0.95590.75760.83810.99000.7688
AURLE1.00140.96400.98601.00260.9677
RMLE0.38180.38180.38180.38180.3818
RAURLE0.38140.36980.37670.38180.3710
0.90MLE1.91441.91441.91441.91441.9144
RLE1.53711.14841.35351.85861.1580
AURLE1.86851.70541.80811.91341.7111
RMLE0.60810.60810.60810.60810.6081
RAURLE0.59610.55180.57990.60790.5534
0.95MLE3.74773.74773.74773.74773.7477
RLE3.12361.97622.11643.47271.6703
AURLE3.68533.16473.26273.73612.9106
RMLE0.96560.96560.96560.96560.9656
RAURLE0.95220.83600.85850.96310.7775
0.99MLE18.434518.434518.434518.434518.4345
RLE8.34502.83055.590812.94103.7151
AURLE14.49897.089711.494417.40298.6919
RMLE2.06472.06472.06472.06472.0647
RAURLE1.68090.89011.36981.96681.0678
Estimator
0.80MLE0.58130.58130.58130.58130.5813
RLE0.57210.54970.56680.57840.5230
AURLE0.58120.58030.58110.58130.5779
RMLE0.27340.27340.27340.27340.2734
RAURLE0.27320.27300.27310.27330.2720
0.90MLE1.10841.10841.10841.10841.1084
RLE0.99290.64020.94601.09850.9585
AURLE1.10150.97461.09451.10841.0966
RMLE0.41930.41930.41930.41930.4193
RAURLE0.41700.37440.41470.41930.4154
0.95MLE2.16852.16852.16852.16852.1685
RLE1.89381.10411.66492.12691.8481
AURLE2.14861.80862.09822.16812.1412
RMLE0.66270.66270.66270.66270.6627
RAURLE0.65740.56310.64370.66260.6553
0.99MLE10.660210.660210.660210.660210.6602
RLE5.49490.97434.86919.65805.3288
AURLE8.97072.71728.465610.60808.8449
RMLE1.47341.47341.47341.47341.4734
RAURLE1.26080.41981.19571.46701.2446

the performance of the proposed estimator over some existing estimators MLE, RLE, AURLE and RMLE in terms of scalar mean square error. Also, five different choices of existing ridge parameter estimates were used to compare the estimators. The results show that the newly proposed estimator outperforms all the other estimators considered in this study under the selected values of n, r, and k by means of SMSE.

Acknowledgements

We thank the Editor and the referee for their comments and suggestions, and the Postgraduate Institute of Science, University of Peradeniya, Sri Lanka for providing necessary facilities to complete this research.

Cite this paper

Varathan, N. and Wijekoon, P. (2016) On the Restricted Almost Unbiased Ridge Estimator in Logistic Regression. Open Journal of Statistics, 6, 1076-1084. http://dx.doi.org/10.4236/ojs.2016.66087

ReferencesSchaefer, R.L., Roi, L.D. and Wolfe, R.A. (1984) A Ridge Logistic Estimator. Communications in Statistics - Theory and Methods, 13, 99-113. https://doi.org/10.1080/03610928408828664Liu, K. (1993) A New Class of Biased Estimate in Linear Regression. Communications in Statistics -Theory and Methods, 22, 393-402. https://doi.org/10.1080/03610929308831027Urgan, N.N. and Tez, M. (2008) Liu Estimator in Logistic Regression When the Data Are Collinear. International Conference. “Continuous Optimization and Knowledge-Based Technologies”, 323-327.Mansson, G., Kibria, B.M.G. and Shukur, G. (2012) On Liu Estimators for the Logit Regression Model. The Royal Institute of Techonology, Centre of Excellence for Science and Innovation Studies (CESIS), Sweden, Paper No. 259.Aguilera, A.M., Escabias, M. and Valderrama, M.J. (2006) Using Principal Components for Estimating Logistic Regression with High-Dimensional Multicollinear Data. Computational Statistics & Data Analysis, 50, 1905-1924. https://doi.org/10.1016/j.csda.2005.03.011Nja, M.E., Ogoke, U.P. and Nduka, E.C. (2013) The Logistic Regression Model with a Modified Weight Function. Journal of Statistical and Econometric Method, 2, 161-171.Inan, D. and Erdogan, B.E. (2013) Liu-Type Logistic Estimator. Communications in Statistics - Simulation and Computation, 42, 1578-1586. https://doi.org/10.1080/03610918.2012.667480Kibria, B.M.G. (2003) Performance of Some New Ridge Regression Estimators. Communications in Statistics - Theory and Methods, 32, 419-435. https://doi.org/10.1081/sac-120017499McDonald, G.C. and Galarneau, D.I. (1975) A Monte Carlo Evaluation of Some Ridge-Type Estimators. Journal of the American Statistical Association, 70, 407-416. https://doi.org/10.1080/01621459.1975.10479882Lindley, D.V. and Smith, A.F.M. (1972) Bayes Estimate for the Linear Model (with Discussion) Part 1. Journal of the Royal Statistical Society, Ser B, 34, 1-41.Lawless, J.F. and Wang, P. (1976) A Simulation Study of Ridge and Other Regression Estimators. Communications in Statistics - Theory and Methods, 14, 307-323. https://doi.org/10.1080/03610927608827353Hoerl, E., Kennard, R.W. and Baldwin, K.F. (1975) Ridge Regression: Some Simulations. Communications in Statistics, 4, 105-123. https://doi.org/10.1080/03610927508827232Hoerl, E. and Kennard, R.W. (1970) Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634Wu, J. and Asar, Y. (2016) On Almost Unbiased Ridge Logistic Estimator for the Logistic Regression Model. Hacettepe Journal of Mathematics and Statistics, 45, 989-998.Varathan, N. and Wijekoon, P. (2016) Ridge Estimator in Logistic Regression under Stochastic Linear Restriction. British Journal of Mathematics & Computer Science, 15, 1. https://doi.org/10.9734/BJMCS/2016/24585Nagarajah, V. and Wijekoon, P. (2015) Stochastic Restricted Maximum Likelihood Estimator in Logistic Regression Model. Open Journal of Statistics, 5, 837-851. https://doi.org/10.4236/ojs.2015.57082Asar, Y., Erisoglu, M. and Arashi, M. (2016) Developing a Restricted Two Parameter Liu-Type Estimator: A Comparison of Restricted Estimators in the Binary Logistic Regression Model. Communications in Statistics - Theory and Methods, Online. https://doi.org/10.1080/03610926.2015.1137597Wu, J. (2016) Modified Restricted Liu Estimator in Logistic Regression Model. Computational Statistics, 31, 1557. https://doi.org/10.1007/s00180-015-0609-3Siray, G.U., Toker, S. and Kaciranlar, S. (2015) On the Restricted Liu Estimator in Logistic Regression Model. Communications in Statistics - Simulation and Computation, 44, 217-232. https://doi.org/10.1080/03610918.2013.771742Asar, Y., Arashi, M. and Wu, J. (2016) Restricted Ridge Estimator in the Logistic Regression Model. Communications in Statistics - Simulation and Computation, Online. https://doi.org/10.1080/03610918.2016.1206932Duffy, D.E. and Santner, T.J. (1989) On the Small Sample Prosperities of Norm-Restricted Maximum Likelihood Estimators for Logistic Regression Models. Communications in Statistics -Theory and Methods, 18, 959-980. https://doi.org/10.1080/03610928908829944Asar, Y. (2015) Some New Methods to Solve Multicollinearity in Logistic Regression. Communications in Statistics - Simulation and Computation, Online. https://doi.org/10.1080/03610918.2015.1053925Xinfeng, C. (2015) On the Almost Unbiased Ridge and Liu Estimator in the Logistic Regression Model. International Conference on Social Science, Education Management and Sports Education, Atlantis Press, Amsterdam, 1663-1665.