On the Restricted Almost Unbiased Ridge Estimator in Logistic Regression

In this article, the restricted almost unbiased ridge logistic estimator (RAURLE) is proposed to estimate the parameter in a logistic regression model with exact linear restrictions when there exists multicollinearity among explanatory variables. The performance of the proposed estimator over the maximum likelihood estimator (MLE), ridge logistic estimator (RLE), almost unbiased ridge logistic estimator (AURLE), and restricted maximum likelihood estimator (RMLE) with respect to different ridge parameters is investigated through a simulation study in terms of scalar mean square error.


Introduction
Multicollinearity inflates the variance of the maximum likelihood estimator (MLE) in the logistic regression.As a result, one may not obtain an efficient estimate for the parameter β in the logistic regression model.To combat the multicollinearity in logistic regression, several alternative techniques have been proposed in the literature.One of the most famous techniques is to consider suitable biased estimators in place of Maximum likelihood estimator.The biased estimators proposed in the literature, are the Ridge Logistic Estimator (RLE) (Schaefer et al., 1984 [1]), Liu Logistic Estimator (LLE) (Liu, 1993 [2], Urgan and Tez, 2008 [3], and Mansson et al., 2012 [4]), Principal Component Logistic Estimator (PCLE) (Aguilera et al., 2006 [5]), Modified Logistic Ridge Estimator (MLRE) (Nja et al., 2013 [6]), Liu-type estimator (Inan and Erdogan, 2013 [7]), and Almost Unbiased Liu Logistic Estimator (AULLE) (Xinfeng, 2015 [8]).Morever, Asar (2015) [9], proposed some new methods to solve the multicollinearity in logistic regression by introducing new methods of estimating the shrinkage parameter in Liu-type estimators.Only the sample information was used in all the above estimation procedures.An alternative technique suggested to solve the multicollinearity problem is to consider parameter estimation with some linear restrictions on the unknown parameters, which are generally based on prior information of the sample data, and further they may be in the exact or stochastic form.By incorporating linear restrictions to the sample information, different types of biased estimators were introduced in the literature, and some researchers have incorporated these estimators with the logistic regression estimator to improve its performance.In the presence of exact linear restrictions in addition to sample logistic regression model, Duffy and Santer (1989) [10] introduced the restricted maximum likelihood estimator (RMLE) by incorporating the restricted least squares estimator based on exact linear restriction to the logistic regression.Later, the Restricted Logistic Ridge Estimator (Asar et al., 2016 [11]), Restricted Logistic Liu Estimator (RLLE) (Şiray et al., 2015 [12]), Modified Restricted Liu Estimator (Wu, 2016 [13]), Restricted two parameter Liu type estimator (Asar et al., 2016 [14]) were introduced to the logistic regression with exact linear restrictions.In the presence of stochastic linear restrictions in addition to sample logistic regression model, Nagarajah and Wijekoon (2015) introduced the Stochastic Restricted Maximum Likelihood Estimator (SRMLE).Following Nagarajah and Wijekoon (2015) [15], the Stochastic Restricted Ridge Maximum Likelihood Estimator (SRRMLE) was proposed by Varathan and Wijekoon (2016) [16] by incorporating Ridge Logistic Estimator (RLE) with the SRMLE.Wu and Asar (2016) [17] has proposed a new biased estimator called Almost Unbiased Ridge Logistic Estimator (AURLE), and shown its performance over the other available estimators.In this article, we further improve the logistic regression estimator by combining AURLE with RMLE, and name it as the Restricted Almost Unbiased Ridge Logistic Estimator (RAURLE).Further, the performance of RAURLE based on estimated ridge parameters using different methods given in the literature was considered, and compared each of these cases with MLE, RLE, AURLE and RMLE.The proceeding sections of the article are organized as follows.The model specification and estimation are discussed in Section 2. The proposed estimator and its asymptotic properties are given in Section 3. Section 4 describes the existing methods related to some ridge parameters.In Section 5, the performance of the proposed estimator by considering different ridge parameters is compared with respect to the scalar mean squared error (SMSE) with MLE, RLE, AURLE and RMLE by performing a Monte Carlo simulation study.Finally, conclusions of the study are presented in Section 6.

Model Specification and Estimation
Consider the following logistic regression model , 1, , which follows Bernoulli distribution with parameter i π as where i x is the i th row of X, which is an ( ) data matrix with p predictor variables and β is a ( ) ε are independent with mean zero and variance ( ) The maximum likelihood estimator (MLE) of β can be obtained as follows: where Ĉ X WX ′ = ; Z is the column vector with i th element equals In the presence of multicollinearity, Schaefer et al. ( 1984) [1] proposed to incorporate the Logistic Ridge Estimator (LRE), in place of the MLE in the logistic regression model ˆˆˆL where ( ) and k is the ridge parameter, However the LRE is a biased estimator which produces inconsistent estimates for the parameter (Wu and Asar, 2016 [17]).Consequently, the Almost Unbiased Ridge Logistic Estimator (AURLE) was introduced by Wu and Asar (2016) [17] and it is defined as ( ) where ( ) And the asymptotic properties of AURLE: As another remedial action for multicollinearity, one may use the exact linear restrictions in addition to the sample logistic regression model (1).The resulting estimator is called as Restricted estimator.
Suppose that the following exact restriction is given in addition to the general logistic regression model (1).
where H is a ( ) ( ) known matrix and h is an ( ) vector of known constants.
In the presence of the above restriction (11) in addition to the logistic regression model (1), Duffy and Santner (1989) [10] proposed the following Restricted Maximum Likelihood Estimator (RMLE).

The Proposed Estimator
To improve the performance of the estimators further, in this section, by combining AURLE and RMLE, we propose a new estimator which is called as the Restricted Almost Unbiased Ridge Logistic Estimator (RAURLE) and defined as ( ) where ( ) Note that this estimator is based on the ridge parameter k, and its performance is based on the choice of k.
The asymptotic properties of ˆRAURLE β are

( ) (
) Consequently, the mean square error can be obtained as,

Some Ridge Estimators
Now we consider the existing methods to obtain an estimated value for the ridge parameter k, since RAURLE depends on k.Many researchers suggested various methods of estimating the ridge parameter in the ridge regression approach and recently this estimation method is added to the logistic regression.In this research, we have considered the following existing ridge parameter estimation methods to compare the performance of the proposed estimator with some existing estimators in logistic regression.

Simulation Study
It is difficult to compare the mean square error of the estimators theoretically, since none of the estimators MLE, RLE, AURLE, RMLE and RAURLE are not always superior.So, we use Monte Carlo simulation to examine the performance of the proposed estimator over the existing estimators under different levels of multicollinearity.Following McDonald and Galarneau (1975) [22] and Kibria (2003) [23], the explanatory variables are generated using the following equation.
( ) where ij z are independent pseudo standard normal random numbers and 2 ρ repre- sents the correlation between any two explanatory variables.The n observations for the response variable are obtained from the Bernoulli ( i π ) distribution in (1).Four explana- tory variables are generated using (26) and four different values of ρ corresponding to 0.80, 0.90, 0.95 and 0.99 are considered.Further for the sample size n, three different values 25, 60, and 100 are also considered.The parameter values of 1 2 , , , p β β β  are chosen so that where ˆr β is any estimator considered in the r th simulation.The simulation results are given in Tables 1-3.It can be noticed from the Tables 1-3 that the scalar mean square error of the proposed estimator RAURLE is smaller compared to MLE, RLE, AURLE and RMLE, with respect to all the selected values of n, ρ, and k, considered in this research.Further, the new estimator RAURLE has better performance when SRW k is used.

Concluding Remarks
In this paper, we proposed a restricted almost unbiased ridge logistic estimator (RAURLE) in logistic regression with exact linear restrictions when the explanatory variables are highly correlated.Through a Monte Carlo simulation study, we examined common restrictions in many simulation studies.Further for the ridge parameter k, five different choices are used as defined in the Equations (21)-(25).The simulation is repeated 2000 times by generating new pseudo-random numbers and the simulated SMSE values of the estimators are obtained using the following equation.

Table 1 .
The estimated SMSE values for different k, when

Table 2 .
The estimated SMSE values for different k, when 60 n = .

Table 3 .
The estimated SMSE values for different k, when performance of the proposed estimator over some existing estimators MLE, RLE, AURLE and RMLE in terms of scalar mean square error.Also, five different choices of existing ridge parameter estimates were used to compare the estimators.The results show that the newly proposed estimator outperforms all the other estimators considered in this study under the selected values of n, ρ, and k by means of SMSE. the