A New Stochastic Restricted Liu Estimator for the Logistic Regression Model

In order to overcome the well-known multicollinearity problem, we propose a new Stochastic Restricted Liu Estimator in logistic regression model. In the mean square error matrix sense, the new estimation is compared with the Maximum Likelihood Estimation, Liu Estimator Stochastic Restricted Maximum Likelihood Estimator etc. Finally, a numerical example and a Monte Carlo simulation are given to explain some of the theoretical results.


Introduction
Consider the following multiple logistic regression model is , 1, , , where β is a ( ) 1 1 p + × vector of coefficients and i x is the i th row of X, which is an ( ) data matrix with P explanatory variables, i ε is inde- pendent with mean zero and variance ( ) The maximum likelihood method is the most commonly used method of estimating parameters and the Maximum Likelihood Estimator (MLE) is defined as where Ĉ X WX ′ = ; ( )  and Z is the column vector with i th Open Journal of Statistics element equals ( ) ( ) , which is an asymptotically unbiased estimate of β .The covariance matrix of ˆMLE β is ( ) ( ) Multicollinearity inflates the variance of the Maximum Likelihood Estimator (MLE) in the logistic regression.Therefore, MLE is no longer the best estimate of parameter in the logistic regression model.
To overcome the problem of multicollinearity in the logistic regression, many scholars conducted a lot of research.Schaffer et al. (1984) [1] proposed Ridge Logistic Regression (RLR).Aguilera

The Proposed Estimators
For the unrestricted model given in Equation ( In addition to sample model (1.1), let us be given some prior information about β in the form of a set of j independent linear stochastic restrictions as follows: ( ) ( ) where H is a ( ) ( ) known elements, h is an 1 q × stochastic known vector and v is an 1 q × random vector of disturbances with dispersion matrix Ψ and mean 0, and Ψ is assumed to be known q q × positive definite matrix.Further, it is assumed that v is stochastically independent of ( ) For the restricted model specified by Equations (1.1) and (2.4), the SRMLE proposed by Varathan Nagarajah and Pushpakanthie (2015), the SRLMLE proposed by Varathan N, Wijekoon P (2016) are denoted as ( ) ( ) respectively, the bias and variance matrices of the SRMLE and SRLMLE: ( ) ( ) respectively.
We propose the Mix Maximum Likelihood Estimator (MME) [11] in logistic regression model which through analogy OME [12] in linear model.Defined as follows ( ) ( ) the bias and variance matrices of the MME: ( ) In this paper, we propose a new estimator which is named Stochastic Restricted Liu Estimator.Defined as follows the bias and variance matrices of the SRLE:

Mean Square Error Matrix (MSEM) Comparisons of the Estimators
In this section, we will compare SRLE with MLE, LLE, SRMLE, SRLMLE under the standard of MSEM.
First, the MSEM of β which is an estimator of β is where ( ) Bias β is the bias vector and ( ) Ĉov β is the dispersion matrix.For two given estimators 1 β and 2 β , the estimator 2 β is considered to be better than 1 β in the MSEM criterion, if and only if The scalar mean square error matrix (MSE) is defined as Note that the MSEM criterion is always superior over the scalar MSE criterion, we only consider the MSEM comparisons among the estimators.

MSEM Comparisons of the MLE and SRLE
In this section, we make the MSEM comparison between the MLE and SRLE.
First, the MSEM of MLE and SRLE as ( ) respectively.
We now compare these two estimates to the criterion of the MSEM ( ) ( ) ( ) where M N − is positive definite matrix.Based on the above discussions, the following theorem can be proved.
Theorem 3.1.For the restricted linear model specified by Equations (1.1) and (2.4), the SRLE is superior to MLE if and only if ( )

MSEM Comparisons of the LLE and SRLE
First, the MSEM of LLE as ( ) We now compare these two estimates to the criterion of the MSEM ( ) ( ) where ( ) Based on the above discussions, the following theorem can be proved.
Theorem 3.2.For the restricted linear model specified by Equations (1.1) and (2.4), the SRLE is always superior to LLE in the MSEM sense.

MSEM Comparisons of the SRMLE and SRLE
First, the MSEM of SRMLE as ( ) We now compare these two estimates to the criterion of the MSEM where ( ) b b′ is non-negative definite matrices, F and d d Z BZ are positive definite.
Using Theorem 2.1, it is clear that 3 N is positive define matrix.By Lemma 2.1, if ( ) ) is the largest eigen value of (2.4), the SRLE is superior to SRMLE if and only if ( ) ( )

MSEM Comparisons of the SRLMLE and SRLE
) M N − is positive definite matrix.
Based on the above discussions, the following theorem can be proved.
Theorem 3.4.For the restricted linear model specified by Equations (1.1) and (2.4), the SRLE is superior to SRLMLE if and only if ( )

Numerical Example
In this section, we now consider the data set of IRIS from UCI to illustrate our theoretical results.
A binary logistic regression model is set where the dependent variable is as follows.If the plant is Iris-setosa, it is indicated with 0 and if the plant is Iris-versicolor, it is 1.The explanatory variables is as follows. 1 x : Sepal.Length; x : Petal.Length; and 3 x : Petal.Width.
The sample consists of the first 80 observations.The correlation matrix can be seen in Table A1 (Appendix A).From Table A1 (Appendix A), it can be seen that the correlations among the regressors are all greater than 0.80 and some of them are close to 0.98 and the condition number is 55.4984 showing that there is a severe multicollinearity problem in this data.
From Table A2 (Appendix A) we can conclude that:

Monte Carlo Simulation
To illustrate the above theoretical results, the Monte Carlo Simulation is used for data Simulation.Following McDonald and Galarneau (1975) [15] and Kibria (2003) [16], the explanatory variables are generated using the following equation.
( ) where ij z are pseudo-random numbers from standardized normal distribution and 2 ρ represents the correlation between any two explanatory variables.In this section, we set ρ to take 0.70, 0.80, 0.99 and n to take 20, 100, 200 for the dependent variable with two and four explanatory variables.The dependent variable i y in (1.1) is obtained from the Bernoulli ( i π ) distribution where ( ) ( ) . The parameter values of 1 , , p β β  are chosen so that .Further for the Liu parameter d, some selected values is chosen so that 0 1 d ≤ ≤ .Moreover, for the restriction, we choose 2 The simulation is repeated 2000 times by generating new pseudo-random numbers and the simulated MSE values of the estimators are obtained using the following equation 3) The results of the simulation are reported in Tables A3-A9 (Appendix A) and also displayed in Figures A1-A3 (Appendix B).
From Tables A3-A9, Figures A1-A3, we can conclude that: 1) The MSE values of all the estimators are increasing along with the increase of ρ ; 2) The MSE values of all the estimators are decreasing along with the in- crease of n; 3) SRLE is always superior to the MLE, LLE, SRMLE, SRLMLE for all d, n and ρ .

Conclusion Remarks
In this paper, we proposed the Stochastic Restricted Liu Estimator (SRLE) for logistic regression model when the linear stochastic restriction was available.In the sense of MSEM, we got the necessary and sufficient condition or sufficient condition that SRLE was superior to MLE, LLE, SRMLE and SRLMLE and Verify its superiority by using Monte Carlo simulation.How to reduce the new estimation's bias is the focus of our next step which guaranteed mean square error does not increase.

14 )
Open Journal of Statistics respectively.Now we will give a theorem and a lemma that will be used in the following paragraphs.Theorem 2.1.[13](Rao and Toutenburg, 1995) Let A : n n × such that 0 A > and 0 B ≥ .Then 0 A B + ≥ .Lemma 2.1.[14] (Rao et al., 2008) Let the two n n

Theorem 3 . 3 .
matrix.Based on the above discussions, the following theorem can be proved.For the restricted linear model specified by Equations (1.1) and

First 4 M and 4 N
are pos- itive definite matrices.By Lemma 2.1, if

1 )
With the increase of d, the MSE values of the estimators are decreasing which are LRE, SRRMLE, SRLRE, SRLMLE, SRLE.2) With the increase of d, the MSE values of the estimators are same which are MLE, SRMLE, MME. 3) The new estimator is always superior to the other estimators.
[4]al.(2006)[2]proposedPrincipalComponentLogisticEstimator (PCLE).Nja et al. (2013)[3]proposed Modified Logistic Ridge Regression Estimator (MLRE).Inan and Erdogan (2013)[4]pro- [10]d Liu-type estimator (LLE).Some scholars also improve estimation by limiting unknown parameters in the model which may be exact or stochastic.Where additional linear restriction on parameter vector is assumed to hold,Duffy and Santer (1989)[5]proposed Restricted Maximum Likelihood Estimator (RMLE), Siray et al. (2014)[6]proposed Restricted Liu Estimator (RLE), Asar Y et al. (2016)[7]proposed Restricted Ridge Estimator.Where additional stochastic linear restriction on parameter vector is assumed to hold, Nagarajah V, Wijekoon P (2015)[8]proposed Stochastic Restricted Maximum Likelihood Estimator (SRMLE), Varathan N, Wijekoon P (2016)[9]proposed Stochastic Restricted Liu Maximum Likelihood Estimator (SRLMLE), Varathan N, Wijekoon P (2016)[10]proposed Stochastic Restricted Ridge Maximum Likelihood Estimator (SRRMLE).In this article, we propose a new estimator which is called the Stochastic Restricted Liu Estimator (SRLE) when the linear stochastic restrictions are available in addition to the logistic regression model.The article is structured as follows.Model specifications and the new estimators are proposed in Section 2. Section 3 is derived to compare the mean square error matrix (MSEM) of SRLE, MLE etc. Section 4 is a Numerical Example.A Monte Carlo Simulation is used to verify the above theoretical results shown in Section 5.

Table A2 .
The estimated MSEM values for different d.

Table A8 .
The estimated MSEM values for different d when

Table A9 .
The estimated MSEM values for different d when