Improvement of the Preliminary Test Estimator When Stochastic Restrictions are Available in Linear Regression Model ()
1. Introduction
A common problem in a multiple linear regression model is a multicolllinearity. Some biased estimators are proposed to solve this problem such as the Ordinary Ridge Estimator (ORE) by Hoerl and Kennard [1], the Restricted Ridge Estimator (RRE) by Sarkar [2], the Liu Estimator (LE) by Liu [3], the Restricted Liu Estimator (RLE) by Kaçiranlar, et al. [4] and the Stochastic Restricted Liu Estimator (SRLE) by Hubert and Wijekoon [5]. When different estimators are available the preliminary test estimation procedure is adopted to select a suitable estimator. The preliminary test approach was first proposed by Bancroft [6] and then has been studied by many researchers, such as Judge and Bock [7], Wijekoon and Trenkler [8] and Saleh and Kibria [9]. Later Kibria and Saleh [10] have discussed the performance of preliminary test ridge estimators based on WA [11], the LR [12] and the LM [13] tests. Then Yang and Xu [14] have introduced the preliminary test Liu estimators based on these three tests by combining the Restricted Liu Estimator (RLE) and the Liu Estimator.
In this paper, two ridge estimators, the Stochastic Restricted Liu Estimator and Liu Estimator are combined to define a new preliminary test estimator. The new PTSRLE is introduced and derives its stochastic properties in Section 2. The mean square error and scalar mean square error comparisons between PTSRLE and SRLE are carried out in Section 3. In Section 4 the SMSE of the PTSRLE based on WA, LR and LM tests are derived and the performance of the PTSRLE is compared using WA, LR and LM tests as a function of the shrinkage parameter d with respect to the Scalar Mean Square Error. Finally in Section 5, we illustrated these comparisons with a numerical example.
2. Model Specification and Stochastic Properties of the Proposed Estimator
First we consider the multiple linear regression model
, (1)
where is an n × 1 observable random vector, is an n × p known design matrix of rank p, is a p × 1 vector of unknown parameters and is an n × 1 vector of disturbances.
In addition to sample Model (1), let us be given some prior information about in the form of a set of m independent stochastic linear restrictions as follows;
(2)
where is an m × 1 stochastic known vector is a m × p of full row rank with known elements, is non zero m × 1 unknown vector and is an m × 1 random vector of disturbances and is assumed to be known and positive definite. Further it is assumed that is stochastically independent of, i.e.,
.
Let us now turn to the question of the statistical evaluation of the compatibility of sample and stochastic information. The classical procedures is to test the hypothesis
(3)
under linear Model (1) and stochastic prior information (2).
The Ordinary Least Squares Estimator (OLSE) for the Model (1) and mixed estimator [15] due to a stochastic prior restriction (2) are given by
(4)
respectively, where
The Ordinary Stochastic Pre Test Estimator (OSPE) of [8] is defined as
(5)
Further, we can write (5) as follows
(6)
where,
(7)
which has a non-central distribution under, with non-centrality parameter
(8)
and
are indicator functions which take the value one if falls in the subscripted interval and zero otherwise.
is the upper α-level critical value from the central F distribution
.
When different estimators are available for the same parameter vector in the linear regression model one must solve the problem of their comparison. Usually as a simultaneous measure of covariance and bias, the mean square error matrix is used, and is defined by
(9)
where is the dispersion matrix and
denotes the bias vector. We recall that the Scalar Mean Square Error
Now the Liu estimator
(10)
and stochastic restricted Liu estimator
(11)
are combined to define the new preliminary test estimator (Preliminary Test Stochastic Restricted Liu Estimator (PTSRLE)) as
(12)
where,
with and is the shrinkage parameter.
Then we can write (12) as follows
. (13)
Wijekoon [8] derived the stochastic properties of OSPE. By using those results the expectation vector, bias vector, dispersion matrix, MSEM and SMSE of can be shown as follows
(14)
(15)
(16)
(17)
and
(18)
respectively, where,
,
,
,
and
.
Hubert and Wijekoon [5] have given the MSE and SMSE for SRLE as
(19)
(20)
Now we will see some properties of• Note that the PTSRLE reduces to the OSPE when.
• If then and hence the MSE matrix of reduces to
which is the MSE matrix of Liu estimator.
• If then and hence the MSE matrix of reduces to
which is the MSE matrix of SRLE.
• If then, and hence from (17), the MSE matrix of the PTSRLE tends towards that of the LE.
3. Performance of the Proposed Estimator
In this section, we will compare the PTSRLE with the SRLE in the sense of mean square error matrix and scalar mean square error when stochastic restrictions are correct and not correct.
Definition: (MSEM Superiority of Estimators)
Let two alternative estimators and of be given. Then is said to be superior to with respect to the MSEM criterion if and only if
. (21)
3.1. Comparison between the PTSRLE and SRLE under MSE Criterion
In this subsection, we will compare the PTSRLE with SRLE under MSE criterion when the stochastic restrictions are correct and not correct.
Consider the MSE difference between the PTSRLE and SRLE,
(22)
where,
,
and
.
3.1.1. Theorem 3.1:
1) If the stochastic restrictions are true (i.e.,); the SRLE is always superior to the PTSRLE in the mean squared error matrix sense.
2) Under the assumption
the SRLE is not worse than the PTSRLE if and only if:
in the mean square error matrix sense when the stochastic restriction are not true (i.e.,). Here denotes the column space of the corresponding matrix.
3.1.2. Proof:
If the stochastic restrictions are correct then
and consequently the Equation (22) reduced to
(23)
Since the matrix is clearly nonnegative definite.
Therefore the mean square difference in (23) is clearly nonnegative definite matrix since is nonnegative definite matrix. Hence the SRLE is always superior to the PTSRLE in the mean square error matrix sense when
If the stochastic restriction are not correct then
and consequently with respect to the MSE matrix criterion is superior to if and only if () is nonnegative definite. Since is nonnegative definite, we can apply the lemma of [16] (see Appendix) to analyze the MSE matrix superiority of over.
According to [17] (Theorem A.76, p. 514) we can derive the generalized inverse of as
(24)
After some straightforward calculation we can show that
(25)
Using (24) and (25) we can easily prove that. This implies that. If then we have and.
To establish condition (1) in the lemma (see Appendix), we find for such that
,
and,
Hence, according to the lemma the mean square error matrix difference
is nonnegative definite if and only if
This completes the proof of theorem.
3.2. Comparison between the PTSRLE and SRLE under SMSE Criterion
In this subsection, we will compare the PTSRLE with the SRLE under SMSE criterion when stochastic restricttions are correct and not correct.
If the stochastic restrictions are correct then
and consequently the SMSE difference between and can be written as
which is nonnegative definite as
.
Hence is always superior to when.
If the stochastic restrictions are not correct then
and consequently since the matrix is positive definite, there exist an orthogonal matrix and a positive definite diagonal matrix
such that, with. Then the SMSE difference between SRLE and PTSRLE can be written as
(26)
where,
,
and is the diagonal element of the matrix. Therefore, the SMSE difference in (26) is nonnegative definite if and only if, where,
(27)
Now we summarize our findings:
Theorem 3.2:
1) If the stochastic restrictions are true (i.e.); the SRLE is always superior to the PTSRLE in the scalar mean squared error sense.
2) If the stochastic restrictions are not true (i.e.); the Preliminary Test Stochastic restricted Liu Estimator has Smaller SMSE than the Stochastic Restricted Liu Estimator if and only if, where is given in (27).
4. PTSRLE Based on WA, LR and LM Tests
In general, the finite sample test such as t or F was used to define the preliminary test estimator. Since these finite sample tests are not always available it is very useful to consider the preliminary test estimators based on the three tests WA, LR and LM. The WA test offers the advantage of only requiring estimates of the unrestricted model, whereas LR test requires estimates of both unrestricted and the restricted model. The LM test only requires estimates of the restricted model. In different situations, we may find one or the other of these tests which is easier to compute. Judge and Bock [7] have rewritten the model given in (1) and (2) to obtain the F statistics for testing the hypothesis in (3). Using the same model we can derive the test statistics for the WA, the LR and the LM tests which are well employed for testing the Hypothesis (3) and given by
(28)
respectively [18].
It’s known that under the null hypothesis, the three test statistics have the same asymptotic chi-square distribution with degrees of freedom [18]. When the exact distribution is approximated by the asymptotic chisquare distribution, the critical value for an α-level test of is approximated by the central chi-square critical value for large sample tests. This asymptotic chi-square distribution has wide application in the field of Econometrics. Based on the above tests, the PTSRLE takes the form [10] as
(29)
where (*) stands for either WA, LR or LM tests values, and is the upper percentiles of the central distribution with m degrees of freedom.
By using the equation in (18), now we can obtain the SMSE of the PTSRLE based on WA, LR and LM tests.
(30)
where,
for, and takes the value for WA, LR and LM tests as
,
and
respectively.
We consider the SMSE difference between WA and LR tests of the PTSRLE
(31)
where:
and
.
Now we consider the SMSE difference between LR and LM tests of the PTSRLE
(32)
where:
and
Case I: If the stochastic restrictions are true then δ = 0.
Note that as and as then the SMSE difference in (31) reduced to which is nonnegative definite as . Similarly the SMSE difference in (32) reduced to which is nonnegative definite as .
Case II: If the stochastic restrictions are not true then.
We can rewrite the SMSE difference in (31) as follows
(33)
Therefore the SMSE difference in (33) is nonnegative definite if, where
(34)
Hence, will dominate if and we can similarly get that will dominate whenever where
(35)
We can rewrite the SMSE difference in (32) as follow:
. (36)
Therefore, the SMSE difference in (36) is nonnegative definite if, where
(37)
Hence, will dominate if and we can similarly get that will dominate whenever where
(38)
Now the performance of the PTSRLE estimator based on WA, LR and LM tests are compared with respect to the SMSE, and the following theorem can be stated.
Theorem 4.1:
1) The stochastic restrictions are true (i.e. δ = 0); then
2) The stochastic restrictions are not true (i.e.) then a) If then
b) If then
where, , , and are given in equations (34), (35), (37) and (38), respectively.
From Theorem 4.1(2b) and according to [14] we can say that when is small, WA test has the smallest SMSE than the other tests. Similarly according to the results stated in (2a), the LM test has the smallest SMSE than the other tests when becomes large.
5. Numerical Example
To illustrate our theoretical results, we consider the following data set on Portland cement originally due to Woods, Steinour and Starke [19]. This data set came from an experimental investigation of the heat evolved during the setting and hardening of Portland cements of varied composition and the dependence of this heat on the percentages of four compounds in the clinkers from which the cement was produced. The four compounds considered by Woods, Steinour and Starke [19] are tricalium aluminate: 3CaO×Al2O3, tricalcium silicate: 3CaO×SiO2, tetracalcium aluminaferrite: 4CaO×Al2O3×Fe2O3, and beta-dicalcium silicate: 2CaO×SiO2, which we will denote by, , and, respectively. The dependent variable is the heat evolved in calories per gram of cement after 180 days of curing. This dataset has since then been widely used by many researchers (e.g. [4,20]).
,
The X = (X1, X2, X3, X4) matrix contains observations and predictor variables. Since the regressor matrix X does not include a column of ones a homogeneous multiple linear regression, Model (1) without intercept is fitted to the data.
The ordinary least square estimator of regression coefficient is
with
and
.
Consider the following stochastic restrictions where, and
(see [20,21]).
Figures 1 and 2 are drawn by using the SMSE given in Equations (18) and (20) for different values selected from (0, 1).
According to the Figures 1 and 2, we can conclude that when is small the PTSRLE has the smallest SMSE value than the SRLE, OSPE and OLSE.
Figures 3 and 4 are drawn by using the SMSE given in equation (30) for different values selected from (0, 1).
From Figures 3 and 4, we can notice that when is small, the WA test has the smallest SMSE than the other tests. When becomes large, the LM test has the smallest SMSE. Hence the data analysis supports the findings of this paper.
6. Conclusions
In this paper, we have introduced a new preliminary test estimator in a multiple linear regression model. When is small, the PTSRLE based on WA test has the smallest SMSE than the other tests. When becomes large, the PTSRLE based on LM test has the smallest SMSE. Moreover, for certain cases (Figures 1 and 2) the proposed estimator has the smallest SMSE. The results of
Figure 1. Estimated the SMSE values for SRLE, PTSRLE, OSPE and OLSE at
Figure 2. Estimated the SMSE values for SRLE, PTSRLE, OSPE and OLSE at
Figure 3. The SMSE of the PTSRLE based on WA, LR and LM tests for
Figure 4. The SMSE of the PTSRLE based on WA, LR and LM tests for
this paper have a potential for future developments for both theoretical and practical aspects.
7. Acknowledgements
We thank the Postgraduate Institute of Sicence, University of Peradeniya, Sri Lanka for providing all facilities to do this research.
Appendix
Lemma: (Baksalary and Trenkler, [16])
Let be a nonnegative definite matrix and, be linearly independent vectors. Furthermore for some generalized inverse of, let; and let
where and denote the column space of the corresponding matrix. Then we have
if and only if 1) and
or 2) and
and all expressions in (1) and (2) are independent of the choice of.