^{1}

^{*}

^{2}

^{2}

The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This notwithstanding, regression analysis may aim at prediction. Consequently, this paper examines the performances of the Ordinary Least Square (OLS) estimator, Cochrane-Orcutt (COR) estimator, Maximum Likelihood (ML) estimator and the estimators based on Principal Component (PC) analysis in prediction of linear regression model under the joint violations of the assumption of non-stochastic regressors, independent regressors and error terms. With correlated stochastic normal variables as regressors and autocorrelated error terms, Monte-Carlo experiments were conducted and the study further identifies the best estimator that can be used for prediction purpose by adopting the goodness of fit statistics of the estimators. From the results, it is observed that the performances of COR at each level of correlation (multicollinearity) and that of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern while that of OLS and PC are concave-like. Also, as the levels of multicollinearity increase, the estimators, except the PC estimators when multicollinearity is negative, rapidly perform better over the levels autocorrelation. The COR and ML estimators are generally best for prediction in the presence of multicollinearity and autocorrelated error terms. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC estimator is either best or competes with the best when multicollinearity level is high（λ
__>__0.8 or λ
__<__-0.49）.

Linear regression model is probably the most widely used statistical technique for solving functional relationship problems among variables. It helps to explain observations of a dependent variable, y, with observed values of one or more independent variables, X_{1}, X_{2}, , X_{p}. In an attempt to explain the dependent variable, prediction of its values often becomes very essential and necessary. Moreover, the linear regression model is formulated under some basic assumptions. Among these assumptions are regressors being assumed to be non-stochastic (fixed in repeated sampling) and independent. The error terms also assumed to be independent, have constant variance and are also independent of the regressors. When all these assumptions of the classical linear regression model are satisfied, the Ordinary Least Square (OLS) estimator given as:

is known to possess some ideal or optimum properties of an estimator which include linearity, unbiasedness and efficiency [

The assumption of non-stochastic regressors is not always satisfied, especially in business, economic and social sciences because their regressors are often generated by stochastic process beyond their control. Many authors, including Neter and Wasserman [

The violation of the assumption of independent regressors leads to multicollinearity. With strongly interrelated regressors, interpretation given to the regression coefficients may no longer be valid because the assumption under which the regression model is built has been violated. Although the estimates of the regression coefficients provided by the OLS estimator is still unbiased as long as multicollinearity is not perfect, the regression coefficients may have large sampling errors which affect both the inference and forecasting resulting from the model [

The methodology of the biased estimator of regression coefficients due to principal component regression involves two stages. This two-stage procedure first reduces the predictor variables using principal component analysis and then uses the reduced variables in an OLS regression fit. While it often works well in practice, there is no general theoretical reason that the most informative linear function of the predictor variables should lie among the dominant principal components of the multivariate distribution of the predictor variables.

Consider the linear regression model,

Let, where is a pxp diagonal matrix of the eignvalues of and T is a p × p orthogonal matrix whose columns are the eigenvectors associated with. Then the above model can be written as:

where

The columns of Z, which define a new set of orthogonal regressors, such as are referred to as principle components. The principle components regression approach combats multicollinearity by using less than the full set of principle components in the model. Using all will give back into the result of the OLS estimator. To obtain the principle component estimator, assume that the regressors are arranged in order of descending eigen values, and that the last of these eigen values are approximately equal to zero. In principal components regression, the principal components corresponding to near zero eigen values are removed from the analysis and the least squares applied to the remaining component.

When all the assumptions of the Classical Linear Regression Model hold except that the error terms are not homoscedastic but are heteroscedastic, the resulting model is the Generalized Least Squares (GLS) Model. Aitken [

is efficient among the class of linear unbiased estimators of β with variance-covariance matrix of β given aswhere Ω is assumed to be known. The GLS estimator described requires Ω, and in particular ρ to be known before the parameters can be estimated. Thus, in linear model with autocorrelated error terms having AR(1):

and

where

and, and the inverse of Ω is

Now with a suitable xn matrix transformation defined by

Multiplying then shows that gives an n × n matrix which, apart from a proportional constant, is identical with except for the first elements in the leading diagonal, which is rather than unity. With another n × n transformation matrix P obtained from

by adding a new row with in the first position and zero elsewhere, that is

Multiplying shows that. The difference between and P lies only in the treatment of the first sample observation. However, when n is large, the difference is negligible, but in small sample, the difference can be major. If Ω or more precisely ρ is known, the GLS estimation could be achieved by applying the OLS via the transformation matrix and P above. However, this is not often the case; we resort to estimating Ω to have a Feasible Generalized Least Squares Estimator. This estimator becomes feasible when ρ is replaced by a consistent estimator [

Several authors have worked on this violation especially in terms of the parameters’ estimation of the linear regression model with autoregressive of orders one. The OLS estimator is inefficient even though unbiased. Its predicted values are also inefficient and the sampling variances of the autocorrelated error terms are known to be underestimated causing the t and the F tests to be invalid [3-5] and [

In spite of these several works on these estimators, none has actually been done on prediction especially as it relates multicollinearity problem. Therefore, this paper does not only examine the predictive ability of some of these estimators but also does it under some violations of assumption of regression model making the model much closer to reality.

Consider the linear regression model of the form:

where and are stochastic and correlated.

For Monte-Carlo simulation study, the parameters of equation (1) were specified and fixed as β_{0} = 4, β_{1} = 2.5, β_{2} = 1.8 and β_{3} = 0.6. The levels of intercorrelation (multicollinearity) among the independent variables were sixteen (16) and specified as:

The levels of autocorrelation is twenty-one (21) and are specified as Furthermore, the experiment was replicated in 1000 times under six (6) levels of sample sizes. The correlated stochastic normal regressors were generated by using the equations provided by Ayinde [

where , and ; and

By these equations, the inter-correlation matrix has to be positive definite and hence, the correlations among the independent variables were taken as prescribed earlier

. In the study, we assumed

The error terms were generated using one of the distributional properties of the autocorrelated error terms

and the AR(1) equation as follows:

Since some of these estimators have now been incorporated into the Time Series Processor (TSP 5.0) [

An estimator is the best if its Adjusted Coefficient of Determination is the closest to unity.

The full summary of the simulated results of each estimator at different level of sample size, muticollinearity, and autocorrelation is contained in the work of Apata [

From these figures, it is observed that the performances of COR at each level of multicollinearity and those of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern, while those of OLS, PC1 and PC2 are generally concave-like. Also, as the level of multicollinearity increases the estimators, except PC estimators when multicolinearity is negative, rapidly perform better as their averaged adjusted coefficient of determination increases over the levels of autocorrelation. The PC estimators perform better as multicollinearity level increases in its

absolute value. The COR and ML estimators are generally good for prediction in the presence of multicollinearity and autocorrelated error term. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC2 estimator is also either best or competes with the best when multicollinearity is high .

Specifically, according to

From

When n = 15,

According to

best except when. At these instances, the PC2 estimator is best when and at other instances, the best estimator is frequently ML or COR.

When n = 20, 30, 50 and 100, the results according to Figures 3, 4, 5 and 6 are not too different. However, from

When n = 30 from

ally best except when. At these instances, the PC2 estimator is best when. At other instances, the best estimator is frequently ML and sparsely COR.

From

instances, the PC2 estimator is best. When n = 100 from

The performances COR, ML, OLS and PCs estimators in prediction have been critically examined under the violation of the assumptions of fixed regressors, independent regressors and error terms. The paper has not only generally revealed how the performances of these estimators are affected by multicollinearity, autocorrelation and sample sizes but has also specifically identified the best estimator for prediction purpose. The COR and ML are

generally best for prediction. At low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator while the PC2 estimator is either best or competes also with the best when multicollinearity level is high.