_{1}

^{*}

The paper introduces a new Frequentist model averaging estimation procedure, based on a stacked OLS estimator across models, implementable on cross-sectional, panel, as well as time series data. The proposed estimator shows the same optimal properties of the OLS estimator under the usual set of assumptions concerning the population regression model. Relatively to available alternative approaches, it has the advantage of performing model averaging exante in a single step, optimally selecting models’ weight according to the MSE metric, i.e. by minimizing the squared Euclidean distance between actual and predicted value vectors. Moreover, it is straightforward to implement, only requiring the estimation of a single OLS augmented regression. By exploiting exante a broader information set and benefiting of more degrees of freedom, the proposed approach yields more accurate and (relatively) more efficient estimation than available expost methods.

The Classical Linear Regression Model (CLRM) is grounded on a basic set of assumptions concerning its specification and distributional properties of control variables and error term. In this respect, under what is usually held as Assumption 1, the population regression model is required to be linear in the parameters, and control variables are all known and included in the model. However, the latter correct specification assumption may not always be appropriate in Economics; for instance, there may be more than a single set of variables, i.e. more than a single candidate model, which can be employed in estimation, also when economic theory has clear-cut implications for the causal linkage of interest.

Consider the relationship linking y to x, when both variables can be measured in different ways, i.e. when there exist ^{1}

Two solutions have so far been proposed in the literature to the above model selection problem. On the one hand, by maintaining the assumption of correct specification, a single model out of the

Several model averaging procedures have been proposed in the literature, making use of either Bayesian or Frequentist procedures (see [

The rest of the paper is organized as follows. In Section 2, the proposed approach is illustrated by means of a simple example. Then, the econometric methodology is outlined in full in Section 3, while Section 4 deals with its statistical properties. Finally Section 5 concludes.

For sake of clarity, consider the following bivariate example

where the dependent variable y is a linear function of the independent variable x. The endogenous variable y can then be alternatively measured by

lows we assume that the other usual properties of the CLRM hold, i.e.

^{2}

Four consistent estimates of the parameter of interest

Ex-post model averaging then yields a robust consistent estimate

For instance, within a Frequentist model averaging approach [

where the weights

where

On the other hand, the proposed model averaging strategy is single-step and implemented by means of an augmented regression model using all the available data jointly. It then requires the construction of the auxiliary dependent (

With reference to the set of models in (2), consider the stacked model obtained from their union, i.e.

where

Alternatively, the regression model can be written as

The stacked OLS problem is then stated as

yielding, after some algebra

or

where

with

The ex-ante model averaging or stacked OLS estimator of

Moreover, consistent OLS estimation of

while the stacked estimator is

Hence, the stacked OLS estimator of

Issues related to the (relative) efficiency of the stacked OLS estimator and the gain in terms of higher degrees of freedom are discussed below.

Consider the regression function

and suppose that P candidate dependent variables are available, i.e.

For simplicity, three cases for the specification of the design matrix are considered:

1) The case of a single

2) The case of R candidates for one of the K regressors in the model, ordered first for simplicity, i.e.,

3) The case of R candidates for each of the K regressors in the model, yielding up to

In case 1. Up to P models could be estimated, i.e.

Their union yields the stacked model

where

Disjoint OLS estimation of the pth generic model in (15) yields (see [

while for the variance, in large samples

Ex-ante model averaging is obtained by OLS estimation of the stacked model in (16), yielding

The linkage between ex-ante and ex-post model averaging can then be gauged by noting that (19) can be stated as

where

Hence, in this case, ex-ante OLS model averaging is equivalent to ex-post arithmetic model averaging across the P disjoint OLS estimators

Similarly for

which also is the arithmetic average, across the P available models, of the disjoint estimators

In the case of multiple design matrices, up to G regression models can be computed, with

The disjoint OLS estimator for the generic

is

while for the variance, in large samples

On the other hand, the union of the above disjoint models yields the stacked model

where

tor collecting the P ^{3}

By denoting

The stacked OLS estimator is then computed as

For sake of simplicity, consider first the case where

The stacked OLS estimator in (28) can then be stated

where

Denote

Using matrix inversion rules^{4}, one has

where

By substitution in (31), it follows

where

Optimal ex-ante weights, contained in the

and therefore

Moreover, given

Hence,

Consider now the case in which more than single candidate dependent variable is available, i.e.

where again

Moreover, denote

By recalling that

where, as for the previous case,

The optimal ex-ante weights, contained in the

Moreover,

Then, ex-ante model averaging estimation of the variance

Assume that the properties of the classical linear regression model hold, i.e.:

1) The population regression function is linear in the K parameters, i.e.

2)

3) The regressors

4) Any of the

5) The conditional variance covariance matrix of the residuals

Under the above assumptions (even relaxing the conditional homoskedasticity property), the disjoint OLS estimator

In so far as

since by ergodic stationarity

Moreover, in so far as

Under properties 1. to 5., by means of a CLT (see [

leading to

The asymptotic distribution of

as well as its feasible form

In the case of conditional heteroskedasticity

and

with feasible form

where

The relative efficiency of the stacked over the disjoint OLS estimator can be established by comparing their asymptotic variances, i.e.

which is a finite, symmetric, positive semidefinite

Finally, the gain in terms of degrees of freedom yield by the stacked over the disjoint OLS estimator is equal to

which is of rank equal to

and

which is of rank

The increase in degrees of freedom yield by stacked over disjoint OLS estimation is then

If the stronger assumption of strict exogeneity is made in 3. above, i.e.

biased and efficient (within the class of linear estimators) (see [^{5} Moreover, if the assumption of conditional Normality of the error term is added, i.e.

where

The above properties can also be established for the stacked OLS estimator, in the same way as for the disjoint OLS estimator (see [

with

where

Then, by comparing the conditional variances of

as for the asymptotic case. Moreover,

which similarly is a finite, symmetric and positive semidefinite

Finally, the gain in terms of degrees of freedom yield by stacked over disjoint OLS estimation is again

The paper introduces an ex-ante model averaging approach, requiring the estimation of a single augmented model obtained from the union of all the possible candidate models, rather than their disjoint estimation. In this framework, optimal weights are implicitly computed according to the MSE metric, i.e. by minimizing the squared Euclidean distance between actual and predicted value vectors, and are proportional to the relative variation of the regressors. By exploiting ex-ante all the available information on the various candidate set of variables, and relying on more degrees of freedom, it then leads to more accurate and (relatively) more efficient estimation than available ex-post methods. Moreover, the proposed estimator shows the same optimal properties of the disjoint OLS estimator, under the usual set of assumptions concerning the population regression model. While the method is proposed to be used within the OLS estimator framework, extension to GIVE and GMM is straightforward. We point to [

The author is grateful to the referees for their comments. This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 3202782013-2015. The flowers are supported by the branches/The trunk supports the branches/The roots support the trunk/But we do not see the roots (Mitsuo Aida).

ClaudioMorana,11, (2015) Model Averaging by Stacking. Open Journal of Statistics,05,797-807. doi: 10.4236/ojs.2015.57079