Unbalanced Regressions and Spurious Inference

Spurious regression has been extensively studied in time series econometrics since Granger and Newbold’s [1] seminal paper. Recently, it has been advanced that this phenomenon is due to a mistreatment of short-range autocorrelation in the residuals of the regression when at least one of the variables in a bivariate regression is stationary. HAC errors, feasible GLS and Cochrane-Orcutt-type procedures are then proposed to draw correct inference. Such a proposal should be cautiously considered, since nonsense inference might also be due to deterministic trend mechanisms, structural breaks, and long range dependence. In these cases, standard autocorrelation correction procedures would not solve the problem of spurious regression. We aim to make the later argument clear.

Spurious regression has been extensively studied since Granger and Newbold [1] seminal paper in which independent nonstationary variables are simulated and then used to estimate a simple bivariate regression.Phillips [2] provided the theoretical framework to understand the phenomenon in the simplest case (independent driftless unit root processes).Since then, the spurious regression phenomenon has been identified for many data-generating processes (DGPs), such as unit root with drifts, (broken-) trend stationary and long range, for example 1 .
Here, we are concerned with the results presented in Noriega and Ventosa-Santaulària [4] and Stewart [5] pertaining to the spurious regression phenomenon under the following conditions: 1) both variables, t and t y x (see Equation ( 1)), are stationary (i.e., integrated of order 0, I(0)), and 2) at least one of the variables (the regressor or the regressand) is integrated of order 1, I(1).The later combinations result in an unbalanced regression and Noriega and Ventosa-Santaulària [4] found that, in a simple regression specification, , where either , t x (or both) is I(0)2 , the t-ratio associated with  , t , does not diverge as the sample size grows; i.e.
. Results in Noriega and Ventosa-Santaulària [4] imply that the asymptotic spurious regression phenomenon does not occur.Nevertheless, nonsense inference cannot be fully discarded.In a recent paper, Stewart [5] argues that, although the t-ratio does not diverge, it may not necessarily converge to a standard normal distribution.Furthermore, in the absence of autocorrelation in the DGP's innovations, only when both variables are iid I(0) processes, the t-ratio behavesasymptotically-as a standard normal.Other DGP combinations, such as x I and vice versa, do have asymptotic nonstandard distributed t-ratios.Nevertheless, the size distortions are better explained by the presence of autocorrelation in the DGP innovations.This point is illustrated by Stewart [5] throughout a number of finite-sample experiments.The problem comes as no surprise since the estimated residuals behave as an autocorrelated process and size distortion should be expected in that case.Moreover, the use of heteroskedasticity and autocorrelation consistent (HAC) errors considerably reduce size distortions in some cases, as argues Stewart [5].Table 1 summarizes the relevant DGPs for both the dependent and the explanatory variables, similar to those used by Noriega and VentosaSantaulària [4] and Stewart [5], to estimate a simple linear specification.
For simplicity, we assume that innovations, zt , for e , z x y  , are iid white noises.Following Noriega and Ventosa-Santaulària [4] and using the aforementioned DGPs, we present the following corollary: Corollary Let t and 1 For a recent survey see Ventosa-Santaulària [3].y t x , be generated by DGPs i and j of Table 1.Denote i j  as the DGP combination that generated y and x, respectively, and use them to  Results in the corollary reveal that the asymptotic distribution of the t-ratio is nonstandard when the regression is unbalanced.However, a simple simulation of the asymptotic distribution shows a striking resemblance of this distribution with a standard normal (insets (a) and (b) in Figure 1).Such resemblance fades out in the presence of autocorrelation (insets (c) and (d) in Figure 1).We, therefore, confirm that the size distortions pointed out by Stewart [5] are due to autocorrelation; the latter happens to be an important source of spurious regression when at least one of the variables is I(0) and confirms the results of Granger Hyung and Jeon [6] and Mikosch and Vries [7] results.Nevertheless, short range autocorrelation should not be considered as the sole source of spurious inference.It is well documented that deterministic trends, structural breaks, and long range dependence, also generate nonsense inference (see Perron [8] and Tsay and Chung [9]).It is important to note that the latter cannot be prevented by using Cochrane-Orcutt or Feasible GLS.
Using standard correction procedures to deal with the   denotes convergence in distribution.spurious regression phenomenon is tempting, even if such procedures cannot always provide correct inference (see Stewart [5] and McCallum [10], for example).Sun [11] proposed a convergent t-statistic using modified HAC errors with a bandwidth proportional to the sample size when the variables are highly persistent.The author acknowledges, however, that such a procedure cannot be used in empirical applications, since the limit distribution of the test depends on the memory parameter under the null hypothesis and critical values cannot, therefore, be tabulated.McCallum [10] and Kolev [12] also advocate classical correction procedures to deal with spurious regressions, such as the Cochrane-Orcutt procedure and Feasible Generalized Least Squares.They argue that using them reduces size distortions of the t-test.However, Martínez-Rivera and Ventosa-Santaulària [13] proved that such methods are not always effective and remain highly dependent on the DGP of the series 4 .

Concluding Remarks
There is finite-sample evidence showing that spurious inference in unbalanced regressions mostly occurs when the innovations of the DGPs are not iid.In that sense, standard autocorrelation-correction procedures, such as HAC errors, Feasible GLS and Cochrane-Orcutt estimates, have been advanced to eliminate/reduce the size distortions and, thus, spurious inference.This approach should, nevertheless, be reconsidered.First, there is evidence that spurious regression using stationary series cannot always be interpreted as a short range autocorrelation phenomenon: long range dependence and structural breaks (level shifts, for example) also cause spurious inference; spurious regression cannot, therefore, be always corrected using classical procedures.Second, an unbalanced regression (in which the order of integrations of the involved series is not the same) is an empirical situation which remains to be proved relevant.The estimation of an unbalanced regression is not intuitive, although there are cases such as in the predictive equation in the finance literature, in which the market returns (usually, found to be stationary) is regressed against dividend yield (stationary but highly persistent).Spurious regression cannot be simply considered as a short-memory autocorrelation phenomenon and cannot, therefore, be treated using standard procedures.The main conclusion is, therefore, twofold: 1) practitioners should inter-pret cautiously their results whenever they find evidence of autocorrelation, since the inference could be spurious; 2) they should, however, be aware that spurious regressions arise for many diverse reasons, autocorrelation being only one of them; standard autocorrelation correction procedures are not to be considered as the sole solution to prevent spurious inference; on the contrary: parameter stability, long memory and cointegration tests should always be also considered.