A Hausman Type Test for Differences between Least Squares and Robust Time Series Factor Model Betas

Robust regression is playing an increasingly important role in fitting time series and cross-section factor models for stock returns. We introduce and study the properties of a Hausman type test for comparing factor model regression coefficients computed with LS, which is fully efficient under idea-lized normal data distributions, and a Robust MM-estimate, which is highly efficient for normally distributed data but also controls variance inflation and bias for outlier generating non-normal data distributions. The test is based on the asymptotic distribution of the difference between the two estimators, one of which is fully efficient. The test can detect a significant difference between the LS and Robust estimate due to the inefficiency of the LS estimator under out-lier-generating non-normal error distributions, and due to bias of the LS estimator relative to the Robust estimator caused by bias inducing distributions. The applications efficacy of the new test is demonstrated for comparison of LS and Robust estimates of both CAPM betas and Fama-French three-factor model betas. Monte Carlo studies of the finite sample level and power of the test reveal good performance for sample sizes of at least 100 to 200, which are typical for weekly and daily returns for such models.

portfolio management research, for which a very large literature exists. Examples in empirical asset pricing include papers by Fama and French [1] [2] [3], Hou et al. [4] [5], Feng et al. [6], and the overview book by Bali et al. [7]. Examples in quantitative portfolio management include significant coverage in books such as Grinold et al. [8] and Qian et al. [9], and papers such as Menchero and Mitra [10], Menchero and Davis [11], Ding and Martin [12], and Ding et al. [13]. The main types of factor models appearing in the literature are cross-section factor models and time series factor models, both of which are specific forms of linear regression models.
Linear regression models in quantitative finance are universally fit using ordinary least squares (LS) estimates of the coefficients or weighted least squares (WLS) estimates. Both LS and WLS estimates are relatively simple, widely available in software packages, and blessed by being the best linear unbiased estimates (BLUE) under standard assumptions. In addition, LS estimates are the best among both linear and nonlinear estimates when the errors are normally distributed. However, asset returns and factors often have quite non-normal distributions, and LS coefficient estimates are quite non-robust toward outliers in that they can be very adversely distorted by even one or a few outliers. In statistical terms, LS estimates can suffer from a substantial loss of efficiency when the errors have a fat-tailed non-normal distribution, in that they can have much larger variances than maximum-likelihood estimates (MLEs) for such non-normal distributions. Furthermore, under some types of deviations from normality LS estimates will be biased, even asymptotically as the sample size goes to infinity.
Fortunately, several robust factor model fitting alternatives to LS estimates exist that suffer relatively little from severe inefficiency and bias. See for example the books by Huber [14], Huber and Ronchetti [15], Hampel et al. [16], Rousseeuw et al. [17], and Maronna et al. [18], and the references therein. See also the papers on robust time-series estimation of CAPM betas by Martin and Simin [19], Bailer et al. [20], and the paper on robust cross-section factor models by Martin and Xia [21]. Various types of outlier-robust regression methods are implemented in commercial statistical software programs such as SAS and STATA, and in the open-source R packages robust, robustbase, and RobStatTM that are available on CRAN (https://cran.r-project.org/). Regression M-estimates of one form or another are the most widely used robust regression methods.
Statistical inference methods for robust regression coefficients such as robust t-tests, F-tests, robust R-squared, and robust model selection criteria have been available in the literature for many years, and these are described in Maronna et al. [18] and are available in the companion R package RobStatTM. On the other hand, the literature on statistical tests for evaluating the difference between LS and robust regressions fits is minimal. In this regard, we recall Tukey [22] who stated "It is perfectly proper to use both classical and robust/resistant methods routinely, and only worry when they differ enough to matter. But when they differ, you should think hard." This is good advice that leaves open the question of how much is the "enough" in "when they differ enough", and it is highly desirable to have a reliable test statistic whose rejection region defines "enough".
If such a test statistic has a reliable level and adequate power, then acceptance of an appropriately defined null hypothesis would lead a user who routinely computes both LS and robust regressions to be confident in the LS results. On the other hand, rejection of the null hypothesis would support reliance on the robust regression estimate and associated robust inferences. Unfortunately, there does not at present exist a well-accepted statistical test for determining whether LS and robust regression estimates differ significantly from one another. We propose and study the properties of a viable test that uses the robust regression an MM-estimator that is well known in the robust statistics literature.
Our test statistic is focused on differences between LS and Robust MM-estimator factor model slope coefficients, based on a key idea in the specification tests paper by Hausman [23]. We consider composite null and alternative hypotheses where the null hypothesis is that of a linear regression factor model with errors that are normally distributed. The alternative hypothesis consists of outlier generating non-normal error distributions as well as more general types of bias-inducing joint distributions for the returns and factor variables. Rejection of the null hypothesis can occur due to any of the following LS estimator behaviors: inefficiency only, bias only, or both inefficiency and bias.
The novelty of our results is that for the first time there is a reliable significance test for differences between LS and Robust estimates of time series and cross-section factor model coefficients, for selected subsets of coefficients as well as the set of all coefficients. In particular, rejection of the null hypothesis that the data is normally distributed will lead the analyst or risk manager to favor the use of the Robust estimator model fit for risk and performance analysis, and to carry out further analysis to determine the extent and type of non-normality that gives rise to the rejection of the null hypothesis.

Robust Regression MM-Estimates
We consider estimation in a linear regression time-series factor model of the form with the assumption that the observed data x  , consists of independent and identically distributed random variables. Here, t y is a return of a specific asset at time t, typically in excess of a risk-free rate,  is a vector of K factor returns at time t, α is an unknown intercept, β is a K-dimensional vector of unknown regression slope coefficients, and the t  are the regression errors.
Major applications of such time series factor models in finance include: • The CAPM model with K = 1, where t t y r = is an asset return in excess of a risk-free rate at time t, and 1,t x is a market return in excess of a risk-free rate at time t, x is a small-minus-big (SMB) factor return, and 3,t x is a high-minus-low (HML) factor return, • The Fama-French-Carhart [24] 4-factor model (FFC4) adds the momentum factor to the FF3 model.
We focus on the important class of robust regression MM-estimators introduced and studied by Yohai [25], which have both the highest possible breakdown point (BP) of 0.5 and high efficiency at normal distributions. Efficiency here is defined as the ratio of the variance of the LS estimator to the robust estimator when errors are normally distributed. Since LS has the minimum possible variance at a normal distribution, the efficiency of an MM-estimator, expressed as a percent, is less than 100%. Typically, an efficiency of 85% to 95% is considered high. A regression MM-estimate of θ is obtained by first computing a high-breakdown point but relatively inefficient initial estimate 0 θ and then computing a final estimate θ as the nearest local minimum of With respect to θ , where σ is a highly robust scale estimate of the residuals. The parameter c is a tuning parameter used to control the trade-off between a high normal distribution efficiency of the estimate and robustness toward outliers, which we discuss subsequently. With c c ψ ρ′ = the resulting θ satisfies the stationary local minimum condition 1 .
A well-established method of computing σ and solving the minimization problem (2) was developed by Yohai et al. [26], and is briefly described in Appendix B for the interested reader. See also Section 5.5. in Maronna et al. [18].
Martin et al. [27] demonstrated that to obtain bias-robustness toward outliers, one needs to use a bounded loss function c ρ . The most popular choice of a bounded loss function is the well-known Tukey bisquare function, and the analytic expressions for the bisquare ρand ψfunctions are given in Appendix A.
Versions of the bisquare loss functions for normal distribution efficiencies of 85%, 90%, 95%, and 99% are shown on the left plot in Figure 1, and the corres-  Table 1 for the four efficiencies of 85%, 90%, 95%, and 99%. 1 Throughout this paper, a prime on a scalar-valued function, e.g. c ρ′ , denotes its derivative, otherwise a prime denotes the transpose of a vector or matrix.
where x has a finite positive definite covariance matrix x C . See also Chapter 10 in [18]. In what follows we focus on the LS and MM-estimators of the slopes vector β in (1).
Under model (4) the asymptotic covariance matrix of the LS estimator is and the asymptotic covariance matrix of the MM-estimator ˆM .
where EFF is the large sample normal distribution efficiency of the MM-estimator equal to with  normally distributed with mean 0 and standard deviation σ .
A finite-sample approximation to the covariance matrix of ˆM M β is obtained by computing estimates of τ , σ and x C . We use a method of doing so proposed by Yohai et al. [26] that is described in Sections 5.5 and 5.6 of Maronna et al. [18], and implemented in the function lmRob in the R robust library. A brief summary of the method is given in Appendix B.
Here we discuss the behavior of the ordinary least squares (LS) and robust MM estimators under several distinct situations with respect to the joint distribution of the data. First of all, when F  in model (4) is a normal distribution LS is consistent and fully efficient, and MM is consistent with high efficiency that can be set by the user, e.g., use of 90% or 95% normal distribution efficiency is common. Second, (4) is a non-normal distribution with fat tails but finite variance, the LS estimator is consistent but can have an efficiency arbitrarily close to zero, and the MM-estimators are consistent and can have high efficiencies.
A common approach in robustness studies to allow for more general types of ( ) , t t y x outliers, than those generated by model (4) with a fat-tailed error distribution, is to use a broad family of mixture distributions (10) where 0 F is given by (4), the mixing parameter γ is positive and often small, e.g., in the range 0.01 to 0.1, and H is unrestricted. This family of models is motivated by the empirical evidence that most of the time the data are generated by the nominal distribution 0 F but with small probability γ the data come from another distribution H that can generate a wide variety of outlier types. In the context of the distribution model (10), the goal is to obtain good estimates of the parameters ( ) F . Unfortunately, the LS estimator of θ can be not only highly inefficient but also highly biased for some outlier generating distributions H. Modern robust regression MM-estimators have been designed to minimize the maximum bias due to unrestricted distributions H in the data distribution model (10), while also obtaining high efficiency when 0 γ = in (10).

Test Statistic
The test is designed to test a null hypothesis of a regression model (4) with normally distributed errors. The test is expected to reject when the difference be- It follows that under normality: In view of (11) the above expression may be written in the following alternative form: A multi-parameter large-sample version of the above result was obtained by Hausman [23] in his classic paper on specification tests 2 . Hausman's Corollary 2.6 to Lemma 2.1 states that the asymptotic covariance matrix of the difference between two consistent and asymptotically normal estimators, one of which is asymptotically fully efficient and the other is inefficient, is equal to the covariance matrix of the inefficient estimator minus the covariance matrix of the efficient estimator. Thus in our case under normality, we have the following asymptotic covariance matrices relationship: In view of the asymptotic result (8) we have (13) Note that (13) holds only under normality because in that case LS is fully efficient and the MM-estimate is inefficient. A result analogous to (12), namely , will hold when the MM-estimator is a maximum 2 We thank Professor Eric Zivot for pointing out this reference. Hausman [23] showed that asymptotically MM  (12) and (13) suggest the following: Under the null hypothesis of normally distributed errors, the statistic k T will have approximately a chi-squared distribution with k degrees of freedom and the statistic i T will have approximately a standard normal distribution. The extent to which the use of such an approximation is valid is explored in Section 5.

Two Time Series Factor Model Examples
In this section, we present two pairs of empirical examples of using the proposed test statistic T for determining significant differences between classical LS and the robust bisquare MM estimator with 95% normal distribution efficiency, 3 In principle one might also use the estimate . While this estimate should result in decent accuracy of level in finite sample sizes, we conjecture that it will result in lower power under non-normal alternatives due to LS estimates having higher variance than MM estimates.

Single Factor CAPM Time Series Model
The single factor model Beta of a set of asset returns is the slope coefficient in a regression of the asset returns on market returns, where both returns are in excess of a risk-free rate. Beta plays a central role in the capital asset pricing model (CAPM) [29] and is one of the most widely known and widely used measures of the expected excess return and market risk of an asset. Figure 2 shows The standard error (SE) and p-value for the test T for the difference in the two betas are reported in Table 2. Recall from Equation (15) that the SE of T is just a 4 The stock returns data used in this paper are from the "Center for Research in Security Prices, LLC". 5 Outliers here are defined as asset and market return pairs for which the absolute value of the robust bisquare estimator residual exceeds 3 times a robust residual scale estimate.    It is interesting to see how the test statistic behaves on a data set that is identical to that of Figure 2, except for deleting the 6 outliers of Figure 2. The resulting scatter plot shown in Figure 3 reveals that the LS and Robust estimator coefficients and straight-line fits are now virtually identical. This result illustrates the important characteristic of a good robust fitting method that it gives almost the same results as LS when the data contains no influential outliers, which is also reflected in the high normal distribution efficiency of 95% for the Robust estimator. Not surprisingly then, the Table 3 p-value of 0.387 indicates no significant difference between the LS and Robust estimates.
Differences between LS and Robust betas are very common, as is revealed in [20]. We highly recommend routine use of robust regression betas along with their standard errors and the test statistic T p-values as a complement to the LS estimates of asset returns provided by many financial data service providers (e.g.,

Multifactor Time Series Model
Here we apply our test T to the LS and Robust fits of the Fama-French 3 factor model (FF3) to the weekly returns of the stock with ticker ADL for the year 2008. The FF3 time series factor model has the form: where e t r is a time series of the asset excess returns relative to a risk-free rate,        chi-squared distribution approximation has the value 1101 and a p-value that is zero to 3 digits. Table 5 reports the same quantities as in Table 4 except that the ADL stock and FF3 factor returns are deleted at the times of the five residuals outliers in the right-hand panel of Figure 6. Not surprisingly, neither the overall test nor the individual coefficients tests reject the null hypothesis of normally distributed linear factor model errors. This is consistent with the high 95% normal distribution efficiency of the Robust estimator, along with the fact that after deleting the 5 most extreme outliers in the right-hand panel of Figure 7, the distribution of the residuals is close to a normal distribution. Note also that the only coefficients that changed substantially after removing outliers were the LS estimates for the MKT and HML factors, the two factors for which original test T in Table 4 indicated a statistically significant difference with the Robust estimates. As expected, the removal of outliers did not affect the Robust estimates much.

Monte Carlo Simulations
In order to evaluate the finite sample behavior of the level and power of our test

Distribution Models
We assume independent and identically distributed (i.i.d.) random t x that are independent of i.i.d. errors t  for the first two models below. We generate samples from the following distributions for the errors t  : Model 1: Standard normal, which is included in the null hypothesis Model 3: Asymmetric two-term conditional joint normal mixture for t x and t  that is included in the composite alternative: where we condition on the number of "outliers" from the second component to We carry out the conditioning in Model 3 as follows. We first generate For all 3 models we set 0 0 α = , 0 1 β = . For models 1 and 2 we generated 10,000 replicates. Model 3 includes many combinations of the parameters µ and γ , and for each such combination we generated 1000 replicates 7 . We used sample size N ranging from 50 to 500.

Results
Model 1 (normal distribution errors). Figure    Model 2 (skew-t distribution errors). Results for a skewed t-distribution with five degrees of freedom are displayed in Figure 9.
The skewed t-distribution is in the alternative hypothesis for the test and thus one would hope for high power results. The power indeed increases with increasing sample size and with normal distribution efficiency. It can be shown that the power of the test T for sample size 500 is close to the estimated asymptotic value for each of the four efficiencies. Since both the LS and robust estimates are consistent estimators that converge to beta 0 β at the same rate, the asymptotic power of the test T will be less than one, and it is not surprising that the power of T is less than one for the largest sample sizes in Figure 9.    Table 1 Figure 8 and Figure 9, is that more accurate levels and higher power are obtained using the robust MM-estimator with a higher normal distribution efficiency of 99% instead of the more traditional 95%. The reason this is surprising is that using lower normal distribution efficiency of an MM-estimator generally results in lower bias due to bias -inducing outliers. Note however that for sample size 50 in Figure 10, higher efficiency yields higher power only for the fraction 0.02 of outliers, and lower efficiency yields higher power for fractions 0.04 and 0.06. This represents a curious interaction between the fraction of outliers and MM-estimator efficiency, and this behavior needs further study.
We remark that the increasing empirical levels of the test T as the sample size decreases in Figure 4 below sample size 150 is likely due to a small sample bias in the estimate ( ) ,MM i se β that appears in the denominator of T in (15). It will be worthwhile to consider possible bias correction methods to improve the small sample size accuracy of the level of the test.

Summary and Discussion
This paper uses the important Hausman [23] result to construct a new test statistic T for detecting differences between LS and Robust estimators of the slope tunately, there is a simple method to take care of this by centering the factor model response and factor exposures with sample medians and using the MM-estimate of regression through the origin for these transformed variables. The robust slope coefficients obtained in this manner can be used to compute regression residuals whose median will be a robust estimate of the intercept. We plan to study the statistical properties of the resulting robust intercept estimate in a separate follow-on study.