Investigating Performances of Some Statistical Tests for Heteroscedasticity Assumption in Generalized Linear Model: A Monte Carlo Simulations Study

In a linear regression model, testing for uniformity of the variance of the residuals is a significant integral part of statistical analysis. This is a crucial assumption that requires statistical confirmation via the use of some statistical tests mostly before carrying out the Analysis of Variance (ANOVA) technique. Many academic researchers have published series of papers (articles) on some tests for detecting variance heterogeneity assumption in multiple linear regression models. So many comparisons on these tests have been made using various statistical techniques like biases, error rates as well as powers. Aside comparisons, modifications of some of these statistical tests for detecting variance heterogeneity have been reported in some literatures in recent years. In a multiple linear regression situation, much work has not been done on comparing some selected statistical tests for homoscedasticity assumption when linear, quadratic, square root, and exponential forms of heteroscedasticity are injected into the residuals. As a result of this fact, the present study intends to work extensively on all these areas of interest with a view to filling the gap. The paper aims at providing a comprehensive comparative analysis of asymptotic behaviour of some selected statistical tests for homoscedasticity assumption in order to hunt for the best statistical test for detecting heteroscedasticity in a multiple linear regression scenario with varying variances and levels of significance. In the literature, several tests for homoscedasticity are available but only nine: Breusch-Godfrey test, studentized Breusch-Pagan test, White’s test, Nonconstant Variance Score test, Park test, Spearman Rank, Glejser test, Goldfeld-Quandt test, Harrison-McCabe test were considered for this study; this is with a view to examining, by Monte Carlo simulations, their asymptotic behaviours. However, four different forms of heteroscedastic structures: exponential and linear (generalize of square-root and quadratic structures) were injected into the residual part of the multiple linear regression models at different categories of sample sizes: 30, 50, 100, 200, 500 and 1000. Evaluations of the performances were done within R environment. Among other findings, our investigations revealed that Glejser and Park tests returned the best test to employ to check for heteroscedasticity in EHS and LHS respectively also White and Harrison-McCabe tests returned the best test to employ to check for homoscedasticity in EHS and LHS respectively for sample size less than 50.

Share and Cite:

Onifade, O.C. and Olanrewaju, S.O. (2020) Investigating Performances of Some Statistical Tests for Heteroscedasticity Assumption in Generalized Linear Model: A Monte Carlo Simulations Study. Open Journal of Statistics, 10, 453-493. https://doi.org/10.4236/ojs.2020.103029

1. Introduction

One of the crucial assumptions in the multiple linear regression models is that the variance of the errors should be constant . The Ordinary Least Squares (OLS) method is very popular with statistics practitioners as it provides efficient and unbiased estimates of the parameters when the assumptions, especially the assumption of homoscedastic error variances, are met. But in many real-life applications, variances of the errors vary across observations. Since homoscedasticity is often unrealistic assumption, researchers should consider how the results are affected by heteroscedasticity. Even though the OLS estimates retain unbiasedness in the presence of heteroscedasticity, its estimates become inefficient  .

However, heteroscedasticity yields hypothesis tests that fail to keep false rejections at the nominal level, or estimated standard errors as well as confidence intervals that are either too narrow or too large . Every statistical procedure carries with it certain assumptions that must be at least approximately true before the procedure can produce reliable and accurate results . Researchers often apply a statistical procedure to their data without checking on the validity of the assumptions of the procedure. If one or more of the assumptions of a given statistical procedure are violated, most especially in multiple linear regression analysis, then misleading results will be produced by the procedure .

In short, a number of assumptions are associated with the analysis of data using OLS in multiple linear regression but the current study deals with only one of them, that is, homogeneity of variance assumption. Literarily, assumptions refer to basic principles that are accepted on faith, or assumed to be true, without proof or verification. It is frequent and common that a researcher applies a statistical method to a set of data without thoroughly checking that the assumptions of the methods are valid . This may be especially true in multiple linear modelling.

Notwithstanding, it is a known fact that all statistical procedures should have underlying assumptions; some are more stringent than others. In some cases, violation of these assumptions will not change substantive research conclusions. In other cases, violation of assumptions will undermine meaningful research. Establishing that one’s data meet the assumptions of the procedure one is using is an expected component of all quantitatively-based theses, journal articles, and dissertations. This assumption practically and usually exists in regression and experimental design but this research discusses its relation with regression analysis.

The homogeneity of variance assumption is one of the critical assumptions underlying most parametric statistical procedures such as the analysis of variance and it is very important for a good researcher to be able to test this assumption before the application of ANOVA technique. Simply, the term “homo” means “the same” while “hetero” means different, therefore variance homogeneity assumption, which is equivalently called “homoscedasticity”, means that the variance of each residual should be the same throughout the experiment. If the errors (residuals) fail to possess equal (but sometimes unknown) variances, the reliability of application of analysis of variance technique may be badly affected . Direct opposite in meaning to “homogeneity assumption” is “heterogeneity of error variances”, which simply refers to a situation where the variance of the residuals is affected by at least one predictor variable leading to unequal magnitude in spread. Thus, heterogeneity problem may arise in most of the economic (econometric), experimental and agricultural modelling where specifically analysis of variance technique is applied. Thus, homogeneity of variance is a major assumption underlying the validity of many parametric tests. More importantly, it serves as the null hypothesis in substantive studies that focus on cross- or within-group dispersion.

In addition, showing that several samples do not come from populations with the same variance is sometimes of importance per se. The statistical validity of many commonly used tests such as the t-test and ANOVA depend on the extent to which the data conform to the assumption of homogeneity of variance. When a research design, however, involves groups that have very different variances, the p-value accompanying the test statistic, such as t and F may be too lenient or too harsh. Thus, substantive research often requires investigation of cross- or within-group fluctuation in dispersion. For example, in quality control research, homogeneity of variance tests is often “a useful endpoint in an analysis” . In human performance studies, an increase or decrease in the dispersion of performance scores within the same group of subjects may shed light on how changing condition affect human behaviour. Recent studies on gender-related differences in the dispersion of academic performance have provoked substantive as well as methodological interest in homogeneity of variance  .

It has been reported in some literatures that the assumption of the error term is such that its probability distribution remains the same over all observations of the explanatory variables, and in particular the variance of each ${e}_{i}$ is the same for all values of the predictor variables . This assumption is also known as Assumption of Homogeneity of Variances or the Assumption of Constant Variance of the error term. If it is not satisfied in any particular case, we say that the error term ( ${e}_{i}$ ) is heteroscedastic. The meaning of the assumption of homoscedasticity is that the variation of each error ( ${e}_{i}$ ) around its zero mean does not depend on the values of predictor variables. The variance of each ${e}_{i}$ remains the same irrespective of small or large values of the explanatory variables.

Apparently, the present research intends to investigate the best statistical test, through the computation of the number of time (frequency) each test commits type II error (when sigma = 0) and type I error (sigma ≠ 0) for confirming homoscedasticity assumption when different levels of heteroscedasticity are injected into the multiple linear regression models at 30, 50, 100, 200, 500 and 1000 sample sizes.

2. Aim and Objectives of the Study

This study aims at providing a comprehensive comparative analysis of asymptotic behaviour of some selected statistical tests for homoscedasticity assumption in order to hunt for the best statistical test for detecting heteroscedasticity in a multiple linear regression scenario with varying variances. In order to achieve this aim, the following objectives are pursued:

1) To compare nine different statistical tests under different heteroscedastic conditions such as Exponential and Linear (generalized structure for Quadratic and Square-root) Forms;

2) To evaluate, through the computation of the number of time (frequency) each test commits type II error (when sigma = 0) and type I error (sigma ≠ 0) as the case may be;

3) To investigate the asymptotic behaviour of the selected tests when the variances are varied across all simulations.

3. Theoretical Framework and Literature Review

The analysis of variance (ANOVA) is one of the most important and useful techniques for variety of fields such as economics, agriculture, biology and so on with a view to comparing different groups or treatments with respect to their means . Let us consider testing equality of means of k populations given samples $\left\{{x}_{ij}:i=1,2,\cdots ,k;j=1,2,\cdots ,{n}_{i}\right\}$ from ith population with mean ${\mu }_{i}$, variance ${\sigma }_{i}^{2}$ and distribution function $F\left\{{\sigma }_{i}^{-1}\left(x-{\mu }_{i}\right)\right\}$. While doing ANOVA, the null hypothesis to be tested is

${H}_{01}:{\mu }_{1}={\mu }_{2}=\cdots ={\mu }_{k}$ versus ${H}_{11}:{\mu }_{i}\ne {\mu }_{j}$ for some $i\ne j$ (2.1)

Hence, a set of assumptions such as normality, homogeneity and independence of observations has to be made in order to employ an F test for (2.1). As  has pointed out, in practice, the assumption of homogeneity of variances is the one most often unmet in ANOVA. In the absence of homogeneity of variances, the sample means will not necessarily have equal expected standard errors, and an exact solution for comparison among the means requires reference to a compound F distribution with unknown parameters if the true population variances are unknown. It is now well established that the violation of the assumption of homogeneity of variances can have severe effects on inference about the population means, especially in the case of unequal sample sizes  .

Furthermore,  has demonstrated that the ANOVA F is not robust to all degrees of variance heterogeneity even when sample sizes are equal. In fact, the conventional ANOVA F provides generally poor control over both Type I and Type II error rates under a wide variety of variance heterogeneity conditions. Therefore, the problem of homogeneity of variances has to be settled before performing an ANOVA   .

Obviously, there are a good number of methods available to test for homogeneity of variances for different situations   . The most common and popular tests, however, in the case of one-way ANOVA, are Bartlett and Levene tests. The hypothesis to be tested for homogeneity of variances is

${H}_{02}:{\sigma }_{1}^{2}={\sigma }_{2}^{2}=\cdots ={\sigma }_{k}^{2}$ against ${H}_{12}:{\sigma }_{i}^{2}\ne {\sigma }_{j}^{2}$ for some $i\ne j$ (2.2)

Unfortunately, these tests (Bartlett and Levene) are sensitive to the assumption of normality . Specifically, the probability of a Type I error ( $\alpha$ ) is dependent upon the kurtosis of the distribution. However, these alternative tests are affected adversely by non-normality . As  pointed out, none of the procedures directly handle the problem of skewness bias.  has also argued that failure to consider the impact of the combined violations of variance equality and distribution normality is an important omission of a statistical procedure. Under these circumstances, the development of new alternative tests, namely, trimming, transforming statistics, bootstrapping  to deal with unequal variances, and non-normality is worthwhile.

 employed four homogeneity tests, named SNHT, Buishand Range test, Pettitt test and Von Neumann Ratio (VNR) test to the European Climate. The results are categorized into three classes, which are useful, doubtful and suspect according to the number of tests rejecting the null hypothesis. Three testing variables were used, each consists of annual values. For temperature, the two testing variables are annual mean of diurnal temperature range and annual mean of the absolute day-to-day differences. Meanwhile for precipitation, the annual number of wet days (threshold 1 mm) is employed.

Tests for equality of variances are of interest in many situations such as analysis of variance or quality control . The classical approach to hypothesis testing begins with the likelihood ratio test under the assumption of normal distributions given by . However, as this test is very sensitive to departures from normality, many alternative tests have been prompted since then. Some of these are modifications of likelihood ratio test. Others are adaptations of the F-test to test variances rather than means. Against one would hope, recent comparative studies reveal that some of these tests present lack of robustness and have poor power . This problem was also studied by , who performed an increasing study of many of the existing parametric and non-parametric tests. In his last paper Monte Carlo simulations of some distributions for several sample sizes show a few tests that are robust and have good power.

4. Methodology

4.1. Forms of Heteroscedasticity

The study considers four different heteroscedastic structures coined from  additive and multiplicative heteroscedastic model but in our model, we assume that the variance of the error varies as the mean of the responses. The two general forms are:

1) $Var\left({e}^{\prime }e\right)={\sigma }^{2}{\text{e}}^{E\left({y}_{i}\right)}$ ;

2) $Var\left({e}^{\prime }e\right)={\sigma }^{2}E{\left({y}_{i}\right)}^{g}$ ; where $g\ge 0$.

Emanating from the two above, four heteroscedastic structures were formulated as follows:

1) Exponential Form: ${h}_{i1}={\sigma }^{2}{\text{e}}^{E\left({y}_{i}\right)}={\sigma }^{2}{\text{e}}^{\left({\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}\right)}$ ;

2) Linear Form: ${h}_{i2}={\sigma }^{2}E{\left({y}_{i}\right)}^{g}={\sigma }^{2}{\left({\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}\right)}^{1}$ ;

3) Square-rooted Form: ${h}_{i3}={\sigma }^{2}E{\left({y}_{i}\right)}^{g}={\sigma }^{2}{\left({\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}\right)}^{0.5}$ ;

4) Quadratic Form: ${h}_{i4}={\sigma }^{2}E{\left({y}_{i}\right)}^{g}={\sigma }^{2}{\left({\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}\right)}^{2}$.

4.2. Procedure for the Monte Carlo Simulation Experiment

To investigate the finite sample properties of the test statistics of the presence of heteroscedasticity in any given dataset, we use a Monte Carlo experiment. We simulate a linear multiple regression model with three explanatory variables model using a simple Least square function of the form:

${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}+{\beta }_{3}{x}_{3}+e$ (3.1)

where e is a normal error variable.

Following , we then simulate the independent variables Xi as follows:

Xi’s are a set of independent variables that are fixed following ${X}_{1}=1,\cdots ,N$, ${X}_{2}=1,\cdots ,\sqrt{N}$, ${X}_{3}=\text{rnorm}\left(10,1\right)$ and $Z=\sqrt{i}$, for $i=1,\cdots ,N$.

We also generate the two error terms as follows $v~N\left(0,{\sigma }^{2}\right)$ and $u~|N\left(0,{\sigma }^{2}\right)|$

Then the model for the heteroscedasticity function was formed as follows:

${\sigma }_{v}=\mathrm{exp}\left({\alpha }_{0}+{\sigma }_{\alpha 1}\mathrm{ln}{X}_{1i}+{\sigma }_{\alpha 2}\mathrm{ln}{X}_{2i}\right)$ (3.2)

The parameters were then set at:

${\beta }_{0}={\alpha }_{0}={\gamma }_{0}={\alpha }_{1}={\alpha }_{2}={\gamma }_{1}=1$

${\beta }_{1}={\beta }_{2}={\beta }_{3}=0.5\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(\text{constantreturntoscale}\right)$

The parameter $\sigma$ measures the degree of heteroscedasticity. We use several degrees of heteroscedasticity by letting $\sigma$ to vary (0, 0.1, 0.3, 0.7 and 0.9). When $\sigma =0$, we obtain the homoscedastic case. We considered also different sample sizes: 30, 50, 100, 200, 500 and 1000 observations (Hadri and Garry, 1998). To analyze the performance of the heteroscedasticity test statistic as we estimated the OLS model and then tested with the following heteroscedastic test statistics (Breusch-Godfrey test, studentized Breusch-Pagan test, White’s Test, Non-constant Variance Score Test, Park test, Spearman Rank, GLEJSER test, Goldfeld-Quandt test, Harrison-McCabe test) for the true presence of the problem using the test frequency of significance at 1%, 5% and 10% respectively. We also set the number of replications to 1000. Refer to Appendix 1 for the R simulation code.

4.3. Tests for Detecting Heteroscedasticity

This study considers nine (9) popularly used heteroscedastic test as evident in past works of literature. These tests include: Breusch-Godfrey test, studentized Breusch-Pagan test, White’s Test, Non-constant Variance Score Test, Park test, Spearman Rank, GLEJSER test, Goldfeld-Quandt test, Harrison-McCabe test.

4.4. Park Test

Park (1966) formalizes the graphical method by suggesting that ${\sigma }_{i}^{2}$ is some function of the explanatory variable ${X}_{i}$. The functional form he suggested was:

${\sigma }_{i}^{2}={\sigma }^{2}{X}_{i}^{\beta }{\text{e}}^{{v}_{i}}$ (3.3)

or $\mathrm{ln}{\sigma }_{i}^{2}=\mathrm{ln}{\sigma }^{2}+\beta \mathrm{ln}{X}_{i}+{v}_{i}$ (3.4)

where ${v}_{i}$ is the stochastic disturbance term. Since ${\sigma }_{i}^{2}$ is generally not known, Park suggests using ${\stackrel{^}{u}}_{i}^{2}$ as a proxy and running the following regression:

$\mathrm{ln}{\stackrel{^}{u}}_{i}^{2}=\mathrm{ln}{\sigma }^{2}+\beta \mathrm{ln}{X}_{i}+{v}_{i}=\alpha +\beta \mathrm{ln}{X}_{i}+{v}_{i}$ (3.5)

If $\beta$ turns out to be statistically significant, it would suggest that heteroscedasticity is present in the data. If it turns out to be insignificant, we may accept the assumption of variance homogeneity. The Park test is thus a two-stage procedure. In the first stage, we run the OLS regression disregarding the heteroscedasticity question. We obtain ${\stackrel{^}{u}}_{i}$ from this regression, and then in the second stage, we run the regression (3.5).

Although empirically appealing, the Park test has some problems.  has argued that the error term ${v}_{i}$ entering into (3.5) may not satisfy the OLS assumptions and may itself be heteroscedastic. Nonetheless, as a strictly exploratory method, one may still use the Park test.

4.5. Glejser Test

The  is similar in spirit to the Park test. After obtaining the residuals ${\stackrel{^}{u}}_{i}$ from the OLS regression, Glejser suggests regressing the absolute values of ${\stackrel{^}{u}}_{i}$ on the X variable that is thought to be closely associated with ${\sigma }_{i}^{2}$ In his experiments, Glejser used the following functional forms:

$|{\stackrel{^}{u}}_{i}|={\beta }_{0}+{\beta }_{1}{X}_{i}+{v}_{i}$ (3.6)

$|{\stackrel{^}{u}}_{i}|={\beta }_{0}+{\beta }_{1}\sqrt{{X}_{i}}+{v}_{i}$ (3.7)

$|{\stackrel{^}{u}}_{i}|={\beta }_{0}+{\beta }_{1}\frac{1}{{X}_{i}}+{v}_{i}$ (3.8)

$|{\stackrel{^}{u}}_{i}|={\beta }_{0}+{\beta }_{1}\frac{1}{\sqrt{{X}_{i}}}+{v}_{i}$ (3.9)

$|{\stackrel{^}{u}}_{i}|=\sqrt{{\beta }_{0}+{\beta }_{1}{X}_{i}}+{v}_{i}$ (3.10)

$|{\stackrel{^}{u}}_{i}|=\sqrt{{\beta }_{0}+{\beta }_{1}{X}_{i}^{2}}+{v}_{i}$ (3.11)

where ${v}_{i}$ is the error term.

Again, as an empirical or practical matter, one may use the Glejser approach. But Goldfeld and Quandt pointed out that the error term ${v}_{i}$ has some problems in that its expected value is nonzero, it is serially correlated. Above all, we shall choose the best form of regression which gives the best fit from the viewpoint of correlation coefficient and standard error of the coefficients. It’s reported that if ${\beta }_{0}=0$ and ${\beta }_{1}\ne 0$, pure heteroscedasticity is suggested and, also if both ${\beta }_{0}$ and ${\beta }_{1}$ differ from zero, it is an indication of mixed heteroscedasticity. This could be achieved by conducting statistical significance of both ${\beta }_{0}$ and ${\beta }_{1}$.

4.6. Goldfeld-Quandt Test

This is a simple and intuitive test. One orders the observations according to ${X}_{i}$ and omits c central observations. Next, two regressions are run on the two separated sets of observations with $\left(n-c\right)/2$ observations in each. The c omitted observations separate the low-value X’s from the high-value X’s, and if heteroscedasticity exists and is related to ${X}_{i}$, the estimates of ${\sigma }^{2}$ reported from the two regressions should be different. Hence, the test statistic is ${s}_{2}^{2}/{s}_{1}^{2}$, where ${s}_{1}^{2}$ and ${s}_{2}^{2}$ are the Mean Square Error of the two regressions, respectively. Their ratio would be the same as that of the two residual sums of squares because the degrees of freedom of the two regressions are the same. This statistic is F-distributed with $\left[\left(n-c\right)/2-k\right]$ degrees of freedom in the numerator as well as the denominator.

The only remaining question for performing this test is the magnitude of c. Obviously, the larger c is, the more central observations are being omitted and the more confident we feel that the two samples are distant from each other. The loss of c observations should lead to loss of power. However, separating the two samples should give us more confidence that the two variances are in fact the same if we do not reject homoscedasticity. This trade off in power was studied by  using Monte Carlo experiments. Their results recommend the use of c = 8 for n = 30 and c = 16 for n = 60. This is a popular test, but assumes that we know how to order the heteroscedasticity. In this case, we use ${X}_{i}$. But what if there are more than one regressor on the right-hand side? In that case one can order the observations using ${\stackrel{^}{Y}}_{i}$.

4.7. Breusch-Pagan Test

The success of the Goldfeld-Quandt test depends not only on the value of c (the number of central observations to be omitted) but also on identifying the correct X variable with which to order the observations. This limitation of this test can be avoided if we consider the Breusch-Pagan (BP) test (1979). Consider the k-variable linear regression model:

${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{1i}+\cdots +{\beta }_{k}{x}_{ki}+{e}_{i}$ (3.12)

Assuming that the error variance ${\sigma }_{i}^{2}$ is described as follows:

${\sigma }_{i}^{2}=f\left({\alpha }_{0}+{\alpha }_{1}{z}_{1i}+\cdots +{\alpha }_{m}{z}_{mi}\right)$ (3.13)

that is, ${\sigma }_{i}^{2}$ is some function of the non-stochastic variables z’s; some or all of the x’s can serve as z’s. Specifically, assume that:

${\sigma }_{i}^{2}={\alpha }_{0}+{\alpha }_{1}{z}_{1i}+\cdots +{\alpha }_{m}{z}_{mi}$ (3.14)

that is, ${\sigma }_{i}^{2}$ is a linear function of the z’s. If ${\alpha }_{1}={\alpha }_{2}=\cdots ={\alpha }_{m}=0$, ${\sigma }_{i}^{2}={\alpha }_{0}$, which is a constant. Therefore, to test whether ${\sigma }_{i}^{2}$ is homoscedastic, one can test the hypothesis that ${\alpha }_{1}={\alpha }_{2}=\cdots ={\alpha }_{m}=0$. This is the basic idea behind the Breusch-Pagan test. The actual procedure is tailored as follows:

1) Obtain ${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{1i}+\cdots +{\beta }_{k}{x}_{ki}+{e}_{i}$ by OLS and compute the residuals;

2) Obtain ${\stackrel{^}{\sigma }}^{2}=\frac{1}{n}\sum {e}_{i}^{2}$, which would be MLE of ${\sigma }^{2}$ under homoscedasticity;

3) Obtain another variable such that ${p}_{i}=\frac{{e}_{i}^{2}}{{\stackrel{^}{\sigma }}^{2}}$ ;

4) Obtain the regression such that ${p}_{i}={\alpha }_{0}+{\alpha }_{1}{z}_{1i}+\cdots +{\alpha }_{m}{z}_{mi}+{v}_{i}$, where ${v}_{i}$ is the residual term of this regression;

5) Obtain the statistic: $BPG=\frac{1}{2}\left(SSR\right)$, where SSR is the Regression Sum of Squares;

6) Assuming ${e}_{i}$ are normally distributed, one can show that if there is homoscedasticity and if the sample size n increases indefinitely, then:

$BP~{\chi }_{m-1}^{2}$. (3.15)

4.8. White Test

Another general test for homoscedasticity where nothing is known about the form of this heteroscedasticity is suggested by . This test is based on the difference between the variance of the OLS estimates under homoscedasticity and that under heteroscedasticity. This test does not rely on normality assumption making it very easy to implement. Consider the following three-variable regression model

${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}+{e}_{i}$ (3.16)

The White test is tailored as follows:

1) Given a set of data, obtain the residual, ${e}_{i}$ from (3.16);

2) Run the following auxiliary regression:

${e}_{i}^{2}={\alpha }_{0}+{\alpha }_{1}{x}_{1}+{\alpha }_{2}{x}_{2}+{\alpha }_{3}{x}_{1}^{2}+{\alpha }_{4}{x}_{2}^{2}+{\alpha }_{5}{x}_{1}{x}_{2}+{v}_{i}$ (3.17)

3) Obtain ${R}^{2}$ from this auxiliary regression;

4) Under ${H}_{0}$ that there is no heteroscedasticity, it can be shown that the sample size (n) times ${R}^{2}$ obtained from the auxiliary regression asymptotically follows the chi-squared distribution with degree of freedom equal to the number of regressors (excluding the constant term) in the auxiliary regression. Mathematically, we have:

$n{R}^{2}~{\chi }_{k;\alpha }^{2}$ (3.18)

5) It is expected that the null hypothesis will be rejected when $n{R}^{2}$ exceeds the critical value obtained from chi-square table at a given level of significance.

It is observed that if a model has several regressors, then introducing all the regressors, their squared (or higher-powered) terms, and their product can quickly consume degreed of freedom. Therefore, one must be very cautious of using the test; this is one of the demerits of this test.

4.9. Spearman’s Rank Correlation Test

This test ranks the ${x}_{i}$ ’s and the absolute value of OLS residuals, the ${e}_{i}$ ’s. Then it computes the difference between these rankings, that is, ${d}_{i}=\text{rank}\left(|{e}_{i}|\right)-\text{rank}\left({x}_{i}\right)$. For this simple linear regression model, we obtain the Spearman’s Rank Correlation Coefficient as follows:

${r}_{|{e}_{i}|.{x}_{i}}=1-\left[\frac{6{\sum }_{i=1}^{n}{d}_{i}^{2}}{n\left({n}^{2}-1\right)}\right]$ (3.19)

Having obtained (3.19), the next step is to test for the significance of the coefficient using t-test as follows:

$T=\frac{{r}_{|{e}_{i}|.{x}_{i}}\sqrt{n-2}}{\sqrt{1-{r}_{|{e}_{i}|.{x}_{i}}^{2}}}$ (3.20)

The statistic is t-distributed with (n − 2) degree of freedom under any level of significance.

In a situation where the number of regressors are more than one (multiple linear regression case), that is, ${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{1i}+\cdots +{\beta }_{k}{x}_{ki}+{e}_{i}$, it’s suggested that ${r}_{|{e}_{i}|.{x}_{1}},{r}_{|{e}_{i}|.{x}_{2}},\cdots ,{r}_{|{e}_{i}|.{x}_{k}}$ should be computed separately and the same t-test should be used for testing the significance of each of the correlation coefficient. This is the test intended to modify in the present research.

The choice of statistical tests for this study is in connection with the existing literature, most especially in papers entitled “Heteroscedasticity as Basis of Direction Dependence in Reversible Linear Regression Models” authored by , it’s reported that Bartlett’s, Goldfeld-Quandt, Breusch-Pagan, White and Koenker-Bassett Tests are the major tests predominantly used in multiple linear regression models. In addition, we decide judgmentally to include Spearman’s Rank Test making a total of six statistical tests for detecting homoscedasticity assumption.

4.10. Breusch-Godfrey Serial Correlation

The Breusch-Godfrey serial correlation LM test is a test for heteroscedasticity in the errors in a regression model. It makes use of the residuals from the model being considered in a regression, and a test statistic is derived from these.

This test is valid with lagged dependent variables and can be used to test for heteroscedasticity

Procedure

Step 1. Estimate.

${Y}_{t}={\beta }_{1}+{\beta }_{2}{X}_{2t}+{\beta }_{3}{X}_{3t}+{\beta }_{4}{Y}_{t-1}+{U}_{t}$ (3.21)

obtain the residuals (et).

Step 2. Estimate the following auxiliary regression.

model:

${e}_{t}={b}_{1}+{b}_{2}{X}_{2}+{b}_{3}{X}_{3}+{b}_{4}{Y}_{t-1}+{c}_{1}{e}_{t-1}+{c}_{2}{e}_{t-2}+{c}_{3}{e}_{t-3}+{w}_{t}$ (3.22)

Step 3. For large sample sizes, the test statistic is:

$\left(n-p\right){R}^{2}~{\chi }_{p}^{2}$ (3.23)

Step 4. If the test statistic exceeds the critical chi-square value we can reject the null hypothesis of no serial correlation in any of the ρ terms.

Other tests are “Non-constant variance score test” and Harrison-McCabe test.

4.11. Heteroscedasticity Correction

From the section above, a general linear regression model with the assumption of heteroscedasticity can be expressed as follows:

${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}+\cdots +e$ (3.24)

Letting $e={\mu }_{t}$

${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{1}+{\beta }_{2}{x}_{2}+\cdots +{\mu }_{t}$ (3.25)

$Var\left({\mu }_{t}\right)=E\left({\mu }_{t}^{2}\right)={\sigma }_{t}^{2}$ for $t=1,2,\cdots ,n$

where:

Noting that the t subscript attached to sigma squared indicates that the disturbance for each of the n-units is drawn from a probability distribution that has a different variance.

Given such a non-constant variance function

$Var\left({e}_{i}\right)={\sigma }_{i}^{2}={\sigma }_{i}^{2}{x}_{i}^{\alpha }$ (3.26)

where $\alpha$ is the unknown parameter in the model.

Taking the natural logarithm

$\mathrm{ln}\left({\sigma }_{i}^{2}\right)=\mathrm{ln}\left({\sigma }_{i}^{2}\right)+\alpha \mathrm{ln}\left({x}_{i}\right)$ (3.27)

Then taking exponential of equation

${\sigma }_{i}^{2}=\mathrm{exp}\left[\mathrm{ln}\left({\sigma }_{o}^{2}\right)+\alpha \mathrm{ln}\left({x}_{i}\right)\right]$ (3.28)

Letting ${\beta }_{1}=\mathrm{ln}\left({\sigma }_{i}^{2}\right)$, ${\beta }_{2}=\alpha$, ${Z}_{i}=\mathrm{ln}\left( x i \right)$

${\sigma }_{i}^{2}=\mathrm{exp}\left[{\beta }_{1}+{\beta }_{2}{Z}_{i}\right]$ (3.29)

${\sigma }_{i}^{2}=\mathrm{exp}\left[{\beta }_{1}+{\beta }_{2}{Z}_{i2}+\cdots +{\beta }_{3}{Z}_{is}\right]$ (3.29*)

If the variance depends on more than one explanatory variable (a multiple regression case) Taking the exponential function is best because it gives non-negative value of variance ${\sigma }_{i}^{2}$.

From Equation (3.27) with ${\beta }_{1}=\mathrm{ln}\left({\sigma }_{i}^{2}\right)$, ${\beta }_{2}=\alpha$, ${Z}_{i}=\mathrm{ln}\left( x i \right)$

Using the OLS technique to estimate the coefficients ${\beta }_{1},{\beta }_{2},\cdots ,{\beta }_{s}$ of the variance function

$\mathrm{ln}\left({\sigma }_{i}^{2}\right)={\beta }_{1}+{\beta }_{2}{Z}_{i2}+\cdots +{\beta }_{s}{Z}_{is}$ (3.30)

where ${Z}_{i2}=\mathrm{ln}\left({x}_{2}\right)$, ${Z}_{i3}=\mathrm{ln}\left({x}_{3}\right)$ and so on.

We then took the square root of the exponential of the fitted estimate

${\stackrel{^}{\sigma }}_{i}=\sqrt{\mathrm{exp}\left({\stackrel{^}{\beta }}_{1}+{\stackrel{^}{\beta }}_{2}{Z}_{i2}+\cdots +{\stackrel{^}{\beta }}_{s}{Z}_{is}\right)}$ (3.31)

Then ${\stackrel{^}{\sigma }}_{i}$ is the weight required to transform the data set by dividing through.

But;

$Var\left(\frac{{e}_{i}}{{\sigma }_{i}}\right)=\frac{1}{{\sigma }_{i}^{2}}Var\left({e}_{i}\right)=\frac{1}{{\sigma }_{i}^{2}}×{\sigma }_{i}^{2}=1$ (3.32)

Using the estimate of our variance function ${\stackrel{^}{\sigma }}_{i}^{2}$ in place of ${\sigma }_{i}^{2}$ in Equation (3.30) to obtain the Generalized Least Square Estimator of ${\beta }_{1},{\beta }_{2},\cdots ,{\beta }_{s}$.

We then defined the transformed variable as

${y}_{i}^{\ast }=\frac{{y}_{i}}{{\stackrel{^}{\sigma }}_{i}},{x}_{i1}^{\ast }=\frac{1}{{\stackrel{^}{\sigma }}_{i}},{x}_{i2}^{\ast }=\frac{{x}_{i}}{{\stackrel{^}{\sigma }}_{i}},\cdots ,{x}_{is}^{\ast }=\frac{{x}_{s}}{{\stackrel{^}{\sigma }}_{i}}$ (3.33)

Therefore;

${y}_{i}^{\ast }={\beta }_{i}{x}_{i1}^{\ast }+{\beta }_{2}{x}_{i2}^{\ast }+\cdots +{\beta }_{s}{x}_{is}^{\ast }+{e}_{i}^{\ast }$ (3.34)

which is the Weighted Least Squares model with homoscedasticity.

5. Analysis and Results

5.1. Comparative Analyses of Some Statistical Tests for Homoscedasticity Assumption

As earlier mentioned, nine statistical tests are compared in this study with the use of the number of time (frequency) each test commits type II error and type I error as the case may be, such that the one with the least frequency type II error) when sigma = 0 shall be considered as the best among others and the test with the highest frequency (type I error) when sigma ≠ 0 shall be considered as the best among others. The null hypothesis is such that homoscedasticity assumption is upheld.

5.2. Performance of the Tests When Error Follows Exponential Heteroscedastic Structure (EHS)

Table 1 presents the frequency of tests of significance at 1% level after 1000

Table 1. Performance of the tests when error follows exponential heteroscedastic structure (EHS) at 1%.

*Frequency of test significance after 1000 replications.

replications for errors that follow EHS. As observed from the simulation results in Table 1 and Figure 1, the OLS model was not contaminated with level heteroscedasticity, but yet tested with the nine various tests of heteroscedasticity to detect the true nature of the various test statistics at sample size of n = 30. At no presence of heteroscedasticity, the following tests returned the following rate in percent of type two error; Goldfeld-Quandt test 0.6%, Studentized Breusch-Pagan test 0.3%, White test with 0%, and so on. Making the White test the best in terms of detecting no presence of heteroscedasticity (i.e. sigma = 0). While the model was infused with the level of heteroscedasticity (i.e. sigma = 0.1, 0.3, 0.5, 0.7 & 0.9) at sample size of 30. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 417, 727, 59 and 413 corrected tests results at 0.1, 0.3, 0.5 & 0.7 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 41.7%, 72.7%, 5.9% and 41.3% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5 or 0.7 at sample size of 30. Howbeit, Non-constant Variance Score Test with 31.4% outperformed the celebrated Glejser test when sigma = 0.9.

Moreover, considering sample size 50, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Studentized Breusch-Pagan test 0.6%, Non-constant Variance Score test 0.6%, White test with 0.3%, and so on (see Table 1 and Figure 1). Hence, White test the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 656, 950, 68 and 707 corrected tests results at 0.1, 0.3, 0.5 & 0.7 sigma levels

Figure 1. Sample size 50 results when error follows EHS at 1%.

respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 65.6%, 95.0%, 6.8% and 70.7% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5 or 0.7 at sample size of 50. However, Non-constant Variance Score test with 61.1% outperformed the celebrated Glejser test when sigma = 0.9.

In addition, considering sample size 100, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 0.9%, Studentized Breusch-Pagan test 0.9%, Park test 0.9%, White test with 1.2%, Glejser test 0.6% and so on (see Table 1 and Figure 2). Hence, the celebrated White test at sigma = 0 was displayed by Glejser test being the best in terms of detecting no presence of heteroscedasticity when sample size is 100. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 924, 1000, 73 and 984 corrected tests results at 0.1, 0.3, 0.5 & 0.7 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 92.4%, 100%, 7.3% and 98.4% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5 or 0.7 at sample size of 100. However, Non-constant Variance Score test with 92.9% outperformed the celebrated Glejser test when sigma = 0.9.

Furthermore, considering sample size 200, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two

Figure 2. Sample size 100 results when error follows EHS at 1%.

error; Breusch-Godfrey test 1.2%, Studentized Breusch-Pagan test 0.8%, Non-constant Variance Score test 1.1%, White test with 1.9% and so on (see Table 1). Hence, Studentized Breusch-Pagan test the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 999, 1000, 98 and 998 corrected tests results at 0.1, 0.3, 0.5 & 0.7 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 99.9%, 100%, 9.8% and 99.8% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5 or 0.7 at sample size of 200. However, Non-constant Variance Score test with 99.9% outperformed the celebrated Glejser test 99.8% when sigma = 0.9.

Additionally, considering sample size 500, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 1.2%, Studentized Breusch-Pagan test 0.8%, Non-constant Variance Score test 0.8%, White test with 1.6% and so on (see Table 1 and Figure 3). Hence, Studentized Breusch-Pagan and Non-constant Variance Score tests are the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 1000, 1000, 91, 1000 and 1000 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying

Figure 3. Sample size 500 results when error follows EHS at 1%.

that the Glejser test has the highest rate of type I error of 100%, 100%, 9.1%, 100% and 100% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 500. Interestingly, Studentized Breusch-Pagan recorded 100% performance at sigma = 0.9 also, Non-constant Variance Score test recorded 100% performance at sigma = 0.7 & 0.9. Hence, Non-constant Variance Score test is also best at sigma 0.7 and 0.9 also, Studentized Breusch-Pagan is best at sigma = 0.9.

Lastly, considering sample size 1000, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 1.0%, Studentized Breusch-Pagan test 1.5%, Non-constant Variance Score test1.1%, White test 1.2%, Goldfeld-Quandt test 0.7% and so on (see Table 1 and Figure 4). Hence, Goldfeld-Quandt test is the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 1000, 1000, 88, 1000 and 1000 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 100%, 100%, 8.8%, 100% and 100% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 1000. Interestingly, Studentized Breusch-Pagan and Non-constant Variance Score tests recorded 100% performances at sigma = 0.7 & 0.9 also, Park test recorded 100% performance at sigma = 0.7. Hence, Studentized Breusch-Paganand Non-constant Variance Score tests are also best at sigma 0.7 and 0.9 while Park test is also best at sigma = 0.7.

Table 2 presents the frequency of tests of significance at 5% level after 1000 replications for errors that follow EHS. As observed from the simulation results

Figure 4. Sample size 1000 results when error follows EHS at 1%

Table 2. Performance of the tests when error follows exponential heteroscedastic structure at 5%.

*Frequency of test significance after 1000 replications.

in Table 2 and Figure 5, the OLS model was not contaminated with level heteroscedasticity, but yet tested with the nine various tests of heteroscedasticity to detect the true nature of the various test statistics at sample size of n = 30. At no presence of heteroscedasticity, the following tests returned the following rate in percent of type two error; Goldfeld-Quandt test 4.1%, Studentized Breusch-Pagan test 3.4%, White test with 0.7%, and so on. Making the White test the best in terms of detecting no presence of heteroscedasticity (i.e. sigma = 0). While the model was infused with the level of heteroscedasticity (i.e. sigma = 0.1, 0.3, 0.5, 0.7 & 0.9) at sample size of 30. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 614, 884, 241, 783 and 649 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 61.4%, 88.4%, 24.1%, 78.3% and 64.9% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 30.

Moreover, considering sample size 50, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Studentized Breusch-Pagan test 4.7%, Non-constant Variance Score test 4.3%, White test 2.8% and so on (see Table 2). Hence, White test the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 819, 988, 257, 964 and 877 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels

Figure 5. Sample size 30 results when error follows EHS at 5%.

respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 81.9%, 98.8%, 25.7%, 96.4% and 87.7% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 50.

In addition, considering sample size 100, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 5.7%, Studentized Breusch-Pagan test 4.8%, Park test 4.8%, White test 5.5%, Glejser test 4.0% and so on (see Table 2 and Figure 6). Hence, the celebrated White test at sigma = 0 was outperformed by Glejser test being the best in terms of detecting no presence of heteroscedasticity when sample size is 100. The Glejser test also returned the highest returned, the highest presence of heteroscedasticity with 973, 1000, 273, 1000 and 998 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 97.3%, 100%, 27.3%, 100% and 99.8% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 100.

Furthermore, considering sample size 200, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 5.5%, Studentized Breusch-Pagan test 6.3%, Non-constant Variance Score test 6.3%, White test 6.8%, Harrison-McCabe test 5.5% and so on (see Table 2). Hence, Harrison-McCabe and Breusch-Godfrey tests are the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 1000, 1000, 301, 1000 and 1000 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 100%, 100%, 30.1%, 100%

Figure 6. Sample size 100 results when error follows EHS at 5%.

and 100% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 200. Interestingly, Non-constant Variance Score test recorded 100% performance at sigma 0.9. Hence, Non-constant Variance Score test is also best at sigma 0.9.

Additionally, considering sample size 500, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 5.0%, Studentized Breusch-Pagan test 5.5%, Non-constant Variance Score test 5.8%, White test with 5.7%, Harrison-McCabe 4.4% and so on (see Table 2 and Figure 7). Hence, Harrison-McCabe test is the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 1000, 1000, 293, 1000 and 1000 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 100%, 100%, 29.3%, 100% and 100% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 500. Interestingly, Studentized Breusch-Pagan recorded 100% performance at sigma = 0.9 also, Non-constant Variance Score test recorded 100% performance at sigma = 0.7 & 0.9. Hence, Non-constant Variance Score test is also best at sigma 0.7 and 0.9 also, Studentized Breusch-Pagan is best at sigma = 0.9.

Lastly, considering sample size 1000, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 3.5%, Studentized Breusch-Pagan test 6.7%, White test 4.7% and so on (see Table 2 and Figure 8). Hence, Breusch-Godfrey test is the best in terms of detecting no presence of heteroscedasticity. The Glejser test returned the highest returned, the highest presence of heteroscedasticity with 1000, 1000,

Figure 7. Sample size 500 results when error follows EHS at 5%.

Figure 8. Sample size 1000 results when error follows EHS at 5%.

283, 1000 and 1000 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Glejser test has the highest rate of type I error of 100%, 100%, 28.3%, 100% and 100% which makes Glejser the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 1000. Interestingly, Studentized Breusch-Pagan and Non-constant Variance Score tests recorded 100% performances at sigma = 0.7 & 0.9 also, Park test recorded 100% performance at sigma 0.1, 0.3 and 0.7. Hence, Studentized Breusch-Paganand Non-constant Variance Score tests are also best at sigma 0.7 and 0.9 while Park test is also best at sigma 0.1, 0.3 and 0.7.

5.3. Performance of the Tests When Error Follows Linear Heteroscedastic Structure (LHS)

Table 3 and Table 4 present the frequency of tests of significance at 1% and 5% levels respectively after 1000 replications for errors that follow LHS. As observed from the simulation results in Table 4 and Figure 9, the OLS model was not contaminated with level heteroscedasticity, but yet tested with the nine various tests of heteroscedasticity to detect the true nature of the various test statistics at sample size of n = 30. At no presence of heteroscedasticity, the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 0.3%, Spearman Rank tests 0.6%, White test 2.2%, Harrison-McCabe test0% and so on. Making the Harrison-McCabe test the best in terms of detecting no presence of heteroscedasticity (i.e. sigma = 0). While the model was infused with the level of heteroscedasticity (i.e. sigma = 0.1, 0.3, 0.5, 0.7 & 0.9) at sample size of 30. The Park test returned the highest in the highest presence of heteroscedasticity with 295, 353, 353, 353 and 354 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 29.5%, 35.3%, 35.3%, 35.3%

Table 3. Performance of the tests when error follows LHS at 1%.

*Frequency of test significance after 1000 replications.

Table 4. Performance of the tests when error follows LHS at 5%.

*Frequency of test significance after 1000 replications.

Figure 9. Sample size 30 results when error follows LHS at 1%.

and 35.4% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 30.

Moreover, considering sample size 50, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 0.6%, Spearman Rank tests 1.3%, White test 4.0%, Harrison-McCabe test 0.3% and so on (see Table 3 and Figure 10). Hence, Harrison-McCabe test the best in terms of detecting no presence of heteroscedasticity. The Park test returned the highest returned, the highest presence of heteroscedasticity with 352, 389, 391, 392 and 393 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 35.2%, 38.9%, 39.1%, 39.2% and 39.3% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 50.

In addition, considering sample size 100, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 0.9%, Non-constant Variance Score test 0.6%, Spearman Rank test 1.3%, White test 2.5%, Harrison-McCabe test 1.2% and so on (see Table 3 and Figure 11). Hence, the celebrated Harrison-McCabe test at sigma = 0 was outperformed by Non-constant Variance Score test being the best in terms of detecting no presence of heteroscedasticity when sample size is 100. The Park test returned the highest returned, the highest presence of heteroscedasticity with 474, 522, 522, 520 and 520 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the

Figure 10. Sample size 50 results when error follows LHS at 1%.

Figure 11. Sample size 100 results when error follows LHS at 1%.

Park test has the highest rate of type I error of 47.4%, 52.2%, 52.2%, 52.0% and 52.0% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 100.

Furthermore, considering sample size 200, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 0.8%, Non-constant Variance Score test 1.1%, Spearman Rank test 1.7%, White test 2.9%, Harrison-McCabe test 1.9% and so on (see Table 3 and Figure 12). Hence, the earlier celebrated Harrison-McCabe and Non-constant Variance Score test at sigma = 0 were outperformed by Breusch-Godfrey test being the best in terms of detecting no presence of heteroscedasticity when sample size is 200. The Park test returned the highest returned, the highest presence of heteroscedasticity with 643, 687, 690, 690 and 689corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 64.3%, 68.7%, 69.0%, 69.0% and 68.9% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 200.

Additionally, considering sample size 500, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 0.8%, Non-constant Variance Score test 0.8%, Spearman Rank test 1.11%, White test 2.9%, Harrison-McCabe test 1.6% and so on (see Table 3 and Figure 13). Hence, Non-constant Variance Score and Breusch-Godfrey tests are the best in terms of detecting no presence of heteroscedasticity when sample size is 500. The Park test returned the highest returned, the highest presence of heteroscedasticity with 887, 914, 915, 914 and 915 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 88.7%, 91.4%, 91.5%, 91.4% and 91.5% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 500.

Lastly, considering sample size 1000, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two

Figure 12. Sample size 200 results when error follows LHS at 1%.

Figure 13. Sample size 500 results when error follows LHS at 1%.

error; Breusch-Godfrey test 1.5%, Non-constant Variance Score test 1.1%, Spearman Rank test 0.7%, White test 3.1%, Harrison-McCabe test 1.2% and so on (see Table 3 and Figure 14). Hence, Spearman rank test is the best in terms of detecting no presence of heteroscedasticity when sample size is 1000. The Park test returned the highest returned, the highest presence of heteroscedasticity with 977, 984, 984, 983 and 983 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 97.7%, 98.4%, 98.4%, 98.3% and 98.3% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 1000.

Table 4 presents the frequency of tests of significance at 5% level after 1000 replications for errors that follow LHS. As observed from the simulation results in Table 4 and Figure 15, the OLS model was not contaminated with level heteroscedasticity, but yet tested with the nine various tests of heteroscedasticity to detect the true nature of the various test statistics at sample size of n = 30. At no presence of heteroscedasticity, the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 3.4%, Spearman Rank tests 4.1%, White test 12.8% Harrison-McCabe test 0.7% and so on. Making the Harrison-McCabe test the best in terms of detecting no presence of heteroscedasticity (i.e. sigma = 0). While the model was infused with the level of heteroscedasticity (i.e. sigma = 0.1, 0.3, 0.5, 0.7 & 0.9) at sample size of 30. The Park test returned the highest returned, the highest presence of heteroscedasticity with 516, 602, 605, 608 and 608 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9

Figure 14. Sample size 1000 results when error follows LHS at 1%.

Figure 15. Sample size 30 results when error follows LHS at 5%.

sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 51.6%, 60.2%, 60.5%, 60.8% and 60.8% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 30.

Moreover, considering sample size 50, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 4.0%, Spearman Rank tests 6.1%, White test 13.9%, Harrison-McCabe test 2.8% and so on (see Table 4). Hence, Harrison-McCabe test the best in terms of detecting no presence of heteroscedasticity. The Park test returned the highest returned, the highest presence of heteroscedasticity with 593, 662, 659, 656 and 655 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 59.3%, 66.2%, 65.9%, 65.6% and 65.5% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 50.

In addition, considering sample size 100, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 4.8%, Non-constant Variance Score test 4.0%, Spearman Rank test 5.7%, White test 13.9%, Harrison-McCabe test 5.5% and so on (see Table 4 and Figure 16). Hence, the celebrated Harrison-McCabe test at sigma = 0 was outperformed by Non-constant Variance Score test being the best in terms of detecting no presence of heteroscedasticity when sample size is 100. The Park test returned the highest returned, the highest presence of heteroscedasticity with 706, 747, 748, 750 and 749 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 70.6%, 74.7%, 74.8%, 75.0%

Figure 16. Sample size 100 results when error follows LHS at 5%.

and 74.9% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 100.

Additionally, considering sample size 500, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 5.5%, Non-constant Variance Score test 5.5%, Spearman Rank test 4.5%, White test 13.6%, Harrison-McCabe test 5.7% and so on (see Table 4 and Figure 17). Hence, Spearman rank test is the best in terms of detecting no presence of heteroscedasticity when sample size is 500. The Park test returned the highest returned, the highest presence of heteroscedasticity with 969, 980, 980, 980 and 980 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the Park test has the highest rate of type I error of 96.9%, 98.0%, 8.0%, 8.0% and 8.0% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 500.

Lastly, considering sample size 1000, at no presence of heteroscedasticity (sigma = 0), the following tests returned the following rate in percent of type two error; Breusch-Godfrey test 6.7%, Non-constant Variance Score test 5.0%, Spearman Rank test 4.9%, White test 11.6%, Harrison-McCabe test 4.7% and so on (see Table 4 and Figure 18). Hence, Spearman rank test is the best in terms of detecting no presence of heteroscedasticity when sample size is 1000. The Park test returned the highest returned, the highest presence of heteroscedasticity with 997, 999, 999, 999 and 999 corrected tests results at 0.1, 0.3, 0.5, 0.7 & 0.9 sigma levels respectively out of every 1000 replications, thus implying that the

Figure 17. Sample size 500 results when error follows LHS at 5%.

Figure 18. Sample size 1000 results when error follows LHS at 5%.

Park test has the highest rate of type I error of 99.7%, 99.9%, 99.9%, 99.9% and 99.9% which makes Park the best test when sigma is 0.1, 0.3, 0.5, 0.7 or 0.9 at sample size of 1000.

6. Summary, Conclusions and Recommendations

6.1. Summary

This study focuses on comparative analysis that determines the asymptotic behaviour of some selected statistical tests for homoscedasticity assumption by Monte Carlo simulations, and seeks to recommend the best statistical test for detecting heteroscedasticity in a multiple linear regression scenario with varying variances.

6.2. Conclusions

The aim and objectives of the study have been principally accomplished. The analyses show the comparative results of the null hypothesis such that homoscedasticity assumption is upheld under four error structures for sample sizes; 30, 50, 100, 200, 500 and 1000 when ${\sigma }^{2}$ was varied as follows; 0, 0.1, 0.3, 0.5, 0.7 and 0.9.

In the analyses, the homoscedasticity assumption was tested under four different error distributions namely; Exponential Heteroscedastic Structure (EHS), Linear Heteroscedastic Structure (LHS), Square-Root Heteroscedastic Structure (SHS) and Quadratic Heteroscedastic Structure (QHS), for different sample sizes as the ${\sigma }^{2}$ was varied. However, two major error structures results were reported in course of our study namely Exponential Heteroscedastic Structure (EHS) and Linear Heteroscedastic Structure (LHS). The Linear Heteroscedastic Structure (LHS) results were adopted as it explains the results in Square-Root Heteroscedastic Structure (SHS) and Quadratic Heteroscedastic Structure (QHS). Following our findings, Table 5 and Table 6 present the summary of the best tests across all board.

As observed from Table 5 and Table 6:

- when the OLS model was not contaminated with level heteroscedasticity (i.e. sigma = 0) White test returned the best test at sample size 30 and 50 for errors following EHS while;

- Harrison-McCabe test returned the best for errors following LHS. Still on sample size 30 and 50;

- when the model was infused with the level of heteroscedasticity (i.e. sigma = 0.1, 0.3, 0.5, 0.7 & 0.9), the Glejser test and Park test returned the best test for EHS and LHS respectively at sigma = 0.1, 0.3, 0.5, 0.7 & 0.9 except for EHS at sigma = 0.9 Non-constant Variance Score test returned best (0.01 level only);

Table 5. Summary of the best tests (Sample Size 30, 50 and 100).

Table 6. Summary of the best tests (Sample Size 200, 500 and 1000).

- Furthermore, at sample size 100 Glejser test returned the best test when sigma is 0, 0.1, 0.3, 0.5 & 0.7 and Non-constant Variance Score test returned best when sigma is 0.9 for EHS. While;

- Park test returned the best test when sigma is 0.1, 0.3, 0.5, 0.7 & 0.9 and Non-constant Variance Score test returned best when sigma is 0 for LHS;

- In addition, at sample size 200 the Glejser test and Park test returned the best test for EHS and LHS respectively when sigma = 0.1, 0.3, 0.5, 0.7 & 0.9 except for EHS at sigma = 0.9 for 0.01 level Non-constant Variance Score test returned best;

- However, when sigma = 0 (no heteroscedasticity) the following tests returned best: Studentized Breusch-Pagan (EHS at 0.01 level); Harrison-McCabe/ Breusch-Godfrey (EHS at 0.05 level); Breusch-Godfrey (EHS at 0.1 level); Breusch-Godfrey (LHS at 0.01); and Spearman rank (LHS at 0.05 & 0.1);

- Moreover, at sample size 500 the Glejser test and Park test returned the best test for EHS and LHS respectively when sigma = 0.1, 0.3, 0.5, 0.7 & 0.9 also Non-constant Variance Score at sigma = 0.7 or 0.9 and Studentized Breusch-Pagan returned best when sigma = 0.9 for EHS;

- However, when sigma = 0 (no heteroscedasticity) the following tests returned best: Studentized Breusch-Pagan/Non-constant Variance Score (EHS at 0.01 level); Harrison-McCabe (EHS at 0.05 level); Harrison-McCabe (EHS at 0.1 level); Non-constant Variance Score/Breusch-Godfrey (LHS at 0.01); and Spearman rank (LHS at 0.05 & 0.1);

- Lastly, at sample size 1000 the Glejser test and Park test returned the best test for EHS and LHS respectively when sigma = 0.1, 0.3, 0.5, 0.7 & 0.9;

- Also the following test returned best: Park (EHS at sigma 0.3 and 0.01 level); Non-constant Variance Score/Studentized Breusch-Pagan (EHS; sigma 0.7 & 0.9 at 0.01 level); Park (EHS; sigma 0.3 & 0.9 at 0.05 and 0.1 levels); and Non-constant Variance Score/Studentized Breusch-Pagan/Park (EHS; sigma 0.7 & 0.9 at 0.05 and 0.1 levels);

- However, when sigma = 0 (no heteroscedasticity) the following tests returned best: Goldfeld-Quandt (EHS at 0.01 level); Breusch-Godfrey (EHS at 0.05 and 0.1 levels); and Spearman rank (LHS at 0.01, 0.05 & 0.1 levels).

6.3. Recommendations

From the aforementioned, the following are recommended:

1) White and Harrison-McCabe tests should be employed to check for homoscedasticity in EHS and LHS respectively for sample size 30 and 50 (Small samples).

2) others can be employed as follows Glejser (EHS at n = 100), Non-constant Variance Score (LHS at n = 100), Studentized Breusch-Pagan (EHS at n = 200, 0.01 level), Harrison-McCabe/Breusch-Godfrey (EHS at n = 200, 0.05 level), Breusch-Godfrey (EHS at n = 200, 0.1 level), Breusch-Godfrey (LHS at n = 200, 0.01), Spearman rank (LHS at n = 200, 0.05 & 0.1), Studentized Breusch-Pagan/ Non-constant Variance Score (EHS at n = 500, 0.01 level) (Moderate samples).

3) Harrison-McCabe (EHS at n = 500, 0.05 level), Harrison-McCabe (EHS at n = 500, 0.1 level), Non-constant Variance Score/Breusch-Godfrey (LHS at n = 500, 0.01), Spearman rank (LHS at n = 500, 0.05 & 0.1), Goldfeld-Quandt (EHS at n = 1000, 0.01 level), Breusch-Godfrey (EHS at n = 1000, 0.05 and 0.1 levels), Spearman rank (LHS at n = 1000, 0.01, 0.05 & 0.1 levels) (Large samples).

4) Glejser and Park tests should be employed to check for heteroscedasticity in EHS and LHS respectively.

Appendix 1

Performance of the Tests when Error follows Quadratic Structure (QHS) at 1%

*Frequency of test significance after 1000 replications.

Performance of the Tests when Error follows Quadratic Structure (QHS) at 5%

*Frequency of test significance after 1000 replications.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

  Bartlett, M.S. and Kendall, D.G. (1946) The Statistical Analysis of Variance Heterogeneity and the Logarithmic Transformation. Supplement to the Journal of the Royal Statistical Society, 8, 128-138. https://doi.org/10.2307/2983618  Weerahandi, S. (1995) ANOVA under Unequal Error Variances. Biometrics, 51, 589-599. https://doi.org/10.2307/2532947  Ogunleye, T.A., Olaleye, M.O. and Solomon, A.Z. (2014) Econometric Modelling of Commercial Banks’ Expenditure on the Sources of Profit Maximization in Nigeria. Scholars Journal of Economics, Business and Management, 1, 276-290.  Lim, T.S. and Loh, W.Y. (1996) A Comparison of Tests of Equality of Variances. Computational Statistics and Data Analysis, 22, 287-301. https://doi.org/10.1016/0167-9473(95)00054-2  White, H. (1980) A Heteroscedasticity Consistent Covariance Matrix and Direct Test for Heteroscedasticity. Econometrica, 48, 817-838. https://doi.org/10.2307/1912934  Graybill, F.A. (1976) The Theory and Applications of the Linear Model. Duxbury Press, London.  Conover, W.J., Johnson, M.E. and Johnsons, M.M. (1981) A Comparative Study of Tests for Homogeneity of Variances with Applications to the Outer Continental Shelf Bidding Data. Technometrics, 23, 351-361. https://doi.org/10.1080/00401706.1981.10487680  Feingold, A. (1992) Sex Differences in Variability in Intellectual Abilities: A New Look at an Old Controversy. Review of Educational Research, 62, 61-84. https://doi.org/10.3102/00346543062001061  Hedges, L.V. and Friedman, L. (1993) Gender Differences in Variability in Intellectual Abilities: A Re-Analysis of Feingold’s Results. Review of Educational Research, 63, 94-105. https://doi.org/10.3102/00346543063001094  Carroll, R.J. (1982) Tests for Regression Parameters in Power Transformation Models. Scandinavian Journal of Statistics, 9, 217-222.  Mendes, M. (2003) The Comparison of Levene, Bartlett, Neymann-Pearson and Bartlett 2 Tests in Terms of Actual Type I Error Rates. Journal of Agriculture Sciences, 9, 143-146. https://doi.org/10.1501/Tarimbil_0000000782  Camdeviren, H. and Mendes, M. (2005) A Simulation Study for Type III Error Rates of Some Variance Homogeneity Tests. Pakistan Journal of Statistics, 21, 223-234.  Rogan, J.C. and Keselman, H.J. (1977) Is the ANOVA F-Test Robust to Variance Heterogeneity When Sample Sizes Are Equal? An Investigation via a Coefficient of Variation. American Educational Research Journal, 14, 493-498. https://doi.org/10.3102/00028312014004493  Cochran, W.G. and Cox, G.M. (1957) Experimental Design. John Willey and Sons Inc., New York.  Brown, M.B. and Forsythe, A.B. (1974) The Small Sample Behavior of Some Statistics Which Test the Equality of Several Means. Technometrics, 16, 129-132. https://doi.org/10.1080/00401706.1974.10489158  Wilcox, R.R., Charlin, V.L. and Thomson, K.L. (1986) New Monte Carlo Results on the Robustness of the Anova F, W and F*Statistics. Communications in Statistics—Simulation and Computation, 15, 933-943. https://doi.org/10.1080/03610918608812553  Nelson, L.S. (2000) Comparing Two Variances from Normal Populations. Journal of Quality Technology, 32, 79-80. https://doi.org/10.1080/00224065.2000.11979974  Wilcox, R.R. (2002) Comparing Variances of Two Independent Groups. British Journal of Mathematical and Statistical Psychology, 55, 169-175. https://doi.org/10.1348/000711002159635  Gupta, A.K., Harrar, S. and Pardo, L. (2004) On Testing Homogeneity of Variances for Non-Normal Models Using Entropy. Department of Mathematics and Statistics, Bowling Green State University, Technical Report, No. 04-11.  Zar, J.H. (1999) Biostatistical Analysis. Prentice-Hall Inc. Simon and Schuster/A Viacom Company, New Jersey, USA.  Hsiung, T.C. and Olejnik, S. (1996) Type I Error Rates and Statistical Power for James Second-Order Test and the Univariate F Test in Two-Way ANOVA Models under Heteroscedasticity and/or Non-Normality. Journal of Experimental Education, 65, 57-71. https://doi.org/10.1080/00220973.1996.9943463  Wilcox, R.R. (1995) ANOVA: A Paradigm for Low Power and Misleading Measures of Effect Size. Review of Educational Research, 65, 51-77. https://doi.org/10.3102/00346543065001051  Oshima, T.C. and Algina, J. (1992) A SAS Program for Testing the Hypothesis of Equal Means under Heteroscedasticity: James’s Second-Order Test. Educational and Psychological Measurement, 52, 117-118. https://doi.org/10.1177/001316449205200116  Keselman, H.J., Wilcox, R.R., Othman, A.R. and Fradette, K. (2002) Trimming, Transforming Statistics and Bootstrapping: Circumventing the Biasing Effects of Heteroscedasticity and Non-Normality. Journal of Modern Applied Statistical Methods, 1, 288-309. https://doi.org/10.22237/jmasm/1036109820  Wijngaad, J.B., Kleink, A.M.G., Tank, G. and Konnen, P. (2003) Homogeneity of 20th Century European Daily Temperature and Precipitation Series. International Journal of Climatology, 23, 679-692. https://doi.org/10.1002/joc.906  Pardo, J.A., Pardo, M.C., Vicente, M.L. and Esteban, M.D. (1997) A Statistical Information Theory Approach to Compare the Homogeneity of Several Variances. Journal of Computational Statistics and Data Analysis, 24, 411-416. https://doi.org/10.1016/S0167-9473(96)00080-1  Neyman, J. and Pearson, E.S. (1931) On the Problem of k Samples. Bulletin de Lacademie Polonaise des Sciences-Serie des Sciences de la Terre, Ser. A, 460-481.  Layard, M.W.J. (1973) Robust Large Sample Tests for Homogeneity of Variances. Journal of the American Statistical Association, 68, 195-198. https://doi.org/10.1080/01621459.1973.10481363  Harvey, A.C. (1976) Estimating Regression Models with Multiplicative Heteroscedasticity. Econometrica, 44, 461-465. https://doi.org/10.2307/1913974  Goldfeld, S.M. and Quandt, R.E. (1972) Nonlinear Methods in Econometrics. North-Holland Publishing Company, Amsterdam.  Goldfeld, S.M. and Quandt, R.E. (1965) Some Tests for Heteroscedasticity. Journal of the American Statistical Association, 60, 539-547. https://doi.org/10.1080/01621459.1965.10480811  Glejser, H. (1969) A New Test for Heteroscedasticity. Journal of the American Statistical Association, 64, 316-323. https://doi.org/10.1080/01621459.1969.10500976  Wiedermann, W., Artner, R. and Eye, A. (2017) Heteroscedasticity as a Basis of Direction Dependence in Reversible Linear Models. Multivariate Behavioural Research, 52, 222-241. https://doi.org/10.1080/00273171.2016.1275498 