Parker Test for Heteroskedasticity Based on Sample Fitted Values

To address the drawbacks of the traditional Parker test in multivariate linear models: the process is cumbersome and computationally intensive, we propose a new heteroscedasticity test. A new heteroskedasticity test is proposed using the fitted values of the samples as new explanatory variables, reconstructing the regression model, and giving a new heteroskedasticity test based on the significance test of the coefficients, It is also compared with the existing Parker test which is improved using the principal component idea. Numerical simulations and empirical analyses show that the improved Parker test with the fitted values of the samples proposed in this paper is superior.


Introduction
A basic assumption of classical linear regression analysis is that the random error terms of the model i µ are homoskedastic, i.e. they have the same variance 2 σ . However, studies have shown that heteroskedasticity is an almost universal phenomenon when regression analysis is performed with cross-sectional or time-series data. Therefore, the study of heteroskedasticity in econometric modelling has become a hot issue for many scholars. There are many different tests for heteroskedasticity. For example, the graphical test, the Parker test, the Spearman's rank correlation test, the Glejser test, the White test, the G-Q test and so on [1]- [7], Bai Xuemei [8] proposed various methods to test heteroscedasticity, including Parker's test model and its existing shortcomings, Liu Ming and Huang Hengjun [9] proposed to use sample fitting value ˆi y as the standard improving the work efficiency.
Among them, the traditional Parker test can not only test the existence of heteroskedasticity in the one-dimensional linear regression model, but also write the specific expression of heteroskedasticity. However, for multiple linear regression models, the traditional Parker test does not have a specific equation to test, and can only test each explanatory variable one by one, which is a tedious process and computationally intensive, and can lead to multiple heteroskedasticity models. Tan

Heteroskedasticity Model
The linear regression model is a univariate regression model when 1 k = and multiple regression model when 2 k ≥ , the number of k explanatory variables is the number of explanatory variables, y called the explanatory variable (dependent variable) and 1 2 , , , k x x x  called the explanatory variable (independent variable

The Traditional Parker Test
The If β the above equation is statistically significant, then the data is heteroscedastic; if β it is not statistically significant, then there is no heteroscedasticity.
Specific steps of the traditional Parker test.
Step 1: Estimate the original regression using ordinary least squares to derive the square of 2 i e the sample residuals.
Step 2: Regress the 2 i e log of the explanatory variables associated with the heteroskedasticity on the basis of: Step 3: Perform a statistical test on the above equation and reject the null hypothesis of homoscedasticity if β it is statistically significant, or accept the null hypothesis of homoscedasticity if β it is not statistically significant.

Parker's Test Improved by Principal Components Thinking
Principal component analysis of explanatory variable 1 2 3 , , x x x , all the principal components obtained were logarithmically processed with 2 i e , and the following heteroscedasticity model was established according to the new data: Open Journal of Statistics where 1 2 3 , , i i i z z z denotes the principal components generated based on the explanatory variables 1 2 3 , , The least squares method was used to estimate the coefficients for model (2).
The significance of the coefficients 1 2 3 , , b b b in the model is tested using p-values. Comparing the p values obtained with 0.05 α = the coefficients in the model, the presence of significant coefficients in the 1 2 3 , , b b b model indicates heteroskedasticity, while conversely, the assumption of homoskedasticity is satisfied.

An Improved Parker Test Based on Sample Fitted Values
The above-mentioned traditional Parker test method is a complex and cumber- Specific steps to improve the Parker test.
Step 1: OLS estimation of Equation (1) to obtain the sample fit ˆi Y and residuals.

Random Simulation
The study of heteroskedasticity test analysis through random simulation can demonstrate the usefulness and validity of both the Parker test with sample principal components as explanatory variables and the Parker test with sample fitted values as explanatory variables.

1) Generation of simulation data
To generate the random simulation data, three sets of sample variables where i µ is the 400 normal random terms from the mean of 0, set in the form of: where ξ are mutually independent random variables that follow a standard normal distribution.
Obviously µ this form of random term is highly susceptible to heteroskedasticity, and the heteroskedasticity is related to the explanatory variables 1 x .
1 p denotes the value 1 b of p the coefficient test, 2 p denotes the p value of the coefficient tes 2 b , and 3 p denotes the value 3 b of p the coefficient test.
Since the Parker test with the fitted value as the new explanatory variable has only one p value, the results of the test are written centered as shown in Table 1.
The following conclusions were drawn from Table 1.  Table 2.
The following conclusions were drawn from Table 2.
Regardless of the magnitude of the correlation coefficients between the expla-

Analysis of Practical Examples
1) Data sources The regional gross domestic product (y), per capita consumption expenditure ( 1 x ), per capita regional general budget expenditure ( 2 x ), and price index of fixed asset investment ( 3 x ) by region were collected in the statistical yearbook for 31 provinces in 2018, in RMB. First,the following multiple regression model is established: and the OLS regression of (4) is performed to obtain a set of fitted values ŷ , and the square of the residuals is calculated 2 i e 1 2 3 9.574e 09 1.493e 08 8.809e 03 9.255e 07 y x a) Methodological steps for using sample principal components as new variables: Step 1: OLS regression of equation (5) to obtain a set of residuals 2 i e Step 2: Calculate the sample principal components, get the first, second and third principal components 1 2 3 , , z z z , the new variable contains all the information of the explained quantity 1 2 3 , ,    Table 3 show that the values 2 β of p are less than 0.05, indicating that the regression coefficient is significantly non-zero, which is the logarithmic value of the residuals is related to the second principal component, but it is still not possible to find whether the heteroskedasticity is related to 1 x , 2 x or 3 x .
At the significance level 0.05 α = , it was learned from Table 4 tests that the p regression coefficient is less than 0.05, which means that the logarithm of the residuals is significantly non-zero, ˆi Y there is a correlation between the logarithm and the residuals.
The same conclusion was obtained using a modified Parker test, with heteroskedasticity in the multiple regression model.
In summary, it can be seen that the Parker test modified with principal components and the Parker test modified with sample fitted values reach the same conclusion, although both methods only require a heteroskedasticity model to obtain the existence of heteroskedasticity in the original model, the implementation of the Parker test modified with sample fitted values is simpler and faster in the experimental process, omitting the step of calculating the principal components of the sample, and the results are more valid. In contrast, the sample fit also contains all the information on the explanatory variables and the fit is more reflective of the variance trends in the overall data. Therefore, it can be concluded that the method using the sample fitted values as new variables is more effective than the method using the sample principal components. This indicates that the improved method can replace the Parker test with the principal components as the explanatory variables.

Conclusions
There are many different methods of testing for heteroskedasticity in regression models, and scholars at home and abroad have proposed many different tests that are more effective than the traditional methods. In the paper, a new test