A Chi-Square Approximation for the F Distribution

F distribution is one of the most frequently used distributions in statistics. For example, it is used for testing: equality of variances of two independent normal distributions, equality of means in the one-way ANOVA setting, overall significance of a normal linear regression model, and so on. In this paper, a simple chi-square approximation for the cumulative distribution of the Fdistribution is obtained via an adjusted log-likelihood ratio statistic. This new approximation exhibits remarkable accuracy even when the degrees of freedom of the F distribution are small.


Introduction
F distribution is one of the most frequently used distributions in statistics.It arises in many practical situations.For example, the test statistic for testing equality of variances of two independently distributed normal distributions is distributed as an F distribution.Another example is the test statistic for testing equality of means of k independent normal distributions with homogeneous variance is also distributed as an F distribution.
Johnson and Kotz [1] give a comprehensive review on the approximations to the cumulative distribution function (cdf) of the F distribution.Li and Martin [2] propose a shrinking factor approximation method and approximate the cdf of the F distribution by the cdf of the 2 χ distribution.On the other hand, consi- dering testing equality of variances of two independent normal distributions, Wong [3] derives the modified signed log-likelihood ratio statistic.As a result, a normal approximation for the cdf of the F distribution is obtained.The approximation by Wong [3] has a theoretical order of convergence ( ) O n − .In this paper, we consider the problem of testing equality of means of k independent normal distributions with homogeneous variance.Rather than the standard one-way ANOVA approach, we derive an adjusted log-likelihood ratio statistic, which is asymptotically distributed as 2   χ distribution such that the mean of this adjusted log-likelihood ratio statistic is exactly the same as the mean of the 2 χ distribution.As a result, a very accurate new 2  χ approxima- tion for the cdf of the F distribution is obtained.

Bartlett Corrected Log-Likelihood Ratio Statistic
Let ( ) 1 , , n Y Y be identical independently distributed random variables with joint log-likelihood function ( ) θ , where θ is a p-dimensional vector parameter.A frequently used asymptotic method for testing the hypothesis ( ) ( ) is based on the asymptotic distribution of the log-likelihood ratio statistic.In particular, the log-likelihood ratio statistic is defined as ( ) ( ) { ( ) 0 ψ θ ψ = .Generally, this constrained maximum likelihood estimator of θ can be obtained by the Lagrange multiplier method.With the regularity conditions stated in Cox and Hinkley [4], it is well-known that W is asymptotically distributed as 2 r χ distribution, where r is the degrees of freedom, which is the difference in the number of unconstrained parameters being estimated and the number of constrained parameters being estimated.Hence, the observed level of significance for testing the hypothesis in ( 1) is ( ) where w is the observed value of the log-likelihood ratio statistic W. Note that Cox and Hinkley [4] show that this method of obtaining the observed level of significance has order of convergence of only ( ) There exists many different ways of improving the accuracy of the convergence of the log-likelihood ratio statistic.Barndorff-Nielsen and Cox [5] and Brazzale et al. [6] give detail review of some higher order asymptotic methods and their applications.Recently, Davison et al. [7] derive a directional test for a vector parameter of interest for the linear exponential families.The method is quite complicated, both in terms of theories and computations.
In this paper, we propose a statistic, which is very similar to the Bartlett corrected log-ikelihood ratio statistic.Bartlett [8] [9] show that the expected value of W can be expressed as where b is known as the Bartlett factor.Since ( ) E W does not equal to the mean of the 2 r χ distribution, Bartlett [8] [9] propose to adjust the log-like- lihood ratio statistic by [10] shows that in fact all cumulants of * W agree with those of a 2 r χ distribution to the same order.Lawley's proof is very complicated.Barndorff-Nielsen and Cox [11] discuss a much simpler derivation based on the saddlepoint approximation.
However, the Bartlett factor, b, in general, is very difficult to obtain.This limited the use of the Bartlett corrected log-likelihood ratio statistic in applied statistic.
In this paper, we propose to adjust the log-likelihood ratio statistic W such that the adjusted log-likelihood ratio statistic has exactly the same mean as the 2 r χ distribution.In other words, let W is asymptotically distributed as 2 r χ distribution.Thus, the observed level of significance for testing the hypothesis in ( 1) is ( ) , where † w is the observed value of † W .Note that his adjusted log-likelihood ratio statistic is just a modified version of the Bartlett corrected log-likelihood ratio statistic.
In the next section, the proposed adjusted log-likelihood ratio statistic for testing the equality of means of k homoscedastic normally distributed populations is derived.By comparing to the standard F-test in the one-way ANOVA approach, an approximation of the cdf of the F distribution is obtained.

Main Result
From the one-way ANOVA approach, we have the following sum of squares: and the degrees of freedom are It can be shown that the unconstrained maximum likelihood estimator is ( ) , where When the null hypothesis in (3) is true, the log-likelihood function can be written as and the constrained maximum likelihood estimator is , .
k SSTr X n n and W is asymptotically distributed as 2 χ distribution with dfTr degrees of freedom.
Our proposed method required to obtain ( ) where ( ) ; , g y dfTr dfE is the probability density function of the F distribution with degrees of freedom ( ) , dfTr dfE .Therefore, the observed level of signi- ficance for testing the hypothesis in (3) based on the proposed adjusted loglikelihood ratio statistic is ( ) where ( ) E W is defined in ( 5) and * f is the observed value of the test statistic given in (4).
By re-indexing the above approximation, let X be distributed as the for 0 x > .Hence, the log-likelihood ratio statistic is ( ) However, this approximation has order of convergence ( ) O n − only.
The proposed approach gives Hence, the proposed approximation will be problematic when u is large.Nevertheless, the , u v F distribution has the inverse property: that can be applied to obliviate this problem.Thus, the proposed approximation is:

Numerical Comparisons
Wong [3] gives a simple and accurate normal approximation to the cdf of the It is of interest to compare the proposed method, to the approximation by Wong [3].

Conclusion
In this paper, a simple chi-square approximation to the cumulative distribution the cdf of the standard normal distribution,

Figures 1 (
Figures 1(a)-8(a) are the plots of the cumulative distribution functions for the , u vF distribution for various u and v obtained by the exact method, the approximation by Wong[3], and the proposed method.The difference between the two approximated cumulative distribution functions and the exact cumulative distribution function are barely noticeable.To explore the accuracy of the two approximations, we examine the relative error, which is defined as
have a closed form solution but it can be obtained numerically by software like R, Maple and Matlab.Table 1 records some values