Asymptotic Efficiency of the Maximum Likelihood Estimator for the Box-Cox Transformation Model with Heteroscedastic Disturbances

This paper considers the asymptotic efficiency of the maximum likelihood estimator (MLE) for the Box-Cox transformation model with heteroscedastic disturbances. The MLE under the normality assumption (BC MLE) is a consistent and asymptotically efficient estimator if the “small σ ” condition is satisfied and the number of parameters is finite. However, the BC MLE cannot be asymptotically efficient and its rate of convergence is slower than ordinal order 1 n when the number of parameters goes to infinity. Anew consistent estimator of order 1 n is proposed. One important implication of this study is that estimation methods should be carefully chosen when the model contains many parameters in actual empirical studies.


Introduction
The Box-Cox transformation model (BC model) [1] is widely used in empirical studies.For details on this model, see Hossain [2] and Sakia [3].The maximum likelihood estimator (MLE), which maximizes the likelihood function under the normality assumption (BC MLE), can be asymptotically efficient if the "small σ " condition described by Bickel and Doksum [4] is satisfied.On the other hand, the model with heteroscedastic disturbances, in which variances are different among groups, is also widely used in the analysis of various datasets such as panel data [5].It is sometimes necessary to consider a model combining these two models.Nawata and Kawabuchi [6]- [11] analyzed length of stay (LOS) in Japanese hospitals using the BC model.They found that the variances among hospitals were often very different among hospitals even after the transformation.Their studies are these cases.
It is well known that the MLE is usually an asymptotically efficient estimator when the number of parameters is finite.However, this may not be true when the number of parameters goes to infinity.It is often necessary for us to consider cases in which numbers of groups go to infinity.For example, the new medical payment system known as the Diagnostic Procedure Combination/Per Diem Payment System (DPC/PDPS) was introduced in 2003 in Japan, and as of April 2014, 1863 hospitals had either already joined or were preparing to join; this number has been increasing [12].The hospitals joining the DPC/PDPS are required to computerize their medical information.This means that it has become possible to analyze a large scale dataset that contains information from many hospitals.In other words, it is necessary for us to consider the asymptotic properties of estimators when the number of groups (hospitals) goes to infinity.
This paper considers the estimation of the Box-Cox transformation model with heteroscedastic disturbances when the number of groups that increases to infinity.In such cases, the conventional maximum likelihood method yields only an estimator whose rate of convergence is slower than ordinal order of 1 n even if the "small σ " con- dition is satisfied in all groups.Then a new estimation method that can handle these problems is proposed.

BC Model with Heteroscedastic Disturbances
Suppose that ij t is the explanatory variable of observation j in group i (for example, LOS of patient j in hospital i in Nawata and Kawabuchi [6]- [11]).I consider the BC model: , , with heteroscedastic disturbances and variances given by ( ) where λ is the transformation parameter, ij x and β are the vectors of the expla- natory variables and coefficients, k is the number of groups, and i n is the number of people in group i.We assume that the "small σ " condition described by Bickel and Doksum [4] ( ( )  under the normality assumption, is small enough, for all i and j) is satisfied. 1Under this condition, we can assume that ij u follows the normal distribution with mean 0 and variance 2 i σ .The 1 If the "small σ " condition is not satisfied, we can use the estimator proposed by Nawata [15] instead of the BC MLE.Even in this case, however, we reach the same conclusion as that presented here; that is, a model that considers heteroscedasticity and a number of parameters that goes to infinity is simply a consistent estimator of order 1 m and there exists a consistent estimator of order 1 n by a modification of the homoscedastic case.
likelihood function is given by [13] ( ) where φ is the density function of the standard normal distribution.Although the likelihood function is a function of 1 2 , , , ,  we simply write it as (3).Note that Showalter [14] reports a large bias of the BC MLE when heteroscedasticity is ignored.

Estimation of the Model When the Number of Groups Goes to Infinity
Let the numbers of observations be . k is assumed to increase at a slower rate than i n ; that means that ( ) , , , , , k λ β σ σ σ  by maximizing (3).Let 0 0 0 , , i λ β σ be the true parameter values of , , i λ β σ and let ˆi σ be the MLE of i σ .
We do not assume any specific forms of the variances, and simply assume O m , ˆi σ can only be a consis- tent estimator of order 1 m by any estimation method (Baltagi and Griffin [5] considered different variance estimators).When we substitute ˆi σ , the conditions that the estimators obtained by maximizing (3) become order 1 n ; i.e.

( ) ( )
( ) As before, although the values of derivatives are at 1 2 ˆˆ, , , k σ σ σ  , we write them simply as (4).Then (5) becomes Here, ( ) ( ) We get ( ) Therefore, if ( ) is satisfied.This means that we can use the standard method of dealing with heteroscedasticity if and only if ( ) However, for the transformation parameter λ , since we get [15] [16] under the "small σ " condition, where * ij y is the value of ij y when we get ( ) and ( 4) is not satisfied.This means that the MLE becomes a consistent estimator only of order 1 m ; that is, the rate of convergence is slower than ordinal order 1 n when k → ∞ .This means that the estimator of λ cannot be a consistent estimator of order 1 n .Here, ( ) Therefore, β cannot be an estimator of order 1 n either.

A Consistent Estimator of Order n 1
Here, an alternative estimator is proposed by an essential modification of the likelihood function.Suppose that disturbances are homoscedastic and that  Then the likelihood becomes Instead of maximizing (15), we considered the roots of the equations, ( .
For the standard maximum likelihood method, the variance is estimated by the simple average.However, in this case, the variance is estimated by the weighted average of least squares residuals.
We assume ) where ( ) From (21), we get  there exist consistent estimators of λ and β among the roots of ( 16)-(18).Let where ∑ where ( ) . This means that ˆN λ and ˆN β are consistent estimators of order 1 n and are asymptotically more efficient than the BC MLE.

Conclusion
This paper considers the estimation of the BC model with heteroscedastic disturbances; that is, variances are different by groups.The BC MLE is a consistent and asymptotically efficient estimator if the "small σ " condition described by Bickel and Doksum  [4] is satisfied and the number of parameters is finite.However, its rate of convergence is slower than ordinal order of 1 n and the BC MLE cannot be efficient when the heteroscedasticity of disturbances is considered and the number of groups goes to infinity.An alternative consistent estimator based on a modification of the likelihood function is considered.It is a consistent estimator of order 1 n .One important result of this study is that the MLE might not be a good estimator and estimation methods should be carefully chosen when the model contains many parameters in the actual empirical studies.
on the same argument presented by Nawata[15] [17], distribution of this estimator is given by