Inference for the Normal Mean with Known Coefficient of Variation

Inference for the mean of a normal distribution with known coefficient of variation is of special theoretical interest because the model belongs to the curved exponential family with a scalar parameter of interest and a two-dimensional minimal sufficient statistic. Therefore, standard inferential methods cannot be directly applied to this problem. It is also of practical interest because this problem arises naturally in many environmental and agriculture studies. In this paper we proposed a modified signed log likelihood ratio method to obtain inference for the normal mean with known coefficient of variation. Simulation studies show the remarkable accuracy of the proposed method even for sample size as small as 2. Moreover, a new point estimator for the mean can be derived from the proposed method. Simulation studies show that new point estimator is more efficient than most of the existing estimators.


Introduction
Normal distribution is one of the most widely known and commonly used distributions in statistics.Even in the introductory statistics courses, we discussed inference about the mean of a normal distribution.Usually we assume that the population mean and the population standard deviation are unrelated parameters.However, in many physical and biological applications the population standard deviation is often found to be proportional to the mean.That is, the mean and standard deviation are related.The ratio of the standard deviation to the mean is defined as the coefficient of variation (CV) in statistics.The focus of this paper is to make inference on the normal mean using the extra information on the CV.
In practice, this problem arises more frequently than we might anticipate.For example, in environmental studies, inference about the mean of the pollutant is of special interest.And in those studies, the standard deviation of a pollutant is often assumed to be directly related to the mean of the pollutant (Niwitpong [1]).In agricultural studies, it is customary to conduct multi-location trials.From the results of a few locations, the CV can be calcu-lated and subsequently used as a known value for studying the mean of the experiment conducted in a new location (Bhat and Rao [2]).Brazauskas and Ghorai [3] also give examples of this problem emerging from biological and medical experiments.From the theoretical point of view, estimating a normal mean with known CV is also an interesting problem because it has a scalar parameter but a two-dimensional minimal sufficient statistic.In other words, we have a curved exponential family model, and standard inferential methods cannot be directly applied (see Efron [4]).
In literature, many authors have studied point estimation of a normal mean with known CV.For example, a consistent estimator was obtained by Searls [5] based on truncation of extreme observations.Khan [6] derived the best unbiased estimator with minimum variance.Gleser and Healy [7] obtained the uniformly minimum risk estimator when the loss function is the squared error.Sen [8] proposed a simple and consistent estimator but the proposed estimator is biased.Guo and Pal [9] worked out an estimator based on the scaled quadratic loss function.Chaturvedi and Tomer [10] extended the method in Singh [11] and proposed a three-stage procedure and an accelerated sequential procedure to estimate the normal mean.By various ways of combining the minimal sufficient statistic, Anis [12] proposed three simple but biased estimators.And most recently, Srisodaphol and Tongmol [13] suggested that the estimator based on jackknife technique is preferred as it has the smallest mean square error.
Despite the large literature devoted to point estimation, very few literature is available for interval estimation and hypothesis test for the normal mean with known CV.Hinkley [14] derived two locally most powerful test for right alternatives based on an ancillary statistic.Bhat and Rao [2] examined the likelihood ratio test and the Wald test.Niwitpong [1] proposed two confidence interval for the normal mean based on the work of Searls [5].
In this paper, we extended the approach of Bhat and Rao [2] and proposed the modified signed log-likelihood ratio test for the normal mean with known CV.The proposed method is known to have third-order accuracy.Moreover, a new estimator is obtained from the modified signed log-likelihood ratio statistic.
The rest of the paper is organized as follows.In Section 2, the modified signed log-likelihood ratio method is reviewed.Application of the method to the normal mean with know CV problem is presented in Section 3. Simulation results to illustrate the accuracy of the proposed method are given in Section 4. The overall conclusions are summarized in Section 5.

Review of the Modified Signed Log
Likelihood Ratio Method where the canonical parameter  is one-to-one transformation of  , and be the scalar parameter of interest and     is a vector of nuisance parameters.Hence, the log-likelihood function is Fraser, Reid and Wu [15] approximated the p-value function of  with third order accuracy by where is the cumulative distribution of the stan- are the signed log-likelihood ratio statistic, and a standardized maximum likelihood departure calculated in the canonical parameter scale, respectively.Here is the observed information matrix evaluated at  and is the nuisance observed information matrix evaluated at ˆ is the modified signed loglikelihood ratio statistic as defined in Barndorff-Nielsen [16,17].It is important to note that is invariant to reparameterization, whereas is not and has to be calculated in the canonical parameter scale.A   Fraser, Reid and Wong [18] considered the gamma mean problem where the parameter of interest  is not a component of the canonical parameter.In this case, the modified signed log-likelihood ratio method can still be applied with r given in (1.3) because it is invariant to reparameterization, and has to be re-calculated in the canonical parameter scale and it takes the form where Then by change of variable from  to  , we have For a model that does not belong to an exponential family, Fraser and Reid [19] proposed a systematic method to obtain the locally defined canonical parameter     .Their method is to, first, obtain the ancillary direction by where is an n-dimensional pivotal quantity.Then the locally defined canonical parameter is defined as: Thus, the modified signed log-likelihood ratio statistic method can be applied to obtain the p-value function of  , and confidence interval for  .Fraser and Reid (1995) showed that the method maintained third order accuracy.

Main Results
We studied the modified signed log-likelihood ratio test to the normal mean with known CV problem.The main results are as follows.Let  be a random sample from a normal distribution with mean  and variance .Without loss of generality, we follow the set up in Srisodaphol and Tongmol [13] that the coe- sufficient statistic.This belongs to the curved exponential family as defined in Efron [4] with a two-dimensional minimal sufficient statistic but only a scalar parameter.Classical statistical methods cannot be directly applied to obtain the p-value function of  .
Since and 0 c  0 c     , therefore  has to be positive, and the maximum likelihood estimate of  is The signed log likelihood ratio statistic is To calculate   Q  , we need to first obtain the locally defined canonical parameter     which depends on the pivotal quantity   , z  x .In this case, the pivotal quantity for the observation is and we have The component of the ancillary direction is Since there is no nuisance parameter involved in this ten thousand Monte Carlo replications are performed.
For each generated sample, the 95% confidence interval for  is calculated.The performance of a method is judged using the following criteria: problem, simplifying (1.7) and (1.6), we have  The coverage probability (CP): Proportion of the true  falls within the 95% confidence interval;  The lower tail error rate (LE): Proportion of the true  falls below the lower limit of the 95% confidence interval; Finally, the maximum likelihood departure in  The upper tail error rate (UE): Proportion of the true  falls above the upper limit of the 95% confidence interval; and thus the p-value function of  ,   p  , can be obtained by the modified signed log likelihood ratio method.
 The average bias (AB) LE 0.025 UE 0.025 AB . 2 In addition, we proposed a new estimator of  which is a by-product of the modified signed log likelihood ratio method,   * r  .We denote our new estimator as   which satisfies The desired values are 0.95, 0.025, 0.025 and 0, respectively.These values reflect the desired properties of the accuracy and symmetry of the interval estimates of  .Results are recorded in Table 1.The Wald method gives unsatisfactory coverage probability.LR gives decent coverage probability.Both the Wald method and the likelihood ratio method gives asymmetric intervals.However, the proposed modified signed log likelihood ratio method gives excellent results in all four criteria even for this extreme sample size case.Table 2

Numerical Studies
Our first simulation study is to compare the accuracy of the confidence intervals obtained from the Wald method (Wald) and the likelihood ratio method (LR) as discussed in Bhat and Rao [2] and those obtained by the proposed method   * r .We consider the extreme case of 2 n  .For each combinations of 1,10, 20 c  and 2,5,10   , Anis [12] compares the relative efficiency of ten point ) with the "standard" estimator , , , T T T  X and concluded that 6 , which is the maximum likelihood estimator, performs best.Moreover, 3 , which is easy to compute, is comparable to and .

T T 8 9 10
We mimic the simulation study discussed in Anis [12] to compare our proposed estimator, , T T T   , to the ten estimators discussed in Anis [12].As in Anis [12], we chose 100   , for each of the combinations of       0.0 25 0.25 1 0.5 3 c  5 0.05 0. and , ten thousand Monte Carlo replications were performed.For each generated sample, we calculated the relative efficiency of the estimator with the "standard" estimator 2,3,15,100 n  X .Results are reported in Table 3. From Table 3, we can observe that 6 performs best and our proposed estimator T   ranks second.However, as shown in the our first simulation study, the inference based on the maximum likelihood estimate (the Wald method) gives unsatisfactory results.In other words, although 6 is most efficient among the estimators discussed in this paper, it does not give satisfactory coverage properties.On the other hand, the point estimate based on the modified signed log likelihood ratio statistic is, in general, the second most efficient estimator among the estimators discussed in this paper, and the corresponding interval estimate has the best coverage properties.Thus, the proposed method is the recommended T method.

Discussion
In this paper, we proposed a modified signed log-likelihood ratio method to obtain inference for the mean parameter of a normal distribution when the coefficient of variation is known.A by-product of the proposed method is the availability of an efficient point estimator of the mean.Theoretically, the proposed method has rate of convergence  

O n 
and simulation results show the extreme numerical accuracy of the proposed method even when the sample size is small.The proposed method can be applied to any model to obtain inference for

 10 
Although the explicit form of   is not available, it can be obtained easily by simple numerical methods. .In this case, the Wald method still gives decent coverage probability but also gives asymmetric intervals.Both LR and give similar coverage probability with having a smaller average bias.Simulation results for other combinations of * are available upon request to authors.

Table 3 . (a) Relative efficiency of different estimators with respect to X for n 2  ; (b) Relative efficiency of different esti- mators with respect to X for ; (c) Relative efficiency of different estimators with respect to n 3
 X for n 15  ; (d) Relative efficiency of different estimators with respect to X for n 100  .