Revisit the Two Sample t-Test with a Known Ratio of Variances

Inference for the difference of two independent normal means has been widely studied in staitstical literature. In this paper, we consider the case that the variances are unknown but with a known relationship between them. This situation arises frequently in practice, for example, when two instruments report averaged responses of the same object based on a different number of replicates, the ratio of the variances of the response is then known, and is the ratio of the number of replicates going into each response. A likelihood based method is proposed. Simulation results show that the proposed method is very accurate even when the sample sizes are small. Moreover, the proposed method can be extended to the case that the ratio of the variances is unknown.


Introduction
Inference for the difference of two independent normal means is omnipresent in statistical practice and is introduced in most introductory staitstics texts.Typically, the variances are assumed to be unknown and must be estimated.When we assume equal variances, then a pooled estimate of the common variance is used and the test statistic is exactly distributed as a t-distribution.However, without making the equality of variances assumption, the problem is then the well-known Behrens-Fisher problem, where no exact distribution of the test statistic is available.Although there exists many approximate solutions for this problem, most statistical software packages use the Satterthwaite solution, where the test statistic is approximately distributed as a t-distribution.Maity & Sherman [1] considered the Behrens-Fisher problem with an additional assumption that one of the variances is known, and a Satterthwaite type solution is obtained.Wong & Wu [2] examined the problem considered by Maity & Sherman [1] and derived a likelihood based asymptotic solution, which has excellent coverage property.
Schechtman & Sherman [3] also considered the Behrens-Fisher problem but with an assumption that the ratio of the two variances is known.This problem arises in many practical situations.For example, when two instruments report averaged responses of the same object based on a different number of replicates, the ratio of the variances of the response is then known, and is the ratio of the number of replicates going into each response.Schechtman & Sherman [3] showed that their proposed solution is equivalent to the one suggested by Sprott & Farewell [4].
In this paper, we followed the approach by Wong & Wu [2] and obtained a likelihood based asymptotic solution for the problem considered in Schechtman & Sherman [3].The underlying theories of the proposed method are discussed in Wong & Wu [2].Simulation results showed that the proposed solution has excellent coverage property even for small sample sizes.The proposed method is then applied to the Behrens-Fisher problem.Again, simulation results showed the excellent coverage property of the proposed method.
The structure of the paper is as follows.Likelihood based inference for a scalar canonical parameter of the exponential family model is presented in a step-by-step algorithm in Section 2. The proposed method is applied to obtain inference for the difference of two independent normal means with known ratio of variances in Section 3. Simulation results are also recorded in Section 3 to illustrate the coverage properties of the proposed method.
The proposed method is then applied to the Behrens-Fisher problem in Section 4. Simulation results recorded in Section 4 showed that the proposed method and the Satterthwaite method have similar coverage properties.Some concluding remarks are given in Section 5.

An Algorithm to Obtain Confidence
Interval for a Scalar Parameter of Interest be a sample from an exponential family model with density where ( )

   , and ( )
var  is the estimated asymptotic variance of  , which can be derived from the asymptotic variance of  using the Delta method.Alternatively, the signed log-likelihood ratio statistic is also asymptotically distributed as with (0,1) N   being the constrained maximum likelihood estimate of  for a given  .Therefore a (1 )100%   confidence interval for ( )     based on the signed log-like- In this paper, we consider the method discussed in Wong & Wu [2], which can be summarized into the following algorithm: Given: a) A sample 1 = ( , , ) Step 1: a) Obtain the overall maximum likelihood estimate ˆ= ( , ) b) Obtain , the determinant of the observed information matrix evaluated at c) Obtain the constrained maximum likelihood estimate = ( , ) d) Obtain , the determinant of the observed nuisance information matrix evaluated at Step 2: Calculate the signed log-likelihood ratio statistic Under regularity conditions as given in DiCiccio et al [5], is asymptotically distributed as with order of convergence .Hence a , where is the percentile of . /2 Step 3: With the canonical parameter Step 4: Parameter of interest and its variance in ( ) Step 5.The standardized maximum likelihood departure in ( )  where is an additive constant that does not depend on a  , and ( )   is the canonical parameter.
Step 6: The modified signed log-likelihood ratio statistic is which is shown in Barndorff-Nielsen [6,7] and Wong & Wu [2] to be distributed as with order of convergence .Hence a (0,1) )|< } r z 

Inference for the Difference of Two
Independent Normal Means with a Known Ratio of Variances  Schechtman & Sherman [3] showed that a (1 % )100 ( 1) , and is the percentile of the t-distribution with degrees of freedom.
The log-likelihood function can be written as ( 1) Step 2: = ( ) r r  can be obtained.
Step 3: For this problem, the canonical parameter is The rest of the steps can be obtained from the above information.Hence (1 )100%   confidence interval can be obtained from the modified signed log likelihood ratio statistic.

Simulation Study
To compare the accuracy of the proposed method with the signed log likelihood ratio method, and the Schechtman & Sherman [3] method, Monte Carlo simulation studies were conducted.We generated 10,000 simulated samples for some combinations of the parameters.For each simulated sample, we calculate the 95% confidence intervals for  obtained by the proposed method ( ) with the signed log-likelihood ratio method ( ), and the Schechtman & Sherman [3] method ( ).For each simulated setting, we report the proportion of

Propose r SS
 that falls outside the lower bound of the confidence interval (lower error), the proportion of  that falls outside the upper bound of the confidence interval (upper error), and the proportion of  that falls within the confidence interval (central coverage).The nominal values for the central coverage, and the lower and upper errors are 0.95, 0.025, and 0.025 respectively.The simu-lation standard errors for these three quantities are 0.0022, 0.0016 and 0.0016 respectively.Results are recorded in Tables 1-3.It is clear that the results from signed log-likelihood method are not satisfactory especially when the sample sizes are small.Results from the Schechtman & Sherman [3] method and the proposed method are almost indistinguishable even for small sample sizes (they are all within 3 simulated standard errors).The major difference between the two methods is that Schechtman & Sherman [3] method is tailor-made for this problem and cannot be applied when is unknown; whereas the proposed method can be applied to the is unknown case.

Proposed Likelihood Based Inference
In this section, we consider the same model set up as in Section 3, but the ratio of variances is unknown.This is the Behrens-Fisher problem, and no exact distribution of the test statistic is available.The most common approximate solution is the Satterthwaite solution, which is discussed in most of the introductory statistics texts, and it is implemented in most statistical software packages.
For this problem, the log likelihood function can be written as where 2 = ( , ) = ( , , , ) , and iv) Go to ii).Will stop when the absolute value of the difference of two consecutive ( )    is less than some pre-set tolerance level.
Step 3: For this problem, the canonical parameter is The rest of the steps can be obtained from the above information.Hence (1 )100%   confidence interval can be obtained from the modified signed log-likelihood ratio statistic.

Simulation
Monte Carlo simulation studies, with setting being the same as those considered in Section 3, were conducted to compare the coverage properties of the proposed method ( ) with the signed log-likelihood ratio method ( ). Results are recorded in Tables 4-6 and they are similar to what we have observed in Section 3: the signed log-likelihood method does not have good coverage properties, whereas the proposed method have coverages very closed to the nominal levels.

Discussion
A likelihood based method to obtain inference for the   [3] method is tailored made for this particular problem and cannot be applied to the case where the ratio of variances is unknown.On the other hand, the proposed method can still be applied to the unknown ratio of variance case.Simulation studies for other combinations of the parameters have also been conducted and results are consistent with those reported in this paper.A simple program to perform the calculations is available upon request.As a final note, the theoretical accuracy of the modified signed log-likelihood method is shown in Barndorff-Nielsen [5,6] and Wong & Wu [2].

following the algorithm given in Section 2 2 =
constrained maximum likelihood estimate of  for a given  , does not have a closed form.However, it can be obtained by the following iterative procedure:

Table 4
[3]two independent normal means with known ratio of variances is proposed.Monte Carlo simulation results showed that the proposed method and the Schechtman & Sherman[3]method are almost indistinguishable.However, Schechtman & Sherman difference