A Comparison of Two Test Statistics for Poisson Overdispersion / Underdispersion

Within the family of zero-inflated Poisson distributions, the data has Poisson distribution if any only if the mean equals the variance. In this paper we compare two closely related test statistics constructed based on this idea. Our results show that although these two tests are asymptotically equivalent under the null hypothesis and are equally efficient, one test is always more efficient than the other one for small and medium sample sizes.


Introduction
The Poisson distribution is the standard model for counting data, for example, the number of telephone calls within a specific time period [1].One stringent condition for the Poisson distribution is that the mean equals variance.However, in practice, many counting data show some overdispersion, i.e. the variance is greater than the mean value.The Zero-inflated Poisson (ZIP) distribution [2] and the negative binomial distribution [3,4] have been proposed to catch this overdispersion in practical data.The ZIP and it's related regression methods have been developed and used in many different areas, such as substance use [5], microbiology [6,7], psychology [8], health information management [9], dentistry [10], transportation engineering [11], and manufacturing [2].
Many tests have been proposed to test the overdispersion in counting data [1,6,[12][13][14][15][16][17][18][19].El-Shaarawi [6] compares the properties of the likelihood ratio test, the Cochran test [13], and the Rao test [17].His simulation result indicates that the likelihood ratio is always the best to keep the significance level in the cases of small or medium sample sizes.However, the Cochan and Rao tests are much more powerful than the likelihood ratio test in those cases.
In the Zero-inflated Poisson (ZIP) distribution an extra proportion of zeros is added to the probability of zeros in the Poisson distribution.Suppose that d X ZV  , where Z and V are independent variables with The mean and variance of X are Remark 1.Although in the definition of ZIP distribution, the parameter p in the Bernoulli distribution is required to be in   0,1 , formula (1) always define a valid probability distribution as long as This means that p can be greater than Since the Poisson distribution is a special case of the distribution defined in (1), the likelihood ratio test (LRT) is a natural choice for testing the hypotheses which is equivalent to To construct LRT, we need to estimate two parameters  and p.There are no closed form solutions for the score equation.Some iteration methods are called for the solution.Secondly, under the null hypothesis, the parameter p is on the boundary of the parameter region.Many other methods have been proposed to test the hypotheses in (4).See, for example, [12] and [19].Brown and Zhao [1] studied the hypotheses of the form and compared the behavior of their newly developed test with likelihood ratio test and several other tests.Feng et al. [14] derived the asymptotic distribution of the likelihood ratio test defined in [1] and corrected an error in that paper.
In this paper we construct two nonparametric test statistics and compare the efficiency, the empirical size, and power of two closely related tests, especially for the cases of small and medium sample sizes.

Test Statistics
From (1) we know that the data is from Poisson if and only if 2     i .We can construct a test to study the difference between the sample mean and sample variance.Suppose , 1, be the sample mean and sample variances, respectively.Simple algebra shows that The asymptotic result in (5) still holds with  replaced by a consistent estimator of  .
Under the null hypothesis, both X and are consistent estimators of H . Based on this idea, we define two test statistics Since the exact variance of 2 which is asymptotically equivalent to .This test is also called Neyman-Scott test in [1].

T
In the next section we study the relative efficiency of 1 and 2 T and compare their empirical size and power for small and medium sample sizes.T

Comparison of T 1 and T 2
Note that algebraically these two tests satisfy the relation . In this section we study the Pitman asymptotic relative efficiency (ARE) of 1 with respect to 2 T .The ARE is a large sample property of a test statistic.We also com- , is a random sample.Let pare the empirical sizes and powers of these two test statistics by simulations for the cases of small and medium sample sizes if the asymptotic distributions are used in those cases.

Relative Efficiency
In this subsection we study the Pitman efficiency of with respect to .Note that The asymptotic variances of and are respectively.The Pitman efficiency of with respect to is This means that these two test statistics have the same efficiency in the large sample case.

Empirical Sizes and Powers
In this part we compare the empirical sizes and powers by simulations when the asymptotic distributions of 1 and 2 are used in the cases of small and medium sample sizes.The theoretical significance level is set at 0.05.We compare 1 and 2 T for different p and T T T  .Table 1 shows the simulation results (from 100,000 Monte Carlo repetitions).
Empirical size: Except for very small  (for example, 0.1

 
), the empirical size of 1 is very close to the theoretical significance level, even when .On the other hand, the empirical size of 2 is far away from the theoretical value.For example, when , the empirical size of 2 T is 0.03 even when .When 500 1   , 2 and 5, the empirical sizes of are well above the theoretical value for sample size .Empirical Power: As p decreases, the distribution goes far and far away from the Poisson distribution.The powers of these two test statistics increase as p decreases.However, the power of 1 increases much faster than that of 2 T .For example, when , , the powers are 0.519 and 0.297, respectively.When 0.9 p  , 2.0   and , the powers are 0.498 and 0.337, respectively.

 n
The simulation results show that although these two test statistics are asymptotically equivalent under 0 H , and have the same efficiency in the large sample case, 1 is more efficient than in the small and medium sample cases.
we also compare the behave of (not reported here).We find that 4 is very similar to 1 .This means the remainder term in ( 9) plays a very significant role in the cases of small and medium sample sizes.

Real Data Study
In this section, we apply these two test statistics to four real data sets.These data sets have sample sizes from relatively small n = 44 to relatively large n = 539.The results are summarized in Table 2.
Example 1: This data set is used in [1].It contains the number of daily calls for standard services between 4:30 pm to 4:45 pm in an Israel call center within 44 consecutive days.More information about this data set can be found in Section 2 of [1].We want to test if the data has a Poisson distribution.Here we assume that the number of calls from different days are independent.The sample mean and sample variances are 18.66 and 25.95.The p-value of 1 is 0.07, which shows marginally significant overdispersion of the data.This is consistent with our impression.See Figure 2 in [1] for the histogram of the data.However, the p-value of is 0.19.
Example 2: This is another data set used in [1].It contains the number of daily calls for internet services between 4:30 pm to 4:45 pm in an Israel call center within 107 consecutive days.Here we also assume that the number of calls from different days are independent.The sample mean and sample variances are 2.18 and 2.47.The p-values of 1 and 2 T are 0.33 and 0.39, which show the Poisson distribution is a good approximation for the data.This is consistent with our impression.See Fi ure 1 in [1] for the histogram of the data.

Discussion
In this paper we compare two test statistics which can be easily used to test the Poisson distribution versus the zero-inflated Poisson distributions.Both test statistics are asymptotically equivalent under null hypothesis and the relative Pitman efficiency is 1.However, they have very significantly different behaviors for small and medium sample sizes.While T 1 always has reasonable empirical size (under null hypothesis) and power (under alternative hypothesis) for small and medium sample sizes, T 2 shows some erratic behaviors even for medium sample sizes and may lead to wrong conclusion in practice (example 1).Therefore we should never use it in practice.

0 1 :
Data from Poisson distribution vs : Data from distribution in (1), H H ( fact, X is the MLE of  under 0

T Example 4 : 1 M 2 M
This data is from a HIV prevention study finished at the University of Rochester School of Nursing.The study participants were 621 sexually active girls of ages 15 -19 years.For more details about the study, please refer to[7].One of the primary outcomes is the number of unprotected vaginal sex over the past 3 months.After 3 months of intervention, the number of unprotected vaginal sex wase available for 539 girls.The sample mean and sample variances are 12.03 and 184.52.Although both tests show very significant over-disper-sion for the data set (p-values < 0.0001), the values of the test statistics are very significantly different with T 1 = 235.44 and T 2 = 15.35.This phenomena can also be seen from (means) as   p can be arbitrarily large, while   p is bounded above by 1.

Table 2 . Results from real data.
[12]ple 3: This data set is reported in[12].It contains the number of daily deaths of women with brain vessel disease during the year 1989 in West Germany.The sample mean and sample variances are 6.36 and 6.82.