The Permutation Test as an Ancillary Procedure for Comparing Zero-Inflated Continuous Distributions

Empirical estimates of power and Type I error can be misleading if a statistical test does not perform at the stated rejection level under the null hypothesis. We employed the permutation test to control the empirical type I errors for zero-inflated exponential distributions. The simulation results indicated that the permutation test can be used effectively to control the type I errors near the nominal level even when the sample sizes are small based on four statistical tests. Our results attest to the permutation test being a valuable adjunct to the current statistical methods for comparing distributions with underlying zero-inflated data structures.


Introduction
Statistical analysts sometimes encounter data that have an excessive number of zeros and these data often present analytical difficulties because traditional methods rely on assumptions that may be unrealistic and plausible transformations may not be found.Many studies have reported on statistical methods for analyzing count data with excessive zeros [1][2][3][4][5][6].Some zero inflated data may be viewed as having a mixed distribution where zeros have a point distribution and the distribution of non-zero observations is positive and continuous.This distribution has not been investigated adequately and statistical methods with favorable Type I and Type II errors for comparing these non-traditional distributions are desired.
Testing equivalence of zero-inflated populations in the context of underlying mixed distributions is equivalent to testing equality of the probabilities of zeros and simultaneously equality of the parameters of the non-zero observations [7].The likelihood ratio (LR) [8] and Wald [9] tests are two widely used methods.These two methods typically perform well if the probability density function that applies under the null hypothesis is known.Recently, Monte Carlo simulations were employed to compare several approaches including the LR, Wald, central limit theorem (CLT), modified central limit theorem (MCLT) tests with respect to their empirical Type I errors and testing powers for three zero-inflated continuous distributions [7].The LR, Wald, and MCLT tests were found to be preferable to the tests based on central limit theory.
There are two important issues when several populations with zero-inflated data structure are compared.First, the underlying distribution is usually unknown and, therefore, the assumptions of specific distributions can be easily violated by using assumption-constrained methods.Second, empirical Type I errors and testing powers are difficult to determine because the relevant parameters are almost always unknown even if the assumed distribution is correct.Moreover, a small sample size may contribute to higher Type I and Type II errors.Thus, a test that controls the empirical Type I errors and yields valid estimates of testing powers is helpful.
Permutation tests are advocated for data analysis when assumptions required to validate parametric procedures are violated [10][11][12][13].Unlike parametric tests, permutation tests can generate probabilities by repeatedly "resampling" the data and evaluating the obtained results with reference to an empirically derived distribution [14,15].Permutation tests have two major advantages: 1) they can be used to adjust the empirical Type I errors and the testing powers and, 2) they can be used when some assumptions required to justify parametric tests are violated.
Hence, their use may lead to more appropriate statistical conclusions.
The purpose of this study was to investigate the issues raised above pertaining to the use of ancillary permutation tests to compare several populations when the random variable of interest has either a known or unknown zero-inflated continuous distribution.Four statistical tests were compared with respect to both their empirical type I errors and testing powers.First we assumed the data followed a zero-inflated exponential distribution as reported by Zhang et al. [7].Empirical Type I errors and testing powers for these tests were compared with and without adjunct permutation tests by empirical estimates obtained using Monte Carlo simulations.Section 2 describes a general permutation test that generates an empirical probability for each test.Simulated results for four carefully selected parameter configurations are presented in Section 3. Finally, Section 4 demonstrates the results with the permutation test for a data set reported by Koopmans [16].

Four Testing Methods
Performances of four tests including the likelihood ratio (LR) [8], Wald [9], central limit theorem (CLT), and modified central limit theorem (MCLT) tests [7] were evaluated.The CLT test considers only the population means calculated over all zero and non-zero observations while the MCLT test considers both the probability of zeros and simultaneously the mean of non-zero observations.The first two tests are distribution-based while the other two are distribution-free based.Maximum likelihood (ML) estimators [17] are required for both the LR and Wald tests.For the CLT and MCLT tests, the Wald test was incorporated to derive the probability for each test [7].These methods were detailed in one of our previous papers [7] and were not repeated in this study.

Permutation Test
The procedures of using the permutation test in zeroinflated data are: Step 1: Calculate the p-value using each of the above mentioned four tests (e.g.LR) to analyze the original data; Step 2: Reshuffle the original data and randomly assign the data to different populations without replacement; Step 3: Calculate the p-values by the same method used in Step 1 for the reshuffled data obtained in Step 2; Step 4: Repeat Steps 2 and 3 "N times"; Step 5: Construct the sampling distribution of p-values obtained in Steps 2 through 4; Step 6: Locate the p-value in this distribution that corresponds to each p-value calculated in Step 1.If the pvalue from the original data is in the main body of the distribution (α/2 to (1 − α/2)), then there is no significant difference at probability level α among populations.Otherwise, there is evidence that the difference between (among) populations is significant.
The above procedures from Steps 1 to 6 were applied to all four tests in this study.

Simulation Procedure
In our empirical investigation we assumed interest was in testing the hypothesis that three zero-inflated distributions had identical means.We simulated data from three zero-inflated distributions with sample sizes ranging from 25 to 300 and performed each of four tests repeatedly using the replicate samples to test the null hypothesis.We tabulated the number of rejections of the hypothesis under each known scenario to estimate Type I errors and powers.Twelve sample sizes (n = 25 × s, where s = 1, 2, •••, 12) were considered and the nominal probability level was set at 0.05 throughout.Although different configurations were considered only one was listed for the null distributions and three for alternative distributions as described in Table 1.The first configuration in Table 1 was designed to estimate the empirical Type I errors and the remaining three configurations were designed to estimate the empirical testing powers.Each set of simulated data was analyzed by the four tests with and without employment of the permutation test.Repetitions of 1000 simulated samples were used for each case.All simulations were conducted by a C++ program written by the authors of this paper.

Simulation Results
First the number of permutations sufficient for statistical tests at a given probability level is determined.The Type I errors and testing powers from 100 to 2000 different permutations for configurations 1 and 2 with sample size 200 are summarized in Figures 1 and 2, respectively.These figures clearly demonstrate that both empirical Type I errors and testing powers became reasonably stable after the sample size surpassed 100 permutations.Results from additional simulations for various different sample sizes and configurations showed similar trends.Thus samples of 500 permutations were chosen for all the remaining simulations.
The empirical Type I errors of the four tests with and without permutation tests are summarized in Table 1 for the case of a zero-inflated exponential distribution.The    differences between observed Type I errors and the nominal 0.05 level tend to be smaller as the sample size increases for all four tests without permutation tests, indicating that all these tests tend to perform better as the sample size increases.However, with the permutation tests, the empirical Type I errors are close to the nominal 0.05 level for different methods and various sample sizes including small sample sizes (Table 2).The results indicate that the permutation tests can reduce the high Type I errors that are prevalent with small sample sizes.When the sample sizes are large, i.e., at least 100, the empirical Type I errors for the four statistical methods are almost identical irrespective of using the permutation tests.Tables 3-5 present the empirical powers of the four tests for three parameter configurations as defined in Table 1.As expected, the testing power increased for all four tests as the sample size increased.The testing powers obtained without permutation tests were typically lower than those obtained with permutation tests for all methods when the sample size is small (100 and below).However, as the sample size increases, the testing powers were similar irrespective of using permutation tests.As for parameter configuration 2 described in Table 1, the CLT test and the other three tests have similar testing powers because only means for the non-zero observations contributed the differences (Table 3).As for designs 3 and 4, the CLT test has an extremely low testing power compared with other three tests (Tables 4 and 5).The increase or decrease of both zero probability level and the non-zero mean made the differences among populations hard to detect with the CLT method, while the other three tests are sensitive and maintain desirable testing powers.This indicates that the LR, Wald, and the MCLT tests are better than the CLT test in general.When the zero probability levels among populations are similar, the CLT test is still a good option.
In many situations, the distribution for a given zeroinflated data set is unknown.It will be interesting to reveal the empirical Type I errors and testing powers obtained using these methods by assuming the following distributions.In this study, we generated 1000 simulated data sets based on different parameter configurations as described in Table 1 with the zero-inflated exponential distribution.Then the LR and Wald methods were applied to test the differences among three populations by assuming the data follow zero-inflated gamma and lognormal distributions.Although simulations for various sample sizes were conducted only the results for configurations 1 and 2 with sample size of 200 were reported (Table 1) because the similar patterns were observed for different configurations with different sample sizes (data not shown).Given zero-inflated exponential data, both the LR and Wald tests resulted in unfavorably high Type I errors if no permutation tests were applied; however, these type I errors were adjusted substantially to be close to the nominal level on using the permutation test.On the other hand, the testing powers obtained by the LR and Wald tests were lower when the lognormal distribution was assumed.For the gamma distribution, both the LR and Wald tests have similar and desirable testing powers when the permutation tests are applied (Table 6).The results suggested that the tests could have caused either higher Type I errors or lower testing powers when an  inappropriate distribution was assumed.However, with the permutation tests, the chance to make Type I errors can be greatly decreased, yet the testing powers can be desirable in many cases.

Application
Koopmans [16] reported results of a study of seasonal activity patterns of field mice.Data consisted of the average distances traveled between captures by field mice at least twice in a given month.The distances were rounded to the nearest meter.A large number of zero distances were observed in addition to non-zero distances resulting in data with a zero-inflated distribution.The exact distribution of the non-zero observations is unknown.Various LR tests were used to identify which parameter(s) were associated with the seasonal differences by assuming the data followed a mixture of zero-inflated logsnormal distributions [18].To illustrate our approach, we analyzed the data by four tests alone and by the permutation test with 1000 repetitions assuming the underlying distribution was a zero-inflated exponential distribution (  employment of the permutation tests, indicated that the mice distances differed significantly among the three seasons.

Discussion
It is desired that a statistical method sustains a preset nominal Type I error and a high testing power.Many methods are based on the appropriate statistical assumptions and require a large sample size.In some situations, the sample size may be very small and test statistics may yield unfavorable Type I errors and testing powers.In addition, the real distribution is often unknown so desirable testing properties cannot be expected on employing distribution-based tests.In this study, we investigated statistical properties of the permutation tests integrated with four distribution-based tests to compare populations with zero-inflated data structures.Based on the results from the simulated zero-inflated exponential data, several conclusions can be made on use of the permutation test: 1) high Type I error caused by the appropriate statistical tests without the permutation test for small sample sizes can be adjusted to the preset nominal level when the permutation test is used; 2) high Type I errors caused by the inappropriate assumptions can be adjusted to the preset nominal level; and 3) for a large data set, both the type I errors and testing powers are similar regardless the use of the permutation test for appropriate distribution assumptions.The same conclusions applied for the other two types of zero-inflated continuous distributions including gamma and lognormal distributions (results not shown).
As reported by Zhang et al. [7] and in results of this study, the LR and Wald tests hold similar type I errors and testing powers but they are distribution dependent.If an inappropriate distribution is assumed, both inflated Type I errors and low testing powers can occur (Table 6).The CLT test is data structure dependent because it considers only the population mean including zeros.When the population means are similar (because the populations have similar probabilities of observations equal to zero) and their non-zero observations have similar distributions, then the CLT test may have statistical properties similar to the other three tests.The MCLT test considers two parameters: the zero probability and non-zero mean and thus is better than the CLT test and robust for most cases.In addition, high Type I errors caused by the MCLT test can be adjusted by the permutation test for small sample size.Therefore, the MCLT test can be recommended for general use regardless whether the data distribution is known or unknown.Numerical investigation on other types of distributions should help gain more information regarding the MCLT method.
Even though the permutation test showed several major advantages, the LR and Wald test still sustain desirable Type I errors and testing powers and are not as computationally intensive when the distribution for a large data set is known or the assumed distributions are appropriate.Nevertheless, the permutation test could be a valuable addition to the current statistical tests especially when a data set is small or the distribution is unknown.
for null hypothesis and 2 to 4 are alternative hypotheses.† : δ j and β j are zero probability level and mean of exponential distribution for j th population.

Figure 1 .
Figure 1.Empirical type I errors obtained by 20 different numbers of permutations.(LR = likelihood ratio, CLT = central limit theorem, and MCLT = modified central limit theorem).

Table 2 . Empirical Type I errors for zero-inflated exponential distribution based on 1000 simulations.
† : LR = likelihood ratio, CLT = central limit theorem, and MCLT = modified central limit theorem; ‡ : 500 permutations were used.

Table 3 . Empirical testing power for zero-inflated exponential distribution based on 1000 simulations for configuration 2.
‡ : 500 permutations were used.

Table 7 )
. The results for all four tests, with and without

Table 6 . Empirical type I errors and testing powers esti- mated by the LR and Wald tests by assuming three differ- ent distributions (exponential, Exp, Gamma, and log nor- mal, LogN) as zero-inflated exponential distribution with and without permutation tests based on 1000 simulations for population size 200.
‡ : Based on design 1 and design 2 in Table1, respectively; ∆ : 500 permutations were used.