^{1}

^{*}

^{1}

^{*}

In this paper the authors study empirically the power of the test based on the index of dissimilarity to compare two samples drawn from two populations differing only in the location parameter. We call such a test as test of homogeneity. In practice the power of such a bidirectional test will be studied referring to the absolute value of the shiftδand to the same probability models considered by Fried and Dehling.

Recently Fried and Dehling (2011) [

An index of comparison between two sets of observations was suggested by Corrado Gini (see details in [

In the 60’s and 70’s many authors, as Bertino [

The aim of this paper is to study empirically the power of the test based on the index of dissimilarity to compare two samples drawn from two populations differing only in the location parameter, given the alternative hypothesis

Let

be two random samples where F and G are continuous cumulative distribution functions. F and G have the same shape but are shifted of δ, that is

where δ is real.

Let

and

be the order statistics in the two samples.

The Gini’s sample dissimilarity index [

is a random variable measuring the symmetrical divergence between the two samples. A test to verify the homogeneity of the populations from where the two samples were drawn can be constructed on the basis of the sampling distribution of D. Such a distribution is known only for some models (discrete equispaced and equidistributed, continuous exponential and uniform) and only in the case of equal sampled populations, i.e. when is F = G. The sampling distribution of D is unknown when is

Let us suppose that the two distribution functions F and G differ only in the location parameters and are shifted by

pair of simulated samples,

2n observations of the pooled sample into two groups of n observations. For each of these pair of groups the dissimilarity index was also computed. Once the level of significance α is fixed, a measure of the power of the test to verify the null hypothesis of homogeneity of the two sampled populations is given by the fraction on the total of the replications of the simulated samples for which the dissimilarity index is above the fraction 1 − α of the dissimilarity indexes of the correspondent permuted samples. The significance level α = 0.05 as been considered in the simulations.

Figures 1-5 show, for each of the considered models, the graphs of the power of the test of homogeneity based on the index of dissimilarity, in relation to the shift δ and for each sample size n.

Power of the bidirectional test of homogeneity based on dissimilarity index when sampled populations are normal

Power of the bidirectional test of homogeneity based on dissimilarity index when sampled populations are Student’s t with 1 degree of freedom

Power of the bidirectional test of homogeneity based on dissimilarity index when sampled populations are Student’s t with 3 degrees of freedom

Power of the bidirectional test of homogeneity based on dissimilarity index when sampled populations are with 3 degrees of freedom

It can be seen that, when the sampled populations are normal, the power of the test based on the dissimilarity index, obviously, increases both as the sample dimension and the value of the shift δ increase. In particular, as the sample size increases, the highest improvement of the power can be observed for a shift δ = 1.5, as well as good improvements can be noticed, for increasing sample sizes, for shift values ranging between 1 and 2. For δ values outside this range, the power improvement for increasing sample sizes n turns out to be quite negligible. Furthermore it should be noted that for very high values of δ (≥2.5) even small samples provide a very good power of the test (ϒ > 0.9).

It can be seen that, also when the sampled populations are Student’s t with 1 degree of freedom, the power of the test based on dissimilarity index, obviously, increases both as the sample size and the value of the shift δ increase. In particular, as the sample size increases, the power improvement is practically nil for δ = 0.5, it reaches approximately the value 0.05 for δ ranging between 1 and 1.5, it increases approximately by 0.1 for δ = 2, 2.5. Even for high values of δ and sample size n = 10 the power of the test does not exceed the value of ϒ = 0.5. Evidently, the fat tails of the distribution of the Student’s t with 1 degree of freedom have a negative impact on the power of the test based on dissimilarity index, like it happens with many other tests.

When the sampled populations are Student’s t with 3 degrees of freedom the power of the test based on dis-

Power of the bidirectional test of homogeneity based on dissimilarity index for two samples of size n = 5, in relation to the considered probability models

similarity, as usual, also increases both as the sample size and the value of the shift δ increase. In particular, as the sample size increases, the power improvement is quite negligible for δ = 0.5, it reaches approximately the value 0.15 for δ = 1, and it keeps going up to about 0.25 for δ = 1.5, 2, 2.5. The power of the test for δ = 2.5 is quite good (ϒ ≥ 0.8) even for small sample sizes (n = 6, 7). The results appear to be much better when compared to those related to Student’s t with 1 degree of freedom, but less good when compared to those related to the two remaining models.

It can be seen that, also when the sampled populations are chi square with 3 degrees of freedom, the power of the test based on dissimilarity, obviously, increases both as the sample size and the value of the shift δ increase. In particular, as the sample size increases, the highest improvements of the power can be observed for values of the shift δ ranging between 1.5 and 2. For δ values below 1 or greater than 2, the power gain with increasing n turns out to be quite modest. Furthermore it should be noted that for high values of δ (δ ≥ 2) even small samples provide a good power of the test (ϒ > 0.8) and for higher values of δ (δ > 2.5) the power of the test is really excellent (ϒ > 0.9).

Figures 5-12 report, for each sample size n, the powers of the test based on the dissimilarity index as functions of the shift δ, for each of the considered models.

It can be seen that, when the sample size n = 5, the power of the test of homogeneity of the sampled populations is higher for the chi square model with 3 degrees of freedom. The power is still high for the normal model, though it is lower than the previous model for values of δ ≤ 2.5. For δ = 2.5 the power of the test is the same for both models. The power of the test appears to be lower when switching to the Student’s t with 3 degrees of free- dom, and even lower when the Student’s t with 1 degree of freedom is considered. Unlike the first two models, for the last two models the divergence between the power of the test seems to increase as the shift δ increases.

What has been now stated for sample size n = 5 is also valid for size n = 6, respect to which slight improvements of the powers are observed for all the considered models.

The same considerations apply to sample size n = 7, noting that as the size increases slight improvements of the powers are observed for all the considered models.

When n = 8 the power improvement affects more the Student’s t with 3 degrees of freedom. In the case of models such as the chi squared with 3 degrees of freedom and the normal the powers of the test appear to be quite close.

What has been now stated for sample size n = 8 is totally valid for sizes n = 9, 10. It should be noted about these sizes that, with regard to the Student’s t with 1 degree of freedom, the contained power of the test keeps being more evident as sample size increases.

Power of the bidirectional test of homogeneity based on dissimilarity index for two samples of size n = 6, in relation to the considered probability models

Power of the bidirectional test of homogeneity based on dissimilarity index for two samples of size n = 7, in relation to the considered probability models

Figures 11-14 report, for sample size n = 10 and for the different models, the powers of the tests based on dissimilarity index together with the most powerful tests considered by Fried and Dehling [

Let us consider the normal model first. In this case the most powerful test among those suggested by Fried and Dehling [

Power of the bidirectional test of homogeneity based on dissimilarity index for two samples of size n = 8, in relation to the considered probability models

Power of the bidirectional test of homogeneity based on dissimilarity index for two samples of size n = 9, in relation to the considered probability models

Let us now turn to the Student’s t with 1 degree of freedom which, as we know, is a model, characterised by symmetry and by a strong kurtosis. As far as this model is concerned, both the test based on the dissimilarity index and the best test among those suggested by Fried and Dehling [

Let us consider now the Student’s t with 3 degrees of freedom. This model, as we know, is symmetric and characterised by a small kurtosis. From

Power of the bidirectional test of homogeneity based on dissimilarity index for two samples of size n = 10, in relation to the considered probability models

Comparison between the powers of the test of homogeneity based on dissimilarity index and the test based on the difference of the averages: case of the normal model and sample size n = 10

comparable. A more careful look of it discloses the superiority of the best test among those suggested by Fried and Dehling [

Let us have a look in the end at the chi-squared with 3 degrees of freedom, model that is characterised by positive skewness. As it can be seen from

In this paper the power of a test based on the dissimilarity index has been empirically analysed to prove the hypothesis of homogeneity for two samples drawn from two populations identified by two probability models dif-

Comparison between the powers of the test of homogeneity based on dissimilarity index and the best of the tests considered by Fried-Dehling

Comparison between the powers of the test of homogeneity based on dissimilarity index and the best of the tests considered by Fried-Dehling

fering only by a location parameter. The analysed probability models were normal, Student’s t with 1 and 3 degrees of freedom, chi-squared with 3 degrees of freedom.

For each of these models two samples were generated via simulation, with location parameters in the sampled populations differing by a shift

For each of the considered sample sizes n = 5, 6, 7, 8, 9, 10 and for each of the 6 values of the shift

The results coming from the simulation allow assessing the power of the test based on the dissimilarity index in relation to the size of the shift

Obviously the power increases in relation to the shift size and the sample size. The power of the test based on the dissimilarity index seems to be very high both for the normal model and for the chi-squared model. It is still quite good for the Student’s t with 3 degrees of freedom. It appears to be less powerful when Student’s t with 1 degree of freedom is considered. In other words the test based on dissimilarity does not seem to be affected by the high kurtosis of the model; it seems instead to lose power in relation to the high kurtosis of the model.

Comparison between the powers of the test of homogeneity based on dissimilarity index and the best of the tests considered by Fried-Dehling

The power of the test based on dissimilarity index has been at the end compared with the power of the most powerful test among those considered by Fried and Dehling [