_{1}

A class of pseudo distances is used to derive test statistics using transformed data or spacings for testing goodness-of-fit for parametric models. These statistics can be considered as density based statistics and expressible as simple functions of spacings. It is known that when the null hypothesis is simple, the statistics follow asymptotic normal distributions without unknown parameters. In this paper we emphasize results for the null composite hypothesis : the parameters can be estimated by a generalized spacing method (GSP) first which is equivalent to minimize a pseudo distance from the class which is considered; subsequently the estimated parameters are used to replace the parameters in the pseudo distance used for estimation ; goodness-of-fit statistics for the composite hypothesis can be constructed and shown to have again an asymptotic normal distribution without unknown parameters. Since these statistics are related to a discrepancy measure, these tests can be shown to be consistent in general. Furthermore, due to the simplicity of these statistics and they com e a no extra cost after fitting the model, they can be considered as alternative statistics to chi-square statistics which require a choice of intervals and statistics based on empirical distribution (EDF) using the original data with a complicated null distribution which might depend on the parametric family being considered and also might depend on the vector of true parameters but EDF tests might be more powerful against some specific models which are specified by the alternative hypothesis.

Let X 1 , ⋯ , X n − 1 be a sample of size n − 1 from a continuous distribution F ∈ { F θ } and let X 1 ≤ ⋯ ≤ X n − 1 be the order statistics and let the transformed data be defined as U i ( θ ) = F θ ( X i ) , i = 1 , ⋯ , n − 1 , U ( i ) ( θ ) = F θ ( X ( i ) ) , i = 1 , ⋯ , n − 1 and define F θ ( X ( n ) ) = 1 and F θ ( X ( 0 ) ) = 0 . The spacings are given by D i ( θ ) = F θ ( X ( i ) ) − F θ ( X ( i − 1 ) ) , i = 1 , ⋯ , n

Ghosh and Jammalammadaka [

∑ i = 1 n h ( n D i ) with h ( x ) = − x α (1)

Using this class of h ( x ) , it is shown that the asymptotic covariance matrix of θ ^ is given by

1 n σ h 2 ( α ) [ I ( θ ) ] − 1 with σ h 2 ( α ) ≥ 0

and σ h 2 ( α ) depends on α but does not depend on the parametric family { F θ } and I ( θ ) is the usual information matrix of maximum likelihood (ML) estimation.

Furthermore, by letting α → 0 + , σ h 2 ( α ) → 1 . This result is interesting, as it means if we set α = 0.01 we then have σ h 2 ( α ) ≈ 1.02 and therefore, the loss of efficiency comparing to ML estimation or maximum spacing (MSP) method is around two percent no matter which parametric model is used. Luong [

In this paper, we focus on using this class of GSP methods for construction of goodness-of-fit tests statistics for testing the simple null hypothesis:

H_{0}: data come from a distribution F 0 = F θ 0 ; θ 0 is specified and for testing the composite null hypothesis.

H_{0}: data come from the parametric family F ∈ { F θ } ; θ is unspecified. For the composite H_{0}, Cheng and Stephens [_{0} using GDP methods has not received much attention in the literature.

We adopt an approach using pseudo distances by showing the class of h ( x ) = − x α , 0 < α < 1 induces a class of pseudo distances which we shall denote by d ф ( f 1 , f 2 ) , the function ф = 1 − x α = 1 + h ( x ) , f 1 and f 2 are densities and d ф ( f 1 , f 2 ) is a measure to quantify how close these densities are. Implicitly, for methods using spacings we work with transform data and if θ 0 is the true parameter then the transform data

U i = F 0 ( X i ) = F θ 0 ( X i ) , i = 1 , ⋯ , n − 1

will follow a standard uniform distribution with density f U = 1 , 0 ≤ x ≤ 1 and f U = 0 , elsewhere.

Using the transformed data we can obtain an easily constructed elementary density estimate without requiring a kernel of the usual density estimate, this empirical density estimate is denoted by f ^ n , see expression (6) and for testing the simple null hypothesis, a test statistic can be constructed which is based on

n k d ф ( f n , f U ) (2)

with the restriction on k > 0 and n k d ф ( f n , f U ) can be reexpressed equivalently as a simple function of spacings and numerically simple to compute; the statistic will follow an asymptotic normal distribution which does not depend on the parametric family. For the statistic to have good power for large samples, it appears that we should choose the scaling factor n k so that an asymptotic distribution exists for the statistic given by expression (2) and at the same time k > 0 so that n k → ∞ as n → ∞ and if d ф ( f n , f U ) can be used to discriminate whether the sample is drawn from an assumed distribution, the test will be consistent and it is an advantage over chi-square tests which do not have the consistency property, in general.

For the composite hypothesis, we use a GSP method to obtain the GSP estimators given by the vector θ ^ first but we shall see that minimizing expression (1) is equivalent to minimizing the following pseudo distance based on a function ф , the expression up to a positive multiplicative constant is given by n k d ф ( f n θ , f U ) , f n θ is defined by expression (11) in Section (4).

Subsequently the statistic is based on

n k d ф ( f n θ ^ , f U ) (3)

and after simplifications, it is reduced to a simple function of spacings with estimated parameters and it will be shown again the equivalent statistic to the one given by expression (3) will follow an asymptotic normal distribution without unknown parameters; this property will facilitate goodness-of-fit testing. Using this unified presentation, we would like to show that these statistics are density based and they are parallel to traditional test statistics based on distribution functions (EDF) such as the Anderson-Darling statistic, see Anderson Darling [

The approach used in this paper hopefully will unify estimation and model testing and facilitate the comparisons of these density based statistics with traditional EDF statistics and chi-square statistics which are more often used than these density based statistics. We note that these statistics can be computed easily and their null asymptotic distribution is normal without unknown parameters which make it easy to use these statistics and comparing to the related chi-square statistics, these statistics do not need a choice of intervals and they come as by products when fitting models using the corresponding GSP methods. This feature is not shared by maximum likelihood (ML) methods.

We also note that power analysis using theoretical works might not give a complete picture for these density based statistics as the analysis is often based on only one sequence of functions which belongs to the alternative hypothesis converging to the functions specified by the null distribution and there are so many sequences that can approach the functions of the hypothesis in a functional space; see Sethuraman and Rao [

In this paper, we shall concentrate on asymptotic distributions goodness-of-fit tests statistics based on GSP methods and emphasizing a class of GSP methods which complete the results on estimation and parameter testing given by a previous paper. Implicitly, GSP methods in this paper mean GSP methods restricted to the class being considered in this paper. Furthermore, we do not touch upon the question of power analysis which might need extensive simulations studies with many models chosen for the alternative hypothesis as we do not have enough computing facilities and resources for such large scale simulation studies, see Cheng and Stephens [

The paper is organized as follows.

In Section 2, a class of pseudo distances which generate the related GSP methods for estimation and model testing is introduced and the inference methods are based on spacings or equivalently on transformed data. The elementary density estimate introduced by Kale [

We shall see that pseudo-distances can be created using a convex function ф ( x ) , x ≥ 0 with ф ′ ( x ) and ф ″ ( x ) being respectively its first and second derivatives with ф ″ ( x ) ≥ 0 . We focus on pseudo distances defined by using as ф ( x ) = 1 − x δ , x ≥ 0 , 0 < δ < 1 and let α = 1 − δ . The GSP estimators given by the vector θ ^ can be seen are based on this class as they are obtained by minimizing the following objective function with respect to θ and by choosing a value for α ,

T n ( θ ) = − ∑ i = 1 n ( n D i ( θ ) ) α , 0 < α < 1 ,

i.e., specifying h ( x ) = − x α , 0 < α < 1 , x > 0 .

We shall see that using this class of h ( x ) using spacings is equivalent to use a class of pseudo distances for densities defined using ф ( x ) . It has been shown in our previous paper that GSP methods can attain high efficiency for estimation using values for α being positive and near 0.

Note that by letting α → 0 + we obtain full efficiency and with α = 0.05 , the asymptotic relative efficiency is around 0.98 for all parametric families comparing to fully efficient methods such as the MSP method or ML method or Hellinger method based on density estimate using the original data introduced by Beran [

This will also make the GSP methods parallel to EDF methods such as the Cramér Von Mises methods or weighted Cramér-Von Mises distances such as the Anderson-Darling distance methods which also make use of the original data. For Anderson-Darling (AD) distance, see Anderson and Darling [

In general, the MAD estimators are robust and have high efficiencies but for some parametric families, the overall relative efficiency when compared to maximum likelihood (ML) estimators can fall below 0.80, see Boos [

A D ( θ ¯ ) = n ∫ − ∞ ∞ ( F n − F θ ¯ ) 2 d F θ ¯ ( x ) (4)

to test the validity of the model specified by the composite H 0 : F ∈ { F θ } , i.e., the data is drawn from a distribution F which belongs to the family { F θ } and F n is the usual empirical distribution function using the original data. The expression (4) can also be reexpressed so that it is more suitable for calculations see Boos [

Before introducing these goodness-of-fit statistics, first we shall define a ф-discrepancy measure which induces a ф-pseudo-distance. The definitions have been given by Ali and Silvey [

Definition (ф -pseudo-distance)

The ф-pseudo-distance or ф-divergence measure between two densities f 1

and f 2 is defined by d ф ( f 1 , f 2 ) = E f 2 ( ф ( f 1 f 2 ) ) , E f 2 ( . ) is the expectation using

f 2 , ф is a convex function with ф ( x ) defined for x ≥ 0 and the second derivative ф ″ ( x ) exists and ф ″ ( x ) ≥ 0 , ф ( 1 ) = 0 .

We have d ф ( f 1 , f 2 ) ≥ 0 , d ф ( f 1 , f 2 ) = 0 if and only if f 1 = f 2 except on a set of measure 0. The discrepancy measure needs not be symmetric as d ф ( f 1 , f 2 ) ≠ d ф ( f 2 , f 1 ) and it does not need to obey the triangle inequality and unless otherwise stated, we focus on the class of ф ( x ) = 1 − x δ , x ≥ 0 , 0 < δ < 1 and let α = 1 − δ .

Using the above function, the pseudo distance can be expressed as

d ф ( f 1 , f 2 ) = 1 − ∫ − ∞ ∞ ( f 1 ( x ) f 2 ( x ) ) δ f 2 ( x ) d x = 1 − ∫ − ∞ ∞ f 1 δ ( x ) f 2 1 − δ ( x ) d x

and we shall use these pseudo distances to construct goodness-of-fit test statistics using transformed data or equivalently spacings and related them with results which already obtained using spacings which have appeared in the literature. The advantage of this approach is an unified treatment can be given to estimation and model testing and it can reveal tests based on statistics which make use of spacings which might not be powerful for large samples when used for testing of goodness-of-fit.

Note that Hellinger distance (HD) which is a true distance as used by Beran [

d HD ( f 1 , f 2 ) = ∫ − ∞ ∞ ( f 1 ( x ) − f 2 ( x ) ) 2 d x = 2 − 2 ∫ − ∞ ∞ f 1 1 2 ( x ) f 2 1 2 ( x ) d x .

In the next section, we shall present an elementary density estimate using transformed data and we aim to test the following simple H_{0} which specifies that the random sample of observation is drawn from a distribution function F 0 ( x ) = F θ 0 ( x ) , θ 0 is specified and F 0 ( x ) has a closed form expression.

We assume to have a random sample of size n − 1 which consists of X 1 , ⋯ , X n − 1 and these observations are independent and identically distributed(iid) as X which follows a distribution F ∈ { F θ } , { F θ } is the parametric model used and let the order statistics be denoted by X 1 ≤ X 2 < ⋯ ≤ X n − 1 .The vector of parameters is denoted by θ = ( θ 1 , ⋯ , θ m ) ′ , θ 0 is the true vector of parameters.

If we want to test the simple null hypothesis which specifies that data come from F = F 0 ( x ) = F θ 0 , let U i = F 0 ( X i ) be the transformed data and the order statistics based on transformed data are U ( 1 ) ≤ U ( 2 ) ≤ ⋯ ≤ U ( n − 1 ) and the spacings be defined as D i = U ( i ) − U ( i − 1 ) , i = 1 , ⋯ , n with

The density function of

The procedure to smooth the empirical distribution using transformed data is similar to the procedure of constructing an ogive function when data have been grouped into intervals and we need to smooth the empirical distribution function, see Klugman et al. [

The smoothed empirical distribution function admits the following elementary density estimate as density,

and it can be obtained easily without requiring a kernel and specifying a window.

It is not difficult to see that under the simple null hypothesis the transformed data follow the uniform distribution with density function given by

since

Therefore, if we can find a real number

has an asymptotic distribution which no longer depends on the functional form of

In fact, we do not need to require

1) If the sample is drawn from a distribution F and

2) If the sample is drawn from a distribution G and

we shall use the notation

Then, we should have

Furthermore, if we can simplify the expression of V so that we can have an equivalent statistic which serves the same purpose and it is simpler to compute then it is interesting to use its equivalent form. It turns out that this is the case as the statistic can be expressed as a simple function of spacings. However, by relating to the discrepancy measure, the test based on such a statistic can be seen to be consistent. This statistic parallels the one proposed by Beran [

Now we shall examine the component

and it can be re-expressed as

see Kirmani and Alam [

Using results as given in section 2 by Luong [

Theorem can be applied to the expression. By letting the mean and variance of

we have

so that we have an asymptotic normal distribution for the test statistic P defined below and if we need to emphasize the dependence on

Therefore, if we look for the scaling factor k using expression (9) we should consider

and with

The asymptotic distribution of the statistic

which can also be represented as

follows a Normal distribution with mean _{0} if

and hence, _{0} if

with

or Hellinger distance with

given by expression (11) might make the test having low power for large samples when the null hypothesis is composite; see Kirmani and Alam [

[

efficient for estimation. Testing for the null hypothesis which is composite will be considered subsequently.

For testing the null composite hypothesis which specifies that data come from the parametric family

Parallel to the simple null hypothesis case, we transform the data and let _{i}’s depend on

with

Since the transformed data

using the notations

We estimate first

The estimators given by the vector

The goodness-of-fit test statistic can be based on

Argue as in the case of simple null hypothesis it leads to consider the equivalent statistic

We shall show subsequently that we have the equality in distribution

The statistic is similar to the one used for the simple null hypothesis case. All we need is to replace

Observe that we can expand the expression

around

with

by Luong [

which justifies the use of expression (15).

The same type of property has been shown to hold for the asymptotic distribution of the Moran’s statistic with Maximum spacing estimators for testing goodness of fit for parametric models, see Cheng and Stephens [

The GSP methods with

In this section, we would like to make the following remark by pointing out that in a data set which is not large and there are many tied observations, it might be preferred to use a GSP method instead of the MSP method as the MSP method is based on minimizing

Cheng and Stephens [

In this section, we touch upon the question of power analysis for these density based tests. Power analysis for null hypothesis which specifies functions is more complicated than Pitman efficiency analysis when parameters are scalars, see Lehmann [

Here, under the null hypothesis a function or functions are specified, this makes the study of power more complicated even for the simplest case when the null hypothesis H_{0} is simple which specifies the data comes from F_{0} or equivalently the transformed data comes from a standard uniform density with density function

For power study, often a sequence of tests based on a sequence of functions which belongs to the alternative hypothesis

Rao [

For theoretical works and Pitman efficiencies, the focus is on best tests based on a chosen sequence of functions but it might not provide a complete answer for applications as an optimum statistic might no longer be optimum if another sequence of functions are chosen. In applications, the distributions belonging to the alternative hypothesis which are useful and commonly used might not have been included in the analysis for theoretical works. This makes the assessment of power difficult using theoretical analysis especially when parameters are functions instead of scalars, see Lehman [

Cheng and Stephens [

In a previous paper, we have studied estimation, asymptotic properties, robustness and parameter hypothesis testing using GSP methods. In this paper we have adopted the view that GSP methods are minimum density based distance methods using transformed data or equivalently spacings so that estimation and model testing can be treated in a unified way. Model validation via goodness-of-fit tests and construction of density based tests are treated in this paper. We have shown that these statistics for testing come at no extra cost once a GSP method is used for fitting a parametric model and might be useful for assessment of the model in practice. These tests are simple to perform and practitioners might want to use these tests concomitantly with GSP estimation especially when sample sizes are relatively large. For some real life data sets, GSP methods might be preferred over MSP method for estimation and chosen for their robustness property, efficiency and the flexibility to handle tied observations and finally tests statistics for goodness-of-fit can be constructed at no extra cost. The last feature is not shared by maximum likelihood (ML) method.

The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support from the editorial staffs of Open Journal of Statistics to process the paper are all gratefully acknowledged.

The authors declare no conflicts of interest regarding the publication of this paper.

Luong, A. (2018) Asymptotic Results for Goodness-of-Fit Tests Using a Class of Generalized Spacing Methods with Estimated Parameters. Open Journal of Statistics, 8, 731-746. https://doi.org/10.4236/ojs.2018.84048