_{1}

Asymptotic results are obtained using an approach based on limit theorem results obtained for α -mixing sequences for the class of general spacings (GSP) methods which include the maximum spacings (MSP) method. The MSP method has been shown to be very useful for estimating parameters for univariate continuous models with a shift at the origin which are often encountered in loss models of actuarial science and extreme models. The MSP estimators have also been shown to be as efficient as maximum likelihood estimators in general and can be used as an alternative method when ML method might have numerical difficulties for some parametric models. Asymptotic properties are presented in a unified way . Robustness results for estimation and parameter testing results which facilitate the applications of the GSP methods are also included and related to quasi-likelihood results.

Let X 1 , ⋯ , X n be a random sample from a continuous parametric family with a distribution function which belongs { F θ } , θ is the vector of parameters and instead of fitting distribution using maximum likelihood (ML) method, Cheng and Amin [

∑ i = 1 n + 1 − log ( ( n + 1 ) ( D i ( θ ) ) )

where D i ( θ ) = F θ ( x ( i ) ) − F θ ( x ( i − 1 ) ) , i = 1 , ⋯ , n + 1 are the spacings and we define F θ ( x ( n + 1 ) ) = 1 , F θ ( x ( 0 ) ) = 0 with the order statistics of the sample given by

X ( 1 ) < X ( 2 ) < ⋯ < X ( n ) .

It is quite obvious that it is not more difficult to obtain the GSP estimators than the ML estimators and it has been proven that the MSP estimators are as efficient as the ML estimators in general and can be consistent when ML estimators might fail to be consistent. MSP method can be used as an alternative to ML method as ML method might encounter numerical difficulties when used for fitting some models with shifted origin which are often encountered in loss models and extreme value models. We shall examine a few examples for illustrations. Anatolyev and Kosenok [

Example 1 (Pareto)

The Pareto model considered by Anatolyev and Kosenok [

f ( x ; α ) = α x α + 1

and distribution function given by

F ( x ; α ) = 1 − ( 1 x ) α , x > 1 , α > − 1 .

The model is a sub-model of the larger model with two parameters α and θ, density function

f ( x ; α ) = α θ α x α + 1 , x > θ , θ ≥ 0 ,

where θ is a shift parameter, α > − 1 . The distribution function is given by

F ( x ; α , θ ) = 1 − ( θ x ) α , x > θ , α > − 1 ,

see Klugman et al. [

The following example gives the Fréchet model which is an extreme value model and it is also a shifted origin model, see more properties and details in the book by Castillo et al. [

Example 2(Fréchet)

The Fréchet model has three parameters β , δ , λ with density function and distribution function given by

f ( x ; β , δ , λ ) = β δ ( x − λ ) 2 ( δ x − λ ) β − 1 exp [ − ( δ x − λ ) β ] , x > λ ≥ 0 , β > 1 , δ > 0

and

F ( x ; β , δ , λ ) = exp [ − ( δ x − λ ) β ] , x > λ ≥ 0 ,

where λ is as shift parameter, β > 1 , δ > 0 .

Ghosh and Jammalamadaka [

As we have seen that despite the GSP methods are powerful methods for univariate continuous models but they are not used as often as they should be. It might be due to the asymptotic results are scattered in the literature and in particular previous approaches for asymptotic normality have been based on distribution of spacings and order statistics which make further results such as the distributions of counterparts of Wald test statistic, Score statistic and likelihood ratio test statistics of likelihood theory difficult to establish which prevent the use of these methods for applications. In this paper, a different approach is taken for establishing asymptotic normality. The approach is a based on using uniform weak law of large numbers (UWLLN) for establishing consistency of the GSP estimators and central limit theorem for α-mixing sequences as given by White and Domowitz [

The paper is organized as follows. Section 2 gives the preliminary results already established by Ranneby [

For further study the class of generalized spacing (GSP) methods, we shall present the GJ class being considered by Ghosh and Jammalamadaka [

The GSP methods can be seen to be closely related to quasi-likelihood methods and M-estimation theory can be used to study estimation, robustness and parameter testing via Wald test, Lagrange multiplier test or score test and quasi-likelihood ratio test but with a GSP version for each of these tests forming the classical trinity. The results appear to be natural and parallel to maximum likelihood methods (ML) given that the vector of MSP estimators which belong to the class of GSP is as efficient as the vector of ML estimator and therefore it is natural to establish inference methods based on this class which parallel ML methods. It is also worth to note that this class can be used for robust estimation which parallel Hellinger distance methods given by Beran [

Now we shall use the set up as given by Ghosh and Jammalamadaka [

The order statistics are denoted by X ( 1 ) < X ( 2 ) < ⋯ < X ( n ) and the spacings D i ( θ ) which can be viewed as transforms of the order statistics and they are given by

D i ( θ ) = F θ ( X ( i ) ) − F θ ( X ( i − 1 ) ) , i = 1 , ⋯ , n + 1

with F θ ( X ( 0 ) ) = 0 and F θ ( X ( n + 1 ) ) = 1 by definitions.

Ghosh and Jammalamadaka [

T ( θ ) = ∑ i = 1 n + 1 h ( ( n + 1 ) D i ( θ ) ) (1)

or equivalently,

Q n ( θ ) = 1 n + 1 ∑ i = 1 n + 1 h ( ( n + 1 ) D i ( θ ) ) . (2)

The class considered include the following functional form for h ( x ) which is a convex function with domain ( 0 , ∞ ) an range being the real line. With h ( x ) = − log ( x ) we have the MSP method and the function h ( x ) = − log ( x ) is optimum in term of efficiency of statistical methods generated but for robustness other choices for h ( x ) might include h ( x ) = x α for α > 1 , h ( x ) = − x α for 0 < α < 1 ,

h ( x ) = x log ( x ) , h ( x ) = x α for − 1 2 < α < 0 . (3)

Note that for all these choices the first and second derivatives h ′ ( x ) , h ″ ( x ) exist and since h ( x ) is a convex function h ′ ( x ) ≥ 0 , see Lehmann and Casella [

We shall call the class defined by using functions given by expression (3) and including h ( x ) = − log ( x ) , the GJ class as it was introduced by Ghosh and Jammalamadaka [

For this class, we can see that the sub-class with h ( x ) = − x α for 0 < α < 1 and up to an additive and multiplicative constant, it can be expressed equivalently as

h ( x ) = − x α − 1 α and as α → 0 +

h ( x ) = − x α − 1 α → h * ( x ) = − log (x)

which is the optimum h ( x ) as using h * ( x ) will generate the MSP estimators which are the most efficient within this class and asymptotically equivalent to maximum likelihood(ML) estimators.

Ghosh and Jammalamadaka [

We shall also make use of the following results and notions introduced by Ranneby [

It is clear that if we use ( X i , Y i ) to re-express Q n ( θ ) or an asymptotic equivalent expression for Q n ( θ ) can be given with the use of ( X i , Y i ) , i = 1 , ⋯ , n instead of D i ( θ ) , i = 1 , ⋯ , n in expression (2).

Now if we define

z i ( θ , n ) = ( n + 1 ) ( F θ ( x i + y i n + 1 ) − F θ ( x i ) ) (4)

and

Q n ( θ ) = 1 n ∑ i = 1 n h ( z i ( θ , n ) ) (5)

then Q n ( θ ) defined as above is asymptotically equivalent to Q n ( θ ) defined using expression (2) as only the first term in the summation of expression (2) is left out for the summation of expression (5) and we focus on asymptotic theory here. Also, expression (5) is asymptotically equivalent to the expression denoted by S n ( θ ) given by Ranneby [

We shall see that most of limit theorems such as the uniform weak law of large numbers(UWLLN) or Central limit theorem (CLT) are stated using the form given by expression (5) and we need to subject Q n ( θ ) to these limit theorems; it is more convenient to use Q n ( θ ) as given by expression (5) with a factor as it simplifies the notations.

Now we can note that if { z i ( θ , n ) } is a sequence of independent identically distributed (iid) terms, then there is no problem to apply UWLN and CLT but we will see that { z i ( θ , n ) } is a dependent sequence but with a weak form of dependency so that we can apply a dependent version of UWLN then we can draw the same conclusion just as assuming { z i ( θ , n ) } are iid.

Clearly, we need to make use of the distribution of V i = ( X i , Y i ) ′ and the dependence of the sequence { V i } to study the sequence { z i ( θ , n ) } . Ranneby [

P n ( x , y ) = ∫ − ∞ x [ 1 − [ 1 − ( F θ 0 ( u + y n + 1 ) ) n − 1 ] ] f θ 0 ( u ) d u . (6)

As n → ∞ ,

P n ( x , y ) → P θ 0 ( x , y ) = ∫ − ∞ x ( 1 − e − y f θ 0 ( u ) ) f θ 0 ( u ) d u

with the bivariate density function given by

p θ 0 ( x , y ) = f θ 0 2 ( x ) e − y f θ 0 ( x ) , y > 0 , (7)

see Ranneby [

Let V 0 = ( X 0 , Y 0 ) and its bivariate density is as given by expression (7) and to derive asymptotic results subsequently we let n → ∞ , then we have the equalities in distribution for i = 1 , 2 , ⋯ .

V i = ( X i , Y i ) ′ = d V 0 = ( X 0 , Y 0 ) ′ . (8)

Furthermore, we have pairwise asymptotically independent of { V i , V i ′ } for i ≠ i ′ in the sense that the joint distribution of V i and V i ′ which is denoted by

Q n ( x , y , x ′ , y ′ ) → P θ 0 ( x , y ) P θ 0 ( x ′ , y ′ ) as n → ∞ , (9)

see Ranneby [

We shall define some more notations.

Let V i 0 = ( X i 0 , Y i 0 ) ′ , i = 1 , 2 , ⋯ . As they have a common distribution let V 0 be one of them, its bivariate density function is given by

p θ 0 ( x , y ) = f θ 0 2 ( x ) e − y f θ 0 ( x ) . (10)

From the mean value theorem, we have

z i ( θ , n ) = ( n + 1 ) ( F θ ( x i + y i n + 1 ) − F θ ( x i ) ) → y i f θ ( x i ) (11)

for each i as n → ∞ which is also given by Ranneby [

for all i and the property given by expression (12) can be used to establish asymptotic results subsequently.

It is not difficult to see that the following covariance relationships hold,

as

Define two sets of random variables which are apart of a distance m as follows

and since

Now we can define the ρ-mixing coefficient for the sequence

Now

The coefficient

Now we shall define the coefficient

The coefficient

using

For establishing asymptotic results for the GSP methods which include the MSP method we do not aim to obtain the results with a minimum amount of regularity conditions as by doing so the technicalities will be increased and might discourage practitioners to use the methods. The regularity conditions used are comparable to regularity conditions for maximum likelihood methods under usual circumstances in order to put GSP methods parallel to ML methods. The aims are to put the GSP methods as equally practical as ML methods for univariate continuous models and to show that it is not more difficult to use these methods than ML methods. Furthermore, by related this class of estimators with to the class of M-estimators, it will be shown that this class can offer more flexible choices for robust estimators should the MSP estimators which are equivalent to ML estimators are not robust and they share similarities with the class of estimators considered by Broniatowski et al. [

The objective function to be minimized to obtain the GSP estimators which is denoted by the vector

The following Theorems can be used to establish consistency for

Theorem 1(Consistency)

Assume that:

1) The parameter space θ is compact, the true vector of parameters is denoted by

2)

3)

Then

To apply this Theorem condition 2) is a condition on uniform convergence, conditions which ensure UWWLN can be applied will imply condition 2 for

Implicitly, we assume that

Theorem 2 (UWLLN)

Assume that:

1)

2) There exists a function

3) For

4)

Then we have:

1)

2)

Applying Theorem 1 and Theorem 2 will show consistency for the GSP estimators given by the vector

and since the

which is more general but similar to the expression given by Ranne by [

If we consider the inner integral and make a change of variables with

Therefore,

For consistency based on Theorem 1, we need to show that

1)

2) The vector

3)

The conditions (1 - 3) as given above hold in general, it has been shown that conditions for MSP estimators to be consistent are more relaxed than the condition

as given by Theorem 2.5 of Newey and McFadden [

Since

It is not difficult to see that:

1)

2) the marginal density of W is standard exponential, i.e., the density for W is

3) the marginal density for

We shall see subsequently in the next sections that these properties allow many asymptotic results to be obtained in a unified way and simplify proofs for some results which already appeared in the literature and allow asymptotic normality to be established for the GSP estimators for multi-parameters estimation which have been established in the paper by Ghosh and Jammalamadaka [

In fact, for asymptotic properties we essentially work with

We shall see in the next sections by considering

For establishing

and using expression (16) with a change of order of integration,

The inner integral can be expressed as

and since

since

Therefore,

This completes the proof for the inequality.

Furthermore, by making a change of variable we can put

which is the expression used by Ghosh and Jammalamadaka [

For asymptotic normality, often we work with an expression with n being finite then passing to the limit to get the asymptotic results by letting

When passing to the limit by letting

The asymptotic normality results might continue to hold with less stringent conditions for some parametric families but the proofs would be technical and similar to proofs for maximum likelihood estimators under the nonstandard conditions as in M-estimation theory which are given by Huber [

Theorem 3 (CLT)

Let

Assume that:

1)

2) There exists K finite and nonzero such that

3)

4) The mixing coefficient

Then

Often, we need to apply Theorem 3 in a multivariate context, Cramer-Wold devices can be used together with Theorem 3, see Davidson [

Let

be the first, second and third derivatives of the function

1) The above partial derivatives are continuous with respect to elements of

2) The expectations

3)

4) Interchanging order of integration and differentiation is allowed as in likelihood theory so that

and the Fisher information

5) The convergence of

These functions are with respect to

6) The vector

For condition (5), a sufficient condition to have uniform convergence is the sequence of functions with respect to

Now we can state the following Theorem which is Theorem 4 which give the asymptotic normality results for the GSP estimators in general, i.e., for the multi parameters case and we also verify the result given by expression (9) obtained by Ghosh and Jammalamadaka [

Theorem 4 (Asymptotic Normality)

Under Assumptions (165) as given above, then we have the following convergence in distribution for the vector of GSP estimators

1)

2)

3)

Proof.

Under differentiability assumptions made, the vector of GSP estimators

Using a Taylor expansion around the true vector of parameters

with

From expression (21), we have the following representation using equality in distribution

We can proceed by using

It is not difficult to see that:

1)

2) Similarly,

We shall show that

Now applying Slutzky’s Theorem if needed as in likelihood theory, we have

Subsequently, we shall display the matrices

and with expectation taken will give

Note that

by letting

using the independence of

The elements of the matrix

Note that the second term of the RHS of the above equality can be expressed as

and upon taking expectation, it is reduced to 0 as

Let

which implies

using

For comparison with results given by Ghosh and Jammalamadaka [

expression

It is not difficult to see that the elements of

are

using the independence of

for

Therefore,

The expression for

The asymptotic covariance for

At this point, we would like to make some remarks which are given below.

Remark 1

It appears that a minor adjustment is needed for expression (9) given by Ghosh and Jammalamadaka [

It appears that the term

An interpretation of asymptotic relative efficiency of the GSP method versus the MSP method can be given to

For the moment generating function of the log-gamma distribution, see Chan [

Remark 2

By relating with M-estimators, we can study the efficiency and robustness for GSP estimators based on a function

is the vector of quasi-score functions.

From M-estimation theory, we already know that for efficiency

is optimal as in this case,

which shows that MSP method is efficient as ML method. This finding has been reported by Ghosh and Jammalamadaka [

is not bounded as a function of x, the MSP estimators might not be robust despite they are efficient.

For robustness we might want to choose

and clearly

Remark 3

Only for the MSP case that we have

Now having the asymptotic results for

Often, we are interested to test the null hypothesis which specifies that

These matrices are assumed to have rank q. With an application of the delta method, we can say that the asymptotic covariance matrix of

Applying Wald’s method to construct chi-square statistic using

Therefore, we have an asymptotic chi-square distribution with q degree of freedom as given below,

Replacing

The score test is also called Lagrange multiplier (LM) test, it can be derived using the Lagrange multipliers but they do not need to be calculated explicitly as they can be expressed using the quasi-score function of the GSP methods. We only need to fit the restricted model which is specified by the null hypothesis. The vector for restricted estimators is denoted by

For minimizing under q constraints, we introduce the vector of Lagrange multipliers

which can be viewed as the quasi-score of the GSP methods, this will parallel GSP methods with quasi-likelihood methods or likelihood methods. The first order conditions using

A Taylor series expansion on the system (29) and (30) around

as

Multiply the first system by

with

as in the proof of Theorem 4. Therefore, using expression (29), we then have

Let

then

and replacing

R is with an asymptotic chi-square distribution with q degree of freedom.

Equivalently, if we can assume that

using a result established by Wooldrige [

such that

The proof is involved and requires preliminary results for linear algebra, we shall not reproduce here, see Wooldrige [

will give the expression (32) for R. Note that expression (31) holds in general without the additional assumption

Note that for the use of the score test, only the reduced model under

The following test which is the quasi-likelihood ratio test, we do not need the expressions for asymptotic covariance matrices where partial derivatives are involved but we need to fit the reduced model and the full model to obtain both

The quasi-likelihood ratio test makes use of a statistic which is based on the change of the objective function obtained by fitting the full model and the reduced model, it can be expressed as

and we shall see that again we have a chi-square asymptotic distribution for the QLR statistic with

We justify the asymptotic distribution as given above for the QLR statistic by expanding

with

But with a Taylor expansion again around

with

This implies

and using the quasi-score functions,

which is equivalent to the score statistic,

We end this section by noting that GSP methods for multivariate models have been introduced by Kuljus and Ranneby [

Asymptotic results for the GSP methods are obtained and presented in a unified way with fewer technicalities which parallel likelihood methods. The implementation of the methods is not more complicated than the implementation of likelihood or quasi-likelihood methods, and the GJ class is large enough to allow more choices for robustness if needed for some parametric models, and at the same time the MSP method within this class is as efficient as likelihood method for continuous univariate models. With all these properties of the GSP methods and simple presentation, we hope to show that these methods are indeed very powerful and useful for continuous univariate models but appear to be under used. Practitioners might want to implement these methods in various fields which include actuarial science for their applied works as they are not more complicated than quasi-likelihood methods.

The helpful and constructive comments of reviewers which lead to an improvement of the presentation of the paper and support from the editorial staffs of Open Journal of Statistics to process the paper are all gratefully acknowledged.

Luong, A. (2018) Unified Asymptotic Results for Maximum Spacing and Generalized Spacing Methods for Continuous Models. Open Journal of Statistics, 8, 614-639. https://doi.org/10.4236/ojs.2018.83040