_{1}

Quadratic distance estimation making use of the sample quantile function over a continuous range is introduced. It extends previous methods which are based only on a few sample quantiles and it parallels the continuous GMM method. Asymptotic properties are established for the continuous quadratic distance estimators (CQDE) and the implementation of the methods are discussed. The methods appear to be useful for balancing robustness and efficiency and useful for fitting distribution with model quantile function being simpler than its density function or distribution function.

For estimation in a classical setup, we often assume to have n independent, identically distributed observations X 1 , ⋯ , X n from a continuous density f θ 0 ( x ) which belongs to a parametric family { f θ } , i.e., f θ 0 ( x ) ∈ { f θ } where θ = ( θ 1 , ⋯ , θ m ) ′ , θ ∈ Ω and θ 0 is the true vector of parameters, Ω is assumed to be compact. One of the main objectives of inferences is to be able to estimate θ 0 . In an actuarial context, the sample observations might represent losses of a certain type of contracts and an estimate of θ 0 is necessary if we want to make rates or premiums for the type of contract where we have observations.

Maximum likelihood (ML) estimation are density based and often the domain of the density function must not depend on the parameters is one of the regularity conditions so that ML estimators attain the lower bound as given by the information matrix. In many applications, this condition is not met. We can consider the following example which gives the Generalized Pareto distribution (GPD) and draw the attention on the properties of the model quantile function which appears to have nicer properties than the density function and hence motivate us to develop continuous quadratic distance (CQD) estimation using quantiles on a continuum range which generalizes the quadratic distance (QD) methods based on few quantiles as proposed by LaRiccia and Wehrly [

Example (GPD).

The GP family is a two parameters family with the vector of parameter θ = ( λ , κ ) ′ .

The density, distribution function and quantile function are given respectively by

f ( x ; λ , κ ) = 1 λ ( 1 − κ x λ ) 1 κ − 1 , 1 − κ x λ ≥ 0 , κ ≠ 0 , λ > 0 and

f ( x ; λ ) = 1 λ e − x / λ , x ≥ 0 , κ = 0 , λ > 0 ,

the distribution function is given by

F ( x ; λ , κ ) = 1 − ( 1 − κ x λ ) 1 κ , 1 − κ x λ ≥ 0 , κ ≠ 0 , λ > 0 and

F ( x ; λ ) = 1 − e − x / λ , x ≥ 0 , κ = 0 , λ > 0 ,

the quantile function is given by

F − 1 ( t ; λ , κ ) = λ ( 1 − ( 1 − t ) k ) / k , 0 < t < 1 , κ ≠ 0 , λ > 0 and

F − 1 ( t ; λ ) = − λ log ( 1 − t ) , κ = 0 , λ > 0 , 0 < t < 1

These functions can be found in Castillo et al. [

F n − 1 ( t ) = inf { x | F n ( x ) ≥ t } and

F n ( x ) = 1 n ∑ i = 1 n δ x i with δ x i being the degenerate distribution at x i is the commonly used sample distribution. The counterpart of F n − 1 ( t ) is the model quantile function F θ − 1 ( t ) , see Serfling [

Due to the complexity of the density function for the GP model, alternative methods to ML have been developed in the literature for example with the probability weighted moments (PWM) method proposed by Hosking and Wallis [

For estimating parameters of the GPD, the percentiles matching (PM) method for fitting loss distributions as described by Klugman et al. [

F n − 1 ( t 1 ) = F θ − 1 ( t 1 ) or equivalently, F θ ( F n − 1 ( t 1 ) ) = t 1

and

The method is robust but not very efficient as only two points are used here to obtain moment type of equations and there is also arbitrariness on the choice of these two points. Castillo and Hadi [

Instead of solving moment type of equations, for parametric estimation in general not necessary for the GPD with the vector of parameters

which is based on the sample and its model counterpart defined as

This leads to a class of quadratic distance of the form

and the quadratic distance (QD) estimators are found by minimizing the objective function given by expression (1),

By quadratic distance estimation without further specializing it is continuous we mean that it is based on quadratic form as given by expression (1), it also fits into classical minimum distance (CMD) estimation and closely related to Generalized Methods of moment (GMM) and by GMM without further specializing that it is continuous GMM, we mean GMM based on a finite number of moment conditions, see Newey and McFadden [

Using the asymptotic theory of QD estimation or CMD estimation, it is well known that by letting

In fact, it has been shown that it suffices to use a consistent estimate for

then we can construct a consistent estimate which is given

In practice, for QD estimation we let

For GMM estimation, it is quite straightforward to construct

Continuous GMM theory makes use of Hilbert space linear operator theory and have been developed in details by Carrasco and Florens [

CQD estimators can be viewed as estimators based on minimizing a continuous quadratic form as given by

with:

1)

2) a and b are chosen values with a being close to 0 and b close to 1 and

In practice, we work with an asymptotic equivalent objective function

Since the kernel

As in the spectral decomposition of a symmetric positive defined matrix for the Euclidean space, spectral decomposition in Hilbert space allows the kernel

representation, we can express

which is similar to the expression used to obtain continuous GMM estimators as given by Carrasco and Florens [

Spectral decompositions in functional space have been used in the literature, see Feuerverger and McDunnough [

to obtain CQD estimators. Unless otherwise stated, by CQD estimators we mean estimators using the objective function of the form as defined by expression (5).

Carrasco and Florens [

The objectives of the paper are to develop CQD estimation based on quantiles with the aims to have estimators which are robust in the sense of bounded influence functions and have good efficiencies. For technicalities, we refer to the paper by Carrasco and Florens [

The paper is organized as follows. Section 2 gives some preliminary results such as statistical functional and its influence function from which the sample quantiles can be viewed as robust statistics with bounded influence functions. CQD estimation using quantiles will inherit the same robustness property. Some of the standard notions for the study of kernel functions will also be reviewed. By linking a kernel to a linear operator in the Hilbert space of functions which are square integrable over the range

Finally, we shall mention that simulation studies are not discussed in this paper as numerical quadrature methods are involved for evaluating the integrals over the range

In this section we shall review the notion of statistical functional and its influence function and view a sample quantile as a statistical functional. Using its influence function, we can see that the sample quantile is a robust statistic and using the influence functions of two sample quantiles, we can also obtain the asymptotic covariance of the two sample quantiles.

Often, a statistic can be represented as a functional of the sample distribution

Furthermore, since a Taylor type of approximation in a functional space can be used, we then have the following approximation expressed with a remainder term

or equivalently using

If

Therefore, if we want to find the asymptotic variance of

The influence function of the sth-sample quantile

from which we can obtain the asymptotic variance of

See Serfing [

see LaRiccia and Wehrly [

If we define the covariance kernel as

then associated to this kernel there is a linear operator

We can see that for a suitable functional space, it is natural to consider the Hilbert space of functions which are square integrable so that a norm and linear operators can be defined in this space. This will facilitate the studies of kernels which are function of

The functional space that we are interested is the space of integrable function with the range

For a Euclidean space, the composition of two linear operators

Just as a matrix

More precisely, given

Furthermore,

if

In this paper we focus on positive definite symmetric kernel

Unless otherwise stated, we work with linear operators associated with positive definite symmetric kernels. For the Euclidean space if the covariance matrix

see Hogg et al. [

If

For our purpose, we shall focus on a linear operator

The methods used to construct an estimator for

1) We need a preliminary consistent estimate

2) Use

For our set-up, i.e., CQD estimation, we should use the influence function of the sample quantiles as given by expression (6) to specify

The notion of influence function was not mentioned in Carrasco and Florens [

3) Since

4) Use the spectral decomposition to express

The above expression is similar to the representation of a positive definite matrix

We can proceed as follows in order to find

It turns out that

The eigenfunction can be expressed as

can be computed as statistical packages often offer routines to compute eigenvalues and eigenvectors for a given matrix.

For numerical evaluations of

Now we turn into attention of constructing

It appears then the kernel of

Now we can define define

For example, if we let

This also means that the kernel for

and again

This also means that

In Section 3 we shall turn our attention to asymptotic properties of CQD estimators using the objective function

For consistency, we shall make use the basic consistency Theorem, i.e., Theorem 2.1 as given by Newey and McFadden [

Assuming

Now if we assume that the integrand can be dominated by a function

The basic assumption used to establish asymptotic normality for the CQD estimators is the model quantile function is twice differentiable which allows a standard Taylor expansion the estimating equations.

Assuming the first derivative vector

Before considering the Taylor expansion, we also need the following notation and the notion of a random element with zero mean and covariance given by the kernel of the associated linear operator K, i.e.,

as

Using expression (12), we then have

Now using

Note that

Let

And

so that

with the symbol

The matrix

using the spectral decomposition technique, the

The proposed method is similar to the continuous GMM method with the estimators obtained using sample distribution function obtained by minimizing

being an optimum kernel but using a sample distribution function

The authors also showed that by letting

For robustness sake for continuous GMM estimation we might want to let 𝑇 be finite and the lower bound be

The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support from the editorial staffs of Open Journal of Statistics to process the paper are all gratefully acknowledged.

The author declares no conflicts of interest regarding the publication of this paper.

Luong, A. (2019) Robust Continuous Quadratic Distance Estimation Using Quantiles for Fitting Continuous Distributions. Open Journal of Statistics, 9, 421-435. https://doi.org/10.4236/ojs.2019.94028