Generalized Method of Moments and Generalized Estimating Functions Based on Probability Generating Function for Count Models

Generalized method of moments based on probability generating function is 
considered. Estimation and model testing are unified using this approach which 
also leads to distribution free chi-square tests. The estimation methods 
developed are also related to estimation methods based on generalized estimating 
equations but with the advantage of having statistics for model testing. The 
methods proposed overcome numerical problems often encountered when the 
probability mass functions have no closed forms which prevent the use of 
maximum likelihood (ML) procedures and in general, ML procedures do not lead to 
distribution free model testing statistics.


Introduction
Count data are often encountered in many fields of applications which include actuarial sciences and fitting discrete count models are of interests. Classical methods such as maximum likelihood (ML) procedures often require the probability function of the model to have closed-form and furthermore the inferences techniques do not lead to distribution free statistics when using the Pearson statistics. In fact, if a model does not fit the data, better models can be created using compound procedure, stop sum procedure or mixing procedure and the new How to cite this paper: Luong For discussions on these procedures see the books by Johnson et al. [1], Klugman et al. [2] but for these better models often they do not have closedform probability mass functions but their probability generating functions often remain simple and have closed-form expressions.
For example, if count data display long tailed behavior so that the Poisson model with probability generating function ( ) ( ) 1 , 0 s P s e θ θ θ − = > does not provide a good fit, the positive discrete stable (DPS) distribution can be created and be used as an alternative to the Poisson distribution. The discrete positive stable distribution (DPS) does not have closed-form or simple form for probability mass function but its probability generating function is simple and given by  [3] for this distribution. In their paper, expression (6) gives the representation of the probability mass function of the DPS distribution using series, The probability mass function appears to be complicated and for model validation there is a need for a statistic for model testing. By having these issues, it will make maximum likelihood (ML) procedures difficult to implement.
GMM procedures based on probability generating function appear to be a natural way to introduce alternatives to ML procedures, bypassing the use of the probability mass function explicitly and focus uniquely on the probability generating function. In this vein, the procedures proposed in this paper make use of GMM and generalized estimating equation theory and they are less simulation intensive oriented than inference techniques given by the paper by Luong et al. [4].
We shall use general GMM methodology but adapted it to situations where moment conditions are based on probability generating function so that estimation and model testing can be carried out in a unified way for discrete count models. The choice of moments of the developed GMM procedures makes use of estimating function theory which allows us the use a number of points based on probability generating which tends to infinity as the sample size n → ∞ . Furthermore, we also related GMM estimation with the approach using generalized 1) GMM procedures as proposed by Doray et al. [5] only make use of a finite number of points of the probability generating function, our methods aim at achieving higher efficiency yet remain simple to implement and it is done by linking to the theory of estimating function, it can accommodate the use of a number of points from the probability generating function instead of being fixed, it goes to infinity as n → ∞ .
2) The new GMM procedures remain simpler to implement than GMM procedures using a continuum moment conditions in general as proposed by Carrasco and Florens [6] or methods on adapting GMM procedures using a continuum of moment conditions for characteristic function proposed by Carrasco and Kotchoni [7] to probability generating function. Practitioners might find the sophisticated methods based on a continuum moment conditions difficult to implement.
The paper is organized as follows. In Section 2, we review available results from general GMM theory, despite the results are not new once the moment conditions are defined but they make the paper more self-contained as these results will be adapted subsequently with moments conditions extracted from the probability generating function when count models are considered. In Section 3, GMM estimation and related GEE estimation for count models are considered.
The chi-square statistics are also given in Section 3.

Generalized Method of Moments (GMM) Methodology
The inferences techniques based on probability generating functions developed in this paper make use of results of Generalized Method of Moments (GMM) theory which are well established once the moment conditions are specified, see The estimating equations of GMM methods will also be linked to the theory of estimating equations and generalized estimating equations (GEE) as developed by Godambe and Thompson [10], Morton [11], Liang and Zeger [12].

Generalized Estimating Equations (GEE) and GMM Estimation
For data, we shall assume that we have n independent observations 1 , , n y y  , these observations need not be identically distributed but each i y will follow a distribution which depends on the same vector of parameters ( ) ( ) with convergence in probability denoted by p  → and convergence in distribution denoted by Therefore, the asymptotic covariance for ˆo p θ is simply can be used to obtain the estimators numerically ˆo p θ . The algorithm gives the j + 1-th iteration based on the previous j-th iteration as Other numerical techniques to obtain ˆo p θ can be used. For example, we can consider solving the system of equations given by expression (4) and expression ; ; and Ŝ is positive definite with probability one and clearly Ŝ is symmetric, its inverse is 1 − S which exists with probability one. Despite that these two expressions for Ŝ are asymptotically equivalent but for numerical implementations of the methods in finite samples, the matrix has more chance to be invertible.
Under suitable differentiability assumptions imposed on the vector function ( ) g θ , the GMM estimators given by θ is consistent and has an asymptotic multivariate normal distribution, i.e., Using V , the asymptotic covariance matrix of θ can be estimated.
We also notice that we can recover optimum estimating equations estimators using the following GMM estimation set-up by letting k p = , i.e., the number of sample moments is equal to the number of parameters to be estimated and Minimizing the corresponding GMM objective function yields the vector of GMM estimators which are given by the following system of equations since 1 − S is positive definite with probability one, which is the same system of equations for obtaining the optimum estimating equations estimators as discussed. Using vector notations, the vector of optimum estimating functions is simply ( ) ( only being independent but they are also identically distributed then we have the equivalence of the two methods. We also notice that Ŝ used for GMM estimation plays a similar role as the working matrix is often simpler to obtain Ŝ than Based on expression (7)  ; see Section 3 for more details for the choice of ( ) ( ) ( ) for GMM methods with models based on probability generating functions.
One advantage of the GMM approach over generalized estimating equations (GEE) approach is with GMM approach, we have an objective function to be minimized and it leads to construction of chi-square tests for moment restrictions meanwhile there is no such equivalent test statistic if we use the generalized estimating equations approach. Furthermore, we shall see in Section 3 when applied to discrete distributions with moment conditions extracted from probability generating function, testing for moment restrictions can be viewed as testing goodness-of-fit for the count model being used. Consequently, estimation and model testing can be treated in a unified way using this approach.
As mentioned earlier, the GMM objective function evaluated at θ can be used to construct a test statistic which follows an asymptotic chi-square distribu- but we need k p > , i.e., the number of sample moments must exceed the number of parameters to be estimated.

Testing the Validity of Moment Restrictions
We notice that since ( ) θ is specified, the Hansen's statistic is given as and the asymptotic distribution of the statistic is chi-square with k degree of freedom, i.e., : We need to obtain θ first by minimizing ( ) Q θ then the Hansen's statistic is given as θ θ and the asymptotic distribution of the statistic is chi-square with k p − degree of freedom, i.e., These statistics will be used subsequently with moment conditions extracted from the model probability generating function in Section 3. We shall show in the next sections that these statistics are consistent test statistics in general for model testing with the discrete model specified by its probability generating function. These statistics are also distribution free. The distribution free property is not enjoyed by goodness-of-fit test statistics for model testing based on the empirical probability function which is defined as  [15] as the null distributions of the statistics depend on the unknown parameters.
In addition, the procedures as proposed by Doray et al. [5] only make use of k fixed points 1 , , k s s  to generate moment conditions regardless of the sample size n.
The procedures proposed in this paper are different as the number of points selected from the probability generating function goes to infinity as n → ∞ .

GEE and GMM Methods with Moment Conditions from Probability Generating Function
In this section, we shall give attention to count models and we shall assume that we have a random sample of n independent and identically distributed observations 1 , , n X X  which follow the same distribution as X and X follows a nonnegative integer discrete distribution with probability mass function ( ) Optimum estimating functions can be used to obtain estimators but we emphasize here the GMM approach as tests for moment restrictions with asymptotic chi-square distribution free can also be obtained which can be interpreted as goodness-of-fit tests for the parametric family used. However, optimum estimating functions theory is very useful for identifying sample moments for efficiency of GMM procedures.

Generalized Estimating Functions (GEE)
First, we shall define the basic unbiased estimating functions The elements of the set  , 1, , We select two points 1 t and 2 t , for example by letting 1 0.50 t = and 2 0.75 t = and therefore we can form two sets of elementary basic unbiased estimating functions using these two points which are given by

GMM Methodology
Before defining the sample moment vector

GMM Objective Function
Now we turn our attention to defining the vector t is chosen to close to 1 but 3 1 t < . The GMM objective function can be constructed and given by ( ) ( ) ( )

Model Testing Using GMM Objective Function
Now we shall turn our attention to the problem of testing a model specified by its probability generating function. Let 1 , , n X X  be the random sample drawn from the nonnegative integer discrete distribution with probability generating

P t P t ≠ θ
the chi-square statistic will converge to infinity.
In order not to have this property, we must have If ( ) Therefore, we cannot have simultaneously convergence as given by expression (19) and the chi-square test is consistent in general as it can detect common de- These chi-square statistics are distribution free as there is no unknown parameter in these chi-square distributions for the statistics used. These goodness-of-fit tests are simpler to implement than the ones based on matching sample probability generating function with its model counterpart using a continuum of moment conditions as given by Theorem 10 of Carrasco and Florens [6] (p 812-813). Note that maximum likelihood estimators if used concomitantly with the common classical Pearson statistics often have complicated distributions and the statistics are no longer distribution free, see Chernoff and Lehmann [17], Luong and Thompson [18] and these classical Pearson's test statistics are not consistent in general.    nents but more components also tend to create numerical difficulties because the matrix Ŝ will be nearly singular and the numerical inversion of such a matrix is often problematic.

Further Extensions: The Use of Orthogonal Estimating Functions
Finally, we note that the GMM methods developed although are primarily for discrete distributions, the methods can also accommodate nonnegative continuous defined using Laplace transforms as discussed in Luong [20] as Laplace transforms are related to probability generating functions.

An Example and Numerical Illustrations
We shall use an example to illustrate the procedures, let us consider a random   Often by using a spectral decomposition of Ŝ , we can obtain 1 − S numerically although directly asking for the inverse using R, it might just give the message, matrix is nearly singular and does not return the inverse. As can be seen by using the spectral representation of Ŝ , ˆ′ = S P P Λ with ′ P being an orthonormal matrix, ′ = P P I , 1 − ′ = P P and Λ is a diagonal matrix with diagonal elements consist of eigenvalues of Ŝ and these eigenvalues need to be computed with high accuracies and they also must be positive numerically, so by keeping more digits to compute the eigenvalues of Ŝ then in general, are respectively the estimates of mean square error of GMM estimator, NLS estimator and ML estimator using simulated samples. The efficiency of GMM estimator is practically identical to the efficiency of ML estimator but the efficiency on NLS estimator is much lower and getting worse as θ increases in comparison with ML estimator. The results are displayed in Table A1.
In order to test whether the chi-square test has power to detect departure from the model used here we use the negative binomial with mean equals to θ and variance equals to obtained are also encouraging and show that the chi-square tests have considerable power to detect departures. As n becomes large the estimate power also decreases as expected since as n → ∞ , the negative binomial distribution also

Conclusion
At this point, we can conclude that the methods appear to be relatively simple to implement and have the potentials to be efficient for some count models and have the advantage of only using of probability generating function instead of probability mass function, allowing inferences to be made for a much larger class of parametric families without relying on extensive use of simulations. The proposed GMM methodology also combines traditional GMM methodology with generalized estimating function methodology and both of these methodologies are well-known alternatives to ML methodology. There is a lack of statistics for model testing when using generalized estimating function methodology and it is overcome by the proposed procedures.