Efficiency of Some Estimators for a Generalized Poisson Autoregressive Process of Order 1

Various models have been proposed in the literature to study non-negative integer-valued time series. In this paper, we study estimators for the generalized Poisson autoregressive process of order 1, a model developed by Alzaid and Al-Osh [1]. We compare three estimation methods, the methods of moments, quasi-likelihood and conditional maximum likelihood and study their asymptotic properties. To compare the bias of the estimators in small samples, we perform a simulation study for various parameter values. Using the theory of estimating equations, we obtain expressions for the variance-covariance matrices of those three estimators, and we compare their asymptotic efficiency. Finally, we apply the methods derived in the paper to a real time series.


Introduction
Time series are used to model various phenomena measured over time.Successive observations are often correlated, since they may depend on some common external factors, but which remain unknown to the analyst.In this case, autoregressive models will be useful to model this dependence.
In some situations, we might be interested in the number of events which occur during a certain period of time.Such observations will necessarily be non-negative and integer-valued.Models which have been used for sequences of dependent discrete random variables include the Poisson autoregressive process of order 1, denoted ( ) 1 PAR , introduced by Al-Osh and Alzaid [2] and the generalized Poisson autoregressive process of order 1, denoted ( )

GPAR
(see Alzaid and Al-Osh [1]).The ( ) process, a stationary process with Poisson marginal distributions, is a special case of the ( ) 1 GPAR .The paper is organized as follows.In Section 2, for completeness, we review some properties of the generalized Poisson autoregressive process of order 1.In Section 3, we derive the expressions for the moments estimators, the quasi-likelihood and the maximum likelihood estimators of the 3 parameters of the ( ) 1 GPAR .These methods have appeared in the literature (see Al-Nachawati, Alwasel and Alzaid [3] for the quasilikelihood and moments method and Brännäs [4] for likelihood methods).However, asymptotic properties such as efficiencies of these methods are not discussed in those papers.In this paper (Sections 4 and 5), we study properties of these estimators such as bias and asymptotic efficiency.The last section reanalyzes a real-data example which can be modelled with a ( ) process, where testing is discussed.We hope that with this study, practitioners will have more information to select one estimation method versus another one and to perform tests concerning values of the parameters.

GPAR(1) Process
To define the ( ) process, we need first to review the generalized Poisson and the quasi-binomial distributions.
A random variable X has a generalized Poisson distribution with parameters λ and θ , denoted ( ) where 0 λ > , ( ) so that, for positive values of θ , we have overdispersion (i.e.

[ ] [ ]
Var X E X > ).The sum X Y + of two independent random variables X and Y with ( ) A non-negative integer-valued random variable X has a quasi-binomial distribution, denoted ( ) . Its mean, equal to pn , is independent of the parameter θ .
The following proposition, proved in Alzaid and Al-Osh [1], shows the relation between the QB and GP distributions.

GPAR
process generalizes the ( ) process introduced by Al-Osh and Alzaid [2].The ( ) , has been used to model time series in various fields, for example in insurance for short-term workers' compensation because of work-related injuries (Freeland and McCabe [7]) and in medicine for the incidence of infectious diseases (Cardinal, Roy and Lambert [8]).
In practice, many integer-valued series will often exhibit overdispersion, (i.e.
[ ] model would therefore not be appropriate for those time series.In cases where the extra variation can be explained in a deterministic way, adding regressors would be adequate (see Freeland and McCabe [7]), but where the extra variation is of a stochastic nature, the ( ) model could be used for modelling overdispersed time series.The ( ) model, introduced by Alzaid and Al-Osh [1], is defined as where 1) , GP qλ θ distributions, as stated in Al- Nachawati et al. [3].
The autocorrelation function (acf) of the ( ) The acf of this process is the same as that of an ( ) process except that it is always non-negative, since ( ) . The partial autocorrelation function (pacf) of the ( )  The sample acf and pacf will be useful to identify the ( ) model from an observed time series.

Estimation of the Parameters
Estimating the parameters in a ( ) process will present some challenges, since the conditional distribution of t X , given In this section, we will review three estimation methods for the parameter vector ( ) process, the methods of moments, quasi-likelihood and conditional maximum likelihood.These methods have been proposed in the literature, see for example, Al-Nachawati et al. [3] or Brännäs [4].However, less emphasis is placed on their asymptotic properties, such as efficiency.In Section 4, we study the bias of these estimators, and in Section 5 their efficiency.

Method of Moments or Yule-Walker
The first autocovariance of the ( ) By taking the expected value of both sides of the equation given in (1), we find [ ] ( ) We also know that From the observations 1 2 , , , n x x x Solving the system of Equations ( 2), ( 3), ( 4) with [ ] by their sample values, we obtain the moments estimators ( ) We have corrected here misprints in the formulas for the moment estimators of the parameters λ and θ given by Al-Nachawati et al. [3].

Quasi-Likelihood Method
This method, proposed initially by Whittle [9], replaces the true likelihood by the one which assumes that the observations come from a normal distribution with the same conditional mean and variance.Al-Nachawati et al. [3] obtained the quasi-likelihood estimators ( ) We have used the expression in Shenton [10] for the formula of the variance of a quasi-binomial distribution, which is a bit different from the one given in Al-Nachawati et al. [3].Since the ( ) process is restricted to non-negative integers and therefore not symmetrical, one might suspect that the estimators are less efficient than the maximum likelihood estimators, which is indeed the case (see Section 5 for numerical results).

Conditional Maximum Likelihood Method
To obtain the conditional maximum likelihood estimators (MLE's) , we need the conditional distribution of ( )  , , , n  x x x x  , we have to maximize the function We will work with the loglikelihood function ( ) , , , l p λ θ equal to ( ) ln , , L p λ θ , which will have to be maximized numerically to obtain the MLE Θ .
( ) where  →  denotes convergence in law, 0 is the vector of zeros of dimension 3, and ( ) ( ) is Fisher's expected information matrix, of dimension 3 3 × .

Bias of Estimators
With simulations, we will study the bias of the moments estimators and the MLE's.Setting the values of the 3 parameters to those in Table 1, two series of 50 and 200 observations were generated from model (1) in C++.This experiment was repeated 200 times.
For each series, the moments estimators were calculated, as well as their average, and the bias.The conditional MLE's were calculated using the iterative Downhill Simplex method (see Press, Teukolsky, Vetterling and Flannery [13]), which does not require the calculation of the derivatives of the function to be maximized.As initial values, we used the moments estimators.The results of the simulations appear in Figures 1-3.
From Figures 1-3, we see that the bias of the MLE's is smaller than that of the moments estimators, and that    it decreases when the size of the series increases.Figure 1 shows that the bias of p is much smaller than that of p , except when 200 n = and 0.2 θ = where they are almost equal to 0. The bias of the two estimators is negative.In Figure 2, we see that the bias of λ and λ is close to 0 when 1 λ = ; as λ increases, λ and λ are more biased.In all cases, the bias of the estimator of λ is positive.The bias of the estimator of θ behaves like that of p (Figure 3); for the two estimation methods, it is similar for 5 λ = or 10.Since the moments estimators and the conditional MLE's are almost unbiased for large n, we study their asymptotic efficiency in the next section.

Asymptotic Efficiency of Estimators
We will first discuss the techniques by which we can obtain the asymptotic variance-covariance matrix of the estimators under the three estimation methods.To study efficiencies, we calculate, in subsection 5.4, the ratios of the variances of the estimators and the ratio of the determinants of their variance-covariance matrix using observations simulated from a ( ) process for various values of the parameters.The results are summarized in Table 2 and Table 3 of this section.

Method of Moments
By using an asymptotically equivalent factor of 1 n instead of 1 1 n − in Equation (3), moments estimators ( ) , , p λ θ Θ = are given as solutions of the system of equations ) , , , 1 The expected values ( ) ( )  are asymptotically equal to 0. Using a Taylor series expansion around ( ) , the true parameter value, we obtain where 0 p nε  → , with p  → denoting convergence in probability.Since Θ is a solution of ( ) Using Slutsky's theorem, we find that ( ) ( ) ( ) where, with probability 1, ( ) ( ) Matrix A evaluated at 0 Θ can be estimated by Let us consider the first element of this matrix: [ ] In practice, we truncate these expressions, since ( ) Using the law of large numbers, we can estimate this last term by 5 1 .
The other elements of the matrix can be estimated in the same way.
From Hamilton [12], using quasi-likelihood theory, we conclude that ( ) ( ) ∑ ∑ evaluated at 0 Θ , the true parameter.We can obtain estimates for D and Ŝ , where matrix D is defined as ( ) and Ŝ is the finite version of S evaluated at 0 Θ ; the elements of D are evaluated numerically using   .This leads us to reject 0 H and to conclude that the ( ) model is more appropriate: there is overdispersion in the observations.

Proposition 1 :
If X and ( ) S n are two independent random variables with

,
which is the convolution of a GP qλ θ distribution.Given the observations 0 1 2 ,

Hχ
at level α if Z is greater than 1 2 .distribution asymptotically.It is expected that the more efficient the estimator is, the

Figure 4 .
Figure 4. Acf and pacf.morepowerful the test will be.With the estimated parameters, we can test the

Table 1 .
Values of parameters.

Table 2 .
Efficiency of moments estimators.

Table 4 .
MLE's of the parameters.