Estimating Equations for Estimation of Mcdonald Generalized Beta — Binomial Parameters

There has been a considerable recent attention in modeling over dispersed binomial data occurring in toxicology, biology, clinical medicine, epidemiology and other similar fields using a class of Binomial mixture distribution such as Beta Binomial distribution (BB) and Kumaraswamy-Binomial distribution (KB). A new three-parameter binomial mixture distribution namely, McDonald Generalized Beta Binomial (McGBB) distribution has been developed which is superior to KB and BB since studies have shown that it gives a better fit than the KB and BB distribution on both real life data set and on the extended simulation study in handling over dispersed binomial data. The dispersion parameter will be treated as nuisance in the analysis of proportions since our interest is in the parameters of McGBB distribution. In this paper, we consider estimation of parameters of this MCGBB model using Quasi-likelihood (QL) and Quadratic estimating functions (QEEs) with dispersion. By varying the coefficients of the QEE’s we obtain four sets of estimating equations which in turn yield four sets of estimates. We compare small sample relative efficiencies of the estimates based on QEEs and quasi-likelihood with the maximum likelihood estimates. The comparison is performed using real life data sets arising from alcohol consumption practices and simulated data. These comparisons show that estimates based on optimal QEEs and QL are highly efficient and are the best among all estimates investigated.


Introduction
Estimating functions have for sometimes been a key concept and subject of inquiry in research and it is known to be the most general method of estimation.The basis of this method is a set of simultaneous equations involving both the data and the unknown model parameters.To obtain an estimator, the estimating function is equated to zero and then solve the resulting equation with respect to the parameter in order to obtain parameter estimate.Estimating equations are not quite intensive in computation unlike MLEs.Moreover, the MLE estimators are based on the assumption that the distribution is known, however an estimating equation is free of such assumptions.The usual procedure is to take a parametric model, such as, the McDonald Generalized beta-binomial model to allow over as well as under dispersion and obtain maximum likelihood estimates of the parameters McDonald Generalized Beta Binomial (McGBB) distribution is a three-parameter distribution which is superior to KB in handling over dispersed binomial data.This procedure may produce inefficient or biased estimates when the parametric model does not fit the data well.Alternatively, more robust estimates, such as moment estimates, quasi-likelihood estimates (Breslow, 1990 [1]; Moore and Tsiatis, 1991 [2]), extended quasi-likelihood estimates (Nelder and Pregibon, 1987 [3]), the Gaussian likelihood estimates (Whittle, 1961 [4]; Crowder, 1985 [5]), estimates based on the pseudo-likelihood estimating equations of Davidian and Carrol (1987) [6] and estimates based on quadratic estimating functions of Crowder (1987) [7] and Godambe and Thompson (1989) [8] can be considered.In this paper we consider estimating the parameters of McDonald Generalized Beta Binomial by the quadratic estimating equations (QEE's) of Crowder (1987) [7] and Godambe and Thompson (1989) [8] and compared the small sample efficiency and bias properties of these estimates with the maximum likelihood estimates.By varying the coefficients of the QEE's we obtain four sets of estimating equations.We compare the small sample efficiency of the five sets of estimates obtained by the QEE's and the quasi-likelihood estimates with the maximum likelihood estimates.We compare estimated relative efficiencies of the estimates for two sets of real life data arising from alcohol consumption practices and simulation study.Estimation of the parameters by the six methods is discussed in Section 3. In Section 4 we compare small sample relative efficiencies.This study shows that if interest is on the point parameters then the GL is the method of choice followed by QL.

McDonald Generalized Beta-Binomial Distribution of the First Kind
Let p be a random variable following McDonald's Generalized Beta-Binomial Distribution of the first kind (McDonald, 1984 [9]; McDonald and Xu, 1995 [10]) with three parameters, α , β and γ .The probability density function of p is then given by ( ) ( ) ( ) The th r moment of the McDonald Generalized Beta-Binomial Distribution of the first kind is given by

McDonald Generalized Beta Binomial Distribution
where 0,1, , y n =  and Θ is the parameter space of the mixing distribution.

Maximum Likelihood Method
The three unknown parameters of McGBB distribution have been estimated using the maximum likelihood estimation technique.Let , then the log-likelihood function for Θ can be defined as, ( ) ( ) ( )

Quasi-Likelihood
The quasi-likelihood (Wedderburn, 1974 [11]) is based on the knowledge of the form of first two moments of the random variable , . Where ( ) The quasi-likelihood with the above mean and variance is given by ( ) , ∫ By virtue of independence between samples, the quasi-likelihood with the above means and variance is given by: ( where 1, , , We denote Equation ( 5) by QL λ .In this case ij d β is given as ( ) Then the partial derivatives for the three parameters α , β , γ given ρ as also obtained as follow:

Quadratic Estimating Equations
By considering estimating functions quadratic in i z the QEEs has general form a , Crowder (1987) [7], where λ i a and λ i b are specified nonstochastic functions of λ .Thus, through derivation the unbiased quadratic estimating equations for parameters: α , β and γ for McGBB distribution is found as follows.
The unbiased quadratic estimating equations for α , β and γ and ρ have the form , 0.
If we take We obtain the Gaussian estimating equations.We denote this Equation ( 10) by GL λ If we take , and 0 Then we obtain the unbiased estimating equations (QEE's) for McDonald Generalized Binomial Distribution.These equations were obtained by combining the quasi-likelihood estimating equations for the regression parameters and the optimal quadratic estimating equations of Crowder (1987) [7] for the dispersion parameter after setting 1 ϒ and 2 ϒ to zero.We denote the estimates so obtained from Equation ( 11) by , . .
We obtain the optimal quadratic estimating equations.We note that the forms of the skewness 1λ ϒ and the kurtosis 2λ ϒ are not known.We then take these based on the second, third and fourth moments of the McDonald generalized beta-binomial distribution, which are: We denote the estimates obtained by solving these optimal quadratic estimating equations by 2 M λ Further we also denote the estimates obtained by solving the optimal quadratic estimating equations with are also obtained by using the pseudo-likelihood estimating equations of Davidian and Carrol (1987) [7].

Small-Sample Relative Efficiency
The asymptotic relative efficiency may not be very useful when comparing different estimators in small samples.So we conducted a simulation study using relatively small n alongside the real data.We compare the small sample relative efficiency of the estimates obtained by the five estimation procedures: QL ; GL ; , GL , 1 M , 2 M , 3 M .In the situation where relative efficiency is greater than one, then the procedure with its efficiency as the denominator is preferred than the "gold standard" ML .The relative efficiency results for the McGBB parameters are summarized in Table 2 for the real data and those for simulated data are summarized in Table 3 and Table 4 and plotted in Figure 1 for simulated data.

Estimation
Table 1 shows the data set used by Alanko and Lemmens (1996) [12], Rodrίguez-Avi et al. (2007) [13], and Chandrabose et al. (2013) [14] in the study of handling over dispersion.It shows the number of days an individual consumes alcohol y, out of n = 7 days in N = 399, where y = number of days, n = frequency of consumption.We used this data in Table 1 to obtain the estimates for , α β and γ and estimated relative efficiencies by the six different procedures as given in Table 2.

Simulation
We compare the relative efficiency of the estimates , α β and γ obtained by the six estimation procedures Table 2.The estimate , α β and γ and their estimated Relative efficiencies by MLE QL M1 M2 and M3 methods for the real data.

Parameter estimates
Estimated relative efficiencies   , GL , 1 M , 2 M , 3 M .In the situation where relative efficiency is greater than one, then the procedure with its efficiency as the denominator is preferred than the "gold standard" ML.Using the combination of , α β and γ parameters.We simulated 5000 samples from the MacDonald generalized Beta-Binomial distribution using the weekly alcohol consumption data.During simulation, all the parameters , α β and γ were estimated for all the six procedures including maximum likelihood and their efficiencies and subsequently their relative efficiencies for the six procedures.

Discussion
From Table 2, Table 3 and Table 4 we see that the methods QL, GL and 2 M all consistently provide high efficiency (never below 0.83).Efficiency of parameters by the method GL is consistently the best.The good behaviour of the Gaussian likelihood estimator may be due to the fact that the Gaussian likelihood is a proper likelihood and the distribution of the data does not depend on a specific departure from the binomial distribution.Generally the estimates of parameters by all estimating functions methods have high efficiencies.In this paper we showed that the estimates obtained through small sample parameter estimates and efficiencies obtained during data analysis are the best for GL followed by QL and then 2 M method (estimates based on the optimal quadratic estimating equations with the third and the fourth moments of the McGBB distribution) are consistent.The next best, at the cost of some loss of efficiency, are the 1 M and then 3 M seems to be the least method.Therefore, when data follow a McGBB distribution, these methods are expected to have high efficiency as compared to MLEs.

Conclusion
The estimation functions are based on the knowledge of moments and one of the advantages of this approach is that it is robust to model misspecification.The comparison results in this paper indicate that the Estimating Equations are superior to MLE.The small relative efficiency for the estimates results also shows that estimates using optimal quadratic estimating functions of Crowder (1987) are highly efficient and are the best among all estimates investigated followed by Quasi-likelihood.Thus, we propose quadratic estimating function for estimation of point parameters of any model inclusive of McDonald Generalized Beta-Binomial instead of MLEs since they are consistent and robust to variance misspecification.

Figure 1 .
Figure 1.Plot of relative efficiencies for various estimators relative to that of the MLE under McDonald Generalized Beta-Binomial model: for (a) relative efficiency comparison for α varied when

Figure 1 :
Maximum likelihood procedure relative efficiency comparison for (a) when we fix 1 γ = and 0.5 β = then α varied for GL and QL procedures, (b) α varied when 0.5 β = for all procedures.While (c) β varied when 0.7 α = and fix 1 γ = for GL and QL procedures and (d) β varied when 0.5 β = for all procedures under simulated data.
A random variable Y is said to have McDonald Generated Beta Binomial (McGBB) Distribution with parameter α , β and γ if and only if it satisfies the following stochastic representation.Y p ~ Bin ( )

Table 1 .
Number of alcohol consumption days and the frequency of consumption.

Table 3 .
Relative efficiencies for various estimators for α varied when

Table 4 .
Relative efficiencies for various estimators for β varied when