Estimating Equations for Estimation of Mcdonald Generalized Beta— Binomial Parameters ()
1. Introduction
Estimating functions have for sometimes been a key concept and subject of inquiry in research and it is known to be the most general method of estimation. The basis of this method is a set of simultaneous equations involving both the data and the unknown model parameters. To obtain an estimator, the estimating function is equated to zero and then solve the resulting equation with respect to the parameter in order to obtain parameter estimate. Estimating equations are not quite intensive in computation unlike MLEs. Moreover, the MLE estimators are based on the assumption that the distribution is known, however an estimating equation is free of such assumptions. The usual procedure is to take a parametric model, such as, the McDonald Generalized beta-binomial model to allow over as well as under dispersion and obtain maximum likelihood estimates of the parameters McDonald Generalized Beta Binomial (McGBB) distribution is a three-parameter distribution which is superior to KB in handling over dispersed binomial data. This procedure may produce inefficient or biased estimates when the parametric model does not fit the data well. Alternatively, more robust estimates, such as moment estimates, quasi-likelihood estimates (Breslow, 1990 [1] ; Moore and Tsiatis, 1991 [2] ), extended quasi-likelihood estimates (Nelder and Pregibon, 1987 [3] ), the Gaussian likelihood estimates (Whittle, 1961 [4] ; Crowder, 1985 [5] ), estimates based on the pseudo-likelihood estimating equations of Davidian and Carrol (1987) [6] and estimates based on quadratic estimating functions of Crowder (1987) [7] and Godambe and Thompson (1989) [8] can be considered. In this paper we consider estimating the parameters of McDonald Generalized Beta Binomial by the quadratic estimating equations (QEE’s) of Crowder (1987) [7] and Godambe and Thompson (1989) [8] and compared the small sample efficiency and bias properties of these estimates with the maximum likelihood estimates. By varying the coefficients of the QEE’s we obtain four sets of estimating equations. We compare the small sample efficiency of the five sets of estimates obtained by the QEE’s and the quasi-likelihood estimates with the maximum likelihood estimates. We compare estimated relative efficiencies of the estimates for two sets of real life data arising from alcohol consumption practices and simulation study. Estimation of the parameters by the six methods is discussed in Section 3. In Section 4 we compare small sample relative efficiencies. This study shows that if interest is on the point parameters then the GL is the method of choice followed by QL.
2. McDonald Generalized Beta-Binomial Distribution of the First Kind
Let be a random variable following McDonald’s Generalized Beta-Binomial Distribution of the first kind (McDonald, 1984 [9] ; McDonald and Xu, 1995 [10] ) with three parameters, , and. The probability density function of is then given by
(1)
The moment of the McDonald Generalized Beta-Binomial Distribution of the first kind is given by
(2)
McDonald Generalized Beta Binomial Distribution
A random variable is said to have McDonald Generated Beta Binomial (McGBB) Distribution with parameter, and if and only if it satisfies the following stochastic representation. ~ Bin
and ~ GB1, where, and are positive real numbers. This distribution was denoted as, ~ McGBB.
In general, a Binomial mixture is obtained through an integration approach. Suppose follows a binomial distribution given by Bin and ~ Bin. Unconditional PMF of the can be obtained by evaluating the integral
(3)
where and is the parameter space of the mixing distribution.
3. Estimation of Parameters of McDonald Generalized Beta-Binomial Distribution
3.1. Maximum Likelihood Method
The three unknown parameters of McGBB distribution have been estimated using the maximum likelihood estimation technique. Let be a random sample of size from a McGBB distribution with
unknown parameter vector, then the log-likelihood function for can be defined as,
(4)
3.2. Quasi-Likelihood
The quasi-likelihood (Wedderburn, 1974 [11] ) is based on the knowledge of the form of first two moments of the random variable. Where, and. While with
then, and.
The quasi-likelihood with the above mean and variance is given by
where,
By virtue of independence between samples, the quasi-likelihood with the above means and variance is given by:
(5)
We denote Equation (5) by. In this case is given as. Given we have:
where, and.
Then the partial derivatives for the three parameters, , givenas also obtained as follow:
(6)
(7)
(8)
3.3. Quadratic Estimating Equations
By considering estimating functions quadratic in the QEEs has general form a
, Crowder (1987) [7] , where and are specified nonsto-
chastic functions of. Thus, through derivation the unbiased quadratic estimating equations for parameters:, and for McGBB distribution is found as follows.
The unbiased quadratic estimating equations for, and and have the form
(9)
If we take
We obtain the Gaussian estimating equations. We denote this Equation (10) by
(10)
If we take, and.
Then we obtain the unbiased estimating equations (QEE’s) for McDonald Generalized Binomial Distribution. These equations were obtained by combining the quasi-likelihood estimating equations for the regression parameters and the optimal quadratic estimating equations of Crowder (1987) [7] for the dispersion parameter after setting and to zero.
We denote the estimates so obtained from Equation (11) by
This simplifies to,
(11)
For
We obtain the optimal quadratic estimating equations. We note that the forms of the skewness and the kurtosis are not known. We then take these based on the second, third and fourth moments of the McDonald generalized beta-binomial distribution, which are:
and.
We denote the estimates obtained by solving these optimal quadratic estimating equations by Further we also denote the estimates obtained by solving the optimal quadratic estimating equations with by. Note the estimates are also obtained by using the pseudo-likelihood estimating equations of Davidian and Carrol (1987) [7] .
(12)
(13)
4. Small-Sample Relative Efficiency
The asymptotic relative efficiency may not be very useful when comparing different estimators in small samples. So we conducted a simulation study using relatively small alongside the real data. We compare the small sample relative efficiency of the estimates obtained by the five estimation procedures:;;;; with the MLE. The estimated Relative efficiency of is where, , , ,. In the situation where relative efficiency is greater than one, then the procedure with its efficiency as the denominator is preferred than the “gold standard”. The relative efficiency results for the McGBB parameters are summarized in Table 2 for the real data and those for simulated data are summarized in Table 3 and Table 4 and plotted in Figure 1 for simulated data.
5. Estimation
Table 1 shows the data set used by Alanko and Lemmens (1996) [12] , Rodrίguez-Avi et al. (2007) [13] , and Chandrabose et al. (2013) [14] in the study of handling over dispersion. It shows the number of days an individual consumes alcohol y, out of n = 7 days in N = 399, where y = number of days, n = frequency of consumption. We used this data in Table 1 to obtain the estimates for and and estimated relative efficiencies by the six different procedures as given in Table 2.
6. Simulation
We compare the relative efficiency of the estimates and obtained by the six estimation procedures
Table 1. Number of alcohol consumption days and the frequency of consumption.
Table 2. The estimate and and their estimated Relative efficiencies by MLE QL M1 M2 and M3 methods for the real data.
using weekly (7 days) alcohol consumption survey data and simulated data for the survey of weekly alcohol consumption for a small time frame (days) along with estimates of the parameters of the maximum likelihood method. Estimated Relative efficiency of is where, , , ,. In the situation where relative efficiency is greater than one, then the procedure with its efficiency as the denominator is preferred than the “gold standard” ML. Using the combination of and parameters. We simulated 5000 samples from the MacDonald generalized Beta-Binomial distribution using the weekly alcohol consumption data. During simulation, all the parameters and were estimated for all the six procedures including maximum likelihood and their efficiencies and subsequently their relative efficiencies for the six procedures. Figure 1: Maximum likelihood procedure relative efficiency comparison for (a) when we fix and then varied for GL and QL procedures, (b) varied when for all procedures. While (c) varied when and fix for GL and QL procedures and (d) varied when for all procedures under simulated data.
7. Discussion
From Table 2, Table 3 and Table 4 we see that the methods QL, GL and all consistently provide high efficiency (never below 0.83). Efficiency of parameters by the method GL is consistently the best. The good behaviour of the Gaussian likelihood estimator may be due to the fact that the Gaussian likelihood is a proper likelihood and the distribution of the data does not depend on a specific departure from the binomial distribution. Generally the estimates of parameters by all estimating functions methods have high efficiencies. In this paper we showed that the estimates obtained through small sample parameter estimates and efficiencies obtained during data analysis are the best for GL followed by QL and then method (estimates based on the optimal quadratic estimating equations with the third and the fourth moments of the McGBB distribution) are consistent. The next best, at the cost of some loss of efficiency, are the and then seems to be the least method. Therefore, when data follow a McGBB distribution, these methods are expected to have high efficiency as compared to MLEs.
8. Conclusion
The estimation functions are based on the knowledge of moments and one of the advantages of this approach is that it is robust to model misspecification. The comparison results in this paper indicate that the Estimating Equations are superior to MLE. The small relative efficiency for the estimates results also shows that estimates using optimal quadratic estimating functions of Crowder (1987) are highly efficient and are the best among all estimates investigated followed by Quasi-likelihood. Thus, we propose quadratic estimating function for estimation of point parameters of any model inclusive of McDonald Generalized Beta-Binomial instead of MLEs since they are consistent and robust to variance misspecification.
NOTES
*Corresponding author.