^{1}

^{*}

^{1}

^{*}

^{1}

^{*}

There has been a considerable recent attention in modeling over dispersed binomial data occurring in toxicology, biology, clinical medicine, epidemiology and other similar fields using a class of Binomial mixture distribution such as Beta Binomial distribution (BB) and Kumaraswamy-Binomial distribution (KB). A new three-parameter binomial mixture distribution namely, McDonald Generalized Beta Binomial (McGBB) distribution has been developed which is superior to KB and BB since studies have shown that it gives a better fit than the KB and BB distribution on both real life data set and on the extended simulation study in handling over dispersed binomial data. The dispersion parameter will be treated as nuisance in the analysis of proportions since our interest is in the parameters of McGBB distribution. In this paper, we consider estimation of parameters of this MCGBB model using Quasi-likelihood (QL) and Quadratic estimating functions (QEEs) with dispersion. By varying the coefficients of the QEE’s we obtain four sets of estimating equations which in turn yield four sets of estimates. We compare small sample relative efficiencies of the estimates based on QEEs and quasi-likelihood with the maximum likelihood estimates. The comparison is performed using real life data sets arising from alcohol consumption practices and simulated data. These comparisons show that estimates based on optimal QEEs and QL are highly efficient and are the best among all estimates investigated.

Estimating functions have for sometimes been a key concept and subject of inquiry in research and it is known to be the most general method of estimation. The basis of this method is a set of simultaneous equations involving both the data and the unknown model parameters. To obtain an estimator, the estimating function is equated to zero and then solve the resulting equation with respect to the parameter in order to obtain parameter estimate. Estimating equations are not quite intensive in computation unlike MLEs. Moreover, the MLE estimators are based on the assumption that the distribution is known, however an estimating equation is free of such assumptions. The usual procedure is to take a parametric model, such as, the McDonald Generalized beta-binomial model to allow over as well as under dispersion and obtain maximum likelihood estimates of the parameters McDonald Generalized Beta Binomial (McGBB) distribution is a three-parameter distribution which is superior to KB in handling over dispersed binomial data. This procedure may produce inefficient or biased estimates when the parametric model does not fit the data well. Alternatively, more robust estimates, such as moment estimates, quasi-likelihood estimates (Breslow, 1990 [

Let

The

A random variable

and

In general, a Binomial mixture is obtained through an integration approach. Suppose

where

The three unknown parameters of McGBB distribution have been estimated using the maximum likelihood estimation technique. Let

unknown parameter vector

The quasi-likelihood (Wedderburn, 1974 [

The quasi-likelihood with the above mean and variance is given by _{ }

where,

By virtue of independence between samples, the quasi-likelihood with the above means and variance is given by:

We denote Equation (5) by

where,

Then the partial derivatives for the three parameters

By considering estimating functions quadratic in

chastic functions of

The unbiased quadratic estimating equations for

If we take

We obtain the Gaussian estimating equations. We denote this Equation (10) by

If we take_{ }

Then we obtain the unbiased estimating equations (QEE’s) for McDonald Generalized Binomial Distribution. These equations were obtained by combining the quasi-likelihood estimating equations for the regression parameters and the optimal quadratic estimating equations of Crowder (1987) [

We denote the estimates so obtained from Equation (11) by

This simplifies to,

For

We obtain the optimal quadratic estimating equations. We note that the forms of the skewness

and

We denote the estimates obtained by solving these optimal quadratic estimating equations by

The asymptotic relative efficiency may not be very useful when comparing different estimators in small samples. So we conducted a simulation study using relatively small

We compare the relative efficiency of the estimates

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|---|

47 | 54 | 43 | 40 | 40 | 41 | 39 | 95 |

Parameter estimates | Estimated relative efficiencies | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Parameters | ||||||||||||

0.0333 | 0.0281 | 0.0301 | 0.0287 | 0.0312 | 0.0282 | 1.000 | 1.0912 | 1.0424 | 0.6121 | 0.9352 | 0.3292 | |

0.1797 | 0.1502 | 0.1671 | 0.1655 | 0.1671 | 0.1611 | 1.000 | 0.9085 | 0.9831 | 0.5192 | 0.9811 | 0.3615 | |

26.7312 | 25.5541 | 25.8127 | 24.8421 | 25.4523 | 24.6746 | 1.000 | 0.9010 | 0.8742 | 0.5320 | 0.9381 | 0.3510 |

using weekly (7 days) alcohol consumption survey data and simulated data for the survey of weekly alcohol consumption for a small time frame (

From

Estimated relative efficiencies | ||||||
---|---|---|---|---|---|---|

0.0 | 1.000 | 0.980 | 0.950 | 0.591 | 0.933 | 0.454 |

0.1 | 1.000 | 0.990 | 0.967 | 0.638 | 0.938 | 0.519 |

0.2 | 1.000 | 0.998 | 0.985 | 0.490 | 0.859 | 0.586 |

0.3 | 1.000 | 1.014 | 1.000 | 0.592 | 0.928 | 0.609 |

0.4 | 1.000 | 1.052 | 1.053 | 0.617 | 1.247 | 0.425 |

0.5 | 1.000 | 1.095 | 1.025 | 0.706 | 0.990 | 0.438 |

0.6 | 1.000 | 1.148 | 1.135 | 0.669 | 1.552 | 0.411 |

0.7 | 1.000 | 1.131 | 1.021 | 0.655 | 0.839 | 0.415 |

0.8 | 1.000 | 1.035 | 1.010 | 0.592 | 0.982 | 0.298 |

0.9 | 1.000 | 1.001 | 0.989 | 0.529 | 0.952 | 0.216 |

1.0 | 1.000 | 0.998 | 0.993 | 0.389 | 0.941 | 0.201 |

Estimated relative efficiencies | ||||||
---|---|---|---|---|---|---|

0.0 | 1.000 | 1.080 | 0.950 | 0.431 | 0.946 | 0.589 |

0.1 | 1.000 | 1.109 | 0.997 | 0.488 | 0.908 | 0.429 |

0.2 | 1.000 | 1.210 | 0.989 | 0.549 | 0.885 | 0.348 |

0.3 | 1.000 | 1.214 | 1.130 | 0.627 | 0.941 | 0.411 |

0.4 | 1.000 | 1.252 | 1.289 | 0.717 | 1.493 | 0.495 |

0.5 | 1.000 | 1.35 | 1.325 | 0.796 | 0.898 | 0.517 |

0.6 | 1.000 | 1.348 | 1.305 | 0.739 | 1.541 | 0.524 |

0.7 | 1.000 | 1.343 | 1.216 | 0.715 | 0.953 | 0.459 |

0.8 | 1.000 | 1.235 | 1.101 | 0.69 | 0.894 | 0.398 |

0.9 | 1.000 | 1.191 | 0.995 | 0.652 | 0.958 | 0.306 |

1.0 | 1.000 | 0.908 | 0.985 | 0.479 | 0.902 | 0.297 |

The estimation functions are based on the knowledge of moments and one of the advantages of this approach is that it is robust to model misspecification. The comparison results in this paper indicate that the Estimating Equations are superior to MLE. The small relative efficiency for the estimates results also shows that estimates using optimal quadratic estimating functions of Crowder (1987) are highly efficient and are the best among all estimates investigated followed by Quasi-likelihood. Thus, we propose quadratic estimating function for estimation of point parameters of any model inclusive of McDonald Generalized Beta-Binomial instead of MLEs since they are consistent and robust to variance misspecification.