Maximum Entropy Empirical Likelihood Methods Based on Bivariate Laplace Transforms and Moment Generating Functions

Maximum Entropy Empirical Likelihood (MEEL) methods are extended to bivariate distributions with closed form expressions for their bivariate Laplace transforms (BLT) or moment generating functions (BMGF) without closed form expressions for their bivariate density functions which make the implementation of the likelihood methods difficult. These distributions are often encountered in joint modeling in actuarial science and finance. Moment conditions to implement MEEL methods are given and a bivariate Laplace transform power mixture (BLTPM) is also introduced, the new operator generalizes the existing univariate one in the literature. Many new bivariate distributions including infinitely divisible(ID) distributions with closed form expressions for their BLT can be created using this operator and MEEL methods can also be applied to these bivariate distributions.

1. Introduction

Bivariate distributions are useful for joint modelling and naturally fitting these distributions is a necessity for pricing in insurance and finance. For example, in finance fitting a bivariate jump diffusion model to joint returns data allowed us to price a basket option accordingly, see Ruijter and Oosterlee  (p. B658) for a bivariate jump diffusion model. The authors advocated a two dimensions cosine method for pricing basket option. In actuarial sciences, one might consider modeling two claim amounts from an accident, i.e., the amount from body damage and the amount from material damage which are incurred simultaneously in a car accident, see Partrat  (p. 225). Joint survival analysis is also useful for lifetime study, see Crowder  (pp. 121-136).

In order to find suitable bivariate parametric models, compound methods and copulas are often used. New bivariate distributions created using the traditional distribution or survival copulas are often continuous with closed form distribution functions or density functions so that methods based on likelihood functions can be applied for statistical inferences. It is also well known that many useful bivariate infinitely divisible distributions do not have closed form density function nor distribution function yet very useful for applications.

For the univariate case, the LT power mixture (LTPM) operator as introduced by Abate and Whitt  (pp. 92-93) for the univariate set up for creating new distribution using Laplace transforms (LT) can be used to generate many distributions with closed form LT. Furthermore, it can also be used to generate distribution which is infinitely divisible. In this paper we shall generalize the LTPM to a bivariate version, the BLTPM operator and show that the BLTPM operator can be used to generate new bivariate distributions with closed from BLT. Furthermore, the traditional survival copulas such as the Clayton copula, see Shih and Louis  can be used as a LT copula if the property of complete monotonicity of two specific related functions are satisfied. For another class of survival copulas, see Crowder  (pp. 121-138). Consequently, some distribution or survival function copulas which are defined using a generator based on a LT of a nonnegative random variable can be used to generate new distribution with prescribed marginals specified by their marginal LTs. A similar bivariate PM operator based on distribution functions or survival functions, the BDSPM operator has been introduced by Marshall and Olkin  (pp. 834-836) to create new distributions functions and with frailty induced distributions, it also related to a class of distribution or survival Copula functions defined with a generator. The BLTPM operator can be used to generate bivariate infinite divisible distributions with closed form BLTs. It appears to be useful to have bivariate infinite divisible distributions for joint modeling as they are related to corresponding bivariate Lévy processes with stationary and independent increments. These types of processes are useful as they often lead to elegant results in risk theory in actuarial sciences and in finance.

It appears natural to extent inferences based on bivariate maximum entropy empirical likelihood (MEEL) to distributions with closed form bivariate LT (BLT) or closed form bivariate moment generating function(BMGF) which are similar to the univariate case as given by Luong  . Beside only the BLT or BMGF is needed without asking the explicit expression of the Bivariate distribution function or density function, MEEL methods appear to be practical and the methods also offer simultaneously a goodness of fit test statistics which follows a chi-square distribution which is relatively simple to implement and appear to be useful for practitioners. Along the same vein of univariate MEEL methods, bivariate MEEL methods of estimation and tests are based on constraints specified by moments conditions which can be identified with elements of a basis where the true score functions are projected as in the univariate set up, see Luong  (pp. 463-465); these moment conditions which define elements of the base can be extracted from the model BLT or BMGF.

It is also worth to mention that many bivariate distributions with a singular component in a domain such as the Bivariate exponential distribution as introduced by Marshall and Olkin  (p. 36) or bivariate phase type distribution as given by Assaf et al.  (p. 692) do have a closed form bivariate LT so that model testing for these distributions with singular components in the domain can be handled by bivariate MEEL methods. This motivates us to extend MEEL methods to bivariate distributions and the paper is organized as follows.

In Section 2 some examples from actuarial science and finance are given to illustrate the usefulness of bivariate models without closed form density functions but with closed form BLTs. The BLTPM operator is introduced in Section 3 and it is shown that it can be used to create bivariate ID distribution with closed form BLT.MEEL methods are introduced in Section 3 with the proposal of two bases to generate moment conditions. The elements of the bases are based or BLT or BMGF and do not need the density functions explicitly. These bases are proposed to balance efficiencies and numerical tractability as a base with a large number of elements tend to create numerical difficulties on implementation of the MEEL methods. The asymptotic properties which already appear in the literature are restated emphasizing bases and projections of the score functions. We also discuss numerical implementation of the MEEL methods using penalty function approach as given by Luong  . Section 5 illustrates the use of the methods by conducting a limited simulation study with a bivariate compound Poisson model as proposed by Partrat  (pp. 220-223). With the range of parameters as considered, we observe that the MEEL estimators are two to four time more efficient than MM estimators and at the same time provide a chi-square goodness of fit test statistics for model selection. A base of nine elements is chosen in the study and there is no major numerical difficulty encountered on implementation using the penalty function approach. The methods avoid the use of the covariance matrix of the elements of the base as required for the optimum generalized moment methods (GMM) as discussed in Luong  , the expressions for this optimum covariance matrix can be complicated.

In the next section, we shall give some examples to illustrate that in many situations we end up working with bivariate distributions with closed form BLT or closed from BMGF but without a closed form for distributions functions or density functions.

2. Some Examples

Example 1

The following bivariate distribution introduced by Partrat  (pp. 220-222) is nonnegative and continuous in general but it is discrete at the origin where there is a mass assigned to it. It can be created from compounding using a bivariate Poisson distribution and has closed form BLT and suitable for modelling two claim amounts from two types of claim in actuarial sciences. The vector of claim amount will have two components and each component can be expressed as a random sum as in the univariate case but the two components will be correlated due to the two numbers of claims given by the vector $N={\left({N}_{1},{N}_{2}\right)}^{\prime }$ are dependent and follow a bivariate Poisson distribution. More precisely, the claim amount is a vector $X={\left(Y,Z\right)}^{\prime }$ . Each of the components of X can be expressed each as a random sum. Note that $Y={\sum }_{i=1}^{{N}_{1}}{U}_{i}$ , the random variables $\left\{{U}_{i},i=1,2,\cdots \right\}$ are independent and identically distributed as $U\ge 0$ which is a nonnegative continuous random variable with LT, ${L}_{U}\left(s\right)$ . Similarly, $Z={\sum }_{i=1}^{{N}_{1}}{V}_{i}$ , the random variables $\left\{{V}_{i},i=1,2,\cdots \right\}$ are independent and identically distributed as $V\ge 0$ which is a nonnegative continuous random variable with LT, ${L}_{V}\left(t\right)$ . The random variables $\left\{{U}_{i},i=1,2,\cdots \right\}$ , $\left\{{V}_{i},i=1,2,\cdots \right\}$ , $\left\{{N}_{1},{N}_{2}\right\}$ are assumed to be independent and with other related independence assumptions as given by H1-H7 in Partrat  (pp. 220-221) leads to the BLT of the vector X expressible as

${L}_{X}\left(s,t\right)=g\left({L}_{U}\left(s\right),{L}_{V}\left(t\right)\right)$ , see Partrat  (p. 221).

The bivariate probability generating function (BPGF) for $N={\left({N}_{1},{N}_{2}\right)}^{\prime }$ is denoted by $g\left(s,t\right)$ and if $N={\left({N}_{1},{N}_{2}\right)}^{\prime }$ follows a bivariate Poisson distribution then its PGF is given as $g\left(s,t\right)={\text{e}}^{{ф}_{1}\left(s-1\right)+{ф}_{2}\left(t-1\right)+{ф}_{12}\left(st-1\right)}$ , see Partrat  (p. 222).

The parameters ${ф}_{1},{ф}_{2},{ф}_{12}$ are all nonnegative. This is the only bivariate Poisson distribution which is infintely divisble(ID), see Dwass and Teicher  . Now if U and V are also ID then with property of the bivariate Poisson PGF, it can be seen that

${\left[g\left({L}_{U}\left(s\right),{L}_{V}\left(t\right)\right)\right]}^{\frac{1}{n}}$

is a BLT for each positive integer n, using the remarks given by Abate and Whitt  based on results given by Feller  (p. 176, 449). Consequently, we can conclude that X is also ID. As BLT is closely related to BMGF, a similar conclusion can be made using BMGF instead of BLT. Also In many circumsstances by inspecting the BLT or BMGF which has a closed form expression, we can infer the property of infinitely divisibility of the distribution without having to use more sophisticated methods such as the one based on the Lévy-Khintchine representation, see Dwass and Teicher  for this representation which is general but very technical to characterize ID distribution and Lévy process.

Clearly, we can differentiate ${L}_{X}\left(s,t\right)$ or using a conditioning argument with the random sums representations to obtain the first two moments of the vector X and these first two moments will be icluded in the set of moment conditions for MEEL methods developed subsequently in section 4. The following example gives a model which is used in finance.

Example 2

Ruijter and Oosterlee  (p. B658) consider joint modelling of returns of two assets in a period of time. The return random vector is $X={\left(Y,Z\right)}^{\prime }$ and is representable as

$X=B+{\sum }_{i=1}^{N}{J}_{i}$ ,

with B following a bivariate normal distribution, $B~N\left(\mu ,\Sigma \right)$ .

The mean vector $\mu ={\left({\mu }_{1},{\mu }_{2}\right)}^{\prime }$ and the covariance matrix $\Sigma =\left(\begin{array}{cc}{\sigma }_{11}& {\sigma }_{12}\\ {\sigma }_{12}& {\sigma }_{22}\end{array}\right)$ , the random vector B is independent of the random sum ${\sum }_{i=1}^{N}{J}_{i}$ , the ${J}_{i}$ ’s are independent and identically dsitributed as J which follows a bivariate normal distribution, $J~N\left({\mu }^{J},{\Sigma }^{J}\right)$ with

${\mu }^{J}={\left({\mu }_{1}^{J},{\mu }_{2}^{J}\right)}^{\prime },\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\Sigma }^{J}=\left(\begin{array}{cc}{\sigma }_{11}^{J}& {\sigma }_{12}^{J}\\ {\sigma }_{12}^{J}& {\sigma }_{22}^{J}\end{array}\right)$

This is a jump-diffusion model with the diffusion part given by B and the jump part by the random sum ${\sum }_{i=1}^{N}{J}_{i}$ . Also, the Ji’s, B, Z and N are independent with N following a Poisson distribution with parameter λ. Comparing to a bivariate normal model, the bivariate jump diffusion model has an extra jump component. Ruijter and Oosterlee  (p. B658) show that by letting $s={\left(s,t\right)}^{\prime }$ , the BMGF for W is given by

${M}_{X}\left(s,t\right)={\text{e}}^{{s}^{\prime }\mu +\frac{1}{2}{s}^{\prime }\Sigma s}\cdot \mathrm{exp}\left(\lambda \left({\text{e}}^{{s}^{\prime }{\mu }^{J}+\frac{1}{2}{s}^{\prime }{\Sigma }^{J}s}-1\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}-\infty .

The domain of ${M}_{X}\left(s,t\right)$ is the entire plane due to the use of the Poisson and normal distributions. Clearly, this an ID BMGF and the corresponding jump diffusion process is a bivariate Lévy process. Also, the first two moments of the vector X can be extracted from the BMGF. For this model, MEEL methods can be used for estimation and model testing. The class of bivariate normal mean variance mixture as described by McNeil et al  (pp. 77-78) is another example where bivariate MEEL methods might be suitable.

Furthermore, MEEL methods can also be used for testing the null composite hypothesis which specifies that the parametric model fits the data. The test statistics based on MEEL methods can be constructed in a unified way and obtained simultaneously with the estimation procedures. This feature is useful for doing applied works and the test statistics is less complicated than statistics based on the Mahanolobis distance, see Mc Assey  , Muldhokar et al.  for statistics based on such a distance. The unique asymptotic chi-square distribution that the statistics follows across the composite hypothesis make them suitable statistics to replace the traditional chi-square test statistics as proposed by Moore and Stubblebine  which require closed form bivariate density functions. Model testing procedure is easy to implement if it is based on a statistics with a unique asymptotic distribution for all ${\beta }_{0}\in \Omega$ , the parameter space is denoted by Ω. The main requirement is the model BLT has a closed form expression.

These properties of bivariate MEEL methods are similar to properties of univariate MEEL methods as discussed by Luong  and will be further discussed in section 4. Note that asymptotic theory for empirical likelihood methods as developed by Qin and Lawless  (p302-308) is not restricted to univariate distribution and the same can be said for the maximum entropy version. In fact, asymptotic theory for empirical likelihoods is well established for model with multivariate observations which are independent and identically distributed using a set of fixed moment conditions. The example 2 as given by Qin and Lawless  (pp. 311-312) is an example of the use of empirical likelihood methods for a bivariate model. In practice, we would like to choose the moment conditions so that high efficiencies can be achieved and handle the procedures numerically. We focus on these points for a class of distributions with closed form BLT or BMGF and do not emphasize asymptotic theory in the paper.

3. Distributions Created Using the BLTPM Operator

3.1. The BLTPM Operator

The Laplace transform power mixture operator (LTPM) has been introduced in the literature for creating univariate non-negative distribution and can be viewed as a continuous type of compounding operator and it makes use of LT of a distribution and the LT of a mixing random variable to create a new distribution specified by its LT. It is due to Abate and Whitt  (pp. 92-93). They use the acronym PM but to emphasize the explicitly use of LTs with this operator we shall use the acronym LTPM and to distinguish it with a similar power mixture operator introduced by Marshall and Olkin  (pp. 834-836) which make explicitly use of distributions instead of LTs. Abate and Whitt  also point out that a new ID can be obtained if we start with an ID distribution and the mixing distribution is non-negative continuous and ID. They also give an elegant and insightful interpretation when instead of ID distribution we can consider the corresponding Lévy process behind. The new ID distribution is related to a new Lévy process constructed from a Lévy process subject to a change time process induced by the mixing random variable and since it is ID, the change time process is also a Lévy process. To summarize, we can say that starting with a Lévy process and subject this process to another non-negative Lévy process used as a change of time process then a new Lévy process can be created or equivalently a new ID non-negative distribution is created which is the distribution of the increments of the newly created Lévy process. Observe that the LTPM operator can be used to generate new distributions in general and with further conditions imposed on the procedures, the new distribution can be ID. Consequently, it appears to be natural to generalize this operator to a bivariate version to create bivariate distributions using two univariate LTs and the LT of a mixing random variable, this is the bivariate power mixture operator (BLTPM). Clearly, it is natural extension of the LTPM operator which is used to create univariate distribution. This operator is similar to another power mixture operator which makes use of distributions or survival functions instead of LTs given by Marshall and Olkin  (p. 835), we shall use the acronym BDSPM for this operator.

For illustrations, some examples of new bivariate distribution specified by BLT and created using the BLTPM operator will be given for illustrations in section 3.2. The BLTPM operator can also be used to create new distribution with one marginal distribution which is discrete and the other marginal being continuous. These types of bivariate distributions appear to be useful for actuarial science and possibly for other fields as well.

Often, we need to know whether a function can be considered as LT of some univariate random variable, the notion of completely monotonicity of a function is useful for characterizing a function to be a proper LT of a random variable and can be found in Feller  (pp. 439-440) but it is reproduced here for completeness as Definition and Theorem below.

Definition

A function $ф\left(s\right)$ with the domain given by $s\ge 0$ is completely monotone (CM) if it possesses derivatives ${ф}^{\left(n\right)}$ of all order and ${\left(-1\right)}^{n}{ф}^{\left(n\right)}\left(s\right)\ge 0$ .

Theorem

A function $ф\left(s\right)$ with the domain given by $s\ge 0$ is the LT of a random variable if and only if it is CM.

Now if we assume that the LT of a random variable is $\stackrel{^}{f}\left(s\right)$ so that ${\stackrel{^}{f}}^{\alpha }\left(s\right)$ is a proper LT, also suppose that the LT of another random variable Y is $\stackrel{^}{g}\left(t\right)$ so that ${\stackrel{^}{g}}^{\alpha }\left(t\right)$ is a proper LT and α is a non-negative random variable with distribution function H and LT $\stackrel{^}{h}\left(u\right)$ . Furthermore, we assume that conditioning on a realized value α, the LT ${\stackrel{^}{f}}^{\alpha }\left(s\right)$ which is the LT of a random variable denoted by ${X}^{\left(\alpha \right)}$ and the LT ${\stackrel{^}{g}}^{\alpha }\left(t\right)$ which is the LT of a random variable denoted by ${Y}^{\left(\alpha \right)}$ are independent but when integrate out with respect to the distribution of α will give a new bivariate LT for a vector $X={\left(Y,Z\right)}^{\prime }$ ,

${L}_{X}\left(s,t\right)={\int }_{0}^{\infty }{\stackrel{^}{f}}^{\alpha }\left(s\right)\cdot {\stackrel{^}{g}}^{\alpha }\left(t\right)\text{d}H\left(\alpha \right)\text{ }\text{ }.$ (1)

Using the LT of α, it can be re-expressed as

${L}_{X}\left(s,t\right)=\stackrel{^}{h}\left(-\mathrm{log}\left(\stackrel{^}{f}\left(s\right)\right)-\mathrm{log}\left(\stackrel{^}{g}\left(t\right)\right)\right).$ (2)

Expression (2) gives a definition of the BLTPM operator and it generalizes expression (6.1) given by Abate and Whitt  (p. 92) for the univariate LTPM operator. Similarly, if MGFs are used instead and they are given respectively by $\stackrel{^}{{f}_{M}}\left(s\right)$ , $\stackrel{^}{{g}_{M}}\left(t\right)$ and $\stackrel{^}{{h}_{M}}\left(u\right)$ , a new bivariate distribution with BMGF can be created similarly.

${M}_{X}\left(s,t\right)=\stackrel{^}{{h}_{M}}\left(\mathrm{log}\left(\stackrel{^}{{f}_{M}}\left(s\right)\right)+\mathrm{log}\left(\stackrel{^}{{g}_{M}}\left(t\right)\right)\right).$ (3)

Observe that if ${X}^{\left(\alpha \right)},{Y}^{\left(\alpha \right)}$ are ID and $\alpha \ge 0$ is continuous and ID then the new bivariate distribution created is also ID. We can interpret this property using Lévy processes, as two independent Lévy processes subject to the same time change Lévy process is a bivariate Lévy process. This generalizes the argument used for the univariate case given by Abate and Whitt  (p. 92).

A form of bivariate PM operator, the BSDPM has been used in the literature but with the use of distribution functions or survival functions, see the seminal works of Marshall and Olkin  and a class of survival or distribution copula functions are also obtained. This class is defined using a generator which is based on the LT of some nonnegative random variable which forms a subclass of the Archimedean class and will be further discussed in section 3.2. Due to the analogy between the BLTPM operator and BDSPM operator, these distribution or survival copula functions subject to some conditions are also LT copulas. For the Archimedean class, see Klugman et al.  (pp. 187-213).

In section 3.2, we shall see that additional requirements are needed when working with LT’s instead of distribution functions or survival functions for the use of these copulas traditionally used to create new bivariate distribution function using prescribed marginal distributions.

This is due to the conditions ${\text{e}}^{-\alpha {ф}^{-1}\left({L}_{1}\left(s\right)\right)}$ and ${\text{e}}^{-\alpha {ф}^{-1}\left({L}_{2}\left(t\right)\right)},\alpha >0$ are proper LT are more difficult to verify than the corresponding conditions, ${\text{e}}^{-\alpha {ф}^{-1}\left({F}_{1}\left(x\right)\right)}$ and ${\text{e}}^{-\alpha {ф}^{-1}\left({F}_{2}\left(y\right)\right)}$ are proper distribution functions with ${F}_{1}\left(x\right)$ and ${F}_{2}\left(y\right)$ being distribution functions when the BDSPM operator is used, see Marshall and Olkin  (p. 834).

We shall give a few examples below to illustrate the use of the BLTPM operator to create bivariate ID distribution using the LT of a gamma mixing random variable. The same procedure can be used with other mixing random variables such as the inverse Gaussian random variable or the exponential mixture inverse Gaussian (EMIG) random variable. The EMIG distribution as studied by Abate and Whitt  (pp. 95-96) is also ID, continuous and nonnegative. This distribution has not received much attention in the statistical literature despite it is used frequently in queueing theory.

3.2. Some Examples

Example 3

For modeling two type of claims in insurance for one period of time, we might want to construct a joint continuous model using the BLTPM operator with the mixing random variable following a gamma distribution with LT $\stackrel{^}{h}\left(u\right)={\left(1+u\right)}^{-\tau },\tau >0$ , $\stackrel{^}{f}\left(s\right)$ and $\stackrel{^}{g}\left(t\right)$ are LT of inverse Gaussian distributions with LTs given respectively by

$\stackrel{^}{f}\left(s\right)=\mathrm{exp}\left(\frac{{\theta }_{1}}{{\mu }_{1}}\left(1-\sqrt{1+\frac{2{\mu }_{1}^{2}s}{{\theta }_{1}}}\right)\right)$ and $\stackrel{^}{g}\left(t\right)=\mathrm{exp}\left(\frac{{\theta }_{2}}{{\mu }_{2}}\left(1-\sqrt{1+\frac{2{\mu }_{2}^{2}t}{{\theta }_{2}}}\right)\right)$ ,

${\theta }_{i},{\mu }_{i}>0,i=1,2$ see Klugman et al.  (p. 472) for LT of an inverse Gaussian distribution.

Apply the BLTBPM operator using these LTs, this will give a new bivariate continuous distribution for a random vector $X={\left(Y,Z\right)}^{\prime }$ which is ID and nonnegative with its BLT given by

${L}_{X}\left(s,t\right)={\left(1-\frac{{\theta }_{1}}{{\mu }_{1}}\left(1-\sqrt{1+\frac{2{\mu }_{1}^{2}s}{{\theta }_{1}}}\right)-\frac{{\theta }_{2}}{{\mu }_{2}}\left(1-\sqrt{1+\frac{2{\mu }_{2}^{2}t}{{\theta }_{2}}}\right)\right)}^{-\tau }$ .

From ${L}_{X}\left(s,t\right)$ , the Pearson product moment correlation coefficient can be obtained and the coefficient can be used to study the dependence between Y and Z. Other depence measures can also be used to study associativity and dependence, see Chapter 4 of the book by Balakrishnan and Lai  (pp. 141-173) and for total positivity dependence, see Barlow and Proschan  (pp. 142-150) but we do not go into details into these dependence studies for distributions created using the BLTPM operator as the main focus of the paper are on statistical inference methods using BLTs or BMGFs.

Example 4

Bivariate mixed models with one marginal for counts and the other one for claim amounts appear to be useful in actuarial science. These models can also be constructed using the BLTPM operator. For example, let the random vector $X={\left(Y,Z\right)}^{\prime }$ with Y being discrete and Z being continuous and nonnegative. We might specify $\stackrel{^}{h}\left(u\right)={\left(1+u\right)}^{-\tau },\tau >0$ , $\stackrel{^}{f}\left(s\right)=\mathrm{exp}\left(\lambda \left({\text{e}}^{-s}-1\right)\right)$ which is the LT of a Poisson random variable with $\lambda >0$ , let and

$\stackrel{^}{g}\left(t\right)=\mathrm{exp}\left(\frac{\theta }{\mu }\left(1-\sqrt{1+\frac{2{\mu }^{2}t}{\theta }}\right)\right)$ , $\theta$ and $\mu >0$ which is the LT of an inverse Gaussian distribution.

The BLT of the random vector $X={\left(Y,Z\right)}^{\prime }$ created using the BLTPM operator is

${L}_{X}\left(s,t\right)={\left(1-\lambda \left({\text{e}}^{-s}-1\right)-\frac{\theta }{\mu }\left(1-\sqrt{1+\frac{2{\mu }^{2}t}{\theta }}\right)\right)}^{-\tau }$ .

Setting $t=0$ we obtain the marginal LT of Y which is given by

${L}_{Y}\left(s\right)={\left(1-\lambda \left({\text{e}}^{-s}-1\right)\right)}^{-\tau }$ . One can recognize that this is a LT of a negative binomial distribution and for this distribtion, the corresponding probability

generating function (PGF) ${P}_{Y}\left(s\right)={\left(1-\lambda \left(s-1\right)\right)}^{-\tau }$ , see Klugman et al.  (p. 83) for the PGF of a negative binomial distribution.

The marginal ${L}_{Z}\left(t\right)={\left(1-\frac{\theta }{\mu }\left(1-\sqrt{1+\frac{2{\mu }^{2}t}{\theta }}\right)\right)}^{-\tau }$ is the LT of a continuous random variable.

Example 5

In this example we shall obtain a bivariate negative binomial distribution using

the BLTPM operator. Let $\stackrel{^}{h}\left(u\right)={\left(1+u\right)}^{-\tau },\tau >0$ , $\stackrel{^}{f}\left(s\right)=\mathrm{exp}\left({\lambda }_{1}\left({\text{e}}^{-s}-1\right)\right)$ and

$\stackrel{^}{g}\left(t\right)=\mathrm{exp}\left({\lambda }_{2}\left({\text{e}}^{-s}-1\right)\right)$ , ${\lambda }_{1}$ and ${\lambda }_{2}>0$ . Apply the BLTPM operator this gives a

joint distribution for $X={\left(Y,Z\right)}^{\prime }$ with BLT given by

${L}_{X}\left(s,t\right)={\left(1-{\lambda }_{1}\left({\text{e}}^{-s}-1\right)-{\lambda }_{2}\left({\text{e}}^{-s}-1\right)\right)}^{-\tau }$ .

This is a BLT of a bivariate negative binomial distribution which is also ID. In the literature, this distribution has been constructed using various methods and by many authors, see Mardia  (p. 84). We can see that the BLTPM operator is very useful for constructing bivariate distribution with the property being ID. This property is useful in finance and actuarial sciences as the corresponding bivariate process is a bivariate Lévy process with stationary and independent increments. This property often leads to nice results in risk theory and pricing formulas in finance. Simulation of sample paths from these processes are also simplified using this property.

We briefly mention on how to simulate from bivariate distribution with a specified ${L}_{X}\left(s,t\right)$ created by the BLTPM operator. The procedures are similar to the procedures described in section 5 and given by Marshall and Olkin  (p. 840) but the inverse transform method might not be necessary explicitly.

For the inverse transform method, see Ross  (pp. 69-70).

The main steps are:

1) Generate an observation for the mixing random variable α from the distribution H which has LT $\stackrel{^}{h}\left(u\right).$

2) Use the observed value α, generate an observation ${X}^{\left(\alpha \right)}$ from a distribution with LT ${\stackrel{^}{f}}^{\alpha }\left(s\right)$ . Often from the expression of ${\stackrel{^}{f}}^{\alpha }\left(s\right)$ , we have a procedure to simulate from this distribution and the inverse method might not be needed explicitly at this step.

3) Use the same observed value α, simulate another observation ${Y}^{\left(\alpha \right)}$ which is independent of ${X}^{\left(\alpha \right)}$ obtained from 2) from a distribution with LT ${\stackrel{^}{g}}^{\alpha }\left(t\right)$ .

The pair of observations ${\left({X}^{\left(\alpha \right)},{Y}^{\left(\alpha \right)}\right)}^{\prime }$ obtained will follow a bivariate distribution with BLT as specified by ${L}_{X}\left(s,t\right)$ .

3.3. LT Copulas

Observe that the BLTPM operator given by expression (2) is similar to the BDSPM operator given by expression 2.2 in Marshall and Olkin  (p834) where a class of distribution or survival copulas can be obtained, it is considered below. Without loss of generality we consider distribution copulas which are related to the BDSPM operator. By using the specified marginal distributions ${F}_{1}$ and ${F}_{2}$ , a new bivariate distribution $F\left(x,y\right)$ can be constructed with these two specified marginals which is given by

$F\left(x,y\right)=ф\left({ф}^{-1}\left({F}_{1}\left(x\right)\right)+{ф}^{-1}\left({F}_{2}\left(y\right)\right)\right)$ , (4)

$ф$ is a univariate LT and its inverse given by ${ф}^{-1}$ .

If we let $ф={\left(1+s\right)}^{\tau }$ which is a LT of a gamma distribution, its inverse is given by ${ф}^{-1}={s}^{-\frac{1}{\tau }}-1$ , then we will have the classical distribution or survival

distribution Clayton copula as shown in example 4.2 given by Marshall and Olkin  (p. 837). For other copulas of the form given by expression (4), see Shih and Louis  .

For new bivariate distribution constructed using the BLTPM, we restrict the attention to nonegative distributions. Note that if the marginal distributions are specified respectively using LTs and given by ${L}_{1}\left(s\right)$ and ${L}_{2}\left(t\right)$ and subject to the two functions ${\text{e}}^{-\alpha {ф}^{-1}\left({L}_{1}\left(s\right)\right)}$ and ${\text{e}}^{-\alpha {ф}^{-1}\left({L}_{2}\left(t\right)\right)}$ are proper LT, a BLT of a new distribution with marginal LTs given respectively by ${L}_{1}\left(s\right)$ and ${L}_{2}\left(t\right)$ can also be obtained as

${L}_{X}\left(s,t\right)=ф\left({ф}^{-1}\left({L}_{1}\left(s\right)\right)+{ф}^{-1}\left({L}_{2}\left(t\right)\right)\right)$ , (5)

which is similar to the procedure as given by expression (4) using distribution or survival functions. This also means that some of the distribution or survival copulas of the class given by expression (4) can also be used as LT copulas. But the drawback of this approach using LT copula here is even by specifying the marginals ${L}_{1}\left(s\right)$ and ${L}_{2}\left(t\right)$ being LT of ID distributions, ${L}_{X}\left(s,t\right)$ constructed using expression (5) might not necessary be a BLT of a bivariate ID distribution.

It is also more difficult to simulate from a BLT ${L}_{X}\left(s,t\right)$ obtained using the LT copula as given by expression (5). Despite the algorithm used to generate a pair of observations from a specified BLT procedure is already given as above but when apply the procedures here we encounter the difficulty on how to simulate from distributions with LTs given by ${K}_{1}\left(s\right)={\text{e}}^{-\alpha {ф}^{-1}\left({L}_{1}\left(s\right)\right)}$ and ${K}_{2}\left(t\right)={\text{e}}^{-\alpha {ф}^{-1}\left({L}_{2}\left(t\right)\right)}$ as we might not be able to recognize these distributions.The inverse transform method can be applied by using the approximate quantile function based on moment generating functions ${M}_{1}\left(s\right)={\text{e}}^{-\alpha {ф}^{-1}\left({L}_{1}\left(-s\right)\right)}$ and ${M}_{2}\left(s\right)={\text{e}}^{-\alpha {ф}^{-1}\left({L}_{2}\left(-t\right)\right)}$ to obtain a simulated sample with some accuracy. The approximate quantile functions based on moment generating functions can be obtained explicitly using the saddle point technique and it is given by Arevelillo  . If these univariate MGFs do not exist, these LTs can be converted to characteristic functions and the numerical methods based on the inverse method and the approximate quantile functions as described by Glasserman and Liu  (pp. 1613-1615) can also be used to generate a pair of observations with some degree of accuracies.

There is a vast literature on survival or distribution copula, we just mention a few here. For a good general review on distribution or survival copula models with emphasis on goodness-of-fit test statistics, see Genest et al.  and for a good review of inference methods based on survival copulas used in actuarial science, see Klugman et al.  (pp. 187-213), Frees and Valdez  .

As mentioned earlier, asymptotic theory for Maximum entropy empirical likelihood (MEEL) and empirical likelihood (EL) for models where we have ${X}_{1},\cdots ,{X}_{n}$ a sample of multivariate observations independent and identically distributed (iid) as $X$ which follows a d-variate multivariate distribution ${F}_{\beta },\beta ={\left({\beta }_{1},\cdots ,{\beta }_{p}\right)}^{\prime }$ are well established but assuming a set of moment conditions or constraints has been chosen, see Qin and Lawless  , Schennach  , Owen  , Mittelhammer et al.  ; also see discussions in section 3.2 by Luong  (pp. 471-472).

For applications, the question on how to choose moment conditions or equivalently bases so that MEEL methods have high efficiencies is a relevant one and we would like to address mainly this issue here for models with closed form BLTs or BMGFs as introduced in previous sections. The moment conditions are viewed as constraints and can be identifed with elements of a chosen basis used to project the true score functions as in the univariate case, see Luong  (pp. 461-467). Beside estimation, empirical likelihood type methods also give a chi-square test statistics for model testing which is useful for practical applications. We shall emphasize moment conditions and focus on how to choose constraints for models with closed form BLTs or BMGFs in the next section as asymptotic theory is similar to the univariate case and already discussed in section 3.2 as given by Luong  .

4. MEEL Methods

Let ${X}_{1},\cdots ,{X}_{n}$ be sample of bivariate observations and they are independent and identically distributed as $X={\left(Y,Z\right)}^{\prime }$ which follows a bivariate distribution ${F}_{\beta },\beta ={\left({\beta }_{1},\cdots ,{\beta }_{p}\right)}^{\prime }$ , ${F}_{\beta }$ has no closed from but its BLT or BMGF has a closed form expression and given respectively by ${L}_{\beta }\left(s,t\right)$ or ${M}_{\beta }\left(s,t\right)$ .The true vector of parameters is denoted by ${\beta }_{0}\in \Omega$ .The parameter space Ω is assumed to be compact.

Suppose that we can identify k constraints of the form ${g}_{1}\left(x;\beta \right),\cdots ,{g}_{k}\left(x;\beta \right)$ where these functions ${g}_{i}\left(x;\beta \right),i=1,\cdots ,k$ are unbiased estimating functions with the property ${E}_{\beta }\left({g}_{i}\left(x;\beta \right)\right)=0,i=1,\cdots ,k$ , these functions also form a basis $B=\left\{{g}_{1}\left(x;\beta \right),\cdots ,{g}_{k}\left(x;\beta \right)\right\}$ used to project the true score functions and MEEL estimators can be viewed as equivalent to quasilikelihood estimators based on the projected score functions, see Luong  (pp. 466-468). Therefore, the constraints are linked to a basis and the ability to obtain projected score functions close to the true score functions for some restricted but useful parameter space will make MEEL estimators have high efficiencies in practice.

4.1. Choice of Bases

In this section, we shall recommend a base when using a model BLT to extract moment conditions which can be used as a guideline for forming other bases if needed for applying bivariate MEEL methods.

We shall define the first five gi’s as follows which make use of the first two moments of X, i.e.,

$\begin{array}{l}{g}_{1}\left(x;\beta \right)=y-{E}_{\beta }\left(z\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{2}\left(x;\beta \right)=z-{E}_{\beta }\left(z\right)\\ {g}_{3}\left(x;\beta \right)={\left(y-{E}_{\beta }\left(y\right)\right)}^{2}-{V}_{\beta }\left(z\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{4}\left(x;\beta \right)={\left(z-{E}_{\beta }\left(z\right)\right)}^{2}-{V}_{\beta }\left(z\right),\\ {g}_{5}\left(x;\beta \right)=\left(y-{E}_{\beta }\left(y\right)\right)\left(z-{E}_{\beta }\left(z\right)\right)-Co{v}_{\beta }\left(y,z\right).\end{array}$ (6)

The variances of y and z are given respectively by ${V}_{\beta }\left(y\right)$ and ${V}_{\beta }\left(z\right)$ , the covariance between y and z is denoted by $Co{v}_{\beta }\left(y,z\right)$ . Subsequently we will pick the gi’s directly using the model BLT ${L}_{\beta }\left(s,t\right)$ . We shall choose first four points of the BLT ${L}_{\beta }\left(s,t\right)$ , $\left({s}_{1},{t}_{1}\right),\left({s}_{2},{t}_{2}\right),\left({s}_{3},{t}_{3}\right),\left({s}_{4},{t}_{4}\right)$ which belong to a circle of radius ${r}_{1}=0.01$ in the nonnegative quadrant and centered at the origin, i.e., let

$\left({s}_{1},{t}_{1}\right)={r}_{1}\left(\mathrm{cos}\left(0\right),\mathrm{sin}\left(0\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left({s}_{2},{t}_{2}\right)={r}_{1}\left(\mathrm{cos}\left(\frac{\text{π}}{6}\right),\mathrm{sin}\left(\frac{\text{π}}{6}\right)\right)$

$\left({s}_{3},{t}_{3}\right)={r}_{1}\left(\mathrm{cos}\left(\frac{\text{2π}}{6}\right),\mathrm{sin}\left(\frac{\text{2π}}{6}\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left({s}_{4},{t}_{4}\right)={r}_{1}\left(\mathrm{cos}\left(\frac{\text{3π}}{6}\right),\mathrm{sin}\left(\frac{\text{3π}}{6}\right)\right)$

$\begin{array}{l}{g}_{6}\left(x;\beta \right)={\text{e}}^{{s}_{1}y+{t}_{1}z}-{L}_{\beta }\left({s}_{1},{t}_{1}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{7}\left(x;\beta \right)={\text{e}}^{{s}_{2}y+{t}_{2}z}-{L}_{\beta }\left({s}_{2},{t}_{2}\right),\\ {g}_{8}\left(x;\beta \right)={\text{e}}^{{s}_{3}y+{t}_{3}z}-{L}_{\beta }\left({s}_{3},{t}_{3}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{9}\left(x;\beta \right)={\text{e}}^{{s}_{4}y+{t}_{4}z}-{L}_{\beta }\left({s}_{4},{t}_{4}\right).\end{array}$

We might want to add another 4 points on the part of the circle of radius ${r}_{2}=0.02$ of the nonnegative quadrant and centered at the origin by letting

$\left({s}_{5},{t}_{5}\right)={r}_{2}\left(\mathrm{cos}\left(0\right),\mathrm{sin}\left(0\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left({s}_{6},{t}_{6}\right)={r}_{2}\left(\mathrm{cos}\left(\frac{\text{π}}{6}\right),\mathrm{sin}\left(\frac{\text{π}}{6}\right)\right)$

$\left({s}_{7},{t}_{7}\right)={r}_{1}\left(\mathrm{cos}\left(\frac{\text{2π}}{6}\right),\mathrm{sin}\left(\frac{\text{2π}}{6}\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left({s}_{8},{t}_{8}\right)={r}_{2}\left(\mathrm{cos}\left(\frac{\text{3π}}{6}\right),\mathrm{sin}\left(\frac{\text{3π}}{6}\right)\right)$

$\begin{array}{l}{g}_{10}\left(x;\beta \right)={\text{e}}^{{s}_{5}y+{t}_{5}z}-{L}_{\beta }\left({s}_{1},{t}_{1}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{11}\left(x;\beta \right)={\text{e}}^{{s}_{6}y+{t}_{6}z}-{L}_{\beta }\left({s}_{6},{t}_{6}\right),\\ {g}_{12}\left(x;\beta \right)={\text{e}}^{{s}_{7}y+{t}_{7}z}-{L}_{\beta }\left({s}_{7},{t}_{7}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{13}\left(x;\beta \right)={\text{e}}^{{s}_{8}y+{t}_{8}z}-{L}_{\beta }\left({s}_{8},{t}_{8}\right).\end{array}$

If needed, pick another 4 points on the circle of radius radius ${r}_{3}=0.03$ of the nonnegative quadrant centered at the origin by letting

$\left({s}_{9},{t}_{9}\right)={r}_{3}\left(\mathrm{cos}\left(0\right),\mathrm{sin}\left(0\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left({s}_{10},{t}_{10}\right)={r}_{3}\left(\mathrm{cos}\left(\frac{\text{π}}{6}\right),\mathrm{sin}\left(\frac{\text{π}}{6}\right)\right)$

$\left({s}_{11},{t}_{11}\right)={r}_{3}\left(\mathrm{cos}\left(\frac{\text{2π}}{6}\right),\mathrm{sin}\left(\frac{\text{2π}}{6}\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left({s}_{12},{t}_{12}\right)={r}_{3}\left(\mathrm{cos}\left(\frac{\text{3π}}{6}\right),\mathrm{sin}\left(\frac{\text{3π}}{6}\right)\right)$

$\begin{array}{l}{g}_{14}\left(x;\beta \right)={\text{e}}^{{s}_{9}y+{t}_{9}z}-{L}_{\beta }\left({s}_{9},{t}_{9}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{15}\left(x;\beta \right)={\text{e}}^{{s}_{10}y+{t}_{10}z}-{L}_{\beta }\left({s}_{10},{t}_{10}\right),\\ {g}_{16}\left(x;\beta \right)={\text{e}}^{{s}_{11}y+{t}_{11}z}-{L}_{\beta }\left({s}_{11},{t}_{11}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{g}_{17}\left(x;\beta \right)={\text{e}}^{{s}_{12}y+{t}_{12}z}-{L}_{\beta }\left({s}_{12},{t}_{12}\right).\end{array}$

With all these gi’s, the basis can be formed and given by

${B}_{LT}=\left\{{g}_{1}\left(x;\beta \right),\cdots ,{g}_{k}\left(x;\beta \right)\right\},k=17.$ (7)

Therefore, if the number of parameters m in the model, $m , bivariate MEEL methods can be used for estimation and model testing using this basis to generate constraints. Note that the choice of ${g}_{i}\left(x;\beta \right),i=1,\cdots ,17$ do not put restrictions on the parameter space as the model BLT is well defined on the nonnegative quadrant for all values of the vector $s={\left(s,t\right)}^{\prime }$ . This might not be the case with the use of the BMGF ${M}_{\beta }\left(s,t\right)$ where the domain of s might depend on $\beta$ . We have to be ensured that if restrictions are imposed, the restricted parameter space is all we need for applications. The choice of points are based on the intuitve ground that the BLT contains more information at points near the origin and obviously there is some arbitrairiness on the choice of points or equivalently moment conditions for using MEEL methods.

If we use moment conditions based on the BMGF such as estimation for the jump-diffusion model as given by example 2, in general we might want to keep the first five ${g}_{i}\left(x;\beta \right)$ ’s as given by expression (6) for BLT and add 12 additional gi’s with chosen points on the circle of radius $r=0.01$ centered at the origin without restricting on the first quadrant and define

$\left({s}_{j},{t}_{j}\right)=r\left(\mathrm{cos}\left(\frac{j\text{π}}{6}\right),\mathrm{sin}\left(\frac{j\text{π}}{6}\right)\right),\text{\hspace{0.17em}}j=0,1,2,\cdots ,11$ which leads to define

$\begin{array}{l}{g}_{6}\left(x;\beta \right)={\text{e}}^{{s}_{0}{z}_{1}+{t}_{0}{z}_{2}}-{M}_{\beta }\left({s}_{0},{t}_{0}\right),\\ {g}_{7}\left(x;\beta \right)={\text{e}}^{{s}_{1}{z}_{1}+{t}_{1}{z}_{2}}-{M}_{\beta }\left({s}_{1},{t}_{1}\right),\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}⋮\\ {g}_{17}\left(x;\beta \right)={\text{e}}^{{s}_{11}{z}_{1}+{t}_{11}{z}_{2}}-{M}_{\beta }\left({s}_{11},{t}_{11}\right).\end{array}$

Consequently, the base

${B}_{MGF}=\left\{{g}_{1}\left(x;\beta \right),\cdots ,{g}_{5}\left(x;\beta \right),{g}_{6}\left(x;\beta \right),\cdots ,{g}_{17}\left(x;\beta \right)\right\}$ (8)

which makes use of the model BMGF ${M}_{\beta }\left(s,t\right)$ can be used for generating constraints.

Note that the BMGF for the jump-diffusion model as given by example 2 is well defined for all points $s={\left(s,t\right)}^{\prime }$ . Note that with a basis of $k=17$ elements, it will allow the number of parameters $p<17$ .It appears that such a base gives a good balance between efficiency and numerically tractability for the MEEL methods and can be used as a guideline to form other bases if necessary in practice.

4.2. Estimation and Model Testing

Asymptotic properties for Multivariate MEEL methods and for EL methods for models with independent and identically distributed of vector of observations are well established, see Qin and Lawless, Mittelhammer et al.  (p. 323), Schennach  . Now if we use the constraints as defined using a basis $B=\left\{{g}_{1},\cdots ,{g}_{k}\right\}$ , MEEL estimators are obtained as the vector $\stackrel{^}{\beta }$ which maximizes the entropy or minimizes

${\sum }_{i=1}^{n}{\pi }_{i}^{*}\left(\beta \right)\mathrm{ln}{\pi }_{i}^{*}\left(\beta \right)$ (9)

with

${\pi }_{i}^{*}=\frac{\mathrm{exp}\left(-{\sum }_{j=1}^{k}{\lambda }_{j}\left(\beta \right)\left({\sum }_{i=1}^{n}{g}_{j}\left({x}_{i};\beta \right)\right)\right)}{{\sum }_{i=1}^{n}\mathrm{exp}\left(-{\sum }_{j=1}^{k}{\lambda }_{j}\left(\beta \right)\left({\sum }_{i=1}^{n}{g}_{j}\left({x}_{i};\beta \right)\right)\right)},\text{\hspace{0.17em}}i=1,\cdots ,n$ (10)

subject to

${\sum }_{i=1}^{n}{\pi }_{i}^{*}\left(\lambda ,\beta \right)\left({g}_{j}\left({x}_{i};\beta \right)\right)=0,\text{\hspace{0.17em}}j=1,\cdots ,k$ . (11)

The λj's depend on $\beta$ and are only implicitly defined using expression (11).

Using standard regularity conditions as indicated in the references just listed, also see section 3.2 given by Luong  (pp. 471-472), we then have consistency and asymptotic normality for the MEEL estimators given by $\stackrel{^}{\beta }$ , i.e., $\stackrel{^}{\beta }\stackrel{p}{\to }{\beta }_{0}$ , ${\beta }_{0}$ is the vector of the true parameters,

$\sqrt{n}\left(\stackrel{^}{\beta }-{\beta }_{0}\right)\stackrel{p}{\to }N\left(0,\Omega \right)$ ,

$\Omega ={\left[E\left[{\frac{\partial g\left(x,\beta \right)}{\partial \beta }|}_{\beta ={\beta }_{0}}\right]{\left({E\left[g\left(x,\beta \right)g{\left(x,\beta \right)}^{\prime }\right]|}_{\beta ={\beta }_{0}}\right)}^{-1}E\left[{\frac{\partial g\left(x,\beta \right)}{\partial {\beta }^{\prime }}|}_{\beta ={\beta }_{0}}\right]\right]}^{-1}$ , (12)

$g\left(x,\beta \right)={\left({g}_{1}\left(x,\beta \right),\cdots ,{g}_{k}\left(x,\beta \right)\right)}^{\prime },\text{\hspace{0.17em}}\Sigma \left({\beta }_{0}\right)={E\left[g\left(x,\beta \right)g{\left(x,\beta \right)}^{\prime }\right]|}_{\beta ={\beta }_{0}}.$

An estimator $\stackrel{^}{\Omega }$ for $\Omega$ is given below,

$\begin{array}{c}\stackrel{^}{\Omega }=\left[\left[{\sum }_{i=1}^{n}{\stackrel{^}{\pi }}_{i}{\frac{\partial g\left({x}_{i},\beta \right)}{\partial \beta }|}_{\beta =\stackrel{^}{\beta }}\right]{\left({\sum }_{i=1}^{n}{{\stackrel{^}{\pi }}_{i}g\left({x}_{i},\beta \right)g{\left({x}_{i},\beta \right)}^{\prime }|}_{\beta =\stackrel{^}{\beta }}\right)}^{-1}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}{\cdot \left[{\sum }_{i=1}^{n}{\stackrel{^}{\pi }}_{i}{\frac{\partial g\left({x}_{i},\beta \right)}{\partial \beta }|}_{\beta =\stackrel{^}{\beta }}\right]\right]}^{-1}\end{array}$

${\stackrel{^}{\pi }}_{i}={\pi }_{i}^{*}\left(\stackrel{^}{\beta }\right),i=1,\cdots ,n$ . If we let ${\stackrel{^}{\pi }}_{i}=\frac{1}{n},i=1,\cdots ,n$ , we have another consistent

estimator for $\Omega$ . The asymptotic results are very similar to the ones for the univariate case as given by Luong  but for completeness they are reproduced to make the paper easier to follow.

For model testing, it is viewed as testing the validity of the moment conditions, i.e., tesing the null hypothesis ${H}_{0}:{E}_{\beta }\left({g}_{j}\left(x\right)\right),j=1,\cdots ,k$ just as in the univariate case, the expectations are under the true parametric model.

The following test statistics given below is a chi-square test statistics which has an asymptotic chi-square distribution with $r=k-p$ degree of freedom, i.e.,

$2nKL\left({\pi }^{*}\left(\stackrel{^}{\beta }\right),{p}_{n}\right)=2n\left({\sum }_{i=1}^{n}{\pi }_{i}^{*}\left(\stackrel{^}{\beta }\right)\left[\mathrm{log}\left({\pi }_{i}^{*}\left(\stackrel{^}{\beta }\right)\right)-\mathrm{log}\left(\frac{1}{n}\right)\right]\right)\stackrel{L}{\to }{\chi }^{2}\left(k-p\right)$ . (13)

This test might be useful for testing bivariate normality in empirical finance and it is based on the MEEL estimators $\stackrel{^}{\beta }$ and since the basis used spans the true score functions of the bivariate normal model which makes $\stackrel{^}{\beta }$ as efficient as the maximum likelihood estimator $\stackrel{^}{{\beta }_{ML}}$ , as the projected score functions are identical to the score functions. Furthermore, the asymptotic chi-square distributions are practical for the use of these test statistics and we do not need intensive simulations to implement model testing procedures as in other methods. Also, MEEL methods can be applied for model with a singular part in the domain provided that the model BLT has a closed form expression.

4.3. Numerical Implementations

As for the univariare case, a penalty approach can be used which means that we can perform unconstrained minimization using the following objective function with respect to $\lambda ={\left({\lambda }_{1},\cdots ,{\lambda }_{k}\right)}^{\prime }$ and ${\beta }_{1},\cdots ,{\beta }_{p}$ ,

$\begin{array}{l}{\sum }_{i=1}^{n}{\pi }_{i}^{*}\left(\lambda ,\beta \right)\mathrm{ln}{\pi }_{i}^{*}\left(\lambda ,\beta \right)\\ +\frac{K}{2}\left[{\left({\sum }_{i=1}^{n}{\pi }_{i}^{*}\left(\lambda ,\beta \right)\left[{g}_{1}\left({x}_{i},\beta \right)\right]\right)}^{2}+\cdots +{\left({\sum }_{i=1}^{n}{\pi }_{i}^{*}\left(\lambda ,\beta \right)\left[{g}_{k}\left({x}_{i},\beta \right)\right]\right)}^{2}\right]\end{array}$ (14)

The penalty constant K is a large positive value, direct search agorithms which are derivative free are recommended and these direct search algorithms are also more stable.

It is important to note that only a local minimizer is found each time using these algorithms, some strategies are needed to identify the global minimizer. Often, we might need a starting vector being close to the vector of the estimators to initialize the algorithm, this is important when working with real data and bivariate models might have many parameters. We might want to adopt the following strategy to look for a good starting vector. Using the two submodels based on the two marginal distributions of the model separately we can fit using univariate MEEL methods using $\left\{{y}_{i},i=1,\cdots ,n\right\}$ and $\left\{{z}_{i},i=1,\cdots ,n\right\}$ as described in Luong  to obtain respectively $\stackrel{^}{{\beta }_{1}^{\left(0\right)}}$ the vector which estimates the vector of parameters ${\beta }_{1}$ of the first marginal model and similarly obtain $\stackrel{^}{{\beta }_{2}^{\left(0\right)}}$ which estimates the vector of parameters ${\beta }_{2}$ of the second marginal model.The vector of parameters of the bivariate model is $\beta$ and $\beta$ can be partitioned into three components,i.e.,

$\beta ={\left({\beta }_{1},{\beta }_{2},{\beta }_{3}\right)}^{\prime }$ . Now if we apply MEEL methods with the bivariate model but fixing

${\beta }_{1}=\stackrel{^}{{\beta }_{1}^{\left(0\right)}},{\beta }_{2}=\stackrel{^}{{\beta }_{2}^{\left(0\right)}}$ and only estimate ${\beta }_{3}$ , the vector which estimates ${\beta }_{3}$ is denoted by $\stackrel{^}{{\beta }_{3}^{\left(0\right)}}$ . We can construct the following starting vector

${\stackrel{^}{\beta }}^{\left(0\right)}={\left(\stackrel{^}{{\beta }_{1}^{\left(0\right)}},\stackrel{^}{{\beta }_{2}^{\left(0\right)}},\stackrel{^}{{\beta }_{3}^{\left(0\right)}}\right)}^{\prime }$ , ${\stackrel{^}{\lambda }}^{\left(0\right)}=\left(0,\cdots ,0\right)$ and fit the bivariate model using penalty method by minimizing jointly with respect to the vector $\lambda$ and the vector $\beta$ the objective function as given by expression (14).

5. A Simulation Study

In this section we perform a limited simulation study for model given by example 1. We compare the performance of the Method of moment(MM) estimators with the MEEL estimators using the following basis ${B}_{MEEL}$ with $k=9$ elements which is formed using the first 9 elements of the basis given by expression (7) as the model only has five parameters given by the vector

$\beta ={\left({ф}_{1},{ф}_{2},{ф}_{12},{\beta }_{1},{\beta }_{2}\right)}^{\prime }$ and these parameters will be introduced below. Using the model in example by specifying an exponential distribution for U with

density function $f\left(u;{\beta }_{1}\right)=\frac{1}{{\beta }_{1}}{\text{e}}^{-\text{\hspace{0.17em}}\frac{u}{{\beta }_{1}}},{\beta }_{1}>0$ and specifying another exponential distribution for V with density function $f\left(v;{\beta }_{2}\right)=\frac{1}{{\beta }_{2}}{\text{e}}^{-\text{\hspace{0.17em}}\frac{u}{{\beta }_{2}}},{\beta }_{2}>0$ . The random

vector $N={\left({N}_{1},{N}_{2}\right)}^{\prime }$ follows a bivariate Poisson distribution with parameters ${ф}_{1},{ф}_{2},{ф}_{12}$ and these parameters all all positive. We have n independent and identically distributed observations

${X}_{i}={\left({Y}_{i},{Z}_{i}\right)}^{\prime },i=1,2,\cdots ,n$ . They have the same distributions as $X={\left(Y,Z\right)}^{\prime }$ and using the model as specified, the following first two moments of X can be obtained, they are also given by Partrat  (p. 221),

$E\left(Y\right)={ф}_{1}{\beta }_{1},\text{\hspace{0.17em}}E\left(Z\right)={ф}_{2}{\beta }_{2},V\left(Y\right)=2{ф}_{1}{\beta }_{1}^{2},\text{\hspace{0.17em}}V\left(Z\right)=2{ф}_{2}{\beta }_{2}^{2},\text{\hspace{0.17em}}Cov\left(Y,Z\right)={ф}_{12}{\beta }_{1}{\beta }_{2}$ (15)

where $E\left(.\right),V\left(.\right)$ and $Cov\left(.\right)$ denote respectively the mean, the variance and covariance of the expression inside the parenthesis. Now if we replace $E\left(Y\right)$ , $E\left(Z\right)$ , $V\left(Y\right)$ , $V\left(Z\right)$ and $Cov\left(Y,Z\right)$ by their sample counterparts which are given by $Y,\stackrel{¯}{Z},{s}_{Y}^{2},{s}_{Z}^{2}$ and ${s}_{YZ}$ in expression (15) and by solving the system of equation,it will give us the MM estimators given by the vector $\stackrel{˜}{\beta }={\left(\stackrel{˜}{{ф}_{1}},\stackrel{˜}{{ф}_{2}},\stackrel{˜}{{ф}_{12}},\stackrel{˜}{{\beta }_{1}},\stackrel{˜}{{\beta }_{2}}\right)}^{\prime }$ with

$\stackrel{˜}{{\beta }_{1}}=\frac{{s}_{Y}^{2}}{2\stackrel{¯}{Y}},\text{\hspace{0.17em}}\stackrel{˜}{{\beta }_{2}}=\frac{{s}_{Z}^{2}}{2\stackrel{¯}{Z}},\text{\hspace{0.17em}}\stackrel{˜}{{ф}_{1}}=\frac{\stackrel{¯}{Y}}{\stackrel{˜}{{\beta }_{1}}},\text{\hspace{0.17em}}\stackrel{˜}{{ф}_{2}}=\frac{\stackrel{¯}{Z}}{\stackrel{˜}{{\beta }_{2}}},\text{\hspace{0.17em}}\stackrel{˜}{{ф}_{12}}=\frac{{s}_{YZ}}{\stackrel{˜}{{\beta }_{1}}\stackrel{˜}{{\beta }_{2}}}$

For MEEL methods we use the basis ${B}_{MEEL}=\left\{{g}_{1},\cdots ,{g}_{9}\right\}$ and we only keep the first 9 elements of the basis B as given by expression (7).

The model BLT is

${L}_{X}\left(s,t\right)=\mathrm{exp}\left({ф}_{1}\left(\frac{1}{1+{\beta }_{1}s}-1\right)+{ф}_{2}\left(\frac{1}{1+{\beta }_{2}t}-1\right)+{ф}_{12}\left(\frac{1}{1+{\beta }_{1}s}\frac{1}{1+{\beta }_{2}t}-1\right)\right)$ ,

see Partrat  (p. 221) and example 1 in section (2). For efficiency, observe that the quasi-score functions of the MM estimators belong to the linear space spanned by ${g}_{1},\cdots ,{g}_{5}$ and based on asymptotic theory MEEL estimators will be more efficient than MM estimators for large samples already as ${g}_{1},\cdots ,{g}_{5}$ belong the basis used for MEEL methods, see Luong  for quasi-score functions related to MEEL estimation. Another advantage of the MEEEL methods over methods of moments (MM) is that a chi-square statistics with four degree of freedom is available for model testing as the methods provide a unified approach to estimation and model testing.

We conduct a limited simulation study using this model, to simulate an observation from the bivariate Poisson distribution we use a trivariate reduction technique which is based on the following representation in distribution. We simulate first an observation ${R}_{12}$ from a Poisson distribution with parameter ${\lambda }_{12}$ then simulate an independent observation ${R}_{1}$ which follows a Poisson distribution with parameter ${\lambda }_{1}$ and another observation which is independent of the first two observations which follows a Poisson distribution with parameter ${\lambda }_{2}$ . Finally, we construct the vector of observations obtained by simulations as

$N={\left({N}_{1},{N}_{2}\right)}^{\prime }$ with its components given by ${N}_{1}{=}^{d}{R}_{1}+{R}_{12}$ with ${N}_{2}{=}^{d}{R}_{2}+{R}_{12}$ and the equalities hold in distribution. The vector $N={\left({N}_{1},{N}_{2}\right)}^{\prime }$ will follow a bivariate Poisson distribution with parameters ${ф}_{1}={\lambda }_{1}+{\lambda }_{12}$ , ${ф}_{2}={\lambda }_{2}+{\lambda }_{12}$ , ${ф}_{12}={\lambda }_{12}$ , see Johnson et al.  (pp. 124-126) for this representation. To simulate an observation ${X}_{i}=\left({Y}_{i},{Z}_{i}\right),i=1,\cdots ,n$ we can use the random sum representations of ${Y}_{i}$ and ${Z}_{i}$ and therefore it suffices to simulate from exponential distributions and perform the appropriate summations.

We use M = 100 samples and each of the sample is of size $n=1000$ . For the range of parameters we fix ${\beta }_{1}={\beta }_{2}$ and let ${\beta }_{1}$ varies from 1, 2, …, 10. For the parameters of the bivariate Poisson distribution we fix ${ф}_{12}=0.1$ , ${ф}_{1}={ф}_{2}$ and let ${ф}_{1}=1.1,3.1,4.1,5.1,6.1,8.1,10.1$ . Similarly, we fix ${\beta }_{1}={\beta }_{2}$ and let ${\beta }_{1}$ varies from 1, 2, …, 10 and we fix ${ф}_{12}=0.2$ , ${ф}_{1}={ф}_{2}$ and let ${ф}_{1}=1.2,3.2,4.2,5.2,6.2,8.2,10.2$ .

The mean square errors (MSE) are estimated using simulated samples. The mean square error of an estimator $\stackrel{^}{\pi }$ for ${\pi }_{0}$ is defined as

$MSE\left(\stackrel{^}{\pi }\right)=E{\left(\stackrel{^}{\pi }-{\pi }_{0}\right)}^{2}$ . We use the following criterion for comparison between the efficiencies of MEEL estimators versus MM estimators

$ARE=\frac{MSE\left(\stackrel{^}{{ф}_{1}}\right)+MSE\left(\stackrel{^}{{ф}_{2}}\right)+MSE\left(\stackrel{^}{{ф}_{12}}\right)+MSE\left(\stackrel{^}{{\beta }_{1}}\right)+MSE\left(\stackrel{^}{{\beta }_{2}}\right)}{MSE\left(\stackrel{˜}{{ф}_{1}}\right)+MSE\left(\stackrel{˜}{{ф}_{2}}\right)+MSE\left(\stackrel{˜}{{ф}_{12}}\right)+MSE\left(\stackrel{˜}{{\beta }_{1}}\right)+MSE\left(\stackrel{˜}{{\beta }_{2}}\right)}$

The results are displayed in Table 1 where we observe that overall the MEEL estimators are two to four time more efficient than MM estimators with the ARE estimated using simulated samples. We also observe that each individual estimate MSE is smaller for MEEL estimators than their counterpart for MM estimators. Furthermore, we also have a chi-square goodness-of-fit statistics for model selection for a model without closed form density function. This model testing statistics can be useful for applications in general. Despite the study is limited as

$\left({\text{ф}}_{12}=0.2,{\text{ф}}_{1}={\text{ф}}_{2},{\beta }_{1}={\beta }_{2}\right)$

Table 1. Overall relative efficiency of MEEL estimators versus MM estimators.

we do not have large computing resources but it does confirm the asymptotic results on efficiencies of MEEL estimators. More works need to be done for assessing the efficiencies of MEEL methods by using various parametric models especially in finite samples.

6. Conclusion

MEEL estimators using the proposed moment conditions or bases appear to have the potential of higher efficiencies than MM estimators in general. The methods also provide chi-square goodness-of-fit test statistics for model testing. These features make the methods useful for inferences for bivariate distributions with closed form BLTs or BMGFs without closed form for the bivariate density functions. For these distributions, the implementation of likelihood methods might be difficult.

Acknowledgements

The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support form the editorial staffs of Open Journal of Statistics to process the paper are all gratefully acknowledged.

Conflicts of Interest

The authors declare no conflicts of interest.

Cite this paper

Luong, A. (2018) Maximum Entropy Empirical Likelihood Methods Based on Bivariate Laplace Transforms and Moment Generating Functions. Open Journal of Statistics, 8, 264-283. doi: 10.4236/ojs.2018.82017.

  Ruijter, M.J. and Oosterlee, C.W. (2012) Two Dimensional Fourier Cosine Series Expansion Method for Pricing Financial Options. SIAM Journal of Scientific Computing, 34, B642-B671. https://doi.org/10.1137/120862053  Partrat, C. (1995) Compound Model for Two Dependent Kinds of Claims. Insurance: Mathematics and Economics, 15, 219-231. https://doi.org/10.1016/0167-6687(94)90796-X  Crowder, M. (2012) Multivariate Survival Analysis and Competing Risks. CRC Press, Boca Raton. https://doi.org/10.1201/b11893  Abate, J. and Whitt, W. (1996) An Operational Calculus for Probability Distributions. Advanced in Applied Probability, 28, 75-113. https://doi.org/10.2307/1427914  Shih, J.H. and Louis, T.A. (1995) Inferences of the Association Parameter in Copula Models for Bivariate Survival Data. Biometrics, 51, 1384-1399. https://doi.org/10.2307/2533269  Marshall, A.W. and Olkin, I. (1988) Families of Multivariate Distributions. Journal of the American Statistical Association, 83, 834-841. https://doi.org/10.1080/01621459.1988.10478671  Luong, A. (2017) Maximum Entropy Empirical Likelihood Methods Based on Laplace Transforms for Nonnegative Continuous Distribution with Actuarial Applications. Open Journal of Statistics, 7, 459-482. https://doi.org/10.4236/ojs.2017.73033  Marshall, A.W. and Olkin, I. (1967) A Multivariate Exponential Distribution. Journal of the American Statistical Association, 62, 30-44. https://doi.org/10.1080/01621459.1967.10482885  Assaf, D., Langberg, N.A., Savits, T.H. and Shaked, M. (1984) Multivariate Phase Type Distributions. Operations Research, 32, 688-702. https://doi.org/10.1287/opre.32.3.688  Dwass, M. and Teicher, H. (1957) On Infinitely Divisible Random Vectors. Annals of Mathematical Statistics, 28, 461-470. https://doi.org/10.1214/aoms/1177706974  Feller, W. (1971) An Introduction to Probability and Its Applications. Volume 2, Wiley, New York.  McNeil, A.J., Frey, R. and Embrechts, P. (2005) Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press, Princeton.  McAssey, M.P. (2012) An Empirical Goodness -of -fit Test for Multivariate Distributions. Journal of Applied Statistics, 40, 1120-1131. https://doi.org/10.1080/02664763.2013.780160  Mudholkar, G.S., McDermott, M. and Srivastava, D.K. (1992) A Test of p Variate Normality. Biometrika, 79, 850-854. https://doi.org/10.1093/biomet/79.4.850  Moore, D.S. and Stubblebine, J.B. (1981) Chi-Square Tests for Multivariate Normality with Applications to Common Stock Prices. Communications in Statistics, Theory and Methods A, 10, 713-738. https://doi.org/10.1080/03610928108828070  Qin, J. and Lawless, J. (1994) Empirical Likelihood and General Estimating Equations. Annals of Statistics, 22, 300-325. https://doi.org/10.1214/aos/1176325370  Klugman, S.A., Panjer, H.H. and Willmot, G.E. (2013) Loss Models: Further Topics. Wiley, New York. https://doi.org/10.1002/9781118787106  Klugman, S.A., Panjer, H.H. and Willmot, G.E. (2012) Loss Models: From Data to Decisions. Fourth Edition, Wiley, New York.  Balakrishnan, N. and Lai, C.-D. (2009) Continuous Bivariate Distributions. Second Edition, Springer, New York, Chapter 4.  Barlow, R.E. and Proschan, F. (1981) Statistical Theory of Reliability and Life Testing. Holt, Rinehart and Winston, New York.  Mardia, K.V. (1970) Families of Bivariate Distributions. Griffin, London.  Ross, S. (2013) Simulation. Fifth Edition, Elsevier, New York.  Arevalillo, J.M. (2003) Inverting a Saddlepoint Approximation. Statistics and Probability Letters, 61, 421-428. https://doi.org/10.1016/S0167-7152(02)00402-9  Glasserman, P. and Liu, Z. (2010) Sensitivity Estimates from Characteristic Functions. Operations Research, 58, 1611-1623. https://doi.org/10.1287/opre.1100.0837  Genest, C., Rémillard, B. and Beaudoin, D. (2009) Goodness-of-Fit Tests for Copulas: A Review and Power Study. Insurance: Mathematics and Economics, 44, 199-213. https://doi.org/10.1016/j.insmatheco.2007.10.005  Frees, E.W. and Valdez, E.A. (1998) Understanding Relationships Using Copulas. North American Actuarial Journal, 2, 1-25. https://doi.org/10.1080/10920277.1998.10595667  Schennach, S. (2007) Point Estimation with Exponential tilted Likelihood. Annals of Statistics, 35, 634-672. https://doi.org/10.1214/009053606000001208  Owen, A. (2001) Empirical Likelihood. Chapman and Hall, New York. https://doi.org/10.1201/9781420036152  Mittelhammer, R.C., Judge, G.G. and Miller, D.J. (2000) Econometrics Foundations. Cambridge University Press, Cambridge.  Johnson, N.L., Kotz, S. and Balakrishnan, N. (1997) Discrete Multivariate Distributions. Wiley, New York. 