^{1}

^{*}

^{2}

^{3}

The convolution of Nadarajah-Haghighi-G family of distributions will result into a more flexible distribution (Nadarajah-Haghighi Gompertz distribution) than each of them individually in terms of the estimate of the characteristics in there parameters. The combination was done using Nadarajah-Haghighi (NH) generator. We investigated in the newly developed distribution some basic properties including moment, moment generating function, survival rate function, hazard rate function asymptotic behaviour and estimation of parameters. The proposed model is much more flexible and has a better representation of data than Gompertz distribution and some other model considered. A real data set was used to illustrate the applicability of the new model.

The Gompertz (G) distribution is a flexible distribution which can be skewed to the right and to the left. This distribution is a generalization of the exponential (E) distribution and is commonly used in many applied problems, particularly in lifetime data analysis ( [

G ( x ) = 1 − e − α β ( e β x − 1 ) (1)

And the probability density function given as

g ( x ) = α e β x e − α β ( e β x − 1 ) (2)

A generalization based on the idea of [

In this paper, we introduce a new generalization of G distribution which results in the application of the G distribution to the Nadarajah and Haghighi (NH) family of distribution proposed by [

Consider a continuous distribution G ( x ) with density g ( x ) . The cdf of NH-family is defined as

F ( x ) = ∫ 0 − log [ 1 − G ( x ) ] δ λ ( 1 + λ t ) δ − 1 exp [ 1 − ( 1 + λ t ) δ ] d t (3)

This on simplification gives

F ( x ) = 1 − exp { 1 − [ 1 − λ log [ 1 − G ( x ) ] ] δ } , x > 0 , δ > 0 , λ > 0 (4)

But, d F ( x ) d x = f ( x ) , then we obtain the pdf as

f ( x ) = δ λ g ( x ) { 1 − λ log [ 1 − G ( x ) ] } δ − 1 exp { 1 − [ 1 − λ log [ 1 − G ( x ) ] ] δ } 1 − G ( x ) (5)

A random variable X with pdf (5) is denoted by X ~ NH-G ( δ , λ , ξ ) where ξ is the parameter vector of G ( x ) .

By using the power series for the exponential function and the generalized binomial expansion we can express the NH-G function as an infinite linear combination of exponentiated-G density functions. Then, the pdf of X can be expressed as

f ( x ) = ∑ m = 0 ∞ z m h m + 1 ( x , ξ ) (6)

where,

z m = ∑ i , j = 0 ∞ ∑ l = 0 ∞ e δ λ s + 1 ( − 1 ) i + l + m ( m + 1 ) ( l − 1 m ) ( δ ( i + 1 ) − 1 j ) [ ( j l ) + ∑ k = 0 ∞ ρ k ( j ) ( k + j + 1 l ) ]

ρ 0 ( c ) = c 2 , ρ 1 ( c ) = c ( 3 c + 5 ) 24 , ρ 2 ( c ) = c ( c 2 + 5 c + 6 ) 48 , ρ 3 ( c ) = c ( 15 c 3 + 150 c 2 + 485 c + 302 ) 5760 , etc

And h m = m g ( x ) G ( x ) m + 1 is the exp-density function with parameter m.

Also, integrating the mixture (6) and using monotone convergence theorem, the cdf of x can be expressed as

F ( x ) = ∑ m = 0 ∞ z m H m + 1 ( x ) (7)

where,

H m + 1 ( x ) = G ( x ) m + 1

The new proposed Nadarajah Haghighi Gompertz distribution

Suppose X ~ G ( α , β ) with cdf define in (2) inserting it in (4) will give the cdf of Nadarajah Haghighi Gompertz distribution as

F ( x ) = 1 − e { 1 − [ 1 + λ α ( e β x − 1 ) β ] δ } (8)

Using the relation in (6), we can express (8) as

F ( x ) = ∑ j = 0 ∞ ∑ m = 0 ∞ z m ( − 1 ) j ( m + 1 j ) e − α β ( e β x − 1 ) j (9)

The graph of the cdf for the values of the parameters is given in

The cdf graph drawn in

Also, putting (1) in (5) gives the pdf of NH-Gompertz distribution as

f ( x ) = δ λ α e β x [ 1 + λ α ( e β x − 1 ) β ] δ − 1 e { 1 − [ 1 + λ α ( e β x − 1 ) β ] δ } (10)

Using the relation in (6) we can express (10) as

f ( x ) = α ∑ j = 0 ∞ ∑ m = 0 ∞ ( − 1 ) j ( m + 1 j ) z m ( m + 1 ) e β x e [ − α β ( e β x − 1 ) ] ( j + 1 ) (11)

The graph of the pdf for various values of the parameters is drawn below in

We seek to investigate the behaviour of the model in Equation (10) as x → 0 and δ = 1

lim x → 0 δ λ α e β x [ 1 + λ α ( e β x − 1 ) β ] δ − 1 e { 1 − [ 1 + λ α ( e β x − 1 ) β ] δ } = λ α

The survival function is defined by,

S ( x ) = 1 − F ( x ) (12)

Inserting (9) in (12), we have

S ( x ) = 1 − ∑ j = 0 ∞ ∑ m = 0 ∞ z m ( − 1 ) j ( m + 1 j ) e − α β ( e β x − 1 ) j (13)

The graph of the survival function is drawn below in

For any random variable x the hazard function is defined by

h ( x ) = f ( x ) S ( x ) (14)

Substituting (8) and (10) in (14) we have

h ( x ) = δ λ α e β x [ 1 + λ α ( e β x − 1 ) β ] δ − 1 e { 1 − [ 1 + λ α ( e β x − 1 ) β ] δ } e { 1 − [ 1 + λ α ( e β x − 1 ) β ] δ }

Then we have

h ( x ) = δ λ α e β x [ 1 + λ α ( e β x − 1 ) β ] δ − 1 (15)

If we let δ = λ = 1 , (15) will reduce to

h ( x ) = α e β x [ 1 + λ α ( e β x − 1 ) β ] 0

Finally,

h ( x ) = α e β x (16)

The above equation is the hazard function of Gompertz distribution known as the Gompertz model.

•

The r^{th} moment of a distribution can be obtained using the relation

E ( X r ) = ∫ − ∞ ∞ x r f ( x ) d x (17)

Inserting (11) in (17), we have

E ( X r ) = ∫ − ∞ ∞ x r α ∑ j = 0 ∞ ∑ m = 0 ∞ ( − 1 ) j ( m + 1 j ) z m ( m + 1 ) e β x e [ − α β ( e β x − 1 ) ] ( j + 1 ) d x

On simplification we have

E ( X r ) = α ∑ j = 0 ∞ ∑ m = 0 ∞ ( − 1 ) j ( m + 1 j ) z m ( m + 1 ) ∫ − ∞ ∞ x r e β x e [ − α β ( e β x − 1 ) ] ( j + 1 ) d x (18)

In (18), let I 1 to represent the integrand part, then we have

I 1 = ∫ − ∞ ∞ x r e β x e [ − α β ( e β x − 1 ) ] ( j + 1 ) d x (19)

Let

e β x = w , ln ( w ) = β x , x = ln ( w ) β , d x = d w w β ,

Then substituting the above expression in (19) will transform to

I 1 = β − ( r + 1 ) e − α β ( j + 1 ) ∫ − ∞ ∞ ln ( w ) r e − α β ( j + 1 ) w d w (20)

Integrating (20) by parts, we have

I 1 = β − ( r + 1 ) e − α β ( j + 1 ) E 1 r − 1 ( α β ( j + 1 ) ) (21)

where

E s k ( w ) = 1 k ! ∫ 1 ∞ ( ln x ) k x − s e − w x d w

Is the generalized integro-exponential function, for further study on integro exponential see [

Then combining the Equation (18) and Equation (21) we obtain the r^{th} moment of NH-Gompertz distribution function as

E ( X r ) = α β − ( r + 1 ) ∑ j = 0 ∞ ∑ m = 0 ∞ ( − 1 ) j ( m + 1 j ) w m ( m + 1 ) e − α β ( j + 1 ) E 1 r − 1 ( α β ( j + 1 ) ) (22)

Here we want to generate an expression for the moment generating function for the NH-Gompertz distribution, from

μ x ( t ) = E ( e t x ) = ∫ 0 ∞ e t x f ( x ) d x (23)

Substituting (11) in (23), we have

E ( e t x ) = α ∑ j = 0 ∞ ∑ m = 0 ∞ ( − 1 ) j ( m + 1 j ) z m ( m + 1 ) ∫ 0 ∞ e t x e β x e [ − α β ( e β x − 1 ) ] ( j + 1 ) d x (24)

We let I 2 equals the integrand in (24), then we have

I 2 = ∫ 0 ∞ e t x e β x e [ − α β ( e β x − 1 ) ] ( j + 1 ) d x (25)

Let, u = [ α β ( e β x − 1 ) ] ( j + 1 ) , then d x = d u α ( j + 1 ) e β x , then substituting for u and dx in Equation (25), we have

I 2 = 1 ( j + 1 ) ∫ [ α β ( e β x − 1 ) ] ( j + 1 ) ∞ ∞ [ β u α ( j + 1 ) + 1 ] t β e − u d u (26)

From Taylor series,

( 1 + y ) r m = ∑ l = 0 ∞ ( r m k ) y k (27)

Applying Equation (27) in Equation (26) we have,

I 2 = 1 ( j + 1 ) ∑ k = 0 ∞ ( t β k ) ( β α ( j + 1 ) ) k ∫ [ α β ( e β t i − 1 ) ] ( j + 1 ) ∞ u k e − u d u (28)

Since,

Γ ( z ) = ∫ 0 ∞ x z − 1 e − x d x (29)

Applying the gamma function given in Equation (29) in Equation (28), we have

I 2 = 1 ( j + 1 ) ∑ k = 0 ∞ ( t β k ) ( β α ( j + 1 ) ) k [ k + 1 , { α β ( e β t i − 1 ) } ( j + 1 ) ] (30)

Them we substitute Equation (30) in Equation (24) to obtain the moment generating function of Nadarajah Haghighi Gompertz distribution as, then,

I 2 = α ( j + 1 ) − ( t + β )

Then substituting I 2 in (24) we have

μ x ( t ) = α ∑ j = 0 ∞ ∑ m = 0 ∞ ( − 1 ) j ( 1 j + 1 ) ( m + 1 j ) w m ( m + 1 ) × { ∑ k = 0 ∞ ( t β k ) ( β α ( j + 1 ) ) k [ k + 1 , { α β ( e β t i − 1 ) } ( j + 1 ) ] } (31)

Here we determine the maximum likelihood estimates (mle’s) of the parameters of the NH-Gompertz from complete samples only. Let x 1 , x 2 , ⋯ , x n be observed values from the NH-Gompertz distribution with parameters α , β , λ , δ . Let Θ = ( α , β , λ , δ ) T be the PX1 parameter vector. The total log-likelihood function for Θ is given by

L ( Θ ) = n + n log ( δ ) + n log ( λ ) + ∑ i = 1 n log α + ∑ i = 1 n { β x i − α β ( e β x i − 1 ) } + ∑ i = 1 n α β ( e β x i − 1 ) + ( δ − 1 ) ∑ i = 1 n log [ 1 + λ α β ( e β x i − 1 ) ] − ∑ i = 1 n log [ 1 + λ α β ( e β x i − 1 ) ] δ (32)

The maximum likelihood function can be maximized either directly by using the ox program (Subroutine Max BFGS) (DOORNIK; 2007) or the SAS (PROC NCMIXED) or by solving the nonlinear likelihood equation by differentiating (13). The components of the score function are:

U δ = n δ + ∑ i = 1 n log { 1 + α λ β ( e β x i − 1 ) } − ∑ i = 1 n { 1 − α λ β ( e β x i − 1 ) } α log { 1 + α λ β ( e β x i − 1 ) } (33)

U λ = n λ + α ( α − 1 ) β ∑ i = 1 n ( e β x i − 1 ) { 1 + λ α β ( e β x i − 1 ) } − α 2 β ∑ i = 1 n { 1 + λ α β ( e β x i − 1 ) } α − 1 ( e β x i − 1 ) (34)

U α = ∑ i = 1 n [ 1 − α β ( e β x i − 1 ) ] e β x i − α β ( e β x i − 1 ) α e β x i e − α β ( e β x i − 1 ) + λ ( δ − 1 ) ∑ i = 1 n 1 β ( e β x i − 1 ) e − α β ( e β x i − 1 ) { e − α β ( e β x i − 1 ) } − 1 { 1 + λ α β ( e β x i − 1 ) } − δ λ ∑ i = 1 n 1 β { 1 + λ α β ( e β x i − 1 ) } δ − 1 ( e β x i − 1 ) e − α β ( e β x i − 1 ) e − α β ( e β x i − 1 ) + ∑ i = 1 n 1 β ( e β x i − 1 ) e − α β ( e β x i − 1 ) e − α β ( e β x i − 1 ) (35)

U β = ∑ i = 1 n { x i + α β 2 [ e β x i ( x i β − 1 ) − 1 ] } e β x i − α β ( e β x i − 1 ) α e β x i e − α β ( e β x i − 1 ) + λ ( δ − 1 ) ∑ i = 1 n { α β 2 [ e β x i ( x i β − 1 ) − 1 ] } e − α β ( e β x i − 1 ) { e − α β ( e β x i − 1 ) } − 1 { 1 + λ α β ( e β x i − 1 ) } − δ λ ∑ i = 1 n 1 β { 1 + λ α β ( e β x i − 1 ) } δ − 1 { α β 2 [ e β x i ( x i β − 1 ) − 1 ] } e − α β ( e β x i − 1 ) e − α β ( e β x i − 1 ) + ∑ i = 1 n { α β 2 [ e β x i ( x i β − 1 ) − 1 ] } e − α β ( e β x i − 1 ) e − α β ( e β x i − 1 ) (36)

Order statistics is among the most fundamental tools in non-parametric statistics and inference. The pdf f i : n ( x ) of the ith order statistic for a random sample x 1 , x 2 , ⋯ , x n from the NH-Gompertz distribution id given by

f i : n ( x ) = k f ( x ) F i − 1 ( x ) [ 1 − F ( x ) ] n − i (37)

where

k = n ! ( i − 1 ) ! ( n − i ) ! ;

Then,

f i : n ( x ) = k α δ λ ∑ j = 0 n − i ( − 1 ) j ( n − i j ) e β x [ 1 + λ α ( e β x − 1 ) β ] δ − 1 e { 1 − [ 1 + λ α ( e β x − 1 ) β ] δ } × { 1 − e { 1 − [ 1 + λ α ( e β x − 1 ) β ] δ } } i + j − 1

The pdf of x i : n can be expressed from (9) and (11) as

f i : n ( x ) = k ∑ j = 0 n − i ( − 1 ) j ( n − i j ) [ α ∑ r = 0 ∞ w r ( r + 1 ) { 1 − e − α β ( e β x − 1 ) } r e β x − α β ( e β x − 1 ) ] × [ ∑ m = 0 ∞ w m { 1 − e − α β ( e β x − 1 ) } m + 1 ] i + j + 1 (38)

Application to real data

To illustrate the new results presented in this paper, we fit the NH-Gompertz distribution to a real data for breaking stress of carbon fibers of 50 mm length (GPa) obtained from [

Min | Q_{1} | Median | mean | Q_{3} | Max | kurtosis | Skewness |
---|---|---|---|---|---|---|---|

0.390 | 1.840 | 2.700 | 2.640 | 3.220 | 5.560 | 0.17287 | 0.37378 |

Mode I | Estimates | ||||
---|---|---|---|---|---|

NHGo (α, β, δ, λ) | −0.00222 (0.00049) | 0.17634 (0.013381) | 1.31946 (0.61066) | −0.00218 (0.00045) | - - |

KGM (a, b, λ, α, β) | 3.25904 (1.8545) | 6.74224 (1.18545) | 10e^{−11} (17.4572) | 0.221480 (0.74510) | 0.130941 (0.71868) |

G (θ, β) | 0.769198 (0.01743) | 0.79109 (0.07760) | - - | - - | - |

EF (b, θ, β) | 52.0491 (31.954) | 26.1730 (14.666) | 0.6181 (0.0897) | - - | - - |

Mode I | l ( θ ^ ) | AIC | BIC | HQIC | CAIC |
---|---|---|---|---|---|

NHGo | −56.112 | 120.224 | 124.207 | 121.001 | 122.891 |

KGM (a, b, λ, α, β) | −141.332 | 292.664 | 305.690 | 297.936 | 293.306 |

G (θ, β) | −149.125 | 302.250 | 307.460 | 304.359 | 307.460 |

EF (b, θ, β) | −145.087 | 296.174 | 303.989 | 294.755 | 296.414 |

the log likelihood function evaluated at the maximum likelihood estimates), Akaike information criterion (AIC), the Bayesian information criterion (BIC), and Hannan-Quinn information criterion (HQIC).

We also applied the Statistical tools for model comparison such as Bayesian information criterion, Akaike information criterion (AIC), Hanna Quinn information criterion and corrected Akaike information criterion (CAIC) to choose the best possible model for the data set among the competitive models.

The study of skew models is useful in modeling skew data that brings about new proposed distribution which generalizes the Gompertz distribution and the new distribution which includes sub-models. We call the new model the Nadarajah Haghighi Gompertz distribution which was studied mathematically and some of its properties were obtained, which includes: derivation of its density and distribution function, survival function, hazard function, asymptotic behaviour, moment and moment generating function. Graph 1 depicts the shape of the cdf of and shows that is a proper cdf, graph 2 shows the shape of the pdf through several values, graph 3 and graph 4 represent the shape of the survival and the hazard functions respectively. The parameters of the proposed distribution were obtained and also the information criteria. Since the Nadarajah Haghighi Gompertz (NH-Gom) distribution has the lowest l ( θ ^ ) AIC, BIC, CAIC and HQIC values among all the other models, and so it could be chosen as the best model. Furthermore, the new model may be applied to many areas such as survival analysis, insurance, engineering, environmental pollution study, etc.

This work was self-funded by authors.

The authors declare no conflicts of interest regarding the publication of this paper.

Ogunde, A.A., Ajao, I.O. and Olalude, G.A. (2020) On the Application of Nadarajah Haghighi Gompertz Distribution as a Life Time Distribution. Open Journal of Statistics, 10, 850-862. https://doi.org/10.4236/ojs.2020.105049