Generalized Kumaraswamy Generalized Power Gompertz Distribution: Statistical Properties, Application, and Validation Using a Modified Chi-Squared Goodness of Fit Test ()
1. Introduction
The Gompertz distribution is a continuous probability distribution often applied in lifetime data analysis to describe the distribution of the science such as biology [1], gerontology [2], adult lifespans by demographers [3], actuaries [4], marketing [5], network theory [6] and computer science [7]. The Gompertz distribution has a convex hazard function. It is a flexible distribution, skewed to the right and the left, and a generalization of the exponential distribution.
To produce a more flexible distribution for a highly skewed dataset, new families of distributions are proposed daily. Some of these families of distributions include the Generalized Kumaraswamy generalized family by Nofal et al. [8], the Marshall-Olkin generalized family by Yousof et al. [9], the odd Dagum generalized family by Afify and Alizadeh [10], a new generalized Weibull-G family by Cordeiro et al. [11], a new Weibull-G family by Tahir et al. [12], the Gompertz generalized family by Alizadeh et al. [13], the Type II Power Topp-Leone generated family by Bantan et al. [14], the generated odd burr III family BY Hag et al. [15], Exponentiated-G (EG) by Cordeiro et al. [16], Weibull-X by Alzaatreh et al. [17], Weibull-G by Bourguignon et al. [18], Logistic-G by Torabi and Montazari [19], Gamma-X by Alzaatreh et al. [20], a Lomax-G family by Cordeiro et al. [21], Exponentiated T-X by Alzaghal et al. [22], a Beta Marshall-Olkin family of distributions by Alizadeh et al. [23], Logistic-X by Tahir et al. [24], the beta generalized family (Beta-G) by Eugene et al. [25], a Lindley G family by Cakmakyapan and Ozel [26], Odd Lindley-G family by Gomes-Silva et al. [27], Transmuted family of distributions by Shaw and Buckley [28], Gamma-G (type 1) by Zografos and Balakrishnan [29], the Kumaraswamy-G by Cordeiro and de Castro [30], McDonald-G by Alexander et al. [31], Gamma-G (type 2) by Ristic et al. [32], Gamma-G (type 3) by Torabi and Montazari [33], Log-gammaG by Amini et al. [34], and so on.
Statistics show that a powerful transformation is a series of functions used to create a monotonous data transformation using power functions. Applied to the random variable, the technique is useful in stabilizing variance, making the data more normal distribution-like, improving the validity of association measures like the Pearson correlation between variables, and providing a more flexible model by adding a new parameter named power parameter. The works of Ieren et al. [35], Ghitany et al. [36], and Rady et al. [37] prove this fact. Ieren et al. [35] proposed the power Gompertz distribution, and derived certain properties of the new distribution. Estimated parameters by Maximum Probability Estimate (MLE) were provided. The application of the proposed model with other existing distributions to a data set of remission times for a random sample of 128 patients with bladder cancer was done with the power Gompertz model providing better performance than the Gompertz model, Ghitany et al. [36] introduced the power Lindley distribution. This model provides more flexibility than Lindley distribution when applied to lifetime data, Rady et al. [37] proposed the Power Lomax distribution, when applied to bladder cancer data, the proposed Power Lomax distribution exhibited a much more flexible model than the Lomax distribution. To produce a more flexible distribution for a highly skewed dataset, our focus in this paper is to present an extension of the power Gompertz distribution using the generalized Kumaraswamy generalized family of distribution [8], the resulting distribution is a six-parameter continuous distribution called the generalized Kumaraswamy generalized power Gompertz distribution, various statistical properties will be looked at. The method of maximum likelihood is discussed for estimating the model parameter. We also construct and analyze the generalized Nikulin Rao-Robson goodness-of-fit statistic test
(Bagdonavicius and Nikulin [38], Bagdonavicius and Nikulin [39] ) for the generalized Kumaraswamy generalized power Gompertz distribution based on censored data.
The remaining parts of this article are presented in sections as follows: formation of the new distribution is provided in Section 2. In Section 3, we analyzed the plots of the probability density and cumulative distribution function. Derivation of some properties of the new distribution such as asymptotic behavior, quantile function for median, Skewness and Kurtosis, and reliability analysis was discussed in Section 4. The distribution of order statistics in Section 5, estimation of parameters based on censored and uncensored random samples using Maximum Likelihood Estimation (MLE) is provided in Section 6. We evaluate the new goodness-of-fit statistic test
, and investigate some criteria test for the generalized Kumaraswamy generalized power Gompertz distribution in Section 7, a simulation study was carried out in Section 8, and an application of the new model to the dataset is illustrated in Section 9.
2. Formation of the Generalized Kumaraswamy Generalized Power Gompertz Distribution (GKGPG)
The Power Gompertz (PG) distribution [35] with positive parameter
and
has pdf and cdf given by:
(1)
and:
(2)
where
The cdf of the Generalized Kumaraswamy Generalized (GK-G) family is defined (for
) by:
(3)
The corresponding pdf of the GK-G family is given by:
(4)
where
,
and
are shape parameters.
The hazard rate function (hrf) of the GK-G family is given by:
(5)
Hence the pdf and cdf of the newly proposed Generalized Kumaraswamy Generalized Power Gompertz (GKGPG) distribution is given by:
(6)
And:
(7)
where
.
3. Graphical Description of the Generalized Kumaraswamy Generalized Power Gompertz Distribution (GKGPG)
Here, we graphically illustrate the probability density function, and cumulative distribution function of the generalized kumaraswamy generalized power Gompertz distribution at different parameter values.
Remarks: Figure 1 represents the behavior of the density plot the effect of the different parameter values. The probability density function of the generalized kumaraswamy generalized power Gompertzdistribution is unimodal; it is also decreasing, and right skewed, depending on the indicated parameter values.
Remarks: Figure 2 represents the cdf plot, clearly, the cdf approaches one (1) as X tends to infinity and equals zero when X tends to zero.
4. Statistical Properties of the Generalized Kumaraswamy Generalized Power Gompertz Distribution (GKGPG)
4.1. Asymptotic Behavior
This section examines the limiting behavior of the GKGPG distribution as
and as
.
Figure 1. PDF plot of the Generalized Kumaraswamy Generalized Power Gompertz Distribution (GKGPG) at different parameter values.
Figure 2. CDF plot of the Generalized Kumaraswamygeneralized Power Gompertz Distribution (GKGPG) at different parameter values.
For the pdf,
(8)
(9)
For the cdf,
(10)
(11)
4.2. Quantile Function
The quantile function (qf) of X, say
can be obtained by inverting Equation (3) numerically, and it is given by:
(12)
where
.
Ieren et al. (2019) defined the quantile function of the power Gompertz distribution as:
(13)
By substituting Equations (12) in (13), we obtain the quantile function of the GKGPG distribution as:
(14)
This above derived function is used to obtain certain moments, such as Skewness and Kurtosis, as well as the median of the distribution and generation of random variables from the distribution concerned.
4.3. Skewness and Kurtosis
The analysis of the Skewness and Kurtosis variability on the shape parameters can be examined on the basis of quantile action. The weaknesses of the conventional measure of Kurtosis are well known. Kenney and Keeping [40] gives the Bowely Skewness based on quantiles as:
(15)
Moors et al. [41] gave the Moors quantile based Kurtosis as:
(16)
With
is obtainable using the equation of the quantile function as given in Equation (14).
4.4. Reliability Analysis of the GKGPG Distribution
The Survival function of the generalized kumaraswamy generalized power Gompertz distribution is given as (Figure 3).
(17)
where
.
The Hazard failure of the generalized kumaraswamy generalized power Gompertz distribution is given as (Figure 4).
Figure 3. Survival plot of the Generalized Kumaraswamy generalized Power Gompertz Distribution (GKGPG) at different parameter values.
Figure 4. Hazard function plot of the Generalized Kumaraswamy generalized Power Gompertz Distribution (GKGPG) at different parameter values.
(18)
where
.
5. Order Statistics
For
from an independent and identically distributed random variables, let
denote a random sample from the Generalized Kumaraswamy generalized Power Gompertz Distribution with cdf
, and pdf given by Equations (3) and (4) respectively. Then the probability density function
of the ith order statistics of the GKGPG distribution is given by:
(19)
By substituting Equations (6) and (7) into the ith order statistics of the GKGPG distribution, we have that:
(20)
Hence the minimum order statistics
for the GKGPG distribution is given by:
(21)
Similarly, the maximum order statistics
for the GKGPG distribution is given by:
(22)
6. Parameter Estimation
6.1. Maximum Likelihood Estimation
Here, the parameters of the GKGPG distribution are estimated using the method of maximum likelihood. Let
be random samples distributed according to the GKGPG distribution, the likelihood function is obtained by the relationship:
(23)
(24)
With
.
The maximum likelihood estimators
and
of the unknown parameters
and
are derived from the nonlinear following score equations:
(25)
(26)
(27)
(28)
(29)
(30)
6.2. Estimation under Right-Censored Data
The hypothesizing test will be discussed under complete and censored data, however, the MPS is only defined for complete data, since the MLE is usually considered for right-censored data, Let us consider
a random right censored sample obtained from the GKGPG distribution with the parameter vector
. The censoring time
is fixed. So, the observation
is equal to
where:
(31)
In this case, the log-likelihood is obtained as follow:
(32)
(33)
The maximum likelihood estimators
and
of the unknown parameters
and
are derived from the nonlinear following score equations:
(34)
(35)
(36)
(37)
(38)
(39)
Monte Carlo technique or other iterative methods can be used to determine the values of
and
.
7. Test Statistic for Right Censored Data
Let
be n i.i.d. random variables grouped into r classes
. To assess the adequacy of a parametric model F₀:
(40)
When data are right censored and the parameter vector β is unknown, Bagdonavicius and Nikulin [38] proposed a statistic test Y2 based on the vector:
(41)
This one represents the differences between observed and expected numbers of failures (
and
) to fall into these grouping intervals
with
,
, where τ is a finite time. The authors considered
as random data functions such as ther intervals chosen have equal expected numbers of failures
.
The statistic test Y2 is defined by:
(42)
where
and
is a generalized inverse of the covariance matrix
and:
is the maximum likelihood estimator of
on initial non-grouped data.
Under the null hypothesis H₀, the limit distribution of the statistic Y2 is a chi-square with
degrees of freedom. The description and applications of modified chi-square tests are discussed in Voinov et al. [42].
The interval limits
for grouping data into j classes
are considered as data functions and defined by:
(43)
Such as the expected failure times
to fall into these intervals are
for any j, with
. The distribution of this statistic test
is chi-square (see Voinov et al., 2013).
7.1. Criteria Test for GKGPG Distribution
For testing the null hypothesis H₀ that data belong to the GKGPG model, we construct a modified chi-squared type goodness-of-fit test based on the statistic Y2. Suppose that τ is a finite time, and observed data are grouped into
sub-intervals
of
. The limit intervals
are considered as random variables such that the expected numbers of failures in each interval
are the same, so the expected numbers of failures
are obtained as:
(44)
7.2. Estimated Matrix
and
The components of the estimated matrix
are derived from the estimated matrix
which is given by:
(45)
(46)
(47)
(48)
(49)
(50)
And:
7.3. Estimated Matrix
The estimated matrix
is defined by:
where:
Therefore the quadratic form of the test statistic can be obtained easily:
(51)
8. Simulations
8.1. Maximum Likelihood Estimation
We generated
right censored samples with different sizes (
) from the GKGPG model with parameters
,
,
,
,
and
. Using R statistical software and the Barzilai-Borwein (BB) algorithm (Ravi, [43] ), we calculate the maximum likelihood estimators of the unknown parameters and their Mean Squared Errors (MSE). The results are given in Table 1.
The maximum likelihood estimated parameter values, presented in Table 1, agree closely with the true parameter values.
Table 1. Mean simulated values of MLEs
their corresponding square mean errors.
8.2. Criteria Test
For testing the null hypothesis H₀ that right censored data become from GKGPG model, we compute the criteria statistic
as defined above for 10,000 simulated samples from the hypothezised distribution with different sizes (30, 50, 150, 350, 500). Then, we calculate empirical levels of significance, when
, corresponding to theoretical levels of significance (
,
,
), We choose
. The results are reported in Table 2.
The null hypothesis H₀ for which simulated samples are fitted by GKGPG distribution is widely validated for the different levels of significance. Therefore, the test proposed in this work, can be used to fit data from this new distribution.
9. Application
In this section, we apply the results obtained through this study to real data set from reliability (Crowder et al. [44] ), previously used by [45] [46] [47]. In an experiment to gain information on the strength of a certain type of braided cord after weathering, the strengths of 48 pieces of cord that had been weathered for a specified length of time were investigated. The observed right-censored strength-values are given below:
26.8*, 29.6*, 33.4*, 35*, 36.3, 40*, 41.7, 41.9*, 42.5*, 43.9, 49.9, 50.1, 50.8, 51.9, 52.1, 52.3, 52.3, 52.4, 52.6, 52.7, 53.1, 53.6, 53.6, 53.9, 53.9, 54.1, 54.6, 54.8, 54.8, 55.1, 55.4, 55.9, 56, 56.1, 56.5, 56.9, 57.1, 57.1, 57.3, 57.7, 57.8, 58.1, 58.9, 59, 59.1, 59.6, 60.4, 60.7
We use the statistic test provided above to verify if these data are modelled by GKGPG distribution, and at that end, we first calculate the maximum likelihood estimators of the unknown parameters:
(52)
Table 2. Simulated levels of significance for
test for GKGPG model against their theoretical values (
).
Table 3. Values of
.
Data are grouped into
intervals
. We give the necessary calculus in Table 3.
Then we obtain the value of the statistic test
:
(53)
For significance level
, the critical value
is superior than the value of
, so we can say that the proposed model GKGPGfit these data.
10. Conclusion
This research has successfully introduced and studied a six-parameter continuous distribution called the generalized Kumaraswamy generalized power Gompertz distribution. The plots of the probability density and cumulative distribution function have been analyzed. We have also derived some properties of the new distribution such as asymptotic behavior, quantile function for median, Skewness, and Kurtosis, and reliability analysis. The distribution of order statistics estimation of parameters based on censored and uncensored random samples using Maximum Likelihood Estimation (MLE) has been provided. We evaluated the new goodness-of-fit statistic test
and investigated some criteria tests for the generalized Kumaraswamy generalized power Gompertz distribution. A simulation study was carried out in applying the new model to datasets. The newly proposed model GKGPG adequately fits the data.
Formation of the Generalized Kumaraswamy Generalized
Defined in this paper has three shape parameters which control its Skewness, Kurtosis and tails. It can therefore be applied in more real-life situations. Maximum likelihood estimates are discussed, and modified chi-square goodness-of-fit tests for right censoring are constructed. The statistical test provided in this article can be used to fit unknown parameters and censorship into this model and its sub-models. The results and efficacy of the proposed test are shown in an important simulation study.