1. Introduction
The modeling of the lifetime data is a crucial one in many applied sciences, especially engineering, actuarial science, medicine, and others. Several lifetime distributions, for instance, the exponential, gamma, Weibull, log-normal distributions, and their modifications, have been used to model the lifetime data [1]. These distributions and their modifications have their own characteristics in terms of the shapes of the failure rate function, covering the tail-heaviness, horizontal symmetry, and dispersion. The tail-heaviness for a data set can be measured by the excess kurtosis (EK) which is defined as
, where
is the kurtosis of the data set. The
is called a fatter tail (Leptokurtic) and
is called a thinner tail (Platykurtic) distribution. Further, the symmetry and dispersion for a data set can be measured by skewness (SK), and Fano factor (FF) values, respectively, where the Fano factor value is the variance-to-mean ratio.
The modification of a lifetime distribution may be done by using the finite mixture model to handle the complexity by heterogeneity. The Lindley distribution (LD) is one of the finite mixture models under the Bayesian framework, and it was introduced by Lindley (1958) [2] having the density function:
(1)
where
is the shape parameter that controls the shape of the distribution, and y is the respective random variable. The density function of this distribution is based on a two-component mixture of two different continuous distributions
namely exponential (
) and gamma (
) distributions with the mixing proportion,
, where the p is defined by using the shape parameter (s) of
the latent variable distribution. The LD has the increasing failure rate function while the exponential distribution has the constant failure rate function. In statistical literature, Ghitany et al. (2008) [3] showed that the Lindley distribution is more flexible and provides a better fit than the exponential distribution for lifetime data, especially its flexible mathematical format and failure rate criteria.
Some modifications of LD have been proposed by researchers to increase the flexibility further, especially for failure rate criteria. Here, they introduce new parameter (s) that might be shape or scale or location parameter (s). In general, while a scale parameter stretches or shrinks the respective distribution, a location parameter changes the starting point of that distribution. Dolati et al. (2009) [4] introduced a generalized Lindley distribution (GLD), Shanker et al. (2013) [5] obtained a two-parameter Lindley distribution (TwPLD), Abouammoh et al. (2015) [6] proposed a new generalized Lindley distribution (NGLD), and Monsef (2016) [7] introduced a Lindley distribution with location parameter (LwLD). Ekhosuehi et al. (2018) [8] obtained a new generalized two-parameter Lindley distribution (NGTwPLD). Tharshan and Wijekoon (2020) [9] proposed a location-based generalized Akash distribution (LGAD). Recently, Ramos et al. (2020) [10] introduced a two-parameter distribution with increasing and Bathtub hazard rate (TwPD). Note that GLD, TwPLD, NGLD, LwLD, NGTwPLD, LGAD, and TwPD are two-component mixture models with two or three parameters. Table 1 summarizes these distributions’ mixing proportions, mixing components, failure rate, and parameters. Further, in all distributions given in Table 1, the mixing proportions are defined by incorporating the scale parameter of the mixing components. This may limit the flexibility to perform the scale parameter of the mixing component and shape parameters of the latent variable distribution, separately for a data set.
Without incorporating the scale parameter to the mixing proportion, Shanker et al. (2013) [11] proposed the Quasi Lindley distribution (QLD) with the density function:
(2)
where
is the shape parameter introduced from the latent variable distribution and
is the scale parameter introduced from the mixing components. Equation (2) presents two-component mixture of an exponential (
), and gamma
(
) with the mixing proportion,
. It has the increasing failure rate and its skewness (
), kurtosis (
), and Fano factor (
) functions are:
,
, and
,
respectively. Then, it is clear that it has more flexibility to cover the tail-heaviness and dispersion than mentioned distributions in Table 1.
Tharshan and Wijekoon (2020) [12] have done a comparison study by introducing a new five-parameter generalized Lindley distribution (FPGLD). They have shown that QLD can perform well than some other existing Lindley family distributions for higher SK, EK, and FF values by using the simulated and real-world data sets. This new distribution (FPGLD) was introduced to ease the
![]()
Table 1. Mixing proportions, mixing components, failure rate, and parameters of some notable existing Lindley family distributions.
comparison. The density function of FPGLD (
) is given by:
(3)
where
, and
are the shape parameters introduced from the latent variable distribution, and
and
are scale and location parameters, respectively, introduced from the mixing components. Equation (3) presents two-component mixture of an exponential (
), and gamma (
) with the mixing proportion
. Although the QLD performs well than the other distributions when
all three measures; skewness, excess kurtosis, and Fano factor are high, the flexibility of QLD is limited for all ranges of the above three measures since the shape parameter of the mixing component gamma (
) is fixed in QLD. That is,
, and
.
In this context, we modify the QLD by adding a shape parameter that is not fixed to the mixing components. The modified QLD will be called modified Quasi Lindley distribution (MQLD). The new distribution is a two-component mixture of an exponential, and a gamma distributions. Since FPGLD (
) has the same mixing components of MQLD and accommodates several existing and new sub-models of Lindley family distributions by setting its parameters of mixing proportion, we define the the mixing proportion p for MQLD via a comparison study among the FPGLD (
) and its sub-models. Here, FPGLD (
) means FPGLD by setting its location parameter
. This comparison study will be helpful to define the mixing proportion of MQLD that provides a better fit without having additional shape parameter(s) in the new distribution.
The paper is outlined as follows: in Section 2, we introduce the MQLD with its density and distribution functions. We present the statistical properties of MQLD including moments and moment generating functions, and quantile function in Section 3. In Section 4, we derive the reliability properties of MQLD. The size-biased form of the MQLD is discussed in Section 5. Section 6 covers the unknown parameter estimation of MQLD. Finally, a simulation study is performed to verify the asymptotic property of unknown parameter estimation methods, and simulated and real-world data sets are used to illustrate its applicability over some other existing Lindley family distributions.
2. Formulation of the New Distribution
In this section, we introduce a finite mixture of two non-identical distributions called modified Quasi Lindley distribution with its probability density function (pdf) and cumulative distribution function (cdf).
2.1. Defining the Mixing Proportion p
For the comparison study, it is simulated 50 random samples of size,
from FPGLD (
) with various skewness (SK), Excees kurtosis (EK), and Fano factor (FF) values by setting the parameter values. Then, FPGLD (
) and its sub-models for given
and
values in Table 2 are fitted to the simulated random samples. Table 2 shows sub-models and FPGLD (
), denoted
and highlighted some of the sub-models that gives minimum negative log-likelihood (
) values consistently with
for all simulated random samples. Based on minimum number of parameters among the highlighted models, the sub-model, denoted
is a simple distribution and can perform well than others. Then, we utilize the mixing
proportion of
,
to define the mixing proportion of MQLD. The detailed study results could be provided upon request of reviewers.
2.2. Defining the pdf and cdf
Suppose Y be a non-negative random variable that is derived as a finite mixture of two non-identical distributions, exponential (
), and gamma (
) with the mixing proportion,
under the Bayesian framework, as follows:
where
and
are shape parameters, and
is a scale parameter and
and
.
Then, the pdf of the MQLD with parameters
, and
is defined as:
(4)
The first derivative of
for y is given by:
![]()
Table 2. Comparison study results of FPGLD (
) and its sub-models.
Then, the non-linear equation respect to y,
gives the modes of the
, i.e. roots of the
. It is clear that there exists more than one roots for
. Suppose
is a mode value of
, then
(local maximum), where
at
.
Figure 1 illustrates some of the possible shapes of the pdf of the MQLD.
The corresponding cdf of MQLD is given by:
(5)
where
is an incomplete gamma function defined as
.
3. Statistical Properties
In this section, we provide some important statistical properties of MQLD such as rth moments about the origin and about the mean, moment related measures, moment generating and characteristic functions, and quantile function.
3.1. Moments and Related Measures
We may utilize the moments to study the characteristics of a distribution such as horizontal symmetry, dispersion, and tail-heaviness. The following proposition gives the rth moment about the origin:
Proposition 1. The rth moment about the origin of the MQLD is given by:
(6)
![]()
Figure 1. The probability density of MQLD at different parameter values. (a) and (b):
and
are fixed, and
values are changed; (c) and (d):
and
are fixed, and
values are changed; (e) and (f):
and
are fixed, and
values are changed.
Proof.
Substituting
and 4 in Equation (6), the first four moments about the origin are derived as:
and
,
respectively. Then, the rth-order moments about the mean can be obtained by using the relationship between moments about the mean and moments about the origin, i.e.
.
Therefore, some rth-order moments about the mean are:
and
where,
respectively. Further, measures of skewness (
), measures of kurtosis (
), and the Index of dispersion/Fano factor (
) of the MQLD are derived as:
and
respectively. Figure 2 and Figure 3 show various patterns of the skewness, kurtosis, and Fano factor functions of MQLD at different parameter values. The patterns suggest that the MQLD is more flexible than the QLD in terms of covering various ranges of skewness, kurtosis, and Fano factor values.
3.2. Moment Generating and Characteristic Function
Own characteristics of a probability distribution are directly associated with the moment generating function (mgf) and the characteristic function (cf). The
![]()
Figure 2. The skewness and kurtosis functions of MQLD at different parameter values of
and
.
![]()
Figure 3. The Fano factor function of MQLD at different parameter values of
and
.
following proposition provides mgf of the MQLD:
Proposition 2. The mgf say
of the MQLD is given as follows:
(7)
Proof.
o
Similar way, the characteristic function say,
of the MQLD can be derived as follows:
(8)
where
is the complex unit.
3.3. Quantile Function
We may use the quantile function to estimate the quantiles and simulate the random samples for a probability distribution. The quantile function can be derived by solving
. The quantile function of MQLD is obtained as:
(9)
Since Equation (9) is not a closed-form, we cannot estimate the quantiles and simulate the random variables from MQLD directly. However, these can be done by using numerical methods. Further, By substituting
and 0.75 in Equation (9), the first three quartiles can be derived by solving the following equations, respectively.
and
3.4. Distribution of Order Statistics
The linear combinations of order statistics are used to estimate the unknown parameters for a distribution. Let
be n independent random variables from MQLD and
be the corresponding order statistics. Then, the pdf and cdf of
are given as:
(10)
and
(11)
respectively. By substituting
and
of MQLD in Equations (10) and (11), the pdf and cdf of
for MQLD are obtained as:
and
respectively.
4. Reliability, Inequality and Entropy Measures
In this section, we derive and study some important reliability measures of MQLD, namely survival function/reliability function
, hazard rate function/failure rate function
, reversed hazard rate function
, cumulative hazard rate function
, mean residual life function
; inequality measures, namely Lorenz curve
, and Benferroni curve
; and the Renyi entropy measure.
4.1. Survival and Hazard Rate Functions
The survival function and hazard rate function are crucial functions to specify a survival distribution. The survival function is the probability of surviving up to a point
. Then, the survival function of MQLD is defined as:
. (12)
Note that,
and
.
The instantaneous failure rate is described by the hazard rate function (hrf). The hrf of MQLD is given by:
(13)
Note that,
and
.
Figure 4 illustrates the possible patterns of the hrf of MQLD at different parameter values. The results indicate that MQLD has the capability to model the monotonic increasing and decreasing, constant, and bathtub failure rate shapes while QLD has only increasing failure rate shape.
The reversed hazard rate function of MQLD is defined as:
(14)
and the corresponding cumulative hazard rate function that represents the total number of failures over an interval of time is defined for MQLD as:
(15)
![]()
Figure 4. The hazard rate function of MQLD at different parameter values of
,
, and
.
This is a monotonic increasing function and satisfies
and
.
4.2. Mean Residual Life Function
The mean residual life function represents the expected additional lifetime of a component that has survived up to time
. It is an important characteristic in the reliability study. The mean residual life function say
, is defined as:
. The following proposition gives the
for the MQLD.
Proposition 3. The mean residual life function of MQLD is given by:
(16)
Proof.
and note that,
Therefore,
o
Then, Equation (16) satisfies:
,
, and
.
4.3. Lorenz and Bonferroni Curves
The Lorenz and Bonferroni curves are used to measure the income inequality. They are widely used in reliability, insurance, economises, and medicine. The
Lorenz curve say
, is defined as:
, and the Bonferroni curve say
, is defined as
. By substituting the integral part,
value from the previous proposition’s proof, the
for MQLD can be obtained as:
(17)
Then, the
for the MQLD is given by:
(18)
4.4. Renyi Entropy
The entropy measure is a measure of the variation of uncertainty for a distribution and widely used in the information theory. The Renyi entropy is a popular uncertainty measure say
and it is an extension of Shannon entropy
[13]. The
is defined as:
. The following proposition derives the Renyi entropy for MQLD:
Proposition 4. The Renyi entropy for the MQLD is obtained as:
(19)
Proof.
o
5. The Size-Biased of MQLD
The weighted distributions are used to record the observations with an unequal chance. The application of the weighted distributions in reliability, medical, and ecological sciences have studied by Patil et al. (1978) [14]. The weighted random variable
of MQLD is defined as:
(20)
where
.
When
, the resulting distribution is called size-biased version of MQLD with order
, and is defined as:
,
where
is the respective random variable. The following proposition gives the density function for the sized-biased version of MQLD:
Proposition 5. The density function for rth order sized-biased form of MQLD is derived as:
(21)
Proof.
.
Note that
Therefore,
o
The length-biased density function can be obtained by substituting
in Equation (21) for MQLD and is given as:
(22)
and corresponding cdf is given as:
(23)
where
is the lower incomplete gamma function.
The mean and variance of length-biased MQLD are:
and
, respectively.
6. Parameter Estimation
This section introduces the parameter estimation methods of MQLD by using the method of moment estimation, maximum likelihood estimation method, and weighted least square estimation method.
6.1. Method of Moment Estimation (MME)
The method of moment estimators of
, and
, abbreviated as
, and
can be derived by equating the raw-moments, say
, to the sample moments, say
. Then, we need to solve the following system of equations:
and
.
Since the simultaneous equations are not a closed-form, the numerical methods such as Newton-Rapshon can be employed to find the roots of the equations.
6.2. Maximum Likelihood Estimation (MLE)
The MLE method is the most commonly employed due to its better asymptotic properties. Suppose
be the observed values from MQLD with the parameters
, and
. The likelihood function of the ith sample value
can be written as:
and the log likelihood function is given by:
.
By solving the expressions
,
, and
, the maximum likelihood estimators of
, and
, abbreviated as
, and
can be obtained. The system of the equations are:
and
, where
The asymptotic confidence intervals for the parameters
, and
are derived by the asymptotic theory. The estimators are asymptotic three-variate normal with mean
and the observed information matrix:
at
, and
. That is,
. The elements of the observed information matrix are given in Appendix.
Therefore, the
confidence interval for the parameters
, and
are given by
wherein, the
, and
are the variance of
, and
, respectively, and can be derived by diagonal elements of
and
is the critical value at a level of significance.
6.3. Weighted Least Square Estimation (WLE)
The weighted least square estimators of
, and
, abbreviated as
, and
can be obtained by minimizing:
with respect to
,
where
is the cdf of the order statistic defined in section 3.4. Then, the estimators can be found by solving the non-linear equations:
where
7. Simulation Study
In this section, we examine the performance of the MME and MLE method in the unknown parameter estimation of MQLD with respect to the sample size n. Further, a comparison study is performed among the MQLD, QLD, and LD based on minimum negative log-likelihood (
) by using various simulated samples from MQLD. The following algorithm is used to generate the random samples from MQLD:
Algorithm
1) Generate
2) Solve the non-linear equation for
;
.
7.1. Performance of MME and MLE Methods
The simulation study is designed to examine the performance of
and
with respect to the sample size n as follows:
1) generate thousand samples of size n
2) Compute the average biases, and mean squared errors of
, and
of the parameters
, and
by using the equations:
a) The average biases are:
b) The average MSEs are:
Table 3 and Table 4 represent the performance of MME and MLE method for the combinations of parameter values
that represents the unimodel case and
that represents the monotonic decreasing case, respectively. They summarize average MMEs, MLEs, biases, and MSEs for different sample sizes and corresponding results of MLE method are given in parentheses. We consider sample sizes of
and 180.
Observations from Table 3 and Table 4, the biases and MSEs decrease as n increases in both methods. Then, both methods verify the asymptotic property. However, comparing between MME and MLE method for given combination of parameter values and different sample sizes, it is clear that the MLE method is better than the MME since its’ ability to converge to the actual parameter value is stronger than the method of moment estimation. Further, we have noted that this ability is very strong for a large sample. Among the MLEs of unknown parameters,
and
are overestimated and
is underestimated for both combinations of parameters. Further,
has low biases and MSEs while
has high biases and MSEs.
![]()
Table 3. Performance of MME and MLE methods for MQLD (
).
![]()
Table 4. Performance of MME and MLE methods for MQLD (
).
7.2. Comparison Study among MQLD, QLD and LD
This comparison study is performed to show how the MQLD provides a better fit than QLD and LD for the various data sets that are simulated from MQLD. Since the ranges of skewness, and kurtosis of QLD are,
, and
, respectively, we define three ranges of SKs and EKs to simulate data sets as R1, R2 and R3, where R1:
and
, R2:
, and
, and R3:
and
. This study is designed as follows:
1) Generate 8 random samples of size,
from MQLD (
) for each range R1, R2, and R3.
2) Fit the MQLD, QLD, and LD to the 24 generated random samples.
3) Make the comparisons based on minimum
values.
Here, the estimates of the unknown parameters for the distributions are derived by the MLE method. Tables 5-7 summarize
values of MQLD, QLD, and LD for the generated random samples. Based on minimum
value, the MQLD performs better than QLD, and LD in all given ranges of SK, EK, and FF.
![]()
Table 5.
values of MQLD, QLD, and LD for the simulated random samples of R1.
![]()
Table 6.
values of MQLD, QLD, and LD for the simulated random samples of R2.
![]()
Table 7.
values of MQLD, QLD, and LD for the simulated random samples of R3.
8. Real-World Applications
In this section, we fit the MQLD to three published real-data sets and compare its’ performance with some existing Lindley family distributions. The
, Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Kolmogorov-Smirnov Statistics (K-S Statistics) are utilized to compare the performance of distributions. Based on the minimum value of these statistics the best model is chosen to fit the data. The unknown parameters of distributions are estimated by using the MLE method. The three real-data sets are:
Data set 1: Fuller et al. (1994) [15] discussed this data set that represents the strength of glass of the aircraft window. The data are:
18.83, 20.80, 21.657, 23.03, 23.23, 24.05, 24.321, 25.50, 25.52, 25.80, 26.69, 26.77, 26.78, 27.05, 27.67, 29.90, 31.11, 33.20, 33.73, 33.76, 33.89, 34.76, 35.75, 35.91, 36.98, 37.08, 37.09, 39.58, 44.045, 45.29, 45.381.
Data set 2: The following data set represents the tree circumferences in Marshall, Minnesota and reported by Shakil et al. (2010) [16].
1.8, 1.8, 1.9, 2.4, 3.1, 3.4, 3.7, 3.7, 3.8, 3.9, 4.0, 4.1, 4.9, 5.1, 5.1, 5.2, 5.3, 5.5, 8.3, 13.7.
Data set 3: The data set was used by Murthy et al. (2004) [17] that represents 50 items failure times in weeks.
0.013, 0.065, 0.111, 0.111, 0.163, 0.309, 0.426, 0.535, 0.684, 0.747, 0.997, 1.284, 1.304, 1.647, 1.829, 2.336, 2.838, 3.269, 3.977, 3.981, 4.520, 4.789, 4.849, 5.202, 5.291, 5.349, 5.911, 6.018, 6.427, 6.456, 6.572, 7.023, 7.087, 7.291, 7.787, 8.596, 9.388, 10.261, 10.713, 11.658, 13.006, 13.388, 13.842, 17.152, 17.283, 19.418, 23.471, 24.777, 32.795, 48.105.
Some important statistical measures for data sets 1, 2, and 3, are summarized in Table 8.
The empirical histogram of the data sets and the fitted densities of MQLD, QLD, and LD are displayed in Figure 5. One can observe that the fitted density of MQLD gives a closer fit with the empirical distributions of the data sets. Table 9 lists the MLEs, SDs,
, AICs, BICs, and K-S statistics with critical values for the fitted models to the data set 1, 2, and 3. It is noted that from Table 9, the MQLD provides the lowest values for the
, AIC, and BIC among all fitted models. Then, it is clear from Table 9 and Figure 5 results that the MQLD provides a better fit than the QLD and LD.
![]()
Table 8. Statistical measures for data set 1, 2, and 3
![]()
Table 9. MLEs, SDs, AICs, BICs, K-S statistics and its critical values of the fitted models.
![]()
Figure 5. Empirical histograms of the data sets with fitted densities of MQLD, QLD, TwPLD, and LD.
9. Conclusion
In this paper, we have introduced a new three-parameter Lindley family distribution, called the modified Quasi Lindley distribution (MQLD). We studied its’ fundamental structural properties such as the density, moments and related measures, quantile function, order statistics, failure rate function, mean residual life function, inequality and entropy measures, and size-biased of MQLD. The new distribution has very flexible properties for lifetime data. Its’ density function covers various ranges of horizontal symmetries, tail-weights, and dispersion. Further, the failure rate function of new distribution can be increasing, decreasing, constant, and bathtub shapes. A simulation study indicates that the maximum likelihood method offers better performance and accuracy than the method of moment estimation. The maximum likelihood estimation method was approached for estimating the unknown model parameters. A simulation study and three real-world applications showed its superiority over the Quasi Lindley distribution, Two-parameter Lindley distribution, and Lindley distribution.
Appendix
The terms
, and
are defined as follows:
,
, and
.
Then, the second order partial derivatives of the log-likelihood function are as follows:
and
where
is the trigamma function and it is defined as: