The Alpha-Beta-Gamma Skew Normal Distribution and Its Application ()
1. Introduction
Over the years, the family of skew-symmetric distributions has received considerable attention by many researchers. Azzalini [1] first proposed the skew normal distribution (SN) with the density
(1)
where
and
are probability density function (pdf) and cumulative distribution function (cdf) of standard normal distribution, respectively. Balakrishnan [2] and Gupta [3] researched a generalization of the skew normal distribution (GSN) and further discussed its properties, Yadegaria et al. [4] introduced a generalization of the Balakrishnan skew normal distribution which includes GSN as special case, an epsilon skew normal distribution and a new class of skew-Cauchy distribution is studied in [5] and [6], respectively. Another skew normal distribution named power normal distribution is proposed by [7] which solved the estimation of the skewness parameter of SN when the sample size is not very large.
Recently, Elal-Olivero [8] proposed a new form of skew distribution with bimodal behavior named alpha skew normal distribution (ASN) with the density
(2)
where
controls the skewness and kurtosis, Equation (2) has at most two modes. Alpha-skew-Laplace distribution (ASL) and a generalization of alpha skew normal distribution (GASN) are studied in [9] and [10], respectively. Moreover, Chakraborty et al. [11] proposed the Balakrishnan alpha skew normal distribution and discussed a generalized bimodal normal distribution denoted by BN(n) with the density
(3)
where n is positive even integers and M is normalizing constant. Furthermore, Shafiei et al. [12] proposed a new class of skew normal distributions called alpha-beta skew normal distribution (ABSN) that is more flexible than SN and ASN, with the density
(4)
where
,
control the skewness and kurtosis, Equation (4) has at most four modes.
The motivations for considering this new family of distributions are as follows: Firstly, the new distribution family contains some classical distributions, such as normal distribution, ASN, ABSN, etc. Secondly, the admissible intervals for the skewness and the kurtosis parameters are wider than ASN and ABSN. Lastly, the new distribution family is more flexible than some other known skew distributions through the analysis of a real data set. Consequently, the reminder of the paper is organized as follows: In Section 2, we propose the new class of skew normal distributions and discussed its properties. In Section 3, we provide the maximum likelihood estimates (MLEs) of the parameters, the performance of the estimates are verified by random simulation. The real application of the new distribution family is considered in Section 4. Section 5 presents some conclusions. Lastly, some proofs and elements of Fisher information (FI) matrix are given in the appendix.
2. Alpha-Beta-Gamma Skew Normal Distribution
In this section, we define a class of skew distributions that allow the fitting of multimodal data sets.
Definition 1 A random variable X follows alpha-beta-gamma skew normal distribution, denoted by
, if it has the pdf
(5)
where
.
Figure 1 and Figure 2 show the pdf of the
for different parameter values. As can be seen that the proposed density is very general, the parameters
,
as well as
have substantial impact on the skewness and the number of extreme points of the distribution.
Remark 1 Derivative of pdf (5) with respect to x is given by
(6)
Figure 1. Alpha-beta-gamma skew normal pdf.
has six modes (black solid line),
has two modes (red dashed line).
Figure 2. Alpha-beta-gamma skew normal pdf.
has three modes (black solid line),
has five modes (red dashed line).
Equation (6) has at most eleven zeros, therefore the pdf up to six modes, see Figure 1 and Figure 2.
Proposition 1 The cdf of
is
Remark 2 Some properties of the
are as follows
1)
,
and
degenerate to
,
and
, respectively.
2) If
, pdf (5) reduces to
(7)
in sequel, we name Equation (7) the beta-gamma skew normal distribution, denoted by
.
3) If
, pdf (5) reduces to
(8)
in sequel, we name Equation (8) the alpha-gamma skew normal distribution, denoted by
.
4) If
, pdf (5) reduces to
(9)
Equation (9) is known as the beta skew normal distribution and denoted by
, see Shafiei [12].
5) If
, pdf (5) reduces to
(10)
in sequel, we name Equation (10) the gamma skew normal distribution, denoted by
.
6) If
, when
,
and
, then X follows BN(2), BN(6) and BN(10), respectively.
7) If
, then
.
Proposition 2 Let
, then for
, we have
where
,
.
Remark 3 In particularly, we have
We use R software to optimizing
and
with respect to
, we can get the bounds
Remark 4 Let
and
stand for its skewness and kurtosis coefficients, then we have
calculated by R software with the package DEoptim we can get the bounds
Note that, the length of the admissible intervals for the skewness and the kurtosis of ABGSN are larger than the corresponding intervals of ASN and ABSN which are (−0.81, 0.81), (−1.30, 0.75) and (−1.19, 1.19), (−1.77, 3.30), respectively.
Proposition 3 Let
, then the MGF of X is
where
Theorem 1 If
, then
1) The random variable
has the following density function
where
and
is the pdf of the chi-square distribution with 1 degree of freedom.
2) The density function of
is given by
where
is the density function of standard Half-Normal distribution.
3) Let
, then its density function is
Remark 5 The pdf (5) can be separated into symmetric and asymmetric parts as follows
where
is symmetric and the second part is asymmetric one.
Definition 2 If a random variable V has density function
(11)
then we say V follows symmetric component alpha-beta-gamma skew normal distribution, denoted by
.
Figure 3 shows the pdf of the SCABGSN for different parameter values. We can see that they are all symmetrical.
Remark 6 Derivative of pdf (11) with respect to v is given by
Figure 3. Symmetric component alpha-beta-gamma skew normal pdf.
has six modes (black solid line),
has four modes (red dashed line),
has two modes (blue dot dash line).
(12)
Equation (12) has only eleven zeros, therefore, pdf (11) has at most six modes.
Proposition 4 If
and
, then we have
where
Sampling algorithm
For generate the random number x from
, we adopt the acceptance-rejection method with the following steps
1) Generate random number v from
.
2) Generate random number u from the Uniform (0, 1).
3) If
accept x and deliver
, otherwise go back to step (1) and continue the process.
Where
,
and
are pdf of
and
, respectively.
3. Parameter Estimation
In this section, maximum likelihood estimation and a random simulation are considered.
3.1. Maximum Likelihood Estimation
Definition 3 Let
, then
is the location and scale extension of X with the pdf
(13)
where
, we denote
.
Let
be a random sample of size n drawn from
, then the log-likelihood function (LLF) for
is given by
(14)
where
for
. Differentiating Equation (14) above partially with respect to the parameters
, the following likelihood equations are obtained
the solutions of the above system of likelihood equations gives the MLEs for
. For interval estimation and hypothesis testing, it is necessary to observe the Fisher information matrix, under the regular condition, the corresponding Fisher information matrix can be expressed as
,
for
, the detailed expression of FI is shown in the appendix.
3.2. Simulation Study
In this subsection, a random simulation is conducted. The sample sizes and true values of the parameters considered is
,
,
and
, while the location and the scale parameters is set to
and
, the number of replications is 1000. The specific steps are as follows
1) Generate 1000 samples (simulation with N= 1000 repetitions) of size n from
.
2) Compute the maximum likelihood estimates for each of the 1000 samples.
3) Compute the mean and mean squared error (MSE) of
over the 1000 estimates.
From Table 1, MSE of the MLEs for each parameter decreases with the increase of n and disappear when n trends to infinity, mean of the MLEs for each parameter tends to the true value with the increase of n.
4. Real Life Application
In this section, we shall examine the application of the alpha-beta-gamma skew distribution to a real data set, and the R’s package DEoptim is used for maximizing the corresponding LLF, the MLEs of the parameters, log-likelihood values, AIC and BIC of distributions are obtained, then we compared the new distribution family with normal distribution, SN, PN, ASN, BSN, ABSN.
We consider the velocity of 82 Galaxy samples in the universe, which is first described by Roeder [13]. Table 2 shows the MLEs, log-likelihood values, AIC and BIC of each distribution. Figure 4 shows the graph representation of the density only considering the distribution of the top three with good fitting.
Based on the sample data, Kolmogrov-Smirnov test is carried out for the seven distributions. The test statistics is
, where
and
are the empirical distribution function of sample data and the given distribution function, respectively. The test results are shown in Table 3.
Figure 4. Plots of observed and expected densities of some distributions for the velocity of 82 galaxy samples.
Table 1. Mean and MSE of the MLEs of the unknown parameters for the
.
Where
,
.
Table 2. MLEs, log likelihood value, AIC and BIC comparison table.
Table 3. Kolmogrov-Smirnov test for seven distributions.
From Table 2 and Table 3, we can conclude that the log-likelihood value of the ABGSN is maximum, its AIC and BIC are minimum, while the statistics is the smallest, the P value is the largest and greater than 0.05. Therefore, we do not reject the original hypothesis, and the ABGSN is more suitable for fitting this data sets than others, Figure 4 also confirms our findings.
5. Conclusion
This paper introduced a new class of skew normal distributions which has at most six modes and some of its properties are discussed. The admissible intervals of the skewness and kurtosis coefficients for the alpha-beta-gamma skew normal distribution are wider than skew normal distribution, alpha skew normal distribution and alpha-beta skew normal distribution. Through random simulation, the MSE of the MLEs for each parameter decreases with the increase of n. The numerical results of fitting a real-life data set considered here has shown that the ABGSN provides a better fitting in comparison to the other known distributions. Based on the distribution studied in this paper, we can further extend it. For example, combining ABGSN with the classical skew normal distribution, we can get the generalized alpha-beta-gamma skew normal distribution, which makes ABGSN as a special case of it.
Funding
Project is supported by Natural Science Foundation of Chongqing (Grant No. cstc2020jcyj-msxmX0232 and cstc2019jcyj-msxmX0386).
Appendix
A1. Proof of Proposition 1
where
A2. Proof of Proposition 2
When
, we have
then
A3. Proof of Proposition 3
where
A4. Proof of Theorem 1
1) The density function of Y is obtained by
then
2) For
, we have
3) The density function of T is derived by
A5. FI Matrix Elements of
where