The Alpha-Beta-Gamma Skew Normal Distribution and Its Application

Zhengyuan Wei; Tiankui Peng; Xiaoya Zhou

doi:10.4236/ojs.2020.106060

Open Journal of Statistics > Vol.10 No.6, December 2020

The Alpha-Beta-Gamma Skew Normal Distribution and Its Application

Zhengyuan Wei, Tiankui Peng^*, Xiaoya Zhou
College of Science, Chongqing University of Technology, Chongqing, China.
DOI: 10.4236/ojs.2020.106060 PDF HTML XML 478 Downloads 2,243 Views Citations

Abstract

In this paper, a new class of skew multimodal distributions with more flexible than alpha skew normal distribution and alpha-beta skew normal distribution is proposed, which makes some important distributions become its special cases. The statistical properties of the new distribution are studied in detail, its moment generating function, skewness coefficient, kurtosis coefficient, Fisher information matrix, maximum likelihood estimators are derived. Moreover, a random simulation study is carried out for test the performance of the estimators, the simulation results show that with the increase of sample size, the mean value of maximum likelihood estimators tends to the true value. The new distribution family provides a better fit compared with other known skew distributions through the analysis of a real data set.

Keywords

Skew Distribution, Multimodal Distribution, Alpha-Beta-Gamma Skew Normal Distribution, Fisher Information Matrix

Share and Cite:

Wei, Z. , Peng, T. and Zhou, X. (2020) The Alpha-Beta-Gamma Skew Normal Distribution and Its Application. Open Journal of Statistics, 10, 1057-1071. doi: 10.4236/ojs.2020.106060.

1. Introduction

Over the years, the family of skew-symmetric distributions has received considerable attention by many researchers. Azzalini [1] first proposed the skew normal distribution (SN) with the density

$f (z) = 2 Φ (α z) ϕ (z), {(z, α)}^{T} \in ℝ^{2},$ (1)

where $ϕ (\cdot)$ and $Φ (\cdot)$ are probability density function (pdf) and cumulative distribution function (cdf) of standard normal distribution, respectively. Balakrishnan [2] and Gupta [3] researched a generalization of the skew normal distribution (GSN) and further discussed its properties, Yadegaria et al. [4] introduced a generalization of the Balakrishnan skew normal distribution which includes GSN as special case, an epsilon skew normal distribution and a new class of skew-Cauchy distribution is studied in [5] and [6], respectively. Another skew normal distribution named power normal distribution is proposed by [7] which solved the estimation of the skewness parameter of SN when the sample size is not very large.

Recently, Elal-Olivero [8] proposed a new form of skew distribution with bimodal behavior named alpha skew normal distribution (ASN) with the density

$f (z; α) = \frac{{(1 - α z)}^{2} + 1}{α^{2} + 2} ϕ (z), {(z, α)}^{T} \in ℝ^{2},$ (2)

where $α$ controls the skewness and kurtosis, Equation (2) has at most two modes. Alpha-skew-Laplace distribution (ASL) and a generalization of alpha skew normal distribution (GASN) are studied in [9] and [10], respectively. Moreover, Chakraborty et al. [11] proposed the Balakrishnan alpha skew normal distribution and discussed a generalized bimodal normal distribution denoted by BN(n) with the density

$f (z) = \frac{z^{n}}{M} ϕ (z), z \in R,$ (3)

where n is positive even integers and M is normalizing constant. Furthermore, Shafiei et al. [12] proposed a new class of skew normal distributions called alpha-beta skew normal distribution (ABSN) that is more flexible than SN and ASN, with the density

$f (z; α, β) = \frac{{(1 - α z - β z^{3})}^{2} + 1}{6 α β + 15 β^{2} + α^{2} + 2} ϕ (z), {(z, α, β)}^{T} \in ℝ^{3},$ (4)

where $α$ , $β$ control the skewness and kurtosis, Equation (4) has at most four modes.

The motivations for considering this new family of distributions are as follows: Firstly, the new distribution family contains some classical distributions, such as normal distribution, ASN, ABSN, etc. Secondly, the admissible intervals for the skewness and the kurtosis parameters are wider than ASN and ABSN. Lastly, the new distribution family is more flexible than some other known skew distributions through the analysis of a real data set. Consequently, the reminder of the paper is organized as follows: In Section 2, we propose the new class of skew normal distributions and discussed its properties. In Section 3, we provide the maximum likelihood estimates (MLEs) of the parameters, the performance of the estimates are verified by random simulation. The real application of the new distribution family is considered in Section 4. Section 5 presents some conclusions. Lastly, some proofs and elements of Fisher information (FI) matrix are given in the appendix.

2. Alpha-Beta-Gamma Skew Normal Distribution

In this section, we define a class of skew distributions that allow the fitting of multimodal data sets.

Definition 1 A random variable X follows alpha-beta-gamma skew normal distribution, denoted by $ABGSN (α, β, γ)$ , if it has the pdf

$f (x; α, β, γ) = \frac{{(1 - α x - β x^{3} - γ x^{5})}^{2} + 1}{C} ϕ (x), {(x, α, β, γ)}^{T} \in ℝ^{4},$ (5)

where $C = 945 γ^{2} + 210 γ β + 30 α γ + 6 α β + 15 β^{2} + α^{2} + 2$ .

Figure 1 and Figure 2 show the pdf of the $ABGSN (α, β, γ)$ for different parameter values. As can be seen that the proposed density is very general, the parameters $α$ , $β$ as well as $γ$ have substantial impact on the skewness and the number of extreme points of the distribution.

Remark 1 Derivative of pdf (5) with respect to x is given by

$\begin{array}{l} \frac{\partial f (x; α, β, γ)}{\partial x} \\ = \frac{ϕ (x)}{C} [- γ^{2} x^{11} + (10 γ^{2} - 2 β γ) x^{9} + (16 β γ - β^{2} - 2 α γ) x^{7} + 2 γ x^{6} \\ + (12 α γ + 6 β^{2} - 2 α β) x^{5} + (2 β - 10 γ) x^{4} + (8 α β - α^{2}) x^{3} \\ + (2 α - 6 β) x^{2} + (2 α^{2} - 2) x - 2 α], \end{array}$ (6)

Figure 1. Alpha-beta-gamma skew normal pdf. $ABGSN (20, - 10,0.8)$ has six modes (black solid line), $ABGSN (4.5, - 8, - 10)$ has two modes (red dashed line).

Figure 2. Alpha-beta-gamma skew normal pdf. $ABGSN (0.4, - 0.3,0.1)$ has three modes (black solid line), $ABGSN (- 0.4,2, - 0.2)$ has five modes (red dashed line).

Equation (6) has at most eleven zeros, therefore the pdf up to six modes, see Figure 1 and Figure 2.

Proposition 1 The cdf of $X ~ ABGSN {(α, β, γ)}^{T}$ is

$\begin{array}{l} F_{X} (x; α, β, γ) \\ = [- γ^{2} x^{9} - (2 β γ + 9 γ^{2}) x^{7} - (β^{2} + 2 α γ + 63 γ^{2} + 14 β γ) x^{5} + 2 γ x^{4} \\ - (5 β^{2} + 10 α γ + 315 γ^{2} + 2 α β + 70 β γ) x^{3} + (2 β + 8 γ) x^{2} \\ - (C - 2) x + 2 α + 4 β + 16 γ] ϕ (x) / C + Φ (x), (x, α, β, γ) \in ℝ^{4} . \end{array}$

Remark 2 Some properties of the $ABGSN (α, β, γ)$ are as follows

1) $ABGSN (0,0,0)$ , $ABGSN (α, β,0)$ and $ABGSN (α,0,0)$ degenerate to $N (0,1)$ , $ABSN (α, β)$ and $ASN (α)$ , respectively.

2) If $α = 0$ , pdf (5) reduces to

$f (x; β, γ) = \frac{{(1 - β x^{3} - γ x^{5})}^{2} + 1}{945 γ^{2} + 210 γ β + 15 β^{2} + 2} ϕ (x), {(x, β, γ)}^{T} \in ℝ^{3},$ (7)

in sequel, we name Equation (7) the beta-gamma skew normal distribution, denoted by $BGSN (β, γ)$ .

3) If $β = 0$ , pdf (5) reduces to

$f (x; α, γ) = \frac{{(1 - α x - γ x^{5})}^{2} + 1}{945 γ^{2} + 30 α γ + α^{2} + 2} ϕ (x), {(x, α, γ)}^{T} \in ℝ^{3},$ (8)

in sequel, we name Equation (8) the alpha-gamma skew normal distribution, denoted by $AGSN (α, γ)$ .

4) If $α = γ = 0$ , pdf (5) reduces to

$f (x; β) = \frac{{(1 - β x^{3})}^{2} + 1}{15 β^{2} + 2} ϕ (x), {(x, β)}^{T} \in ℝ^{2},$ (9)

Equation (9) is known as the beta skew normal distribution and denoted by $BSN (β)$ , see Shafiei [12].

5) If $α = β = 0$ , pdf (5) reduces to

$f (x; γ) = \frac{{(1 - γ x^{5})}^{2} + 1}{945 γ^{2} + 2} ϕ (x), {(x, γ)}^{T} \in ℝ^{2},$ (10)

in sequel, we name Equation (10) the gamma skew normal distribution, denoted by $GSN (γ)$ .

6) If $X ~ ABGSN (α, β, γ)$ , when $α \to \pm \infty$ , $β \to \pm \infty$ and $γ \to \pm \infty$ , then X follows BN(2), BN(6) and BN(10), respectively.

7) If $X ~ ABGSN (α, β, γ)$ , then $- X ~ ABGSN (- α, - β, - γ)$ .

Proposition 2 Let $X ~ ABGSN (α, β, γ)$ , then for $k \in ℕ$ , we have

$\begin{array}{l} E [X^{2 k}] = \frac{1}{C} [2 h_{2 k} + α^{2} h_{2 k + 2} + (2 α γ + β^{2}) h_{2 k + 6} \\ + γ^{2} h_{2 k + 10} + 2 α β h_{2 k + 4} + 2 β γ h_{2 k + 8}], \end{array}$

$E [X^{2 k - 1}] = - \frac{1}{C} (2 α h_{2 k} + 2 β h_{2 k + 2} + 2 γ h_{2 k + 4}),$

where $h_{2 k + i} = 1 \times 3 \times \dots \times (2 k + i - 1)$ , $i = 0, 2, 4, 6, \dots$ .

Remark 3 In particularly, we have

$E [X] = - \frac{1}{C} (2 α + 6 β + 30 γ),$

$\begin{array}{l} V a r [X] = \frac{1}{C} [(2 + 3 α^{2} + 210 α γ + 105 β^{2} + 10395 γ^{2} + 30 α β + 1890 β γ) \\ - {(2 α + 6 β + 30 γ)}^{2} / C], \end{array}$

We use R software to optimizing $E [X]$ and $V a r [X]$ with respect to $α, β, γ$ , we can get the bounds

$E [X] \in (- 0.71,0.71), V a r [X] \in (0.81,14.07) .$

Remark 4 Let $ξ_{1}$ and $ξ_{2}$ stand for its skewness and kurtosis coefficients, then we have

$\begin{matrix} ξ_{1} = \frac{E [X^{3}] - 3 E [X] E [X^{2}] + 2 E^{3} [X]}{{(E [X^{2}] - E^{2} [X])}^{3 / 2}} \\ = [- 16 {(α + 3 β + 15 γ)}^{3} + 12 C (α^{3} + 120 β^{3} + 61425 γ^{3} + 95 α^{2} γ \\ + 14 α^{2} β + 75 α β^{2} + 2835 β^{2} γ + 5775 α γ^{2} + 23730 β γ^{2} \\ + 1200 α β γ - 2 β - 20 γ)] / [(2 + 3 α^{2} + 210 α γ + 105 β^{2} \\ + {10395 γ^{2} + 30 α β + 1890 β γ) C - {(2 α + 6 β + 30 γ)}^{2}]}^{3 / 2}, \end{matrix}$

$\begin{matrix} ξ_{2} = \frac{E [X^{4}] - 4 E [X] E [X^{3}] + 6 E^{2} [X] E [X^{2}] - 3 E^{4} [X]}{{(E [X^{2}] - E^{2} [X])}^{2}} - 3 \\ = - 3 + [- 48 {(α + 3 β + 15 γ)}^{4} + C^{3} (15 α^{2} + 945 β^{2} + 135135 γ^{2} + 210 α β \\ + 1890 α γ + 20790 β γ + 6) + 24 (α + 3 β + 15 γ) C (α^{3} + 165 β^{3} \\ + 89775 γ^{3} + 125 α^{2} γ + 17 α^{2} β + 105 α β^{2} + 4095 β^{2} γ + 9555 α γ^{2} \\ + 35385 β γ^{2} + 1830 α β γ - 2 α - 14 β - 110 γ)] / [(2 + 3 α^{2} + 210 α γ \\ + {105 β^{2} + 10395 γ^{2} + 30 α β + 1890 β γ) C - {(2 α + 6 β + 30 γ)}^{2}]}^{2}, \end{matrix}$

calculated by R software with the package DEoptim we can get the bounds

$ξ_{1} \in (- 1.29,1.29), ξ_{2} \in (- 1.88,5.60) .$

Note that, the length of the admissible intervals for the skewness and the kurtosis of ABGSN are larger than the corresponding intervals of ASN and ABSN which are (−0.81, 0.81), (−1.30, 0.75) and (−1.19, 1.19), (−1.77, 3.30), respectively.

Proposition 3 Let $X ~ ABGSN (α, β, γ)$ , then the MGF of X is

$M_{X} (t; α, β, γ) = e^{\frac{t^{2}}{2}} [\frac{t^{2} S_{1} - t S_{2}}{C} + 1], {(t, α, β, γ)}^{T} \in ℝ^{4},$

where

$\begin{matrix} S_{1} = γ^{2} (4725 + 3150 t^{2} + 630 t^{4} + 45 t^{6} + t^{8}) + β^{2} (45 + 15 t^{2} + t^{4}) \\ + 2 α β (6 + t^{2}) + 2 α γ (45 + 15 t^{2} + t^{4}) + 2 β γ (420 + 210 t^{2} + 28 t^{4} + t^{6}), \end{matrix}$

$S_{2} = α (2 - α t) + 2 β (3 + t^{2}) + 2 γ (15 + 10 t^{2} + t^{4}) .$

Theorem 1 If $X ~ ABGSN (α, β, γ)$ , then

1) The random variable $Y = X^{2}$ has the following density function

$f_{Y} (y; α, β, γ) = f_{χ^{2} (1)} (y) \frac{2 + y {(α + β y + γ y^{2})}^{2}}{C}, y > 0,$

where ${(α, β, γ)}^{T} \in ℝ^{3}$ and $f_{χ^{2} (1)}$ is the pdf of the chi-square distribution with 1 degree of freedom.

2) The density function of $Z = | X |$ is given by

$f_{Z} (z; α, β, γ) = f_{N H} (z) \frac{2 + {(α z + β z^{3} + γ z^{5})}^{2}}{C}, {(z, α, β, γ)}^{T} \in ℝ^{4}$

where $f_{N H} (\cdot)$ is the density function of standard Half-Normal distribution.

3) Let $T = X | X > 0$ , then its density function is

$f_{T} (t; α, β, γ) = \frac{[2 + 2 {(1 - α t - β t^{3} - γ t^{5})}^{2}] e^{- \frac{t^{2}}{2}}}{\sqrt{2 π} C - 4 (α + 2 β + 8 γ)}, {(t, α, β, γ)}^{T} \in ℝ^{4} .$

Remark 5 The pdf (5) can be separated into symmetric and asymmetric parts as follows

$\begin{matrix} f (x; α, β, γ) = \frac{2 + {(α x + β x^{3} + γ x^{5})}^{2}}{C} ϕ (x) - \frac{2 (α x + β x^{3} + γ x^{5})}{C} ϕ (x) \\ = f_{1} (x; α, β, γ) - f_{2} (x; α, β, γ), \end{matrix}$

where $\begin{matrix} f_{1} (x; α, β, γ) = [2 + {(α x + β x^{3} + γ x^{5})}^{2}] / (945 γ^{2} + 210 γ β + 30 α γ \\ + 6 α β + 15 β^{2} + α^{2} + 2) ϕ (x) \end{matrix}$ is symmetric and the second part is asymmetric one.

Definition 2 If a random variable V has density function

$f_{1} (v; α, β, γ) = \frac{2 + {(α v + β v^{3} + γ v^{5})}^{2}}{C} ϕ (v), {(x, α, β, γ)}^{T} \in ℝ^{4},$ (11)

then we say V follows symmetric component alpha-beta-gamma skew normal distribution, denoted by $SCABGSN (α, β, γ)$ .

Figure 3 shows the pdf of the SCABGSN for different parameter values. We can see that they are all symmetrical.

Remark 6 Derivative of pdf (11) with respect to v is given by

Figure 3. Symmetric component alpha-beta-gamma skew normal pdf. $SCABGSN (10, - 5,0.4)$ has six modes (black solid line), $SCABGSN (- 4,2, - 0.2)$ has four modes (red dashed line), $SCABGSN (5,0.5,0.5)$ has two modes (blue dot dash line).

$\begin{array}{l} \frac{\partial f_{1} (v; α, β, γ)}{\partial x} \\ = \frac{ϕ (v)}{C} [- γ^{2} v^{11} - (2 β γ - 10 γ^{2}) v^{9} + (16 β γ - β^{2} - 2 α γ) v^{7} \\ + (12 α γ + 6 β^{2} - 2 α β) v^{5} + (8 α β - α^{2}) v^{3} + (2 α^{2} - 2) v - 2 α], \end{array}$ (12)

Equation (12) has only eleven zeros, therefore, pdf (11) has at most six modes.

Proposition 4 If $X ~ ABGSN (α, β, γ)$ and $V ~ SCABGSN (α, β, γ)$ , then we have

$M_{V} (t; α, β, γ) = M_{X} (t; α, β, γ) + e^{\frac{t^{2}}{2}} \frac{t S_{2}}{C}, {(t, α, β, γ)}^{T} \in ℝ^{4},$

where

$S_{2} = α (2 - α t) + 2 β (3 + t^{2}) + 2 γ (15 + 10 t^{2} + t^{4}) .$

Sampling algorithm

For generate the random number x from $ABGSN (α, β, γ)$ , we adopt the acceptance-rejection method with the following steps

1) Generate random number v from $SCABGSN (α, β, γ)$ .

2) Generate random number u from the Uniform (0, 1).

3) If $u < \frac{f (v; α, β, γ)}{Δ f_{1} (v; α, β, γ)}$ accept x and deliver $v = x$ , otherwise go back to step (1) and continue the process.

Where $Δ = \sup_{x} \frac{f (x; α, β, γ)}{f_{1} (x; α, β, γ)} = 1 + \frac{\sqrt{2}}{2}$ , $f (\cdot; α, β, γ)$ and $f_{1} (\cdot; α, β, γ)$ are pdf of $ABGSN (α, β, γ)$ and $SCABGSN (α, β, γ)$ , respectively.

3. Parameter Estimation

In this section, maximum likelihood estimation and a random simulation are considered.

3.1. Maximum Likelihood Estimation

Definition 3 Let $X ~ ABGSN (α, β, γ)$ , then $Y = μ + σ X$ is the location and scale extension of X with the pdf

$\begin{array}{l} f (y; μ, σ, α, β, γ) \\ = \frac{{[1 - α \frac{y - μ}{σ} - β {(\frac{y - μ}{σ})}^{3} - γ {(\frac{y - μ}{σ})}^{5}]}^{2} + 1}{σ (945 γ^{2} + 210 γ β + 30 α γ + 6 α β + 15 β^{2} + α^{2} + 2)} ϕ (\frac{y - μ}{σ}), σ > 0, \end{array}$ (13)

where ${(y, μ, α, β, γ)}^{T} \in ℝ^{5}$ , we denote $Y ~ ABGSN (μ, σ, α, β, γ)$ .

Let $y_{1}, y_{2}, \dots, y_{n}$ be a random sample of size n drawn from $ABGSN (μ, σ, α, β, γ)$ , then the log-likelihood function (LLF) for $(θ_{1}, θ_{2}, θ_{3}, θ_{4}, θ_{5}) = (μ, σ, α, β, γ)$ is given by

$\begin{matrix} l = \ln L (μ, σ, α, β, γ; y) \\ = \sum_{i = 1}^{n} \ln [{(1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}^{2} + 1] - n \ln σ - \frac{n}{2} \ln 2 π \\ - n \ln (945 γ^{2} + 210 γ β + 30 α γ + 6 α β + 15 β^{2} + α^{2} + 2) - \frac{1}{2} \sum_{i = 1}^{n} m_{i}^{2}, \end{matrix}$ (14)

where $m_{i} = (y_{i} - μ) / σ$ for $1 \leq i \leq n$ . Differentiating Equation (14) above partially with respect to the parameters $μ, σ, α, β, γ$ , the following likelihood equations are obtained

$\frac{\partial l}{\partial μ} = \frac{1}{σ} [\sum_{i = 1}^{n} \frac{2 (1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5}) (α + 3 β m_{i}^{2} + 5 γ m_{i}^{4})}{{(1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}^{2} + 1} + \sum_{i = 1}^{n} m_{i}] = 0,$

$\frac{\partial l}{\partial σ} = \frac{1}{σ} [\sum_{i = 1}^{n} \frac{2 (1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5}) (α m_{i} + 3 β m_{i}^{3} + 5 γ m_{i}^{5})}{{(1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}^{2} + 1} + \sum_{i = 1}^{n} m_{i}^{2} - n] = 0,$

$\frac{\partial l}{\partial α} = - \sum_{i = 1}^{n} \frac{2 m_{i} (1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}{{(1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}^{2} + 1} - n \frac{2 α + 6 β + 30 γ}{C} = 0,$

$\frac{\partial l}{\partial β} = - \sum_{i = 1}^{n} \frac{2 m_{i}^{3} (1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}{{(1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}^{2} + 1} - n \frac{210 γ + 30 β + 6 α}{C} = 0,$

$\frac{\partial l}{\partial γ} = - \sum_{i = 1}^{n} \frac{2 m_{i}^{5} (1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}{{(1 - α m_{i} - β m_{i}^{3} - γ m_{i}^{5})}^{2} + 1} - n \frac{1890 γ + 210 β + 30 α}{C} = 0,$

the solutions of the above system of likelihood equations gives the MLEs for $(θ_{1}, θ_{2}, θ_{3}, θ_{4}, θ_{5})$ . For interval estimation and hypothesis testing, it is necessary to observe the Fisher information matrix, under the regular condition, the corresponding Fisher information matrix can be expressed as $I = {(I_{i j})}_{5 \times 5}$ ,

$I_{i j} = - E [\frac{\partial l^{2}}{\partial θ_{i} \partial θ_{j}}]$ for $i, j = 1, \dots, 5$ , the detailed expression of FI is shown in the appendix.

3.2. Simulation Study

In this subsection, a random simulation is conducted. The sample sizes and true values of the parameters considered is $n = 50, 80, 100, 200$ , $α = - 2, 2$ , $β = - 1, 1$ and $γ = - 0.5, 0.5$ , while the location and the scale parameters is set to $μ = 0$ and $σ = 1$ , the number of replications is 1000. The specific steps are as follows

1) Generate 1000 samples (simulation with N= 1000 repetitions) of size n from $ABGSN (μ, σ, α, β, γ)$ .

2) Compute the maximum likelihood estimates for each of the 1000 samples.

3) Compute the mean and mean squared error (MSE) of $(\hat{μ}, \hat{σ}, \hat{α}, \hat{β}, \hat{γ})$ over the 1000 estimates.

From Table 1, MSE of the MLEs for each parameter decreases with the increase of n and disappear when n trends to infinity, mean of the MLEs for each parameter tends to the true value with the increase of n.

4. Real Life Application

In this section, we shall examine the application of the alpha-beta-gamma skew distribution to a real data set, and the R’s package DEoptim is used for maximizing the corresponding LLF, the MLEs of the parameters, log-likelihood values, AIC and BIC of distributions are obtained, then we compared the new distribution family with normal distribution, SN, PN, ASN, BSN, ABSN.

We consider the velocity of 82 Galaxy samples in the universe, which is first described by Roeder [13]. Table 2 shows the MLEs, log-likelihood values, AIC and BIC of each distribution. Figure 4 shows the graph representation of the density only considering the distribution of the top three with good fitting.

Based on the sample data, Kolmogrov-Smirnov test is carried out for the seven distributions. The test statistics is $D = \max | {\hat{F}}_{n} (x) - F (x) |$ , where ${\hat{F}}_{n} (x)$ and $F (x)$ are the empirical distribution function of sample data and the given distribution function, respectively. The test results are shown in Table 3.

Figure 4. Plots of observed and expected densities of some distributions for the velocity of 82 galaxy samples.

Table 1. Mean and MSE of the MLEs of the unknown parameters for the $ABGSN (μ, σ, α, β, γ)$ .

Where $MSE ({\hat{θ}}_{i}) = E {[{\hat{θ}}_{i} - θ_{i}]}^{2}$ , $i = 1, \dots,5$ .

Table 2. MLEs, log likelihood value, AIC and BIC comparison table.

Table 3. Kolmogrov-Smirnov test for seven distributions.

From Table 2 and Table 3, we can conclude that the log-likelihood value of the ABGSN is maximum, its AIC and BIC are minimum, while the statistics is the smallest, the P value is the largest and greater than 0.05. Therefore, we do not reject the original hypothesis, and the ABGSN is more suitable for fitting this data sets than others, Figure 4 also confirms our findings.

5. Conclusion

This paper introduced a new class of skew normal distributions which has at most six modes and some of its properties are discussed. The admissible intervals of the skewness and kurtosis coefficients for the alpha-beta-gamma skew normal distribution are wider than skew normal distribution, alpha skew normal distribution and alpha-beta skew normal distribution. Through random simulation, the MSE of the MLEs for each parameter decreases with the increase of n. The numerical results of fitting a real-life data set considered here has shown that the ABGSN provides a better fitting in comparison to the other known distributions. Based on the distribution studied in this paper, we can further extend it. For example, combining ABGSN with the classical skew normal distribution, we can get the generalized alpha-beta-gamma skew normal distribution, which makes ABGSN as a special case of it.

Funding

Project is supported by Natural Science Foundation of Chongqing (Grant No. cstc2020jcyj-msxmX0232 and cstc2019jcyj-msxmX0386).

Appendix

A1. Proof of Proposition 1

$\begin{matrix} F (x) = \int_{- \infty}^{x} f (t) d t \\ = \frac{1}{C} \int_{- \infty}^{x} [γ^{2} t^{10} + 2 β γ t^{8} + (2 α γ + β^{2}) t^{6} - 2 γ t^{5} + 2 α β t^{4} \\ - 2 β t^{3} + α^{2} t^{2} - 2 α t + 2] ϕ (t) \\ = \frac{1}{C} [γ^{2} A_{10} + 2 β γ A_{8} + (2 α γ + β^{2}) A_{6} - 2 γ A_{5} + 2 α β A_{4} \\ - 2 β A_{3} + α^{2} A_{2} - 2 α A_{1} + 2 Φ (x)], \end{matrix}$

where

$A_{i} = - x^{i - 1} ϕ (x) + (i - 1) \int_{- \infty}^{x} t^{i - 2} ϕ (t) d t = - x^{i - 1} ϕ (x) + (i - 1) A_{i - 2}, i = 1, 2, 3, \dots .$

A2. Proof of Proposition 2

When $Y ~ N (0,1)$ , we have

$E [Y^{2 k}] = \prod_{j = 1}^{k} (2 j - 1), E [Y^{2 k - 1}] = 0, k = 1, 2, 3, \dots .$

then

$\begin{matrix} E [X^{2 k}] = \frac{1}{C} (2 E [Y^{2 K}] - 2 α E [Y^{2 K + 1}] + α^{2} E [Y^{2 K + 2}] + β^{2} E [Y^{2 K + 6}] \\ + γ^{2} E [Y^{2 K + 10}] - 2 β E [Y^{2 K + 3}] + 2 α β E [Y^{2 K + 4}] \\ - 2 γ E [Y^{2 K + 5}] + 2 α γ E [Y^{2 K + 6}] + 2 β γ E [Y^{2 K + 8}]) \\ = \frac{1}{C} (2 E [Y^{2 K}] + α^{2} E [Y^{2 K + 2}] + β^{2} E [Y^{2 K + 6}] + γ^{2} E [Y^{2 K + 10}] \\ + 2 α β E [Y^{2 K + 4}] + 2 α γ E [Y^{2 K + 6}] + 2 β γ E [Y^{2 K + 8}]), \end{matrix}$

$\begin{matrix} E [X^{2 k - 1}] = \frac{1}{C} (2 E [Y^{2 K - 1}] - 2 α E [Y^{2 K}] + α^{2} E [Y^{2 K + 1}] + β^{2} E [Y^{2 K + 5}] \\ + γ^{2} E [Y^{2 K + 9}] - 2 β E [Y^{2 K + 2}] + 2 α β E [Y^{2 K + 3}] \\ - 2 γ E [Y^{2 K + 4}] + 2 α γ E [Y^{2 K + 5}] + 2 β γ E [Y^{2 K + 7}]) \\ = \frac{1}{C} (- 2 α E [Y^{2 K}] - 2 β E [Y^{2 K + 2}] - 2 γ E [Y^{2 K + 4}]) . \end{matrix}$

A3. Proof of Proposition 3

$\begin{matrix} M_{X} (t; α, β, γ) = E [e^{t x}] \\ = \frac{1}{C} \int_{- \infty}^{\infty} e^{t x} (2 - 2 α x - 2 β x^{3} - 2 γ x^{5} + α^{2} x^{2} + (2 α γ + β^{2}) x^{6} \\ + γ^{2} x^{10} + 2 α β x^{4} + 2 β γ x^{8}) ϕ (x) d x \\ = \frac{1}{C} (2 D_{0} - 2 α D_{1} - 2 β D_{3} - 2 γ D_{5} + α^{2} D_{2} + β^{2} D_{6} + γ^{2} D_{10} \\ + 2 α β D_{4} + 2 α γ D_{6} + 2 β γ D_{8}), \end{matrix}$

where

$D_{0} = \int_{- \infty}^{\infty} e^{t x} ϕ (x) d x = e^{\frac{t^{2}}{2}},$

$D_{i} = \int_{- \infty}^{\infty} e^{t x} x^{i} ϕ (x) d x = (i - 1) D_{i - 2} + t D_{i - 1}, i = 1, 2, 3, \dots .$

A4. Proof of Theorem 1

1) The density function of Y is obtained by

$F_{Y} (y; α, β, γ) = ℙ (Y \leq y) = F_{X} (\sqrt{y}; α, β, γ) - F_{X} (- \sqrt{y}; α, β, γ),$

then

$\begin{matrix} f_{Y} (y; α, β, γ) = \frac{1}{2 \sqrt{y}} [f_{X} (\sqrt{y}; α, β, γ) + f_{X} (- \sqrt{y}; α, β, γ)] \\ = \frac{ϕ (\sqrt{y})}{\sqrt{y}} \frac{2 + y {(α + β y + γ y^{2})}^{2}}{C} . \end{matrix}$

2) For $z > 0$ , we have

$\begin{matrix} f_{Z} (z; α, β, γ) = f_{X} (z) + f_{X} (- z) \\ = \frac{2 + {(1 + α z + β z^{3} + γ z^{5})}^{2} + {(1 - α z - β z^{3} - γ z^{5})}^{2}}{C} \\ = 2 ϕ (z) \frac{2 + {(α z + β z^{3} + γ z^{5})}^{2}}{C} . \end{matrix}$

3) The density function of T is derived by

$\begin{matrix} f_{T} (t; α, β, γ) = \frac{f_{X} (t)}{1 - F_{X} (0)} \\ = \frac{2 \sqrt{2 π} C f_{X} (t)}{\sqrt{2 π} C - 4 (α + 2 β + 8 γ)} \\ = \frac{[2 + 2 {(1 - α t - β t^{3} - γ t^{5})}^{2}] e^{\frac{- t^{2}}{2}}}{\sqrt{2 π} C - 4 (α + 2 β + 8 γ)} . \end{matrix}$

A5. FI Matrix Elements of $A B G S N (μ, σ, α, β, γ)$

$I_{11} = n \frac{2 - α^{2} - 75 β^{2} - 8505 γ^{2} - 18 α β - 150 α γ - 780 β γ + 4 G_{0}}{σ^{2} C} .$

$I_{12} = I_{21} = n \frac{- 2 α + 6 β + 90 γ + 4 G_{1}}{σ^{2} C} .$

$I_{13} = I_{31} = n \frac{- 2 - 4 E_{1}}{σ C} .$

$I_{14} = I_{41} = n \frac{- 6 - 4 E_{3}}{σ C} .$

$I_{15} = I_{51} = n \frac{- 30 - 4 E_{5}}{σ C} .$

$I_{22} = n \frac{2 α^{2} - 330 β^{2} - 73710 γ^{2} - 36 α β - 660 α γ - 9660 β γ + 4 + 4 G_{2}}{σ^{2} C} .$

$I_{23} = I_{32} = n \frac{4 α + 24 β + 180 γ - 4 E_{2}}{σ C} .$

$I_{24} = I_{42} = n \frac{24 α + 180 β + 1680 γ - 4 E_{4}}{σ C} .$

$I_{25} = I_{52} = n \frac{180 α + 1680 β + 18900 γ - 4 E_{6}}{σ C} .$

$I_{33} = n \frac{4 C F_{2} - {(2 α + 6 β + 30 γ)}^{2}}{C^{2}} .$

$I_{34} = I_{43} = n \frac{4 C F_{4} - (2 α + 6 β + 30 γ) (6 α + 30 β + 210 γ)}{C^{2}} .$

$I_{35} = I_{53} = n \frac{4 C F_{6} - {(30 α + 210 β + 1890 γ)}^{2}}{C^{2}} .$

$I_{44} = n \frac{4 C F_{6} - {(6 α + 30 β + 210 γ)}^{2}}{C^{2}} .$

$I_{45} = I_{54} = n \frac{4 C F_{8} - (6 α + 30 β + 210 γ) (30 α + 210 β + 1890 γ)}{C^{2}} .$

$I_{55} = n \frac{4 C F_{10} - {(30 α + 210 β + 1890 γ)}^{2}}{C^{2}} .$

where

$E_{i} = E [X^{i} (α + 3 β X^{2} + 5 γ X^{4}) \frac{{(1 - α X - β X^{3} - γ X^{5})}^{2}}{{(1 - α X - β X^{3} - γ X^{5})}^{2} + 1}], i = 1, 2, 3, 4, 5, 6,$

$F_{i} = E [X^{i} \frac{{(1 - α X - β X^{3} - γ X^{5})}^{2}}{{(1 - α X - β X^{3} - γ X^{5})}^{2} + 1}], i = 2, 4, 6, 8, 10,$

$G_{i} = E [X^{i} {(α + 3 β X^{2} + 5 γ X^{4})}^{2} \frac{{(1 - α X - β X^{3} - γ X^{5})}^{2}}{{(1 - α X - β X^{3} - γ X^{5})}^{2} + 1}], i = 0, 1, 2,$

$X ~ N (0,1) .$

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Azzalini, A. (1985) A Class of Distributions Which Includes the Normal Ones. Scandinavian Journal of Statistics, 12, 171-178.
[2]	Arnold, B. and Beaver, R., Azzalini, A., Balakrishnan, N., Bhaumik, A., Dey, D., Cuadras, C., Sarabia, J., Arnold, B. and Beaver, R. (2002) Skewed Multivariate Models Related to Hidden Truncation and/or Selective Reporting. Test, 11, 7-54. https://doi.org/10.1007/BF02595728
[3]	Gupta, R.C. and Gupta, R.D. (2004) Generalized Skew Normal Model. Test, 13, 501-524. https://doi.org/10.1007/BF02595784
[4]	Yadegari, I., Gerami, A. and Khaledi, M.J. (2008) A Generalization of the Balakrishnan Skew-Normal Distribution. Statistics & Probability Letters, 78, 1165-1167. https://doi.org/10.1016/j.spl.2007.12.001
[5]	Govind, A., Mudholkar, S., Alan, B. and Hutson, D. (2000) The Epsilon- Skew- Normal Distribution for Analyzing Near-Normal Data. Journal of Statistical Planning and Inference, 83, 291-309. https://doi.org/10.1016/S0378-3758(99)00096-8
[6]	Behboodian, J., Jamalizadeh, A. and Balakrishnan, N. (2008) A New Class of Skew-Cauchy Distributions. Stats & Probability Letters, 76, 1488-1493, 2006. https://doi.org/10.1016/j.spl.2006.03.008
[7]	Gupta, R.D. and Gupta, R.C. (2008) Analyzing Skewed Data by Power Normal Model. Test, 17, 197-210. https://doi.org/10.1007/s11749-006-0030-x
[8]	Elal-Olivero, D. (2010) Alpha-Skew-Normal Distribution. Proyecciones (Antofagasta), 29, 224-240. https://doi.org/10.4067/S0716-09172010000300006
[9]	Harandi, S.S. and Alamatsaz, M.H. (2013) Alpha-Skew-Laplace Distribution. Statistics & Probability Letters, 83, 774-782. https://doi.org/10.1016/j.spl.2012.11.024
[10]	Sharafi, M., Sajjadnia, Z. and Behboodian, J. (2017) A New Generalization of Alpha-Skew-Normal Distribution. Communications in Statistics—Theory and Methods, 46, 6098-6111. https://doi.org/10.1080/03610926.2015.1117639
[11]	Chakraborty, S., Jyoti Hazarika, P. and Shah, S. (2020) The Balakrishnan Alpha Skew Normal Distribution: Properties and Applications. Malaysian Journal of Science, 39, 71-91. https://doi.org/10.22452/mjs.vol39no2.5
[12]	Doostparast, M., Shafiei, S. and Jamalizadeh, A. (2016) The Alpha-Beta Skew Normal Distribution: Properties and Applications. Statistics, 50, 338-349.
[13]	Roeder, K. (1990) Density Estimation with Confidence Sets Exemplified by Super Clusters and Voids in the Galaxies. Publications of the American Statal Association, 85, 617-624. https://doi.org/10.1080/01621459.1990.10474918

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies