New Probability Distributions in Astrophysics: XI. Left Truncation for the Topp-Leone Distribution ()
1. Introduction
A family of univariate J-shaped probability distributions was introduced by Topp & Leone in 1955 [1] , in the following T-L. After 50 years, a derivation of the moments of the T-L distribution was done by [2] in terms of the Gauss hypergeometric function, and a numerical analysis of its skewness was done by [3] . At the moment of writing, the study of the generalizations of the T-L distributions is an active field of research, we cite among others some approaches: the introduction of two sides and a generalization [4] , a new family of distributions called the Marshall-Olkin Topp Leone-G family [5] , a new trigonometric family of distributions defined from the alliance of the families known as sine-G and Topp-Leone generated distributions [6] . This paper introduces in Section 2 the scale for the T-L distribution, which is originally defined in the interval
. Section 3 introduces a left truncation of the T-L distribution and Section 4 applies the derived results to the mass distribution for stars.
2. Topp-Leone Distribution with Scale
Let Y be a random variable taking values y in the interval
. The Topp-Leone probability density function (PDF), (in the following T-L) is
(1)
where
is the shape parameter [1] . We now introduce the scale, b, with the change of variable
: the T-L PDF with scale defined in
is
(2)
where
. The distribution function, (DF), of the T-L with scale is
(3)
its average value or mean,
, is
(4)
where
(5)
is the gamma function. Its variance,
, is
(6)
and its standard deviation, std, is
(7)
Its rth moment about the origin,
, is
(8)
where
is a regularized hypergeometric function [7] [8] [9] [10] . Its skewness is
(9)
where
(10)
(11)
Figure 1 shows the behaviour of the skewness as a function of the parameter
; the transition from positive to negative values is at
and [3] quotes
.
The kurtosis of the T-L has a complicated expression and we limit ourselves to a numerical display, see Figure 2; the minimum value is at
when
.
The median,
, is at
(12)
and the mode is at
(13)
Figure 1. Skewness of the T-L distribution with scale as a function of
when
.
Figure 2. Kurtosis of the T-L distribution with scale as a function of
when
.
The random generation of the T-L variate X is given by
(14)
where R is the unit rectangular variate. The two parameters b and
can be derived by the numerical solution of the two following equations, which arise from the maximum likelihood estimator (MLE),
(15a)
(15b)
where
are the elements of the experimental sample with i varying between 1 and n.
3. Truncated Topp-Leone Distribution with Scale
Let X be a random variable defined in
; the left truncated two-parameter T-L DF,
, is
(16)
and its PDF,
, is
(17)
Its average value or mean,
, is
(18)
Its rth moment about the origin,
, is
(19)
Its variance can be evaluated with the usual formula:
(20)
The random generation of the truncated T-L variate X is obtained by solving the following nonlinear equation in x:
(21)
where R is the unit rectangular variate. The three parameters
, b and
can be obtained in the following way. Consider a sample
and let
denote their order statistics, so that
,
. The first parameter
is
(22)
One method, the MLE, allows us to derive the two remaining parameters maximizing the log-likelihood:
(23)
where
is the likelihood function. The two parameters b and β are derived by the numerical solution of the two following equations,
(24a)
(2ab)
where
are the elements of the experimental sample with i varying between 1 and n. Another method is the method of moments, which derives β and b from the following two non-linear equations:
(25a)
(25b)
where
and Var are, respectively, the average value and the variance of the experimental sample [11] .
4. Astrophysical Applications
This section reviews the adopted statistics; the lognormal distribution is also used here for the sake of comparison. The new results are applied to the initial mass function (IMF) for stars.
4.1. Statistics
The merit function
is computed according to the formula:
(26)
where n is the number of bins,
is the theoretical value, and
is the experimental value represented by the frequencies. The theoretical frequency distribution is given by
(27)
where N is the number of elements of the sample,
is the magnitude of the size interval, and
is the PDF under examination. A reduced merit function
is given by
(28)
where
is the number of degrees of freedom, n is the number of bins, and k is the number of parameters. The goodness of the fit can be expressed by the probability Q, see equation 15.2.12 in [11] , which involves the number of degrees of freedom and
. According to [11] p. 658, the fit “may be acceptable” if
. The Akaike information criterion (AIC), see [12] , is defined by
(29)
where L is the likelihood function and k the number of free parameters in the model. We assume a Gaussian distribution for the errors. The likelihood function
can then be derived from the
statistic
where
has been computed by Equation (29), see [13] [14] . Now the AIC becomes:
(30)
The Kolmogorov-Smirnov test (K-S), see [15] [16] [17] , does not require the data to be binned. The K-S test, as implemented by the FORTRAN subroutine KSONE in [11] , finds the maximum distance, D, between the theoretical and the astronomical DF, as well as the significance level
; see formulas 14.3.5 and 14.3.9 in [11] . If
, then the goodness of the fit is believable.
4.2. Lognormal Distribution
Let X be a random variable defined in
; the lognormal PDF, following [18] or formula (14.2) in [19] , is
(31)
where m is the median and
the shape parameter. Its CDF is
(32)
where erf(x) is the error function, defined as
(33)
see [10] . Its average value or mean,
, is
(34)
its variance,
, is
(35)
and its second moment about the origin,
, is
(36)
4.3. The IMF for Stars
The first test is performed on NGC 2362, where the 271 stars have a range of
, see [20] and CDS catalog J/MNRAS/384/675/table1. According to [21] , the distance of NGC 2362 is 1480 pc.
The second test is performed on the low-mass IMF in the young cluster NGC 6611, see [22] and CDS catalog J/MNRAS/392/1034. This massive cluster has an age of 2 - 3 Myr and contains masses from
. Therefore, the brown dwarfs (BD) region,
, is covered. The third test is performed on the
Velorum cluster where the 237 stars have a range of
, see [23] and CDS catalog J/A+A/589/A70/table5. The fourth test is performed on the young cluster Berkeley 59, where the 420 stars have a range of
, see [24] and CDS catalog J/AJ/155/44/table3. The results are presented in Table 1 for the lognormal distribution, in Table 2
Table 1. Numerical values of
, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and PKS, the significance level, in the K-S test of the lognormal distribution, see Equation (34), for different mass distributions. The number of linear bins, n, is 10.
Table 2. Numerical values of
, AIC, probability Q, D, the maximum distance between theoretical and observed DF, and PKS, the significance level, in the K-S test of the T-L distribution with scale, see Equation (2), for different astrophysical environments. The last column (F) indicates a PKS higher (Y) or lower (N) than that for the lognormal distribution. The number of linear bins, n, is 10.
Table 3. Numerical values of
, AIC, probability Q, D, the maximum distance between theoretical and observed DF, and PKS, the significance level, in the K-S test of the truncated T-L distribution with scale, see Equation (18), for different astrophysical environments. The last column (F) indicates a PKS higher (Y) or lower (N) than that for the lognormal distribution. The number of linear bins, n, is 10.
Table 4. Numerical values of D, the maximum distance between theoretical and observed DF, and PKS, the significance level, in the K-S test for different distributions in the case of γ Velorum cluster.
Figure 3. Empirical DF of the mass distribution for NGC 6611 (blue histogram) with a superposition of the T-L DF (red dashed line). Theoretical parameters as in Table 2.
for the T-L distribution with scale, and in Table 3 for the truncated T-L distribution with scale. In Table 2 and Table 3 the last column shows whether the results of the K-S test are better when compared to the Weibull distribution (Y) or worse (N). As an example, the empirical DF visualized through histograms and the theoretical T-L DF for NGC 6611 is presented in Figure 3.
5. Conclusions
The Truncated Distribution
We derived the PDF, the DF, the average value, the rth moment, and the MLE for the left truncated T-L distribution with scale.
Astrophysical Applications
The application of the T-L distribution to the IMF for stars gives better results than the lognormal distribution for one out of four samples, see Table 2. The truncated T-L distribution gives better results than the T-L distribution for two out of four samples, see Table 2 and Table 3.
The results for the mass distribution of γ Velorum cluster compared with other distributions are shown in Table 4, in which the truncated T-L distribution occupies the 7th position.