New Probability Distributions in Astrophysics: XI. Left Truncation for the Topp-Leone Distribution

Abstract

The Topp-Leone (T-L) distribution has aided the modeling of scientific data in many contexts. We demonstrate how it can be adapted to model astrophysical data. We analyse the left truncated version of the T-L distribution, deriving its probability density function (PDF), distribution function, average value, rth moment about the origin, median, the random generation of its values, and its maximum likelihood estimator, which allows us to derive the two unknown parameters. The T-L distribution, in its regular and truncated versions, is then applied to model the initial mass function for the stars. A comparison is made with specific clusters and between proposed functions for the IMF. The Topp-Leone distribution can provide an excellent fit in some cases.

Share and Cite:

Zaninetti, L. (2023) New Probability Distributions in Astrophysics: XI. Left Truncation for the Topp-Leone Distribution. International Journal of Astronomy and Astrophysics, 13, 154-165. doi: 10.4236/ijaa.2023.133009.

1. Introduction

A family of univariate J-shaped probability distributions was introduced by Topp & Leone in 1955 [1] , in the following T-L. After 50 years, a derivation of the moments of the T-L distribution was done by [2] in terms of the Gauss hypergeometric function, and a numerical analysis of its skewness was done by [3] . At the moment of writing, the study of the generalizations of the T-L distributions is an active field of research, we cite among others some approaches: the introduction of two sides and a generalization [4] , a new family of distributions called the Marshall-Olkin Topp Leone-G family [5] , a new trigonometric family of distributions defined from the alliance of the families known as sine-G and Topp-Leone generated distributions [6] . This paper introduces in Section 2 the scale for the T-L distribution, which is originally defined in the interval [ 0,1 ] . Section 3 introduces a left truncation of the T-L distribution and Section 4 applies the derived results to the mass distribution for stars.

2. Topp-Leone Distribution with Scale

Let Y be a random variable taking values y in the interval [ 0,1 ] . The Topp-Leone probability density function (PDF), (in the following T-L) is

f ( y ) = β ( 2 2 y ) ( y 2 + 2 y ) β 1 , (1)

where β > 0 is the shape parameter [1] . We now introduce the scale, b, with the change of variable y = x b : the T-L PDF with scale defined in [ 0,1 ] is

f ( x ; b , β ) = β ( 2 2 x b ) ( x 2 b 2 + 2 x b ) β 1 b , (2)

where b > 0 , β > 0 . The distribution function, (DF), of the T-L with scale is

F ( x ; b , β ) = ( x b ) β ( 2 x b ) β , (3)

its average value or mean, μ , is

μ ( b , β ) = b ( π Γ ( β + 1 ) 2 Γ ( 3 2 + β ) ) 2 Γ ( 3 2 + β ) , (4)

where

Γ ( z ) = 0 e t t z 1 d t , (5)

is the gamma function. Its variance, σ 2 , is

σ 2 ( b , β ) = ( 4 Γ ( 3 2 + β ) 2 + π Γ ( β + 1 ) 2 ( β + 1 ) ) b 2 4 ( β + 1 ) Γ ( 3 2 + β ) 2 , (6)

and its standard deviation, std, is

s t d = σ 2 . (7)

Its rth moment about the origin, μ r , is

μ r ( b , β ) = β ( 2 β F 2 1 ( β + 1, β + r ; 1 + β + r ; 1 2 ) r 2 β 2 r ) b r ( β + r ) ( 2 β + r ) , (8)

where F 2 1 ( a , b ; c ; v ) is a regularized hypergeometric function [7] [8] [9] [10] . Its skewness is

skewness = N D , (9)

where

N = 768 ( 3 ( 4 ( β + 2 ) ( β + 1 ) 3 Γ ( 5 2 + β ) 3 3 2 ( β + 1 ) 2 π ( β + 2 ) × Γ ( β + 2 ) ( 3 2 + β ) Γ ( 5 2 + β ) 2 + ( β + 1 ) π Γ ( β + 2 ) 2 ( 3 2 + β ) 3 Γ ( 5 2 + β ) π 3 2 ( β + 5 4 ) Γ ( β + 2 ) 3 ( 3 2 + β ) 3 6 ) ( β + 1 ) ( β 1 ) 2 β Γ ( 5 2 + β ) ( β + 7 3 ) × F 2 1 ( β , β + 2 ; β + 1 ; 1 2 ) + 8 ( β + 1 ) 4 β ( β + 2 ) ( β + 7 3 ) Γ ( 5 2 + β ) 4

8 ( β + 1 ) 2 π ( β + 2 ) ( β 4 + 29 6 β 3 + 79 12 β 2 + 8 3 β 11 24 ) Γ ( β + 2 ) ( 3 2 + β ) Γ ( 5 2 + β ) 3 + 12 ( β 3 + 5 2 β 2 + 1 2 β 1 2 ) ( β + 1 ) 2 π Γ ( β + 2 ) 2 ( β + 7 3 ) ( 3 2 + β ) 2 Γ ( 5 2 + β ) 2 6 ( β + 1 ) π 3 2 ( β 4 + 38 9 β 3 + 301 72 β 2 19 36 β 23 24 ) Γ ( β + 2 ) 3 ( 3 2 + β ) 3 Γ ( 5 2 + β ) + π 2 ( β + 1 2 ) ( β + 5 4 ) ( β 1 2 ) Γ ( β + 2 ) 4 ( β + 7 3 ) ( 3 2 + β ) 4 ) b 3 (10)

D = s t d 3 ( β + 1 ) 3 512 ( 4 β 2 1 ) ( 3 + 2 β ) ( β + 1 ) Γ ( 5 2 + β ) 4 . (11)

Figure 1 shows the behaviour of the skewness as a function of the parameter β ; the transition from positive to negative values is at β = 2.563 and [3] quotes β = 2.56 .

The kurtosis of the T-L has a complicated expression and we limit ourselves to a numerical display, see Figure 2; the minimum value is at β = 1.843 when b = 1 .

The median, q 1 / 2 , is at

q 1 / 2 ( b , β ) = ( 1 1 2 1 β ) b , (12)

and the mode is at

mode ( b , β ) = ( 2 β 1 1 ) b 2 β 1 . (13)

Figure 1. Skewness of the T-L distribution with scale as a function of β when b = 1 .

Figure 2. Kurtosis of the T-L distribution with scale as a function of β when b = 1 .

The random generation of the T-L variate X is given by

X : b , β ( 1 1 R 1 β ) b , (14)

where R is the unit rectangular variate. The two parameters b and β can be derived by the numerical solution of the two following equations, which arise from the maximum likelihood estimator (MLE),

2 n ( i = 1 n ( 2 β 2 ) x i 2 + ( 4 β + 5 ) b x i + 2 b 2 ( β 2 ) ( b x i ) ( 2 b x i ) ) b = 0 , (15a)

n β + ( i = 1 n ln ( x i ( 2 b x i ) b 2 ) ) = 0, (15b)

where x i are the elements of the experimental sample with i varying between 1 and n.

3. Truncated Topp-Leone Distribution with Scale

Let X be a random variable defined in [ x l , b ] ; the left truncated two-parameter T-L DF, F T ( x ) , is

F T ( x ; β , x l , b ) = b 2 β ( x l β ( 2 b x l ) β x β ( 2 b x ) β ) x l β b 2 β ( 2 b x l ) β 1 , (16)

and its PDF, f T ( x ) , is

f T ( x ; β , x l , b ) = β ( 2 2 x b ) ( x 2 b 2 + 2 x b ) β 1 b ( 1 x l β b 2 β ( 2 b x l ) β ) . (17)

Its average value or mean, μ T , is

μ T ( β , x l , b ) = 1 2 ( β + 1 ) ( β + 2 ) Γ ( 3 2 + β ) ( x l β ( 2 b x l ) β b 2 β ) x l β + 2 Γ ( 3 2 + β ) β 2 β + 1 b β 1 ( β + 1 ) F 2 1 ( β + 1, β + 2 ; β + 3 ; x l 2 b ) + x l β + 1 Γ ( 3 2 + β ) β b β ( β 2 β + 1 + 4 2 β ) F 2 1 ( β + 1, β + 1 ; β + 2 ; x l 2 b ) + b 2 β + 1 ( β + 1 ) ( β + 2 ) ( π Γ ( β + 1 ) 2 Γ ( 3 2 + β ) ) , (18)

Its rth moment about the origin, μ r , T , is

μ r , T ( β , x l , b ) = 1 ( x l β ( 2 b x l ) β b 2 β ) ( β + r ) ( 2 β + r ) ( 2 β x l β + r b β ( 2 b x l b ) β + 2 r x l β + r b β ( 2 b x l b ) β F 2 1 ( β + r , β + 1 ; 1 + β + r ; x l 2 b ) r x l β + r 2 β b β + F 2 1 ( β + r , β + 1 ; 1 + β + r ; 1 2 ) r b 2 β + r 2 β 2 β b 2 β + r 2 r b 2 β + r ) β . (19)

Its variance can be evaluated with the usual formula:

σ T 2 ( β , x l , b ) = μ 2, T ( μ 1, T ) 2 . (20)

The random generation of the truncated T-L variate X is obtained by solving the following nonlinear equation in x:

F T ( x ; β , x l , b ) = R , (21)

where R is the unit rectangular variate. The three parameters x l , b and β can be obtained in the following way. Consider a sample X = x 1 , x 2 , , x n and let x ( 1 ) x ( 2 ) x ( n ) denote their order statistics, so that x ( 1 ) = max ( x 1 , x 2 , , x n ) , x ( n ) = min ( x 1 , x 2 , , x n ) . The first parameter x l is

x l = x ( n ) . (22)

One method, the MLE, allows us to derive the two remaining parameters maximizing the log-likelihood:

ln ( L ( x i ; β , x l , b ) ) = n ln ( 2 ) + n ln ( β ) 2 n ln ( b ) + ( i = 1 n ln ( ( b x i ) ( x i ( 2 b x i ) b 2 ) β 1 x l β b 2 β ( 2 b x l ) β 1 ) ) , (23)

where L ( x i ; β , x l , b ) is the likelihood function. The two parameters b and β are derived by the numerical solution of the two following equations,

ln ( L ( x i ; β , x l , b ) ) β = 0, (24a)

ln ( L ( x i ; β , x l , b ) ) b = 0, (2ab)

where x i are the elements of the experimental sample with i varying between 1 and n. Another method is the method of moments, which derives β and b from the following two non-linear equations:

μ T ( β , x l , b ) = x ¯ , (25a)

σ T 2 ( β , x l , b ) = V a r , (25b)

where x ¯ and Var are, respectively, the average value and the variance of the experimental sample [11] .

4. Astrophysical Applications

This section reviews the adopted statistics; the lognormal distribution is also used here for the sake of comparison. The new results are applied to the initial mass function (IMF) for stars.

4.1. Statistics

The merit function χ 2 is computed according to the formula:

χ 2 = i = 1 n ( T i O i ) 2 T i , (26)

where n is the number of bins, T i is the theoretical value, and O i is the experimental value represented by the frequencies. The theoretical frequency distribution is given by

T i = N Δ x i p ( x ) , (27)

where N is the number of elements of the sample, Δ x i is the magnitude of the size interval, and p ( x ) is the PDF under examination. A reduced merit function χ r e d 2 is given by

χ r e d 2 = χ 2 / N F , (28)

where N F = n k is the number of degrees of freedom, n is the number of bins, and k is the number of parameters. The goodness of the fit can be expressed by the probability Q, see equation 15.2.12 in [11] , which involves the number of degrees of freedom and χ 2 . According to [11] p. 658, the fit “may be acceptable” if Q > 0.001 . The Akaike information criterion (AIC), see [12] , is defined by

AIC = 2 k 2 ln ( L ) , (29)

where L is the likelihood function and k the number of free parameters in the model. We assume a Gaussian distribution for the errors. The likelihood function

can then be derived from the χ 2 statistic L exp ( χ 2 2 ) where χ 2 has been computed by Equation (29), see [13] [14] . Now the AIC becomes:

AIC = 2 k + χ 2 . (30)

The Kolmogorov-Smirnov test (K-S), see [15] [16] [17] , does not require the data to be binned. The K-S test, as implemented by the FORTRAN subroutine KSONE in [11] , finds the maximum distance, D, between the theoretical and the astronomical DF, as well as the significance level P K S ; see formulas 14.3.5 and 14.3.9 in [11] . If P K S 0.1 , then the goodness of the fit is believable.

4.2. Lognormal Distribution

Let X be a random variable defined in [ 0, ] ; the lognormal PDF, following [18] or formula (14.2) in [19] , is

PDF ( x ; m , σ ) = e 1 2 σ 2 ( ln ( x m ) ) 2 x σ 2 π , (31)

where m is the median and σ the shape parameter. Its CDF is

CDF ( x ; m , σ ) = 1 2 + 1 2 erf ( 1 2 2 ( ln ( m ) + ln ( x ) ) σ ) , (32)

where erf(x) is the error function, defined as

erf ( x ) = 2 π 0 x e t 2 d t , (33)

see [10] . Its average value or mean, E ( X ) , is

E ( X ; m , σ ) = m e 1 2 σ 2 , (34)

its variance, V a r ( X ) , is

V a r = e σ 2 ( e σ 2 1 ) m 2 , (35)

and its second moment about the origin, E 2 ( X ) , is

E ( X 2 ; m , σ ) = m 2 e 2 σ 2 . (36)

4.3. The IMF for Stars

The first test is performed on NGC 2362, where the 271 stars have a range of 1.47 M M 0.11 M , see [20] and CDS catalog J/MNRAS/384/675/table1. According to [21] , the distance of NGC 2362 is 1480 pc.

The second test is performed on the low-mass IMF in the young cluster NGC 6611, see [22] and CDS catalog J/MNRAS/392/1034. This massive cluster has an age of 2 - 3 Myr and contains masses from 1.5 M M 0.02 M . Therefore, the brown dwarfs (BD) region, 0.2 M , is covered. The third test is performed on the γ Velorum cluster where the 237 stars have a range of 1.31 M M 0.15 M , see [23] and CDS catalog J/A+A/589/A70/table5. The fourth test is performed on the young cluster Berkeley 59, where the 420 stars have a range of 2.24 M M 0.15 M , see [24] and CDS catalog J/AJ/155/44/table3. The results are presented in Table 1 for the lognormal distribution, in Table 2

Table 1. Numerical values of χ r e d 2 , AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and PKS, the significance level, in the K-S test of the lognormal distribution, see Equation (34), for different mass distributions. The number of linear bins, n, is 10.

Table 2. Numerical values of χ r e d 2 , AIC, probability Q, D, the maximum distance between theoretical and observed DF, and PKS, the significance level, in the K-S test of the T-L distribution with scale, see Equation (2), for different astrophysical environments. The last column (F) indicates a PKS higher (Y) or lower (N) than that for the lognormal distribution. The number of linear bins, n, is 10.

Table 3. Numerical values of χ r e d 2 , AIC, probability Q, D, the maximum distance between theoretical and observed DF, and PKS, the significance level, in the K-S test of the truncated T-L distribution with scale, see Equation (18), for different astrophysical environments. The last column (F) indicates a PKS higher (Y) or lower (N) than that for the lognormal distribution. The number of linear bins, n, is 10.

Table 4. Numerical values of D, the maximum distance between theoretical and observed DF, and PKS, the significance level, in the K-S test for different distributions in the case of γ Velorum cluster.

Figure 3. Empirical DF of the mass distribution for NGC 6611 (blue histogram) with a superposition of the T-L DF (red dashed line). Theoretical parameters as in Table 2.

for the T-L distribution with scale, and in Table 3 for the truncated T-L distribution with scale. In Table 2 and Table 3 the last column shows whether the results of the K-S test are better when compared to the Weibull distribution (Y) or worse (N). As an example, the empirical DF visualized through histograms and the theoretical T-L DF for NGC 6611 is presented in Figure 3.

5. Conclusions

The Truncated Distribution

We derived the PDF, the DF, the average value, the rth moment, and the MLE for the left truncated T-L distribution with scale.

Astrophysical Applications

The application of the T-L distribution to the IMF for stars gives better results than the lognormal distribution for one out of four samples, see Table 2. The truncated T-L distribution gives better results than the T-L distribution for two out of four samples, see Table 2 and Table 3.

The results for the mass distribution of γ Velorum cluster compared with other distributions are shown in Table 4, in which the truncated T-L distribution occupies the 7th position.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References

[1] Topp, C.W. and Leone, F.C. (1955) A Family of J-Shaped Frequency Functions. Journal of the American Statistical Association, 50, 209-219.
https://doi.org/10.1080/01621459.1955.10501259
[2] Nadarajah, S. and Kotz, S. (2003) Moments of Some J-Shaped Distributions. Journal of Applied Statistics, 30, 311-317.
https://doi.org/10.1080/0266476022000030084
[3] Kotz, S. and Seier, E. (2007) Kurtosis of the Topp-Leone Distributions. Interstat, 1, 1-15.
[4] Vicari, D., Van Dorp, J.R. and Kotz, S. (2008) Two-Sided Generalized Topp and Leone (TS-GTL) Distributions. Journal of Applied Statistics, 35, 1115-1129.
https://doi.org/10.1080/02664760802230583
[5] Khaleel, M.A., Oguntunde, P.E., Abbasi, J.N.A., Ibrahim, N.A. and AbuJarad, M.H.A. (2020) The Marshall-Olkin Topp Leone-G Family of Distributions: A Family for Generalizing Probability Models. Scientific African, 8, e00470.
https://doi.org/10.1016/j.sciaf.2020.e00470
[6] Al-Babtain, A.A., Elbatal, I., Chesneau C and Elgarhy M. (2020) Sine Topp-Leone-G Family of Distributions: Theory and Applications. Open Physics, 18, 574-593.
https://doi.org/10.1515/phys-2020-0180
[7] Abramowitz, M. and Stegun, I.A. (1965) Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York.
[8] von Seggern, D. (1992) CRC Standard Curves and Surfaces. CRC, New York.
[9] Thompson, W.J. (1997) Atlas for Computing Mathematical Functions. Wiley-Inter-Science, New York.
[10] Olver, F.W.J., Lozier, D.W., Boisvert, R.F. and Clark, C.W. (2010) NIST Handbook of Mathematical Functions. Cambridge University Press, Cambridge.
[11] Press, W.H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B.P. (1992) Numerical Recipes in FORTRAN. The Art of Scientific Computing. Cambridge University Press, Cambridge.
[12] Akaike, H. (1974) A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control, 19, 716-723.
https://doi.org/10.1109/TAC.1974.1100705
[13] Liddle, A.R. (2004) How Many Cosmological Parameters? MNRAS, 351, L49-L53.
https://doi.org/10.1111/j.1365-2966.2004.08033.x
[14] Godlowski, W. and Szydowski, M. (2005) Constraints on Dark Energy Models from Supernovae. In: Turatto, M., Benetti, S., Zampieri, L. and Shea, W., Eds., 1604-2004: Supernovae as Cosmological Lighthouses, Astronomical Society of the Pacific, Vol. 342 of Astronomical Society of the Pacific Conference Series, ASP, San Francisco, 508-516.
[15] Kolmogoroff, A. (1941) Confidence Limits for an Unknown Distribution Function. The Annals of Mathematical Statistics, 12, 461-463.
https://doi.org/10.1214/aoms/1177731684
[16] Smirnov, N. (1948) Table for Estimating the Goodness of Fit of Empirical Distributions. The Annals of Mathematical Statistics, 19, 279-281.
https://doi.org/10.1214/aoms/1177730256
[17] Massey Jr., F.J. (1951) The Kolmogorov-Smirnov Test for Goodness of Fit. Journal of the American Statistical Association, 46, 68-78.
https://doi.org/10.1080/01621459.1951.10500769
[18] Evans, M., Hastings, N. and Peacock, B. (2000) Statistical Distributions. 3rd Edition, Wiley, New York.
[19] Johnson, N.L., Kotz, S. and Balakrishnan, N. (1994) Continuous Univariate Distributions. 2nd Edition, Vol. 1, Wiley, New York.
[20] Irwin, J., Hodgkin, S., Aigrain, S., Bouvier, J., Hebb, L., Irwin, M. and Moraux, E. (2008) The Monitor Project: Rotation of Low-Mass Stars in NGC 2362—Testing the Disc Regulation Paradigm at 5 Myr. Monthly Notices of the Royal Astronomical Society, 384, 675-686.
https://doi.org/10.1111/j.1365-2966.2007.12725.x
[21] Moitinho, A., Alves, J., Huélamo, N. and Lada, C.J. (2001) NGC 2362: A Template for Early Stellar Evolution. The Astrophysical Journal, 563, L73-L76.
https://doi.org/10.1086/338503
[22] Oliveira, J.M., Jeffries, R.D. and van Loon, J.T. (2009) The Low-Mass Initial Mass Function in the Young Cluster NGC 6611. Monthly Notices of the Royal Astronomical Society, 392, 1034-1050.
https://doi.org/10.1111/j.1365-2966.2008.14140.x
[23] Prisinzano, L., Damiani, F., et al. (2016) The Gaia-ESO Survey: Membership and Initial Mass Function of the γ Velorum Cluster. Astronomy & Astrophysics, 589, Article No. A70.
https://doi.org/10.1051/0004-6361/201527875
[24] Panwar, N., Pandey, A.K., Samal, M.R., et al. (2018) Young Cluster Berkeley 59: Properties, Evolution, and Star Formation. The Astronomical Journal, 155, Article No. 44.
https://doi.org/10.3847/1538-3881/aa9f1b
[25] Zaninetti, L. (2022) New Probability Distributions in Astrophysics: X. Truncation and Mass-Luminosity Relationship for the Frèchet Distribution. International Journal of Astronomy and Astrophysics, 12, 347-362.
https://doi.org/10.4236/ijaa.2022.124020
[26] Zaninetti, L. (2021) New Probability Distributions in Astrophysics: V. The Truncated Weibull Distribution. International Journal of Astronomy and Astrophysics 11, 133-149.
https://doi.org/10.4236/ijaa.2021.111008
[27] Zaninetti, L. (2021) New Probability Distributions in Astrophysics: VI. The Truncated Sujatha Distribution. International Journal of Astronomy and Astrophysics, 11, 517-529.
https://doi.org/10.4236/ijaa.2021.114028
[28] Zaninetti, L. (2020) New Probability Distributions in Astrophysics: II. The Generalized and Double Truncated Lindley. International Journal of Astronomy and Astrophysics, 10, 39-55.
https://doi.org/10.4236/ijaa.2020.101004
[29] Zaninetti, L. (2019) New Probability Distributions in Astrophysics: I. The Truncated Generalized Gamma. International Journal of Astronomy and Astrophysics, 9, 393-410.
https://doi.org/10.4236/ijaa.2019.94027
[30] Zaninetti, L. (2017) A Left and Right Truncated Lognormal Distribution for the Stars. Advances in Astrophysics, 2, 197-213.
https://doi.org/10.22606/adap.2017.23005
[31] Zaninetti, L. (2013) A Right and Left Truncated Gamma Distribution with Application to the Stars. Advanced Studies in Theoretical Physics, 23, 1139-1147.
https://doi.org/10.12988/astp.2013.310125
[32] Zaninetti, L. (2013) The Initial Mass Function Modeled by a Left Truncated Beta Distribution. The Astrophysical Journal, 765, Article No. 128.
https://doi.org/10.1088/0004-637X/765/2/128

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.