New Probability Distributions in Astrophysics: XII. Truncation for the Gompertz Distribution ()
1. Introduction
The Gompertz distribution [1] , taking values in the interval from zero to infinity, is a continuous probability distribution that has an exponentially increasing failure rate. The death rate of adult humans increases exponentially, so the Gompertz distribution is widely used in actuarial science. We now outline some applications in physics. The phenomenon of the oscillatory behavior of the counting statistics observed in high energy experimental data is explained by the the shifted Gompertz distribution [2] . There is a two-component model in which a probability distribution function obtained from the superposition of two shifted Gompertz distributions explains the multiplicity distributions of charged particles produced in
collisions at the LEP, pp interactions at the SPS and pp collisions at the LHC at different centers of mass energies in full phase space as well as in restricted phase space [3] [4] . The shifted Gompertz distribution was used in order to explain the modified multiplicity distributions in four different types of neutrino-induced interaction [5] . At the moment of writing, the Gompertz distribution in astrophysics is unknown and therefore some questions arise.
- Can the Gompertz distribution model the initial mass function for stars?
- Can the Gompertz distribution model the luminosity function for galaxies?
- Is using the truncated Gompertz distribution better than using the untruncated one?
In order to answer the above questions, Section 2 treats the untruncated Gompertz distribution and Section 3 introduces its truncation. Section 4 applies the obtained results to the initial mass function for stars. Section 5 derives the untruncated and truncated Gompertz luminosity function for galaxies, parameterizes the photometric maximum for the number of galaxies as a function of the redshift and models the mean absolute magnitude of galaxies as a function of the redshift.
2. The Gompertz Distribution
The Gompertz probability density function (PDF) is
(1)
for
, which at
takes the value a, see [6] [7] . The distribution function (DF) of the Gompertz distribution is
(2)
and its average value or mean, μ, is
(3)
see formulae (A.1) for the definition of
in the Appendix. The moment generating function
is given by
(4)
where the incomplete Gamma function is defined by Equation (A.3). The moment generating function allows deriving the second moment about the origin
(5)
where
is the Euler-Mascheroni constant, see the definition in Equation (A.10), Equations (A.5) and (A.7) define the generalized hypergeometric function in the particular case here used. The third moment
about the origin is
(6)
where
has a power law expansion as given by Equation (A.8) and the Riemann zeta function,
, is defined by Equation (A.11).
The fourth moment about the origin
is
(7)
where
is defined in Equation (A.9). The variance can be evaluated with the usual formula
(8)
and is
(9)
The same result for the variance can be found in the formula after (15) in [6] or the formula after (16) in [7] . The skewness and the kurtosis have complicated expressions and we omit them. The random generation of the Gompertz variate X is given by
(10)
where R is the unit rectangular variate. The median,
, is at
(11)
The mode is at
(12)
and is defined to be positive for
. The first method to find the two parameters a and b is given by the maximum likelihood estimator (MLE) which solves numerically the two following equations
(13)
(14)
The second method to determine the parameters is to introduce the moments of the experimental sample
(15)
As a consequence, the two parameters can be found by solving the following two non-linear equations, the method of moments (MOM)
(16)
(17)
3. Truncated Gompertz Distribution
The DF of the truncated Gompertz distribution,
, is
(18)
and its PDF,
, is
(19)
Its variate X will be a random variable taking values in
. Its average value or mean,
, is
(20)
The second moment about the origin
is
(21)
(22)
The variance can be evaluated by
(23)
The random generation of the variate X of the truncated Gompertz is
(24)
The median is
(25)
The four parameters
and
can be obtained in the following way. Consider a sample
and let
denote their order statistics, so that
,
. The two parameters
and
are
(26)
The two remaining parameters a and b are found by solving the two following equations which arise from the MLE
(27)
and
(28)
where
are the elements of the experimental sample with i varying between 1 and n. A second method to determine a and b is the MOM method, see Equations (16) and (17).
4. Application to the Stars
4.1. Statistics
The merit function
is computed according to the formula
(29)
where n is the number of bins,
is the theoretical value, and
is the experimental value represented by the frequencies. The theoretical frequency distribution is given by
(30)
where N is the number of elements of the sample,
is the magnitude of the size interval, and
is the PDF under examination. A reduced merit function
is given by
(31)
where
is the number of degrees of freedom, n is the number of bins, and k is the number of parameters. The goodness of the fit can be expressed by the probability Q, see equation 15.2.12 in [8] , which involves the number of degrees of freedom and
. According to [8] p. 658, the fit “may be acceptable” if
. The Akaike information criterion (AIC), see [9] , is defined by
(32)
where L is the likelihood function and k the number of free parameters in the model. We assume a Gaussian distribution for the errors. The likelihood function can then be derived from the
statistic
where
has been computed by Equation (29), see [10] [11] . Now the AIC becomes
(33)
The Kolmogorov--Smirnov test (K--S), see [12] [13] [14] , does not require the data to be binned. The K--S test, as implemented by the FORTRAN subroutine KSONE in [8] , finds the maximum distance, D, between the theoretical and the astronomical DFs, as well as the significance level PKS; see formulas 14.3.5 and 14.3.9 in [8] . If
, then the goodness of the fit is believable.
4.2. The IMF for Stars
The first test is performed on NGC 2362, where the 271 stars have a range of
, see [15] and CDS catalog J/MNRAS/384/675/table1. According to [16] , the distance of NGC 2362 is 1480 pc. The second test is performed on the low-mass IMF in the young cluster NGC 6611, see [17] and CDS catalog J/MNRAS/392/1034. This massive cluster has an age of 2 - 3 Myr and contains masses from
. Therefore, the brown dwarf (BD) region,
, is covered. The third test is performed on the γ Velorum cluster where the 237 stars have a range of
, see [18] and CDS catalog J/A+A/589/A70/table5. The fourth test is performed on the young cluster Berkeley 59, where the 420 stars have a range of
, see [19] and CDS catalog J/AJ/155/44/table3. The results are presented in Table 1 for the Gompertz distribution, and in Table 2 for the truncated Gompertz distribution. In Table 1 and Table 2 the last column shows whether the results of the K--S test are better when compared to the lognormal distribution (Y) or worse (N).
As an example, the empirical DF visualized through histograms and the theoretical Gompertz DF for NGC 2362 is presented in Figure 1.
Another example is given by the PDF of the truncated Gompertz distribution, see Figure 2.
Table 1. Numerical values of
, AIC, probability Q, D, the maximum distance between theoretical and observed DFs, and PKS, the significance level, in the K--S test of the Gompertz distribution, see Equation (1), for different astrophysical environments. The last column (F) indicates a PKS higher (Y) or lower (N) than that for the lognormal distribution. The number of linear bins, n, is 10.
Figure 1. Empirical DF of the mass distribution for NGC 2362 (blue histogram) with a superposition of the Gompertz DF (red dashed line). Theoretical parameters as in Table 1.
Table 2. Numerical values of
, AIC, probability Q, D, the maximum distance between theoretical and observed DFs, and PKS, the significance level, in the K-S test of the truncated Gompertz distribution, see Equation (19), for different astrophysical environments. The last column (F) indicates a PKS higher (Y) or lower (N) than that for the lognormal distribution. The number of linear bins, n, is 10.
Figure 2. Empirical PDF of the mass distribution for NGC 6611 (blue histogram) with a superposition of the truncated Gompertz PDF (red dashed line). Theoretical parameters as in Table 2.
5. Luminosity Function for Galaxies
5.1. Processed Catalogs
The tests of the Gompertz luminosity function (LF) have been made on the
band of SDSS as in [20] with data available at https://cosmo.nyu.edu/blanton/lf.html. The tests on the photometric maximum and average magnitude as functions of the redshift have been made on the catalog GLADE+ that contains ≈ 22.5 million galaxies [21] .
5.2. Schechter Luminosity
The Schechter function, introduced by [22] , provides a useful reference for the LF of galaxies
(34)
here
sets the slope for low values of L,
is the characteristic luminosity and
is the normalization. The equivalent distribution in absolute magnitude is
(35)
where
is the characteristic magnitude as derived from the data. We now introduce the parameter h, which is
, where
is the Hubble constant. The scaling with h is
and
.
5.3. Gompertz Luminosity Function
In order to derive the Gompertz LF, we start from the PDF as given by Equation (1) and we substitute b with
and x with L
(36)
where L is the luminosity defined for
,
is the characteristic luminosity and
is a normalization, i.e. the number of galaxies in a cubic Mpc. The mean luminosity,
, is
(37)
see formulae (A.1) for the definition of
in the Appendix. We now introduce the following useful formulae relating the absolute magnitude and luminosity
(38)
where
and
are the luminosity and absolute magnitude of the sun in the considered band. The LF in absolute magnitude is therefore
(39)
The Schechter function, the Gompertz LF represented by formula (39) and the data are presented in Figure 3, parameters as in Table 3.
5.4. Truncated Gompertz Luminosity Function
The truncated Gompertz LF for galaxies according to Equation (19) is
(40)
where the random variable L is defined for
,
is the lower boundary in luminosity,
is the upper boundary in luminosity,
is the characteristic luminosity and
is the normalization. The mean luminosity,
, is
Figure 3. The LF data of SDSS(
) are represented with error bars. The continuous line fit represents the Gompertz LF (39) and the dotted line represents the Schechter function.
Table 3. Numerical values and
of the LFs applied to SDSS Galaxies in the
band.
(41)
The magnitude version is
(42)
where M is the absolute magnitude,
is the characteristic magnitude,
is the lower boundary of the magnitudes and
is the upper boundary of the magnitudes. The two luminosities
and
are connected with the absolute magnitudes
and
through the following relation:
(43)
where the indices u and l are inverted in the transformation from luminosity to absolute magnitude. The mean theoretical absolute magnitude,
, can be evaluated as
(44)
The Schechter function, the truncated Gompertz LF represented by formula (42) and the data are presented in Figure 4 with parameters as in Table 4.
5.5. The Photometric Maximum
In the pseudo-Euclidean universe, we introduce
(45)
which allows defining the joint distribution in z (redshift) and f (flux) for the Gompertz LF as
(46)
Figure 4. The LF data of SDSS (
) are represented with error bars. The continuous line fit represents the truncated Gompertz LF (42) and the dotted line represents the Schechter function.
Table 4. Numerical values and
of the truncated Gompertz and Schechter LFs applied to SDSS Galaxies in the
band.
where
,
and
represent the differentials of the solid angle, the redshift, and the flux, respectively,
is the characteristic luminosity,
is the speed of light, and
is the Hubble constant; see [23] for more details. The solution of the following non-linear equation determines a maximum at
(47)
The above equation does not have an analytical solution and therefore the position in z of the maximum should evaluated numerically. A practical formula for the number of galaxies is
(48)
and the equivalent for the Schechter LF is
(49)
A numerical result is presented in Figure 5, where we display the number of observed galaxies for the Glade+ catalog at a given apparent magnitude and both the Schechter and Gompertz models for the number of galaxies as functions of the redshift.
All the galaxies with
are obtained by integrating Equation (48) with respect to f:
(50)
with the equivalent formula for the Schechter LF
(51)
where the incomplete Gamma function is defined by Equation (A.3). All the galaxies of GLADE+ are shown in Figure 6 together with the two theoretical models.
Figure 5. The galaxies of GLADE+ with
or
are organized in frequencies versus heliocentric redshift, (empty circles); the error bar is given by the square root of the frequency. The maximum frequency of observed galaxies is at
. The full line is the theoretical curve generated by
as given by the application of the Schechter LF which is Equation (49) and the dashed line represents the Gompertz LF which is Equation (48). The parameters for the Gompertz LF are
,
,
for the Schechter LF and
for the Gompertz LF.
Figure 6. All the galaxies of GLADE+ are organized in frequencies versus heliocentric redshift, (empty circles); the error bar is given by the square root of the frequency. The maximum frequency of all observed galaxies is at
. The full line is the theoretical curve generated by
as given by the application of the Schechter LF, see Equation (51), and the dashed line represents the Gompertz LF which is Equation (50). The parameters of the Gompertz LF are the same as in Figure 5.
5.6. Mean Absolute Magnitude
We review the most important equations that allow modeling the mean absolute magnitude as a function of the redshift. The absolute magnitude is
(52)
where
for the GLADE+ catalog.
The theoretical average absolute magnitude of the truncated Gompertz LF, see Equation (44), can be compared with the observed average absolute magnitude of the GLADE+ catalog as a function of the redshift. To fit the data, we assumed the following empirical dependence on the redshift for the characteristic magnitude of the truncated Gompertz LF
(53)
where
and
are the minimum and the maximum value of the redshift in the considered catalog: in the case of the GLADE+ catalog
and
. The lower/upper bounds in absolute magnitude are given by the minimum/maximum magnitude of the selected bin in redshift, the characteristic magnitude varies according to Equation (53) and Figure 7 shows a comparison between the mean theoretical absolute magnitude and the observed mean absolute magnitude for the Glade+ catalog.
Figure 7. Observed minimum absolute magnitude (red empty stars), observed average absolute magnitude (blue empty crosses), theoretical average absolute magnitude for the truncated Gompertz LF as given by Equation (44) (magenta full squares), lower theoretical curve as represented by Equation (52) (cyan empty stars of David) and observed maximum absolute magnitude (green full triangles).
6. Conclusions
The truncated distribution
We derived the PDF, the DF, the average value, the second moment, the median, the random number generator and the MLE for the Gompertz distribution truncated on the left and the right.
Application to the IMF
The application of the Gompertz distribution to the IMF for stars gives better results than the lognormal distribution for two out of four samples, see Table 1. The truncated Gompertz distribution gives better results than the untruncated one for one of the four samples, see Table 1 and Table 2.
The results for the mass distribution of γ Velorum cluster compared with other distributions are shown in Table 5, in which the truncated Gompertz distribution occupies the last position.
Gompertz luminosity function
The Gompertz LF in the absolute magnitude version is derived using the standard and the truncated DFs, see formulae (39) and (42). The application to SDSS Galaxies gives a bigger reduced merit function than that from using the Schechter LF, see Table 3 and Table 4.
Cosmological applications
The maximum number of galaxies for a given solid angle as a function of the redshift which is visible in the catalog GLADE+ can be modeled with the Gompertz LF in the case of a selected flux or apparent magnitude, see Figure 5 and in the case of all galaxies, see Figure 6.
Table 5. Numerical values of D, the maximum distance between theoretical and observed DF, and PKS, the significance level, in the K-S test for different distributions in the case of γ Velorum cluster.
The average absolute magnitude of the GLADE+ galaxies as a function of the redshift, can be theoretically modeled with the truncated Gompertz LF, see Figure 7.
Appendix A. Useful Power Series
The exponential integral [32] is defined for
.
(A.1)
and in the case
has the following power series
(A.2)
The incomplete Gamma function is defined by
(A.3)
and has the following power series
(A.4)
see [32] for more details.
The generalized hypergeometric function is defined by
(A.5)
where the Pochhammer symbol pochhammer (z, a) is
(A.6)
We now present the series expansion of some particular cases of the generalized hypergeometric function
(A.7)
(A.8)
(A.9)
The Euler-Mascheroni constant is defined by
(A.10)
The Riemann zeta function is
(A.11)