Asymptotics and Well-Posedness of the Derived Distribution Density in a Study of Biovariability

In our recent work (Wang, Burgei, and Zhou, 2018) we studied the hearing 
loss injury among subjects in a crowd with a wide spectrum of heterogeneous 
individual injury susceptibility due to biovariability. The injury risk of a 
crowd is defined as the average fraction of injured. We examined mathematically 
the injury risk of a crowd vs the number of acoustic impulses the crowd 
is exposed to, under the assumption that all impulses act independently in 
causing injury regardless of whether one is preceded by another. We concluded 
that the observed dose-response relation can be explained solely on 
the basis of biovariability in the form of heterogeneous susceptibility. We derived 
an analytical solution for the distribution density of injury susceptibility, 
as a power series expansion in terms of scaled log individual non-injury 
probability. While theoretically the power series converges for all argument 
values, in practical computations with IEEE double precision, at large argument 
values, the numerical accuracy of the power series summation is completely 
wiped out by the accumulation of round-off errors. In this study, we 
derive a general asymptotic approximation at large argument values, for the 
distribution density. The combination of the power series and the asymptotics 
provides a practical numerical tool for computing the distribution density. 
We then use this tool to verify numerically that the distribution obtained 
in our previous theoretical study is indeed a proper density. In addition, we 
will also develop a very efficient and accurate Pade approximation for the 
distribution density.


Introduction
Sound is an indispensable part of our life and we experience sound every day. A common way to measure the amount of sound is the decibel (dB) [1]. Sounds of less than 75 dB are at safe levels that do not damage our hearing. However, any sound above 85 dB is potentially harmful and can cause hearing loss. Examples of harmless sounds are normal conversation (60 dB), the humming of a refrigerator (45 dB) whereas harmful sounds include noise from lawn mowers (90 dB) and gun shots or firecrackers (both 150 dB) [2]. The risk of hearing damage depends on the power of the sound as well as the length of exposure.
Hearing loss is a common health problem among veterans. In order to protect warfighters, starting 1960s the US Army conducted and funded research to assess the risk of hearing loss caused by intense impulse noise from explosive blasts and weapon firings [3]. Recently Dr. Chan and his collaborators [4] developed a dose-response model for the assessment of injury caused by impulse noise and a model for the possible recovery afterwards, based on chinchilla data.
Chinchillas share similar hearing capabilities as humans and thereby are commonly used for hearing-related experiments.
In [5], we interpreted the empirical dose-response relation from [4] for exposure to multiple sound impulses in the framework of immunity. In [6], we viewed the empirical dose-response relation from a completely different angle, in the framework of biovariability. Together in these two studies [5] [6], we showed that it is possible to interpret the empirical dose-response relation from either of the two extreme cases: immunity or biovariability. Here we would like to further our study in [6] to demonstrate that the derived distribution density of injury susceptibility in [6] is well-posed.

Mathematical Formulation of the Problem
In experiments [4], the injury risk of a crowd caused by a sound exposure event is described by the logistic dose-response relation: cut-off levels whereas the median injury dose D 50 rises with the PTS cut-off level [4].
In the framework of the logistic dose-response relation, the injury risk of the crowd caused by a sequence of N acoustic impulses is given by the expression where parameters a and η are defined as Parameter a is related to probability ( That is, parameter a is the injury odds of a hypothetical subject in the crowd with the average injury probability, , in responding to a single acoustic impulse. In [6] under the assumption that N acoustic impulses act independently from each other in causing injury, regardless of whether one is preceded by another (i.e., no immunity effect), we explored the possibility of interpreting the observed logistic dose-response relation for a crowd in the framework of biovariability. For mathematical convenience, we consider non-injury probability instead of injury probability. Let ( ) q ω denote the individual non-injury probability of a random subject in the crowd, in responding to one acoustic impulse.
Here ( ) q ω is a random variable, due to the presence of biovariability. Let ( ) q ρ be the distribution density of random variable ( ) q ω . Mathematically in the framework of biovariability, the average non-injury fraction for N acoustic impulses is expressed in terms of ( ) On the other hand, experimentally, the injury risk was observed to follow the logistic dose-response relation (4), which relates to the average non-injury frac-  (7) and (8), which we write out below: In [6] we solved Equation (9) analytically by constructing a power series For density ( ) In [6] we derived a power series solution for ( ) In [6] we rigorously proved these two properties for the special case of 1 2 η = and the special case of 1 3 η = . The analysis procedure differs quite significantly between these two special cases. It is highly unlikely that the particular analysis approach used in either of these two special cases can be directly extended to the general case of arbitrary η. In the current study, we aim at verifying semi-analytically the two properties for any given value of η. For that purpose, we need the numerical capability of calculating function ( ) g s for all values of s, from small to large. Power series (12) has the nice property that theoretically it converges for all values of s. In practical computations, however, at large values of s, the numerical accuracy of the power series summation is completely wiped out by the accumulation of round-off errors. In the power series summation, as s increases the net sum decreases while the magnitude of the largest term grows exponentially with s [6]. The combined effect of these two factors magnifies catastrophically, at large s, the influence of round-off errors on the numerical accuracy of the net sum. For practical computation of ( ) g s in finite precision arithmetic, we need a robust numerical formula of ( ) g s at large s. In the next section, we derive a general asymptotic approximation of ( ) g s at large s. The synthesis of the power series and the asymptotics will provide a practical numerical tool for computing the distribution density ( ) g s for all values of s at any given value of parameter η.

Asymptotics of g(s) at Large s
Now we derive a general asymptotic approximation of ( ) g s at large s when η is a rational number. We then reasonably conjecture that the same asymptotic approximation is also valid even when η is irrational. In practical computation of ( ) g s , the case of irrational η actually does not apply since all numerical calculations are carried out in finite precision arithmetic, using only rational numbers.
A rational number η takes the form where both m and n are positive integers. We rewrite the power series of ( ) g s in terms of the reciprocal gamma function as follows: where ( ) f z is the reciprocal gamma function defined as The advantage of working with the reciprocal gamma function ( ) f z is that it is well defined and is analytic everywhere. In comparison, the gamma function diverges at all non-positive integer values of z. The reciprocal gamma Using this convenient property when differentiating ( ) Differentiating ( ) g s repeatedly m times, we obtain a differential equation We derive an asymptotic approximation of ( ) g s at large s based on this differential equation. We proceed with the assumption that ( ) g s converges to 0 as s goes to +∞ . This assumption is reasonable although not directly derivable from the power series form of ( ) g s . Under this assumption, the asymptotics of ( ) g s at large s takes the form ( ) Differentiating the asymptotic form m times, we get In (21) the first term is of the order Equating the leading terms on both sides of (23) yields 1 1 m n β = + . Substituting this value of 1 β back into (23) gives us ( ) ( ) Notice that in (24), the smallest term in the summation is Thus, all terms in the summation are indeed asymptotically larger than In summary, expression (24) gives an asymptotic approximation of ( ) Since integer r can be made as large as we like, in (27) we simply use the infinite series as a symbolic general asymptotics, with the understanding that a particular asymptotic approximation will use only a partial sum of the infinite series. We need to point out that the infinite series in (27) actually diverges for all values of s. So the infinite summation does not have a well defined sum. Instead, the infinite series serves only as a symbolic general asymptotics. Summation of moderate number of terms, however, will provide an accurate asymptotic approximation of ( ) g s at moderately large s and beyond. We will examine the approximation errors in details later. In the analysis above, we did not assume that integers m and n are prime to each other. It turns out that asymptotics (27) can be written in terms of η only, without any reference to m or n. The expression in terms of η gives us the general asymptotics, which does not depend on a particular rational form of η: ( ) ( ) ( ) ( ) The asymptotic expansion (28) (28) is also valid for irrational η although it is derived in the case of rational η. This conjecture cannot be tested numerically since all computations use a finite precision number representation system, which is a subset of all rational numbers. In the next section, in preparation for the numerical verification of properties (13) and (14), we build the necessary numerical tools for computing function ( ) g s .

Accurate Evaluation of g(s) in Finite Precision
In this section, we develop a practical numerical method for computing ( ) g s in IEEE double precision, over the whole range of s and at any given value of parameter η. Function ( ) g s is defined straightforwardly by power series (12). Theoretically ( ) g s can be evaluated as accurately as we like by including sufficiently large number of terms in summation and carrying out the computation in arithmetic of sufficiently high numerical precision. Practically, with the IEEE double precision arithmetic, the numerical accuracy of the power series summation, at large s, is completely ruined by the round-off errors from terms of the largest magnitude. In [6] we showed that the largest term grows roughly exponentially with s and it has the behavior. The infinite series in the asymptotic expansion (28) diverges for all values of s.
As a result, it does not make sense to include in the asymptotic approximation a very large number of terms from (28). When a moderate number of terms are used, however, the partial sum of (28) provides an accurate approximation of ( ) g s at moderately large s and beyond. For a fixed number of terms, the larger the value of s is, the better the approximation. Therefore, at large s, function ( ) g s can be evaluated fairly accurately by employing an asymptotic approximation with a suitable number of terms. The contrast and complementary behaviors of the power series around 0 s = and the asymptotics at large s suggest that a viable numerical strategy is to use the power series summation for small s and switch to the asymptotic approximation when s is above a threshold s sw , which is yet to be specified. The success of this numerical strategy depends on that there is an overlapping region of intermediate s in which both the power series summation and the asymptotic ap- from the valid region of the asymptotic approximation, ( ) g s cannot be calculated accurately in the gap region. The existence of an overlapping region of intermediate s also provides us a numerical mechanism for identifying the overlapping region and selecting an optimal threshold s sw for switching from one numerical formula to the other.
In the overlapping region, both the power series summation and the asymptotic approximation are reasonably accurate. Accordingly, the difference between the two numerical formulas should be fairly small inside the overlapping region. Below or above the overlapping region, only one of the numerical formula is very accurate while the other is not. Consequently, outside the overlapping region, the difference between the two will be significantly more pronounced than inside the overlapping region. To identify the overlapping region, we examine the difference between the power series summation and the asymp-   on these numerical findings we adopt the numerical procedure below for where K f is the number of terms needed to make the truncation error well below the machine precision of IEEE double precision. In computations, we make the truncation error smaller than 10 −20 . Theoretically, K f has an a priori estimate expressed in terms of s when s is moderately large. In practice, K f is determined automatically in the numerical summation process by monitoring the magnitude of terms.

Pade Approximations
In the previous section, we verified ( ) 1 G +∞ = . We now impose this property as a constraint at s = +∞ and construct a Pade approximation [7] For k in the range of ( )  lapping region is much more pronounced than in the case of 5 n = . The pattern of increase strongly suggests that it is caused by the increased numerical error of ( ) G s near the overlapping region. Figure 4 indicates that the true numerical error in Pade approximation is very likely below 10 −12 throughout the full range of s, much more accurate than the unified numerical procedure (37) in IEEE double precision.
We carry out single precision computations to support the assertion we made above that in finite precision arithmetic, Pade approximations can be more accurate than the power series even though the power series is theoretically exact in infinite precision arithmetic. We use the double precision result of ( ) G s as the exact solution to compare with. We compute power series summation, asymptotics, and Pade approximations in single precision, and then examine the numerical errors in single precision results. Figure 5 shows the error behaviors of single precision results. As s increases, the power series summation starts losing accuracy due to the exponential growth of the largest term and the associated round-off error in summation. Meanwhile, as s increases, the approximation error in asymptotics decreases and its numerical accuracy improves. In contrast, the numerical errors in Pade approximations remain fairly steady with respect to s and decays very rapidly as n is increased. Pade approximation ( ) ;3 R s is already significantly more accurate than both the power series summation and the asymptotics in a large neighborhood of the overlapping region (for single precision arithmetic, the overlapping region is around 6 s = ). It is evident in Figure   5 that the numerical error of ( ) ;4 R s is primarily caused by round-off errors and its true approximation error is below the machine epsilon of single precision  accuracy of 10 −8 or better with IEEE double precision, over the full range of argument. Using this unified procedure, we verified numerically that for all parameter values, the derived distribution density, i) is non-negative everywhere and ii) integrates to one. These results establish numerically that the derived distribution is indeed a proper density for all values of parameter, and thus, is well-posed. Furthermore, we developed efficient and accurate Pade approximations for the distribution density and for the cumulative distribution function. In the computational environment of IEEE double precision, Pade approximations actually yield a much higher realized numerical accuracy than that of both the asymptotic approximation for large argument value and the power series for small argument value. The superior performance of Pade approximations is attributed to the fact that it attains high theoretical accuracy with only a few terms, which leads to less contamination with round-off errors and better realized numerical accuracy. In conclusion, we verified numerically that the observed logistic dose-response relation can be explained solely based on a valid distribution of injury susceptibility. Rigorous proof of the well-posedness of the derived distribution density, however, still remains open.