Dose-Injury Relation as a Model for Uncertainty Propagation from Input Dose to Target Dose

We study a general framework for assessing the injury probability corresponding to an input dose quantity. In many applications, the true value of input dose may not be directly measurable. Instead, the input dose is estimated from measurable/controllable quantities via numerical simulations using assumed representative parameter values. We aim at developing a simple modeling framework for accommodating all uncertainties, including the discrepancy between the estimated input dose and the true input dose. We first interpret the widely used logistic dose-injury model as the result of dose propagation uncertainty from input dose to target dose at the active site for injury where the binary outcome is completely determined by the target dose. We specify the symmetric logistic dose-injury function using two shape parameters: the median injury dose and the 10 - 90 percentile width. We relate the two shape parameters of injury function to the mean and standard deviation of the dose propagation uncertainty. We find 1) a larger total uncertainty will spread more the dose-response function, increasing the 10 - 90 percentile width and 2) a systematic over-estimate of the input dose will shift the injury probability toward the right along the estimated input dose. This framework provides a way of revising an established injury model for a particular test population to predict the injury model for a new population with different distributions of parameters that affect the dose propagation and dose estimation. In addition to modeling dose propagation uncertainty, we propose a new 3-parameter model to include the


Introduction
In many injury assessment situations, injury status of a subject is simply characterized in the form of binary outcome. For example, in a study of skull fracture injury related to highway traffic safety [1] and in a study of rib fracture injury caused by blunt-impact non-lethal weapons [2], in each situation subjects tested are classified as either fractured or not fractured. Mathematically, occurrences of binary injury outcomes are statistically described by injury probability (also called injury risk). Let  ( ) 1 2 , , , k v v v  be a list of input factors that affect the injury outcome,  I be the binary injury outcome (random variable), and  p be the corresponding injury probability: p = Pr (I = "injured") Here the binary injury outcome I is a random variable even when all input factors ( ) 1 2 , , , k v v v  are given and fixed. One approach of building a simple and practical model for assessing the injury risk is to use a single metric x to capture the overall effects of all input variables ( ) 1 2 , , , [4]. Quantity x is called the input dose, serving as the single metric best predictor of the injury probability. Input dose x may be one of the input variables ( ) 1 2 , , , k v v v  or a combination of these input variables. Depending on the application situations, input dose x is also called the determinant of injury, the risk factor, the exposure level, or the predictor variable [3] [4].
When the input dose x is directly controllable and measurable, an experimental data set consists of m entries, each containing a measured value of input dose and the corresponding binary injury outcome in an independent trial: Injury models are constructed in the general form of injury probability vs input dose.
( ) injury probability at input dose p x x = In many application situations, however, the input dose is not directly measurable. For example, for bone fracture injuries, we may use the stress at the impact site as the input dose. But it is difficult to measure directly the stress at impact site. In a study of behind-armor blunt trauma (BABT) [5] and a study of human body response to blunt impacts using advanced total body model (ATBM) [6], an estimated value of stress caused by the impact is calculated via computer simulations. The estimation is based on the measured mass and The estimated input dose, in general, is different from the true input dose, and the discrepancy between the two is population dependent since the actual material properties of individual subjects are different from the selected representative material properties and are population dependent. In addition, the relation of injury probability vs true input dose is also population dependent because the material properties of subjects significantly affect the injury outcome even when the true input dose is fixed. For example, at a fixed impact force, the injury probability varies considerably among groups of different ages, among groups of different body types, body sizes and body compositions. The experimentally established relation of injury probability vs estimated input dose is heavily influenced by the particular population tested. As a result, applying the injury model established for one population, straightforwardly without modification, to assess the injury risk of a different population will inevitably lead to large errors. In many applications, however, we face exactly this task: we are given an injury model established on a particular test population and we need to predict the injury risk of a different population. For example, a data set for human forearm fracture was assembled in [7] from drop test results conducted on PMHS forearms from cadaver donors of average age 55. The purpose of assembling the data set, however, is to build an injury model for assessing the risk of forearm fracture among a population of live human subjects with an age distribution significantly different from that of cadaver donors. In this study, we develop a simple mathematical framework for this task. The key idea is based on interpreting the probabilistic injury model as the consequence of dose propagation uncertainty from input dose to target dose at the active site for injury where the binary outcome is uniquely determined by the target dose.
The framework of dose propagation uncertainty makes it mathematically convenient to accommodate different uncertainties associated with different populations. The formulation developed provides a mechanism of mapping injury function from one population to another by simply updating the model parameters.

Mathematical Formulation
We first review the logistic model for binary outcomes [8]. Note that the injury We write the linear function in (3) where ( ) c z is the critical threshold for target dose in transition from non-injury to injury. The transition is a discontinuous jump with respect to target dose Z at the active site. However, with respect to the input dose x that is away from the active site, the injury probability vs x generally is a smooth and gradual transition.
 The target dose Z is caused by the input dose x. While in most experiments the input dose x can be controlled, at least to some extent, the target dose Z is neither directly observable nor directly controllable.
 For a given input dose x, the corresponding target dose Z is a random variable, reflecting the uncertainty in the propagation from input dose to target dose.
We use an example to illustrate the propagation from input dose to target dose.
Example: Passing exam vs amount of study time In this example, the input dose x is the amount of study time. Note that although the target dose Z is caused by the input dose x, quantities Z and x may have different physical dimensions. For passing an exam, the target dose Z is the effective fraction of actual exam contents correctly completed in the exam by the student. We use a flow chart to show a possible propagation from input dose to target dose.
x = the nominal amount of study time invested →Z 1 = effective amount of study time affected by the student's attentiveness, effciency, and overall load →Z 2 = amount of course contents learned affected by the student's prior preparation and ability of memorizing key items →Z 3 = fraction of actual exam contents learned affected by the exam scope and weighting of components in exam →Z = effective fraction of actual exam contents correctly completed affected by the student's general health condition on exam day, and ability of working under time pressure and in presence of noise/disturbance (8) Mathematically, we write the target dose explicitly as ( ) , Z x ω , emphasizing that Z is a random variable depending on the input dose x and depending on the random factor ω in the dose propagation. The probability that a given input dose x leads to injury is

Logistic Dose-Injury Relation Interpreted as Normally Distributed Target Dose
We model the target dose as proportional to the sum of the input dose and an additive Gaussian noise.
, a standard normal random variable. We scale target dose Z and the associated critical threshold ( ) c z to make 1 r = by changing the physical unit for measuring z-values, or equivalently by changing the physical unit for measuring x-values. Thus, we set In this section, we first examine the dose-response relation for normally distributed dose uncertainty, which is the probit model [11]. Then we discuss how to accommodate different uncertainties corresponding to different populations, including how to incorporate additional uncertainties into the dose-response relation.

Dose-Response Relation
The binary injury outcome is governed by the sign of random variable The injury probability (p) corresponding to input dose x is Recall that the cumulative distribution function (CDF) of standard normal is given by the error function, The dose response relation for normally distributed target dose Z has the expression: We approximate dose-response relation (12) using the logistic function form (4) with tunable parameters 50 D and α . First, we match the two functions at where the scaled coefficient α′ is related to α by α σα ′ = . For conciseness, we denote new x simply as x. The task of approximating (12) with (4) is reduced to finding an optimal value of α′ such that the distance between  It is clear that the two functions are very good approximations of each other. The maximum difference is bounded by 0.01 (i.e., difference in predicted injury probability is less than 1%). With that error tolerance, the logistic model and the normal distribution model can practically substitute each other. In other words, the widely used logistic model can be viewed as a very good approximation of the normal distribution model, which was derived based on normally distributed dose propagation uncertainty from input dose to target dose. Models (13) and (14) are nevertheless mathematically different. When the data set of binary injury outcomes (I) is sufficiently large, eventually, the two models will be distinguishable. Let m be the number of samples in the data set. We look into the question of how large m needs to be in order to statistically distinguish the two models. We consider a collection of independent data sets, each of the form Given data set D, the log-likelihood for a general probability function ( ) We use log-likelihood (15) to compare models  is expected to be positive. However, due to randomness of data sets, the difference in log-likelihood between two models fluctuates from one date set to another.
We examine the sample distribution of differences in log-likelihood based on 100000 N = independent data sets. Figure 2 plots the histograms of ( ) for various values of m, the size of each data set. To clarify, here N is the number of data sets used in each histogram and m is the number of binary outcomes in each data set. In Figure 2, each sample of difference in log-likelihood requires one data set. That is why we use 100000 N = independent data sets to plot each histogram. Suppose we use the sign of ( ) to classify data sets as the normal distribution model (positive sign) or as the logistic model (negative sign). All data sets examined in Figure 2 are generated based on the normal distribution model. Thus, data sets with will be falsely identified as the logistic model (false negative). In Figure  (bottom right panel), the false negative rate drops down to 5.63%. Based on the simulation results, we see that to reduce the false negative rate to less than 20%, for example, we need to work with data sets, each consisting of 1000 m = samples. This is above the typical sample size of data sets for injury models. Thus, in real applications, the normal distribution model (14) and logistic model (13) are practically the same unless we work with injury data sets of very large sample size.
We go back to the pre-transformation logistic model, function (4) specified by steepness coefficient α , and function (6) (17) describes the best approximation to the normal distribution model (12) from the logistic model family (6). The best approximation is obtained numerically by minimizing the distance between the two functions ( Figure 1). Alternatively, a straightforward approximation can be written out by simply matching the widths of two injury functions. The width of normal distribution model is given by the inverse error function Notice that the two widths, the width of normal distribution model normal W and the width of its best logistic model approximation opt W , are indeed very close to each other. We will use these two interchangeably.
Similar to the situation of logistic model, the normal distribution model is also completely specified by the shape parameters ( ) 50 , D W . It has the form The binary injury outcome is completely determined by the condition z is the critical threshold for target dose Z.  The probability of injury caused by the input dose x is described by the CDF of normal distribution. Practically the injury probability is very well approximated by the widely used logistic dose-response relation.  As given in (17), the median injury dose of injury function is the critical threshold for the target dose, shifted by the bias in the dose propagation: The larger the uncertainty, the more spread out the injury function is.
 In terms of shape parameters ( ) 50 , D W , the logistic model is expressed in (6); the normal distribution model is given in (18). American Journal of Operations Research Next, we study how to incorporate additional uncertainties in the framework of dose-response relation, and how to model a new population with different uncertainty.

Effects of Additional Uncertainties
In the previous subsection, we interpreted the dose-response relation as a consequence of dose propagation uncertainty. In this subsection we study how to incorporate additional uncertainties by changing the shape parameters ( ) 50 , D W in logistic model (6) or in normal distribution model (18).
We start by considering a homogeneous population consisting of statistically identical subjects, which means quantities ( ) c z , µ and σ are fixed and stay the same for all subjects in the population. In a homogeneous population, the dose propagation uncertainty is statistically the same for all subjects. Its effect is already reflected in the dose response relation specified by shape parameters (17). In particular, the width W is proportional to the standard deviation of uncertainty.
If there is no uncertainty present in the dose propagation, the dose-response relation would be a sharp transition (a step function). Now we consider a more realistic situation: a heterogeneous population consisting of subjects with variable critical threshold ( ) c z , denoted here in the new setting as ( ) c Z , following the convention of using uppercase letters for random variables. In addition to the uncertainty in ( ) c Z , the input dose x may not be directly measurable. In some situations, the input dose x is not directly measured; instead, input dose x is derived from a controllable/measurable variable y. In these situations, the value of input dose x is calculated via computer simulations from measurable quantities using idealized representative properties of subjects, such as the 50-percentile properties of the general population [5] [6]. We use the example below to illustrate the situation of controllable variable y vs true input dose ( ) Consider the experiment in which we test the shatter resistance of a product by dropping it from a specified height. In this example, the various quantities in the model are described as follows:  The height y is the controllable/measurable variable.   Pr 0 Injury function (23) has the same form as (12

Dose-Injury Function for Target Doze of Log-Normal Distribution
For the discussion below, we adopt the normal-distribution model as the base

A Dose-Injury Model with Skewness Based on a Normally Distributed Intermediate Variable
We construct a model that accommodates the median injury dose ( 50 D ), the width (W) and the skewness ( γ ) as 3 independent parameters. In previous section, we studied the formulation based on target dose of log-normal distribution, in which the skewness is always positive and the 3 shape parameters Note that both ( ) To accommodate negative skewness 0 γ < , however, we need

Case 2:
( ) This works for 0 γ < , which indicates that the injury probability has a steeper rise above the median injury dose 50 D than below it. Next we combine the results of

A Unified Formulation for All Values of Skewness
In the previous sub-section, we studied models based on target dose of shifted log normal distribution with shift as a parameter. We now synthesize the results obtained to develop a unified formulation of injury function in which the 3 shape parameters ( ) 50 , , D W γ can be specified independently. First, we show that at any fixed value of µ , there is one-to-one correspondence between  (42) to write out the corresponding shape parameters ( ) 50 , ,  , , 2 2 In injury model (47) , , erf 2 2  highlights that as γ increases from negative to zero to positive, the injury function becomes more concave down.

Effect of Input Dose Uncertainty on the Injury Function with Skewness
We study the effect of input dose estimation uncertainty on the dose-injury function with skewness. We use the term "composite injury function" to denote the injury model after the input dose uncertainty has been incorporated into the model. In general, the composite injury function will be somewhat different from the 3-parameter function form (47) we derived in the previous section. We calculate the three shape parameters ( ) 50 , , D W γ of the composite injury function. Then we explore approximating the composite injury function using function form (47). We examine the difference between the composite injury function and model (47) with the same shape parameters ( ) 50 , , D W γ . If the approximation error is small, then the 3-parameter function form (47) is approximately invariant with respect to input dose uncertainty, and it serves as an adequate framework for accommodating uncertainty in estimating the input dose. Furthermore, framework (47) provides a mechanism of mapping the injury function for one particular dose propagation uncertainty to that for a different uncertainty. Using this mechanism, we can construct an injury model for a target population in application, based on measured injury data for a test population in experiments.
We start with a function of injury probability vs true input dose that is exactly of form (47) specified by 3 shape parameters ( ) 50 , ,  The left panel of Figure 4 shows the injury probability vs the estimated input dose x, respectively, for 0,1, 2,3 σ = . The most pronounced effect of input dose uncertainty is to spread out the injury function and increase the width. We examine the trend of shape parameters  transformation. Both the forward and backward mappings need to be implemented numerically. The detailed numerical procedure will be discussed in a subsequent study.

Concluding Remarks
We considered injury models in the framework of dose propagation uncertainty.
The mathematical formulation is based on that the binary injury outcome is completely determined by the target dose at the active site and the critical threshold. The randomness in the occurrence of injury at a given input dose is