An Exact Formula for Estimation of Age-Specific Sensitivity for Screening Tests

There has been a growing interest in screening programs designed to detect chronic progressive cancers in the asymptomatic stage, with the expectation that early detection will result in a better prognosis. One key element of early detection programs is a screening test. An accurate screening test is more effective in finding cases with early-stage diseases. Sensitivity, the conditional probability of getting a positive test result when one truly has a disease, represents one measure of accuracy for a screening test. Since the true disease status is unknown, it is not straightforward to estimate the sensitivity directly from observed data. Furthermore, the sensitivity is associated with other parameters related to the disease progression. This feature introduces additional numerical complexity and limitations, especially when the sensitivity depends on age. In this paper, we propose a new approach that, through combinatorial manipulation of probability statements, formulates the age-dependent sensitivity. This formulation has an exact and simple expression and can be estimated based on directly observable probabilities. This approach also helps evaluate other parameters associated with the natural history of disease more accurately. The proposed method was applied to estimate the mammography sensitivity for breast cancer using the data from the Health Insurance Plan trial.


Introduction
Screening of asymptomatic individuals for chronic diseases is a rapidly growing public health initiative. Early detection programs are aimed at detecting the disease in the stage when a disease is present without symptoms.
For example, in breast cancer, there had been eight randomized screening trials demonstrating that mammography screening is beneficial in finding breast cancer at an earlier stage and consequently leads to a decrease in mortality among women of 50 -65 years of age ([1]- [7]).
One key element of these programs is a screening test. An important measure for the effectiveness of a screening test is sensitivity (β), the conditional probability of getting a positive screening test result given that one has the disease. Ideally it should be evaluated in the setting of natural history of disease model ( [8]- [11]). Often parameters like the sojourn time distribution in early stage of disease (pre-clinical state), transition probability from no disease state to early-stage disease state are needed to estimate the sensitivity.
These parameters are not directly observable and add to the complexity of the problem formulation and numerical computation of the sensitivity. Furthermore, the sensitivity can be age-dependent. For example in breast cancer the mammogram sensitivity in younger women is lower than that in older women [12]. The formulation for estimating age-dependent sensitivity becomes complex and does not always guarantee a numerical solution.
In this paper, we derive a formula that expresses age-dependent sensitivity in terms of probabilities that are directly observable in screening trials/programs and discuss the characteristics and generalization of this formula. We apply the formula to breast cancer screening trial (Health Insurance Plan) data to estimate the mammography sensitivity [1].

Method: An Exact Expression for Age-Dependent Sensitivity a
β Consider the following health states in the natural history of disease progression: • 0 S , the disease-free state, when the disease cannot be detected by any current early detection program; • p S , the pre-clinical state, when the disease is detectable by an early detection program, but no symptoms are shown; and • c S , the clinical state, when symptoms show. A progressive disease model assumes that the disease progresses in the direction 0 unless it is interrupted by a medical intervention. In the case of chronic progressive diseases, the goal of screening program in detecting the disease in p S . The age is denoted by a. Let • a β be the sensitivity of the screening exam (i.e., the conditional probability that a disease is detected by the screening exam given disease is present) if the exam is taken at age a; • a w be the transition rate from 0 S to p S at age a; • ( ) a q t be the sojourn time distribution in p S if the transition 0 p S S → happens at age a; and • a P be the probability for case being in the pre-clinical state p S at age a, i.e. the proportion of people in p S at age a in absence of screening examinations. Note that β , w , q and P are functions of age a, but to simplify notations, we use a in the subscripts. We also do this in part because our method will give estimates of β for discrete values of age a.
We now introduce two important probabilities that all the subsequent derivations are based upon. Suppose there are no screening examinations before age a, and a series of examinations are scheduled (but not necessarily taken) at times 1   is obvious as it is the probability that can be estimated directly from screening programs by taking proportions. The notion of ( ) a X t is more subtle, but it will facilitate derivations which will be shown later, especially when determining w and ( ) q t . Note that it is implied in the definition of T and X that by the time of the last examination, one would either be in 0 S or p S , but not c S . Thus if we were to use such definitions, we need to limit the cohort to only those who will still be in 0 S or p S by the last examination taken. In fact, the last examination would be at the same time when multiple X 's and T 's appear in an equation together in all our derivations. Also note that We derive a formula for a β via a recursion based on the number of examinations. Consider a cohort of age a at time 0, with examinations scheduled at times Similarly, for 0,1,2 T , we need three terms, involving the probability to enter p S during the following times and stay in p S until being detected at 2 t : • before age a and go undetected at times 0 and 1 t , which is • between ages a and 1 a t + and go undetected at age Observing (1) and (2), we arrive at our main result in this section: Therefore by writing out a few recursions, we are able to eliminate all ( ) a X t terms and express the age-dependent β as a simple expression of the T terms, which are directly observable. A special case for application of Theorem 2.1 is when 1 2 This implies a situation where repeated tests are performed on the same individual to measuring the sensitivity. For example, assume up to three repeated tests with the same sensitivity are performed at the same time, at three testing centers with same equipment. We label the testing centers as Center 1, Center 2 and Center 3.
Let 0 T be the probability of being detected at Center 1, regardless of what the results from Centers 2 and 3. Let 0,0 T be the probability of being detected by Center 2, but not Center 1, regardless of the result from Center 3. This is equivalent to the probability of being detected by Center 3, but not Center 2, regardless of the result from Center 1. Lastly, let 0,0,0 T be probability of being detected by Center 3, but not Center 1 or 2. The subtlety here is that 0,0,0 T is not the probability of being detected by one center out the three, but just 1/3 of it, because we have ordered the centers beforehand. Then we have Corollary 2.2. When at least three repeated tests with the same sensitivity β can be performed at the same time, we have The result is achieved by simply taking limits 1 0 t → and 2 0 t → in the derivation for Theorem 2. happens at age x . The interpretation of this expression is straight-forward: to be detected at age 50, the transition 0 p S S → needs to happen at some age x before 50, therefore the first integral; and p S needs to last from age x to at least age 50, therefore the second integral. Thus in just this one term, all the x w 's and x q 's for x less than age 50 are involved, when we do not even know what type of distribution x q is, which may vary for different x 's. This is just the simplest case-when there are multiple exams involved, the expressions for the probability of detection will involve several double integrals. One of the more rigorous approaches by Shen and Zelen ([10]) uses such expressions to form likelihood function to find estimates for β , w and q . But due to the complexity of such expressions, even when age-dependency is ignored, and q is assumed to be exponential (so there are only three parameters to determine: β , w and the parameter for the exponential q ), there is no guarantee that a global maximum exists, or can be determined [11].
However Theorem 2.1 allows us to bypass w and x q and find an expression for β in as an exact expression of T 's. This, of course, does not mean that the effects of β on detection can be isolated from that of w and x q , but that the information of w and x q are already incorporated in the T 's. We also want to point out that the T 's involved in Theorem 2.1 are naturally observable probabilities in the case of breast cancer screening (which is the main area of study for the authors): it is recommended that women to be screened either once every year, which gives terms 1,2 T , 0,1,2 T , or once every other year, which gives term 0,2 T . Another important advantage of Theorem 2.1 is that once (age-dependent) β is determined, we will be able to determine age-dependent w and q with various approaches in a more exact manner. This we will explore in another paper.
One limitation of the Theorem 2.1 is that it does not provide a way to apply the "interval cases", or the cases that enter clinical stages between two scheduled screening exams. These cases, however, can be used later on to determine w and ( ) q t once β is determined.

Symmetry between X and T
An interesting mathematical result involves certain symmetry between X and T. We use the same setup as in the last section: suppose there are no examinations before age a, and a series of exams are scheduled (but not necessarily taken) at times is the probability for such a patient being detected at examinations taken at times . This is a hypothetical probability that will facilitate our derivation.
We also recall how we have defined T : let Therefore it is clear that there is a symmetric relationship between the T terms and the corresponding Y terms, in the case of 1, 2 and 3 examinations. The pattern of these expressions becomes more obvious in case of 4 examinations: (which will be proved along with the general case) to make it more obvious: 0,1,2,3 0,1,2,3 0,1,3 0,2,3 1,2,3 0,3 1,3 2,3 3 0,1,2,3 0,1,2,3 0,1,3 0,2,3 1,2,3 0,3 1, In the expression of 0,1,2,3 T in terms of the Y terms (and vice versa), the subscripts of Y terms go though all the eight subsets of { } 0,1, 2 union 3, and the sign in front of the Y terms alternate according to the size of the subset. These facts turn out to be universal. We have: , including the empty set, and S is the size of S. Proof. Because we will be applying the inclusion-exclusion principle later, to make the corresponding arguments, we revise the definition of the T 's slightly. Assume a series of examinations are scheduled at times Let be the event of being detected at n t but undetected at i t ; • n C be the event of being detected at n t ; • A be the event of being detected at all exams at 1 2 , , , n t t t  . Therefore we have Pr A Y =  . By inclusion-exclusion principle [14],  Each Y can be expressed in terms of T 's.  Given expressions like Theorem 3.1, the proof for Theorem 2.1 is not as straightforward. However the purpose of that proof was to relate some of the traditional considerations and reasoning's to our new notions. This will be useful when evaluating important parameters such as age-dependent disease transition probabilities and sojourn time distribution.

An Example
We illustrate the proposed method using the data from the Health Insurance Plan Project, or HIP data [1]. HIP is a randomized screening trial of mammography screening vs. no screening for the women who did not have previous mammography. Even though HIP is a large-scale screening trial, because breast cancer incidence rate is relatively low, we do not have sufficient number of screen-detected and interval cases to readily estimate age-specific β . Thus for an illustration of the Theorem 2.1, we group all the women together as one age group, and assume that all patients had the same age at the initial examination. Because the cohort is required to have annual check-ups, we set 1  Note that this estimate is essentially identical to Shen and Zelen's published estimate of 0.7 ([10] [11]) based on the maximum likelihood method, but is computationally trivial.