Incorporating Uncertain Costs within a Series of Sequential Probability Ratio Tests

We consider an extension to Sequential Probability Ratio Tests for when we have uncertain costs, but also opportunity to learn about these in an adaptive manner. In doing so we demonstrate the effects that allowing uncertainty has on observation cost, and the costs associated with Type I and Type II error. The value of information relating to modelled uncertainties is derived and the case of statistical dependence between the parameter affecting decision outcome and the parameter affecting unknown cost is also examined. Numerical examples of the derived theory are provided, along with a simulation comparing this adaptive learning framework to the classical one.


Introduction
Sequential Probability Ratio Tests (SPRTs) were introduced by Wald in 1945 [1] [2] as a sequential hypothesis test procedure for when data is considered in sequence rather than in entirety.They have been used in many fields of industry, for example: nuclear physics [3], medicine [4] [5], standardised testing [6] and radar detection [7], to name just a few, and even though the classical theory has now been known for some seven decades, they are still the subject of research into extensions and generalisations [8]- [10].
Generally, the objective of a SPRT is to balance the consequence of an error with the cost of acquiring further data and/or making additional observations, e.g.clinical trials, or stress tests.In this approach data is sought until the belief in the state of nature (namely the parameter controlling the decision outcome) is such that the expected cost of implementing the current optimal decision is less than that expected from seeking additional data, updating beliefs, and then implementing the (possibly different) optimal decision.
In its simplest form a SPRT consists of the following: A choice between two decisions or courses of action (here denoted 0 d and A d ), and a state of nature w that can take one of two possible values ( o w or A w ).Depending on the decision that is selected and the true state of nature value, one of three possible losses may occur.Without loss of generality we assume a loss of 1 0 c > occurs if A d is selected when o w is true, 2 0 c > occurs if 0 d is selected but A w is true, and a loss of 0 otherwise (Table 1).
From the above it can be seen that the objective of the Decision Maker (DM) is to choose between 0 d and A d on the basis of their beliefs over the state of nature, seeking to match the decision to what they hope is its correct value.In general we denote such belief by the probability A graphic representation of this protocol is illustrated in Figure 1, where the x-axis varies over the possible value of φ and the y-axis is the resulting expected loss incurred  by a particular strategy.The solid line represents the expected cost of implementing a decision immediately, whilst the curved dash line corresponds to the expected loss of implementing a decision only after taking some further data (at a cost)concerning the correct value of the state of nature.It is included as a curve as it can be shown that the expected loss of deciding after data collection is a concave function of φ (this is be- cause we will be taking the infimum of two further choices, namely to act once the data is collected or to again choose to sample).For values of

{ }
0,1 φ ∈ the DM assumes for sure that they know what the state of nature will be, and hence will make a decision in the belief that they will receive a cost of 0. As φ varies away from these extremes however, the DM will not presume to be certain in their knowledge of the state of nature, and hence expects a risk of making either a Type I or Type II error and incurring the associated cost.This risk can be shown to increase and then decrease linearly between the extreme values of φ (the change from increase to decrease occurring at 0.5 . The dashed curve line, corresponding to making a decision only after further data collection, does not have an expected loss of 0 at φ ∈ because of the additional cost of collecting data.Depending on what this particular data collection cost is, the DM should either always collect further information (when the cost is 0), never collect additional information (when the cost of doing so is prohibitive compared to the cost of actually making a Type I or Type II error), or as is the case in Figure 1, either choose to collect additional data or not to depending on the value of φ that they assign to ( ) 0 P w .The vertical dashed lines of Figure 1 indicates, for the particular numerical example displayed, the range of values for φ within which the DM expects it is better to collect further data before mak- ing a decision.
Whilst the approach described above outlines the classical way of performing a SPRT, it fails to take into account that in practice, many of the costs involved will not be known for certain.For example, in the case of an observation cost, the cost associated with undertaking clinical trials prior to deciding to market a drug may not be known for sure, or in the case of a Type I or Type II error, the reputational or financial effect of implementing a poor decision may be unknown, e.g., releasing poorly coded software when there was opportunity to have more testing to determine unknown bugs.
In such instances it is then natural for us to model our beliefs and uncertainties about relevant costs according to some parameter, say θ , to which we only specify a prior distribution.The question then arises as to the effect this has on how we perform a SPRT, given that we may now learn between successive SPRTs, or in the case of unknown observation cost, between successive observations.The concept of unknown utility (utility defined to be negative loss), but which may instead be learned through experience, is the topic of adaptive utility theory first considered by Cyert and De Groot [11].Here not only do we have uncertainty concerning decision outcome (as modelled through the unknown state of nature), but also in the preferences over those outcomes (or equivalently attitudes to risk) [12]- [14].
In the case of only performing a solitary SPRT, and where the uncertainty relates to only the consequence of a Type I or Type II error, the appropriate procedure is equivalent to the classical one with the cost assigned to its expected value, as there is no possibility to learn about the relevant costs before implementing a decision.However, if the DM has opportunity to purchase information about such costs, e.g. by performing some market survey or enlisting the assistance of a knowledgeable expert, then the value that such information is worth may be calculated as the expected difference between the expected loss without the information, and the expected loss with it.Determining this value will be our primary interest [14] [15]: Here I is the set of information statements we could receive, i is an actual information statement, D represents the set of available decisions, d a particular decision, and ( ) L d the expected loss for implementing decision d.
The remainder of this paper is as follows: In Section 2 we consider SPRTs with uncertain Type I or Type II error cost followed by uncertain observation cost in Section 3.
In the former we consider the value of perfect information and that of noisy information, along with providing numerical examples.The details of a simulation carried out in the case of perfect information are also given.Finally we conclude in Section 4.

Unknown Consequence of Error
Suppose our uncertainty does not concern the cost of taking further observations, but rather the cost of a Type I or Type II error, or both (possibly with different distributions describing these).Without loss of generality assume the uncertainty is with respect the cost of a Type I error only.In this case our loss table is as in Table 2, where θ represents the uncertain cost of a Type I error.
There are three steps to perform to generate the expected value of information relating to these uncertain costs.In the case of information being perfect then these are the following: 1) Consider a SPRT when no information is learned.
2) Obtain expected loss following learning of the uncertain parameter(s).
3) Subtract to obtain the expected value of information, which can then be subtracted from the unknown loss consequence(s).
In performing step 1 we utilize the expected costs using the loss in Table 3. Hence the expected loss on making an immediate decision, as a function of φ , is: If a Type I or Type II error is made, we learn the value of θ .The process of per- forming an SPRT is repeated, but now with the exact value of θ rather than its prior expectation [ ] E θ , resulting in a change in the expected risk profile.An expected on the prior distribution of θ , is now determined.This is then subtracted from the original risk profile (using [ ] E θ ) to obtain the expected value of the cost information.

Perfect Information Numerical Example
As a toy example illustrating this situation consider testing if a sequence of coins are fair ( 0 w ) or biased ( A w ) meaning that ( ) 0.8 P H = .A Type I error corresponds to throwing away a fair coin and we suppose this has known loss of 2 units, namely the coin's value.A Type II error would correspond to accepting a biased coin, of which we have little experience.This could be very bad resulting in a loss of 4 θ = units, or not so bad resulting in a loss of 1 θ = unit.Further suppose the prior on θ is such that ( ) and let φ represent the probability that the coin is fair.
The expected loss table is given in Table 4. From the description we see [ ] 2 E θ = so that if 0.5 φ = then a priori we are indifferent between saving the coin or throwing it away.Then the expected risk of an immediate decision is: We may also consider sampling data by flipping a coin which is assumed to cost 0.1 units.We now calculate the range of φ where it is beneficial to flip the coin.To do so we determine posteriors on φ after observing the possible results of a coin flip: The predictive probability of observing heads or tails, at any point, is: Letting φ′ denote our posterior probability of the coin being biased after being flipped, the risk profiles of the decision following an observation is: Now we can relate the bounds on φ′ to bounds on φ : Suppose [ ] φ ∈ , then the expected loss is: We also need to include the cost of flipping (0.1 units) resulting with an expected loss for observing once then deciding of 0.5 0.6φ + .To see when this risk is preferable to deciding immediately we solve the following inequalities for φ : ( ) Observations should continue to be taken until φ leaves this interval, at which a point a decision should be made.Hence the expected risk profile is: With this risk profile we now compute the expected loss assuming we know the parameter θ .There are two cases: when In each case the expected loss as a function of φ is computed.The process is identical to the above so we just report them: For For Recall the prior on θ was such that ( ) . Hence, the expected risk after learning θ is: Thus the expected value of perfect information is the difference between Equation ( 14) (without perfect information) and Equation (17) (with perfect information): ( ) This represents the maximum amount of units we should be prepared to forsake in order to be informed the true value of the cost parameter θ prior to commencing the SPRT.From this we obtain a new function ( ) [ ] ( ) that represents the loss resulting from the occurrence of a Type II error: ( ) Equation ( 19) represents the expected value of the loss of making a Type II error, but discounted by the fact that we obtain information which allows more informed decisions to be made in subsequent SPRTs.A plot of ( 19) is given in Figure 2 where it can be observed that local minima in the expected loss occur at boundaries of indifference between choices in the initial SPRT, and that plateaus in the expected loss coincide with values of φ where it is never beneficial to take an observation for any value of θ .

Noisy Information
Now assume we only receive noisy observations concerning θ meaning that following observation we are not certain of its value.The procedure is similar to that for perfect information and again we first perform an SPRT without considering the value of the information.
Denoting the true value of θ as T θ and our observation as ob θ then this setting [ ] E L θ as the expected loss before observation and [ ] | ob E L θ θ as the expected loss after observing ob θ , the expected value of the noisy observation is calculated as: Once this has been generated the consequence of the error will be reduced in the risk table just as was the case with perfect information, allowing a classical SPRT to be performed.

Noisy Information Numerical Example
We return to the setting of Example 2.1, but now assume that the probability that the true value is observed is only 0.8, i.e., ( ) 0.8 . This results in ( ) and ( ) After observing a value for θ we update its expected value to the following: [ ] Note that each term in the above implicitly depends on the initial value assigned to φ .We then simply proceed as in Example 2.1 to obtain the final decision rule.The re- sulting loss tables are provided below for the three quantities listed in Equation ( 23 As a result the expected value of noisy information ( ) g φ is calculated as: ( ) Now the new expected cost of a Type II error ( ) θ φ for the noisy information ex- ample can be determined as in see Equation (26).A plot of this function is given in Figure 3 which can be contrasted with the perfect information case given in Figure 2. Note that as before the minima occur at boundaries of indifference and that plateaus occur where we would always (or never) take an observation no matter the value of θ .Also note that in comparison to Figure 2, the result for noisy information results in a larger expected cost of Type II error when the true value of θ does play a role in the decision making.This is to be expected due to the weaker and less useful noisy information in comparison to what we learn from perfect information.
( )   28).The x-axis varies over the prior probability for the state of nature w, whilst the y-axis indicates the resulting expected loss.

Numerical Simulation
Details of a numerical simulation are now provided.The scenario detailed in Example 2.1 was tested in R [16] by considering the outcome of 3 million trials of both the classic and adaptive framework.
Each classical trial consisted of: 1) A SPRT with consequence of Type I/II error of 2 and cost of observation 0.1 run repeatedly until a Type II error is made.The bounds used are those in Equation ( 14), namely [ ] 5 14,15 26 , before value of information is considered.
2) Upon making a Type II error, the cost from that particular SPRT is stored.The value of θ is then learned and another SPRT is run using the true value for the conse- quence of Type II error.The two costs are added to provide the total value for that trial.
In accordance with our prior on θ , two-thirds of the trials were performed with the true value of 1 θ = , while the others had 4 θ = .
A further 3 million trials were then run using the adaptive framework under the same procedure but with the bounds in step 1 being different.This is due to the different values used for consequence of Type II error seen in Equation ( 19).Using initial values of The average costs are given in Table 5.As can be seen, this indicated a substantial improvement (21% with the numerical scenario here) in using the adaptive framework and formally taking such uncertainty into account.

Statistical Dependence
To conclude we give a brief discussion on the effect of their being statistical dependence between the state of nature w and cost parameter θ .Without loss of generality, con- sider a joint distribution as taking on the values (and associated probability) given in Table 6.This implies conditional probabilities as given in Table 7.Note that this specification ensures that w and θ are not independent.Now consider the implementation of the SPRT.The initial loss table when w and θ were independent is given in Table 8.However, note that we can only incur losses governed by θ when the state of nature is A w .So any loss that occurs in the joint distribution when 0 w is true should not be considered here.Also note that an equivalent scenario will occur if the uncertainties were in both Type I and Type II errors.Thus, Table 8 should be corrected to that given in Table 9, where, as can be seen, [ ] remains constant at ( ) ( ) independently of the value of φ , and hence the value of ( ) . This means the SPRT will have constant losses that do not change between observations, and so we simply proceed as before.Table 7. Implied conditional probabilities.

Unknown Observation Cost
Now suppose that the costs of making a Type I ( 1 c ) or Type II ( 2 c ) error are known.
This means that if we were to implement an immediate decision the expected loss will be unchanged from the classical setting.However, we assume the observation cost θ is uncertain but subject to some prior distribution ( ) θ and some specified data like- lihood, in which case the expected loss of making a decision after observation will have to take into account not only the uncertainty concerning the information we may receive in relation to the true state of nature, but also the uncertainty in the additional cost of having taken a further observation.
If we take the expected value of θ , [ ] E θ as the observation cost, then we can de- termine bounds ,  on values of φ within which we should seek additional data before implementing a decision.The expected risk profile ( ) ρ φ (expected loss), as a function of φ would then be: where ( ) c φ is a concave (or linear) function of φ determined by the data generating mechanism.Then, for each possible information statement i we may receive (where here i contains both the information concerning the true state of nature and any information we gain concerning θ the cost of sampling), we can determine a posterior distribution on θ and updated expected value [ ] With this we continue the SPRT leading to updated intervals does not fall within, would result in our now taking an immediate decision.The updated risk table would now have form: Here c′ is another concave (or linear) function in φ′ .
As the information i we may receive is currently unknown, we take the expectation of Equation (30).Subtracting this from Equation (29) (the expected risk without learning information) we obtain the expected value of that information, which can be thought of as the most we would be willing to pay for it in advance of seeing it.This should now be subtracted from [ ] E θ , the original expected observation cost, to obtain what we would use as the adaptive information cost for the adaptive SPRT.Note that this value will be a function of φ .A classical SPRT is then performed with this adaptive observa- tion cost until the true cost has been learned, at which point the test continues with the cost uncertainty removed, i.e., in the classical way.

Numerical Example
As a toy example to aid in clarification of the above, suppose we are testing the efficacy of a drug and are certain of the costs incurred in making a Type I or Type II error (say 2 and 4 units respectively).Assume, however, that we have little experience in running clinical trials (our observation costs) and are not sure if it will be easy and cheap to organise ( 0.1 θ = ) or relatively expensive ( 0.25 θ = ). Prior beliefs are that it is more likely to be cheap so that ( ) 0.1 0.6 P θ = = .Also suppose that the probability a bad drug passes the clinical trial is 0.5 whilst the probability that a that works passes is 0.8.
As we begin testing of the first drug we determine how to modify the SPRT procedure to take into account this uncertainty.Interest lies in the expected value of information of the observation cost, and we assume that the information will be of a perfect nature (namely remove all uncertainties).Noting that [ ] 0.
Recalling the prior on θ is such that ( )  ( ) A plot of Equation ( 35) is provided in Figure 4.Note that the areas where the expected value of information is zero are where the decision rule is the same regardless of the information concerning the cost of sampling, agreeing with our earlier remark, and that the expected value of sampling increases to be maximal where we are currently indifferent between making an immediate decision or taking further samples.With this to hand, we would continue by performing the SPRT as if we had an observation cost of [ ] ( ) ( ) , and if we do take an observation we learn the true value of θ and continue the SPRT with this knowledge.

Conclusions
In this paper we have considered the generalisation of SPRTs from a classical to adaptive utility setting where preferences or associated costs are not assumed fully known but are instead learned through experience or by funding additional information through survey or trial etc.Both unknown cost of Type I/II error was examined before subsequently considering the effect of uncertain observation cost.Both perfect and noisy information were discussed, where we demonstrated the methods of quantifying the value for such information and numerical examples were provided to demonstrate the theory.Statistical dependence between the parameter and the state of nature was also considered and shown to not influence results.The numerical simulation indicated the enhanced performance by formally treating uncertainties and opportunities to learn within a SPRT in comparison to the somewhat easier modelling assumption of equating uncertainties in costs to their expected values.

c and 2 c
can be associated with what is commonly described as making a Type I or Type II error, and hence the connection to sequential hypothesis testing originally considered by Wald.

Figure 1 .
Figure 1.An illustration of the losses involved within a SPRT.The x-axis varies over ( ) 0 P w φ = , the y-axis is the associated expected loss, the solid line corresponds to making a decision immediately, while the curved dashed line corresponds to collecting further information before making a decision.Finally the vertical dashed lines indicate the bounds on φ within which the DM should observe further data before making a decision.

Figure 2 .
Figure 2. A plot of the expected loss incurred from committing a Type II error for Example 2.1 generated from Equation (19).The x-axis varies over the prior probability for the state of nature w, whilst the y-axis indicates the resulting expected loss.means( )

2 E
L is the loss from step 1 where future trials are not considered), then the value of information from our noisy observation ob θ for each φ is calculated as:

Figure 3 .
Figure 3.A plot of the expected loss incurred from committing a Type II error for Example 2.3 generated from Equation (28).The x-axis varies over the prior probability for the state of nature w, whilst the y-axis indicates the resulting expected loss.
, 0.56 .The second step remains the same as the classical trial.
on φ for which we would take further samples) and also for any φ that is alwayscontained in [ ] , L U φ φ .
Equation (34) (expected risk with knowledge of θ ) from Equation (31) provides the expected value of perfect information for the observation cost:

Figure 4 .
Figure 4.A plot of the expected value of information in Example 3.1 given by Equation (35).

Table 1 .
Loss table applied within a SPRT.

Table 2 .
Loss table with uncertain cost of Type I error.

Table 3 .
Loss table assuming no learning about uncertain cost θ .

Table 4 .
Loss table for numerical example in Section 2.1.
A w [ ] E θ 0 expected risk profile concerning how this may look depending on what is learned, based

Table 5 .
Average costs from the simulation described in Section 2.4.

Table 6 .
Assumed joint distribution between w and θ .

Table 8 .
Initial loss table in the case of independence.

Table 9 .
Loss table in the case of statistical dependence.