A Bootstrapping Approach for Software Reliability Measurement Based on a Discretized NHPP Model

Discrete software reliability measurement has a proper characteristic for describing a software reliability growth process which depends on a unit of the software fault-detection period, such as the number of test runs, the number of executed test cases. This paper discusses discrete software reliability measurement based on a discretized nonhomogeneous Poisson process (NHPP) model. Especially, we use a bootstrapping method in our discrete software reliability measurement for discussing the statistical inference on parameters and software reliability assessment measures of our model. Finally we show numerical examples of interval estimations based on our bootstrapping method for the several software reliability assessment measures by using actual data.


Introduction
It is very important to measure reliability of a software product with accuracy in the final stage of software development process for shipping a highly reliable software product.A software reliability growth model [1][2][3][4] is known as one of the useful mathematical tools for quantitative measurement or assessment of software reliability.Generally in an actual testing-phase, we observe a software reliability growth process, in which software faults are detected and removed and the number of faults remaining in the software system is decreasing along with the test-execution time.The software reliability growth model describes the software reliability growth process, and measures the software reliability quantitatively by using software reliability assessment measures, which are derived by the software reliability growth model.A huge number of software reliability growth models were proposed so far for accurate software reliability assessment.Especially, there are discretized nonhomogeneous Poisson process (discretized NHPP) models, which have good fitting and prediction performance in software reliability assessment [5,6] because the discretized NHPP models have consistency with fault counting data, which are obtained by collecting information on the frequency of the software failure-occurrence or the number of detected faults during each constant testing-period.Estimating parameters in the discretized NHPP model from actual data is conducted by using the regression analysis based on a regression equation derived from a difference equation of the discretized NHPP model.After the parameter estimation, software reliability assessment is performed based on the software reliability assessment measures derived from the discretized NHPP model.This approach is based on the point estimation, which is better to use when we have enough number of data.
In recent years, it is very difficult to obtain enough number of data for the point estimation method due to the quick delivery of software development.In such case, it is better to conduct interval estimation for considering the uncertainty of the estimators being related to the model parameters and software reliability assessment measures.We often use asymptotic approximation approaches [7] for the interval estimation.However, we have some difficulty in mathematical manipulation for conducting the interval estimation even if we use the asymptotic approximation approach.For overcoming the problem above, the bootstrap method [8] was proposed.The bootstrapping method is known as one of the useful Monte Carlo methods for obtaining probability distributions for estimators by a resampling method.Recently, the bootstrapping method is applied not only to software reliability analysis [9][10][11] but also optimal checkpoint replacement for hardware system [12].
In this paper, we discuss an interval estimation method for parameters and software reliability assessment measures of a discretized exponential software reliability growth model, which is one of the discretized NHPP models and has the simplest model-structure, by the bootstrapping method.And, we discuss several kinds of bootstrap confidence intervals for the interval estimations.Finally, we show numerical examples for our bootstrapping method for software reliability assessment based on the discretized exponential software reliability model and the bootstrap confidence intervals by using actual data.

The Model
We briefly discuss the aspect of the discretized NHPP model by showing a discretized exponential software reliability growth model [5,6], which has the simplest mathematical structure.Now we define a discrete counting process representing the cumulative number of faults detected up to n-th testing-period.And we can say that the discrete counting process n follows a discrete-time NHPP [13] if the process has the following property: which is derived based on a continuous-time NHPP.In Equation (1),

 
Pr A means the proba of event As   , Equation (3) converges to an exact solution of the original continuous-time exponential software reliability growth model, which is derived by the differential equation.
The discretized exponential software reliability growth model in Equation ( 3) has two parameters,  and  , which have to be estimated by using actual data.The parameter estimations of  and  ,  and  , can be obtained by the following procedure using the method of least-squares.First of all, if we observed fault count- , where n represents the cumulative number of faults detected up to nth testing-period, we derive the following regression equation from Equation (2): .
Based on the regression analysis, we can estimate 0  and 1  , which are the estimations of 0  and 1  in Equation ( 4).Then, the parameter estimations,  and  , can be obtained as is not used in calculating n as showing Equation ( 5).Hence, we can obtain the same parameter estimates C  and  , respectively, when we choose any value of  [5,6,15,16].

Software Reliability Assessment Measures
Software reliability assessment measures are useful in quantitative software reliability assessment.This paper discusses the expected number of remaining faults and the software reliability function, which are well-known software reliability assessment measures.The expected number of remaining faults, n M , represents the expected number of undetected faults in the software system at arbitrary testing-period.Then, we have if we assume that follows a discrete-time NHPP n N with mean in Equation ( 3).The software reliability function, , is defined as the probability that a software failure does not occur in the time-interval given that the testing has been going Up to the nth testing-priod.Then, we have

Software Reliability Assessment Based on Bootstrapping Method
Ordinarily, the parameters of the discretized NHPP models are estimated by using the regression analysis based on the regression equation derived from the difference equation of the discretized NHPP models.However, it is difficult to discuss the statistical inference on software reliability assessment in the existing estimation approach because it is very difficult or complex to identify the probability distribution function for the estimators of parameter analytically.For overcoming a problems above, Kimura and Fujiwara [9,10] applied non-parametric bootstrap software reliability assessment methods for an incomplete gamma function-based software reliability growth model.Kaneishi and Dohi [11] discussed a parametric bootstrap method for software reliability assessment based on continuous-time NHPP models.In this paper, we apply a non-parametric bootstrap method to the discretized NHPP model for estimating model parameters and for obtaining information for the statistical inference on the parameters and software reliability assessment measures.
Especially in this paper, we discuss five types of bootstrap confidence intervals for interval estimation of the model parameters and software reliability assessment measures.

Our Bootstrapping Method
As an example for discussing our bootstrapping method based on the discretized NHPP model, we apply the discretized exponential software reliability growth model.Our bootstrap method for software reliability assessment follows the following procedure: Step 1: Estimate 0  and 1  in Equation (4) based on the linear regression scheme by using fault counting data.We indicate 0  and 1  as  , respectively.
Step 2: Calculate the residual errors, at each observation point by ˆi d Step 3: Construct an empirical distribution function F by assuming the residual errors follows the independent and identically probability distribution and putting mass Step 4: Set the total number of iteration and let  B be the iteration count.
Step 5: Generate a bootstrap sample for the residual errors, by sampling with replacement from F ˆ.
Step 6: Generate a bootstrap sample for Step 7: Estimate Step 8: Calculate parameters of the discretized exponential software reliability growth model by the following equation: Step 9: Calculate software reliability assessment measures.
Step We can calculate the mean and the standard deviation for the model parameters and software reliability assessment measures by the Monte Carlo approximation, respectively.

Bootstrap Confidence Intervals
We discuss the following three typical bootstrap confidence intervals [17]: basic, standard normal, and percentile bootstrap confidence intervals.Further we discuss bootstrap-t and BCa methods [17,18] for deriving bootstrap confidence intervals considering with the asymmetric property and the bias and the skewness of the estimator of the parameter.Let  be parameter of interest.
The basic bootstrap confidence interval is derived by using the quantile of the distribution of ˆ    , where   is the bootstrap statistic.We can approximate the  and   1   quantile denoting v a and v a-1 , respec-tively, of the distribution of    by Thus, the basic bootstrap confidence interval is given by The standard normal bootstrap confidence interval is derived by assuming that the distribution of    can be approximated by the distribution of ˆ    and .
Thus, we have the standard normal bootstrap confidence interval as where , which is the  quantile of the standard normal distribution.For example, The percentile bootstrap confidence interval is calculated from the empirical cumulative probability distribution function, which consists of the bootstrap iteration values: Then, the percentile bootstrap confidence interval is calculated by where   ˆB   represents the  quantile of the empirical cumulative probability distribution function.Further, we discuss a bootstrap-t method, which enables us to take into consideration the variance of  by deriving , where 2  is the variance of  .Letting u  and   In above equation, we substitute u  and   , which are the  and   . Then, the bootstrap-t confidence interval is derived as In Equation ( 12),  is the standard deviation of   ˆb   and   is estimated by the bootstrap-t statistics T  .
And we also discuss a BCa method for getting better bootstrap confidence interval with the asymmetric property, the bias, and the skewness of the probability distribution of the estimator.The BCa confidence interval can be given as where In above equation, Ĝ  is the bootstrap distribution for the estimator.And is the acceleration constant derived as  is a jackknife iteration value, which is estimated by using the data, removed ith data and

Numerial Examples
We show numerical examples for our bootstrap software reliability assessment method based on the discretized exponential software reliability growth model.
We apply fault counting data:   [1] and we set the total number of iteration  .We first obtain by the linear regression scheme from the actual data.Following to our bootstrapping method, we have 2000 bootstrap samples . Then, we obtain bootstrap samples for  and  .
Copyright © 2013 SciRes.JSEA  From Equations ( 7) and ( 8), respectively.respectively.Further, Table 2 shows the results of interval estimations based on the basic, standard normal, percentile, bootstrap-t, and BCa methods, respectively, with the 5% significance level   0.025 M are notably different each other.These results are caused by assuming the symmetric distributions to derive these bootstrap confidence intervals.However, the probability distribution of an estimator follows an asymmetric distribution and the approximate accuracy is influenced by the bias and the skewness of the probability distribution for the estimator generally.As we show in Figure 3, we can say the bootstrap distribution for 25 M follows an asymmetric distribution.On the other hand, the percentile bootstrap confidence interval give us an appropriate interval estimations on 25 M because the interval estimation based on the percentile method is estimated based on only the bootstrap distribution, not assumed a symmetric distribution.Of course, the interval estimations based on the BCa method can be thought that we have more appropriate interval estimation on 25 M because the BCa method is the improved estimation method for the percentile bootstrap confidence interval.

Conclusions
This paper discussed a bootstrap software reliability assessment method based on a discretized exponential software reliability growth model.And we discussed five types of bootstrapping confidence intervals for interval estimations of model parameters and several software reliability assessment measures.
In our numerical examples, we confirmed that our bootstrap approach gives probability distributions of each parameters and software reliability assessment measures numerically even if we do not derive these probability distributions analytically, and that we can obtain useful information in software reliability assessment, such as results of interval estimations on the model parameters, the number of remaining faults, and software reliability.This approach is very useful for the case that we cannot collect enough number of data and we have to conduct interval estimation for complex estimators.However, regarding bootstrap confidence intervals, we encountered a problem that we could not get appropriate interval estimations in the basic, standard normal, and bootstrap-t confidence intervals for the number of remaining faults at the termination time of the testing.This problem was solved by using other bootstrap confidence intervals, such as the percentile and the BCa confidence intervals.In the future studies, we are going to apply our bootstrap approach for estimating optimal software release time and other practical software project management issues.
go back to Step 5 if.b < B Step 11: We have samples for B  ,  and a software reliability assessment measures.

Figures 1 and 2
show histograms for the bootstrap samples   ˆb   and   ˆb   to see bootstrap distributions of  and  , respectively.And we have bootstrap samples for the software reliability assessment measures, such as the expected number of remaining fault at the termination time of the testing,

Figures 3 and 4 Figure 5
show histograms of the bootstrap samples for   25 and   , respectively.Further, Table 1 indicates the mean and the standard deviations of the estimators of shows the estimated discretized exponential software reliability growth model, H , in which we use the means of the bootstrap samples of   ˆb   and   ˆb   ˆb as the point estimations, respectively.The means of     and  ˆb   , which are denoted by  and  , are calculated by

Figure 3 .Figure 4 .
Figure 3. Bootstrap distribution of the expected number of remaining faults at n = 25 .ˆ25 , M

Table 2 ,
we can see that the inappropriate confidence intervals for