^{1}

^{*}

^{1}

^{*}

Discrete software reliability measurement has a proper characteristic for describing a software reliability growth process which depends on a unit of the software fault-detection period, such as the number of test runs, the number of executed test cases. This paper discusses discrete software reliability measurement based on a discretized nonhomogeneous Poisson process (NHPP) model. Especially, we use a bootstrapping method in our discrete software reliability measurement for discussing the statistical inference on parameters and software reliability assessment measures of our model. Finally we show numerical examples of interval estimations based on our bootstrapping method for the several software reliability assessment measures by using actual data.

It is very important to measure reliability of a software product with accuracy in the final stage of software development process for shipping a highly reliable software product. A software reliability growth model [1-4] is known as one of the useful mathematical tools for quantitative measurement or assessment of software reliability. Generally in an actual testing-phase, we observe a software reliability growth process, in which software faults are detected and removed and the number of faults remaining in the software system is decreasing along with the test-execution time. The software reliability growth model describes the software reliability growth process, and measures the software reliability quantitatively by using software reliability assessment measures, which are derived by the software reliability growth model. A huge number of software reliability growth models were proposed so far for accurate software reliability assessment. Especially, there are discretized nonhomogeneous Poisson process (discretized NHPP) models, which have good fitting and prediction performance in software reliability assessment [5,6] because the discretized NHPP models have consistency with fault counting data, which are obtained by collecting information on the frequency of the software failure-occurrence or the number of detected faults during each constant testing-period. Estimating parameters in the discretized NHPP model from actual data is conducted by using the regression analysis based on a regression equation derived from a difference equation of the discretized NHPP model. After the parameter estimation, software reliability assessment is performed based on the software reliability assessment measures derived from the discretized NHPP model. This approach is based on the point estimation, which is better to use when we have enough number of data.

In recent years, it is very difficult to obtain enough number of data for the point estimation method due to the quick delivery of software development. In such case, it is better to conduct interval estimation for considering the uncertainty of the estimators being related to the model parameters and software reliability assessment measures. We often use asymptotic approximation approaches [

In this paper, we discuss an interval estimation method for parameters and software reliability assessment measures of a discretized exponential software reliability growth model, which is one of the discretized NHPP models and has the simplest model-structure, by the bootstrapping method. And, we discuss several kinds of bootstrap confidence intervals for the interval estimations. Finally, we show numerical examples for our bootstrapping method for software reliability assessment based on the discretized exponential software reliability model and the bootstrap confidence intervals by using actual data.

We briefly discuss the aspect of the discretized NHPP model by showing a discretized exponential software reliability growth model [5,6], which has the simplest mathematical structure. Now we define a discrete counting process representing the cumulative number of faults detected up to n-th testing-period. And we can say that the discrete counting process follows a discrete-time NHPP [

which is derived based on a continuous-time NHPP. In Equation (1), means the proba of event. is a mean value function of the discrete-time NHPP. The mean value function, , represents the expected cumulative number of faults detected up to nth testing-period.

The discretized exponential software reliability growth model is a discrete analog of the original (continuous-time) exponential software reliability growth model. Let denote the mean value function following the discretized exponential software reliability growth model. The discretized exponential software reliability growth model is given as

from the basic assumptions of the original exponential software reliability growth model. In Equation (2), represents the constant time-interval, the expected total number of potential faults to be detected in an infinitely long duration or the expected initial fault content, and the fault detection rate per fault. Regarding the discretization method, we use the Hirota’s bilinearization methods [

As, Equation (3) converges to an exact solution of the original continuous-time exponential software reliability growth model, which is derived by the differential equation.

The discretized exponential software reliability growth model in Equation (3) has two parameters, and, which have to be estimated by using actual data. The parameter estimations of and, and, can be obtained by the following procedure using the method of least-squares. First of all, if we observed fault counting data, where represents the cumulative number of faults detected up to nth testing-period, we derive the following regression equation from Equation (2):

where

Based on the regression analysis, we can estimate and, which are the estimations of and in Equation (4). Then, the parameter estimations, and, can be obtained as

in Equation (4) is independent of because is not used in calculating as showing Equation (5). Hence, we can obtain the same parameter estimates and, respectively, when we choose any value of [5,6,15,16].

Software reliability assessment measures are useful in quantitative software reliability assessment. This paper discusses the expected number of remaining faults and the software reliability function, which are well-known software reliability assessment measures. The expected number of remaining faults, , represents the expected number of undetected faults in the software system at arbitrary testing-period. Then, we have

if we assume that follows a discrete-time NHPP with mean in Equation (3). The software reliability function, , is defined as the probability that a software failure does not occur in the time-interval given that the testing has been going Up to the nth testing-priod. Then, we have

Ordinarily, the parameters of the discretized NHPP models are estimated by using the regression analysis based on the regression equation derived from the difference equation of the discretized NHPP models. However, it is difficult to discuss the statistical inference on software reliability assessment in the existing estimation approach because it is very difficult or complex to identify the probability distribution function for the estimators of parameter analytically. For overcoming a problems above, Kimura and Fujiwara [9,10] applied non-parametric bootstrap software reliability assessment methods for an incomplete gamma function-based software reliability growth model. Kaneishi and Dohi [

As an example for discussing our bootstrapping method based on the discretized NHPP model, we apply the discretized exponential software reliability growth model. Our bootstrap method for software reliability assessment follows the following procedure:

Step 1: Estimate and in Equation (4) based on the linear regression scheme by using fault counting data. We indicate and as and, respectively.

Step 2: Calculate the residual errors, at each observation point by

Step 3: Construct an empirical distribution function by assuming the residual errors follows the independent and identically probability distribution and putting mass at each ordered point

.

Step 4: Set the total number of iteration and let be the iteration count.

Step 5: Generate a bootstrap sample for the residual errors,

by sampling with replacement from.

Step 6: Generate a bootstrap sample for

by

Step 7: Estimate and from the bootstrap sample.

Step 8: Calculate parameters of the discretized exponential software reliability growth model by the following equation:

Step 9: Calculate software reliability assessment measures.

Step 10: Let and go back to Step 5 if. b < B Step 11: We have samples for, and a software reliability assessment measures.

We can calculate the mean and the standard deviation for the model parameters and software reliability assessment measures by the Monte Carlo approximation, respectively.

We discuss the following three typical bootstrap confidence intervals [

The basic bootstrap confidence interval is derived by using the quantile of the distribution of, where is the bootstrap statistic. We can approximate the and quantile denoting v_{a} and v_{a}_{-1}, respectivelyof the distribution of by and. Then,

Thus, the basic bootstrap confidence interval is given by

The standard normal bootstrap confidence interval is derived by assuming that the distribution of can be approximated by the distribution of and

. That is,

.

Thus, we have the standard normal bootstrap confidence interval as

where is, which is the quantile of the standard normal distribution. For example,

.

The percentile bootstrap confidence interval is calculated from the empirical cumulative probability distribution function, which consists of the bootstrap iteration values:. Then, the percentile bootstrap confidence interval is calculated by

where represents the quantile of the empirical cumulative probability distribution function.

Further, we discuss a bootstrap-t method, which enables us to take into consideration the variance of by deriving, where is the variance of. Letting and are the and quantile of, we have

In above equation, we substitute and, which are and quantile of, into and

, which are the and quantile of

. Then, the bootstrap-t confidence interval is derived as

In Equation (12), is the standard deviation of and is estimated by the bootstrap-t statistics.

And we also discuss a BCa method for getting better bootstrap confidence interval with the asymmetric property, the bias, and the skewness of the probability distribution of the estimator. The BCa confidence interval can be given as

where

.

In above equation, , in which is the bootstrap distribution for the estimator. And is the acceleration constant derived as

where is a jackknife iteration value, which is estimated by using the data, removed ith data and.

We show numerical examples for our bootstrap software reliability assessment method based on the discretized exponential software reliability growth model.

We apply fault counting data:

[

We first obtain

and

by the linear regression scheme from the actual data. Following to our bootstrapping method, we have 2000 bootstrap samples. Then, we obtain bootstrap samples for and. Figures 1 and 2 show histograms for the bootstrap samples and to see bootstrap distributions of and, respectively. And we have bootstrap samples for the software reliability assessment measures, such as the expected number of remaining fault at the termination time of the testing, , and the software reliability, , respectively.

These bootstrap samples of and are calculated by

From Equations (7) and (8), respectively. Figures 3 and 4 show histograms of the bootstrap samples for and, respectively. Further,

respectively.

Further,

maining faults does not never take a negative value. And depending on the type of the bootstrap confidence interval, the results of interval estimations on are notably different each other. These results are caused by assuming the symmetric distributions to derive these bootstrap confidence intervals. However, the probability distribution of an estimator follows an asymmetric distribution and the approximate accuracy is influenced by the bias and the skewness of the probability distribution for the estimator generally. As we show in

This paper discussed a bootstrap software reliability assessment method based on a discretized exponential software reliability growth model. And we discussed five types of bootstrapping confidence intervals for interval estimations of model parameters and several software reliability assessment measures.

In our numerical examples, we confirmed that our bootstrap approach gives probability distributions of each parameters and software reliability assessment measures numerically even if we do not derive these probability distributions analytically, and that we can obtain useful information in software reliability assessment, such as results of interval estimations on the model parameters, the number of remaining faults, and software reliability. This approach is very useful for the case that we cannot collect enough number of data and we have to conduct interval estimation for complex estimators. However, regarding bootstrap confidence intervals, we encountered a problem that we could not get appropriate interval estimations in the basic, standard normal, and bootstrap-t confidence intervals for the number of remaining faults at the termination time of the testing. This problem was solved by using other bootstrap confidence intervals, such as the percentile and the BCa confidence intervals. In the future studies, we are going to apply our bootstrap approach for estimating optimal software release time and other practical software project management issues.

The second author is supported in part by the Grant-inAid for Scientific Research (C), Grant No. 22510150, from the Ministry of Education, Culture.