**Journal of Software Engineering and Applications** Vol.6 No.4A(2013), Article ID:30069,7 pages DOI:10.4236/jsea.2013.64A001

A Bootstrapping Approach for Software Reliability Measurement Based on a Discretized NHPP Model

^{ }^{ }^{ }^{}

Department of Social Management Engineering, Graduate School of Engineering, Tottori University, Tottori, Japan.

Email: ino@sse.tottori-u.ac.jp, yamada@sse.tottori-u.ac.jp

Copyright © 2013 Shinji Inoue, Shigeru Yamada. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received December 14^{th}, 2012; revised January 17^{th}, 2013; accepted January 26^{th}, 2013

**Keywords:** Software Reliability Measurement; Discretized NHPP Model; Nonparametric Bootstrapping Method; Regression Analysis; Bootstrap Confidence Intervals

ABSTRACT

Discrete software reliability measurement has a proper characteristic for describing a software reliability growth process which depends on a unit of the software fault-detection period, such as the number of test runs, the number of executed test cases. This paper discusses discrete software reliability measurement based on a discretized nonhomogeneous Poisson process (NHPP) model. Especially, we use a bootstrapping method in our discrete software reliability measurement for discussing the statistical inference on parameters and software reliability assessment measures of our model. Finally we show numerical examples of interval estimations based on our bootstrapping method for the several software reliability assessment measures by using actual data.

1. Introduction

It is very important to measure reliability of a software product with accuracy in the final stage of software development process for shipping a highly reliable software product. A software reliability growth model [1-4] is known as one of the useful mathematical tools for quantitative measurement or assessment of software reliability. Generally in an actual testing-phase, we observe a software reliability growth process, in which software faults are detected and removed and the number of faults remaining in the software system is decreasing along with the test-execution time. The software reliability growth model describes the software reliability growth process, and measures the software reliability quantitatively by using software reliability assessment measures, which are derived by the software reliability growth model. A huge number of software reliability growth models were proposed so far for accurate software reliability assessment. Especially, there are discretized nonhomogeneous Poisson process (discretized NHPP) models, which have good fitting and prediction performance in software reliability assessment [5,6] because the discretized NHPP models have consistency with fault counting data, which are obtained by collecting information on the frequency of the software failure-occurrence or the number of detected faults during each constant testing-period. Estimating parameters in the discretized NHPP model from actual data is conducted by using the regression analysis based on a regression equation derived from a difference equation of the discretized NHPP model. After the parameter estimation, software reliability assessment is performed based on the software reliability assessment measures derived from the discretized NHPP model. This approach is based on the point estimation, which is better to use when we have enough number of data.

In recent years, it is very difficult to obtain enough number of data for the point estimation method due to the quick delivery of software development. In such case, it is better to conduct interval estimation for considering the uncertainty of the estimators being related to the model parameters and software reliability assessment measures. We often use asymptotic approximation approaches [7] for the interval estimation. However, we have some difficulty in mathematical manipulation for conducting the interval estimation even if we use the asymptotic approximation approach. For overcoming the problem above, the bootstrap method [8] was proposed. The bootstrapping method is known as one of the useful Monte Carlo methods for obtaining probability distributions for estimators by a resampling method. Recently, the bootstrapping method is applied not only to software reliability analysis [9-11] but also optimal checkpoint replacement for hardware system [12].

In this paper, we discuss an interval estimation method for parameters and software reliability assessment measures of a discretized exponential software reliability growth model, which is one of the discretized NHPP models and has the simplest model-structure, by the bootstrapping method. And, we discuss several kinds of bootstrap confidence intervals for the interval estimations. Finally, we show numerical examples for our bootstrapping method for software reliability assessment based on the discretized exponential software reliability model and the bootstrap confidence intervals by using actual data.

2. Discretized Exponential NHPP Model

2.1. The Model

We briefly discuss the aspect of the discretized NHPP model by showing a discretized exponential software reliability growth model [5,6], which has the simplest mathematical structure. Now we define a discrete counting process representing the cumulative number of faults detected up to n-th testing-period. And we can say that the discrete counting process follows a discrete-time NHPP [13] if the process has the following property:

(1)

which is derived based on a continuous-time NHPP. In Equation (1), means the proba of event. is a mean value function of the discrete-time NHPP. The mean value function, , represents the expected cumulative number of faults detected up to nth testing-period.

The discretized exponential software reliability growth model is a discrete analog of the original (continuous-time) exponential software reliability growth model. Let denote the mean value function following the discretized exponential software reliability growth model. The discretized exponential software reliability growth model is given as

(2)

from the basic assumptions of the original exponential software reliability growth model. In Equation (2), represents the constant time-interval, the expected total number of potential faults to be detected in an infinitely long duration or the expected initial fault content, and the fault detection rate per fault. Regarding the discretization method, we use the Hirota’s bilinearization methods [14] for conserving the property of the continuous-time NHPP model. Solving the integrable difference equation in Equation (2), we can obtain an exact solution as

(3)

As, Equation (3) converges to an exact solution of the original continuous-time exponential software reliability growth model, which is derived by the differential equation.

The discretized exponential software reliability growth model in Equation (3) has two parameters, and, which have to be estimated by using actual data. The parameter estimations of and, and, can be obtained by the following procedure using the method of least-squares. First of all, if we observed fault counting data, where represents the cumulative number of faults detected up to nth testing-period, we derive the following regression equation from Equation (2):

(4)

where

(5)

Based on the regression analysis, we can estimate and, which are the estimations of and in Equation (4). Then, the parameter estimations, and, can be obtained as

(6)

in Equation (4) is independent of because is not used in calculating as showing Equation (5). Hence, we can obtain the same parameter estimates and, respectively, when we choose any value of [5,6,15,16].

2.2. Software Reliability Assessment Measures

Software reliability assessment measures are useful in quantitative software reliability assessment. This paper discusses the expected number of remaining faults and the software reliability function, which are well-known software reliability assessment measures. The expected number of remaining faults, , represents the expected number of undetected faults in the software system at arbitrary testing-period. Then, we have

(7)

if we assume that follows a discrete-time NHPP with mean in Equation (3). The software reliability function, , is defined as the probability that a software failure does not occur in the time-interval given that the testing has been going Up to the nth testing-priod. Then, we have

(8)

3. Software Reliability Assessment Based on Bootstrapping Method

Ordinarily, the parameters of the discretized NHPP models are estimated by using the regression analysis based on the regression equation derived from the difference equation of the discretized NHPP models. However, it is difficult to discuss the statistical inference on software reliability assessment in the existing estimation approach because it is very difficult or complex to identify the probability distribution function for the estimators of parameter analytically. For overcoming a problems above, Kimura and Fujiwara [9,10] applied non-parametric bootstrap software reliability assessment methods for an incomplete gamma function-based software reliability growth model. Kaneishi and Dohi [11] discussed a parametric bootstrap method for software reliability assessment based on continuous-time NHPP models. In this paper, we apply a non-parametric bootstrap method to the discretized NHPP model for estimating model parameters and for obtaining information for the statistical inference on the parameters and software reliability assessment measures. Especially in this paper, we discuss five types of bootstrap confidence intervals for interval estimation of the model parameters and software reliability assessment measures.

3.1. Our Bootstrapping Method

As an example for discussing our bootstrapping method based on the discretized NHPP model, we apply the discretized exponential software reliability growth model. Our bootstrap method for software reliability assessment follows the following procedure:

Step 1: Estimate and in Equation (4) based on the linear regression scheme by using fault counting data. We indicate and as and, respectively.

Step 2: Calculate the residual errors, at each observation point by

Step 3: Construct an empirical distribution function by assuming the residual errors follows the independent and identically probability distribution and putting mass at each ordered point

.

Step 4: Set the total number of iteration and let be the iteration count.

Step 5: Generate a bootstrap sample for the residual errors,

by sampling with replacement from.

Step 6: Generate a bootstrap sample for

by

Step 7: Estimate and from the bootstrap sample.

Step 8: Calculate parameters of the discretized exponential software reliability growth model by the following equation:

Step 9: Calculate software reliability assessment measures.

Step 10: Let and go back to Step 5 if. b < B Step 11: We have samples for, and a software reliability assessment measures.

We can calculate the mean and the standard deviation for the model parameters and software reliability assessment measures by the Monte Carlo approximation, respectively.

3.2. Bootstrap Confidence Intervals

We discuss the following three typical bootstrap confidence intervals [17]: basic, standard normal, and percentile bootstrap confidence intervals. Further we discuss bootstrap-t and BCa methods [17,18] for deriving bootstrap confidence intervals considering with the asymmetric property and the bias and the skewness of the estimator of the parameter. Let be parameter of interest.

The basic bootstrap confidence interval is derived by using the quantile of the distribution of, where is the bootstrap statistic. We can approximate the and quantile denoting v_{a} and v_{a}_{-1}, respectivelyof the distribution of by and. Then,

Thus, the basic bootstrap confidence interval is given by

(9)

The standard normal bootstrap confidence interval is derived by assuming that the distribution of can be approximated by the distribution of and

. That is,

.

Thus, we have the standard normal bootstrap confidence interval as

(10)

where is, which is the quantile of the standard normal distribution. For example,

.

The percentile bootstrap confidence interval is calculated from the empirical cumulative probability distribution function, which consists of the bootstrap iteration values:. Then, the percentile bootstrap confidence interval is calculated by

(11)

where represents the quantile of the empirical cumulative probability distribution function.

Further, we discuss a bootstrap-t method, which enables us to take into consideration the variance of by deriving, where is the variance of. Letting and are the and quantile of, we have

In above equation, we substitute and, which are and quantile of, into and

, which are the and quantile of

. Then, the bootstrap-t confidence interval is derived as

(12)

In Equation (12), is the standard deviation of and is estimated by the bootstrap-t statistics.

And we also discuss a BCa method for getting better bootstrap confidence interval with the asymmetric property, the bias, and the skewness of the probability distribution of the estimator. The BCa confidence interval can be given as

(13)

where

.

In above equation, , in which is the bootstrap distribution for the estimator. And is the acceleration constant derived as

where is a jackknife iteration value, which is estimated by using the data, removed ith data and.

4. Numerial Examples

We show numerical examples for our bootstrap software reliability assessment method based on the discretized exponential software reliability growth model.

We apply fault counting data:

[1] and we set the total number of iteration.

We first obtain

and

by the linear regression scheme from the actual data. Following to our bootstrapping method, we have 2000 bootstrap samples. Then, we obtain bootstrap samples for and. Figures 1 and 2 show histograms for the bootstrap samples and to see bootstrap distributions of and, respectively. And we have bootstrap samples for the software reliability assessment measures, such as the expected number of remaining fault at the termination time of the testing, , and the software reliability, , respectively.

These bootstrap samples of and are calculated by

, (14)

(15)

Figure 1. Bootstrap distribution of.

Figure 2. Bootstrap distribution of.

From Equations (7) and (8), respectively. Figures 3 and 4 show histograms of the bootstrap samples for and, respectively. Further, Table 1 indicates the mean and the standard deviations of the estimators of, , , , , and, respectively. And, Figure 5 shows the estimated discretized exponential software reliability growth model, , in which we use the means of the bootstrap samples of and as the point estimations, respectively. The means of and, which are denoted by and, are calculated by

(16)

Figure 3. Bootstrap distribution of the expected number of remaining faults at n = 25.

Figure 4. Bootstrap distribution of software reliability,.

(17)

respectively.

Further, Table 2 shows the results of interval estimations based on the basic, standard normal, percentile, bootstrap-t, and BCa methods, respectively, with the 5% significance level. From Table 2, we can see that the inappropriate confidence intervals for are estimated in the basic, standard normal, and bootstrap-t confidence intervals because the number of re-

Figure 5. Estimated discretized exponential software reliability growth model.

Table 1. Quantities of the bootstrap distribution.

maining faults does not never take a negative value. And depending on the type of the bootstrap confidence interval, the results of interval estimations on are notably different each other. These results are caused by assuming the symmetric distributions to derive these bootstrap confidence intervals. However, the probability distribution of an estimator follows an asymmetric distribution and the approximate accuracy is influenced by the bias and the skewness of the probability distribution for the estimator generally. As we show in Figure 3, we can say the bootstrap distribution for follows an asymmetric distribution. On the other hand, the percentile bootstrap confidence interval give us an appropriate interval estimations on because the interval estimation based on the percentile method is estimated based on only the bootstrap distribution, not assumed a symmetric distribution. Of course, the interval estimations based on the BCa method can be thought that we have more appropriate interval estimation on because the BCa method is the improved estimation method for the percentile bootstrap confidence interval.

Table 2. Results of interval estimations based on bootstrap confidence intervals.

5. Conclusions

This paper discussed a bootstrap software reliability assessment method based on a discretized exponential software reliability growth model. And we discussed five types of bootstrapping confidence intervals for interval estimations of model parameters and several software reliability assessment measures.

In our numerical examples, we confirmed that our bootstrap approach gives probability distributions of each parameters and software reliability assessment measures numerically even if we do not derive these probability distributions analytically, and that we can obtain useful information in software reliability assessment, such as results of interval estimations on the model parameters, the number of remaining faults, and software reliability. This approach is very useful for the case that we cannot collect enough number of data and we have to conduct interval estimation for complex estimators. However, regarding bootstrap confidence intervals, we encountered a problem that we could not get appropriate interval estimations in the basic, standard normal, and bootstrap-t confidence intervals for the number of remaining faults at the termination time of the testing. This problem was solved by using other bootstrap confidence intervals, such as the percentile and the BCa confidence intervals. In the future studies, we are going to apply our bootstrap approach for estimating optimal software release time and other practical software project management issues.

6. Acknowledgements

The second author is supported in part by the Grant-inAid for Scientific Research (C), Grant No. 22510150, from the Ministry of Education, Culture.

REFERENCES

- J. D. Musa, “A Theory of Software Reliability and Its Application,” IEEE Transactions on Software Engineering, Vol. SE-1, No. 3, 1975, pp. 312-327. doi:10.1109/TSE.1975.6312856
- A. L. Goel, “Software Reliability Models: Assumptions, Limitations, and Applicability,” IEEE Transactions on Software Engineering, Vol. SE-11, No. 12, 1985, pp. 1411- 1423. doi:10.1109/TSE.1985.232177
- J. D. Musa, D. Iannio and K. Okumoto, “Software Reliability: Measurement, Prediction, Application,” McGraw- Hill, New York, 1987.
- H. Pham, “Software Reliability,” Springer-Verlag, Singapore, 2000.
- S. Inoue and S. Yamada, “Discrete Software Reliability Assessment with Discretized NHPP Models,” Computers & Mathematics with Applications: An International Journal, Vol. 51, No. 2, 2006, pp. 161-170. doi:10.1016/j.camwa.2005.11.022
- S. Inoue and S. Yamada, “Integrable Difference Equations for Software Reliability Assessment and Their Applications,” International Journal of Systems Assurance Engineering and Management, Vol. 1, No. 1, 2010, pp. 2-7. doi:10.1007/s13198-010-0005-x
- S. Yamada and S. Osaki, “Software Reliability Growth Modeling: Models and Applications,” IEEE Transactions on Software Engineering, Vol. SE-11, No. 12, 1985, pp. 1431-1437. doi:10.1109/TSE.1985.232179
- B. Efron, “Bootstrap Methods: Another Look at the Jackknife,” The Annals of Statistics, Vol. 7, No. 1, 1979, pp. 1-26. doi:10.1214/aos/1176344552
- M. Kimura, “A study on Bootstrap Confidence Intervals of Software Reliability Measures Based on an Incomplete Gamma Function Model,” In: T. Dohi and W. Y. Yun, Eds., Advanced Reliability Modeling II, World Scientific, Singapore City, 2006, pp. 419-426.
- M. Kimura and T. Fujiwara, “A Bootstrap Software Reliability Assessment Method to Squeeze out Remaining Faults,” In: T. H. Kim and H. Adeli, Eds., Advances in Computer Science and Information Technology, SpringerVerlag, Berlin-Heidelberg, 2010, pp. 435-446. doi:10.1007/978-3-642-13577-4_39
- T. Kaneishi and T. Dohi, “Parametric Bootstrapping for Assessing Software Reliability Measures,” Proceedings of the 17th IEEE Pacific Rim International Symposium on Dependable Computing, 12-14 December 2010, pp. 1-9.
- S. Tokumoto, T. Dohi and W. Y. Yun, “Toward Development of Risk-Based Checkpointing Scheme via Parametric Bootstrapping,” Proceedings of the 2012 Workshop on Recent Advances in Software Dependability, 19 November 2012, pp. 50-55.
- S. Yamada and S. Osaki, “Discrete Software Reliability Growth Models,” Journal of Applied Stochastic Models and Data Analysis, Vol. 1, No. 1, 1985, pp. 65-77. doi:10.1002/asm.3150010108
- R. Hirota, “Nonlinear Partial Difference Equations. V. Nonlinear Equations Reducible to Linear Equations,” Journal of the Physical Society of Japan, Vol. 46, No. 1, 1979, pp. 312-319. doi:10.1143/JPSJ.46.312
- D. Satoh, “A Discrete Gompertz Equation and a software Reliability Growth Model,” IEICE Transactions on Information and Systems, Vol. E83-D, No. 7, 2000, pp. 1508-1513.
- D. Satoh, “A Discrete Bass Model and Its Parameter Estimation,” Journal of the Operations Research Society of Japan, Vol. 44, No. 1, 2001, pp. 1-18.
- M. L. Rizzo, “Statistical Computing with R,” Chapman and Hall/CRC, Boca Raton, 2008.
- B. Efron, “Better Bootstrap Confidence Intervals,” Journal of the American Statistical Association, Vol. 82, No. 397, 1987, pp. 171-185. doi:10.1080/01621459.1987.10478410