^{1}

^{*}

^{2}

Reliability analysis is the key to evaluate software’s quality. Since the early 1970s, the Power Law Process, among others, has been used to assess the rate of change of software reliability as time-varying function by using its intensity function. The Bayesian analysis applicability to the Power Law Process is justified using real software failure times. The choice of a loss function is an important entity of the Bayesian settings. The analytical estimate of likelihood-based Bayesian reliability estimates of the Power Law Process under the squared error and Higgins-Tsokos loss functions were obtained for different prior knowledge of its key parameter. As a result of a simulation analysis and using real data, the Bayesian reliability estimate under the Higgins-Tsokos loss function not only is robust as the Bayesian reliability estimate under the squared error loss function but also performed better, where both are superior to the maximum likelihood reliability estimate. A sensitivity analysis resulted in the Bayesian estimate of the reliability function being sensitive to the prior, whether parametric or non-parametric, and to the loss function. An interactive user interface application was additionally developed using Wolfram language to compute and visualize the Bayesian and maximum likelihood estimates of the intensity and reliability functions of the Power Law Process for a given data.

Reliability analysis of a software under development is a key to assess whether a desired level of a quality product is achieved. Specially, when a software package is considered, and is tested after each failure detection, and then corrected until a new failure is observed. Over the past few decades, the reliability analysis of a software package has been studied, where graphical and numerical metrics have been introduced. One of the earliest, Duane (1964) [

1) N ( t = 0 ) = 0 .

2) Independent increment (counts of disjoint time intervals are independent).

3) It has an intensity function

V ( t ) = lim Δ t → 0 P ( N ( t , t + Δ t ) = 1 ) Δ t .

4) Simultaneous failures do not exist

lim Δ t → 0 P ( N ( t , t + Δ t ) = 2 ) Δ t = 0.

The probability of random value N ( t ) = n is given by:

P ( N ( t ) = n ) = exp { − ∫ 0 t V ( t ) d t } { ∫ 0 t V ( t ) d t } n n ! , t > 0. (1)

Crow (1974) proposed a Non-Homogeneous Poisson Process (NHPP) , which is a Poisson Process with a time varying intensity function, given by:

V ( t ) = V ( t ; β , θ ) = β θ ( t θ ) β − 1 , t > 0 , β > 0 , θ > 0 , (2)

with β and θ are the shape and scale parameters, respectively. This Non-Homogeneous Poisson Process is also known as the Power Law Process (PLP).

The joint probability density function (PDF) of the ordered failure times T 1 , T 2 , ⋯ , T n from a NHPP with intensity function V ( t ; β , θ ) is given by:

f ( t 1 , ⋯ , t n ) = ∏ i = 1 n V ( t i ; β , θ ) exp { − ∫ 0 w V ( t ; β , θ ) d t } , (3)

where w is the so-called stopping time; w = t n for the failure truncated case. Considering the failure truncation case, the conditional reliability function of the failure time T n given T 1 = t 1 , T 2 = t 2 , T 3 = t 3 , ⋯ , T n − 2 = t n − 2 , T n − 1 = t n − 1 is a function of V ( t ; β , θ ) .

As a numerical assessment, the estimate of the key parameter β in the V ( t ; β , θ ) has an important role in evaluating the reliability of a software package. When the estimates of β are less and larger than 1, they indicate that the software reliability is improving and decreasing, respectively. The PLP is reduced to a homogeneous Poisson process when the estimate of β equals to 1.

The NHPP has been used for analyzing software’s failure times, and prediction of the next failure time. The subject model has been shown to be effective and useful not only in software reliability assessment [

Since the conditional reliability function of the PLP is a function of the V ( t ; β , θ ) , which includes the key parameter β . That being said updating the estimation methods for the key parameter will affect positively the V ( t ; β , θ ) and the software reliability estimation, and therefore help the structuring of maintenance strategies. The authors [

To perform Bayesian analysis on a real world problem, one needs to justify the applicability of such analysis. Then, the analysis process starts by identifying the probability distribution of the failure times of a software under development, the prior PDF of the key parameter β , and a loss function. The analytical tractability have made the squared-error loss function commonly used, where it places more weight on the estimates that are far from the true value than the estimates close to true value. Higgins and Tsokos [

In the present study, we investigate the effectiveness, in Bayesian Analysis, of using the commonly used squared-error (S-E) loss function versus the Higgins-Tsokos (H-T) loss function that puts the loss at the end of the process, for modeling software failure times. To accomplish this, we used the underline failure distribution to be the Power Law Process subject to using Burr PDF as a prior of the key parameter β . In addition, we utilize both loss functions to perform sensitive analysis of the prior selections. We perform parametric and non-parametric priors, namely Burr, Inverted Gamma, Jeffery, and two Kernel PDFs. Therefore, the primary objective of the study is to answer the following questions within a Bayesian framework:

1) How robust is the assumption of the squared-error loss function being challenged by the Higgins-Tsokos loss function in estimating the key parameter β of PLP for modeling software failure times?

2) Is the Bayesian estimate of the intensity function, V ( t ; β , θ ) , of the PLP sensitive to the selections of the prior (parametric and non-parametric) and loss function (Higgins-Tsokos and S-E loss functions)?

The paper is organized as follows, Section 2 describes the theory and development of the Bayesian reliability model. Section 3 presents the results and discussion. Section 4 are the conclusions.

The probability of achieving n failures of a given system in the time interval

( 0, t ] can be written as

P ( x = n ; t ) = exp { − ∫ 0 t V ( x ) d x } { ∫ 0 t V ( x ) d x } n n ! , t > 0 , (4)

where V ( t ) is the intensity function given by (2). The reduced expression is given by:

P ( x = n ; t ) = 1 n ! exp { − t θ β } t θ n β , (5)

is the PLP that is commonly known as Weibull or Non-Homogeneous Poisson Process.

When the PLP is the underlying failure model of the failure times t 1 , t 2 , t 3 , ⋯ , t n − 1 and t n , the conditional reliability function of t n given t 1 , t 2 , t 3 , ⋯ , t n − 1 can be written mathematically as a function of the intensity function, given by:

R ( t n | t 1 , t 2 , ⋯ , t n − 1 ) = exp { ∫ t n − 1 t n − V ( t ; β , θ ) d t } , t n > t n − 1 > 0 , (6)

since it is independent of t 1 , t 2 , t 3 , ⋯ , t n − 2 .

Note that the improvement in estimating the key parameter β in the R ( t n | t 1 , t 2 , ⋯ , t n − 1 ) of the PLP, Equation (6), will improve the reliability estimation.

The maximum likelihood estimation (MLE) of β is a function of the largest failure time and the MLE of θ is also a function of the MLE of β . Let

T 1 , T 2 , ⋯ , T n denote the first n failure times of the PLP, where T l < T 2 < ⋯ < T n

are measured in global time; that is, the times are recorded since the initial startup of the system. Thus, the truncated conditional probability distribution function, f i ( t | t 1 , ⋯ , t i − 1 ) , in the Weibull process is given by

f i ( t | t 1 , ⋯ , t i − 1 ) = β θ ( t θ ) β − 1 exp { − t θ β + t i − 1 θ β } , t i − 1 < t . (7)

With t = ( t 1 , t 2 , . ⋯ , t n ) , the Likelihood function for the first n failure times of the PLP T 1 = t 1 , T 2 = t 2 , ⋯ , T n = t n can be written as

L ( t , β ) = exp ( − ( t n θ ) β ) ( β θ ) n ∏ i = 1 n ( t i θ ) β − 1 . (8)

The MLE for the shape parameter is given by

β ^ n = n ∑ i = 1 n log ( t n t i ) , (9)

and for the scale parameter is

θ ^ n = t n n 1 / β ^ n . (10)

Note that the MLE of θ depends on the MLE of β .

The authors [

g B ( β ) = g ( β ; α , γ , δ , κ ) = { α κ ( β − γ δ ) α − 1 δ ( 1 + ( β − γ δ ) α ) κ + 1 γ ≤ β < ∞ 0 otherwise (11)

where the hyperparameters α , γ , δ and κ are being estimated using MLEs in the Goodness of Fit (GOF) test applied to the β estimates. The MLE

Failure times | ||||||
---|---|---|---|---|---|---|

0.7 | 3.7 | 13.2 | 17.6 | 54.5 | 99.2 | 112.2 |

120.9 | 151 | 163 | 174.5 | 191.6 | 282.8 | 355.2 |

486.3 | 490.5 | 513.3 | 558.4 | 678.1 | 688 | 785.9 |

887 | 1010.7 | 1029.1 | 1034.4 | 1136.1 | 1178.9 | 1259.7 |

1297.9 | 1419.7 | 1571.7 | 1629.8 | 1702.4 | 1928.9 | 2072.3 |

2525.2 | 2928.5 | 3016.4 | 3181 | 3256.3 | - | - |

of the key parameter β is always affected by the largest failure, and therefore it is recommended not to consider it unknown constant. This recommendation provides the opportunity to study Bayesian analysis in the PLP with respect to various selections of loss functions and priors.

The Bayesian estimates of β will be derived using the squared-error and Higgins-Tsokos loss functions.

The S-E loss function is given by:

L ( ξ ^ , ξ ) = ( ξ ^ − ξ ) 2 . (12)

The risk using the S-E loss function, where ξ = β represents the estimate of ξ ^ = β ^ , is given by:

E [ L ( β ^ , β ) ] = ∫ − ∞ ∞ [ ( β ^ − β ) 2 ] h ( β | t ) d β , (13)

By differentiating E [ L ( β ^ , β ) ] with respect to β and setting it equal to zero we solve for β ^ , the Bayesian estimate of β with respect to the S-E loss function and Burr probability distribution, Equation (11), given by:

β ^ B . S E = ∫ − ∞ ∞ β ⋅ h ( β | t ) d β , (14)

where the posterior PDF of β given data (t), h ( β | t ) , using the Bayes?? theorem, is given by:

h ( β | t ) = L ( t | β ) g B ( β ) ∫ − ∞ ∞ L ( t | β ) g B ( β ) d β . (15)

Then, the Bayesian estimate of β , under the squared-error loss, is given by

β ^ B . S E = ∫ γ ∞ β n + 1 θ n exp { − ( t n θ ) β } ∏ i = 1 n ( t i θ ) β − 1 ( β − γ δ ) α − 1 ( 1 + ( β − γ δ ) α ) κ + 1 d β ∫ γ ∞ β n θ n exp { − ( t n θ ) β } ∏ i = 1 n ( t i θ ) β − 1 ( β − γ δ ) α − 1 ( 1 + ( β − γ δ ) α ) κ + 1 d β . (16)

The H-T loss function (1976) is given by

L ( ξ ^ , ξ ) = f 1 exp { f 2 ( ξ ^ − ξ ) } + f 2 exp { − f 1 ( ξ ^ − ξ ) } f 1 + f 2 − 1 , f 1 , f 2 > 0. (17)

Higgins and Tsokos [

E [ L ( β ^ , β ) ] = ∫ − ∞ ∞ [ f 1 exp { f 2 ( β ^ − β ) } + f 2 exp { − f 1 ( β ^ − β ) } f 1 + f 2 − 1 ] h ( β | t ) d β (18)

By differentiating E [ L ( β ^ , β ) ] with respect to β and setting it equal to zero we solve for β ^ , the Bayesian estimate of β with respect to the H-T loss function, given by:

β ^ B . T H = 1 f 1 + f 2 ln [ ∫ − ∞ ∞ exp { f 1 β } h ( β | t ) d β ∫ − ∞ ∞ exp { − f 2 β } h ( β | t ) d β ] . (19)

The Bayesian estimate of β with respect to the Higgins-Tsokos loss function and Burr probability distribution, as the prior, has h ( β | t ) given by

h ( β | t ) = ( β θ ) n exp { − ( t n θ ) β } ∏ i = 1 n ( t i θ ) β − 1 ( β − γ δ ) α − 1 ( 1 + ( β − γ δ ) α ) κ + 1 d β ∫ γ ∞ ( β θ ) n exp { − ( t n θ ) β } ∏ i = 1 n ( t i θ ) β − 1 ( β − γ δ ) α − 1 ( 1 + ( β − γ δ ) α ) κ + 1 d β . (20)

With the use of Equation (6), the conditional reliability of t i , the analytical structure of the conditional Bayesian reliability estimate for the PLP that is subject to the above information is given by:

R ^ B ( t i | t 1 , t 2 , ⋯ , t i − 1 ) = exp { − ∫ t i − 1 t i V ^ ′ B ( t ; β , θ ) d t } , t i > t i − 1 > 0 , (21)

where

V ^ ′ B ( t ; β ^ B * , θ ) = β ^ B * θ ( t θ ) β ^ B * − 1 , θ > 0 , t > 0 , (22)

where β ^ B * is the Bayesian estimate using β ^ B . S E or β ^ B . T H for the squared error or Higgins-Tsokos loss functions, respectively. We are also interested in comparing the Bayesian estimates, using Higgins-Tsokos loss function, of the subject parameter for different parametric and non-parametric priors, and with respect to its MLE given by Equation (9), assuming β has a random behavior and θ as known; as well as, comparing Equation (10) with an adjusted MLE considered as a function of β .

In this section, we seek the answer to the following question: Is the Bayesian MLE estimate of the intensity function, V ( t ; β , θ ) , of the PLP sensitive to the selections of the prior( parametric and non-parametric) and loss function (Higgins-Tsokos and S-E loss functions)? Assuming β is a random variable, using simulated data, sensitive analysis was done for the following parametric and non-parametric priors ( [

1) Jeffreys’ prior ( [

g J ( β ) ∝ I ( β ) = − E ( ∂ 2 log L ( t ; β ) ∂ β 2 ) ∝ 1 β , β > 0. (23)

2) The inverted gamma: The PLP and inverted gamma probability distributions belong to the exponential family of probability distributions, which makes the latter a logical choice for an informative parametric prior for β . The inverted gamma probability distribution is given by:

g I G ( β ) ∝ ( μ β ) v + 1 1 μ Γ ( v ) exp { − μ β } , β > 0 , μ > 0 , v > 0 , (24)

where v and μ are the shape and scale parameters.

3) Kernel’ prior:

The kernel probability density estimation is a non-parametric method to approximately estimate the PDF of β using a finite data set. It is given by:

g K ( β ) = 1 n h ∑ i = 1 n K ( β − β i h ) , (25)

where K is the kernel function and h is a positive number called the bandwidth.

Assuming Jeffreys’ PDF (23) as the prior of β and using the likelihood (8) and (15), the posterior density of β is:

h J ( t ¯ | β ) = exp { ( t n θ ) β } β n − 1 θ n β ∏ i = 1 n ( t i ) β − 1 ∫ 0 ∞ exp { ( t n θ ) β } β n − 1 θ n β ∏ i = 1 n ( t i ) β − 1 d β . (26)

Thus, the Jeffreys’ Bayesian estimate of the key parameter β under the S-E and H-T loss functions, using (14) and (19), are given by:

β ^ B . S E J = ∫ 0 ∞ β ⋅ h J ( t ¯ | β ) d β , (27)

and

β ^ B . H T J = 1 f 1 + f 2 ln [ ∫ 0 ∞ exp { f 1 β } h J ( t ¯ | β ) d β ∫ 0 ∞ exp { − f 2 β } h J ( t ¯ | β ) d β ] . (28)

We must rely on a numerical estimation because we cannot obtain close solutions for both β ^ B . S E J and β ^ B . H T J . Also note that it depends on knowing or being able to estimate the scale parameter θ .

The following is an examination of the problem when the prior density of β is given by the inverted gamma (24). Using the likelihood (8), the posterior density of β is given by:

h I G ( t | β ) = β n − v − 1 θ n β exp { − ( t n θ ) β − μ β } ∏ i = 1 n ( t i ) β − 1 ∫ 0 ∞ β n − v − 1 θ n β exp { − ( t n θ ) β − μ β } ∏ i = 1 n ( t i ) β − 1 d β . (29)

Thus, the Bayesian estimates of β under the inverted gamma with respect to the S-E and H-T loss functions, using (14) and (19), are given by:

β ^ B . S E I G = ∫ 0 ∞ β ⋅ h I G ( t ¯ | β ) d β , (30)

and

β ^ B . H T I G = 1 f 1 + f 2 ln [ ∫ 0 ∞ exp { f 1 β } h I G ( t | β ) d β ∫ 0 ∞ exp { − f 2 β } h I G ( t | β ) d β ] . (31)

Here as well, we must rely on a numerical estimation because we cannot obtain close solutions for β ^ B . S E I G and β ^ B . H T I G . Also note that it depends on knowing or being able to estimate the scale parameter θ .

Assuming Kernel density (25) as the prior of β and using the likelihood (8), the posterior density of β is:

h k ( t ¯ | β ) = exp { ( t n θ ) β } β n θ n β ∏ i = 1 n ( t i ) β − 1 1 n h ∑ i = 1 n K ( β − β i h ) ∫ 0 ∞ exp { ( t n θ ) β } β n θ n β ∏ i = 1 n ( t i ) β − 1 1 n h ∑ i = 1 n K ( β − β i h ) d β . (32)

Thus, the kernel Bayesian estimates of the key parameter β under the S-E and H-T loss functions, (14) and (19), are given by:

β ^ B . S E K = ∫ 0 ∞ β ⋅ h K ( t ¯ | β ) d β , (33)

and

β ^ B . H T K = 1 f 1 + f 2 ln [ ∫ γ ∞ exp { f 1 β } h k ( t ¯ | β ) d β ∫ γ ∞ exp { − f 2 β } h k ( t ¯ | β ) d β ] . (34)

We must rely on a numerical estimation because we cannot obtain close solutions for β ^ B . S E K and β ^ B . H T K . Also note that it depends on knowing or being able to estimate the scale parameter θ . In addition, the kernel function, K ( u ) , and bandwidth, h, will be chosen to minimize the asymptotic mean integrated squared error (AMISE) given by:

AMISE ( f ^ ( β ) ) = ∫ E [ ( f ^ ( β ) − f ( β ) ) 2 ] d β , (35)

where f ^ ( β ) and f ( β ) are the estimated probability density of β and the true probability density of β respectively.

A Monte Carlo simulation was used to compare the Bayesian, under the S-E and H-T loss functions, and the MLE approaches. The parameter β of the intensity function for the PLP was calculated using numerical integration techniques in conjunction with a Monte Carlo simulation to obtain its Bayesian estimates. Substituting these estimates in the intensity function we obtained the Bayesian intensity function estimates, from which the reliability function can be estimated.

For a given value of the parameter θ , a stochastic value for the parameter β was generated from a prior probability density. For a pair of values of θ and β , 400 samples of 40 failure times that follow a PLP were generated. This procedure was repeated 250 times and for three distinct values of θ . The procedure is based on the schematic diagram given by Algorithm 1.

Acronyms | |
---|---|

HPP | Homogeneous Poisson Process |

NHPP | Non-Homogeneous Poisson Process |

PLP | Power Law Process |

MLE | Maximum likelihood estimate |

Probability density function | |

CDF | Cumulative density function |

Notations | |

β and θ | Shape and Scale parameters of PLP |

T 1 , T 2 , ⋯ , T n | First n successive failure times of the PLP |

V ( t ; β , θ ) | Intensity function of the PLP |

RE | Relative efficiency |

AMISE | Asymptotic mean integrated squared error |

H-T | Higgin-Tsokos |

S-E | Squared-Error |

β ^ H T | Bayesian estimate of β under H-T loss function |

β ^ S E | Bayesian estimate of β under S-E loss function |

V ^ H T | Bayesian MLE estimate of V ( t ; β , θ ) under H-T loss function |

V ^ S E | Bayesian MLE estimate of V ( t ; β , θ ) under S-E loss function |

B . H T | Bayesian estimate under Burr PDF and H-T loss function |

B . S E | Bayesian estimate under Burr PDF and S-E loss function |

Algorithm 1. Simulation to analyze Bayesian estimates of β for a given θ .

For each sample of size 40, the Bayesian estimates and MLEs of the parameter were calculated when θ ∈ { 0.5 , 1.7441 , 4 } . The comparison is based on the mean squared error (MSE) averaged over the 100000 repetitions. The results are given in

For different sample sizes, the Bayesian estimates under S-E and H-T loss functions and the MLEs of the parameter β were calculated and averaged over 10,000 repetitions.

It can be observed that the Bayesian estimates of β are closer to the true value than the MLE of β , where the Bayesian estimate under the H-T loss function is slightly performing better even for a very small sample size of n = 20 . A graphical comparison of the true estimate of β along with the Bayesian estimates (under both S-E and H-T loss functions) and MLE as a function of sample size is given in

θ | MSE of β ^ | MSE of β ^ B . S E | MSE of β ^ B . H T |
---|---|---|---|

0.50000 | 0.01124360 | 0.0005077610 | 0.000507356 |

1.74410 | 0.01105730 | 0.0005163560 | 0.000516057 |

400000 | 0.01096100 | 0.0005190550 | 0.000518632 |

For the considered sample sizes, the MSEs of the Bayesian estimates of β are sufficiently smaller than the MSEs for the MLE of β . The Bayesian estimate under the H-T loss function performed slightly better than the Bayesian estimate under the S-E loss function.

n | β F i x e d | β ^ | β ^ B . S E | β ^ B . H T |
---|---|---|---|---|

20 | 0.7054 | 0.784026 | 0.673706 | 0.675263 |

30 | 0.7054 | 0.756617 | 0.689413 | 0.690189 |

40 | 0.7054 | 0.743982 | 0.695989 | 0.696467 |

50 | 0.7054 | 0.735310 | 0.698826 | 0.699158 |

60 | 0.7054 | 0.729563 | 0.700393 | 0.700642 |

70 | 0.7054 | 0.725977 | 0.701493 | 0.701690 |

80 | 0.7054 | 0.723338 | 0.702220 | 0.702382 |

100 | 0.7054 | 0.719117 | 0.703049 | 0.703165 |

120 | 0.7054 | 0.716315 | 0.703496 | 0.703585 |

140 | 0.7054 | 0.714821 | 0.703909 | 0.703980 |

160 | 0.7054 | 0.713641 | 0.704185 | 0.704244 |

Since the Bayesian estimates under both loss functions for β are superior to its MLE, Molinares and Tsokos [

This proposed adjusted estimates, θ ^ B . S E and θ ^ B . H T , were averaged over the 10,000 repetitions. It can be appreciated that, based on the Bayesian influence on β , θ ^ B . S E and θ ^ B . H T are better estimates than the MLE of θ ( θ ^ ). This also can be seen in

The MSEs of the adjusted estimates of the shape parameter ( θ ) are significantly smaller that the MSEs of the MLE estimate. The MSEs of the adjusted estimates are then displayed alone in

It can be noticed that the adjusted estimate of θ under the influence of the Bayesian estimate with the H-T loss function, is better, particularly when considering small sample sizes.

We computed the adjusted estimate for the parameter θ and its MSE over 10000 repetitions for different values of θ and sample size n = 40 . The results are given in

The adjusted estimate of θ are were more accurate when considering small true values of θ than the larger values.

n | θ | θ ^ M L E | θ ^ B . S E | θ ^ B . H T |
---|---|---|---|---|

20 | 1.7441 | 3.17139 | 1.3507 | 1.36422 |

30 | 1.7441 | 2.908 | 1.50142 | 1.5097 |

40 | 1.7441 | 2.73107 | 1.57545 | 1.58115 |

50 | 1.7441 | 2.59245 | 1.61556 | 1.61985 |

60 | 1.7441 | 2.48865 | 1.64065 | 1.64406 |

70 | 1.7441 | 2.41782 | 1.65803 | 1.66084 |

80 | 1.7441 | 2.36522 | 1.67055 | 1.67294 |

100 | 1.7441 | 2.26774 | 1.68719 | 1.68902 |

120 | 1.7441 | 2.20117 | 1.69776 | 1.69923 |

140 | 1.7441 | 2.15539 | 1.70537 | 1.70659 |

160 | 1.7441 | 2.11872 | 1.71089 | 1.71193 |

θ | θ ^ B . S E | θ ^ B . H T | MSE of θ ^ B . S E | MSE of θ ^ B . H T |
---|---|---|---|---|

0.50 | 0.5020220 | 0.5033140 | 0.00692919 | 0.00691164 |

1.74410 | 1.746390 | 1.75090 | 0.0830342 | 0.0827802 |

40 | 3.999920 | 4.010250 | 0.440474 | 0.439035 |

The slight improvements in the estimation of the shape and scale parameters of the PLP is expected to jointly improve the estimate of the intensity function and therefore the reliability estimation of a software. For a fixed value of θ = 1.7441 and a sample size similar to the size of the collected data, n = 40 , the estimates of the intensity function V ^ M L E ( t ) , V ^ B . S E ( t ) , and V ^ B . H T ( t ) were obtained when we use β ^ , β ^ B . S E , and β ^ B . H T , respectively, in (2). That is,

V ^ ′ M L E ( t ) = β ^ θ ( t θ ) β ^ − 1 , θ > 0 , t > 0. (36)

V ^ ′ B . S E ( t ) = β ^ B . S E θ ( t θ ) β ^ B . S E − 1 , θ > 0 , t > 0. (37)

V ^ ′ B . H T ( t ) = β ^ B . H T θ ( t θ ) β ^ B . H T − 1 , θ > 0 , t > 0. (38)

Their graphs (

In order to obtain Bayesian estimates of the intensity function, V ^ B . S E ∗ and V ^ B . H T ∗ , we substituted the Bayesian estimates of β and its corresponding θ MLE in (2):

V ^ B . S E ∗ ( t ) = β ^ B . S E θ ^ ( t θ ^ ) β ^ B . S E − 1 , t > 0. (39)

V ^ B . H T ∗ ( t ) = β ^ B . H T θ ^ ( t θ ^ ) β ^ B . H T − 1 , t > 0. (40)

The MLE of the intensity function, V ^ M L E , is obtained using the MLEs of β and θ . That is,

V ^ M L E ( t ) = β ^ θ ^ ( t θ ^ ) β ^ − 1 , t > 0. (41)

The Bayesian MLE of the intensity function under the influence of the Bayesian estimates of β , denoted by V ^ B . S E and V ^ B . H T , are obtained by substituting β ^ B . H T and β ^ B . S E with θ ^ B . H T and θ ^ B . S E , respectively, in (2):

V ^ B . S E ( t ) = β ^ B . S E θ ^ B . S E ( t θ ^ B . S E ) β ^ B . S E − 1 , t > 0 , (42)

and

V ^ B . H T ( t ) = β ^ B . H T θ ^ B . H T ( t θ ^ B . H T ) β ^ B . H T − 1 , t > 0. (43)

To measure the robustness of V ^ B . H T with respect to V ^ B . S E and V ^ M L E , we calculated the relative efficiency (RE) of the estimate V ^ B . H T compared to the estimate V ^ B . S E defined by:

R E ( V ^ B . H T , V ^ B . S E ) = ∫ − ∞ ∞ [ V ^ B . H T ( t ) − V ( t ) ] 2 d t ∫ − ∞ ∞ [ V ^ B . S E ( t ) − V ( t ) ] 2 d t . (44)

If R E = 1 , V ^ B . H T and V ^ B . S E will be interpreted as equally efficient. If R E < 1 , V ^ B . H T is more efficient than V ^ B . S E . To the contrary, if R E > 1 , V ^ B . H T is less efficient than V ^ B . S E . Similarly, we compared V ^ B . H T and V ^ M L E . Bayesian estimates and MLEs for the parameter β = 0.7054 and θ = 1.7441 (

For the comparison of V ^ B . H T and V ^ B . S E , the R E ( V ^ B . H T , V ^ B . S E ) is less than 1, which implies that the intensity function using β ^ B . H T and θ ^ B . H T is more efficient than the intensity function under β ^ B . S E and θ ^ B . S E . Comparing V ^ B . H T and V ^ B . S E to V ^ M L E , we obtained a similar result, establishing the superior relative efficiency of Bayesian estimates over MLE estimates. The corresponding graphs for the intensity functions are given in

In addition, V ^ B . H T ∗ and V ^ B . S E ∗ are computed using Bayesian estimates for β and MLE estimates θ , which were less efficient compare to V ^ M L E , V ^ B . S E , and V ^ B . H T . Based on the results, the Bayesian estimates under the H-T loss function will be used to analyze the real data.

β | β ^ | β ^ B . S E | β ^ B . H T | θ | θ ^ | θ ^ B . S E | θ ^ B . H T |
---|---|---|---|---|---|---|---|

0.7054 | 0.743982 | 0.695989 | 0.696467 | 1.7441 | 2.73107 | 1.57545 | 1.58115 |

V ( t ) | V ^ M L E | V ^ B . S E | V ^ B . H T |
---|---|---|---|

0.476465 ⋅ t − 0.2946 | 0.352321 ⋅ t − 0.256018 | 0.507238 ⋅ t − 0.304011 | 0.5062 ⋅ t − 0.303533 |

R E ( V ^ B . S E , V ^ M L E ) | R E ( V ^ B . H T , V ^ M L E ) | R E ( V ^ B . H T , V ^ B . S E ) |
---|---|---|

0.087746 0 | 0.07619190 | 0.868324 0 |

Using the reliability growth data from

For the failure data of Crow, provided in

V ^ B . H T ( t ) = 0.347933 ⋅ t − 0.498801 , t > 0. (45)

To obtain a Bayesian MLE for the reliability function under H-T loss function, we use this Bayesian estimate for the intensity function. The analytical form for the corresponding Bayesian reliability estimate, based on the data, is given by:

R ^ B . H T ( t i | t 1 , ⋯ , t i − 1 ) = exp { − 0.347933 ∫ t i − 1 t i x − 0.498801 d x } , t i > t i − 1 > 0. (46)

Thus, the conditional reliability of the software given that the last two failure times were t 39 = 3181 and t 40 = 3256.3 is approximately 63%.

Algorithm 2. Estimate of the intensity function using Crow data in

To answer the second research question, “Is the Bayesian estimate of the intensity function, V ( t ; β , θ ) , of the PLP sensitive to the selections of the prior (both parametric and non-parametric priors) and loss function?”, we developed a simulation procedure, Algorithm 3, given below.

The algorithm compares the Bayesian and MLE estimates of the intensity function, V ( t ; β , θ ) , under different prior PDFs, for various sample sizes, with the H-T and S-E loss functions. The relative efficiency is used to compare these estimates of the V ( t ; β , θ ) . The relative efficiency with a value less than 1, larger than 1, and approximately equal to 1 indicate that the Bayesian estimates under the H-T loss function are more, less, equally efficient to the Bayesian estimate under the S-E loss function and the same analysis is applied when we compared to the MLE of V ( t ; β , θ ) , respectively. The algorithm starts by initializing the shape and scale parameters of the PLP, β and θ , respectively, and the number of iterations p.

Algorithm 3. Simulation to compare Bayesian and MLE estimates of the intensity function. Notations found in

For various sample sizes ( n = 20 , 40 , 80 , 140 ), random failure times (time to failures) distributed according to the PLP are simulated using the initialized values of the PLP parameters. Then, the Bayesian and MLE estimates of the key parameter β are computed and used to compute the Bayesian estimates of θ , respectively. After a predetermined number of iterations, the average values of the Bayesian and MLE estimates of β and θ were used to obtain the analytical forms of the V ( t ; β , θ ) under Bayesian, for both H-T and S-E loss functions and MLE, namely V ^ H T , V ^ S E , and V ^ M L E , respectively. Informative parametric priors were considered such as the inverted gamma and the Burr PDFs, whereas the Jeffery prior was chosen as non-informative prior. In addition, probability kernel density function is selected as a non-parametric prior PDF. Probability kernel density estimation depends on the sample size, bandwidth, and the choice of the kernel function ( K ( u ) ). In this study, the optimal bandwidth ( h * ) and kernel function were chosen to minimize the asymptotic mean integrated squared error (AMISE). The simplified form of the AMISE is reduced to:

AMISE ( f ^ ( β ) ) = C ( K ) n ⋅ h + ( 1 4 ⋅ h 4 ⋅ k 2 2 ⋅ R ( f ( 2 ) ( β ) ) ) (47)

where:

C ( K ) = ∫ ( K ( u ) ) 2 d u .

n: sample size.

h: bandwidth.

k 2 = ∫ − ∞ + ∞ u 2 ⋅ K ( u ) d u .

f ( 2 ) ( β ) is the second derivative of Burr PDF.

R ( f ( 2 ) ( β ) ) = ∫ ( f ( 2 ) ( β ) ) 2 d β .

h * = [ C ( K ) k 2 2 ⋅ R ( f ( 2 ) ( β ) ) ] 1 / 5 ⋅ n − 1 / 5 .

AMISE was numerically calculated using the optimal bandwidth, with respect to different samples sizes for each kernel function considered in this study, namely Epanechnikov, Cosine, Biweight, Triweight, Gaussian, Triangle, Uniform, Tricube, and Logistic kernel functions. The results is given by

The minimum AMISE corresponds to the Epanechnikov kernel function ( K ( u ) = 3 4 ( 1 − u 2 ) I | u | ≤ 1 ). In addition to the Epanechnikov kernel function, the Gaussian kernel function ( K ( u ) = 1 2 π exp ( − u 2 2 ) I I R ) was also used in the calculation since it is commonly used for its analytical tractability.

Numerical integration techniques were used to compute the Bayesian estimates of the intensity function, V ( t ; β , θ ) , parameters under both H-T and S-E loss functions according to the equations defined in Section 2.3, for each of the parametric and non-parametric prior PDFs. Samples of size 20, 40, 80, and

Kernel function | Sample size | ||||
---|---|---|---|---|---|

50 | 100 | 150 | 200 | 500 | |

Epanechnikov | 0.362827 | 0.208389 | 0.150662 | 0.119688 | 0.0575042 |

Cosine | 0.362986 | 0.208481 | 0.150728 | 0.119741 | 0.0575295 |

Biweight | 0.364607 | 0.209412 | 0.151401 | 0.120275 | 0.0577863 |

Triweight | 0.366740 | 0.210637 | 0.152286 | 0.120979 | 0.0581243 |

Gaussian | 0.377644 | 0.216900 | 0.156814 | 0.124576 | 0.0598525 |

Triangle | 0.366972 | 0.210770 | 0.152383 | 0.121056 | 0.0581612 |

Uniform | 0.384675 | 0.220938 | 0.159734 | 0.126895 | 0.0609669 |

Tricube | 0.363433 | 0.208737 | 0.150913 | 0.119888 | 0.0576002 |

Logistic | 0.399132 | 0.229241 | 0.165737 | 0.131665 | 0.0632582 |

140 were generated where the parameters β and θ were initialized to be 0.7054 and 1.7441, respectively. In the analytical form (17), f 1 and f 2 are conditioned to be positive numbers and play a big role in assigning the weight of loss depending on the estimator’s behavior, whether underestimating or overestimating. Therefore, the simulation procedure was repeated three times according to the following cases:

1) f 1 > f 2

2) f 1 < f 2

3) f 1 = f 2

The results for 1000 repetitions, f 1 > f 2 , and n = 20 , 40 , 80 , 140 , are shown in

It can be observed that the Bayesian estimate of the V ( t ; β , θ ) under the H-T loss function ( V ^ H T ) and S-E loss function ( V ^ S E ) had an outstanding efficiency compared to the MLE of the V ( t ; β , θ ) ( V ^ M L E ) for all sample sizes and prior PDFs, with the exception of the sample sizes 20 and 40 when inverted gamma PDF was the selected prior. The V ^ H T was more efficient (6% - 11% estimation improvement) compared to the V ^ S E when Burr PDF is selected to be the prior. The V ^ H T had similar efficiency compared to the V ^ S E when Jeffrey prior is selected and for large sample sizes, whereas unsurprisingly V ^ S E was more efficient for small sample sizes since Jeffrey Bayesian estimate of the key parameter β tends to overestimate and for the H-T loss function gives more exponential weight on the extreme overestimate loss than the extreme under-estimate loss when f 1 > f 2 . For Bayesian Gaussian and Epanechnikov kernel estimates, the V ^ H T was more efficient compared to the V ^ S E for sample sizes n = 20 , 40 and 80 with 11% - 13% of estimation improvement even though they tend to underestimate and the H-T loss function puts more exponential weight on the extreme underestimation, but tend to have similar efficiency for sample size n = 140 .

Prior PDF | R E ( V ^ H T , V ^ M L E ) | R E ( V ^ S E , V ^ M L E ) | R E ( V ^ H T , V ^ S E ) | |
---|---|---|---|---|

Burr | n = 20 | 0.1356 | 0.1519 | 0.8923 |

Inverted gamma | 4.2461 | 4.1632 | 1.0199 | |

Jeffrey | 0.0365 | 0.0289 | 1.2616 | |

Gaussian kernel | 0.1187 | 0.1346 | 0.8818 | |

Epanechnikov kernel | 0.1187 | 0.1346 | 0.8818 | |

Burr | n = 40 | 0.3047 | 0.3345 | 0.9107 |

Inverted gamma | 6.3934 | 6.2832 | 1.0175 | |

Jeffrey | 0.0166 | 0.0119 | 1.3947 | |

Gaussian kernel | 0.1234 | 0.1424 | 0.8663 | |

Epanechnikov kernel | 0.1221 | 0.1411 | 0.8659 | |

Burr | n = 80 | 0.0136 | 0.0151 | 0.9007 |

Inverted gamma | 0.8058 | 0.7934 | 1.0156 | |

Jeffrey | 0.0159 | 0.0144 | 1.1065 | |

Gaussian kernel | 0.0105 | 0.0117 | 0.8988 | |

Epanechnikov kernel | 0.0114 | 0.0127 | 0.8999 | |

Burr | n = 140 | 0.0035 | 0.0037 | 0.9367 |

Inverted gamma | 0.1421 | 0.1399 | 1.0155 | |

Jeffrey | 0.0040 | 0.0037 | 1.0680 | |

Gaussian kernel | 0.0019 | 0.0018 | 1.0119 | |

Epanechnikov kernel | 0.0021 | 0.0022 | 0.9670 |

The results for 1000 repetitions, f 1 > f 2 , and n = 20 , 40 , 80 , 140 , are shown in

Again, the Bayesian MLE estimate of the V ( t ; β , θ ) under the H-T loss function ( V ^ H T ) and S-E loss function ( V ^ S E ) had an outstanding efficiency compared to the MLE of the V ( t ; β , θ ) ( V ^ M L E ) for all sample sizes and prior PDFs. When the inverted gamma was selected as prior, the V ^ H T was more efficient compared to the V ^ S E for all sample sizes with an approximately 2% of estimation improvement. As expected, the V ^ H T was less efficient compared to the V ^ S E when Burr PDF, and Gaussian and Epanechnikov kernel densities are selected as priors for sample sizes 20 and 40, since they tend to underestimate the V ( t ; β , θ ) parameters, and the H-T loss function tends to put more weight on the extreme overestimation than on the extreme underestimation when f 1 > f 2 . But the V ^ H T and V ^ S E had approximately similar efficiency for sample size n = 80 , and the V ^ H T tends to be slightly more efficient for large sample size ( n = 140 ). The V ^ H T was more efficient (4% - 24% estimation

Prior PDF | R E ( V ^ H T , V ^ M L E ) | R E ( V ^ S E , V ^ M L E ) | R E ( V ^ H T , V ^ S E ) | |
---|---|---|---|---|

Burr | n = 20 | 0.2068 | 0.1860 | 1.1116 |

Inverted gamma | 4.7351 | 4.8309 | 0.9802 | |

Jeffrey | 0.0232 | 0.0306 | 0.7589 | |

Gaussian kernel | 0.1948 | 0.1735 | 1.1226 | |

Epanechnikov kernel | 0.1949 | 0.1736 | 1.1227 | |

Burr | n = 40 | 0.1500 | 0.1327 | 1.1305 |

Inverted gamma | 5.9173 | 6.0152 | 0.9837 | |

Jeffrey | 0.0673 | 0.0785 | 0.8581 | |

Gaussian kernel | 0.0516 | 0.0431 | 1.1980 | |

Epanechnikov kernel | 0.051 | 0.0425 | 1.1985 | |

Burr | n = 80 | 0.0126 | 0.0121 | 1.0406 |

Inverted gamma | 0.8155 | 0.8274 | 0.9856 | |

Jeffrey | 0.0326 | 0.0349 | 0.9365 | |

Gaussian kernel | 0.0111 | 0.0108 | 1.0307 | |

Epanechnikov kernel | 0.0116 | 0.0112 | 1.0356 | |

Burr | n = 140 | 0.0180 | 0.0183 | 0.9814 |

Inverted gamma | 0.2545 | 0.2576 | 0.9880 | |

Jeffrey | 0.0329 | 0.0338 | 0.9733 | |

Gaussian kernel | 0.0222 | 0.0227 | 0.9762 | |

Epanechnikov kernel | 0.0204 | 0.0209 | 0.9772 |

improvement) compared to the V ^ S E when Burr Jeffrey is chosen to be the prior PDF. The V ^ H T had similar efficiency compared to the V ^ S E for large sample sizes and when Jeffrey prior is selected, whereas unsurprisingly V ^ S E was more efficient for small sample sizes since Jeffrey Bayesian estimate of the key parameter β tends to overestimate and for the H-T loss function gives more exponential weight on the extreme overestimate loss than the extreme under-estimate loss when f 1 > f 2 . For Bayesian Gaussian and Epanechnikov kernel estimates, the V ^ H T was more efficient compared to the V ^ S E for sample sizes n = 20 , 40 and 80 with 11% - 13% of estimation improvement even though they tend to underestimate and the H-T loss function puts more exponential weight on the extreme underestimation, but tend to have similar efficiency for sample size n = 140 .

The results for 1000 repetitions, f 1 > f 2 , and n = 20 , 40 , 80 , 140 , are shown in

Prior PDF | R E ( V ^ H T , V ^ M L E ) | R E ( V ^ S E , V ^ M L E ) | R E ( V ^ H T , V ^ S E ) | |
---|---|---|---|---|

Burr | n = 20 | 0.0703 | 0.0702 | 1.0011 |

Inverted gamma | 3.7132 | 3.7135 | 0.9999 | |

Jeffrey | 0.0612 | 0.0613 | 0.9981 | |

Gaussian kernel | 0.0585 | 0.0583 | 1.0037 | |

Epanechnikov kernel | 0.0585 | 0.0583 | 1.0037 | |

Burr | n = 40 | 0.1195 | 0.1194 | 1.0008 |

Inverted gamma | 7.3018 | 7.3022 | 0.9999 | |

Jeffrey | 0.1351 | 0.1352 | 0.9993 | |

Gaussian kernel | 0.0384 | 0.0384 | 1.0008 | |

Epanechnikov kernel | 0.0381 | 0.0381 | 1.0008 | |

Burr | n = 80 | 0.0144 | 0.0144 | 1.0002 |

Inverted gamma | 0.8626 | 0.8734 | 0.9876 | |

Jeffrey | 0.0250 | 0.0250 | 0.9998 | |

Gaussian kernel | 0.0122 | 0.0122 | 1.0003 | |

Epanechnikov kernel | 0.0131 | 0.0131 | 1.0002 | |

Burr | n = 140 | 0.0065 | 0.0065 | 100000 |

Inverted gamma | 0.1863 | 0.1863 | 100000 | |

Jeffrey | 0.0117 | 0.0117 | 0.9999 | |

Gaussian kernel | 0.0070 | 0.0070 | 0.9999 | |

Epanechnikov kernel | 0.0064 | 0.0064 | 0.9999 |

Again, the Bayesian MLE estimate of the V ( t ; β , θ ) under the H-T loss function ( V ^ H T ) and S-E loss function ( V ^ S E ) had an outstanding efficiency compared to the MLE of the V ( t ; β , θ ) ( V ^ M L E ) for all sample sizes and prior PDFs, with the exception of the sample sizes 20 and 40 when inverted gamma PDF was the selected prior. It is observed that both V ^ H T and V ^ S E had similar efficiency in estimation of the V ( t ; β , θ ) for all sample sizes and priors considered in this study.

The sensitivity analysis shows that the Bayesian estimates of the intensity function of the PLP is sensitive to the prior and loss function selections. Tables 11-13 indicate the efficiency of the Bayesian estimates under the H-T loss function when compared to the Bayesian estimate under S-E loss function and to the MLE, given that the engineer should choose the values of f 1 and f 2 based on his/her estimator’s behaviour (underestimating and over estimating). Moreover, f 1 > f 2 is the recommended choice when the engineer selects Burr or kernel PDFs as their prior knowledge of the behavior of the key parameter β . On the other hand, if the engineer does not have a prior knowledge of the key parameter β , it is still recommended to use H-T loss function in the Bayesian calculations with f 1 < f 2 .

Thus far, we showed the more accuracy in estimating a software reliability when applying the Bayesian analysis under the H-T loss function compared to the Bayesian analysis under the S-E loss function and the MLE of the subject analysis. The performed extensive analysis requires efficiency in utilizing the existing programming languages, which therefore requires some programming experience, we developed an interactive user interface application using Wolfram language to compute and visualize the Bayesian and maximum likelihood estimates of the intensity and reliability functions of the Power Law Process for a given data.

In the present study, we developed the analytical Bayesian estimates of the key parameter β , under Higgin-Tsokos and squared-error loss functions, in the intensity function where the underlying failure distribution is the Power Law Process, that is used for software reliability assessment, among others. The reliability function of the subject model is written analytically as a function of the intensity function.

The behavior of the key parameter β is characterized by the Burr type XII probability distribution. Real data and numerical simulation were used to illustrate not only the robustness of the squared-error loss function being challenged by the assumption of the Higgins-Tsokos loss function, but also the efficiency improvement in the estimation of the intensity function of PLP under Higgins-Tsokos loss function ( V ^ B . H T ( t ) ). For 100,000 samples of software failure times, based on Monte Carlo simulations and sample size of 40, the Bayesian estimate of β under Higgins-Tsokos loss function ( β ^ B . H T ) performed slightly better than the Bayesian estimate of β under squared-error loss function ( β ^ B . S E ) with respect to three different values of θ (0.5, 1.7441, 4). Even for different sample sizes (20, 30, 40, 50, 60, 70, 80, 100, 120, 140, and 160), similar results were achieved using β = 0.7054 , θ = 1.7441 , and averaged over 10,000 samples of software failure times.

As the MLE of the second parameter in the intensity function ( θ ) depends on the estimate of β , the adjusted estimate of θ β ^ B . H T provided better performance compared to the adjusted estimate of θ using the β ^ B . S E ( t ) . Moreover, the Relative Efficiency was used to compare the intensity function estimations, mainly using MLEs for both β and θ ( V ^ M L E ( t ) ), using Bayesian estimate of β under the squared-error loss function and Bayesian of θ ( V ^ B . S E ( t ) ), and using Bayesian estimate of β under the Higgins-Tsokos loss function and Bayesian of θ ( V ^ B . H T ( t ) ), showing that V ^ B . H T ( t ) is more efficient in estimating the intensity function V ( t ) with about 12% estimation improvement.

With respect to the question: Is the Bayesian estimate of the intensity function, V ( t ; β , θ ) , of the PLP sensitive to the selections of the prior, both parametric and non-parametric priors, and loss function? The parametric prior PDFs were Burr, Jeffrey, and inverted gamma probability distributions whereas the non-parametric priors were Gaussian and Epanechnikov kernel densities. The priors’ parameters were estimated using Crow failure times. Additionally, the optimal bandwidth and kernel functions were selected to minimize the asymptotic mean integrated squared error.

Using the developed algorithm, 1000 samples of software failure times with respect to four sample sizes of n (20, 40, 80, and 140) were generated from the PLP to compare the Bayesian estimates of V ( t ; β , θ ) under the subject priors and loss functions using the Relative Efficiency among them. The simulation procedure was repeated three times for the cases when f 1 > f 2 , f 1 < f 2 , and f 1 = f 2 . The results showed the efficacy of the Bayesian estimates of H-T loss function, and the choice of the f 1 and f 2 values depends on the prior knowledge of the key parameter β . It is recommended to choose values where f 1 > f 2 when the engineer thinks the prior knowledge of β is best characterized by Burr or Kernel based probability distributions with a proper justification, whereas a choice of f 1 < f 2 and Jeffery’s prior is suggested when the engineer does not have a prior knowledge of β .

Thus, based on this aspect of our analysis, we can conclude that the Bayesian analysis approach under Higgins-Tsokos loss function not only as robust as the Bayesian analysis approach under squared error loss function but also performed better, where both are superior to the maximum likelihood approach in estimating the reliability function of the Power Law Process. The interactive user interface application can be used without any prior coding knowledge to compute and visualize the Bayesian and maximum likelihood estimates of the intensity and reliability functions of the Power Law Process for a given data.

We thank Majmaah University for funding the research, along with the support provided by the University of South Florida.

The authors declare no conflicts of interest regarding the publication of this paper.

Alenezi, F.N. and Tsokos, C.P. (2019) The Effectiveness of the Squared Error and Higgins-Tsokos Loss Functions on the Bayesian Reliability Analysis of Software Failure Times under the Power Law Process. Engineering, 11, 272-299. https://doi.org/10.4236/eng.2019.115020