On Marginal Distributions under Progressive Type II Censoring: Similarity/Dissimilarity Properties

DOI: 10.4236/ojs.2017.74044   PDF   HTML   XML   560 Downloads   1,037 Views  

Abstract

Currently, progressive censoring is intensively investigated by several researchers due to its ability to remove subjects from the experiment before the final termination point, thus saving time and cost. The closed form of marginal density of failure times under progressive type II censoring is essential to study the properties of statistical analysis under different censoring schemes. In this paper, we provide a different presentation of the marginal distribution under progressive type-II censoring and we derive closed forms for different special cases. In order to study the similarity/dissimilarity of marginal densities of order statistics for failure times, the overlap measure is used. We discovered that the overlap measure depends only on the effective size m. A numerical example based on a real life data regarding failure times of aircrafts' windshields is provided to quantify the amount of redundant information provided by the order statistics of the failure times under different progressive type-II schemes based on the overlap measure. Moreover, this data set is used as a pilot study to estimate the effective size m needed for future studies.

Share and Cite:

Helu, A. and Samawi, H. (2017) On Marginal Distributions under Progressive Type II Censoring: Similarity/Dissimilarity Properties. Open Journal of Statistics, 7, 633-644. doi: 10.4236/ojs.2017.74044.

1. Introduction

Customer satisfaction has been a main interest for manufacturers to produce reliable products. For their products to remain desired and thus profitable, they are motivated to develop high quality and long life products. This requires having knowledge about products failure time distributions which is achieved by performing life testing experiments on products before being released into the markets.

As a result of some constraints such as the lack of funds and/or time limits, samples of life testing experiments are sometimes terminated before the failure of all items under consideration. Such samples are called censored samples.

Two common types of censoring schemes are type I and type II censoring. In type I, the test is terminated at a predetermined time, whereas in type II, the test is terminated at a predetermined number of failures. In both types however, the removal of active units during the experiment is prohibited.

It may be desired in some cases to remove items being tested before their predetermined termination points whether intentionally or unintentionally in order to reduce the cost of the experiment and the time consumed. An example, is the study of weariness of units where these units are required to be completely worn or disintegrated at different stages of the experiment during their actual aging process which is quite time consuming. Another example, is the early removal of some surviving units in the experiment in order to use them in other tests for the purpose of minimizing the cost of the experiment.

This leads to the practice of Progressively Type II (PTII) censoring which is considered by many experimenters as an effective approach of minimizing the cost and the time consumed. Moreover, it contains the ordinary order statistics (OS) and type II censoring as special cases which makes it largely desired and used in experimental design.

Considerable attention has been directed towards the properties of progressive censoring. Part of it is due to the availability of the high-speed computing resources which makes it feasible for simulation studies as well as a practical method of gathering lifetime data for both researchers and practitioners (Viveros & Balakrishnan [1] ). There has been a vast number of discussions on progressive censoring and its applications; interested readers may refer to the books by Balakrishnan & Aggarwala [2] and Balakrishnan & Cramer [3] for recent reviews and discussions of the need for this type of censoring.

Under this type of censoring, n independent items are placed at the same time on a life testing experiment and only m ( < n ) failures are completely observed. The censoring occurs progressively in m stages as follows: When the first failure is observed, a random sample of size R 1 is immediately drawn and removed from the ( n 1 ) survivals, hence, leaving n 1 R 1 survival items. Then after the failure of the second item, the sample becomes n 2 R 1 in which another sample of size R 2 is randomly selected and removed from the remaining survival units, continuing with this process until m failures are observed and all the remaining n m R 1 R m 1 ( = R m ) surviving units are removed from the experiment. It is assumed that the lifetimes of these n units are independent and identically distributed with common distribution function F. Moreover, n, m and the censoring scheme R 1 , R 2 , , R m are all pre-fixed. Note that if R 1 = R 2 = = R m 1 = 0 , then R m = n m which corresponds to type-II censoring. If R 1 = R 2 = = R m = 0 , then m = n which represents the complete data set. For a comprehensive recent review of progressive censoring, readers may refer to Balakrishnan & Cramer [3] .

In general, the order statistics that is produced by PTII censoring provides more information about the underlying distribution than simple random samples (SRS) since their densities span over the whole range of the underlying distribution (see Figure 1).

However, different censoring schemes may provide different amount of information in PTII censoring due to progressively selected out different sets of ( R 1 , R 2 , , R m ) units at random. To study these properties of order statistics, it is necessary to derive the marginal distributions of the rth failure time and use a similarity/dissimilarity measures such as overlapping measures.

The overlapping measure (OVL) is a powerful tool to find the similarities/ dissimilarities between any two densities. In terms of marginal densities of order statistics, the overlap can provide a good indication of the dissimilarities between two densities of two PTII censored failure times. This will enable us to check the amount of information that PTII censoring provides about the underlying distributions and its parameters.

Overlap measures are defined as the common areas under two probability density functions and have been used as measures of agreement of two income distributions and as a proportion of machines or electronic devices that have similar range of failure times. The OVL is used in many useful applications including, clinical trials (see Mizuno et al., [4] ), and in a comparison of income distributed by race (Weitzman, [5] ).

Figure 1. Overlap among densities.

The OVL measure ( Δ ) was originally introduced by Weitzman [5] . One application of Δ , was given by Ichikawa [6] , who used Δ to estimate the lowest upper bound of the failure in the stress-strength model in reliability analysis. Federer et al., [7] used Δ to estimate the proportion of genetic deviations in segregating populations. Moreover, Sneath [8] used Δ as a measure of clusters distinctions. Additional references of such methodology applications in ecology and other fields can be found in Mulekar and Mishra ( [9] and [10] ).

In this paper, we provide another presentation of the general form of the marginal density for the rth failure time based on PTII censoring which is more convenient to be used for deriving special cases of PTII censoring schemes such as the scheme where ( n m ) items are censored at the time of the first failure, the scheme where ( n m ) items are removed at the mth failure, and the equi-balanced scheme. In addition, the OVL coefficient is used to discriminate between two marginal densities based on PTII censoring. The rest of this paper is organized as follows: in Section 2, we investigate another form of the marginal density of PTII censoring and derive some special cases. Similarity properties of marginal densities based on PTII censoring using OVL measures are presented in Section 3. A numerical as well as a real life data examples are presented in Section 4 for illustration. Final remarks and conclusions are provided in Section 5.

2. On the Marginal Distributions Based on PTII

Under the PTII censoring for life-testing, suppose that X 1 : m : n < X 2 : m : n < < X m : m : n are the lifetimes of the completely observed units to fail, and that R = ( R 1 , R 2 , , R m ) represents the numbers of units withdrawn at these failure times. If the failure times are based on an absolutely continuous distribution function F with probability density function f , the joint probability density function of the progressive censored failure times X 1 : m : n < < X m : m : n (see Balakrishnan & Aggarawala [2] ) is given by:

f X 1 : m : n , , X m : m : n ( x 1 , , x m ) = c i = 1 m f ( x i ) [ 1 F ( x i ) ] R i , < x 1 < < x m < (1)

where, n = m + i = 1 m R i , m , n , R i 0 , 1 i m , R = ( R 1 , R 2 , , R m ) , and c = n ( n 1 R 1 ) ( n 2 R 1 R 2 ) ( n i = 1 m ( R i + 1 ) ) .

From the representation of the joint density function, it is obvious that progressive censoring can be embedded in the models of generalized order statistics and of sequential order statistics (Kamps [11] [12] ). Moreover, Balakrishnan and Cramer [3] , showed that the marginal density for the rth progressive type II censored order statistics from an absolutely continuous cumulative distribution function (cdf) F with probability density function (pdf) f is given by:

f X r : m : n ( x ) = f ( x ) ( i = 1 r γ i ) r j = 1 a j , r [ 1 F ( x ) ] γ j 1 ; x (2)

where,

γ k = m j = k ( R j + 1 ) and a j , r = k = 1 k j r 1 γ k γ j . Note that ( i = 1 r γ i ) r j = 1 a j , r γ j = 1.

In this paper, we introduce another representation of the marginal density in (2) that can be derived from the joint density in (1) using repetitive integrals as follows:

f X r : m : n ( x ) = c x r x r 1 x 2 x r x m 1 i r m f ( x i ) [ 1 F ( x i ) ] R i d x m d x m 1 d x r + 1 d x 1 d x r 1 .

Hence,

f X r : m : n ( x ) = i = 1 r γ i f ( x ) [ 1 F ( x ) ] γ r 1 i = r + 1 m γ i × r 1 h = 1 [ ( 1 ) h + 1 ( 1 [ 1 F ( x ) ] γ r h γ r ) i = 1 r h 1 ( γ r i h γ r h ) L = 1 h ( γ r h 1 γ r l + 1 ) ] ; (3)

where, γ k = m j = k ( R j + 1 ) ; r = 1 , 2 , , m , and n = m + i = 1 m R i , m , n .

The closed form in (3) is easier to use in a mathematical software in order to derive a closed form for some well known special cases. In addition, those marginal densities are important to investigate the properties needed for the statistical inferences under different censoring schemes. Using the closed form in (3) we can provide some special cases.

Special Cases

Using the new representation in (3), it is convenient to derive the following special cases.

Case 1: Ordinary order statistics (OS).

When m = n and R 1 = R 2 = = R m = 0 , which represents the complete data set of order statistics, Equation (3) can be written as:

f X r : m : n ( x ) = m ( m 1 r 1 ) f ( x ) F ( x ) r 1 [ 1 F ( x ) ] m r ; x (4)

and hence,

1 m m r = 1 f X r : m : n ( x ) = f ( x ) .

Case 2: Equi-balanced censoring scheme.

Suppose R 1 = R 2 = = R m = R , then the censoring plan with equal removal number R is called equi-balanced censoring scheme and it can be shown that:

f X r : m : n ( x ) = m ( R + 1 ) ( m 1 r 1 ) f ( x ) [ 1 F ( x ) ] ( m r + 1 ) ( R + 1 ) 1 [ 1 [ 1 F ( x ) ] ( R + 1 ) ] r 1 . (5)

Note that, Equation (5) is simply the marginal density of the rth order statistics of m observations from the cdf G ( x ) = [ 1 F ( x ) ] ( R + 1 ) . Moreover,

1 m m r = 1 f X r : m : n ( x ) = ( R + 1 ) f ( x ) ( 1 F ( x ) ) R = f 1 : R + 1 ( x ) , (6)

represents the pdf of the minimum order statistics from ( R + 1 ) observations.

Case 3: Type II censoring scheme.

R 1 = R 2 = = R m 1 = 0 , and R m = n m corresponds to type-II censoring, then Equation (3) can be simplified as:

f X r : m : n ( x ) = n ( n 1 r 1 ) f ( x ) F ( x ) r 1 [ 1 F ( x ) ] n r ; x (7)

and hence

1 m m r = 1 f X r : m : n ( x ) = f ( x ) ( m r = 1 n ( n 1 r 1 ) F ( x ) r 1 [ 1 F ( x ) ] n r m ) . (8)

Note that Equation (7) is basically the pdf of the rth order statistic from a sample of size n.

To study the similarity/dissimilarity of marginal distributions of the order statistics for failure times, the OVL ( Δ ) measure is derived and numerated for different PTII schemes to quantify the amount of information provided by the order statistics of the failure times under different schemes.

3. Similarity Properties of Marginal pdfs Based on PTII Censoring

Investigating the similarities/disemilarities among densities of order statistics based on PTII is important for investigators in order to select the less costly censoring scheme with higher amount of information that this scheme provides about the underlying distribution and its parameters.

Suppose two samples of observations are drawn from two continuous distributions f 1 ( x ) and f 2 ( x ) then Weitzman’s Δ is given in the following equation

Δ ( f 1 , f 2 ) = min ( f 1 ( x ) , f 2 ( x ) ) d x .

The overlap measure Δ can be applied to discrete distributions by replacing the integrals with summations as well as multivariate distributions. Moreover, Δ is measured on a scale of 0 to 1; Δ value close to 0 indicates extreme dissimilarities between the two density functions and Δ = 1 indicates exact similarities.

3.1. Similarity Structure between Two Consecutive Statistics from PTII Censoring

Using (2 or 3), the OVL between the densities f r and f r + 1 of two consecutive order statistics is given by:

Δ ( f r , f r + 1 ) = m i n ( f r ( x ) , f r + 1 ( x ) ) d x ,

where,

min ( f r ( x ) , f r + 1 ( x ) ) = { f r ( x ) if x > M r , r + 1 f r + 1 ( x ) if x M r , r + 1 (9)

Thus,

Δ ( f X r : m : n , f X r + 1 : m : n ) = M r , r + 1 f X r + 1 : m : n ( x ) d x + M r , r + 1 f X r : m : n ( x ) d x . (10)

With some algebraic manipulations using Equation (2) or (3) we can get the following results:

M r , r + 1 = F 1 ( 1 p r , r + 1 ) ; p r , r + 1 = γ r + 1 γ r γ r γ r + 1 . (11)

Notice that Δ is free of the underlying distribution.

3.2. Special Cases

Case 1: Ordinary order statistics (OS).

Using Al-Saleh [13] , when m = n and R 1 = R 2 = = R m = 0 , then for any r < s , the overlapping coefficient Δ between f X r : m : n and f X s : m : n is given by

Δ ( f r , f s ) = 1 P ( r Y r , s s 1 ) (12)

where, Y r , s ~ B i n o m i a l ( m , p r , s ) with p r , s = u r s r u r s r + u s s r and u r = m ( m 1 r 1 ) , u s = m ( m 1 s 1 ) .

Case 2: Equi-balanced censoring scheme.

Suppose R 1 = R 2 = = R m = R , and by using Ghahramani [14] , the OVL measure is given by:

Δ ( f r , f s ) = 1 P ( m s + 1 Y r , s m r ) (13)

where, Y r , s ~ B i n o m i a l ( m , p r , s ) with p r , s = v s s r v s s r + v r s r and v r = ( m 1 r 1 ) , v s = ( m 1 s 1 ) .

Case 3: Type II censoring scheme.

Similarly, when R 1 = R 2 = = R m 1 = 0 , and R m = n m , and applying Ghahramani [14] result, then the OVL measure is given by:

Δ ( f r , f s ) = 1 P ( r Y r , s s 1 ) (14)

where, Y r , s ~ B i n o m i a l ( n , p r , s ) with p r , s = w r s r w r s r + w s s r and w r = ( m 1 r 1 ) , w s = ( m 1 s 1 ) .

4. Illustrations Based on Simulated and Real Life Data Examples

In this section, and in order to quantify the amount of information provided by OVL for different PTII censoring schemes given in Sections 3.1 & 3.2, we provide a numerical as well as a real life data examples based on failure times of aircrafts’ windshields.

Example 1: The OVL for consecutive order statistics for different schemes using the general definition in Section (3.1).

Table 1 shows that the discrimination measured using ( 1 Δ ) is higher in the schemes where ( n m ) items are removed at the time of the first failure, namely schemes 3, 7 and 11, compared to the remaining schemes. Moreover, the discriminations that are based on schemes 4 and 12 are close in values to ordinary ordered statistics (OS). In addition, when R = ( 1 , 1 , 1 , 1 , 1 ) , OS and scheme 8 have identical values.

The Δ for OS increases as the actual sample size, n, increases. Moreover, while increasing the ratio m n has no effect on the cases when censoring occurs

at the time of the last failure (see schemes 2, 6 and 10), it has great effects on the remaining cases ( schemes 3, 4, 7, 8, 11 and 12) where Δ decreases as the ratio

m n increases.

Example 2: (Real life data)

The data set for this application is given by Blischke and Murthy [15] , and

Table 1. Overlapping between two consecutive order statistics from PTII censoring.

later used by Musleh and Helu [16] . The data represent the failure times of aircrafts’ windshields. The windshields consist of several layers of materials to withstand extreme temperatures and pressure. In order to maintain a regular performance of aircrafts, data on windshields are routinely collected and analyzed. The unit of measurement is 1000 h.

The OVL coefficient for the three special cases in Sec (3.2) using Equations 12 - 14 when n = 70 and m = 6 are presented in Table 2 which shows that Δ ( f r , f s ) values are identical for case 1 - case 3, which means that censoring schemes have no influence on the discriminations among the pdfs of the order statistics. Moreover, if r = i & s = m i + 1 ; then we can easily see that as i

( i [ m 2 ] ) increases Δ ( f i , f m i + 1 ) increases and Δ ( f i , f m i + 1 ) approaches zero as m . In addition, the minimum value of Δ ( f r , f s ) is when r = 1 and s = m . Moreover, we can also express the similarity/disimilarity between the two extremes using Δ ( f 1 , f m ) = ( 1 2 ) m 1 .

Since the value of Δ is a function of m only, this enables us to estimate the effective size m for any future studies using a pilot study. For Example, we can use the data in Table 3 as a pilot study to create two clusters based on their

Table 2. Overlapping coefficients for pairs of order statistics from PTII censoring based on the windshield data.

Table 3. The complete failure times of aircraft windshields.

Table 4. Progressive censored samples for the failure times of aircraft windshields.

failure times: one for low quality windshields and one for high quality windshields. The new data sets are presented in Table 4.

The fit of a Weibull model for the two data sets is checked using Kolmogrov- Smirnov (KS) test, Anderson-Darling (AD) and chi-square tests. When we fit the Weibull distribution for “Low Quality” data set based on maximum likelihood estimates α 1 = 1.7947 and β = 4.6 , we observe that K S = 0.1292 with corresponding p value = 0.59505 , A D = 1.40361 and chi-square distance = 0.69511 with a corresponding p value = 0.87435 . Similarly when we fit the Weibull distribution for “High Quality” data set based on maximum likelihood estimates α 1 = 2.975 and β = 4.6 , we observed that K S = 0.19536 with p value = 0.10333 , A D = 1.3587 and chi-square distance = 0.69294 with a corresponding p value = 0.87486 . The results above indicate that Weibull model provides a good fit for the two data sets. The estimated Δ ^ is calculated and

found to be 0.298774. Equating this value to ( 1 2 ) m ^ 1 , we obtain m ^ 3 as an estimate of the effective size for our future study.

Moreover, we create Figure 2 to show the overlapping among densities of the order statistics [ f r ; r = 1 , 2 , , m ; m = 6 ] . Clearly, it shows that the smallest redundancy of information occurs between the densities of the extreme order statistics ( 1,6 ) . In addition, it shows that densities f 1 f 6 span over the whole range of the original density. In our example we choose the Weibull distribution but it can be any other distribution since Δ ( f r , f s ) is free of parameters.

5. Final Remarks and Conclusions

In the past few years, progressive censoring has received a great attention by many researchers. This is due to its advantages in reducing the cost and time of the life testing. Moreover, the availability of high speed computing resources enhances the focus on progressive censoring. In this article, we introduced a new form of the marginal distributions of the order statistics under PTII censoring. In addition, we used these new forms to derive the three special cases, namely: ordinary order statistics, equi-balanced and type II censoring schemes. We derived a closed form of the OVL coefficient for any two order statistics based on PTII censoring using the presented marginal distributions in Sec. 3.2.

Figure 2. Overlap among densities.

Moreover, we found that the OVL coefficient was independent of the parent distribution and depended only on the effective size “m” which enabled us to estimate the effective size m for any future studies instead of randomly picked m.

Acknowledgements

The authors are grateful to the referee for his constructive comments and suggestions which led to the improvement of this paper. The first author would like to thank Mr. Majdi Mustafa for his continuous help.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Viveros, R. and Balakrishnan, N. (1994) Interval Estimation of Parameters of Life from Progressively Censored Data. Technometrics, 36, 84-91.
https://doi.org/10.1080/00401706.1994.10485403
[2] Balakrishnan, N. and Aggarwala, R. (2000) Progressive Censoring: Theory, Methods, and Applications. Springer Science & Business Media, Berlin.
https://doi.org/10.1007/978-1-4612-1334-5
[3] Balakrishnan, N. and Cramer, E. (2014) The Art of Progressive Censoring. Birkhauser, New York.
https://doi.org/10.1007/978-0-8176-4807-7
[4] Mizuno, S., Yamaguchi, T., Fukushima, A., Matsuyama, Y. and Ohashi, Y. (2005) Overlap Coefficient for Assessing the Similarity of Pharmacokinetic Data between Ethnically Different Populations. Clinical Trials, 2, 174-181.
https://doi.org/10.1191/1740774505cn077oa
[5] Weitzman, M.S. (1970) Measures of Overlap of Income Distributions of White and Negro Families in the United States. Vol. 22, US Bureau of the Census.
[6] Ichikawa, M. (1993) A Meaning of the Overlapped Area under Probability Density Curves of Stress and Strength. Reliability Engineering & System Safety, 41, 203-204.
[7] Federer, W.T., Powers, L. and Payne, M.G. (1963) Studies on Statistical Procedures Applied to Chemical Genetic Data from Sugar Beets.
[8] Sneath, P.H. (1977) A Method for Testing the Distinctness of Clusters: A Test of the Disjunction of Two Clusters in Euclidean Space as Measured by Their Overlap. Mathematical Geology, 9, 123-143.
https://doi.org/10.1007/BF02312508
[9] Mulekar, M.S. and Mishra, S.N. (1994) Overlap Coefficients of Two Normal Densities: Equal Means Case. Journal of the Japan Statistical Society, Japanese Issue, 24, 169-180.
[10] Mulekar, M.S. and Mishra, S.N. (2000) Confidence Interval Estimation of Overlap: Equal Means Case. Computational Statistics & Data Analysis, 34, 121-137.
[11] Kamps, U. (1995) A Concept of Generalized Order Statistics. Journal of Statistical Planning and Inference, 48, 1-23.
[12] Kamps, U. (1999) Order Statistics, Generalized. Encyclopedia of Statistical Sciences.
[13] Al-Saleh, M.F. (2007) On the Similarity Structure of Order Statistics. Communications in Statistics—Theory and Methods, 36, 1433-1439.
https://doi.org/10.1080/03610920601077204
[14] Ghahramani, S. (2000) Fundamentals of Probability. Prentice Hall, Upper Saddle River.
[15] Blischke, W.R. and Murthy, D.P. (2011) Reliability: Modeling, Prediction, and Optimization. Vol. 767, John Wiley Sons.
https://doi.org/10.1007/978-0-85729-647-4_3
[16] Musleh, R.M. and Helu, A. (2014) Estimation of the Inverse Weibull Distribution Based on Progressively Censored Data: Comparative Study. Reliability Engineering System Safety, 131, 216-227.

  
comments powered by Disqus

Copyright © 2020 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.