Estimation of Hazard Function for Censoring Random Variable by Using Wavelet Decomposition and Evaluation of MISE, AMSE with Simulation ()
The failure time which is denoted by , , is the time that failure occurs for each individual. It’s not always possible to observe the failure time for each individual in such cases that censorship occurs.
Survival function, which is shown by, indicates the ratio of people who survived since the base time which is the point they enter the experiment to the time unit t analysis. Hazard function for the failure continuous time is as follows:
(1)
In this paper, we obtain estimator hazard function for censoring data by using wavelet method. We evaluate convergence ratio of given estimator by simulation.
2. Estimation of Hazard Function by Using Wavelet Method
Wavelets can be used for transient phenomena analysis or function analysis which sometimes changes rapidly. They are symmetrical and have limited period. A close relationship between wavelet coefficients and some spaces is wavelet bases orthogonally. Also useful properties of them in wavelet issues simplify the computational algorithms. As a result, numerous articles have been published about in statistical science.
The mathematical theorem of wavelets and their application in statistics have been studied as a technique for density function estimator, by Harr [1], Doukhan [2], Antoniadys [3], nonparametric curve estimators by Malat [4], Meyer [5], Daubechies [6], Donoho [7], Kyacharyan and Picard [8], Hall and Patil [9] have found a formula for the Mean Integrated Squared Error of Nonlinear Wavelet based on density estimators. Antoniadys et al. [10] achieved the density function estimator and the hazard function for right-censored data with the wavelets. Daubechies [11] studied and discussed the compactly supported wavelets which produce orthogonal bases. Afshari et al. [12-14] studied about density, derivative density function estimator, regression function for the mixing random variables.
Let the nested sequence of closed subspaces;
be a multiresolutuon approximation to. Define, to be orthogonal complement of in. Wavelets basis for function as scaling function and mother wavelet such that forms an orthogonal basis for and forms an orthonormal basis for. Other wavelets in the basis are then generated by translation of the scaling function and dilations of the mother wavelet by using the relationships:
(2)
Given above Wavelet basis, a function can be written a formal expansion:
(3)
where
As for general orthogonal series estimator, Daubechies [4], density estimator can be writhen as:
(4)
where the obvious coefficient estimator can be written:
(5)
In this article, we divide time axis into two parts, the intervals and the number of events in each interval. We determine number of events and hazard function according to the observations. Then we flatten them separately via linear wavelet density estimation on the whole time and then we calculate the function estimator and evaluate the asymptotic distribution.
Suppose are failure time of n tests that are studied. They are non-negative, independent, identically distributed, with the density function f and distribution function F. Also suppose that are corresponding to censored times, non-negative, independent, identically distributed, with the density function and distribution function.
Assuming independency of failure times and censored time of the observed random variable, and the function and hazard function are shown as below:
Such that is indicator function of A. For data censoring, if,
We assume that,
Such that then we can write as follows:
(6)
To estimate we need the estimator of and.
For estimating, we divide the time axis into two parts of small intervals and the amounts of events (0 or 1) in each interval, and then we divide these values to the length of intervals.
Estimation procedures of can be summarized as the following:
Select and collect the observed failures in intervals with the length and using wavelet estimation on the collected data. We find an estimate of sub density. This means that we calculate the collected wavelet coefficients data on the scale of by choosing the decomposition level and then we estimate. It is necessary to state the following symbols to show the details:
We figure estimators on the finite interval in which. Note that if is the ordinal order statistic of the sequence then,
. In fact we suppose.
Suppose that N is an integer that could be dependent to n and the estimated points are as follows:
Suppose that and we divide the interval of time axis to intervals with long
The k-th interval is marked by so: for,.
Now we define the following indicator function that indicates the number of uncensored failures in the time interval: We assume that is the observed failures ratio in the interval, in other words:
We smooth the data by an appropriate wavelet smoother to find the estimation of.
We can write as the following:
(7)
where,.
The complex structural polymorphism analysis causes an efficient tree construction algorithm for analysis of functions in VN with theoretic scale wavelet coefficients. However, the integral scale is not well available and we need an initial value for a fast wavelet transform. Antoniadis [4] suggested the following initial amount:
As a result a reasonable estimate for image of
with clarity N is:
If we assume that the collected values which are equal to the estimators of, are in Sobolev space and is regular of degree. We estimate the unknown function as follows to level the data with a better rate for the sample size and the sequence:
(8)
That it is the orthogonal image of on the leveler approximation space.
Now we consider an appropriate consistent estimator of, and finally we estimate the Hazard function.
We assume that has distribution function and density function.
For estimating of, we use an empirical distribution as the following:
Such that is Histogram estimator of. Suppose that, , we can write:
Suppose that as, then we define:
so we can write as the following:
(9)
By substituting Equation (9) in Equation (8), we obtain the estimator
Theorem: Suppose that the sub density is a continuous function on and it’s times differentiable, If and
then,
,
Proof:
By using Chung-Smirnov property and Taylor’s theorem we can write as the following:
(10)
(11)
By using Equations (10) and (11), we can write:
then the proof is completed.
3. Numerical Computation and Simulation
In this section, we simulate and on the data of size by using Semlayt’s wavelet. We consider convergence ratio of given estimator by computing of average mean square error of given estimators. We use R software and wavelet package for simulations.
Example 1: We generate and from the samples of size and with K = 16, , and for optimal surface
The solid line in the Figure 1 displays the wavelet estimate of hazard function with the denoted line representing the true hazard rate The results in Table 1 display the average mean square errors of hazard function estimator for sample sizes and.
Example 2: Suppose, where and. We generate from sample size of n = 400 and n = 600 with K = 16, K = 32, K = 64 and.
The solid line in the Figure 2 displays the wavelet estimate of hazard function with the denoted line representing the true hazard rate.
The results in Table 2 display the average mean square
Figure 1. The panel in Figure 1 displays the wavelet estimator of hazard function with the denoted line representing the true hazard rate.
Table 1. Average mean square errors of hazard function estimator.
Figure 2. The solid line in the panel displays the wavelet estimate of hazard function with the denoted line representing the true hazard rate.
Table 2. Average mean square errors of hazard function estimator.
errors of hazard function estimator for sample sizes n = 400 and n = 600.
Acknowledgements
The support of Research Committee of Persian Gulf University is greatly acknowledged.
NOTES