^{1}

^{2}

^{*}

The study of estimation of conditional extreme quantile in incomplete data frameworks is of growing interest. Specially, the estimation of the extreme value index in a censorship framework has been the purpose of many inves tigations when finite dimension covariate information has been considered. In this paper, the estimation of the conditional extreme quantile of a heavy-tailed distribution is discussed when some functional random covariate ( i.e. valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. A Weissman-type estimator of conditional extreme quantiles is proposed and its asymptotic normality is established under mild assumptions. A simulation study is conducted to assess the finite-sample behavior of the proposed estimator and a comparison with two simple estimations strategies is provided.

Estimation of extreme quantile is one of the most important keys in many studies of rare events that happen occasionally but have a big impact on the behaviors of distribution of these rare events. The useful material for modeling those types of extreme events are provided by Extreme Value Theory (EVT), such as estimation of tailed index and associated extreme quantile. The study of extreme events is taking attention in numerous fields of applied statistics for example in hydrology where people are interested for example to estimate a maximum level reached by seawater along a coast over a given period or estimation of conditional quantile of rainfall for given region, see [

The main purpose of studying the problem of rare events is not in the estimation of “central” parameters of the random variable such as mean, mode and median fortunately researchers are interested on the understanding the properties or behaviors on its right tails. One of the known famous results in extreme value theory is the Fisher-Tippett-Gnedenko Theorem [

lim n → ∞ ℙ ( max ( Y 1 , ⋯ , Y n ) − b n a n ≤ x ) = lim n → ∞ F n ( a n x + b n ) = H ( x ) ,

then H belongs to the type^{1} of one of the following three distribution functions:

Φ α ( x ) = { 0 if x ≤ 0 exp ( − x − α ) if x > 0 , α > 0 (Fréchet)

Ψ α ( x ) = { 1 if x ≥ 0 exp ( − ( − x ) α ) if x < 0 , α > 0 (Weibull)

Λ ( x ) = exp ( − exp ( − x ) ) for all x ∈ ℝ (Gumbel)

The three above distribution functions Λ , Ψ α and Φ α are the only possible limit laws of the normalized maximum of a sample of independent and identically distributed random variables. They are referred to as the Extreme Value Distribution (EVD). A parametrization of these three distributions into a single formula called Generalized Extreme Value Distribution (GEV) is given by:

H γ ( x ) = { exp ( − ( 1 + γ x ) − 1 γ ) for all x such that 1 + γ x > 0, if γ ≠ 0 exp ( − exp ( − x ) ) for all x ∈ ℝ , if γ = 0.

The parameter γ so-called the extreme-value index or the tail-index completely characterizes the behaviour of the tail of the distribution F. Its sign also determines the notion of domain of attraction.

The estimation of γ is a cornerstone when we deal with various problem in extreme value analysis such as estimation of conditional extreme quantile of random variable in presence of covariate. When some covariate information X is available and the distribution of Y depends on X, the problem is to estimate the conditional extreme-value index and conditional extreme quantiles.

Then, in this paper, we consider the situation where some covariate information X is available to the investigator, and the distribution of Y depends on X. We focus on the problem of estimating a conditional extreme-quantile of a heavy-tailed distribution when some functional covariate information X ∈ E is available, where E is an infinite dimensional space associated with a semi-metric d ( ⋅ , ⋅ ) .

In the literature, many studies have conducted a research on estimating the conditional extreme quantiles of a random variables Y. Daouia et al. [

In normal case, it may happen that we observe the incomplete information for the variable of interest. In classical applications such as the analysis of lifetime data (survival analysis, reliability theory, insurance), a typical feature which appears is censorship. For example, in medical follow up, the response variable Y represents the time elapsed from the entry of a patient in, say, a follow-up study until death. If, at the time that the data collection is performed, the patient is still alive or has withdrawn from the study for some reason, the variable of interest Y will not be available. Many authors have addressed this issue among them [

Recently, many authors have been interested in the estimation of the extreme value index and extreme quantile we can enumerate few of them such as [

However, based on our knowledge, estimation of the conditional extreme quantile of a heavy-tailed distribution under random right censoring data and functional covariate has not yet been addressed, which motivated us to tackle this issue by taking into consideration the heavy tail distribution and the functional covariate under random right censored data. In our methodology, we consider the Kernel conditional Kaplan-Meier estimator of the conditional survival function and the functional covariate (infinite dimension) is present. Then we construct Weissman-type estimators of the conditional extreme quantile F ← ( 1 − α | x ) under censoring and we establish their asymptotic normality. Finally, the finite-sample performance of these estimators are assessed via simulations and compared with several alternative estimators.

The remainder of this paper is organized as follows. Section 2 consists of introduction of notations and describes the framework of the study. The construction of our estimator of functional conditional extreme quantile is summarized and the asymptotic normality of the proposed estimator is established in Section 3 and some proofs are given in Section 4. In Section 5, we assess via simulations the finite sample behavior of our estimator. The conclusion and some perspectives are presented in Section 6.

In this section, we are interested to describe the behaviors of the nonparametric estimator of the conditional quantile using the Kaplan-Meier estimator with covariate as functional random variable (infinite dimension) when the censored data are available, then for more details we can see [

Let ( X i , Y i ) , i = 1, ⋯ , n be the independent copies of the random pairs ( X , Y ) , where Y is positive real random variable and X be a functional random variable, X ∈ E is an infinite dimensional space associated to semi-metric d ( ⋅ , ⋅ ) . We assume that the random variable Y can be a randomly right censored by a positive random variable C. Therefore, we now observe triple of the independent ( X i , δ i , Z i ) , where Z i = min ( Y i , C i ) and δ i = 1 { Y i ≤ C i } for i = 1 ⋯ , n where 1 { A } is the indicator function of the event A. Regarding that the random variable C is defined on the some probability space ( Ω , ℂ , ℙ ) as Y. We assume that Y andC are independent given X = x , where C 1 , ⋯ , C n are independent each other.

Let F ( ⋅ | x ) and G ( ⋅ | x ) be the conditional cumulative distribution function of random variable Y and C given X = x respectively.

Let F ¯ ( ⋅ | x ) = 1 − F ( ⋅ | x ) and G ¯ ( ⋅ | x ) = 1 − G ( ⋅ | x ) be the conditional survival function of random variable Y and C given X = x respectively. Since, here we are dealing with the case of heavy tails therefore, we assume that the following condition to be satisfied

(A1)

F ¯ ( t | x ) = r 1 ( x ) exp { − ∫ 1 t ( 1 γ 1 ( x ) − ε 1 ( μ | x ) ) d μ μ } (1)

and

G ¯ ( t | x ) = r 2 ( x ) exp { − ∫ 1 t ( 1 γ 2 ( x ) − ε 2 ( μ | x ) ) d μ μ } (2)

where γ 1 ( x ) , γ 2 ( x ) are positive unknown functions of the covariate x, r 1 , r 2 are positive functions and | ε 1 ( μ | x ) | , | ε 2 ( μ | x ) | are continuous and ultimately decreasing to zero. From Equations (1) and (2), we can state that the conditional distribution functions of Y and C given X = x are in Fréchet maximal domain of attraction. Thus, γ 1 ( x ) and γ 2 ( x ) are taken as the conditional extreme tail index functions. Therefore, for all t > 0 , F ¯ ( ⋅ | x ) and G ¯ ( ⋅ | x ) are regularly

varying functions at infinity with index − 1 γ 1 ( x ) and − 1 γ 2 ( x ) respectively. Thus,

F ¯ ( u | x ) = u − 1 γ 1 ( x ) L 1 ( u | x ) and G ¯ ( u | x ) = u − 1 γ 2 ( x ) L 2 ( u | x )

where for x fixed, L 1 ( . | x ) and L 2 ( . | x ) are slowly varying functions at infinity, that is, for all λ > 0 ,

lim y → ∞ L i ( λ u | x ) L i ( u | x ) = 1 , i = 1 , 2.

By condition of independence between Y and C, the conditional survival function H ¯ ( ⋅ | x ) of Z given X = x is also a regularly varying function at infinity with index − 1 γ ( x ) as expressed as follow:

H ¯ ( z | x ) = 1 − H ( z | x ) = F ¯ ( z | x ) G ¯ ( z | x ) = r ( x ) exp { − ∫ 1 z ( 1 γ ( x ) − ( ε ( μ | x ) ) ) d μ μ }

with γ ( x ) = γ 1 ( x ) p ( x ) where p ( x ) = γ 2 ( x ) γ 1 ( x ) + γ 2 ( x ) is the ultimate proportion

of uncensored observations among Z i , i = 1 , ⋯ , n ; the proof of this statement is out of scope of presented paper (see [

Normally, let ( X , Y ) be E × ℝ valued random element where, E is a semi-metric space. Let d ( ⋅ , ⋅ ) be semi-metric correspondent with the space E and suppose that now we observe a sequence of ( X i , Y i ) , i ≥ 1 a copies of ( X , Y ) .

In this paper, we are interested on the problem of the estimation of conditional extreme quantile q ( α n | x ) of order ( 1 − α n ) of the conditional survival distribution function F ¯ ( ⋅ | x ) of Y given X = x .

F ¯ ( q ( α n | x ) | x ) = α n , α n → 0 , n → ∞ .

By considering the random right censored model ( Z , δ ) , Z is random variable and δ is indicator of censoring, then δ equal to one if Y ≤ C and zero otherwise, therefore we say that Y is right censored by C. Hence, conditional cumulative Hazard ratio is given by:

Λ ( y n | x i ) = − log ( 1 − F ( y n | x i ) ) = ∫ 0 y n d F ( s | x i ) 1 − F ( s | x i ) = ∫ 0 y n d H 1 ( s | x i ) 1 − H ( s | x i ) .

Therefore, the estimator Λ ^ n ( y n | x i ) of Λ ( y n | x i ) is given by

where H n ( x ) = B n i ( x , h ) 1 { Z i > y n } and H n 1 ( x ) = B n i ( x , h ) 1 { Z i > y n , δ i = 1 } with δ i is an indicator function of Y i associated to Z i and B n i ( x , h ) is the Nadaraya-Watson weighted [

B n i ( x ) = K ( d ( x , X i ) / h ) / ∑ j = 1 n K ( d ( x , X j ) / h ) ,

where K is a kernel density and h is a bandwidth parameter such that h → 0 as n → ∞ .

From the Equation (3), estimator of survival function may be expressed as

F ¯ ^ n ( y n | x i ) = 1 − F ^ n ( y n | x i ) = exp ( − Λ ^ n ( y n | x i ) ) = exp ( − ∑ i = 1 n B n i ( x , h ) 1 { Z i > y n , δ i = 1 } 1 − ∑ j = 1 n B n j ( x , h ) 1 { Z j ≤ Z i } ) = ∏ i = 1 n exp ( − B n i ( x , h ) 1 { Z i > y n , δ i = 1 } 1 − ∑ j = 1 n B n j ( x , h ) 1 { Z j ≤ Z i } ) .

By applying Taylor expansion of exp ( − y ) around y = 0 where exp ( − y ) ≈ 1 − y , we obtained

F ¯ ^ n ( y n | x i ) = ∏ i = 1 n ( 1 − B n i ( x , h ) 1 { Z i > y n , δ i = 1 } 1 − ∑ j = 1 n B n j ( x , h ) 1 { Z j ≤ Z i } ) .

We denote the conditional moderated quantile, of the order α n → 0 as n → ∞ of random variable Y given X = x , by

q ( α n | x ) = F ¯ ← ( α n | x ) = inf { y n : F ¯ ( y n | x ) ≤ α n }

Therefore, a natural estimator of q ( α n | x ) is given by

q ^ n ( α n | x ) = F ¯ ^ n ← ( α n | x ) = inf { y n : F ¯ ^ n ( y n | x ) ≤ α n } .

Let us denote that B ( t , r ) be a ball centered at point t and the radius r for t ∈ E and defined by

B ( t , r ) = { x ∈ E , d ( x , t ) ≤ r }

and let h be a positive sequence tending to zero as n → ∞ . The proposed method for moving windows adopted in [_{i}’s correspond to the covariates x_{i}’s belongs in ball B ( t , h ) , therefore such proposition is given by

For x ∈ E , we denote the conditional probability distribution function of Y given X = x by

∀ y ∈ ℝ , F ( y | x ) = P ( Y ≤ y | X = x ) .

By assuming that B ( x , h ) be the ball of center x and the radius h, at the end φ x ( h ) can be rewritten as φ x ( h ) = P ( X ∈ B ( x , h ) ) a small ball probability of X.

We now investigate the estimation of large conditional quantile q ( α n | x ) of order 1 − α n of F ( ⋅ | x ) for a variable Y given X = x defined by 1 − F ( q ( α n | x ) | x ) = α n with α n → 0 as n → ∞ . To define our estimator, we have in the first step to define q ^ n c ( α n | x ) the functional estimator of a large conditional quantile q ( α n | x ) within the sample.

Let us consider the Kernel conditional Kaplan-Meier estimator of the conditional survival function 1 − F ( ⋅ | x ) , for all x ∈ E and y n ∈ ( 0, ∞ ) defined as follows:

F ¯ ^ n ( y n | x ) = ∏ i = 1 n ( 1 − B n i ( x , h ) 1 { Z i > y n , δ i = 1 } 1 − ∑ j = 1 n B n j ( x , h ) 1 { Z j ≤ Z i } ) .

This function may be rewritten as

and zero otherwise where Z ( 1 ) ≤ ⋯ ≤ Z ( n ) denoted the order statistics of Z 1 , ⋯ , Z n .

By taking into account the estimator in Equation (4), we propose to estimate conditional quantile q ( α n | x ) within the sample of observation (i.e. for fixed α n ∈ ( 0,1 ) ) as a generalized inverse of F ¯ ^ ( ⋅ | x ) as

q ^ n c ( α n | x ) = F ¯ ^ n ← ( α n | x ) = inf { u : F ¯ ^ n ( u | x ) ≤ α n } ,

where α n → 0 as n → ∞ , we propose to estimate the conditional extreme quantile q ( α n | x ) by Weissman-type estimator

q ^ n c , W ( α n | x ) = q ^ n c ( F ¯ ^ n ( Z ( n − k ) | x ) ) ( F ¯ ^ n ( Z ( n − k ) | x ) α n ) γ ^ n c , H ( x ) . (5)

The term ( F ¯ ^ n ( Z ( n − k ) | x ) α n ) γ ^ n c , H ( x ) is an extrapolation factor allowing to estimate

arbitrary large quantiles and γ ^ n c , H ( x ) is the estimator of the censored functional conditional extreme value index γ 1 ( x ) .

Some regularity conditions are needed for proving our results (these conditions are adapted from [

(A2) K is a function with support [ 0,1 ] and there exist 0 < c 1 < c 2 < ∞ such that c 1 ≤ K ( t ) ≤ c 2 for all t ∈ [ 0,1 ] .

(A3) Let consider α ∈ ( 0,1 ) and let a fixed x ∈ E , the conditional quantile function α ∈ ( 0,1 ) → q ( α | x ) ∈ ( 0, + ∞ ) is differentiable and the function defined by α ∈ ( 0,1 ) → Δ ( α | x ) = γ 1 ( x ) + α ∂ ( log q ( α | x ) ) ∂ α is continuous and such that lim α → 0 Δ ( α | x ) = 0 .

The behavior of the log quantile function with respect to the first derivative is controlled under the hypothesis (A3) which is a necessary and sufficient condition to obtain the heavy-tail property.

The largest oscillation of the log-quantile function with respect to its second variable is defined for all a ∈ ( 0, 1 / 2 ) as

ω n ( a ) = sup { | q ( α n | x ) q ( α n | x ′ ) | , α n ∈ ( a ,1 − a ) , ( x , x ′ ) ∈ B ( t , h ) 2 } .

Theorem 1. Assume that (A1)-(A3) hold, let x ∈ E and consider β n = F ¯ ^ n ( Z ( n − k ) | x ) and α n be sequence such that β n / α n → 0 and σ n ( x ) → 0 , σ n − 1 ( x ) ε ( q ( β n | x ) | x ) → 0 as n → ∞ . Consider γ ^ n c , H ( x ) such that σ n − 1 ( x ) ( γ ^ n c , H ( x ) − γ 1 ( x ) ) → N ( 0, A V ( x ) ) with A V ( x ) ≥ 0 and

σ n ( x ) = ( n H ¯ ( y n | x ) ( μ x ( 1 ) ( h ) ) 2 μ x ( 2 ) ( h ) ) − 1 / 2 .

Let ζ n ( x ) = ( n φ x ( h ) β n ) 1 / 2 log ( β n / α n ) and σ n − 1 ( x ) ζ n − 1 ( x ) → 0 as n → ∞ , then

σ n − 1 ( x ) log ( β n / α n ) log ( q ^ n c , W ( α n | x ) q ( α n | x ) ) → N ( 0, γ 1 3 ( x ) γ ( x ) ) .

Lemma 2. Let T 1 n and T 2 n be two sequence of random variables. Suppose there exists an event B n such that ( T 1 n | B n ) = d ( T 2 n | B n ) with ℙ ( B n ) → 1 , then T 1 n → d T 1 implies T 2 n → d T 1 .

Proof of Lemma 2: See [

Lemma 3. Suppose (A1) and (A2) holds. Let consider 0 ≤ β n ≤ α n , such that α n → 0 as n → ∞ , then

| log q ( α n | x ) q ( β n | x ) + γ 1 ( x ) log ( α n β n ) | = O ( log ( α n β n ) ε ( q ( α n | x ) | x ) )

Proof of Lemma 3: See [

Proposition 4. Let m x = n φ x ( h ) the nonrandom number of observations in the slice ( 0, ∞ ) × B ( x , h ) . Let α n be a sequence satisfying α n → 0 and m x α n → ∞ . if ( m x α n ) 2 ω ( m x − ( 1 + ε ) ) → 0 as n → 0 , for some ε > 0 . Then,

( m x α n ) 1 / 2 ( q ^ c ( α n | x ) q ( α n | x ) − 1 ) → d N ( 0, γ 1 2 ( x ) ) .

Proof of Proposition 4

The proof is similar to proof of Theorem 1 in [

Proof of Theorem 1

Let α n > β n , then the conditional estimation of extreme quantile defined as

log ( q ^ n c , W ( α n | x ) ) = log q ^ n c ( β n | x ) + γ ^ n c , H ( x ) log ( β n / α n )

then,

log ( q ^ n c , W ( α n | x ) q ( α n | x ) ) = log q ^ n c ( β n | x ) + γ ^ n c , H ( x ) log ( β n / α n ) − log q ( α n | x ) = log q ^ n c ( β n | x ) − log q ( β n | x ) + ( γ ^ n c , H ( x ) − γ 1 ( x ) ) log ( β n / α n ) − log q ( α n | x ) + log q ( β n | x ) + γ 1 ( x ) log ( β n / α n ) = log ( q ^ n c ( β n | x ) q ( β n | x ) ) + ( γ ^ n c , H ( x ) − γ 1 ( x ) ) log ( β n / α n ) + log ( q ( β n | x ) q ( α n | x ) ) + γ 1 ( x ) log ( β n / α n ) .

Therefore

σ n − 1 ( x ) log ( β n / α n ) log ( q ^ n c , W ( α n | x ) q ( α n | x ) ) = σ n − 1 ( x ) log ( β n / α n ) log ( q ^ n c ( β n | x ) q ( β n | x ) ) + σ n − 1 ( x ) ( γ ^ n c , H ( x ) − γ 1 ( x ) ) + σ n − 1 ( x ) log ( β n / α n ) [ log ( q ( β n | x ) q ( α n | x ) ) + γ 1 ( x ) log ( β n / α n ) ] = A 1 , n + A 2 , n + A 3 , n

with,

A 1 , n = σ n − 1 ( x ) log ( β n / α n ) log ( q ^ n c ( β n | x ) q ( β n | x ) )

A 2 , n = σ n − 1 ( x ) ( γ ^ n c , H ( x ) − γ 1 ( x ) )

A 3 , n = σ n − 1 ( x ) log ( β n / α n ) [ log ( q ( β n | x ) q ( α n | x ) ) + γ 1 ( x ) log ( β n / α n ) ]

Under the assumption in Theorem 1 and applying the result of Proposition 4

( n φ x ( h ) β n ) 1 / 2 [ log ( q ^ n c ( β n | x ) q ( β n | x ) ) ] → O p ( 1 ) . (6)

By using some notation, we see that

A 1, n = σ n − 1 ( x ) ζ n − 1 ( n φ x ( h ) β n ) 1 / 2 [ log ( q ^ n c ( β n | x ) q ( β n | x ) ) ] .

Using the expression in Equation (6) and the hypothesis of Theorem 1 leads to A 1, n → 0 in probability as n goes to infinity. According to the assumption of the Theorem 1, A 2, n converges in distribution to a centered Gaussian distribution with a covariate matrix AV (see [

In this part of simulation, the main purpose is to assess the performance of the proposed estimators. We will make a comparison of the results with two simple estimation approaches based on tail index of heavy tailed distribution under right random censored. By assuming that the theoretical distribution of Y given X and C given X are known. We consider the simulation of N = 500 replications of a sample size n = 200 , n = 500 of random triple observation of { ( X i , Z i , δ i ) , i = 1 , ⋯ , n } from ( X , Z , δ ) to construct the estimates.

Where the curve X i is given by the following expression of a functional covariate X ∈ E which is defined by

X ( t ) = Ω ( 2 − cos ( π W t ) ) + ( 1 − Ω ) cos ( π W t )

for all t ∈ [ 0,1 ] with W is normally distributed on [ 0,1 ] and Ω in a random variable which follows a Bernoulli distribution with a probability equal to half as adapted in [

The conditional distribution of Y given X = x is a Burr distribution with parameter τ ( x ) = 2 , λ ( x ) = 2 / ( 8 ‖ X ‖ 2 2 − 3 ) , which implies that γ 1 ( x ) = 1 τ ( x ) λ ( x ) , with

‖ X ‖ 2 2 = ∫ 0 1 X 2 ( t ) d t = 4 Ω 2 − 4 Ω ( 2 Ω − 1 ) sin ( π W ) π W + ( 2 Ω − 1 ) 2 [ 1 2 + sin ( 2 π W ) 4 π W ] .

The conditional distribution of C given X = x is also Burr distribution where the parameter γ 2 ( x ) = 1 τ ( x ) λ 2 ( x ) is chosen to yield various values for the overall censoring percentage c ( c = 10 % , 20 % , 30 % , 40 % ) . Since

γ ( x ) = γ 1 ( x ) p ( x ) with p ( x ) = γ 2 ( x ) γ 1 ( x ) + γ 2 ( x ) = λ 1 ( x ) λ 1 ( x ) + λ 2 ( x ) is the ultimate

proportion of uncensored observations among Z i for i = 1 , ⋯ , n then γ 1 ( x ) is selected, we choose γ 2 ( x ) such that 1 − p ( x ) is approximately to ( 10 % ,20 % ,30 % ,40 % ) as censoring percentage.

In practice, there are some parameters to be fixed as kernel density K be an asymmetric linear kernel defined as K ( u ) = ( 1.9 − 1.8 u ) 1 [ 0,1 ] , the estimator q ^ n c , W ( x ) dependents on parameters h k = h . The bandwidth parameter h is chosen using the cross-validation method which was implemented in [

h o p t = arg min ∑ i = 1 n ∑ j = 1 n ( 1 { Z i > Z j } − F ¯ ^ n − i ( Z j | x i ) ) 2 ,

with F ¯ ^ n − i is the kernel conditional Kaplan-Meier estimator presented in Equation (4) adopted in [

In case the bandwidth is already been selected then, next step is to determine the number of threshold excesses k. Different methods have been mentioned in literature and in this paper we adopted the method used by [

We started by creating the successive block of elements of the estimate in Equation (5) with y k , for k = 1, ⋯ , n − 1 , such that for each block has size ⌊ k max ⌋ . Finally, we compute the standard deviation for the estimates in each block, the median of the estimates for a minimum standard deviation is the one will be taken as an optimal k.

Other thing to discuss is the selection of semi-metric distance because a semi-metric appears to be an important key for behavior of nonparametric statistics for functional data for more details can see [

d s m ( X 1 , X 2 ) = ∫ T ( X 1 q ( t ) − X 2 q ( t ) ) 2 d t , (7)

where q is the degree of derivative.

In order to check the finite sample performance of the extreme conditional quantile estimator in Equation (5), we have performed some simulation experiments, which are thoroughly described in the Section 5.1. Furthermore, to evaluate the impact of the order of derivative for the choice of semi-metric as [

To assess the performance of our estimator, we make a comparison with two simple estimation strategies. The first one is a complete-case procedure (“CC” for short): we remove all censored observations from the simulated samples. Then, we compute the tail index estimator proposed in [

Now to illustrate the asymptotic normality result for our estimators, we use the Kolmogorov-Smirnov test to examine the asymptotic normality of the estimator as presented in

The P-values of the Kolmogorov-Smirnov test are greater than 0.05 as illustrated in

The performance of our estimator q ^ n c , W ( α n | x ) defined in (5) is evaluated using Mean Squared Error (MSE) and Mean Absolute Error (MAE). We also provide the averaged value (over the N samples) of the number of threshold excesses k * . The accuracy of our estimator depends on the censoring percentage and on the degree of derivative of the semi-metric d s m ( ⋅ , ⋅ ) defined in Equation (7).

To demonstrate the accurate of the proposed estimator, we provide the comparison for complete case and ignored case as described in Section 5.2.

The proposed estimator of conditional extreme quantiles shows to be quite well performance at low rate of censored as the sample size becomes large enough as is illustrated in

According to choice of the semi-metric distance, our simulation results shows

For sample size of n = 200 | For sample size of n = 500 | ||||||
---|---|---|---|---|---|---|---|

MSE ( q ^ n c , W ( α n | x ) ) | MAE ( q ^ n c , W ( α n | x ) ) | k ⋆ | MSE ( q ^ n c , W ( α n | x ) ) | MAE ( q ^ n c , W ( α n | x ) ) | k ⋆ | ||

For censorship case | |||||||

order = 4 | 10% | 0.0672 | 0.2422 | 63.752 | 0.0190 | 0.1179 | 160.396 |

20% | 0.0737 | 0.2574 | 71.088 | 0.0221 | 0.1310 | 184.112 | |

30% | 0.0801 | 0.2719 | 78.396 | 0.0263 | 0.1478 | 201.052 | |

40% | 0.0871 | 0.2859 | 79.068 | 0.0310 | 0.1654 | 207.828 | |

order = 3 | 10% | 0.0676 | 0.2433 | 64.284 | 0.0201 | 0.1216 | 157.844 |

20% | 0.0738 | 0.2580 | 70.696 | 0.0233 | 0.1341 | 184.728 | |

30% | 0.0803 | 0.2718 | 75.148 | 0.0272 | 0.1499 | 194.760 | |

40% | 0.0872 | 0.2860 | 81.056 | 0.0318 | 0.1670 | 202.592 | |

order = 2 | 10% | 0.0680 | 0.2440 | 63.948 | 0.0206 | 0.1252 | 160.132 |

20% | 0.0740 | 0.2681 | 70.948 | 0.0241 | 0.1387 | 179.536 | |

30% | 0.0806 | 0.2728 | 78.396 | 0.0282 | 0.1533 | 195.024 | |

40% | 0.0872 | 0.2871 | 80.692 | 0.0329 | 0.1704 | 209.060 | |

For Ignored case | |||||||

order = 4 | 10% | 0.0682 | 0.2455 | 65.488 | 0.0201 | 0.1226 | 161.276 |

20% | 0.0744 | 0.2591 | 67.420 | 0.0230 | 0.1345 | 173.640 | |

30% | 0.0805 | 0.2726 | 72.292 | 0.0268 | 0.1497 | 183.672 | |

40% | 0.0873 | 0.2863 | 73.076 | 0.0312 | 0.1660 | 188.908 | |

order = 3 | 10% | 0.0689 | 0.2467 | 66.328 | 0.0214 | 0.1269 | 160.836 |

20% | 0.0745 | 0.2596 | 68.792 | 0.0242 | 0.1375 | 166.248 | |

30% | 0.0807 | 0.2724 | 70.080 | 0.0277 | 0.1516 | 171.396 | |

40% | 0.0874 | 0.2863 | 76.016 | 0.0321 | 0.1679 | 189.216 | |

order = 2 | 10% | 0.0693 | 0.2472 | 65.180 | 0.0220 | 0.1307 | 162.640 |

20% | 0.0747 | 0.2598 | 67.644 | 0.0250 | 0.1417 | 166.996 | |

30% | 0.0808 | 0.2736 | 70.276 | 0.0287 | 0.1552 | 179.536 | |

40% | 0.0873 | 0.2874 | 73.804 | 0.0330 | 0.1711 | 191.240 | |

For Complete case | |||||||

order = 4 | 10% | 0.0682 | 0.2454 | 59.038 | 0.0201 | 0.1225 | 145.311 |

20% | 0.0745 | 0.2592 | 54.869 | 0.0229 | 0.1345 | 135.659 | |

30% | 0.0808 | 0.2728 | 51.362 | 0.0268 | 0.1498 | 126.077 | |

40% | 0.0875 | 0.2865 | 47.629 | 0.0313 | 0.1664 | 117.145 | |

order = 3 | 10% | 0.0689 | 0.2465 | 59.151 | 0.0215 | 0.1271 | 148.662 |

20% | 0.0745 | 0.2598 | 54.992 | 0.0242 | 0.1378 | 138.925 | |

30% | 0.0808 | 0.2726 | 51.586 | 0.0278 | 0.1519 | 128.993 |

40% | 0.0875 | 0.2865 | 47.854 | 0.0321 | 0.1681 | 120.135 | |
---|---|---|---|---|---|---|---|

order = 2 | 10% | 0.0693 | 0.2471 | 59.156 | 0.0219 | 0.1303 | 144.235 |

20% | 0.0748 | 0.2598 | 54.899 | 0.0250 | 0.1419 | 134.688 | |

30% | 0.0810 | 0.2737 | 51.476 | 0.0287 | 0.1554 | 125.360 | |

40% | 0.0875 | 0.2876 | 47.926 | 0.0331 | 0.1714 | 116.549 |

k ⋆ is the average of threshold excesses.

Order | 10% | 20% | 30% | 40% | 10% | 20% | 30% | 40% |
---|---|---|---|---|---|---|---|---|

For sample size n = 200 | For sample size n = 500 | |||||||

For Censored case | ||||||||

2 | 0.7757 | 0.6767 | 0.5916 | 0.5735 | 0.9633 | 0.9596 | 0.8387 | 0.5863 |

3 | 0.6501 | 0.3384 | 0.2027 | 0.1042 | 0.8564 | 0.8547 | 0.7942 | 0.5212 |

4 | 0.5288 | 0.2688 | 0.1266 | 0.08872 | 0.8459 | 0.7568 | 0.6665 | 0.4551 |

that the degree of derivative play a key role, since the functional curves are smooth, where the semi-metric distance with high degree of derivative is well perform compared to low derivative degree as is illustrated in

Considering the results in

The Kolmogorov-Smirnov test has been performed to check the asymptotic normality of our proposed estimator, according to the results in

We considered the estimation of the functional Weissman kernel type estimator when some functional random covariate (i.e. valued in some infinite-dimensional space) information is available and the scalar response variable is right-censored. Its asymptotic properties were established and its finite sample performance was illustrated in a simulation study. Also a comparison with two simple estimation strategies has been provided.

In future, work will be focused on the estimation of the conditional extreme value of Weibull distribution under random right censored in case the covariate is functional random variable, and established its asymptotic behavior.

The authors acknowledge an anonymous Associate Editor and an anonymous reviewer for their helpful comments that led to an improved version of this paper.

The authors declare no conflicts of interest regarding the publication of this paper.

Rutikanga, J.U. and Diop, A. (2021) Functional Kernel Estimation of the Conditional Extreme Quantile under Random Right Censoring. Open Journal of Statistics, 11, 162-177. https://doi.org/10.4236/ojs.2021.111009