Asymptotic confidence bands for copulas based on the local linear kernel estimator

In this paper we establish asymptotic simultaneous confidence bands for copulas based on the local linear kernel estimator proposed by Chen and Huang [1]. For this, we prove under smoothness conditions on the copula function, a uniform in bandwidth law of the iterated logarithm for the maximal deviation of this estimator from its expectation. We also show that the bias term converges uniformly to zero with a precise rate. The performance of these bands is illustrated in a simulation study. An application based on pseudo-panel data is also provided for modeling dependence.


Introduction
Let us consider a random vector (X, Y ) with joint cumulative distribution function H and marginal distribution functions F and G. The Sklar's theorem (see [2]) says that there exists a bivariate distribution function C on [0, 1] 2 with uniform margins such that H(x, y) = C(F (x), G(y)). The function C is called a copula associated with the random vector (X, Y ). If the marginal distribution functions F and G of H are continuous, then the copula C is unique and is defined as where F ← (u) = inf{x : F (x) ≥ u} and G ← (v) = inf{y : G(y) ≥ v}, u, v ∈ [0, 1], are the generalized inverses of F and G respectively.
From these facts, estimating bivariate distribution function can be achieved in two steps : (i) estimating the margins F and G; (ii) estimating the copula C.
In this paper, we are dealing with nonparametric copula estimation. We consider a copula function C with uniform margins U and V defined on [0, 1]. Then, we can write C(u, v) = P(U ≤ u, V ≤ v), u, v ∈ [0, 1]. The aim of this paper is to construct asymptotic optimal confidence bands, for the copula C, from the local linear kernel estimator proposed by Chen and Huang [1]. Our approach, based on modern functional theory of empirical processes, allows the use of data-driven bandwidths for this estimator, and is largely inspired by the works of Mason [3] and Deheuvels and Mason [14].
There are two main methods for estimating copula functions : parametric and nonparametric methods. The Maximum likelihood estimation method (MLE) and the moment method are popular parametric approaches. It happens that one may use a nonparametric approach like the MLE-method and, at the same time, estimates margins by using parametric methods. Such an approach is called a semi-parametric estimation method (see [4]). A popular nonparametric method is the kernel smoothing. Scaillet and Fermanian [5] presented the kernel smoothing method to estimate bivariate copulas for time series. Genest and Rivest [6] gave a nonparametric empirical distribution method to estimate bivariate Archimedean copulas.
A pure nonparametric estimation of copulas treats both the copula and the margins in a parameter-free way and thus offers the greatest generality. Nonparametric estimation of copulas goes back to Deheuvels [7] who proposed an estimator based on a multivariate empirical distribution function and on its empirical marginals. Weak convergence studies of this estimator can be found in Fermanian et al. [8]. Gijbels and Mielniczuk [9] proposed a kernel estimator for a bivariate copula density. Another approach of kernel estimation is to directly estimate a copula function as explored in [5]. Chen and Huang [1] proposed a new bivariate kernel copula estimator by using local linear kernels and a simple mathematical correction that removes the boundary bias. They also derived the bias and the variance of their estimator, which reveal that the kernel smoothing produces a second order reduction in both the variance and mean square error as compared with the unsmoothed empirical estimator of Deheuvels [7].
Omelka, Gijbels and Veraverbeke [10] proposed improved shrinked versions of the estimators of Gijbels and Mielniczuk [9] and Chen and Huang [1]. They have done this shrinkage by including a weight function that removes the corner bias problem. They also established weak convergence for all newly-proposed estimators.
In parallel, powerful technologies have been developed for density and distribution function kernel estimation. We refer to Mason [3], Dony [11], Dony and Einmahl [12], Einmahl and Mason [13], Deheuvels and Mason [14]. In this paper we'll apply these recent methods to kernel-type estimators of copulas. The existence of kernel-type function estimators should lead to nonparametric estimation by confidence bands, as shown in [14], where general asymptotic simultaneous confidence bands are established for the density and the regression function curves. Furthermore, to our knowledge, there are not yet such type of results in nonparametric estimation of copulas. This motivated us to extend such technologies to kernel estimation of copulas by providing asymptotic simultaneous optimal confidence bands.
Let (X 1 , Y 1 ), ..., (X n , Y n ) be an independent and identically distributed sample of the bivariate random vector (X, Y ), with continuous marginal cumulative distribution function F and G. To construct their estimator, Chen and Huang proceed in two steps. In the first step, they estimate margins bŷ where b n1 and b n2 are some bandwidths and K is the integral of a symmetric bounded kernel function k supported on [−1, 1]. In the second step, the pseudo-observationsÛ i =F n (X i ) andV i =Ĝ n (Y i ) are used to estimate the joint distribution function of the unobserved F (X i ) and G(Y i ), which gives the estimate of the unknown copula C. To prevent boundary bias, Chen and Huang suggested using a local linear version of the kernel k given by and 0 < h < 1 is a bandwidth. Finally, the local linear kernel estimator of the copula C is defined as is a variable bandwidth which may depend either on the sample data or the location (u, v).
Our best achievement is the construction of asymptotic confidence bands from a uniform in bandwidth law of the iterated logarithm (LIL) for the maximal deviation of the local linear estimator (1.1), and the uniform convergence of the bias to zero with the same speed of convergence.
The paper is organized as follows. In Section 2, we expose our main results in Theorems 2.1, 2.2 and 2.3. Simulation studies and applications to real data sets are also made in this section to illustrate these results. In Section 3, we report the proofs of our assertions. The paper is ended by appendices in which we postpone some technical results and numerical computations.

Main results and applications
2.1. Results. Here, we state our theoretical results. Theorem 2.1 gives a uniform in bandwidth LIL for the maximal deviation of the estimator (1.1). Theorem 2.2 handles the bias, while Theorem 2.3 provides asymptotic optimal simultaneous confidence bands for the copula function C(u, v). Theorem 2.1. Suppose that the copula function C(u, v) has bounded first order partial derivatives on [0, 1] 2 . Then for any sequence of positive constants (b n ) n≥1 satisfying 0 < b n < 1, b n → 0, b n ≥ (log n) −1 , and for some c > 0, we have almost surely where A(c) is a positive constant such that 0 < A(c) ≤ 3 and R n = n 2 log log n 1/2 . Remark 1. Theorem 2.1 represents a uniform in bandwidth law of the iterated logarithm for the maximal deviation of the estimator (1.1). As in [14] we may use it, in its probability version, to construct simultaneous asymptotic confidence bands from the estimator (1.1). In this purpose, we must ensure before hand that the bias term B n,h (u, v) = EĈ (LL) n,h (u, v) − C(u, v) converges uniformly to 0, with the same rate R n , as n → ∞. But this requires that the copula function C(u, v) admits bounded second-order partial derivatives on the unit square [0, 1] 2 .
Theorem 2.2. Suppose that the copula function C(u, v) admits bounded secondorder partial derivatives on [0, 1] 2 . Then for any sequence of positive constants , and for some c > 0, we have almost surely, Useful comment. Because a number of copula families do not possess bounded second-order partial derivatives, the application of these results is limited by a corner bias problem. To overcome this difficulty and apply these results to a wide family of copulas, we adopt the shrinkage method of Omelka et al. [10], by taking a local data-driven bandwidthĥ n (u, v) satisfying the following condition : where h n is a sequence of positive constants converging to 0, and b(u, v) is a realvalued function defined by For such a bandwidthĥ n (u, v), the local linear kernel estimator can be rewritten as By condition (H 1 ), (2.4) is equivalent for n large enough to This latter estimator (2.5) is exactly the improved "shrinked" version proposed by Omelka et al. [10]. It enables us to keep the bias bounded on the borders of the unit square and then to remove the problem of possible unboundedness of the second order partial derivatives of the copula function C. To set up asymptotic optimal simultaneous confidence bands for the copula C, we need the following additional condition : If conditions (H 1 ) and (H 2 ) hold, then we can infer from Theorem 2.1 that This is still equivalent to To make use of (2.6) for forming confidence bands, we must ensure that the bandwidthĥ n (u, v) is chosen in such a way that the bias of the estimator (2.4) may be neglected , in the sense that This would be the case if condition (H2) holds and √ nb 2 n / √ log log n = o(1).
Suppose that the assumptions of Theorem 2.1 and Theorem 2.2 hold. Then for any local data-driven bandwidthĥ n (u, v) satisfying (H 1 ) and (H 2 ), and any > 0, one has, as n → ∞, Remark 2. Whenever (2.8) and (2.9) hold jointly for each > 0, we will say that the intervals provide asymptotic simultaneous optimal confidence bands (at an asymptotic confidence level of 100% ) for the copula function C(u, v), 0 ≤ u, v ≤ 1. So, with a probability near to 100%, we can write for all (u, v) ∈ [0, 1] 2 ,

2.2.
Simulation results and data-driven applications.

Simulation results.
We make some simulation studies to evaluate the performance of our asymptotic confidence bands. To this end, we compute the confidence bands given in (2.10) for some classical parametric copulas, and check for whether the true copula is lying in these bands. For simplicity, we consider for example two families of copulas : Frank and Clayton, defined respectively as follows : We fix values for the parameter θ, and generate n pairs of data : (u i , v i ), i = 1, · · · , n, respectively from the two copulas by using the conditional sampling method. The steps for drawing from a bi-variate copula C are : • step 1 : Generate two values u and v from U(0, 1), Then u i and v i are random observations drawn from the copula C.
To compute the estimatorĈ , we take h n = 1/ log n and b(u, v) given by formula (2.3), with α = 0.5, so that conditions (H 1 ) and (H 2 ) are fulfilled. That is the case for b n = {(log log n)/n} 1/4 . The function K w,h is obtained by integrating the the local linear kernel function k w,hn defined in the introduction, where k is the Epanechnikov kernel density defined as k( Finally for any (u, v) ∈ [0, 1] 2 , we compute the confidence interval (2.10) by taking A(c) = 3.
In Figure 1, we represent the confidence bands and the Frank copula, while Figure 2 represents the confidence bands and the Clayton copula. One can see that the true curves of the two parametric copulas are well contained in the bands. As we cannot visualize all the information in the above figures, we provide in Appendix some numerical computations to best appreciate the performance of our bands. To this end, we generate 10 couples (u, v) of random numbers uniformly distributed in (0, 1) and compute, for each of them and for each of the considered copulas, the lower bound α n (u, v), the upper bound β n (u, v) and the true value of the copula C(u, v) for different values of θ. These computations are given in Appendix B (see , Tables 3 and 4). We can see there, that all the values of C(u, v) are in their respective confidence intervals.

2.2.2.
Data-driven applications. In this subsection, we apply our theoretical results to select graphically, among various copula families, the one that best fits sample data. Towards this end, we shall represent in a same 2-dimensional graphic the confidence bands established in Theorem 2.3 and the curves corresponding to the different copulas considered. To illustrate this, we use data expenses of senegalese households, available in databases managed by the National Agency  in Senegal, [2005][2006]. For simplicity, we shall deal with the pseudo-panel data utilized by [21], which consist of two series of observations of size n = 116. Instead of smoothing these observations denoted by (X i ; Y i ), i = 1, · · · , n, we deal with pseudo-observations to define the kernel estimator for the true copula. Here, F n and G n are empirical cumulative distribution functions associated respectively with the samples X 1 , · · · , X n and Y 1 , · · · , Y n . This application is limited to Archimedean copulas. We will consider for example three parametric families of copulas : Frank, Gumbel and Clayton. Our aim is to find graphically, using our confidence bands, the family that best fits these pseudo-panel data. The unknown parameter θ, for each family, is estimated by inversion of Kendall's tau. For this, we first calculate the empirical Kendall's tau (we findτ = 0.408), and then we deduce from it the values of the parameter θ for each family. D 1 is the Debye function of order 1 defined as : Table 1. Expression of Kendall's tau and estimated values for θ Figure 3 shows that the Clayton family seems to be more adequate to fit our pseudo-panel data. That is, dependence fitting of these Senegalese households expense data is more satisfactory with the Clayton family than for the other two copulas. We now apply the maximum likelihood method for fitting copulas to comfort our above graphical results. For this, it suffices to compute (see Table 2), for each of the three copulas, the log-likelihood function defined as where C 12 (u, v) = ∂ ∂u ∂ ∂v C(u, v), u = (u 1 , · · · , u n ) and v = (v 1 , · · · , v n ) .
From 3. Concluding remarks. This paper presented a nonparametric method to estimate the copula function by providing asymptotic confidence bands based on the local linear kernel estimator. The results are applied to select graphically the best copula function that fits the dependence structure between pseudo-panel data.
In perspective, similar results can be obtained with other kernel-type estimators of copula function like the mirror-reflection and transformation estimators.

Proofs
In this section, we first expose technical details allowing us to use the methodology of Mason ([3]) described in Proposition 1 and Corollary 1 that are necessary to prove our results. In the second step we give successively the proofs of the theorems stated in Section 2.
We begin by decomposing the differenceĈ The probabilistic term n,h (u, v) is called the deviation of the estimator from its expectation. We'll study its behavior by making use of the methodology described in [3]. The other term that we denote is the so-called bias of the estimator. It is deterministic and its behavior will depend upon the smoothness conditions on the copula C and the bandwidth h. Recall the estimator proposed by Deheuvels in [7], which is defined aŝ where F n and G n are the empirical cumulative distribution functions of the marginals F and G. This estimator is asymptotically equivalent (up to a term O(n −1 )) with the estimator based directly on Sklar's Theorem given by C n (u, v) = H n (F −1 n (u), G −1 n (v)), with H n the empirical joint distribution function of (X, Y ). Then the empirical copula process is defined as To study the behavior of the deviationD n,h (u, v), we introduce the following notation. LetC be the uniform bivariate empirical distribution function based on a sample (U 1 , V 1 ), · · · , (U n , V n ) of i.i.d random variables uniformly distributed on [0, 1] 2 . Define the following empirical process Then one can observe that where g belongs to a class of measurable functions G defined as SinceC n (u, v) is an unbiased estimator for C(u, v), one can observe that To make use of Mason's Theorem in [3], the class of functions G must verify the following four conditions : G is a pointwise measurable class, i.e there exists a countable sub-class G 0 of G such that for all g ∈ G, there exits (g m ) m ⊂ G 0 such that g m −→ g.
The checking of these conditions constitutes the proof of the following proposition which will be done in Appendix A.
where A(c) is a positive constant.

Proof. ( Corollary 1)
First, observe that the condition b n ≥ (log n) −1 yields Next, by the monotonicity of the function x → x| log x| on [0, 1/e], one can write for n large enough, h| log h| ≤ b n | log b n | and hence, Combining this and Proposition 1, we obtain sup c log n n ≤h≤bn sup (u,v)∈(0,1) 2 Thus the Corrollary 1 follows from (3.2).
By the works of Wichura on the iterated law of logarithm (see [19]), one has Since C n (u, v) andC n (u, v) are asymptotically equivalent in view of (3.1), one obtains The proof is then finished by applying Corollary 1 which yields Thus, there exists a contant A(c), with 0 < A(c) ≤ 3, such that (3.7) lim sup n→∞ n 2 log log n By continuity of F and G, we have for n large enough, By applying a 2-order Taylor expansion and taking account of the symmetry of the kernels k u,h (.) and k v,h (.) i.e, Since the second order partial derivatives are assumed to be bounded, then we can infer that and hence, (3.8) n 2 log log n Proof. ( Theorem 2.3) It suffices to show (2.8) and (2.9). From (2.7), we can infer that for any given > 0 and δ > 0, there exists N ∈ N such that for all n > N , On the other hand we deduce from (2.6) that for all (u, v) ∈ [0, 1] 2 , then (3.9) becomes Thus, for any given τ ∈ (0, 1) and all large n, we can write then, analogously to Case 1, we can infer from (3.9) that, for any given τ ∈ (0, 1) and all large n, Letting τ tends to 0, it follows from (3.10) and (3.11) that (3.12) we can write, for any > 0, with probability tending to 1, That is, (2.8) and (2.9) hold.
This implies sup Checking for (G.ii).
We have to show that where C is a constant. Recall that ζ −1 1,n (u) = F oF −1 n (u) and ζ −1 2,n (v − th) = GoĜ −1 n (v). Then, we can write Now we express A and B as integrals of the copula function C(u, v).
Since because K u,h (·) takes its values in [0,1] as a distribution function, we observing that K 2 u,h (x) ≤ K u,h (x). Then we can write We can also notice that Thus For n enough large, we have by continuity of F and G, By splitting the integrals, we obtain after simple calculus that All these six terms can be bounded up by applying Taylor expansion. Precisely, we have From this, we can conclude that Checking for (F.i). We have to check that G satisfies the uniform entropy condition. Consider the following classes of functions : It is clear that by applying the lemmas 2.6.15 and 2.6.18 in van der Vaart and Wellner (see [20], p. 146-147), the sets F, K 0 , K, H are all VC-subgraph classes. Thus, by taking the function (x, y) → G(x, y) = k 2 + 1 as a measurable envelope function for H ( indeed G(x, y) ≥ sup g∈H |g(x, y)| , ∀(x, y)), we can infer from Theorem 2.6.7 in [20] that H satisfies the uniform entropy condition. Since H and G have the same structure, we can conclude that G satisfies this property too. That is, Checking for (F.ii).
Define the class of functions It's clear that G 0 is countable and G 0 ⊂ G. Let and, for m ≥ 1, Let α m = u m − u, β m = v m − v and define Then, one can easily see that 0 < α m ≤ 1 m 2 and 0 < β m ≤  Table 3. Confidence bands for Clayton copula calculated for some random couples of values (u, v).  Table 4. Confidence bands for Frank copula calculated for some random couples of values (u, v) .