From Nonparametric Density Estimation to Parametric Estimation of Multidimensional Diffusion Processes

The paper deals with the estimation of parameters of multidimensional diffusion processes that are discretely observed. We construct estimator of the parameters based on the minimum Hellinger distance method. This method is based on the minimization of the Hellinger distance between the density of the invariant distribution of the diffusion process and a nonparametric estimator of this density. We give conditions which ensure the existence of an invariant measure that admits density with respect to the Lebesgue measure and the strong mixing property with exponential rate for the Markov process. Under this condition, we define an estimator of the density based on kernel function and study his properties (almost sure convergence and asymptotic normality). After, using the estimator of the density, we construct the minimum Hellinger distance estimator of the parameters of the diffusion process and establish the almost sure convergence and the asymptotic normality of this estimator. To illustrate the properties of the estimator of the parameters, we apply the method to two examples of multidimensional diffusion processes.

are devoted to the parameter estimation of the drift and diffusion coefficients of diffusion processes by discrete observation. As a diffusion process is Markovian, the maximum likelihood estimation is the natural choice for parameter estimation to get consistent and asymptotical normally estimator when the transition probability density is known [1]. However, in the discrete case, for most diffusion processes, the transition probability density is difficult to calculate explicitly which prevents the use of this method. To solve this problem, several methods have been developed such as the approximation of the likelihood function [2] [3], the approximation of the transition density [4], schemes of approximation of the diffusion [5] or methods based on martingale estimating functions [6].
In this paper, we study the multidimensional diffusion model under the condition that t X is positive recurrent and exponentially strong mixing. We assume that the diffusion process is observed at regular spaced times k t k = ∆ where ∆ is a positive constant. Using the density of the invariant distribution of the diffusion, we construct an estimator of θ based on minimum Hellinger distance method.
Let f θ denote the density of the invariant distribution of the diffusion. The estimator of θ is that value (or values) ˆn θ in the parameter space Θ which minimizes the Hellinger distance between f θ and ˆn f , where ˆn f is a nonparametric density estimator of f θ .
The interest for this method of parametric estimation is that the minimum Hellinger distance estimation method gives efficient and robust estimators [7]. The minimum Hellinger distance estimators have been used in parameter estimation for independent observations [7], for nonlinear time series models [8] and recently for univariate diffusion processes [9].
The paper is organized as follows. In Section 2, we present the statistical model and some conditions which imply that t X is positive recurrent and exponentially strong mixing. Consistence and asymptotic normality of the kernel estimator of the density of the invariant distribution are studied in the same section. Section 3 defines the minimum Hellinger distance estimator of θ and studies its properties (consistence and asymptotic normality). Section 4 is devoted to some examples and simulations. Proofs of some results are presented in Appendix.

Nonparametric Density Estimation
We consider the d-dimensional diffusion process solution of the multivariate stochastic differential equation: We assume that the functions a and b are known up to the parameter θ and b is bounded. We denote by 0 θ the unknown true value of the parameter. For a matrix A , the notation t A denote the transpose of the matrix A. We will use the notation . to denote a vectorial norm or a matricial norm.
The process t X is observed at discrete time k t k = ∆ where ∆ is a positive constant. We make the following assumptions on the model: In the sequel, we assume that the initial value 0 X follows the invariant law; which implies that the process { } t X is strictly stationary. We consider the kernel estimator is a sequence of bandwidths such that 0 n b → and d n nb → +∞ as n → +∞ and : d K  is a non negative kernel function which satisfies the following assumptions: The proof can be found in the Appendix. , , satisfies the conditions of Lemma 2.
This completes the proof of the theorem.

Example 1
We consider the two-dimensional Ornstein-Uhlenbeck process solution of the stochastic differential equation . Therefore, t Z is exponentially strong mixing and the invariant distribution θ µ admits a density f θ with respect to the Lebesgue measure.
Furthermore [15], The solution of the Equation (3) is Therefore [16], the density of the invariant distribution is , we can write Equation (2) as follows: which gives the the following system t t Y ≥ are two independent univariate Ornstein-Uhlenbeck processes of parameters β and σ respectively. We now give simulations for different parameter values using the R language. For each process, we generate sample paths using the package "sde" [17] and to compute a value of the estimator, we use the function "nlm" [18] of the R language. The kernel function 1 K is the density of the standard normal distribution. We use the bandwidth ( ) Simulation results are given in the Table 1. In Table 1, 0 θ denotes the true value of the parameter and θ denotes an estimation of 0 θ given by the minimum Hellinger distance estimator. Simulation results illustrate the good properties of the estimator. Indeed, the means of the estimator are quite close to the true values of the parameter in all cases and the standard errors are low.

Example 2
We consider the Homogeneous Gaussian diffusion process [19] solution of the stochastic differential equation where 0 σ > is known, W is a two-dimensional Brownian motion, B is a 2 2 × matrix with eigenvalues with strictly negative parts and A is a 2 1 × matrix. By condition on the matrix B, X has an invariant probability and wit 0 and 0.
As in [19], we suppose that 2 σ = . In the following, we suppose that 2 Γ is invertible and we have 11 12 1 12 22 . Hence, the invariant density of µ is For simulation, we must write the stochastic differential Equation (4) in matrix form as follows: As in [19], the true values of the parameter ( ) Now, we can simulate a sample path of the Homogeneous Gaussian diffusion using the "yuima" package of R language [20]. We use the function "nlm" to compute a value of the estimator.
We generate 500 sample paths of the process, each of size 500. The kernel function and the bandwidth are those of the previous example.
We compare the estimator obtained by the minimum Hellinger distance method (MHD) of this paper and the estimator obtained in [19] by estimating function. Table 2 summarizes results of simulation of means and standard errors of the different estimators. Table 2 shows that the two estimators have good behavior. For the two methods, the means of the estimators are close to the true values of the parameter. But the standard errors of the MHD estimator are lower than those of the estimating function estimator.

A3. Proof of Lemma 1
Proof. The proof of the lemma is done in two steps.
Step 1: we prove that Using Davidov's inequality for mixing processes, we get