Convergence Rates of Density Estimation in Besov Spaces

doi:10.4236/am.2011.210175

Paper Menu >>

Journal Menu >>

Applied Mathematics, 2011, 2, 1258-1262

doi:10.4236/am.2011.210175 Published Online October 2011 (http://www.SciRP.org/journal/am)

Convergence Rates of Density Estimation in Besov Spaces

Huiying Wang

Department of Ap pl i e d M athematics, Beijing University of Technology, Beijing, Ch ina

E-mail: b200806005@emails.bjut.edu.cn

Received July 29, 201 1; revised August 23, 2011; accepted August 30, 2011

Abstract

The optimality of a density estimation on Besov spaces





BR for the risk was established by

Donoho, Johnstone, Kerkyacharian and Picard (“Density estimation by wavelet thresholding,” The Annals of

Statistics, Vol. 24, No. 2, 1996, pp. 508-539). To show the lower bound of optimal rates of conver-

gence



nrq



Bp, they use Korostelev and Assouad lemmas. However, the conditions of those two lemmas

are difficult to be verified. This paper aims to give another proof for that bound by using Fano’s Lemma,

which looks a little simpler. In addition, our method can be used in many other statistical models for lower

bounds of estimations.

Keywords: Optimal Rate of Convergence, Density Estimation, Besov Spaces, Wavelets

1. Introduction

Wavelet analysis has many applications, one of which is

to estimate an unknown density function based on inde-

pendent and identically distributed (i.i.d.) random sam-

ples. Let be a probability measurable space

and 1

(,,)P

,,n

X be i.i.d. random variables with an un-

known density function f. We use to denote the

expectation of a random variable X. The sequence





,:inf sup



ffV

RVpE ff



 is called optimal rate

of convergence on the functional class V for the

L risk.

Here, n

is an arbitrary estimator of f with n i.i.d. ran-

dom samples. Kerkyacharian and Picard [1] study

n when V is a Besov space with matched case.

Donoho, Johnstone, Kerkyacharian and Picard [2] con-

sider unmatched cases. In fact, they show the optimal

convergence rates for Besov class





 and

risk



1/ 1/

21/1

ln, ,

sr p

nrq s

nnr s

RB pp

















 (1.1)

To show the lower bou nd of (1.1), author s of [2,3] use

Korostelev and Assouad lemmas. However, the condi-

tions of those two lemmas are difficult to be verified. In

this small paper, we give another proof for the lower

bound of (1.1) by using Fano’s lemma [4]. It should be

pointed out that Fano’s lemma can be used to a variety of

statistical models, see [5-7].

As usual,









LRp denotes the classical Lebes-

gue space on the real line R. In particular,





stands

for the Hilbert space, which consists of all square inte-

grable functions. As a subspace of p, the Sobolev

space with an integer exponent k means











:,,0,1,,1

WRffLRmkp .





The corresponding norm



fff

Moreover, the Besov space



, [3] ,



BR (1 pq



,





 and (0,1])





can be defined by

 











,,2 ,2

snjj

qpp

BR fWRfl









 

with the associated norm









2()

:2,2

pq pq

jnj

BW lZ

ff f







 ,

where













2,:sup2 2.

pht

ftfxhfxh fx







In general, it can be shown that compactly supported and

n times differentiable functions belong to





BR for

H. Y. WANG

1259

Where p and q are density functions of P Q respec-

tively.

Lemma 1.2. (Fano’s Lemma, [4]) Let be

n and 1, .

pq

The Besov space can be discretized by the sequence

norm of wavelet coefficients. Many useful wavelets are

generated by scaling functions. More precisely, if



a scaling function with

 

22 ,

hxk







then



122



hx

k





 defines a wave-

let [3]. Clearly, when



is compactly supported and

continuous, the corresponding wavelet



has the same

properties. An orthonormal wavelet basis of





LR is

generated from dilation and translation of a scaling func-

tion and its corresponding wavelet, i.e.



:2 2,

:2 2.

jkZ

xxk





























Although wavelet basis are constructed for





LR,

most of them constitute unconditional bases for





LR.

A scaling function



is called t regular, if



has con-

tinuous derivatives of order t and its corresponding

wavelet



has vanishing moments of order t, i.e.





d0, 0,1,,1x kt

.

The following lemma [3] plays important roles in this

paper.

Lemma 1.1. Let



be a compactly supported, t regu-

lar orthonormal scaling function with the corresponding

wavelet



and0

t. If



LR, 00kk

:,,sf





kjk

and 1, df



pq



, then the following

two conditions are equivalent:



BR;

2js pj

























Furthermore,

js pj

Bpp

fsd























.

Before introducing Fano’s Lemma, we need the nota-

tion of Kullback-Leilber distance [4]. Let P and Q with P

being absolutely continuous with respect to Q (denoted

by ). Then the Kullback-Leilber distance is de-

fined by

PQ

 



,:ln d

(,, )

P

obability measurable spaces and k

A, 0,1, ,km



.

If k

PQp xx









kv



, then with c

standing for for

the complement of A and



:inf

vm k



,







 

1,exp 3.

m e







0sup min













By Lemma 1.1 and 1.2, we can show the following re-

sult:

Theorem 1.1. Let





BRL with, 1rq



,



 and 1sr . If n

is an estimator of f with

n i.i.d. random samples, then



1/ 1/

21/1 21

supmax, ,

sr ps

sr s

fB RL

Ef fn





 



















 



where

 





,:, s

rqrq B

BRL fBRfL 



and

y means has compact support}; The notation

with a constant C.

rk 1.1. Note that

Cy

Rema

 

1/ 1/

21/1

ln 1/ 1/

21/1

21 ln

max ,

rp p

 sr











 

















for













and for 21





1/ 1/

1/1 21 21

max ,

srp





 











Then theorem 1.1 is a reformulate of the loweund in

(1.1). By using the idea of reference [5], we show this

theorem in the next two sections.

irstly, we prove

n









r bo

2. Proof of Theorem 1.1



1/ 1/

21/1

ln .

sup

f fn











One need construct

fB RL



such that





BRL

and



1/ 1/

sup







Let

Ef g









ono be a compactly supported, regular and

orthrmal scaling function,



tt s



be the corresponding

wavlet with suppe





0,l



, lN

. Here and after,

1260

dgers. Th

H. Y. WANG

N enotes the set of positive inteen there ex-

ists a compactly supported density function g (i.e.



0gx and



) satisfying

 

d1x x

xBR and





0, 0.

gx c



lll l. ThenLet

elem



,2,,21

:0,



ents in the number of

enote

ed by [ is 2

5], one defi nes

1, dd by1. #2

 



2 1/

Motivat 1/

sr and

a

 







kjjk

gx gxxIk



:,

a

if 2

with 2

I





kl, else



2:0

I. Obvi-

ously, 2jl



gx x





d1 and



1/s r

gx  for large implies

that k

020c





 

jj, which

his a density function

e assumptions of for each k.

By t



, the wavelet



is com-

pported and pactly su

Bt times differentiable. Therefore,

 

Rts



 and



krq

BRBec ause



1/2 1/

js r

a , ,

jjk





and so is ,

Hence, due to Lemma 1.1.





krq



xBRL. Clearly,



1/ 1/

kk kjjk

plp

jspr

gggga







 



 (2.1)

For



kk l



 due to





1/2 1/

a

r. Fur-

thermore, :2

knk

Afg











satisfies kk









. Recall that 1. By Lemmfor kk #2

ja 1.2,



j

 



supmin,2 expj

c j













. Here and

 

after







thstands for the peasure correspond-robability m

oe density function ing t









2n1

xfxf

nn xfx.

It is easy to see that 0k

P from the constructions

of k

. Since n

is an estimator of density with n i.i.d.

ra ndom samp le s,



jjj

nnc

nkg nkgk

EfgPfgP A





 







Then,



supsup 2

min,2 exp.

k j

jnc

nk gk

EfgP A



























(2.2)

Next, one shows12

cna









 



12 1

,:ln d

nn n

PPf xx



,













x and







212







x. Then



 



111

12 112

,lnd,

nn ii

: Recall that

PPfxx nKPP







Note that



 



12 12

,:ln d

PP fxx

ln 1uu

and





for en

0 Thu.



 





 



 

121 2

212

,lnd

nn fx

KP Pnfxx

nfx x

nfxfx fxx

















Hence,







2:inf 2,2,.

jkvk j

jnn jnn

gg gg

vkv k

KP PKPP





 





Moreover,

 

22d

ngxgxgx









 .x (2.3)

ng to the definition of Accordik







supp 0,

gl

and











0,l. Thus,

 

xgxg

 x



 

11212

jkj jkj

xxc axca





by the

orthonormality of

dx ca







. Then (2.3) reduces to



cna





 (2.4)

ke Ta



2ln

js r

na nn









.



. Then







n

Now, one can choose such that

lnC n

0C2

and





41Csr c2









. Therefore,



41/ 2

sr c





1

22 ln

jcC

eena





















an ) reduces to d (2.2



jEf



supknk j

g C



 n . The

the desired follows from







 )



 by (2.1

and



21/1

2ln











.



sup

Ef fn



Now, we prove



fB





. Our

proof depends on another lemma [4].

H. Y. WANG

1261

ov-Gilbert) Let Lemma 2.1. (Varsham





:,,







 ,



0,1



. Then there exists a sub-

set





0,,



 of

/8m

 with





00, ,0 such that

 and



10.

mij





 



construct It is sufficient to



0,1, ,



M such

that



BRL



 and



npn





 (2.5) sup

iEf g

As proove, let

ved ab



be a compactly supported,





tt s regular on,and orthormal scaling functi on



uppthe corresponding wavelet with sN





0,l



, l



.



Assume,,



1/2 , :





 and ne

a



0,,2,ll [0, ]0

gc



1l and



. Defi



js ,2



jkjk

xgxa x









with





0,1 2





 (note that 0

g). Since



0,1



, one knows that 2



 

 nd a



1/2 1/

















By Lemma 1.1,

jrq

kjk







, and so is



. Hence



s,rq

BRL.



at the supports ofNote th



for

k are mutu-

ally disjoint. Then



jjk

gxc ac









 

for big j. This with1

 

xx gxx





 implies

that



is a density



0,1function for each





1, there exists . Ac-

cording to Lemma 2.





,,,





 such

that 3



 and

li j







 (2.6)





Because suppjk



supp jk



 for





, one

knows that



li j

pl i

jk k

sp jli



gg a







 



This with (2.6) and















0,1



 leads to

li pj





 and

pps

g

82:

li psj









 . (2.7)

Clearly, the sets



0,1,,

fgi M







 







satisfy li







 for . Then Fano’s Lemma

il

elds



supmin,expM3.

PA e





















(2.8)

On the other hand, it follows 12







f of (2.4). Take

na from

the similar arguments to the proo



. Then

choose a c0 suhat

2(21)

naa 

. Hence, one can

onstant ch t

12 1

2e2e1

Mcna cC





 .

Therefore, (2.8) reduces to



0sup i

PA C





0 and





0supi

Efg





0sup .

iM C

gn p



























This with (2.7) and 1

n

 yield (2.5).

The author Huiying Wang is grateful to the referees for

their valuable comments and thanks her advisoProfes-

sor Youming Liu, for his helpful guidance. Th work is

supported by the National Natural Scien ce Foundation of

China (No. 10871012) and Natural Science Foundation

of Beijing (No. 1082003) .

] G. Kerkyacharian and D. Picard, “Density Estimation in

Acknowledgements

4. References

[1 Besov Spaces,” Statistics & Probability Letters, Vol. 13,

No. 1, 1992, pp. 15-24.

doi:10.1016/0167-7152(92)90231-S

[2] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian and D.

Picard, “Density Estimation by Wavelet Thresholding,”

The Annals of Statistics, Vol. 24, No. 2, 1996, pp. 508-

539. doi:10.1214/aos/1032894451

Kerkyacharian, D. Picard and A. B. Tsy-

lets, Approximation and Statistical Appli-

cations,” Springer-Verlag, New York, 1997.

mir Zaiats, Springer

rk, 2009.

[3] W. Härdle, G.

bakov, “Wave

[4] A. B. Tsybakov, “Introduction to Nonparametric Estima-

tion,” (English) Revised and Extended from the 2004

French Original, Translated by Vladi

Series in Statistics, Springer, New Yo

[5] P. Baldi, G. Kerkyacharian, D. Marinucci and D. Picard,

“Adaptive Density Estimation for Directional Data Using

Needlets,” The Annals of Statistics, Vol. 37, No. 6A,

H. Y. WANG

1262

9-AOS6822009, pp. 3362-3395. doi:10.1214/0

05.010

[6] C. Christophe, “Regression with Random Design: A

Minimax Study,” Statistics & Probability Letters, Vol. 77,

No. 1, 2007, pp. 40-53.

doi:10.1016/j.spl.2006.

[7] A. B. Tsybakov, “Optimal Rates of Aggregation,” COLT/

Kernel 2003 Lecture Notes in Artificial Intelligence 2777,

Springer, Heidelberg, 2003, pp. 303-313.