Paper Menu >>
Journal Menu >>
Applied Mathematics, 2011, 2, 1258-1262 doi:10.4236/am.2011.210175 Published Online October 2011 (http://www.SciRP.org/journal/am) Copyright © 2011 SciRes. AM Convergence Rates of Density Estimation in Besov Spaces Huiying Wang Department of Ap pl i e d M athematics, Beijing University of Technology, Beijing, Ch ina E-mail: b200806005@emails.bjut.edu.cn Received July 29, 201 1; revised August 23, 2011; accepted August 30, 2011 Abstract The optimality of a density estimation on Besov spaces , s rq BR for the risk was established by Donoho, Johnstone, Kerkyacharian and Picard (“Density estimation by wavelet thresholding,” The Annals of Statistics, Vol. 24, No. 2, 1996, pp. 508-539). To show the lower bound of optimal rates of conver- gence p L ,, s nrq R Bp, they use Korostelev and Assouad lemmas. However, the conditions of those two lemmas are difficult to be verified. This paper aims to give another proof for that bound by using Fano’s Lemma, which looks a little simpler. In addition, our method can be used in many other statistical models for lower bounds of estimations. Keywords: Optimal Rate of Convergence, Density Estimation, Besov Spaces, Wavelets 1. Introduction Wavelet analysis has many applications, one of which is to estimate an unknown density function based on inde- pendent and identically distributed (i.i.d.) random sam- ples. Let be a probability measurable space and 1 (,,)P ,,n X X be i.i.d. random variables with an un- known density function f. We use to denote the expectation of a random variable X. The sequence EX ,:inf sup n n n p ffV RVpE ff is called optimal rate of convergence on the functional class V for the p L risk. Here, n f is an arbitrary estimator of f with n i.i.d. ran- dom samples. Kerkyacharian and Picard [1] study n when V is a Besov space with matched case. Donoho, Johnstone, Kerkyacharian and Picard [2] con- sider unmatched cases. In fact, they show the optimal convergence rates for Besov class ,p RV r L, s rq B and p L risk 1/ 1/ 21/1 , 21 ln, , 21 , ,. 21 sr p sr s nrq s s p nnr s RB pp nr s (1.1) To show the lower bou nd of (1.1), author s of [2,3] use Korostelev and Assouad lemmas. However, the condi- tions of those two lemmas are difficult to be verified. In this small paper, we give another proof for the lower bound of (1.1) by using Fano’s lemma [4]. It should be pointed out that Fano’s lemma can be used to a variety of statistical models, see [5-7]. As usual, 1 p LRp denotes the classical Lebes- gue space on the real line R. In particular, 2 LR R stands for the Hilbert space, which consists of all square inte- grable functions. As a subspace of p, the Sobolev space with an integer exponent k means L :,,0,1,,1 m k pp WRffLRmkp . The corresponding norm :. k p k Wp p fff Moreover, the Besov space , [3] , s pq BR (1 pq , s n and (0,1]) can be defined by 2 ,,2 ,2 n snjj p qpp jZ BR fWRfl q with the associated norm , 2() :2,2 sn pq pq jnj p BW lZ ff f , where 2,:sup2 2. pht p ftfxhfxh fx In general, it can be shown that compactly supported and n times differentiable functions belong to , s pq BR for H. Y. WANG 1259 Where p and q are density functions of P Q respec- tively. Lemma 1.2. (Fano’s Lemma, [4]) Let be pr , 0 s n and 1, . pq The Besov space can be discretized by the sequence norm of wavelet coefficients. Many useful wavelets are generated by scaling functions. More precisely, if is a scaling function with 22 , k k x hxk then 122 k :1 k k x hx k defines a wave- let [3]. Clearly, when is compactly supported and continuous, the corresponding wavelet has the same properties. An orthonormal wavelet basis of 2 LR is generated from dilation and translation of a scaling func- tion and its corresponding wavelet, i.e. 0 0 0 2 2 , :2 2, :2 2. jj jj 0 jk jk j jkZ xxk xxk Although wavelet basis are constructed for 2 LR, most of them constitute unconditional bases for p LR. A scaling function is called t regular, if has con- tinuous derivatives of order t and its corresponding wavelet has vanishing moments of order t, i.e. k xx d0, 0,1,,1x kt . The following lemma [3] plays important roles in this paper. Lemma 1.1. Let be a compactly supported, t regu- lar orthonormal scaling function with the corresponding wavelet and0 s t. If p f LR, 00kk :,,sf :, j kjk and 1, df pq , then the following two conditions are equivalent: 1) , s pq f BR; 2) 11 2 0 0 2js pj pp jq sd . Furthermore, , 11 2 0 0 2 s pq js pj Bpp jq fsd . Before introducing Fano’s Lemma, we need the nota- tion of Kullback-Leilber distance [4]. Let P and Q with P being absolutely continuous with respect to Q (denoted by ). Then the Kullback-Leilber distance is de- fined by PQ 0 ,:ln d pq px (,, ) k P obability measurable spaces and k A, 0,1, ,km . If k A , K PQp xx qx v A kv , then with c A standing for for the complement of A and , kv v 0 1 :inf m vm k K PP , m 1 1,exp 3. 2m m e 0sup min c kk km PA By Lemma 1.1 and 1.2, we can show the following re- sult: Theorem 1.1. Let ,, s rq f BRL with, 1rq , 1p and 1sr . If n f is an estimator of f with n i.i.d. random samples, then , 1/ 1/ 21/1 21 , ln supmax, , s rq sr ps sr s np fB RL n Ef fn n where , ,, ,:, s rq ss rqrq B BRL fBRfL f and x y means has compact support}; The notation with a constant C. rk 1.1. Note that x Cy Rema 1/ 1/ 21/1 ln 1/ 1/ 21/1 21 ln max , s rp p s sr n n sr sr sn n n for 21 p r s and for 21 p r s 1/ 1/ 1/1 21 21 max , srp 2 ln s s sr s s nn n . Then theorem 1.1 is a reformulate of the loweund in (1.1). By using the idea of reference [5], we show this theorem in the next two sections. irstly, we prove n r bo 2. Proof of Theorem 1.1 F 1/ 1/ 21/1 ln . s ,, sup s rq rp sr np n f fn One need construct fB RL E such that , s kr k g ,q g BRL and 1/ 1/ /1 . 21 ln sup s rp r nk p kn Let s n Ef g ono be a compactly supported, regular and orthrmal scaling function, tt s be the corresponding wavlet with suppe 0,l , lN . Here and after, Copyright © 2011 SciRes. AM 1260 dgers. Th H. Y. WANG N enotes the set of positive inteen there ex- ists a compactly supported density function g (i.e. 0gx and g ) satisfying , s rq d1x x g xBR and 0 0, 0. l gx c ,2 j lll l. ThenLet elem ,2,,21 j :0, j ents in the number of j enote ed by [ is 2 j 5], one defi nes 1, dd by1. #2 j j 2 1/ Motivat 1/ :2 j sr and j a 2j kjjk kl gx gxxIk :, j a :1 if 2 with 2 I j kl j kl, else 2:0 j kl I. Obvi- ously, 2jl g g, gx x d1 and 1/s r k gx for large implies that k 020c jj, which g his a density function e assumptions of for each k. By t , the wavelet is com- pported and pactly su s rq Bt times differentiable. Therefore, Rts and ,. s krq , g BRBec ause 1/2 1/ 21 js r j a , , s rq jjk B aC and so is , s rq kB g Hence, due to Lemma 1.1. , s krq , g xBRL. Clearly, 2 1/ 1/ 2: j kk kjjk plp p jspr j p gggga (2.1) For 2j j kk l due to :2 1/2 1/ j s j a r. Fur- thermore, :2 j knk p Afg satisfies kk AA . Recall that 1. By Lemmfor kk #2 ja 1.2, j 2 13 supmin,2 expj k c j gk . Here and , 2 j ke after n PA n f P thstands for the peasure correspond-robability m oe density function ing t 2n1 : n f xfxf nn xfx. It is easy to see that 0k g g P P from the constructions of k g . Since n f is an estimator of density with n i.i.d. ra ndom samp le s, 2 ik jjj nnc nkg nkgk pp EfgPfgP A . 22 Then, 2 supsup 2 13 min,2 exp. 22 k j k j j jnc nk gk pk jj EfgP A e (2.2) Next, one shows12 0 2j j cna 12 1 12 1 02 ,:ln d nn n nn n n ff fx K PPf xx fx , 11 nn j fx 1 j f x and 212 nn j j f xf x. Then 111 12 112 12 ,lnd, i nn ii ii fx : Recall that . n K PPfxx nKPP fx Note that 1 11 12 12 ,:ln d fx PP fxx fx ln 1uu K and for en 0 Thu. 1 121 2 1 12 12 212 ,lnd 1d d. nn fx KP Pnfxx fx fx nfx x fx nfxfx fxx Hence, 2 2:inf 2,2,. jkvk j l jj jnn jnn gg gg vkv k KP PKPP Moreover, 12 22d j j jk k ngxgxgx .x (2.3) ng to the definition of Accordik g , supp 0, k g gl and 0 g xc on 0,l. Thus, 12 k g xgxg x 22 11212 00 2 d 0 j jkj jkj xxc axca by the orthonormality of dx ca . Then (2.3) reduces to j k 12 0 2. j j cna (2.4) ke Ta 1 21 2ln /1 11 2 22 2l js r j na nn . s r n . Then j n n Now, one can choose such that lnC n 0C2 j na and 0 41Csr c2 . Therefore, 41/ 2 1 sr c n 1 11 00 22 22 ln jcC jj jn eena n an ) reduces to d (2.2 jEf supknk j p g C n . The the desired follows from 1/ 2 1/ j sp p ) r j by (2.1 and 1 21/1 2ln sr jn n . , 21 , sup s rq s s np RL Ef fn Now, we prove fB . Our proof depends on another lemma [4]. Copyright © 2011 SciRes. AM H. Y. WANG 1261 ov-Gilbert) Let Lemma 2.1. (Varsham 1 :,, m , 0,1 i . Then there exists a sub- set 0,, M of /8m with 00, ,0 such that 2 M and 10. 8 mij m kk k ij M construct It is sufficient to 0,1, , i g i M such that ,, is rq g BRL and 21 . is npn (2.5) sup iEf g As proove, let s ved ab be a compactly supported, tt s regular on,and orthormal scaling functi on be uppthe corresponding wavelet with sN 0,l , l . s s Assume,, rq g BR 1/2 , : j : L and ne : j a 0,,2,ll [0, ]0 |0 l gc 1l and . Defi 2 js ,2 j j jkjk k g xgxa x with 0,1 2 j j kk (note that 0 g g). Since 0,1 k , one knows that 2 j r j k k nd a 1/ 1/2 1/ 21 j j r r rk k a By Lemma 1.1, . js , s jrq kjk B j k aC , and so is , s rq B g . Hence , s,rq g BRL. at the supports ofNote th j k for j k are mutu- ally disjoint. Then 00 20 js jjk gxc ac for big j. This with1 dd g xx gxx implies that g is a density 2 0,1function for each j 1, there exists . Ac- cording to Lemma 2. 01 ,,, M such that 3 2 2 j M and 3 2. li j k (2.6) j kk Because suppjk supp jk for j kk , one knows that 1 2 li j pp pl i jk k k p j p jk p p p sp jli kk k p gg a This with (2.6) and , l k 0,1 i k leads to 3 2 li pj p g and 2 pps p g 1/ 82: li psj j p p gg . (2.7) Clearly, the sets 0,1,, 2 ii j n A fgi M satisfy li AA for . Then Fano’s Lemma yi il elds 1 0 1 supmin,expM3. 2 i i nc gM iM PA e (2.8) On the other hand, it follows 12 2 0 j M c f of (2.4). Take j na from the similar arguments to the proo 1 21 2j s n . Then choose a c0 suhat 2(21) 21 sj j naa . Hence, one can onstant ch t C 12 1 44 00 22 22 2e2e1 jj jj j Mcna cC Me . Therefore, (2.8) reduces to 0sup i i nc g iM PA C 0 and 0supi np iM Efg 0sup . 22 j iM C i i jj n gn p Pf g This with (2.7) and 1 21 2j s n yield (2.5). 3. The author Huiying Wang is grateful to the referees for their valuable comments and thanks her advisoProfes- sor Youming Liu, for his helpful guidance. Th work is supported by the National Natural Scien ce Foundation of China (No. 10871012) and Natural Science Foundation of Beijing (No. 1082003) . ] G. Kerkyacharian and D. Picard, “Density Estimation in Acknowledgements r, is 4. References [1 Besov Spaces,” Statistics & Probability Letters, Vol. 13, No. 1, 1992, pp. 15-24. doi:10.1016/0167-7152(92)90231-S [2] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian and D. Picard, “Density Estimation by Wavelet Thresholding,” The Annals of Statistics, Vol. 24, No. 2, 1996, pp. 508- 539. doi:10.1214/aos/1032894451 Kerkyacharian, D. Picard and A. B. Tsy- lets, Approximation and Statistical Appli- cations,” Springer-Verlag, New York, 1997. mir Zaiats, Springer rk, 2009. [3] W. Härdle, G. bakov, “Wave [4] A. B. Tsybakov, “Introduction to Nonparametric Estima- tion,” (English) Revised and Extended from the 2004 French Original, Translated by Vladi Series in Statistics, Springer, New Yo [5] P. Baldi, G. Kerkyacharian, D. Marinucci and D. Picard, “Adaptive Density Estimation for Directional Data Using Needlets,” The Annals of Statistics, Vol. 37, No. 6A, Copyright © 2011 SciRes. AM H. Y. WANG Copyright © 2011 SciRes. AM 1262 9-AOS6822009, pp. 3362-3395. doi:10.1214/0 05.010 [6] C. Christophe, “Regression with Random Design: A Minimax Study,” Statistics & Probability Letters, Vol. 77, No. 1, 2007, pp. 40-53. doi:10.1016/j.spl.2006. [7] A. B. Tsybakov, “Optimal Rates of Aggregation,” COLT/ Kernel 2003 Lecture Notes in Artificial Intelligence 2777, Springer, Heidelberg, 2003, pp. 303-313. |