A Note on Spline Estimator of Unknown Probability Density Function ()
1. Introduction
The construction of a confidence interval for unknown probability density function (pdf) trough histogram for the first time has been suggested by Smirnov [1]. Bikel and Rosenblatt [2], Rosenblatt [3] have considered analogues problem using of Parsen-Rosenblatt’s estimation. The problem of construction of a confidence interval for unknown pdf trough spline-function was studied by Muminov and Khashimov [4]. Recently for unknown multidimensional distribution density function the kernel estimation is constructed and similar problem is studied by Muminov [5,6].
Several authors have considered the rate of convergence of the distribution of the maximum of difference between Parsen-Rosenblat’s estimator and unknown pdf, see, for example, Konakov and Piterbarg [7-9]. Nevertheless there is no such kind of result for the spline-estimators. The results obtained in this work help to approximate the deviation of spline estimation of unknown density by Gaussian process.
It should be noted that in the works of Lii and Rosenblatt [10], Muminov [11] asymptotical unbiasedness and strong state of the spline estimation are proved. Importance of spline-estimation and its application in statistics are given in the works [5,12].
The paper is organized as follows. In Sec. 2 the spline estimation is constructed and some auxiliary results are stated, and also the main theorem is given. The main theorem is proved in Sec. 3.
2. Results
Let be independent identical distributed random variables (r.v.) with pdf and let be the cubic spline-function which do interpolation of at the points, where, is the epirical distribution function of the sample. Theboundary condition for are, ,.
Then the derivative of spline-function is as follows, see Lii [13]
where, , , ,
,
, for, , for, , for, ,
, ,for, and for the other values of I, j,.
We take the statistic as estimator of pdf. We define r.v. by the following equality
where.
R.v. is interesting with point of view of solution of the following problems:
1) to find a confidential strip for, on given coefficient of trust;
2) to construct criterion for test of null hypothesis on given significance level .
Our main goal in the sequel is: to solve the problems 1) and 2). For this we have to find limit distribution of r.v.. The results, obtained in this work, allow to approximate distribution of r.v. with distribution of maximum of Gaussian process.
Let be an empirical distribution function of the sample, be a sequence of Wiener process. Set
,
,
,.
It is evident that
and the structure of co-variations of the Gaussian processes and is coincided.
We assume that as and the following conditions are fulfilled:
1),
2) The pdf continuously differentiable in the interval [0, 1].
In what follows C and c with or without index is universal positive number.
Theorem. Suppose that the conditions 1) and 2) are satisfied. Then for arbitrary one has
(1)
Also there is C such that
(2)
with probability equal to 1. The following assertion is proved by Komlosh et al. [14].
Lemma 1. There exist a probabilistic space where it is possible to define version of the and the sequence of Brownian bridge such that for all x.
.
Lemma 2. Let. For all l and J such that
where, one has
.
Also for any and the following holds
.
the following Lemma 3 is proved in the book of Lamperty [15].
Lemma 3. Let be a sequence of standard normal distributed r.v.s then
3. Proofs of the Main Results
The proof of Lemma 2 is simple and hence it is omitted. The proof of the main theorem. We have
Hence
because. From Lemma 1 and 3 it follows that as
and
with probability equal to 1. The relation (2) follows.
Let. Then
where
for we suppose. Set
where and are denoted a summation over all and satisfying the inequalities
(3)
and
(4)
respectively. Integrating by part we find
. (5)
Put
,.
Since (3)
According to conditions a), b), also Lagrange’s meanvalue theorem and form the last inequality we have
where. Here we take into account too.
From lemma 2 we obtain. Combining above-mentioned we obtain
. (6)
Similarly we have
.
By virtue of lemma 2 when (4) is fulfilled the following is true
,.
As a result we have
(7)
Reasoning alike presented at p. 410 of Cramér [16] we find
(8)
where is a random variable with pdf
It is known (see, (29.2) of Skorohod [17]) that for arbitrary
.
Use this, (5)-(8) and Chebishev’s inequality to get
(9)
By same way we can find that
(10)
The inequality (1) follows from (9) and (10). The proof of Theorem is completed.
The theorem allows to approximate the distribution of r.v. by distribution of the maximum of Gaussian process.
[1] N. B. Smirnov, “On Construction of a Confidence Interval for the Probability Density Function,” Soviet Reports, Vol. 74, 1959, pp. 1189-1191.
[2] P. J. Bikel and M. Rosenblatt, “On Some Global Measures of the Deviations of Density Functions Estimates,” The Annals of Statistics, Vol. 1, No. 6, 1973, pp. 1071- 1095.
[3] M. Rosenblatt, “On the maximal deviation of k-dimensional density estimates”, Annals of Probability, Vol. 4, No. 6, 1976, pp. 1009-1015. doi:10.1214/aop/1176995945
[4] M. S. Muminov and Sh. A. Khashimov, “On Limit Distribution of the Maximal Deviation of Spline Density Estimators,” FAN, Tashkent, 1986.
[5] M. S. Muminov, “On a Limit Distribution of the Maximal Level of Empirical Distribution Density and the Regression Function. I,” Theory Probability and Its Application, Vol. 55, No. 3, 2010, pp. 582-590.
[6] M. S. Muminov, “On a Limit Distribution of the Maximal Level of Empirical Distribution Density and the Regression Function. II,” Theory Probability and Its Application, Vol. 56, No. 1, 2011, pp. 162-173.
[7] V. D. Konakov and V. I. Piterbarg, “On the Convergence Rate of Maximal Deviations Distribution for Kernel Regression Estimates,” Journal of Multivariate Annalysis, Vol. 15, No. 3, 1984, pp. 279-294. doi:10.1016/0047-259X(84)90053-8
[8] V. D. Konakov and V. I. Piterbarg, “High Level Excursions of Gaussian Fields and the Weakly Optimal Choice of the Smoothing Parameter. I,” Mathenatical Methods of Statistics, Vol. 4, 1995, pp. 481-434.
[9] V. D. Konakov and V. I. Piterbarg, “High Level Excursions of Gaussian Fields and the Weakly Optimal Choice of the Smoothing Parameter. II,” Mathenatical Methods of Statistics, Vol. 1, 1997, pp. 112-124.
[10] K. S. Lii and M. Rosenblatt, “Asymptotic Behavior of a Spline of a Density Function,” Computters & Mathematics with Applications, No. 1, 1975, pp. 223-235.
[11] M. S. Muminov, “On Statistical Estimation of the Probability Density Function by LineFunctions,” Ph.D. Thesis, Tashkent, p. 110.
[12] M. S. Muminov, “On Approximating the Probability of a Large Excursion a Nonstationary Gaussian Process,” Siberian Mathematical Journal, Vol. 51, No. 1, 2010, pp. 175-195. doi:10.1007/s11202-010-0015-6
[13] K. S. Lii, “A Global Measure of a Spline Density Estimate,” The Annals of Statistics, Vol. 6, No. 5, 1978, pp. 1138-1148. doi:10.1214/aos/1176344316
[14] Y. Komlos, P. Major and G. Tusnady, “An Approximation of Partial Sums of Independent RV’s and the Sample DF. I,” Probability Theory and Related Fields, Vol. 32, No. 1-2, 1975, pp.111-131.
[15] G. Lamperty, “Probability,” Nauka, Moscow, 1973.
[16] G. Cramér, “The Mathematical Method in Statistics,” Mir, Moscow, 1976.
[17] A. V. Skorohod, “The Random Processes with Independent Increments,” Nauka, Moskov, 1964.