Bias Correction Technique for Estimating Quantiles of Finite Populations under Simple Random Sampling without Replacement

In this paper, the problem of nonparametric estimation of finite population quantile function using multiplicative bias correction technique is considered. A robust estimator of the finite population quantile function based on multiplicative bias correction is derived with the aid of a super population model. Most studies have concentrated on kernel smoothers in the estimation of regression functions. This technique has also been applied to various methods of non-parametric estimation of the finite population quantile already under review. A major problem with the use of nonparametric kernel-based regression over a finite interval, such as the estimation of finite population quanti-ties, is bias at boundary points. By correcting the boundary problems associated with previous model-based estimators, the multiplicative bias corrected estimator produced better results in estimating the finite population quantile function. Furthermore, the asymptotic behavior of the proposed estimators is presented. It is observed that the estimator is asymptotically unbiased and statistically consistent when certain conditions are satisfied. The simulation results show that the suggested estimator is quite well in terms of relative bias, mean squared error, and relative root mean error. As a result, the multiplicative bias corrected estimator is strongly suggested for survey sampling estimation of the finite population quantile function. Multiplicative Technique, Simple Random Sampling without Replacement


Introduction
In recent years, the estimation of population distribution functions in the context of survey sampling has received considerable attention. A particular focus of this attention was the median, which is often considered to be a more acceptable position measure than the mean, especially when the interest variable follows a distorted distribution. Modern population mean or total estimators may typically be significantly enhanced when appropriate supplementary information is made available. Accordingly, the use of the auxiliary information in sample quantile estimators seems highly desirable. Use of known auxiliary knowledge both at the estimation stage and at the selection stage contributes to better estimation strategies in the sampling of surveys. If such information is not fully known or missing and information on the auxiliary variable(s) is relatively cheaper to obtain, one may consider taking a broad preliminary sample to estimate the auxiliary variable population mean(s).
Traditional kernel estimation methods have generally held that the performance of kernel methods depends largely on the smoothing bandwidth of the kernel, and very little depends on the type of the kernel. Most kernels used are symmetric kernels and are set once chosen. This may be useful for estimating unbounded support curves, but not for curves that have compact support and are discontinuous at boundary points. For curves of this kind, a fixed kernel shape leads to a boundary bias. This boundary bias is due to the weight allocation of the fixed symmetric kernel outside the distribution support when smoothing close to the boundary takes place. In addition, standard kernel methods yield wiggly estimates in the tail of the distribution as the reduction of the boundary bias leads to a limited bandwidth that prevents the pooling of appropriate data. Even otherwise, as noted in [1] when estimating the probability density function, the standard kernel estimator "works well for densities not far from Gaussian in shape", however, it can perform very poorly when the shape seems far from Gaussian, particularly near the boundary.
Boundary bias is a well-known problem, and several scholars have proposed ways to eliminate it. In the context of nonparametric regression, [2] [3] [4] proposed the use of boundary kernels, while [5] used Richardson's extrapolation to combine two kernel estimates with different bandwidths. In density estimation, [6] proposed data reflection, [7] considered empirical transformations, and [8] proposed a framework of jaccknife methods for correcting boundary bias. In recent years, it has been shown by [9] [10], that in nonparametric regression, local linear smoother is free of boundary bias and achieves the optimal convergence for mean integrated squared error. It is interesting to note a local linear smoother uses a fixed kernel in its initial form, and the local least-regression implicitly employs different kernels at different places. The transformation method is among the numerous methods suggested to deal with data on [ ] 0, +∞ . In order to minimize the boundary bias in the density estimation framework, [1] [11] [12], among others, studied general transformation methods. The transformation may operate under unique conditions and it is important to select the ap-propriate transformation by analyzing the subject matter and related studies.
The estimation of population quantiles is of great interest when a parametric form for the underlying distribution is not available. In a broad range of statistical applications, quantile estimation plays an important role: the Q-Q plot; the goodness-of-fit, the computation of extreme quantiles and value at risk in insurance business and financial risk management. Also, a large class of actuarial risk measures can be defined as functional of quantiles see ( [13]). Most contributions have been made based on simple random sampling (SRS) to estimate the pth quantile using a kernel function. The reader can be referred to [14] [15] [16].
Quantile estimation has been intensively used in many fields. Most of the existing quantile estimators suffer from either a bias or an inefficiency for high probability levels. In order to correct the bias problems, [17] suggested several nonparametric quantile estimators based on the beta-kernel and applied them to transformed data. A Monte Carlo based study showed that those estimators improve the efficiency of the traditional ones, not only for light tailed distributions, but also for heavy tailed, when the probability level is close to 1, [18] used transformed kernel estimate. In their study, they overcame this inconsistency by using a new approach based on the modified Champernowne distribution which behaves as the Pareto distribution.
As a result, the aim of this paper is to develop a nonparametric estimator for the quantile function of finite populations using a bias corrected approach to address the shortcomings of previously studied estimation methods. There are two unique features about this approach. One is that it ensures an accurate estimate and the other is that it reduces the estimation bias with negligible increase in variance.
The concept of Multiplicative Bias Correction (MBC) approach was first considered in [19], and the results obtained showed that the estimator of the regression function had desirable properties compared to existing estimators, including solving the boundary problems. This form of correction is especially well suited for changing non-negative regression function because it does not change the sign of the regression function and ensures an accurate estimate and reduces the estimation bias with negligible increase in variance. As there is always a bias-variance trade off for non-parametric smoothers in finite samples, smoothers can be generated whose asymptotic bias converges to zero while maintaining the same asymptotic variance. For a deeper discussion of Multiplicative Bias Correction technique we refer the reader to [20] [21] [22] [23].
Outline of the paper In Section 2, we propose an estimator for finite population quantile function using a bias correction technique. Asymptotic properties of the proposed estimator are derived in Section 3. Empirical study of the results is given in Section 4 and the conclusion of the findings is given in Section 5.

Proposed Estimator
In the sampling survey, we are time and again interested in studying the distri- quantile of the distribution, we imply the value Q, which would be ( ) One way of designing quantile estimators is to invert the estimator of the distribution function. Let ( ) y F t denote an estimator of ( ) Since the estimator ˆy F is often a step function, the form of the quantile estimator may not be smooth.
In this section we discuss a quantile estimator derived from a model-based multiplicative bias correction distribution function estimator that integrates auxiliary information. This distribution function estimator was introduced by [24]. The quantile estimator is based on inverting the [24] distribution function estimator. We derive a Bahadur representation for the quantile estimator.
be a probability distribution function. The population quantile of order α is defined as for 0 1 α < < . If F is continuous and strictly increasing, then Suppose that i X s ′ for 1, 2, , i N =  are independent and identically distributed (i.i.d) random variables with conforming survey values ( ) are independent, identically distributed random variables, each with common distribution function F. For all real t, the empirical The sample quantile of order α is defined as The sample quantile of order α is a strongly consistent estimator of ( ) is a known function of i x that accounts for heteroscedasticity and i e s ′ are independent and identically distributed (i.i.d) random variables with mean 0 and variance 2 Under model-based approach Equation (4) can be expressed as represent the sampled part and is known while The problem is estimating the second term of Equation (7). To estimate Equation (7), [24] proposed a multiplicative bias corrected estimator for finite population distribution given by − . In this study, we propose a multiplicative bias corrected quantile estimator for finite population based on finite population distribution in Equation (8) given by The problem is to estimate

Asymptotic Unbiasedness
In simple random sampling, as If the sample size n is sufficiently large then  Proof: For proof see [27].
We now study the properties of the Substituting the above results of Equation (16) in Equation (15) yields ( )  The linear approximation previously used by [30] [31] helps to study the asymptotic properties of the estimator. On the other hand, the estimator (see [24]). In this way ( ) and by using Equation (18) it can be seen that The bias of ( ) MBC Q α is of order . Thus, it converges to zero at a faster rate. Therefore, ( ) MBC Q α is asymptotically unbiased.

Asymptotic Variance
Asymptotic Variance of ( ) MBC Q α will be obtained as follows, Consider the Bahadur's representation: Then applying variance on both side of Equation (20) we have ( )

Asymptotic Mean Squared Error
The asymptotic mean squared error of the estimator Equation (23)

Empirical Study
The main purpose of this section is to compare the performance of the proposed The results of this simulation study are summarized in Table 2. Table 2 shows the unconditional Biases, Relative Mean Error (RME) and Relative Root Mean Squared Error (RRMSE) for the estimators at various values of the quantile α (i.e. 0.25, 0.5 and 0.75). Linear and cosine mean functions were used to obtain the tabulated results. Similar results and conclusions can be obtained using other mean functions such as quadratic, sine, bump etc. To analyze the performance of the proposed estimator against some specified estimators, unconditional Relative Open Journal of Statistics , , , s s s n X X X  and N is the number of replications. The RME indicates the measure of how close the estimator being considered is from the actual value, while RRMSE indicates measure of accuracy of the estimator. For instance, an estimator, MBCQE, will be said to be "better" or more preferable than the other estimators if its RRMSE is comparably smaller.
Bias of a quantile estimator refers to the deviation of the expected value of the estimator from the true quantile value. All of the quantile estimators considered here are biased but comparetively MBCQE exhibits a smaller bias. MBCQE can be seen to be a very efficient estimator of the empirical quantile function at all levels of the α-quantile followed closely by RKMQE and FAQE. CDQE proved to be a very inefficient estimator at all levels of α.
Further, comparison of estimators was done with respect to empirical quantile function which further affirmed the results tabulated above. Table 3 and Table 4 give a tabulation of all the estimators listed below.         From the plots it can be seen that MBCQE and RKMQE performed equally better than all other estimators of the true quantile function and it can be seen that sample balancing does not affect the performance of the estimators.

Conclusions and Suggestions
In conclusion, using the results from Table 2-4 and Figures 1-6, MBCQE was found to be an efficient estimator of the quantile function for finite population.
NWQE was found to be very inefficient of all the estimators with large conditional bias, relative absolute bias and mean squared error compared to the other estimators. MBCQE can therefore be used in estimating quantile functions for various units in the population in various sectors of the economy. Finally, further work can be done on the construction of confidence intervals for the proposed estimator, and a researcher can investigate various bias correction strategies such as Adaptive Boosting and the Bootstrap bias reduction techniques in quantile function estimation.