Estimation of Population Variance Using the Coefficient of Kurtosis and Median of an Auxiliary Variable under Simple Random Sampling

In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an auxiliary variable x. The estimator’s properties have been derived up to first order of Taylor’s series expansion. The efficiency conditions derived theoretically under which the proposed estimator performs better than existing estimators. Empirical studies have been done using real populations to demonstrate the performance of the developed estimator in comparison with the existing estimators. The proposed estimator as illustrated by the empirical studies performs better than the existing estimators under some specified conditions i.e. it has the smallest Mean Squared Error and the highest Percentage Relative Efficiency. The developed estimator therefore is suitable to be applied to situations in which the variable of interest has a positive correlation with the auxiliary variable.

1. Introduction

It is notable that the appropriate use of auxiliary information in probability sampling designs yields considerable reduction in the variance of the estimators of population parameters namely, population mean, median,variance,regression coefficient and population correlation coefficient.  was the first to show the contribution of known auxiliary information in improving the efficiency of the estimator of the population mean $\stackrel{¯}{Y}$ in survey sampling.

Survey samplings now touch almost every field of scientific study, including demography, education, energy, transportation, health care, economics, forestry, sociology, politics and so on. In fact it is not an exaggeration to say that much of the data that are statistically analyzed are collected in surveys. It is imperative to note that as the demand in use of surveys increase, the need for more effective methods of analyzing and interpreting the resulting data is inevitable. Measure of precision being a prime requirement of good surveys and appear now in most analysis hence the need to be obtained for almost each estimate derived from the survey data.

On regular instances we encounter surveys in which an auxiliary variable x is relatively cheap (with regard to time and money) to monitor than the study variable y. Use of auxiliary information can increase the precision of an estimator when the study variable y is highly correlated with auxiliary variable x. In reality such situations do occur when information is available in the form of auxiliary variable, which is highly correlated with study variable, for example, number of trees in an orchard and the yield of fruits.

The most common and widely used measure of precision is the variance of the survey estimator. In reality population variances are always not known but must be estimated from the survey data themselves. In this study we are interested in the estimation of population variance using known auxiliary information under simple random sampling without replacement (SRSWOR) sampling scheme. The precision of estimators under this situation is always increased, the ratio, product and regression estimators gives better outcome than those of simple random samplings.

Consider a finite population $V=\left\{{V}_{1},{V}_{2},{V}_{3},\cdots ,{V}_{N}\right\}$ of N distinct identifiable units. Let Y be our study variable and X be its corresponding auxiliary variable. Suppose we take a random sample of size n from this bivariate population $\left(Y,X\right)$ that is $\left({y}_{i},{x}_{i}\right)$ , for $i=1,2,3,\cdots ,n$ using a Simple Random Sampling Without Replacement (SRSWOR) method. Let $\stackrel{¯}{Y}$ and $\stackrel{¯}{X}$ be the population means of the study and auxiliary variable respectively and their corresponding sample means be $\stackrel{¯}{y}$ and $\stackrel{¯}{x}$ .

This study focuses on improving the efficiency in the estimation of

${S}_{y}^{2}=\frac{1}{N-1}{\sum }_{i=1}^{N}{\left({Y}_{i}-\stackrel{¯}{Y}\right)}^{2}$ (1)

using the coefficient of kurtosis and median.

We define the following notations that we will use throughout the article. For the population observations we have;

$\stackrel{¯}{Y}=\frac{1}{N}{\sum }_{i=1}^{N}{Y}_{i}$ , $\stackrel{¯}{X}=\frac{1}{N}{\sum }_{i=1}^{N}{X}_{i}$ , ${S}_{y}^{2}=\frac{1}{N-1}{\sum }_{i=1}^{N}{\left({Y}_{i}-\stackrel{¯}{Y}\right)}^{2}$ ,

${S}_{x}^{2}=\frac{1}{N-1}{\sum }_{i=1}^{N}{\left({X}_{i}-\stackrel{¯}{X}\right)}^{2}$ , ${S}_{xy}=\frac{1}{N-1}{\sum }_{i=1}^{N}\left({Y}_{i}-\stackrel{¯}{Y}\right)\left({X}_{i}-\stackrel{¯}{X}\right)$ .

Also we define the following from the sample observations:

$\stackrel{¯}{y}=\frac{1}{n}{\sum }_{i=1}^{n}{y}_{i}$ , $\stackrel{¯}{x}=\frac{1}{n}{\sum }_{i=1}^{n}{x}_{i}$ , ${s}_{y}^{2}=\frac{1}{n-1}{\sum }_{i=1}^{n}{\left({y}_{i}-\stackrel{¯}{y}\right)}^{2}$ ,

${s}_{x}^{2}=\frac{1}{n-1}{\sum }_{i=1}^{n}{\left({x}_{i}-\stackrel{¯}{x}\right)}^{2}$ , ${s}_{xy}=\frac{1}{n-1}{\sum }_{i=1}^{n}\left({y}_{i}-\stackrel{¯}{y}\right)\left({x}_{i}-\stackrel{¯}{x}\right)$ .

In general, we define the following parameters:

${\mu }_{rs}=\frac{1}{N-1}{\sum }_{i=1}^{N}{\left({y}_{i}-\stackrel{¯}{y}\right)}^{r}{\left({x}_{i}-\stackrel{¯}{x}\right)}^{s}$ (2)

${\lambda }_{rs}=\frac{{\mu }_{rs}}{{\mu }_{20}^{\frac{r}{2}}{\mu }_{02}^{\frac{s}{2}}}$ (3)

Thus we note the following;

${\mu }_{20}={S}_{y}^{2}$ , ${\mu }_{02}={S}_{x}^{2}$ , and ${\mu }_{11}={S}_{xy}$ ; ${\lambda }_{22}=\frac{{\mu }_{22}}{{\mu }_{20}{\mu }_{02}}$ , ${\lambda }_{21}=\frac{{\mu }_{21}}{{\mu }_{20}{\mu }_{02}^{\frac{1}{2}}}$ such that; ${C}_{y}=\frac{{S}_{y}^{2}}{{\stackrel{¯}{Y}}^{2}}=\frac{{\mu }_{20}}{{\stackrel{¯}{Y}}^{2}}$ is the coefficient of variation for the study variable y, ${C}_{x}=\frac{{S}_{x}^{2}}{{\stackrel{¯}{X}}^{2}}=\frac{{\mu }_{02}}{{\stackrel{¯}{X}}^{2}}$ is the coefficient of variation for the auxiliary variable x and ${\rho }_{xy}=\frac{{S}_{xy}}{{S}_{x}{S}_{y}}=\frac{{\mu }_{11}}{\sqrt{{\mu }_{20}}\sqrt{{\mu }_{02}}}$ coefficient of correlation between x and y, ${\kappa }_{\left(y\right)}={\lambda }_{40}=\frac{{\mu }_{40}}{{\mu }_{20}^{2}}$ coefficient of kurtosis for the study variable, ${\kappa }_{\left(x\right)}={\lambda }_{04}=\frac{{\mu }_{04}}{{\mu }_{02}^{2}}$ coefficient of kurtosis for the auxiliary variable and ${M}_{x}$ population median of the auxiliary variable.

Many authors have come up with more precise estimators by employing prior knowledge of certain population parameter(s).  for example attempted use of the coefficient of variation of study variable but prove inadequate for in practice, this parameter is unknown. Motivated by  work,   and  used the known coefficient of variation but now that of the auxiliary variable for estimating population mean of study variable. Reasoning along the same path  used the prior value of coefficient of kurtosis of an auxiliary variable in estimating the population variance of the study variable y.

Kurtosis in most cases is not reported or used in many research articles, in spite of the fact that fundamentally speaking every statistical package provides a measure of kurtosis. This maybe attributed to the likelihood that kurtosis is not well understood or its importance in various aspects of statistical analysis has not been explored fully. Kurtosis can simply be expressed as

$\kappa =\frac{E{\left(x-\mu \right)}^{4}}{{\left(E{\left(x-\mu \right)}^{2}\right)}^{2}}=\frac{{\mu }^{4}}{{\sigma }^{4}}$ (4)

where $E$ ―the expectation operator, $\mu$ ―the mean, ${\mu }^{4}$ ―the fourth moment about the mean and $\sigma$ ―the standard deviation.

Median being the middlemost value in a distribution (when the values are arranged in ascending or descending order) has the advantage of being less affected by the outliers and skewed data, thus is preferred to the mean especially when the distribution is not symmetrical. We can therefore utilize the median and the coefficient of kurtosis of the auxiliary variable to derive a more precise ratio type population variance.

2. Existing Population Variance Estimators

In this section we have reviewed some finite population variance estimators existing in literature which will help in the construction and development of the proposed estimator. Notably, when auxiliary information is not available the usual unbiased estimator to the population variance is

${t}_{1}={s}_{y}^{2}$ (5)

The bias and MSE of ${t}_{1}$

$Bias\left({t}_{1}\right)=\frac{1-f}{n}{S}_{y}^{2}\left\{\left({\kappa }_{x}-1\right){\Psi }_{1}\left({\Psi }_{1}-\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right\}=0$ (6)

$\begin{array}{c}MSE\left({t}_{1}\right)=Var\left({t}_{1}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+\left({\kappa }_{x}-1\right){\Psi }_{1}\left({\Psi }_{1}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}\\ =\frac{1-f}{n}{S}_{y}^{4}\left({\kappa }_{y}-1\right)\end{array}$ (7)

where ${\Psi }_{1}=0$

Population variance, ${S}_{y}^{2}$ estimation using auxiliary information was considered by  , and proposed ratio type population variance estimator, given by

${t}_{2}={s}_{y}^{2}\frac{{S}_{x}^{2}}{{s}_{x}^{2}}$ (8)

The bias and Mean Squared Error of Isaki’s estimator,

$Bias\left({t}_{2}\right)=\frac{1-f}{n}{S}_{y}^{2}\left\{\left({\kappa }_{x}-1\right){\Psi }_{2}\left({\Psi }_{2}-\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right\}=\frac{1-f}{n}{S}_{y}^{2}\left[\left({\kappa }_{x}-1\right)-\left({\lambda }_{22}-1\right)\right]$ (9)

$\begin{array}{c}MSE\left({t}_{2}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+\left({\kappa }_{x}-1\right){\Psi }_{2}\left({\Psi }_{2}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}\\ =\frac{1-f}{n}{S}_{y}^{4}\left[\left({\kappa }_{y}-1\right)+\left({\kappa }_{x}-1\right)-2\left({\lambda }_{22}-1\right)\right]\end{array}$ (10)

where ${\Psi }_{2}=1$

 initiated the use of coefficient of kurtosis in estimating population variance of a study variable y. Later, the coefficient of kurtosis was used by    in the estimating the population mean.

 using the known information on both ${S}_{x}^{2}$ and ${\kappa }_{x}$ suggested modified ratio type population variance estimator for ${S}_{y}^{2}$ as

${t}_{3}={s}_{y}^{2}\left[\frac{{S}_{x}^{2}+{\kappa }_{x}}{{s}_{x}^{2}+{\kappa }_{x}}\right]$ (11)

The estimator, ${t}_{3}$ bias and MSE obtained as

$Bias\left({t}_{3}\right)=\frac{1-f}{n}{S}_{y}^{2}\left[\left(\left\{{\kappa }_{x}-1\right\}\right){\Psi }_{3}\left({\Psi }_{3}-\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right]$ (12)

$MSE\left({t}_{3}\right)=\frac{1-f}{n}{S}_{y}^{4}\left[\left\{{\kappa }_{y}-1\right\}+\left\{{\kappa }_{x}-1\right\}{\Psi }_{3}\left({\Psi }_{3}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right]$ (13)

where ${\Psi }_{3}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{\kappa }_{x}}$

 suggested four modified ratio type variance estimators using known values of ${C}_{x}$ and ${\kappa }_{x}$ ,

${t}_{4}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}-{C}_{x}}{{s}_{x}^{2}-{C}_{x}}\right\}$ (14)

${t}_{5}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}-{\kappa }_{x}}{{s}_{x}^{2}-{\kappa }_{x}}\right\}$ (15)

${t}_{6}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}{\kappa }_{x}-{C}_{x}}{{s}_{x}^{2}{\kappa }_{x}-{C}_{x}}\right\}$ (16)

${t}_{7}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}{C}_{x}-{\kappa }_{x}}{{s}_{x}^{2}{C}_{x}-{\kappa }_{x}}\right\}$ (17)

The biases and MSE of their estimators,

$Bias\left({t}_{4}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{4}\left({\Psi }_{4}-\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right\}$ (18)

$MSE\left({t}_{4}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{4}\left({\kappa }_{x}-1\right)\left({\Psi }_{4}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (19)

$Bias\left({t}_{5}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{5}\left({\Psi }_{5}-\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (20)

$MSE\left({t}_{5}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{5}\left({\kappa }_{x}-1\right)\left({\Psi }_{5}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (21)

$Bias\left({t}_{6}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{6}\left({\Psi }_{6}-\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (22)

$MSE\left({t}_{6}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{6}\left({\kappa }_{x}-1\right)\left({\Psi }_{6}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (23)

$Bias\left({t}_{7}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{7}\left({\Psi }_{7}-\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (24)

$MSE\left({t}_{7}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{7}\left({\kappa }_{x}-1\right)\left({\Psi }_{7}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (25)

where;

${\Psi }_{4}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}-{C}_{x}}$ ; ${\Psi }_{5}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}-{\kappa }_{x}}$ ; ${\Psi }_{6}=\frac{{S}_{x}^{2}{\kappa }_{x}}{{S}_{x}^{2}{\kappa }_{x}-{C}_{x}}$ ; ${\Psi }_{7}=\frac{{S}_{x}^{2}{C}_{x}}{{S}_{x}^{2}{C}_{x}-{\kappa }_{x}}$ .

 utilizing population median ${M}_{x}$ came up with a modified ratio type population variance estimator as

${t}_{8}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}+{M}_{x}}{{s}_{x}^{2}+{M}_{x}}\right\}$ (26)

The bias and MSE of their estimator ${t}_{8}$ ,

$Bias\left({t}_{8}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{8}\left({\Psi }_{8}-\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (27)

$MSE\left({t}_{8}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{8}\left({\kappa }_{x}-1\right)\left({\Psi }_{8}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (28)

where, ${\Psi }_{8}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{M}_{x}}$ .

 using the known quartiles (upper and lower quartile ${Q}_{3}$ and ${Q}_{1}$ respectively) of the auxiliary variable x suggested

${t}_{9}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}+{Q}_{1}}{{s}_{x}^{2}+{Q}_{1}}\right\}$ (29)

${t}_{10}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}+{Q}_{3}}{{s}_{x}^{2}+{Q}_{3}}\right\}$ (30)

The biases and MSE of their estimators ${t}_{9}$ and ${t}_{10}$ as follows

$Bias\left({t}_{9}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{9}\left({\Psi }_{9}-\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (31)

$MSE\left({t}_{9}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{9}\left({\kappa }_{x}-1\right)\left({\Psi }_{9}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (32)

$Bias\left({t}_{10}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{10}\left({\Psi }_{10}-\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (33)

$MSE\left({t}_{10}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{10}\left({\kappa }_{x}-1\right)\left({\Psi }_{10}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (34)

where ${\Psi }_{9}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{Q}_{1}}$ and ${\Psi }_{10}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{Q}_{3}}$ . Motivated by  and  

considered the estimation of finite population variance using known coefficient of variation and median of an auxiliary variable, proposed an estimator.

${t}_{11}={s}_{y}^{2}\left[\frac{{C}_{x}{S}_{x}^{2}+{M}_{x}}{{C}_{x}{s}_{x}^{2}+{M}_{x}}\right]$ (35)

The bias and MSE obtained to be,

$Bias\left({t}_{11}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right)\left\{{\Psi }_{11}\left({\Psi }_{11}-\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (36)

$MSE\left({t}_{11}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\Psi }_{11}\left({\kappa }_{x}-1\right)\left({\Psi }_{11}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (37)

where ${\Psi }_{11}=\frac{{C}_{x}{S}_{x}^{2}}{{C}_{x}{S}_{x}^{2}+{M}_{x}}$ .

3. Proposed Estimator

Motivated by the works of      and  in the improvement of the performance of the population variance estimator of the study variable using known population parameters of an auxiliary variable. We propose the following modified ratio type population variance estimator using a known value of population coefficient of kurtosis ${\kappa }_{x}$ and median ${M}_{x}$ of an auxiliary variable.

${\stackrel{^}{S}}_{PM}^{2}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}{\kappa }_{x}+{M}_{x}^{2}}{{s}_{x}^{2}{\kappa }_{x}+{M}_{x}^{2}}\right\}$ (38)

To calculate the bias and the MSE of ${\stackrel{^}{S}}_{PM}^{2}$ ,

We let ${s}_{y}^{2}={S}_{y}^{2}\left(1+{\xi }_{0}\right)$ and ${s}_{x}^{2}={S}_{x}^{2}\left(1+{\xi }_{1}\right)$ or ${\xi }_{0}=\frac{{s}_{y}^{2}}{{S}_{y}^{2}}-1$ and ${\xi }_{1}=\frac{{s}_{x}^{2}}{{S}_{x}^{2}}-1$ so that $E\left({\xi }_{0}\right)=E\left({\xi }_{1}\right)=0$ and to the first degreee of approximations

$E\left({\xi }_{0}^{2}\right)=\frac{1-f}{n}\left({\lambda }_{40}-1\right)$ (39)

$E\left({\xi }_{1}^{2}\right)=\frac{1-f}{n}\left({\lambda }_{04}-1\right)$ (40)

$E\left({\xi }_{0}{\xi }_{1}\right)=\frac{1-f}{n}\left({\lambda }_{22}-1\right)$ (41)

The expectations are obtained following the works of    and  .

Now expressing ${\stackrel{^}{S}}_{PM}^{2}$ in terms of ${\xi }^{\prime }s$ we have

$\begin{array}{c}{\stackrel{^}{S}}_{PM}^{2}={S}_{y}^{2}\left(1+{\xi }_{0}\right)\left\{\frac{{\kappa }_{x}{S}_{x}^{2}+{M}_{x}^{2}}{{\kappa }_{x}{S}_{x}^{2}\left(1+{\xi }_{1}\right)+{M}_{x}^{2}}\right\}\\ ={S}_{y}^{2}\left(1+{\xi }_{0}\right){\left(1+{\rho }^{*}{\xi }_{1}\right)}^{-1}\end{array}$ (42)

where ${\rho }^{*}={\kappa }_{x}{S}_{x}^{2}{\left({\kappa }_{x}{S}_{x}^{2}+{M}_{x}^{2}\right)}^{-1}$ , we assume that $|{\rho }^{*}{\xi }_{1}|<1$ so that ${\left(1+{\rho }^{*}{\xi }_{1}\right)}^{-1}$ is expandable.

Expanding the right hand side of (42) and multiplying out we have

$\begin{array}{c}{\stackrel{^}{S}}_{PM}^{2}={S}_{y}^{2}\left(1+{\xi }_{0}\right)\left(1-{\rho }^{*}{\xi }_{1}+{\rho }^{*2}{\xi }_{1}^{2}+\cdots \right)\\ ={S}_{y}^{2}\left(1+{\xi }_{0}-{\rho }^{*}{\xi }_{1}-{\rho }^{*}{\xi }_{0}{\xi }_{1}+{\rho }^{*2}{\xi }_{1}^{2}+{\rho }^{*2}{\xi }_{0}{\xi }_{1}^{2}-\cdots \right)\end{array}$ (43)

Neglecting terms of ${\xi }^{\prime }s$ having power greater than two we have

$\begin{array}{l}{\stackrel{^}{S}}_{PM}^{2}\cong {S}_{y}^{2}\left(1+{\xi }_{0}-{\rho }^{*}{\xi }_{1}-{\rho }^{*}{\xi }_{0}{\xi }_{1}+{\rho }^{*2}{\xi }_{1}^{2}\right)\\ \text{or}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\cong {S}_{y}^{2}\left({\xi }_{0}-{\rho }^{*}{\xi }_{1}-{\rho }^{*}{\xi }_{0}{\xi }_{1}+{\rho }^{*2}{\xi }_{1}^{2}\right)\end{array}$ (44)

Taking the expectation on both sides of (44)

$E\left({\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\right)\cong E\left({S}_{y}^{2}\left({\xi }_{0}-{\rho }^{*}{\xi }_{1}-{\rho }^{*}{\xi }_{0}{\xi }_{1}+{\rho }^{*2}{\xi }_{1}^{2}\right)\right)$ (45)

we have our bias

$Bias\left({\stackrel{^}{S}}_{PM}^{2}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa }_{x}-1\right){\rho }^{*}\left\{{\rho }^{*}-\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right\}$ (46)

Squaring both sides of (44) and neglecting terms of ${\xi }^{\prime }s$ having power greater than two we have

${\left({\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\right)}^{2}\cong {S}_{y}^{4}\left({\xi }_{0}^{2}+{\rho }^{*2}{\xi }_{1}^{2}-2{\rho }^{*}{\xi }_{0}{\xi }_{1}\right)$ (47)

Taking the expectation on both sides of (47)

$E\left({\left({\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\right)}^{2}\right)\cong E\left({S}_{y}^{4}\left({\xi }_{0}^{2}+{\rho }^{*2}{\xi }_{1}^{2}-2{\rho }^{*}{\xi }_{0}{\xi }_{1}\right)\right)$ (48)

We get the $\left({\stackrel{^}{S}}_{PM}^{2}\right)$ estimator’s Mean Squared Error as

$MSE\left({\stackrel{^}{S}}_{PM}^{2}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\rho }^{*}\left({\kappa }_{x}-1\right)\left({\rho }^{*}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}$ (49)

4. Theoretical Comparison

The theoretical conditions under which the proposed modified ratio type estimators ${\stackrel{^}{S}}_{PM}^{2}$ is more efficient than the other existing estimators ${t}_{j},j=0,1,2,\cdots ,11$ , from MSE of ${t}_{j},j=0,1,2,\cdots ,11$ given to the first degree of approximation in general as

$MSE\left({t}_{j}\right)=\frac{1-f}{n}{S}_{y}^{4}\left[\left({\kappa }_{y}-1\right)+{\Psi }_{j}\left({\kappa }_{x}-1\right)\left({\Psi }_{j}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right]$ (50)

Using Equations ((49) and (50)) we have that $MSE\left({\stackrel{^}{S}}_{PM}^{2}\right) ,

if

${\rho }^{*}\left({\rho }^{*}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)<{\Psi }_{j}\left({\Psi }_{j}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)$ (51)

5. Empirical Studies

Using the data from Population I (Source:[  , 228]) and Population II (source:  ). We assess the performance of the proposed estimator when simple random sampling without replacement (SRSWOR) scheme is used with that of sample variance and existing estimators.We apply the proposed and existing estimators to this data set and the data statistics are given below:

Population I

$X$ = Fixed capital

$Y$ = output of 80 factories

$N=80$ , $n=20$ , $\stackrel{¯}{X}=11.265$ , $\stackrel{¯}{Y}=51.826$ ,

${S}_{x}^{2}=71.504$ , ${S}_{y}^{2}=336.979$ , ${S}_{xy}=146.068$ ,

${\lambda }_{04}={\kappa }_{x}=2.866$ , ${\lambda }_{40}={\kappa }_{y}=2.267$ , ${\lambda }_{22}=2.221$ ,

${\rho }_{xy}=0.941$ , ${C}_{y}=0.354$ , ${C}_{x}=0.751$

${M}_{x}=10.300$ , ${Q}_{1}=5.150$ , ${Q}_{3}=16.975$

Population II

$X$ = acreage under wheat crop in 1973

$Y$ = acreage under wheat crop in 1974,

$N=70$ , $n=25$ , $\stackrel{¯}{X}=175.2671$ , $\stackrel{¯}{Y}=96.700$ ,

${S}_{x}^{2}=19840.7508$ , ${S}_{y}^{2}=3686.1898$ ,

${\lambda }_{04}={\kappa }_{x}=7.0952$ , ${\lambda }_{40}={\kappa }_{y}=4.7596$ , ${\lambda }_{22}=4.6038$ ,

${\rho }_{xy}=0.7293$ , ${C}_{y}=0.6254$ , ${C}_{x}=0.8037$

${M}_{x}=72.4375$ , ${Q}_{1}=80.1500$ , ${Q}_{3}=225.0250$ .

Using the above summary values we have the results in Table 1 below. From the table Mean Squared Errors it is clear that our proposed modified ratio type population variance estimator ${\stackrel{^}{S}}_{PM}^{2}$ has the least Mean Squared Error (MSE).

The efficiency of our proposed estimator ${\stackrel{^}{S}}_{PM}^{2}$ is examined numerically by its Percentage Relative Efficiency (PRE(s)) in comparison with those of existing

Table 1. Bias and Mean Squared Errors (MSE).

estimators using real populations from [  , p.228] and  .

We have computed the PRE(s) of the estimators ${t}_{j},j=1,2,\cdots ,11$ using the formulae

$PRE\left({t}_{j},{s}_{y}^{2}\right)=\frac{MSE\left({s}_{y}^{2}\right)}{MSE\left({t}_{j}\right)}×100$ (52)

$=\frac{\frac{1-f}{n}{S}_{y}^{4}\left({\kappa }_{y}-1\right)}{\frac{1-f}{n}{S}_{y}^{4}\left[\left\{{\kappa }_{y}-1\right\}+\left\{{\kappa }_{x}-1\right\}{\Psi }_{j}\left({\Psi }_{j}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right]}×100$ (53)

$=\frac{{\kappa }_{y}-1}{\left\{{\kappa }_{y}-1\right\}+\left\{{\kappa }_{x}-1\right\}{\Psi }_{j}\left({\Psi }_{j}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)}×100$ (54)

Then PRE for our proposed estimator is subsequently,

$PRE\left({\stackrel{^}{S}}_{PM}^{2},{s}_{y}^{2}\right)=\frac{MSE\left({s}_{y}^{2}\right)}{MSE\left({\stackrel{^}{S}}_{PM}^{2}\right)}×100$ (55)

$=\frac{\frac{1-f}{n}{S}_{y}^{4}\left({\kappa }_{y}-1\right)}{\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa }_{y}-1\right)+{\rho }^{*}\left({\kappa }_{x}-1\right)\left({\rho }^{*}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)\right\}}×100$ (56)

$=\frac{{\kappa }_{y}-1}{\left({\kappa }_{y}-1\right)+{\rho }^{*}\left({\kappa }_{x}-1\right)\left({\rho }^{*}-2\left(\frac{{\lambda }_{22}-1}{{\kappa }_{x}-1}\right)\right)}×100$ (57)

Using formula (54) and (57) we compute the Percent Relative Efficiencies and tabulate the results in Table 2.

Percentage Relative efficiency being a robust statistical tool that is used to

Table 2. Percent Relative Efficiencies (PRE).

measure and ascertain the efficiency of one estimator over another. From the findings summarized in the table above it is clear that our proposed estimator ${\stackrel{^}{S}}_{PM}^{2}$ performed best, that is it has the highest PRE among all the other estimators. This therefore implies that we can apply our proposed estimator to appropriate practical situations and obtain better and more efficient results than the traditional and other existing population variance estimators.

6. Conclusions

In this study we have proposed a modified ratio type population variance estimator using known population parameters the coefficient of kurtosis and the median of the auxiliary variable x.

We have analyzed the performance of our proposed estimator against the usual unbiased variance estimator and existing estimators using two natural populations by comparing their PRE(s).

Based on the results of our studies, it is evidenced that our proposed estimator works better than the other existing estimators having the highest Percentage Relative Efficiency hence can be applied to practical applications, where knowledge of population parameters of auxiliary variable is available. We also recommend that our proposed estimator can be further improved by extending the number of Taylor’s series terms to be more than two.

Acknowledgements

We give much appreciation to the authors for the numerous and valuable contribution to this work.

Conflicts of Interest

The authors declare that there is no conflict of interest in the publication of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Cite this paper

Milton, T. , Odhiambo, R. and Orwa, G. (2017) Estimation of Population Variance Using the Coefficient of Kurtosis and Median of an Auxiliary Variable under Simple Random Sampling. Open Journal of Statistics, 7, 944-955. doi: 10.4236/ojs.2017.76066.

  Cochran, W. (1940) The Estimation of the Yields of Cereal Experiments by Sampling for the Ratio of Grain to Total Produce. The Journal of Agricultural Science, 30, 262-275. https://doi.org/10.1017/S0021859600048012  Searls, D.T. (1964) The Utilization of a Known Coefficient of Variation in the Estimation Procedure. Journal of the American Statistical Association, 59, 1225-1226. https://doi.org/10.1080/01621459.1964.10480765  Sen, A.R. (1978) Estimation of the Population Mean When the Coefficient of variation Is Known. Communications in Statistics—Theory and Methods, 7, 657-672. https://doi.org/10.1080/03610927808827656  Sisodia, B. and Dwivedi, V. (1981) Modified Ratio Estimator Using Coefficient of Variation of Auxiliary Variable. Journal-Indian Society of Agricultural Statistics, 33, 13-18.  Upadhyaya, L.N. and Singh, H.P. (1984) On the Estimation of the Population Mean With Known Coefficient of Variation. Biometrical Journal, 26, 915-922. https://doi.org/10.1002/bimj.4710260814  Hirano, K., Pandey, B.N. and Singh, J. (1973) On the Utilization of a Known Coefficient of Kurtosis in the Estimation Procedure of Variance. Annals of the Institute of Statistical Mathematics, 25, 51-55. https://doi.org/10.1007/BF02479358  Isaki, C.T. (1983) Variance Estimation Using Auxiliary Information. Journal of the American Statistical Association, 78, 117-123. https://doi.org/10.1080/01621459.1983.10477939  Searls, D.T. and Intarapanich, P. (1990) A Note on an Estimator for the Variance That Utilizes The kurtosis. The American Statistician, 44, 295-296.  Upadhyaya, L. and Singh, H. (1999) An Estimator for Population Variance That Utilizes the Kurtosis of an Auxiliary Variable in Sample Surveys. Vikram Mathematical Journal, 19, 14-17.  Kadilar, C. and Cingi, H. (2006) Ratio Estimators for the Population Variance in Simple and Stratified Random Sampling. Applied Mathematics and Computation, 173, 1047-1059. https://doi.org/10.1016/j.amc.2005.04.032  Subramani, J. and Kumarapandiyan, G. (2012) Variance Estimation Using Median of the Auxiliary Variable. International Journal of Probability and Statistics, 1, 6-40. https://doi.org/10.5923/j.ijps.20120103.02  Subramani, J. and Kumarapandiyan, G. (2012) Variance Estimation Using Quartiles and Their Functions of an Auxiliary Variable. International Journal of Statistics and Applications, 2, 67-72. https://doi.org/10.5923/j.statistics.20120205.04  Subramani, J. and Kumarapandiyan, G. (2013) Estimation of Variance Using Known Coefficient of Variation and Median of an Auxiliary Variable. Journal of Modern Applied Statistical Methods, 12, 11. https://doi.org/10.22237/jmasm/1367381400  Khan, M. and Shabbir, J. (2013) A Ratio Type Estimator for the Estimation of Population Variance Using Quartiles of an Auxiliary Variable. Journal of Statistics Applications and Probability, 2, 157-162. https://doi.org/10.12785/jsap/020314  Singh, H.P., Tailor, R., Tailor, R. and Kakran, M. (2004) An Improved Estimator of Population Mean Using Power Transformation. Journal of the Indian Society of Agricultural Statistics, 58, 223-230.  Yadav, S.K., Misra, S. and Mishra, S. (2016) Efficient Estimator for Population Variance Using Auxiliary Variable. American Journal of Operational Research, 6, 9-15.  Sukhatme, P. (1944) Moments and Product Moments of Moment-Statistics for Samples of the Finite and Infinite Populations. Sankhya: The Indian Journal of Statistics, 6, 363-382.  Sukhatme, P. and Sukhatme, B. (1970) Sampling Theory of Surveys with Applicationsrome. Asia Publishing House, Bombay.  Srivastava, S.K. and Jhajj, H.S. (1981) A Class of Estimators of the Population Mean in Survey Sampling Using Auxiliary Information. Biometrika, 68, 341-343. https://doi.org/10.1093/biomet/68.1.341  Tracy, D.S. (1984) Moments of Sample Moments. Communications in Statistics-Theory and Methods, 13, 553-562. https://doi.org/10.1080/03610928408828700  Murthy, M.N. (1967) Sampling Theory and Methods. Statistical Publishing Society, Barrackpore.  Daroga, S. and Chaudhary, F. (1986) Theory and Analysis of Sample Survey Designs. Wiley Eastern Limited, New York. 