**Open Journal of Statistics**

Vol.07 No.06(2017), Article ID:80849,12 pages

10.4236/ojs.2017.76066

Estimation of Population Variance Using the Coefficient of Kurtosis and Median of an Auxiliary Variable under Simple Random Sampling

Tonui Kiplangat Milton^{1*}, Romanus Otieno Odhiambo^{2}, George Otieno Orwa^{2 }

^{1}Pan African University, Institute for Basic Sciences, Technology and Innovation (PAUISTI), Nairobi, Kenya

^{2}Department of Statistics and Actuarial Science, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya

Copyright © 2017 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: September 24, 2017; Accepted: November 29, 2017; Published: December 2, 2017

ABSTRACT

In this study we have proposed a modified ratio type estimator for population variance of the study variable y under simple random sampling without replacement making use of coefficient of kurtosis and median of an auxiliary variable x. The estimator’s properties have been derived up to first order of Taylor’s series expansion. The efficiency conditions derived theoretically under which the proposed estimator performs better than existing estimators. Empirical studies have been done using real populations to demonstrate the performance of the developed estimator in comparison with the existing estimators. The proposed estimator as illustrated by the empirical studies performs better than the existing estimators under some specified conditions i.e. it has the smallest Mean Squared Error and the highest Percentage Relative Efficiency. The developed estimator therefore is suitable to be applied to situations in which the variable of interest has a positive correlation with the auxiliary variable.

**Keywords:**

Modified Ratio Type Variance Estimator, Study Variable, Auxiliary Variable, Kurtosis, Median, Bias, Mean Squared Error (MSE), Percentage Relative Efficiency (PRE), Simple Random Sampling

1. Introduction

It is notable that the appropriate use of auxiliary information in probability sampling designs yields considerable reduction in the variance of the estimators of population parameters namely, population mean, median,variance,regression coefficient and population correlation coefficient. [1] was the first to show the contribution of known auxiliary information in improving the efficiency of the estimator of the population mean $\stackrel{\xaf}{Y}$ in survey sampling.

Survey samplings now touch almost every field of scientific study, including demography, education, energy, transportation, health care, economics, forestry, sociology, politics and so on. In fact it is not an exaggeration to say that much of the data that are statistically analyzed are collected in surveys. It is imperative to note that as the demand in use of surveys increase, the need for more effective methods of analyzing and interpreting the resulting data is inevitable. Measure of precision being a prime requirement of good surveys and appear now in most analysis hence the need to be obtained for almost each estimate derived from the survey data.

On regular instances we encounter surveys in which an auxiliary variable x is relatively cheap (with regard to time and money) to monitor than the study variable y. Use of auxiliary information can increase the precision of an estimator when the study variable y is highly correlated with auxiliary variable x. In reality such situations do occur when information is available in the form of auxiliary variable, which is highly correlated with study variable, for example, number of trees in an orchard and the yield of fruits.

The most common and widely used measure of precision is the variance of the survey estimator. In reality population variances are always not known but must be estimated from the survey data themselves. In this study we are interested in the estimation of population variance using known auxiliary information under simple random sampling without replacement (SRSWOR) sampling scheme. The precision of estimators under this situation is always increased, the ratio, product and regression estimators gives better outcome than those of simple random samplings.

Consider a finite population $V=\left\{{V}_{1},{V}_{2},{V}_{3},\cdots ,{V}_{N}\right\}$ of N distinct identifiable units. Let Y be our study variable and X be its corresponding auxiliary variable. Suppose we take a random sample of size n from this bivariate population $\left(Y\mathrm{,}X\right)$ that is $\left({y}_{i}\mathrm{,}{x}_{i}\right)$ , for $i=1,2,3,\cdots ,n$ using a Simple Random Sampling Without Replacement (SRSWOR) method. Let $\stackrel{\xaf}{Y}$ and $\stackrel{\xaf}{X}$ be the population means of the study and auxiliary variable respectively and their corresponding sample means be $\stackrel{\xaf}{y}$ and $\stackrel{\xaf}{x}$ .

This study focuses on improving the efficiency in the estimation of

${S}_{y}^{2}=\frac{1}{N-1}{\displaystyle {\sum}_{i=1}^{N}{\left({Y}_{i}-\stackrel{\xaf}{Y}\right)}^{2}}$ (1)

using the coefficient of kurtosis and median.

We define the following notations that we will use throughout the article. For the population observations we have;

$\stackrel{\xaf}{Y}=\frac{1}{N}{\displaystyle {\sum}_{i=1}^{N}{Y}_{i}}$ , $\stackrel{\xaf}{X}=\frac{1}{N}{\displaystyle {\sum}_{i=1}^{N}{X}_{i}}$ , ${S}_{y}^{2}=\frac{1}{N-1}{\displaystyle {\sum}_{i=1}^{N}{\left({Y}_{i}-\stackrel{\xaf}{Y}\right)}^{2}}$ ,

${S}_{x}^{2}=\frac{1}{N-1}{\displaystyle {\sum}_{i=1}^{N}{\left({X}_{i}-\stackrel{\xaf}{X}\right)}^{2}}$ , ${S}_{xy}=\frac{1}{N-1}{\displaystyle {\sum}_{i=1}^{N}\left({Y}_{i}-\stackrel{\xaf}{Y}\right)\left({X}_{i}-\stackrel{\xaf}{X}\right)}$ .

Also we define the following from the sample observations:

$\stackrel{\xaf}{y}=\frac{1}{n}{\displaystyle {\sum}_{i=1}^{n}{y}_{i}}$ , $\stackrel{\xaf}{x}=\frac{1}{n}{\displaystyle {\sum}_{i=1}^{n}{x}_{i}}$ , ${s}_{y}^{2}=\frac{1}{n-1}{\displaystyle {\sum}_{i=1}^{n}{\left({y}_{i}-\stackrel{\xaf}{y}\right)}^{2}}$ ,

${s}_{x}^{2}=\frac{1}{n-1}{\displaystyle {\sum}_{i=1}^{n}{\left({x}_{i}-\stackrel{\xaf}{x}\right)}^{2}}$ , ${s}_{xy}=\frac{1}{n-1}{\displaystyle {\sum}_{i=1}^{n}\left({y}_{i}-\stackrel{\xaf}{y}\right)\left({x}_{i}-\stackrel{\xaf}{x}\right)}$ .

In general, we define the following parameters:

${\mu}_{rs}=\frac{1}{N-1}{\displaystyle {\sum}_{i=1}^{N}{\left({y}_{i}-\stackrel{\xaf}{y}\right)}^{r}{\left({x}_{i}-\stackrel{\xaf}{x}\right)}^{s}}$ (2)

${\lambda}_{rs}=\frac{{\mu}_{rs}}{{\mu}_{20}^{\frac{r}{2}}{\mu}_{02}^{\frac{s}{2}}}$ (3)

Thus we note the following;

${\mu}_{20}={S}_{y}^{2}$ , ${\mu}_{02}={S}_{x}^{2}$ , and ${\mu}_{11}={S}_{xy}$ ; ${\lambda}_{22}=\frac{{\mu}_{22}}{{\mu}_{20}{\mu}_{02}}$ , ${\lambda}_{21}=\frac{{\mu}_{21}}{{\mu}_{20}{\mu}_{02}^{\frac{1}{2}}}$ such that; ${C}_{y}=\frac{{S}_{y}^{2}}{{\stackrel{\xaf}{Y}}^{2}}=\frac{{\mu}_{20}}{{\stackrel{\xaf}{Y}}^{2}}$ is the coefficient of variation for the study variable y, ${C}_{x}=\frac{{S}_{x}^{2}}{{\stackrel{\xaf}{X}}^{2}}=\frac{{\mu}_{02}}{{\stackrel{\xaf}{X}}^{2}}$ is the coefficient of variation for the auxiliary variable x and ${\rho}_{xy}=\frac{{S}_{xy}}{{S}_{x}{S}_{y}}=\frac{{\mu}_{11}}{\sqrt{{\mu}_{20}}\sqrt{{\mu}_{02}}}$ coefficient of correlation between x and y, ${\kappa}_{\left(y\right)}={\lambda}_{40}=\frac{{\mu}_{40}}{{\mu}_{20}^{2}}$ coefficient of kurtosis for the study variable, ${\kappa}_{\left(x\right)}={\lambda}_{04}=\frac{{\mu}_{04}}{{\mu}_{02}^{2}}$ coefficient of kurtosis for the auxiliary variable and ${M}_{x}$ population median of the auxiliary variable.

Many authors have come up with more precise estimators by employing prior knowledge of certain population parameter(s). [2] for example attempted use of the coefficient of variation of study variable but prove inadequate for in practice, this parameter is unknown. Motivated by [2] work, [3] [4] and [5] used the known coefficient of variation but now that of the auxiliary variable for estimating population mean of study variable. Reasoning along the same path [6] used the prior value of coefficient of kurtosis of an auxiliary variable in estimating the population variance of the study variable y.

Kurtosis in most cases is not reported or used in many research articles, in spite of the fact that fundamentally speaking every statistical package provides a measure of kurtosis. This maybe attributed to the likelihood that kurtosis is not well understood or its importance in various aspects of statistical analysis has not been explored fully. Kurtosis can simply be expressed as

$\kappa =\frac{E{\left(x-\mu \right)}^{4}}{{\left(E{\left(x-\mu \right)}^{2}\right)}^{2}}=\frac{{\mu}^{4}}{{\sigma}^{4}}$ (4)

where $E$ ―the expectation operator, $\mu $ ―the mean, ${\mu}^{4}$ ―the fourth moment about the mean and $\sigma $ ―the standard deviation.

Median being the middlemost value in a distribution (when the values are arranged in ascending or descending order) has the advantage of being less affected by the outliers and skewed data, thus is preferred to the mean especially when the distribution is not symmetrical. We can therefore utilize the median and the coefficient of kurtosis of the auxiliary variable to derive a more precise ratio type population variance.

2. Existing Population Variance Estimators

In this section we have reviewed some finite population variance estimators existing in literature which will help in the construction and development of the proposed estimator. Notably, when auxiliary information is not available the usual unbiased estimator to the population variance is

${t}_{1}={s}_{y}^{2}$ (5)

The bias and MSE of ${t}_{1}$

$Bias\left({t}_{1}\right)=\frac{1-f}{n}{S}_{y}^{2}\left\{\left({\kappa}_{x}-1\right){\Psi}_{1}\left({\Psi}_{1}-\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right\}=0$ (6)

$\begin{array}{c}MSE\left({t}_{1}\right)=Var\left({t}_{1}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+\left({\kappa}_{x}-1\right){\Psi}_{1}\left({\Psi}_{1}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}\\ =\frac{1-f}{n}{S}_{y}^{4}\left({\kappa}_{y}-1\right)\end{array}$ (7)

where ${\Psi}_{1}=0$

Population variance, ${S}_{y}^{2}$ estimation using auxiliary information was considered by [7] , and proposed ratio type population variance estimator, given by

${t}_{2}={s}_{y}^{2}\frac{{S}_{x}^{2}}{{s}_{x}^{2}}$ (8)

The bias and Mean Squared Error of Isaki’s estimator,

$Bias\left({t}_{2}\right)=\frac{1-f}{n}{S}_{y}^{2}\left\{\left({\kappa}_{x}-1\right){\Psi}_{2}\left({\Psi}_{2}-\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right\}=\frac{1-f}{n}{S}_{y}^{2}\left[\left({\kappa}_{x}-1\right)-\left({\lambda}_{22}-1\right)\right]$ (9)

$\begin{array}{c}MSE\left({t}_{2}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+\left({\kappa}_{x}-1\right){\Psi}_{2}\left({\Psi}_{2}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}\\ =\frac{1-f}{n}{S}_{y}^{4}\left[\left({\kappa}_{y}-1\right)+\left({\kappa}_{x}-1\right)-2\left({\lambda}_{22}-1\right)\right]\end{array}$ (10)

where ${\Psi}_{2}=1$

[6] initiated the use of coefficient of kurtosis in estimating population variance of a study variable y. Later, the coefficient of kurtosis was used by [3] [5] [8] in the estimating the population mean.

[9] using the known information on both ${S}_{x}^{2}$ and ${\kappa}_{x}$ suggested modified ratio type population variance estimator for ${S}_{y}^{2}$ as

${t}_{3}={s}_{y}^{2}\left[\frac{{S}_{x}^{2}+{\kappa}_{x}}{{s}_{x}^{2}+{\kappa}_{x}}\right]$ (11)

The estimator, ${t}_{3}$ bias and MSE obtained as

$Bias\left({t}_{3}\right)=\frac{1-f}{n}{S}_{y}^{2}\left[\left(\left\{{\kappa}_{x}-1\right\}\right){\Psi}_{3}\left({\Psi}_{3}-\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right]$ (12)

$MSE\left({t}_{3}\right)=\frac{1-f}{n}{S}_{y}^{4}\left[\left\{{\kappa}_{y}-1\right\}+\left\{{\kappa}_{x}-1\right\}{\Psi}_{3}\left({\Psi}_{3}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right]$ (13)

where ${\Psi}_{3}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{\kappa}_{x}}$

[10] suggested four modified ratio type variance estimators using known values of ${C}_{x}$ and ${\kappa}_{x}$ ,

${t}_{4}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}-{C}_{x}}{{s}_{x}^{2}-{C}_{x}}\right\}$ (14)

${t}_{5}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}-{\kappa}_{x}}{{s}_{x}^{2}-{\kappa}_{x}}\right\}$ (15)

${t}_{6}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}{\kappa}_{x}-{C}_{x}}{{s}_{x}^{2}{\kappa}_{x}-{C}_{x}}\right\}$ (16)

${t}_{7}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}{C}_{x}-{\kappa}_{x}}{{s}_{x}^{2}{C}_{x}-{\kappa}_{x}}\right\}$ (17)

The biases and MSE of their estimators,

$Bias\left({t}_{4}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{4}\left({\Psi}_{4}-\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right\}$ (18)

$MSE\left({t}_{4}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{4}\left({\kappa}_{x}-1\right)\left({\Psi}_{4}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (19)

$Bias\left({t}_{5}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{5}\left({\Psi}_{5}-\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (20)

$MSE\left({t}_{5}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{5}\left({\kappa}_{x}-1\right)\left({\Psi}_{5}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (21)

$Bias\left({t}_{6}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{6}\left({\Psi}_{6}-\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (22)

$MSE\left({t}_{6}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{6}\left({\kappa}_{x}-1\right)\left({\Psi}_{6}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (23)

$Bias\left({t}_{7}\right)\mathrm{=}\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{7}\left({\Psi}_{7}-\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (24)

$MSE\left({t}_{7}\right)\mathrm{=}\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{7}\left({\kappa}_{x}-1\right)\left({\Psi}_{7}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (25)

where;

${\Psi}_{4}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}-{C}_{x}}$ ; ${\Psi}_{5}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}-{\kappa}_{x}}$ ; ${\Psi}_{6}=\frac{{S}_{x}^{2}{\kappa}_{x}}{{S}_{x}^{2}{\kappa}_{x}-{C}_{x}}$ ; ${\Psi}_{7}=\frac{{S}_{x}^{2}{C}_{x}}{{S}_{x}^{2}{C}_{x}-{\kappa}_{x}}$ .

[11] utilizing population median ${M}_{x}$ came up with a modified ratio type population variance estimator as

${t}_{8}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}+{M}_{x}}{{s}_{x}^{2}+{M}_{x}}\right\}$ (26)

The bias and MSE of their estimator ${t}_{8}$ ,

$Bias\left({t}_{8}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{8}\left({\Psi}_{8}-\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (27)

$MSE\left({t}_{8}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{8}\left({\kappa}_{x}-1\right)\left({\Psi}_{8}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (28)

where, ${\Psi}_{8}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{M}_{x}}$ .

[12] using the known quartiles (upper and lower quartile ${Q}_{3}$ and ${Q}_{1}$ respectively) of the auxiliary variable x suggested

${t}_{9}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}+{Q}_{1}}{{s}_{x}^{2}+{Q}_{1}}\right\}$ (29)

${t}_{10}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}+{Q}_{3}}{{s}_{x}^{2}+{Q}_{3}}\right\}$ (30)

The biases and MSE of their estimators ${t}_{9}$ and ${t}_{10}$ as follows

$Bias\left({t}_{9}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{9}\left({\Psi}_{9}-\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (31)

$MSE\left({t}_{9}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{9}\left({\kappa}_{x}-1\right)\left({\Psi}_{9}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (32)

$Bias\left({t}_{10}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{10}\left({\Psi}_{10}-\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (33)

$MSE\left({t}_{10}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{10}\left({\kappa}_{x}-1\right)\left({\Psi}_{10}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (34)

where ${\Psi}_{9}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{Q}_{1}}$ and ${\Psi}_{10}=\frac{{S}_{x}^{2}}{{S}_{x}^{2}+{Q}_{3}}$ . Motivated by [10] and [11] [13]

considered the estimation of finite population variance using known coefficient of variation and median of an auxiliary variable, proposed an estimator.

${t}_{11}={s}_{y}^{2}\left[\frac{{C}_{x}{S}_{x}^{2}+{M}_{x}}{{C}_{x}{s}_{x}^{2}+{M}_{x}}\right]$ (35)

The bias and MSE obtained to be,

$Bias\left({t}_{11}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right)\left\{{\Psi}_{11}\left({\Psi}_{11}-\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (36)

$MSE\left({t}_{11}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\Psi}_{11}\left({\kappa}_{x}-1\right)\left({\Psi}_{11}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (37)

where ${\Psi}_{11}=\frac{{C}_{x}{S}_{x}^{2}}{{C}_{x}{S}_{x}^{2}+{M}_{x}}$ .

3. Proposed Estimator

Motivated by the works of [14] [9] [15] [13] [10] and [16] in the improvement of the performance of the population variance estimator of the study variable using known population parameters of an auxiliary variable. We propose the following modified ratio type population variance estimator using a known value of population coefficient of kurtosis ${\kappa}_{x}$ and median ${M}_{x}$ of an auxiliary variable.

${\stackrel{^}{S}}_{PM}^{2}={s}_{y}^{2}\left\{\frac{{S}_{x}^{2}{\kappa}_{x}+{M}_{x}^{2}}{{s}_{x}^{2}{\kappa}_{x}+{M}_{x}^{2}}\right\}$ (38)

To calculate the bias and the MSE of ${\stackrel{^}{S}}_{PM}^{2}$ ,

We let ${s}_{y}^{2}={S}_{y}^{2}\left(1+{\xi}_{0}\right)$ and ${s}_{x}^{2}={S}_{x}^{2}\left(1+{\xi}_{1}\right)$ or ${\xi}_{0}=\frac{{s}_{y}^{2}}{{S}_{y}^{2}}-1$ and ${\xi}_{1}=\frac{{s}_{x}^{2}}{{S}_{x}^{2}}-1$ so that $E\left({\xi}_{0}\right)=E\left({\xi}_{1}\right)=0$ and to the first degreee of approximations

$E\left({\xi}_{0}^{2}\right)=\frac{1-f}{n}\left({\lambda}_{40}-1\right)$ (39)

$E\left({\xi}_{1}^{2}\right)=\frac{1-f}{n}\left({\lambda}_{04}-1\right)$ (40)

$E\left({\xi}_{0}{\xi}_{1}\right)=\frac{1-f}{n}\left({\lambda}_{22}-1\right)$ (41)

The expectations are obtained following the works of [17] [18] [19] and [20] .

Now expressing ${\stackrel{^}{S}}_{PM}^{2}$ in terms of ${\xi}^{\prime}s$ we have

$\begin{array}{c}{\stackrel{^}{S}}_{PM}^{2}={S}_{y}^{2}\left(1+{\xi}_{0}\right)\left\{\frac{{\kappa}_{x}{S}_{x}^{2}+{M}_{x}^{2}}{{\kappa}_{x}{S}_{x}^{2}\left(1+{\xi}_{1}\right)+{M}_{x}^{2}}\right\}\\ ={S}_{y}^{2}\left(1+{\xi}_{0}\right){\left(1+{\rho}^{*}{\xi}_{1}\right)}^{-1}\end{array}$ (42)

where ${\rho}^{*}={\kappa}_{x}{S}_{x}^{2}{\left({\kappa}_{x}{S}_{x}^{2}+{M}_{x}^{2}\right)}^{-1}$ , we assume that $\left|{\rho}^{*}{\xi}_{1}\right|<1$ so that ${\left(1+{\rho}^{\mathrm{*}}{\xi}_{1}\right)}^{-1}$ is expandable.

Expanding the right hand side of (42) and multiplying out we have

$\begin{array}{c}{\stackrel{^}{S}}_{PM}^{2}={S}_{y}^{2}\left(1+{\xi}_{0}\right)\left(1-{\rho}^{\mathrm{*}}{\xi}_{1}+{\rho}^{\mathrm{*2}}{\xi}_{1}^{2}+\cdots \right)\\ ={S}_{y}^{2}\left(1+{\xi}_{0}-{\rho}^{\mathrm{*}}{\xi}_{1}-{\rho}^{\mathrm{*}}{\xi}_{0}{\xi}_{1}+{\rho}^{\mathrm{*2}}{\xi}_{1}^{2}+{\rho}^{\mathrm{*2}}{\xi}_{0}{\xi}_{1}^{2}-\cdots \right)\end{array}$ (43)

Neglecting terms of ${\xi}^{\prime}s$ having power greater than two we have

$\begin{array}{l}{\stackrel{^}{S}}_{PM}^{2}\cong {S}_{y}^{2}\left(1+{\xi}_{0}-{\rho}^{\mathrm{*}}{\xi}_{1}-{\rho}^{\mathrm{*}}{\xi}_{0}{\xi}_{1}+{\rho}^{\mathrm{*2}}{\xi}_{1}^{2}\right)\\ \text{or}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\cong {S}_{y}^{2}\left({\xi}_{0}-{\rho}^{\mathrm{*}}{\xi}_{1}-{\rho}^{\mathrm{*}}{\xi}_{0}{\xi}_{1}+{\rho}^{\mathrm{*2}}{\xi}_{1}^{2}\right)\end{array}$ (44)

Taking the expectation on both sides of (44)

$E\left({\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\right)\cong E\left({S}_{y}^{2}\left({\xi}_{0}-{\rho}^{\mathrm{*}}{\xi}_{1}-{\rho}^{\mathrm{*}}{\xi}_{0}{\xi}_{1}+{\rho}^{\mathrm{*2}}{\xi}_{1}^{2}\right)\right)$ (45)

we have our bias

$Bias\left({\stackrel{^}{S}}_{PM}^{2}\right)=\frac{1-f}{n}{S}_{y}^{2}\left({\kappa}_{x}-1\right){\rho}^{*}\left\{{\rho}^{*}-\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right\}$ (46)

Squaring both sides of (44) and neglecting terms of ${\xi}^{\prime}s$ having power greater than two we have

${\left({\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\right)}^{2}\cong {S}_{y}^{4}\left({\xi}_{0}^{2}+{\rho}^{\mathrm{*2}}{\xi}_{1}^{2}-2{\rho}^{\mathrm{*}}{\xi}_{0}{\xi}_{1}\right)$ (47)

Taking the expectation on both sides of (47)

$E\left({\left({\stackrel{^}{S}}_{PM}^{2}-{S}_{y}^{2}\right)}^{2}\right)\cong E\left({S}_{y}^{4}\left({\xi}_{0}^{2}+{\rho}^{\mathrm{*2}}{\xi}_{1}^{2}-2{\rho}^{\mathrm{*}}{\xi}_{0}{\xi}_{1}\right)\right)$ (48)

We get the $\left({\stackrel{^}{S}}_{PM}^{2}\right)$ estimator’s Mean Squared Error as

$MSE\left({\stackrel{^}{S}}_{PM}^{2}\right)=\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\rho}^{*}\left({\kappa}_{x}-1\right)\left({\rho}^{*}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}$ (49)

4. Theoretical Comparison

The theoretical conditions under which the proposed modified ratio type estimators ${\stackrel{^}{S}}_{PM}^{2}$ is more efficient than the other existing estimators ${t}_{j},j=0,1,2,\cdots ,11$ , from MSE of ${t}_{j},j=0,1,2,\cdots ,11$ given to the first degree of approximation in general as

$MSE\left({t}_{j}\right)=\frac{1-f}{n}{S}_{y}^{4}\left[\left({\kappa}_{y}-1\right)+{\Psi}_{j}\left({\kappa}_{x}-1\right)\left({\Psi}_{j}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right]$ (50)

Using Equations ((49) and (50)) we have that $MSE\left({\stackrel{^}{S}}_{PM}^{2}\right)<MSE\left({t}_{j}\right)$ ,

if

${\rho}^{*}\left({\rho}^{*}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)<{\Psi}_{j}\left({\Psi}_{j}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)$ (51)

5. Empirical Studies

Using the data from Population I (Source:[ [21] , 228]) and Population II (source: [22] ). We assess the performance of the proposed estimator when simple random sampling without replacement (SRSWOR) scheme is used with that of sample variance and existing estimators.We apply the proposed and existing estimators to this data set and the data statistics are given below:

Population I

$X$ = Fixed capital

$Y$ = output of 80 factories

$N=80$ , $n=20$ , $\stackrel{\xaf}{X}=11.265$ , $\stackrel{\xaf}{Y}=51.826$ ,

${S}_{x}^{2}=71.504$ , ${S}_{y}^{2}=336.979$ , ${S}_{xy}=146.068$ ,

${\lambda}_{04}={\kappa}_{x}=2.866$ , ${\lambda}_{40}={\kappa}_{y}=2.267$ , ${\lambda}_{22}=2.221$ ,

${\rho}_{xy}=0.941$ , ${C}_{y}=0.354$ , ${C}_{x}=0.751$

${M}_{x}=10.300$ , ${Q}_{1}=5.150$ , ${Q}_{3}=16.975$

Population II

$X$ = acreage under wheat crop in 1973

$Y$ = acreage under wheat crop in 1974,

$N=70$ , $n=25$ , $\stackrel{\xaf}{X}=175.2671$ , $\stackrel{\xaf}{Y}=96.700$ ,

${S}_{x}^{2}=19840.7508$ , ${S}_{y}^{2}=3686.1898$ ,

${\lambda}_{04}={\kappa}_{x}=7.0952$ , ${\lambda}_{40}={\kappa}_{y}=4.7596$ , ${\lambda}_{22}=4.6038$ ,

${\rho}_{xy}=0.7293$ , ${C}_{y}=0.6254$ , ${C}_{x}=0.8037$

${M}_{x}=72.4375$ , ${Q}_{1}=80.1500$ , ${Q}_{3}=225.0250$ .

Using the above summary values we have the results in Table 1 below. From the table Mean Squared Errors it is clear that our proposed modified ratio type population variance estimator ${\stackrel{^}{S}}_{PM}^{2}$ has the least Mean Squared Error (MSE).

The efficiency of our proposed estimator ${\stackrel{^}{S}}_{PM}^{2}$ is examined numerically by its Percentage Relative Efficiency (PRE(s)) in comparison with those of existing

Table 1. Bias and Mean Squared Errors (MSE).

estimators using real populations from [ [21] , p.228] and [22] .

We have computed the PRE(s) of the estimators ${t}_{j},j=1,2,\cdots ,11$ using the formulae

$PRE\left({t}_{j},{s}_{y}^{2}\right)=\frac{MSE\left({s}_{y}^{2}\right)}{MSE\left({t}_{j}\right)}\times 100$ (52)

$=\frac{\frac{1-f}{n}{S}_{y}^{4}\left({\kappa}_{y}-1\right)}{\frac{1-f}{n}{S}_{y}^{4}\left[\left\{{\kappa}_{y}-1\right\}+\left\{{\kappa}_{x}-1\right\}{\Psi}_{j}\left({\Psi}_{j}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right]}\times 100$ (53)

$=\frac{{\kappa}_{y}-1}{\left\{{\kappa}_{y}-1\right\}+\left\{{\kappa}_{x}-1\right\}{\Psi}_{j}\left({\Psi}_{j}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)}\times 100$ (54)

Then PRE for our proposed estimator is subsequently,

$PRE\left({\stackrel{^}{S}}_{PM}^{2},{s}_{y}^{2}\right)=\frac{MSE\left({s}_{y}^{2}\right)}{MSE\left({\stackrel{^}{S}}_{PM}^{2}\right)}\times 100$ (55)

$=\frac{\frac{1-f}{n}{S}_{y}^{4}\left({\kappa}_{y}-1\right)}{\frac{1-f}{n}{S}_{y}^{4}\left\{\left({\kappa}_{y}-1\right)+{\rho}^{*}\left({\kappa}_{x}-1\right)\left({\rho}^{*}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)\right\}}\times 100$ (56)

$=\frac{{\kappa}_{y}-1}{\left({\kappa}_{y}-1\right)+{\rho}^{*}\left({\kappa}_{x}-1\right)\left({\rho}^{*}-2\left(\frac{{\lambda}_{22}-1}{{\kappa}_{x}-1}\right)\right)}\times 100$ (57)

Using formula (54) and (57) we compute the Percent Relative Efficiencies and tabulate the results in Table 2.

Percentage Relative efficiency being a robust statistical tool that is used to

Table 2. Percent Relative Efficiencies (PRE).

measure and ascertain the efficiency of one estimator over another. From the findings summarized in the table above it is clear that our proposed estimator ${\stackrel{^}{S}}_{PM}^{2}$ performed best, that is it has the highest PRE among all the other estimators. This therefore implies that we can apply our proposed estimator to appropriate practical situations and obtain better and more efficient results than the traditional and other existing population variance estimators.

6. Conclusions

In this study we have proposed a modified ratio type population variance estimator using known population parameters the coefficient of kurtosis and the median of the auxiliary variable x.

We have analyzed the performance of our proposed estimator against the usual unbiased variance estimator and existing estimators using two natural populations by comparing their PRE(s).

Based on the results of our studies, it is evidenced that our proposed estimator works better than the other existing estimators having the highest Percentage Relative Efficiency hence can be applied to practical applications, where knowledge of population parameters of auxiliary variable is available. We also recommend that our proposed estimator can be further improved by extending the number of Taylor’s series terms to be more than two.

Acknowledgements

We give much appreciation to the authors for the numerous and valuable contribution to this work.

Conflicts of Interest

The authors declare that there is no conflict of interest in the publication of this paper.

Cite this paper

Milton, T.K., Odhiambo, R.O. and Orwa, G.O. (2017) Estimation of Population Variance Using the Coefficient of Kurtosis and Median of an Auxiliary Variable under Simple Random Sampling. Open Journal of Statistics, 7, 944-955. https://doi.org/10.4236/ojs.2017.76066

References

- 1. Cochran, W. (1940) The Estimation of the Yields of Cereal Experiments by Sampling for the Ratio of Grain to Total Produce. The Journal of Agricultural Science, 30, 262-275. https://doi.org/10.1017/S0021859600048012
- 2. Searls, D.T. (1964) The Utilization of a Known Coefficient of Variation in the Estimation Procedure. Journal of the American Statistical Association, 59, 1225-1226. https://doi.org/10.1080/01621459.1964.10480765
- 3. Sen, A.R. (1978) Estimation of the Population Mean When the Coefficient of variation Is Known. Communications in Statistics—Theory and Methods, 7, 657-672. https://doi.org/10.1080/03610927808827656
- 4. Sisodia, B. and Dwivedi, V. (1981) Modified Ratio Estimator Using Coefficient of Variation of Auxiliary Variable. Journal-Indian Society of Agricultural Statistics, 33, 13-18.
- 5. Upadhyaya, L.N. and Singh, H.P. (1984) On the Estimation of the Population Mean With Known Coefficient of Variation. Biometrical Journal, 26, 915-922. https://doi.org/10.1002/bimj.4710260814
- 6. Hirano, K., Pandey, B.N. and Singh, J. (1973) On the Utilization of a Known Coefficient of Kurtosis in the Estimation Procedure of Variance. Annals of the Institute of Statistical Mathematics, 25, 51-55. https://doi.org/10.1007/BF02479358
- 7. Isaki, C.T. (1983) Variance Estimation Using Auxiliary Information. Journal of the American Statistical Association, 78, 117-123. https://doi.org/10.1080/01621459.1983.10477939
- 8. Searls, D.T. and Intarapanich, P. (1990) A Note on an Estimator for the Variance That Utilizes The kurtosis. The American Statistician, 44, 295-296.
- 9. Upadhyaya, L. and Singh, H. (1999) An Estimator for Population Variance That Utilizes the Kurtosis of an Auxiliary Variable in Sample Surveys. Vikram Mathematical Journal, 19, 14-17.
- 10. Kadilar, C. and Cingi, H. (2006) Ratio Estimators for the Population Variance in Simple and Stratified Random Sampling. Applied Mathematics and Computation, 173, 1047-1059. https://doi.org/10.1016/j.amc.2005.04.032
- 11. Subramani, J. and Kumarapandiyan, G. (2012) Variance Estimation Using Median of the Auxiliary Variable. International Journal of Probability and Statistics, 1, 6-40. https://doi.org/10.5923/j.ijps.20120103.02
- 12. Subramani, J. and Kumarapandiyan, G. (2012) Variance Estimation Using Quartiles and Their Functions of an Auxiliary Variable. International Journal of Statistics and Applications, 2, 67-72. https://doi.org/10.5923/j.statistics.20120205.04
- 13. Subramani, J. and Kumarapandiyan, G. (2013) Estimation of Variance Using Known Coefficient of Variation and Median of an Auxiliary Variable. Journal of Modern Applied Statistical Methods, 12, 11. https://doi.org/10.22237/jmasm/1367381400
- 14. Khan, M. and Shabbir, J. (2013) A Ratio Type Estimator for the Estimation of Population Variance Using Quartiles of an Auxiliary Variable. Journal of Statistics Applications and Probability, 2, 157-162. https://doi.org/10.12785/jsap/020314
- 15. Singh, H.P., Tailor, R., Tailor, R. and Kakran, M. (2004) An Improved Estimator of Population Mean Using Power Transformation. Journal of the Indian Society of Agricultural Statistics, 58, 223-230.
- 16. Yadav, S.K., Misra, S. and Mishra, S. (2016) Efficient Estimator for Population Variance Using Auxiliary Variable. American Journal of Operational Research, 6, 9-15.
- 17. Sukhatme, P. (1944) Moments and Product Moments of Moment-Statistics for Samples of the Finite and Infinite Populations. Sankhya: The Indian Journal of Statistics, 6, 363-382.
- 18. Sukhatme, P. and Sukhatme, B. (1970) Sampling Theory of Surveys with Applicationsrome. Asia Publishing House, Bombay.
- 19. Srivastava, S.K. and Jhajj, H.S. (1981) A Class of Estimators of the Population Mean in Survey Sampling Using Auxiliary Information. Biometrika, 68, 341-343. https://doi.org/10.1093/biomet/68.1.341
- 20. Tracy, D.S. (1984) Moments of Sample Moments. Communications in Statistics-Theory and Methods, 13, 553-562. https://doi.org/10.1080/03610928408828700
- 21. Murthy, M.N. (1967) Sampling Theory and Methods. Statistical Publishing Society, Barrackpore.
- 22. Daroga, S. and Chaudhary, F. (1986) Theory and Analysis of Sample Survey Designs. Wiley Eastern Limited, New York.