Ratio-Cum-Product Estimator Using Multiple Auxiliary Attributes in Two-Phase Sampling ()
1. Introduction
The use of supplementary information is a widely discussed issue in sampling theory. Auxiliary variables are commonly used in sample survey practices to obtain improved designs and to achieve higher precision in the es- timates of some population parameters such as the mean or the variance of a study variable. The concept of ratio estimation was introduced in sample survey by Cochran [1] . It is preferred when the study variable is highly po- sitively correlated with the auxiliary variable. Murthy [2] proposed the product estimator for negatively corre- lated study variable(s) and auxiliary variable which was similar to ratio estimator.
Olkin [3] was the first author to use information on more than one supplementary characteristic, which is po- sitively correlated with the variable under study, using a linear combination of ratio estimator based on each auxiliary variable. Raj [4] suggested a method of using multi-auxiliary information in sample survey. Using this idea, Singh [5] proposed a multivariate expression of product estimator where the study variable was negatively correlated with the multi-auxiliary variable. In the same year, Singh [6] proposed a ratio-cum-product estimator and its multi-variable expression. Singh and Tailor [7] proposed a ratio-cum-product estimator for finite population mean in simple random sampling using coefficient of variation and coefficient of kurtosis which was more efficient than the previous ratio-cum product estimator.
Jhajj, Sharma and Grover [8] proposed a general family of estimators using information on auxiliary attribute. They used known information of population proportion possessing an attribute (highly correlated with study va- riable Y). The attribute are normally used when the auxiliary variables are not available, e.g. amount of milk produced, a particular breed of cow or amount of yield of wheat and a particular variety of wheat. Bahland Tu- teja [9] proposed ratio and product type exponential estimators using auxiliary attribute. Rajesh, Pankaj, Nirmala and Florentins [10] used the information on auxiliary attribute in ratio estimator in estimating population mean of the variable of interest using known attributes such as coefficient of variation, coefficient kurtosis and point biserial correlation coefficient. The estimator performed better than the usual sample mean and Naikand Gupta [8] estimator. Rajesh, Pankaj, Nirmala and Florentins [10] also used the auxiliary attributes in ratio-product type exponential estimator following the work of Bahland Tuteja [9] , the estimator was more efficient compared to mean per unit, ratio and product type exponential estimators as well as Naik and Gupta [11] estimator.
The concept of double sampling was first proposed by Neyman [12] in sampling human populations when the mean of auxiliary variable(s) was unknown. It was later extended to multiphase by Robson [13] . It is advanta- geous when the gain in precision is substantial as compared to the increase in the cost due to collection of in- formation on the auxiliary variable for large samples. In most survey, the auxiliary information is always availa- ble and every form of auxiliary information should be used in developing sampling strategies. Samiuddinand Hanif [14] introduced the following approach of using auxiliary variable.
1) Full information case: Information for all auxiliary variables is available
2) No information case: Information for all auxiliary variables is not available.
3) Partial information case: Information for some auxiliary variable is available for all population units.
We have used these strategies to develop ratio-cum-product estimators using multiple auxiliary attributes for full information, partial and no information cases.
Hanif, Haq and Shahbaz [15] proposed a general family of estimators using multiple auxiliary attribute in sin- gle and double phase sampling. The estimators had a smaller MSE compared to that of Jhajj, Sharma and Grov- er [8] . They also extended their work to ratio, product and regression estimators which were generalization of Naik and Gupta [11] estimator in single and double phase sampling with full information, partial information and no information.
The concept of multiple auxiliary attributes was proposed by Hanif, Haq and Shahbaz [16] , and then extended to ratio and product estimators. In this paper, we have incorporated the multiple auxiliary attributes in ratio- cum-product estimator in two-phase sampling as proposed by Singh [7] and used strategies introduced by Samiuddinand Hanif [14] and also incorporate Arora and Bansi [17] approach in writing down the mean squared error.
2. Preliminaries and Notations
2.1. Notations
Consider a population of N units. Let Y be the variable for which we want to estimate the population mean and
are q auxiliary attributes. For two-phase sampling design, let and be sample
sizes for first and second phase respectively. In defining the attributes we assume complete dichotomy so that
Let and be the total number of units in the population and sample respectively pos-
sessing attribute. Let and be the corresponding proportion of units possessing a specific
attributes and is the mean of the main variable at second phase. Let and denote the
auxiliary attribute form first and second phase samples respectively and denote the variable of interest from second phase. Let and denote the population means and coefficient of variation of auxiliary attribute respectively and denotes the population bi-serial correlation coefficient of Y and. Let be proportion of units possessing attribute in first phase sample of size while be proportion of units possessing attribute in second phase sample of size. Further, let
, , (1.0)
where, and are sampling error which are assumed to be very small. We let
, , (1.1)
while,
(1.2)
Here we shall take to, term of order as
(1.3)
Let and are the coefficient of variation of study variable and the auxiliary variables
respectively. The bi-serial correlation coefficient between study variable and auxiliary attributes is given by
. Then for simple random sampling without replacement for both first and second phases we write
by using phase wise operation of expectations as:
(1.4)
If A is a square matrix, its inverse can be written using ad joint matrix as,
(1.5)
Arora and Lai [1] (1.6)
The following notations will be used in deriving the mean square errors of proposed estimators
Determinant of population correlation matrix of attributes.
Determinant of minor of corresponding to the element of.
Denotes the multiple coefficient of determination of y on.
Denotes the multiple coefficient of determination of y on.
Determinant of population correlation matrix of attributes.
Determinant of the correlation matrix of.
Determinant of the correlation matrix of.
Determinant of the minor corresponding to of the correlation matrix of
.
Determinant of the minor corresponding to of the correlation matrix
(1.7)
2.2. Mean per Unit in Two-Phase Sampling
The sample mean using simple random sampling without replacement is given by,
(1.8)
While the variance of is given by,
(1.9)
2.3. Ratio and Product Estimator in Two-phase Sampling Using One auxiliaryattributes
In order to have an estimate of the population mean of the study variable y, assuming the knowledge of the population proportion P, Naik and Gupta [8] defined ratio and product estimators of population mean when the prior information of population proportion of units possessing the same attribute is variable. Naik and Gupta [8] proposed the following estimators:
(1.10)
(1.11)
The MSE of and up to the first order of approximation are given respectively by,
(1.12)
(1.13)
The optimum value are and for ratio and product estimator respectively.
2.3. Ratio and Product estimator using Multiple auxiliary attributes in Two-Phase Sampling
The ratio and product estimators by Hanif, Haq and Shahbaz [5] for single phase sampling using information on multiple auxiliary attributes are given respectively by,
(1.13)
(1.14)
The MSE of the and up to the first order of approximation are given respectively by,
(1.15)
(1.16)
3. Methodology
3.1. Ratio-Cum-Product Estimator Using Multiple Auxiliary Attributes for Full Information Case in Two-Phase Sampling
If we estimate a study variable when information on all auxiliary variables is available from population, it is uti- lized in the form of their means. By taking the advantage of ratio-cum-product technique for two-phase sam- pling, a generalized estimator for estimating population mean of study variable Y with the use of multi auxiliary attributes is proposed as:
(3.1)
Using (1.0), (1.1) in (3.1) and ignoring the second and higher terms for each expansion of product and after simplification, we write,
(3.2)
The mean squared error of ratio-cum-product estimator is:
(3.3)
We differentiate Equation (3.3) partially with respect to and then equate to zero, using (1.5), (1.7) and (1.4), we get
(3.4)
(3.5)
Using normal equations that are used to find the optimum values of and (3.3) can be written in sim- plified form as:
(3.6)
Using (1.4) in (3.6), we get,
(3.7)
Using the optimum value and in (3.4) and (3.5) and (3.7), we get,
(3.8)
Or
(3.9)
Or
(3.11)
Using (1.6) in (3.11), we get,
(3.12)
3.2. Ratio-Cum-Product Estimator Using Multiple Auxiliary Attributes for Partial Information Case in Two-Phase Sampling
In this section, we proposed a ratio-cum-product estimator using multiple auxiliary attributes for partial infor- mation case in two-phase sampling using k auxiliary attributes with “s” known and “k − s” unknown attributes which are positively correlated with study variable Y and g − k auxiliary attributes with “g − t” known and “g − k + t” unknown attributes which are negatively correlated with study variable (Y). The proposed ratio-cum-product estimator for partial information case is as follows,
(3.13)
Using (1.0), (1.1) in (3.13) and ignoring the second and higher terms for each expansion of product and after simplification, we write,
(3.14)
Mean squared error of is given by
(3.15)
We differentiate equation (3.15) with respect to
,
and equate to zero and use (1.4), (1.6) and (1.7). The optimum value is as follows,
(3.16)
(3.17)
(3.18)
(3.19)
(3.20)
(3.21)
Using normal equations that are used to find the optimum values of (3.15) can be written in simplified form as
(3.22)
Substituting (1.4), (3.14) to (3.19) in (3.20), we get
(3.23)
Or
(3.24)
Or
(3.25)
Or
(3.26)
Using (1.6) in (3.26), we get,
(3.27)
Or
(3.28)
3.3. Ratio-Cum-Product Estimator in Two-Phase Sampling (No Information Case)
If we estimate a study variable when information on all auxiliary variables is unavailable from population, it is utilized in the form of their means. By taking the advantage of ratio-cum-product technique for two-phase sam- pling, a generalized estimator for estimating population mean of study variable Y with the use of multi auxiliary variables are suggested as:
(3.29)
Using (1.0), (1.1) in (3.29) and ignoring the second and higher terms for each expansion of product and after simplification, we write,
(3.30)
Mean squared error of estimator is given by
(3.31)
We differentiate Equation (3.31) partially with respect to and
then equate to zero, using (1.5), (1.7) and (1.4), we get:
(3.32)
(3.33)
Using normal equations that are used to find the optimum values of and (3.31) can be written in simplified form as:
(3.34)
Using (1.4) in (3.34), we get,
(3.35)
Substituting equation (3.32) and (3.33) in (3.35), we get
(3.36)
(3.37)
Or
(3.38)
Using (1.6) in (3.38), we get,
(3.39)
3.4. Bias and Consistency of Ratio-cum-Product Estimators
These ratio-cum-product estimators using multiple auxiliary attributes in two-phase sampling are biased. How- ever, these biases are negligible for moderate and large samples.
It is easily shown that the ratio-cum-product estimators are consistent estimators using multiple auxiliary va- riables since they are linear combinations of consistent estimators it follows that they are also consistent.
4. Simulation, Result and Discussion
In this section, we carried out some data simulation experiments to compare the performance of ratio-cum product estimator in two-phase sampling using multiple auxiliary attributes with existing estimators of finite population that uses one or multiple auxiliary attributes namely mean per unit, ratio and product estimator using one auxiliary attributes and ratio and product estimators using two auxiliary attributes. The simulated data for the empirical study include a study variable and auxiliary attributes that are normally distributed with the fol- lowing variables
N = 300, n = 45, Mean = 45, standard deviation = 5
In order to evaluate the efficiency gain we could achieve by using the proposed estimators, we have calcu- lated the variance of mean per unit and the Mean squared error of all estimators we have considered. We have then calculated percent relative efficiency of each estimator in relation to variance of mean per unit. We have then compared the percent relative efficiency of each estimator, the estimator with the highest percent relative efficiency is considered to be the most efficient than the other estimator. The efficiency is calculated using the following formula:
(4.0)
Table 1 shows the percent relative efficiency of existing and proposed estimator with respect to mean per unit estimator for two-phase sampling. It is observed that ratio and product estimators using one auxiliary attribute are more efficient than mean per unit in the two-phase sampling. Again, ratio and product estimator using multiple auxiliary attributes are more efficient than mean per unit and ratio and product estimator using one auxiliary attribute in the two-phase sampling. Finally, Ratio-cum-product estimator in the two-phase sampling for full in- formation case using multiple auxiliary attributes is the most efficient of the five estimators since it has the highest percent relative efficiency.
Table 2 shows percent relative efficiency of ratio-cum-product estimators with respect to mean per unit es- timator in two-phase sampling. It is observed that the ratio-cum-product estimators are more efficient than mean per unit in the second phase sampling.
Finally, Table 3 compares the efficiency of full information case and partial case to no information case and full to partial information case. It is observed that the full information case and partial information case are more efficient than no information case because they have higher Percent Relative Efficiency than no information case. In addition, the full information case is more efficient than the partial information case because it has a higher Percent Relative Efficiency than partial information case.
5. Conclusion
Ratio-cum-product estimator using multiple auxiliary attributes in full information case in two-phase sampling is recommended to estimate population mean as it outperforms other estimator in two-phase sampling. If some auxiliary attributes are known, the ratio-cum-product estimator using multiple auxiliary attributes in partial infor-
Table 1. Relative efficiency of existing and proposed estimator with respect to mean per unit estimator for two-phase sampling.
Table 2. Relative efficiency of existing and proposed estimators with respect to mean per unit estimator for two-phase sampling.
Table 3. Comparisons of full, partial and no information cases for proposed ratio-cum-pro- duct estimator using multiple auxiliary variables.
mation case should be used but if all the auxiliary attributes are unknown, and ratio-cum-product estimator using multiple auxiliary attributes in no information case should be used to estimate finite population mean. This is clear from Table 3.