Estimation of Population Ratio in Post-Stratified Sampling Using Variable Transformation

Extending the work carried out by [1], this paper proposes six combined-type estimators of population ratio of two variables in post-stratified sampling scheme, using variable transformation. Properties of the proposed estimators were obtained up to first order approximations, ( ) 1 o n , both for achieved sample configurations (conditional argument) and over repeated samples of fixed size n (unconditional argument). Efficiency conditions were obtained. Under these conditions the proposed combined-type estimators would perform better than the associated customary combined-type estimator. Furthermore, optimum estimators among the proposed combinedtype estimators were obtained both under the conditional and unconditional arguments. An empirical work confirmed the theoretical results.


Introduction
The use of information on auxiliary character to improve estimates of population parameters of the study variable is a common practice in sample survey, and sometimes, information on several variables is used to estimate or predict a characteristic of interest.The investigators often collect observations from more than one variable, including the variable of interest y and some auxiliary variables x .The use of these variables (known as auxiliary information in sample survey design) often results in efficient estimate of population parameters (e.g.mean, ratio, proportion, etc.) under some realistic conditions, especially when there is a strong correlation between the study variables and the auxiliary variables.Many authors have made contributions in this regard, including [2] and [3].In this context, ratio, product and regression methods of estimation are good examples.Ratio and product-type estimators take advantage of the correlation between the auxiliary variable and the study variable, to improve the estimate of the characteristic of interest.For example, when information is available on the auxiliary variable that is highly positively correlated with the study variable, the ratio method of estimation proposed by [4] is a suitable estimator to estimate the population mean, and when the correlation is negative, the product method of estimation, as envisaged by [5] and [6], is appropriate.However, in some studies, the ratio of the population means (or totals) of the study and auxiliary variables might be of great significance, hence the need to estimate such ratios.

The customary estimator of the population ratio ( )
R Y X = of the population means of two variables, y and x , under the simple random sampling scheme, is given as R y x = , which is the ratio of the sample means of the two variables ( [2] and [7]).The estimator, R y x = , uses information on only two variables, namely the study variable ( ) y and one auxiliary variable ( ) x .However, several authors, like [7] and [8], have contributed to the problem of estimating the population ratio of two means, often utilizing additional information on one or more auxiliary variables, say ( ) While it is possible to record increased efficiency by introducing such additional auxiliary variables, it is obvious that extra cost is involved in order to obtain information on such additional auxiliary variables.References [1] and [9] have argued that such extra cost could be avoided by using variable transformation of the already observed auxiliary variable, instead of introducing additional (new) auxiliary variables.However, the works carried out by [1] [9] were restricted to estimation of population ratio in simple random sampling scheme.The present study is necessitated by the need to extend to poststratified sampling scheme, the works on ratio estimation carried out by [1] [9] under the simple random sampling scheme.This is in order to extend to other sampling schemes, the obvious advantage of reduced cost in the use of variable transformation instead of introducing additional (new) auxiliary variables when estimating population ratio of two population parameters.

The Proposed Combined-Type Estimators
Let n units be drawn from a population of N units using simple random sampling method and let the sam- pled units be allocated to their respective strata, where h n is the number of units that fall into stratum h such . Let hi y and hi x be the th i observation on the study and auxiliary variables, respectively.
Consider the following variable transformation of the auxiliary variable, x , under post-stratified sampling scheme.
, 1, 2, , and 1, 2, , An equivalent of the transformation (2.1), in simple random sampling scheme, has been used by authors like [1] [8]- [13].The associated sample mean estimator of the transformed variable (2.1), in post-stratified sampling scheme, can be written as ( ) , where are sample mean estimators based on hi x and hi y respectively.Using the sample means ps y , ps x and ps x * , and assuming that the population mean, X of the auxiliary variable hi x , is known, we proposed six combined-type estimators of the population ratio R Y X = in post stratified sampling scheme as ( ) (2.8)

Conditional Properties of the Proposed Estimators
Reference [14] defined that under the conditional argument, that is, for the achieved sample configuration, ( ) the post stratified estimator, ps y is unbiased for the population mean, Y , with variance where 2 V refers to conditional variance and 2 yh S is the population variance of y in stratum h .Similarly, Onyeka (2012) obtained the conditional variance of ps x and the conditional covariance of ps y and ps x re- spectively as: where (2.12) Then, under the conditional argument, ( ) ( ) Using (2.12), the first proposed estimator, 1 ˆC R , given in (2.3), can be re-written up to first order approximation, ( ) We take conditional expectation of (2.17) and (2.18), and use (2.13) to (2.16) to make the necessary substitutions.This gives the conditional bias and mean square error of 1 ˆC R respectively as Following similar procedure, we obtain the conditional biases and mean square errors of the six proposed estimators, together with those of the customary combined-type estimator, ˆC R , in post-stratified sampling, up to first order approximation, ( ) ( ) Generally, we have for the proposed six combined-type estimators, ( )

Unconditional Properties of the Proposed Estimators
Following [14] we obtain the following (unconditional) variances and covariance, for repeated samples of fixed size n.
( ) ( ) where f n N = is the population sampling fraction.By taking unconditional expectations of (2.17) and (2.18), and using (2.38)-(2.40) to make the necessary substitutions, we obtain the unconditional bias and mean square errors of the first proposed estimator, 1 ˆc R , up to first order approximation, ( )

L L L h yh h xh h yxh
Following similar procedure, we obtain the unconditional biases and mean square errors of the six proposed estimators, together with those of the customary combined-type estimator, ˆC R , in post-stratified sampling, up to first order approximation, ( ) ( ) ( ) Generally, the unconditional mean square errors of the proposed combined-type estimators is obtained as ( ) where q θ , 1, , 6 q =  is as given in (2.37).

Efficiency Comparison
The efficiencies of the six proposed combined-type estimators are first compared with that of the customary combined ratio estimator ˆC R in estimating the population ratio R of two population means under the conditional and unconditional arguments in post-stratified random sampling scheme.Secondly, the performances of the proposed estimators among themselves are investigated.Furthermore, the optimum estimators among the proposed estimators are also obtained.The efficiency comparison is carried out using the mean square errors of the estimators and the results are shown in Table 1.

Numerical Illustration
Here, we use the final year GPA ( ) y and the level of absenteeism ( ) x of 2012/2013 graduating students of Statistics Department, Federal University of Technology Owerri to illustrate the properties of the estimators proposed in the present study.Absenteeism is measured as the average number of days absent from lectures in a month.The class consists of 50 students, with 32 and 18 students respectively falling into low-absenteeism (0 -3 days per month) and high-absenteeism (4 -6 days per month) groups or strata.Our interest is to estimate the ratio of final year GPA to absenteeism from lectures, based on a post-stratified sample of 20 out of the 50 graduating students in the class.The data statistics, consisting mainly of population parameters are shown in Table 2. Table 3 shows the percentage relative efficiencies (PRE-1) of the proposed combined-type estimators, ˆqc R , Table 1.Efficiency conditions under conditional and unconditional arguments.

Estimator Conditional argument Unconditional argument
over the customary combined-type estimator, ˆc R , under the conditional and under the unconditional arguments.The table also shows the percentage relative efficiency (PRE-2) of the proposed combined-type estimators, 1 ˆc R , over the other combined-type estimators, under the conditional and under the unconditional arguments.
Table 3 shows that apart from the estimators, 2 ˆc R and 6 ˆc R , the remaining four proposed combined-type estimators, under the conditional and under the unconditional arguments, are more efficient than the customary combined-type estimator, ˆc R , for the data under consideration, and their gains in efficiency (PRE-1) are relatively large.Also, using PRE-2, we observe that the proposed combined-type estimator, 1 ˆc R , is more efficient than the estimators, 2 ˆc R , 6 ˆc R , and ˆc R , under the conditional and unconditional arguments.The optimum estimator, as expected, has the highest gain in efficiency, both under the conditional and unconditional arguments.However, the customary combined-type estimator, on the other hand, is found to be more efficient than some of the proposed combined-type estimators for the given set of data.This confirms the theoretical results, which showed that the proposed estimators are not always more efficient than the customary combined-type estimator.Notice that 0.16 showing that R β ′ < and from the theoretical results in Table 1, the proposed estimators would be more efficient than the customary combined-type estimator, under the unconditional argument, if 1 q θ < .The empirical results in Table 3 show that 2 1 θ > and 6 1 θ > , and the proposed estimators 2 R (PRE-1 = 44%) and 6 R (PRE-1 = 65%) under the unconditional argument, are less efficient than the customary combined-type estimator, ˆc R .Hence the empirical results confirm the theoretical results.

Concluding Remarks
The study extends use of variable transformation in estimating population ratio in simple random sampling scheme to post-stratified sampling scheme.Efficiency conditions for preferring the proposed estimators to the customary combined-type estimator are obtained.The study shows that in any given survey, these efficiency conditions should be employed in order to determine the appropriate proposed combined-type estimators to use for the purpose of estimating the population ratio of two variables in post-stratified sampling scheme, using variable transformation.