^{1}

^{*}

^{2}

^{1}

^{3}

^{4}

Complex survey designs often involve unequal selection probabilities of clus-ters or units within clusters. When estimating models for complex survey data, scaled weights are incorporated into the likelihood, producing a pseudo likeli-hood. In a 3-level weighted analysis for a binary outcome, we implemented two methods for scaling the sampling weights in the National Health Survey of Pa-kistan (NHSP). For NHSP with health care utilization as a binary outcome we found age, gender, household (HH) goods, urban/rural status, community de-velopment index, province and marital status as significant predictors of health care utilization (p-value < 0.05). The variance of the random intercepts using scaling method 1 is estimated as 0.0961 (standard error 0.0339) for PSU level, and 0.2726 (standard error 0.0995) for household level respectively. Both esti-mates are significantly different from zero (p-value < 0.05) and indicate consid-erable heterogeneity in health care utilization with respect to households and PSUs. The results of the NHSP data analysis showed that all three analyses, weighted (two scaling methods) and un-weighted, converged to almost identical results with few exceptions. This may have occurred because of the large num-ber of 3rd and 2nd level clusters and relatively small ICC. We performed a sim-ulation study to assess the effect of varying prevalence and intra-class correla-tion coefficients (ICCs) on bias of fixed effect parameters and variance components of a multilevel pseudo maximum likelihood (weighted) analysis. The simulation results showed that the performance of the scaled weighted estimators is satisfactory for both scaling methods. Incorporating simulation into the analysis of complex multilevel surveys allows the integrity of the results to be tested and is recommended as good practice.

Multilevel modeling (MLM) of complex survey data is an approach increasingly being used in public health research. The influence of contextual factors on public health has been considered as important in recent public health research [

Complex survey designs often involve unequal selection probabilities of clusters and/or people within clusters. The survey includes design (sampling) weights to account for unequal selection probabilities. When estimating multilevel models that are based on complex survey data, sampling weights are incorporated into the likelihood, producing a pseudo likelihood [

In this paper we explore the utility of multilevel modeling of three-level survey data with a binary outcome by incorporating weights in the estimation procedure to investigate the determinants of health seeking behavior of Pakistani population. It is important to understand a comprehensive picture of patterns of disease, conditions, and risk factors which affect the health of Pakistanis and their uptake of services.

We apply this methodology on the National Health Survey of Pakistan (NHSP) as our example of real data in survey research. The NHSP was a cross-sectional survey and provides a comprehensive picture of patterns of disease, conditions, and risk factors which affect the health of the people of Pakistan and their uptake of services. In the NHSP data primary sampling units (PSUs) are at level 3, households at level 2 and individuals at level 1. The outcome of interest in this paper is health care utilization coded as a binary outcome.

The most important recommendation for precise estimation of model parameters in weighted analysis is to have larger cluster sizes. Simulation studies have revealed that the regression coefficients are biased for small cluster sizes [

Two scaling methods are commonly used, but there is no general method available which can deal with all types of design and data issues [

Our literature search observed that three-level weighted analyses have been reported rarely. We found two important methodological papers [

The objective of the study is to conduct MPML approach to estimate model parameters for determinants of health care utilization taking sampling weight into account for a three-level complex survey design with healthcare utilization as a binary outcome. These parameter estimates will leads to determination of contextual and individual level predictors of health care utliztion. In addition we assessed the performance of MPML for three-level binary complex survey data by conducting a simulation study that has focused on varying the prevalence of the binary outcome and the intraclass correlation (ICC) at the third (PSU) and the second (household) level simultaneously.

The structure of this paper is as follows. In methods section we describe the methodology for a multilevel analysis of three-level binary response data. We study two approaches for conducting the multilevel analysis using data from the NHSP. The first approach uses multilevel maximum marginal likelihood (MML) estimation (un-weighted analysis). The second approach uses MPML estimation with sampling weights (weighted analysis). In the weighted analysis section we apply two weight scaling methods, suggested in literature by [

The design probability for a sampling unit is a feature of the survey design and is assumed to be known before data analysis. If the design probabilities are informative they are correlated with the response. The weights are defined as the inverse of the probability of selection for the sampling unit.

Weighting Scheme of Three-Level DataLet us first consider the weighting strategy for three-level data. The 3rd (PSU) level weight can be computed as the reciprocal of the probability that PSU k was

selected from the sampling frame. It is defined as

probability of selection of the ^{rd} level cluster. The 2^{nd} level weight is computed as the reciprocal of the probability that household

that PSU

lity of selection of the ^{nd} level cluster given that the ^{rd} level cluster was selected. The first level weight is computed as the reciprocal of the probability that the ^{nd} level cluster was

selected in the ^{rd} level cluster.

selection of the

Consider a dichotomous outcome variable

where

the ^{nd} level cluster and the

The probability of selection of the ^{rd} level cluster, and n is the total number of 3^{rd} level clusters.

Let ^{rd} level cluster.

The log-likelihood contribution of a level 2 cluster, conditional on the random effect at level 3 is:

where

The log-likelihood contribution of a 3rd level cluster is:

where

where y is the vector of all responses.

The PML estimates are obtained by maximizing the weighted log-likelihood. The sampling weights are incorporated into the likelihood for estimation of parameters and their standard errors. We maximize the weighted pseudo likelihood with respect to the regression parameters.

The log-likelihood contribution of a level 2 unit, conditional on the random effect at level 3 is:

The log-likelihood contribution of a 3rd level cluster is:

Let

The Equation (7) and (8) differ from (3) and (4), respectively, only by addition of the weights as factors in the summation.

In scaling method 1 weights are standardized by summing the scaled weights over the sample size of the corresponding cluster such that the sum of the scaled weights becomes equal to the effective cluster size. It is suggested by Carle [

where

In the scaling method 2 the scaled weights add up to sample sizes; the corresponding totals in method 1 are smaller [

Method 2 provides least biased estimates for estimating slopes, intercepts and odds ratios [

The ICC for three level binary data can be defined for each level separately:

The NHSP was a cross-sectional survey, conducted in 1990-1994. In the present application the outcome we are considering is health care utilization; if an individual had sought any medical care in the last 14 days the person was considered as utilizing health care. In the urban sampling frame a city/town was divided into enumeration blocks of 200 - 250 households. The sample of the NHSP for urban areas was drawn from the list of these enumeration blocks. A sample of 2400 households was considered sufficient to obtain estimates of important characteristics at the national level with an acceptable sampling error [

A multistage cluster sampling design with stratification was employed to collect data in the NHSP. There were 8 strata corresponding to the four provinces in Pakistan divided into urban and rural areas. In the NHSP a PSU is a block of households (HHs) in urban areas, and a village of households in rural areas. At the first stage PSUs were selected from each stratum with probability proportionate to size with respect to the number of households in the urban strata and to the population of the village in the rural strata. In total 80 PSUs were sampled, and out of these 32 were drawn from urban areas and 48 from rural areas. In the 2^{nd} stage, on average 30 households were selected through systematic random sampling from each PSU. All subjects residing in the 2400 households were selected for the study. The household non-response rate was 3.1% in the NHSP and the overall individual non-response rate was 7.6%. A total of 18,315 subjects were interviewed. For our analysis we considered health care utilization by subjects aged 14 years and older. As the objective is to determine the association of individual, household and PSU level characteristics with health care utilization, subjects younger than 14 years were excluded as their health seeking behavior would be determined by their guardians.

Out the total of 18,315 subjects interviewed in NHSP, 9856 were aged 14 years and older. However, the number of individuals in our analysis is smaller because of missing data. We excluded subjects with missing values and performed a complete case analysis. The socio-demographic characteristics differ by less than 3% between the sample n = 9856 and the subset (n = 8454) used for this study suggesting that our findings will be representative of the Pakistani population aged 14 years and above. Two separate weight adjustments were applied for nonresponse at the household level and for individual level, respectively.

Nine explanatory variables were considered across the three levels as potential predictors of health care utilization. The PSU level explanatory variables are urban/rural status, province, and community development index for community development index we used the results provided by Hadden et al. [

Three-level un-weighted multivariable logistic regression model was regressed on covariates selected in the univariate analysis. The specified significance level for a variable to remain in the multivariable model was 0.05. We conducted scale examination for continuous predictor variables in the multivariable model. All possible interactions were assessed for significance (p-values < 0.05). We developed a final model conducting the un-weighted analysis using these rules for model selection. We then fitted the same model using the weighted analyses to compare the un-weighted and weighted analyses. The pseudo-maximum likelihood estimation for generalized linear mixed models with two or more than two levels using adaptive quadrature is implemented in GLLAMM a user written program included in STATA [

About 22.1% subjects aged 14 years and older sought medical care. The mean age of the respondents was 35.4 (±17.3) years. Most of the respondents were illiterate; 64.6% individuals could not read and write, and only 12.2% were educated to 10 or more years of education. About 28.6% of individuals reported that they had never been married while 63.9% were currently married. About 21.4% belonged to the Sindh province, 51.1% to Punjab, 10.0% to Baluchistan and 17.9% to North West Frontier Province (NWFP). Thirty seven percent of people belonged to urban and 63% to rural communities. The average number of durable goods owned per household was 3.1.

The results of analysis of the NHSP data showed that the findings of the scaled weighted analysis generally agree with the un-weighted analysis (

Un-weighted maximum likelihood | Weighted maximum likelihood (scaling method 1) | Weighted maximum likelihood (scaling method 2) | |||||||
---|---|---|---|---|---|---|---|---|---|

Variables* | Estimate | SE | p-value | Estimate | SE | p-value | Estimate | SE | |

Individual Level Variable | |||||||||

Gender Female | −0.0113 | 0.1151 | 0.922 | 0.05048 | 0.1451 | 0.728 | 0.0528 | 0.1564 | 0.730 |

Age 19 - 35 36 or more | 0.3107 0.4789 | 0.1044 0.1189 | ^{‡}<0.001 0.003 <0.001 | 0.3263 0.4531 | 0.1394 0.1912 | ^{‡}0.04 0.019 0.018 | 0.3201 0.4629 | 0.1451 0.2005 | ^{‡}0.03 0.028 0.022 |

Marital Status Married Widow/divorced/separated | −0.0123 0.0802 | 0.1116 0.2169 | ^{‡}0.811 0.912 0.711 | 0.0906 0.1585 | 0.1658 0.2779 | ^{‡}0.320 0.585 0.569 | 0.0745 0.1429 | 0.1634 0.3003 | ^{‡}0.382 0.651 0.622 |

Household Level Variables | |||||||||

Ownership of household goods 1 - 4 5 or more | 0.3974 0.4843 | 0.1066 0.1245 | ^{‡}0.001 <0.001 <0.001 | 0.4410 0.6423 | 0.1260 0.1585 | ^{‡}<0.001 <0.001 <0.001 | 0.4792 0.7101 | 0.1329 0.1688 | ^{‡}<0.001 <0.001 <0.001 |

Community Level Variables | |||||||||

Urban/rural status Urban | 1.4047 | 0.5323 | 0.008 | 1.3216 | 0.3645 | 0.001 | 1.3429 | 0.3783 | <0.001 |

CDI Middle High | 0.2165 0.7772 | 0.1438 0.3330 | ^{‡}0.01 0.132 0.020 | 0.2615 0.8644 | 0.1822 0.1262 | ^{‡}<0.001 0.151 <0.001 | 0.2702 0.9213 | 0.1963 0.1333 | ^{‡}<0.001 0.169 <0.001 |

Province NWFP Sindh Punjab | 0.9149 0.7797 1.1056 | 0.2238 0.2149 0.1983 | ^{‡}<0.001 <0.001 <0.001 <0.001 | 0.6543 0.6493 0.9754 | 0.4025 0.3894 0.3828 | ^{‡}<0.001 0.104 0.095 0.011 | 0.6692 0.6732 1.0230 | 0.4216 0.4056 0.3979 | ^{‡}<0.001 0.107 0.091 0.010 |

Interactions CDI *urban/rural Middle community* urban High community* urban Marital status* gender Married* female Widow/divorced/separated* female | −1.3616 −1.9071 0.3201 0.4998 | 0.5552 0.6271 0.1345 0.2466 | ^{‡}0.045 0.014 0.002 ^{‡}0.005 0.017 0.043 | −1.3103 −1.8898 0.2278 0.5523 | 0.4082 0.3689 0.1568 0.3845 | ^{‡}0.002 0.001 <0.001 ^{‡}0.036 0.146 0.151 | −1.3453 −1.9680 0.2519 0.5719 | 0.4267 0.3839 0.1611 0.4204 | ^{‡}0.001 0.002 <0.001 ^{‡}0.032 0.118 0.159 |

log likelihood | −4267.95 0.1355 0.2244 | 0.0348 0.0625 | 0.001 0.001 | −2,098,251.4 0.0961 0.2726 | 0.03393 0.09952 | 0.001 0.001 | −2,326,415 0.11204 0.28098 | 0.0381 0.1052 | 0.001 0.001 |

ICC(HH) ICC(PSU) | 0.06 0.04 | 0.080 0.030 | 0.076 0.030 |

*Reference categories for individual level variables: male, age 14 - 18 years, never married respectively; *Reference categories for household level variable: none: *Reference categories for community level variables: rural, low, Baluchistan respectively. †Note: Estimates and standard errors for random intercept variance; ^{‡}Note: over all p-value of covariates with more than two categories.

interactions; between gender and marital status, and between the community development index and urban/rural status respectively. The estimates of intraclass correlation from hierarchical modeling are useful and help to interpret the data results.

The estimated variance (unobserved heterogeneity) of the random intercepts using an un-weighted analysis are 0.1355 (standard error 0.0348) for PSU level, and 0.2244 (standard error 0.0625) for household level respectively. Both estimates are significantly different from zero and indicate considerable heterogeneity in health care utilization with respect to households and PSUs that is unaccounted for by the predictor variables and should be adjusted for an adequate analysis. For the analysis of the NHSP data, the household level variability had not been considered and adjusted for in previous research reported in the literature [

The estimated variance of the random effects using scaling method 1 are 0.096 (standard error 0.033) for PSU level and 0.272 (standard error 0.099) for household level respectively. The variance of the random effects using scaling method 2 are estimated as 0.112 (standard error 0.033) for PSU level and 0.280 (standard error 0.105) for household level. Estimated intra-class correlation for scaling methods 1 and scaling method 2 for household level and PSU level are: household; ICC_{method1} = 0.080; ICC_{method2} = 0.076 and PSU; ICC_{method1} = 0.030; ICC_{metho}_{d2} = 0.031, respectively. The estimates of regression coefficients and standard errors from the weighted analysis for a PSU level variable, province diverged somewhat from the un-weighted analysis. The effect of the PSU level variable province, was highly significant (p-value < 0.001) for all three provinces (relative to Baluchistan) in the un-weighted analysis but after weight adjustment this effect became marginal, except for the province Punjab (p-value = 0.01). The confidence intervals were narrow for the un-weighted analysis but due to larger standard errors the confidence intervals become wider under the two scaled weighted analyses (methods 1 and 2).

Another difference that was observed between the weighted and un-weighted analysis was the estimate of an interaction between two individual level variables (gender and marital status). The estimated regression coefficient and standard errors from the weighted analysis diverged slightly from the un-weighted analysis.

To explore the stability of the models we carried out a simulation study for three level complex survey data with a binary outcome to assess the impact of varying prevalence of the outcome, and ICC at each level on the accuracy of the estimates of MPML.

We carried out a simulation study to assess the influence of different conditions on the parameter estimates using two weight scaling methods. It will determine which method provides less biased estimates for three level complex survey data [

The multilevel logistic regression model is:

where

care utilization and

The data were generated using the same level of clustering, number of clusters and average number of observations in each cluster as we have in the NHSP. The number of PSUs was set at

We set four scenarios for our simulation study to compare two weight scaling methods for each scenario: 1) mid-range prevalence of the outcome as 30% with low intra-class correlation at PSU level 0.05 and at household level 0.2, 2) low prevalence of the outcome as 10% with low ICC at PSU level 0.05 and at household level 0.2, 3) mid-range prevalence of the outcome as 30% with high intra-class correlation at PSU level 0.15 and at household level 0.3, and 4) low prevalence of the outcome as 10% with high ICC at PSU level 0.15 and at household level 0.3.

A multistage cluster sampling design with stratification was employed to generate the data. The true values of the fixed effects and random effects parameters for the simulation study were specified within a reasonable range of the estimates from the MPML fit of three level NHSP data. The number of simulations we performed was decided based on detailed literature review, and on a sample size calculation suggested by Burton et al. [

We equated

We set the fixed effects parameters for scenario 1 as

To generate the outcome, a Bernoulli distribution with probability

lection the same weights were taken for simulated data as in the NHSP data. A computer program was prepared in SAS to simulate data for the four different scenarios. The weighted analysis was performed in GLLAMM of STATA to assess the performance of the MPML estimates.

To evaluate the performance of MPML we assessed the bias and coverage of the parameter estimates [

average of the estimates of interest over the B simulation runs. The accuracy of the standard error of a parameter estimate is assessed by computing the observed coverage of the 95% confidence intervals (CI) created by using the standard normal distribution [

The coverage should be approximately equal to the nominal coverage (95%).

An acceptable criterion for the coverage is that the coverage should not fall outside of approximately two SEs of the nominal coverage probability (p) [

Hence according to within 2 SEs criteria (0.95 ± 0.03) the acceptable range for the coverage is 92% - 98%.

We also calculated the standardized bias that describes the bias as a percentage of the

mate of interest of the overall simulation, and calculated as

ther direction has adverse impact on the bias, coverage and efficiency [

The results of our simulation study indicate an acceptable bias for estimators of MPML for all scenarios. The performance of scaling method 1 is satisfactory (

The results suggest that the effect of ICC is somewhat pronounced when the prevalence is low but it is not substantial, this could be due to the large sample size we have for the simulated data. The coverage (if not in the acceptable range) is close to the nominal value for few parameter estimates with mostly the coverage falling within two SEs of the nominal coverage probability (p). The standardized bias is less than 40% for all scenarios for scaling method 1. Hence we conclude that this procedure does not have an adverse impact on bias, coverage and efficiency of parameter estimates of fixed and random effects in the sample size ballpark we have used.

The simulation results show that the performance of scaling method 1 is reasonable for 3 level data with a binary outcome. In simulation results of scaling method 2 we also found acceptable bias for the fixed effects and random effects parameters for all scenarios (

Scenario | Prevalence and ICC | Assessment | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 30%^{1} ^{3}ICC_{PSU} = 0.05 ^{3}ICC_{HH} = 0.2 | Bias Coverage % Standardized bias | 0.030 88 18 | −0.010 95 −15 | 0.010 92 6 | 0.010 97 6 | −0.020 90 −13 | −0.040 94 −13 | −0.020 92 −6 | 0.007 94 3 | −0.010 93 −10 | −0.040 94 −30 |

2 | 10%^{2} ^{3}ICC_{PSU} = 0.05 ^{3}ICC_{HH} = 0.2 | Bias Coverage % Standardized bias | −0.010 95 −2 | −0.010 94 −12 | 0.010 93 5 | 0.030 91 15 | 0.020 92 13 | −0.004 92 −1 | 0.030 93 10 | 0.040 94 14 | 0.020 95 13 | −0.060 92 −25 |

3 | 30%^{1} ^{ } ^{4}ICC_{PSU} = 0.15 ^{4}ICC_{HH} = 0.3 | Bias Coverage % Standardized bias | 0.020 94 7 | −0.010 97 −14 | 0.020 94 11.8 | 0.020 93 10 | −0.030 94 −10 | −0.050 95 −15 | −0.040 93 −11 | 0.00 95 0 | −0.060 93 −32 | −0.070 93 −34 |

4 | 10%^{2} ^{4}ICC_{PSU} = 0.15 ^{4}ICC_{HH} = 0.3 | Bias Coverage % Standardized bias | 0.030 90 9 | −0.010 95 −9 | 0.010 92 4 | 0.020 95 9 | −0.040 92 −15 | 0.040 94 10 | 0.010 94 3 | 0.040 92 12 | −0.050 93 −19 | −0.100 91 −37 |

1-Mid-range prevalence =30%. 2-Low prevalence =10%. 3-Low ICCs: ICC_{PSU} = 0.05; ICC_{HH} = 0.2. 4-High ICCs: ICC_{PSU} = 0.15; ICC_{HH} = 0.3.

Scenario | Prevalence and ICC | Assessment | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 30%^{1} ^{3}ICC_{PSU} = 0.05 ^{3}ICC_{HH} = 0.2 | Bias Coverage % Standardized bias | −0.010 91 −4 | −0.010 93 −15 | 0.010 95 7 | 0.020 97 14 | −0.020 95 −14 | −0.030 92 −11 | 0.000 94 0 | 0.010 96 4 | 0.006 94 5 | −0.050 93 −25 |

2 | 10%^{2} ^{3}ICC_{PSU} = 0.05 ^{3}ICC_{HH} = 0.2 | Bias Coverage % Standardized bias | −0.040 90 −14 | −0.006 95 −6 | 0.010 93 6 | 0.030 94 16 | −0.020 92 −13 | −0.003 96 −1 | 0.030 94 11 | 0.040 92 15 | 0.010 93 7 | −0.070 94 −34 |

3 | 30%^{1} ^{ } ^{4}ICC_{PSU} = 0.15 ^{4}ICC_{HH} = 0.3 | Bias Coverage % Standardized bias | −0.010 92 −4 | −0.008 96 −11 | 0.030 94 18 | 0.030 95 17 | −0.060 95 −21 | −0.080 92 −25 | 0.040 96 11 | −0.020 93 −4 | −0.100 92 −41 | −0.090 91 −35 |

4 | 10%^{2} ^{4}ICC_{PSU} = 0.15 ^{4}ICC_{HH} = 0.3 | Bias Coverage % Standardized bias | 0.030 94 9 | 0.006 93 5 | 0.010 95 5 | 0.010 96 4 | −0.060 91 −18 | −0.070 93 −18 | 0.006 94 2 | 0.050 94 15 | −0.080 95 −34 | −0.110 94 −39 |

1-Mid-range prevalence =30%. 2-Low prevalence =10%. 3-Low ICCs: ICC_{PSU} = 0.05; ICC_{HH} = 0.2. 4-High ICCs: ICC_{PSU} = 0.15; ICC_{HH} = 0.

scenarios 1, 2 and 3 but relatively greater bias for random components estimates were observed as compared to scenarios 1 and 2. These biases are not alarming but in comparison to scenarios 1 and 2 they are somewhat large. The coverage for 95% Wald CIs of the simulated model for scaling method 2 is close to the nominal value for fixed and random effect parameter estimates. Mostly coverage is falling within 2 SEs of the nominal coverage probability (p). The standardized bias is less than 40% for all scenarios for scaling method 2 except for a variance component at household level of scenario 3 (41%). However, the latter is a rather small deviation.

Parameter estimates of the 3-level logistic regression model obtained from un- weighted and weighted (scaling methods 1 and 2) analysis of NHSP data converged to similar results with few exceptions. Relatively small ICC, and a large number of PSUs and households are probably the key components of consistency in the results of the weighted and un-weighted analyses in the present application.

Our simulation results showed that the performance of the scaled weighted estimators is satisfactory for scaling methods (1 and 2) for 3-level data with a binary outcome in all scenarios that we considered. The results of the analysis of health care utilization from the NHSP data are consistent with the results of the simulation study with regard to agreement between the two scaling methods.

When substantial divergence is found in inferential conclusion, it suggests that sampling weights are highly informative (the design weights are correlated to the response) and also the ICC may be larger [

In our analysis of NHSP the standard errors for both the scaling methods are larger than those for the un-weighted analysis. The larger standard errors under the two scaled weighted analyses (methods 1 and 2) showed marginal divergence in inferential conclusion compared to un-weighted analysis.

Simulation studies [

We are unaware of any simulation study reported to date that has assessed the impact of different prevalence of the binary outcome, and ICCs at different clustering levels for three-level complex survey data on the bias of estimators of MPML using the two scaling methods. The results of our simulation will provide guidelines for future surveys with regard to obtaining unbiased estimates of MPML weighted analysis for three-level data with a binary outcome.

A simulation study assessed the influence of the prevalence of the outcome, and that of ICC for different sample sizes on the estimates of maximum marginal likelihood for two-level data. However, this study did not involve a complex survey design with weights [^{rd} level and 2^{nd} level clusters simultaneously. Though we did not find any particular pattern in the bias with respect to varying prevalence of the outcome and ICCs, the bias of the variance component increased slightly when the ICC increased for both scaling methods. The sample weight scaling methods considered in this paper can be biased when the number of clusters is small. We can explore this further in future research by varying sample sizes at different levels for the various scenarios of our simulation. We can conclude that both scaling methods 1 and 2 are effective for complex survey designs for the sample size, cluster size, ICC and range of binary outcome prevalence we have considered.

Contextual phenomena are highly important for public health research. To measure the contextual aspect of health care utilization we used multilevel measures of variance and intra-cluster correlation in addition to fixed effects at different levels of clustering. Our analysis indicates that in rural areas more developed communities seek significantly more health care than lesser developed communities. However, in urban areas there was no significant difference among communities with different levels of development. Communities with low development index are illiterate and economically deprived. In Pakistan health care centers of good quality are generally located in urban areas, hence it seems that subjects from communities with low development index in rural areas do not have the resources and/or awareness to travel to urban areas for seeking health care. In addition, our analysis indicates marked differences with regard to health care utilization among the four provinces of Pakistan; Punjab being the most developed in this regard and Baluchistan being the least developed. Underdevelopment of Baluchistan in general is a phenomenon that the present Government and politicians of Pakistan are advocating to address. Moreover, our analysis indicates that the socio-economic status of a household is associated with health care utilization; households belonging to higher socio-economic status seek significantly more health care. The people of Pakistan need to make progress in many areas including health behavior, and health care practices, and improve economic and social policy to improve the nation’s overall health.

In our analysis of NHSP we found that social and economic factors have created differences in the distribution of health utilization among different individuals, households and geographical areas. The epidemiological vision of multilevel analysis must therefore focus on measures of health variation that provide information about the distribution and determinants of health status within multiple contexts.

The investigation of variance components (or ICCs) in multilevel models for health care research provide additional insight than odds ratios and regression coefficients that are the traditional measures of association. If accurate information on contextual effects is provided to health policy makers, their decisions or policies could be very efficient in reducing health inequalities.

Data analysis is critical to formation of policy and program in the health sector, yet in low and middle income countries like Pakistan, there is little use of statistical information [

We express our heartiest gratitude to the Aga Khan University (AKU) for funding this study through a “Faculty Development Award” (FDA). We thank Dr Andrew Titman (Lecturer, Lancaster University) for helping in the procedure for the simulation. We thank Dr Nida Zahid (Instructor, Aga Khan University) for references of the manuscript.

The authors have declared no conflict of interest.

Rozi, S., Mahmud, S., Lancaster, G., Hadden, W. and Pappas, G. (2017) Multilevel Modeling of Binary Outcomes with Three-Level Complex Health Survey Data. Open Journal of Epidemiology, 7, 27-43. https://doi.org/10.4236/ojepi.2017.71004