Determinants of Infant Mortality in Rural India: a Three-level Model

Taking into account the hierarchical structure of the data, through two-level analysis on infant mortality available under second round of National family Health Survey, the same group of authors recently reported determinants of infant mortality while examining possible changes in results under traditional regression analysis that ignores hierarchical structure of data. They reported that the community (e.g., state) level characteristics still have a major role regarding infant mortality in India. For better epidemiological understanding , the present study is to assess determinants of infant mortality in rural India, where three level considerations were possible. The results indicate that even after consideration of these covariates, variation in infant mortality remains significant not only between States but also between Districts. Further, as an additional observation, the probability of infant mortality is still high in rural areas of districts having health facility beyond three kilometers than their counterparts .


INTRODUCTION
The infant mortality rate (IMR) still remains an important public health indicator [2,3].Its data structures are often hierarchical in nature, especially those available at national/state/district/village/household/individual level [1][2][3][4][5].Hence, as reported earlier, data on IMR need to be dealt with hierarchical/multilevel analysis that takes hierarchical structure into account and also makes it possible to incorporate variables from all levels and retains them at their own levels [1].Depending on circumstances, there may be various possible approaches while analyzing such data .However, for better epidemiological understanding in rural areas where the largest population of the country lives, data relate to only rural India.The analytical methods involved a three-level model instead of earlier two-level models [1].At regional level, an appropriate epidemiological understanding from time to time may be helpful to policy planners.It may also help in testing many hypotheses related to population issues and generate various important clues towards public health programs.

Materials
The data used in the present study are from the National Family Health Survey (NFHS), 1998-1999, conducted for the second time in India.The details about sampling methods, used questionnaires, methods of data collection and all other aspects are already documented [1,6].The detailed information on antenatal, delivery and postnatal care was obtained for the two most recent births that occurred to eligible women during the three years preceding the survey.For analysis, to restore the correct proportion in view of no self-weighting of the sample design, the country-level weight was used [1].As considered earlier [1], in present analysis also, infant mortality was taken as dependent variable (i.e., child who died before his/her first birthday) in the last three years preceding the survey (0 = alive, died = 1).Only from rural area, a total 24,493 births were recorded in the last three years preceding the survey.Out of them, a total of 192 children who died after their first birthday were considered as alive.Further, a total of 7920 children who had not completed their first birthday and were alive were excluded from the analysis.The problem of missing information also resulted into exclusion of some of the children: religion + caste (144), standard of living index (199), place of delivery (82), size of child at birth (86), squeeze milk from breast (120), mother's education level (3), father's education level (40), consumed all given tablets (78), distance to nearest health facility 68, and Delhi-rural (54).Finally, complete information on 15,969 children was available for analysis.
As considered earlier [1], a number of explanatory variables were selected for the analysis; they are presented here again for completeness.Among the qualitative variables, some of them were retained in their existing forms: place of residence (urban/rural), standard of living index (high/low/medium), sex of the child (girl/ boy), consumed all given iron tablets (yes/no), and child received colostrums (yes/no).Further, to enable a meaningful analysis, some of the variables available in the form of nominal scale were categorized as dichotomous variables after exploratory analysis: size at birth (averge + large/small), status of birth (single/multiple), mother received at least 3 antenatal visits (yes/no), received at least 2 TT dose (yes/no), place of delivery (institutional/ non-institutional).To derive meaningful information, continuous variables were converted into categorical variables, like preceding birth interval (≥24 months/first birth/≤24 months), mother education level (9 & above/0 -5/6 -8), father's years of schooling (9 & above/0 -5/6 -8), and birth order (2 -3/1/4+).Further, some of the variables were generated using available information in multiple forms.For example, exposure to radio, television, newspaper and poster in the last month preceding the survey were collected separately.Mothers were categorized as positive if they were exposed to any of the media listed above, otherwise no and variable was named as exposure to mass media (yes/no).Further, mothers age at birth was computed by subtracting date of birth of child and date of birth of mother, and recoded as (20 -29/<20/30 +).Religion and caste were pooled to derive another variable religion-caste (non-Hindu/SC-ST-OBC Hindu/other Hindu).At state level, percentage of women aware of ORS (<63/≥63); percentage of births of order 4 and above (<28/≥28); percentage of women with middle school & above (<20/≥20); Government expenditure on health per head (<29/≥29) and percentage of women with any anemia (<48/≥49), were considered.For such considerations, confidence interval (CI) of each of the considered covariates' national level estimate was calculated and the lower limit of CI was used as threshold for categorization [1].Further, in analysis of rural data, one additional covariate namely district with health facility within 3 km was also considered.

Analytical Methods
For the considered data set for rural India, to begin with, the distribution of children in relation to various covariates and also corresponding percentage of infant mortality were tabulated.Because outcome variable "infant mortality" is binary, under traditional data analysis, logistic regression analysis was used to find out the individual factors associated with infant mortality.Accordingly, unadjusted risk ratio and its 95% confidence interval (CI) were worked out in relation to each covariate and presented in the corresponding tables.
As reported earlier [1], the sets of covariates in data analysis were considered in view of both statistical and public health relevance.Further, on exploration, presence of co linearity and also effects modification among the considered covariates was not noticed in the data sets.As appropriate traditional data analysis, stepwise logistic regression analysis was used to find out the factors associated with infant mortality without (TLR 1 ) and with (TLR 2 ) consideration of district level variable.Again, for appropriate comparison, the respective covariates that were retained in the traditional logistic regression models (TLR 2 ) were included in the hierarchical models (MLR 1 & MLR 2 ).Finally, adjusted risk ratio and its 95% confidence interval were presented in the corresponding tables.
As opposed to earlier study [1], exploration of data structure on infant mortality in rural India revealed appropriateness of consideration of three-level structure in the analysis, conceptualized as children (at level-1) nested within districts (at level-2) and districts nested within states (at level-3).In addition to traditional multivariable stepwise logistic regression analysis (TLR 2 ), the multilevel analysis was carried out twice: first as random intercept model (MLR 1 ) and second as random intercept as well as slope model (MLR 2 ).In other words, all the community level variables were retained at fixed level under MLR 1 where as at random level under MLR 2 .The used models are simply extension of those described earlier [1], from two-level to three levels.To remind again, the hierarchical models correctly assume that children/mothers in same district and districts in state are correlated in terms of community effects, which is not true under traditional logistic regression model.
As documented earlier [1], under the traditional discrete response logistic regression models, usually computationally friendly method of estimation "maximum likelihood" is used.However, this procedure becomes intensive for discrete response multilevel models.Hence, the multilevel models were estimated using either the iterative generalized least squares (IGLS) or reweighted IGLS using MLwiN program version 1.1.For this, Marginal Quasi Likelihood (MQL) approximation with a first order Taylor linearization followed by the 2 nd order PQL procedure was applied.To assess significance of the coefficients, the Wald test was used.Considering significant association at 5% level of significance, the coefficients were transformed to obtain risk ratio and its 95% CI.All analyses were carried out using STATA (version 9) and SPSS (version 14), other than multilevel models.The detailed results related to infant mortality in rural India are presented in Tables 1 and 2.

RESULTS
As true in case of infant mortality in India [1], the distribution of infant deaths in the relation to various socioeconomic and demographic characteristics in rural India (Table 1) also reveals that those children are more likely to die before celebrating their first birthday; whose mothers have comparatively lower education [2. 13   The children belonging to district having health facilities beyond 3 km had comparatively higher chance of dying during infancy, however not statistically significant [1.13 (0.92 -1.39)].The children belonging to state with low level of awareness about ORS among women [1.61 (1.43 -1.81)], low level of middle school and above literacy among the mothers' [1.64 (1.45 -1.85)], higher level of children with birth order 4 and above [1.61 (1.43 -1.81)]; high level of any kind anemia among mothers [1.51 (1.30 -1.75)] and low per capita expenditure on health [1.18 (1.05 -1.33)] had comparatively higher chances to die before celebrating their first birthday.
As evident from Table 2, in general, the results under multilevel analysis (MLR 1 ) and (MLR 2 ) involve comparatively broader confidence interval in comparison to traditional logistic regression analysis (TLR 2 ).The results related to data analysis from rural area provided almost similar results to those for overall India [1] other than few exceptions.For example, the covariate namely complications during pregnancy did not enter in the model at all.Further, among community level covariates, % of woman with any anemia also did not enter in the model.However, the additional covariate considered at district level in Rural India data, namely distance to health facilities, entered in the model.
The results under random intercept model for rural India reconfirm the findings obtained at national level [1].Further, they indicate that even after consideration of these covariates, variation in infant mortality remains significant not only between States but also between Districts.Such genuine exploration is not possible under TLR 2 .Again, the results under random coefficient models reconfirm the findings observed at national level [1].Further, as an additional observation, the probability of infant mortality is high in rural areas of districts having health facility beyond three kilometers than their counterparts.
Under the above model, the probability of infant mortality was allowed to vary across States, however, the effects of the explanatory variables were assumed to be same for each State.Under random coefficient model (MLR 2 ), as a modification in this assumption, the coefficients related to % of women with middle education & above and % of women with any anemia were also assumed to vary across the States.The results indicate that the effect of % of women with middle school and above ranging from 12% to 56% does indeed vary across States [28].As obvious, the results further indicate that State level variation in the probability of infant mortality is higher in communities having low % of women with middle school and above than their counterparts.Likewise, State level variation in the probability of infant mortality is likely to be higher in communities having high % of women with any anemia ranging from 23% to 70% [28] than their counterparts.

DISCUSSION
As reported earlier for overall India [1], in rural India also, those children are more likely to die before celebrating their first birthday; whose mothers have comparatively lower education; do not have exposure to mass media; fathers also have a low level of education; who belong to SC/ST/OBC categories, low and medium standard of living index; birth order is the first; whose size at birth is small; who are multiple at birth; who are either first birth or born before 24 months of previous child; whose mothers are youngsters below 20 years of age; whose mothers do not receive three ANC visits for check-up; and also do not receive two TT dose; whose mothers could not consume all given iron tablets during pregnancy; whose mothers deliver the child at non-institutional place; and who do not receive colostrums [13][14][15][16][17][18][19][20][21][22][23][24].
As expected, multilevel analysis revealed comparatively broader confidence interval in comparison to traditional logistic regression analysis [26,27].Other than few exceptions, the results under multilevel analysis of data from rural area provided almost similar results to those for overall India [1].At individual level, the covariate namely complications during pregnancy did not enter in the model at all.Further, among state level covariates, contrary to % of women with middle school and above, % of woman with any anemia did not enter in the model.However, the additional covariate considered at district level in Rural India data, namely distance to health facilities, entered in the model.Further, they indicate that even after consideration of these covariates, variation in infant mortality remains significant not only between States but also between Districts.As an additional observation, the probability of infant mortality is high in rural areas of districts having health facility beyond three kilometers than their counterparts.
In summary, state/district level developmental indicators are likely to still vary significantly across the states/ districts and also their effects on infant mortality.Hence, for further improvements, there is a need to focus on district level planning as well instead of only at the state level.Taking into account the findings under the present study, for a data involving hierarchical structure, there is a need to emphasize the use of the possible highest levels in hierarchical models (i.e., multilevel models) instead of compromising with lower level models.To further emphasize, such optimal considerations may provide additional important clues to policy planners leading to optimal use of available resources regarding public health programs.
OPEN ACCESS

Table 1 .
Percentage of infants died according to their socioeconomic & demographic characteristics in Rural India and related unadjusted risk ratio (URR) and corresponding 95% confidence interval (95% CI).
*Percentage are taken from the NFHS-2, India report @state-wise population from 1991 census; and expenditure on health: tenth five year plan 2002-2007, Planning Commission of India.

Table 2 .
Adjusted risk ratio (ARR) and corresponding 95% confidence interval (95% CI) for the infant death in Rural India by socio economic and demographic characteristics using traditional (TLR) and multilevel logistic regression (MLR) analysis.