Investigating Spatio-Temporal Pattern of Relative Risk of Tuberculosis in Kenya Using Bayesian Hierarchical Approaches

Proper understanding of global distribution of infectious diseases is an important part of disease management and policy making. However, data are subject to complexities caused by heterogeneities across host classes and space-time epidemic processes. This paper seeks to suggest or propose Bayesian spatio-temporal model for modeling and mapping tuberculosis relative risks in space and time as well identify risks factors associated with the tuberculosis and counties in Kenya with high tuberculosis relative risks. In this paper, we used spatio-temporal Bayesian hierarchical models to study the pattern of tuberculosis relative risks in Kenya. The Markov Chain Monte Carlo method via WinBUGS and R packages were used for simulations and estimation of the parameter estimates. The best fitting model is selected using the Deviance Information Criterion proposed by Spiegelhalter and colleagues. Among the spatio-temporal models used, the Knorr-Held model with space-time interaction type III and IV fit the data well but type IV appears better than type III. Variation in tuberculosis risk is observed among Kenya counties and clustering among counties with high tuberculosis relative risks. The prevalence of HIV is identified as the determinant of TB. We found clustering and heterogeneity of TB risk among high rate counties and the overall tuberculosis risk is slightly decreasing from 2002-2009. We proposed that the Knorr-Held model with interaction type IV should be used to model and map Kenyan tuberculosis relative risks. Interaction of TB relative risk in space and time increases among rural counties that share boundaries with urban counties with high tuberculosis risk. This is due to the ability of models to borrow strength from neighboring counties, such that nearby counties have similar How to cite this paper: Iddrisu, A.-K., Alhassan, A. and Amidu, N. (2018) Investigating Spatio-Temporal Pattern of Relative Risk of Tuberculosis in Kenya Using Bayesian Hierarchical Approaches. Journal of Tuberculosis Research, 6, 175-197. https://doi.org/10.4236/jtr.2018.62017 Received: February 8, 2018 Accepted: June 26, 2018 Published: June 29, 2018 Copyright © 2018 by authors and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ Open Access A.-K. Iddrisu et al. DOI: 10.4236/jtr.2018.62017 176 Journal of Tuberculosis Research risk. Although the approaches are less than ideal, we hope that our study provide a useful stepping stone in the development of spatial and spatio-temporal methodology for the statistical analysis of risk from tuberculosis in Kenya.


Introduction
Studying the spatio-temporal pattern of infectious diseases is an important health problem in biomedical research.This is because infectious diseases adversely affect the health of the human population as well as of domestic and wild animals.This calls for the need to employ appropriate approaches to explain the spatio-temporal pattern of such diseases.This will enable us to have a proper understanding of the sources of infectious diseases as well as how such diseases spread nationally or globally.The data used to study the spread of infectious diseases are complicated due to heterogeneities that could be of spatial or non-spatial in nature [1] [2] [3].Hence, methods of analysis are required to account for such complexities in order to produce valid statistical inferences [1] [2].
Various authors have proposed methods to study the spread of infectious diseases [4] [5] [6] [7] [8].The spatial and spatio-temporal patterns of a disease can be studied using methods that account for the spatial and the spatio-temporal patterns.The methods as well, will be able to create relative risk (RR) maps that visually display spatial and spatio-temporal variations of disease risk [2] [3].However, disease counts maps have various issues such as the modifiable areal unit problem (MAUP) [9]; which is a source of statistical bias that can affect the outcome of statistical hypothesis.One special case of MAUP is ecological or medical bias where the question of interest is whether inference can be made at the individual level from aggregate data [9].
There are various approaches to handle the MAUP [9].For instance, one can scale up the data to ensure smoothing or averaging of data and making statistical inferences at high aggregate level.Also, one can scale down to enable inferences at lower level than that used in the analysis.Multi-scale analysis can also be employed to handle MAUP [9].The multi-scale analysis handles spatial units that are completely matched when aggregated [10].Another important issue is that differences in populations between regions results to differences in variation of regional estimates.We can account for population difference using hierarchical Bayesian approaches [3] [9].Application of Bayesian methods in disease modeling and mapping has received great attention in biomedical research.For instance, Clayton and col-leagues [11] applied the Empirical Bayes (EB) method to obtain smooth relative maps of lip cancer.They assumed the multivariate normal for the log relative risks and addressed the spatial correlation between neighboring regions by using the conditional autoregressive model [12].However, the EB approach is not a fully Bayesian approach, since a quadratic approximation was used for the likelihood and this did not take into account the uncertainty in the estimates of the hyper-parameters in the specified model.Besag and Molié [12] first proposed the full Bayesian approach for disease modeling and mapping.The convolution prior model described in Section 3.1.1was used to model the log relative risk.
Their study revealed that the model shrunk extreme disease rates towards the mean and is also able to detect spatial association that was apparent in the raw data and noted that the fully Bayesian model yield more precise parameter estimates than that produced by EB approach.
Several spatio-temporal models have been developed and applied to count data.Bernardineli and colleagues [4] approach assumed that the count data follow the Poisson distribution where the log of the relative was the focus of modeling.
Waller and colleagues [5] proposed a spatio-temporal model which consist of some components from the [12] approach.This model uses the convolution prior and allows each time point to have separate spatial and non-spatial effect.
They assumed that the risk factors are constant over time and that disease counts followed a Poisson distribution.Waller and colleagues [5] model was applied to lung cancer deaths in 88 Ohio counties for the year 1968-1988.Each year was modeled as a separate time point.Their study revealed that lung cancer deaths increase and there is spatial clustering and variation of relative risks over time.
There was increasing evidence of clustering of relative risks among the high rate counties, but with higher rates increasing and lower rates were constant.This suggests that there is increasing variability (heterogeneity) in relative risks over the study period [13].Knorr-Held and colleagues [14] proposed a spatio-temporal model which is composed of both the convolution prior on space and also a similar prior for temporal trends.Their approach is an extension of the [5] model by assuming that the spatial terms were constant over time.This formulation allows for exploration of additional varying risk factors within each year.This model was applied to the same Ohio lung cancer data, but it was not clear that it revealed additional features of the data [13].
Iddrisu and Amoako [3] paper focus was on studying the spatial pattern of relative risk of tuberculosis in Kenya with main focus on finding suitable spatial model for modeling and mapping such relative risks.In this paper, we investigate spatio-temporal pattern of the RR of TB in Kenya and then proposed a best fitting Bayesian hierarchical approach for studying the spatio-temporal pattern of tuberculosis in Kenya.
We introduce the data used in this paper in Section 2. We discuss the methods used in Section 3. Section 3.1 discusses spatial distribution of diseases with much focus on the conditional autoregressive (CAR) model formulation for modeling spatial distribution of diseases [12].The CAR model allows us to explain clustering of disease risk between neighboring regions/counties/countries.This discussion is followed by a description of the Besag, York and Mollié (BYM) [12] model formulation in Section 3.1.2.The BYM formulation allows us to capture both heterogeneity (variability) and clustering of diseases risk simultaneously.
The discussions on the CAR and the BYM formulation are necessary because they form the components of the spatio-temporal models adopted in this paper.
The methodologies of the spatio-temporal models are discussed in Section 3.2, and then methods applied to the Kenya TB data and the results presented in Section 4. In Section 5, we give a discussion and concluding remarks in Section 6.

Data Description
The data used in this paper are obtained from Kenya DHS.

Methods
In this section we discuss the Bayesian spatial and spatio-temporal models used  in disease modeling and mapping.

Bayesian Hierarchical Spatial Model for Disease Mapping
Spatial data are associated with a given location on the surface of the earth where such models allow for borrowing of strength between neighboring locations.In this way, neighboring locations/countries/regions/counties will have similar risks whiles distant counties are expected to show variation in risks.Waldo Tobler's first law of spatial analysis states that "everything is related to everything else but near-by things are more related than distant things'' [15].

Conditional Autoregressive (CAR) Model
The conditional autoregressive (CAR) models were introduced by Besag and Molié [12] and they are used to identify and detect clustering of diseases risk.In these models, risks of disease at any given area are influenced by the risk in the neighboring areas.The CAR model is often referred to as the structured or the correlated heterogeneity (CH) model.This is because estimation of risk in any given area is to some extend affected by the diseases risk at neighboring areas [16].In the CAR model, the distances or boundaries between the regions/counties are often used in determining the neighborhood properties [17].
Application of the CAR model can be found in [18] [19] [20].Let the set of all regions.Let , i v i S ∈ be a random variable.We define the corres- ponding random field v as the vector ( ) In the normal CAR model, it is often assumed that each observation of the outcome variable i v has a conditional distribution defined as These are full conditionals where ij Φ is the weight of each observation on the mean of i v and also denotes the spatial dependence parameter.
the conditional expectation of a state's observation is the mean of all neighbours observations [20].
The CAR model is defined by mean and covariance function [21].Assuming that each conditional distribution is Gaussian, we will need the mean and the variance-covariance to define the CAR model.The mean and variance-covariance are respectively defined as Detailed information on the conditional probability density function of a CAR random variable i v can be found in [3] [12].Thus, a conditional autoregressive model v has a probability density function defined as and the joint probability density is The necessary and sufficient condition for (1) to be a valid joint probability density function is that its covariance matrix should be symmetric and positive definite (eigenvalues 0, , , [20].The i v is a Gaussian random variable with D Σ as a symmetric matrix (See [3] [18]).
The 2 v τ controls the overall variability of i v and ρ describes the overall effect of spatial dependence.The value of ρ is should be chosen carefully [20].
Detailed description of how to choose ρ values and parameter estimation in the CAR model can be found in [3] [20].Application of the CAR model to the Kenya TB data can be found in [3].

The Besag, York and Mollié (BYM) Model
The BYM is also a spatial model for risk smoothing which have appeared in literature and has received much attention [12].The BMY model is composed of the CAR model component i v discussed in Section 3.1.1and the unstructured heterogeneity (UH) component i u with normal distribution [3] [12].The component i u the prior distribution for the area-specific random effect.The BYM model was introduced by [11] and latter developed by [12].
The log relative risk ( ) Therefore, the relative risk for the i area is defined by ( ) . The log log-link function is defined as where , E β and ϑ are vectors of the covariate, the associated parameters, the expected number of cases, and the relative risks of TB prevalence respectively.The i u is the county level random effect capturing the residual log RR of dis- ease in county i.The i u (UH) is sometime thought of as a latent variable which captures the effect of unknown or unmeasured area level covariates and i v has a CAR model structure.Detailed information on formulation of the BYM model, parameter estimation and application to the Kenya TB data can be found in [3] [12].We now turn our attention to the spatio-temporal models, the focus of this paper.

Bayesian Hierarchical Spatio-Temporal Models Disease Mapping
Many disease mapping models are restricted to identification of spatial heterogeneity and clustering of diseases risk which are in fact constrained to a single time period.However, most data in public health are often in the form of time window for many years.Therefore, there is the need to consider models which account for spatial and temporal pattern of diseases risks.ZA is interaction of risk in space and time and it u is the unstructured random effect.The contribution of a given term may serve to increase or decrease the risk of disease.The intercept or µ gives a background amount of risk shared by all regions and periods.Most often, an unstructured extra variability term it u is included in the model so as to capture the overall effect of the other unaccounted and unobserved effects.The random effect it u is defined as A is often modeled as a structured random effect.
In the following sections, we describe the spatio-temporal models used and parameter estimation.Markov Chain Monte Carlo via Gibbs sampling is used to obtain parameter estimates under each model [16].

Spatio-Temporal Model I
This spatio-temporal model is based on the mode proposed by Bernardineli and colleagues [4].This is a kind of parametric model in which the risk in a given region is considered as a linear or quadratic function of time.This model takes into account spatio-temporal interaction where temporal trend in risk may differ for spatial locations and may even have a spatial structure [16].All the temporal trends are assumed to be linear and information is shared in both space and time [16].This approach is used to investigate statistical linear rise in the reported TB Pr , , | , , , , , , .
The prior distribution ( ) P u of u follows a normal distribution defined as ( ) One limitation of the model is the assumption of a linear time trend in each region.This limitation is resolve by [8] model (discuss in Section 3.2.5).Parameters estimation from Equation (3) was carried out using MCMC via Gibbs sampling.

Spatio-Temporal Model II
The spatio-temporal Approach used in this section was developed by Waller and colleagues [21].In this model, spatial effects are observed as a set of spatial models, one for each period of time, with almost no relation between them, except possibly for some restriction in their precision parameters [16].Under this model, the hierarchical specification is applied to each time point separately [12] [16].This model does not have a single spatial main effect and does allow spatial pattern at each time point to be completely different [2] [16].
Estimation of parameters from Equation ( 6) was achieved through Bayesian MCMC via Gibbs sampling.

Spatio-Temporal Model III
Knorr-Held and Rasser [8] proposed this approach which is a type of smooth temporal evolution model where the evolution of the estimated risk in each region is a smooth function of time.Knorr-Held and Rasser [8] proposed this model to overcome the limitation suffered by the Bernardineli and colleagues [4] model.Assume that Therefore the relative risk of disease is given by The log of the Poisson mean ( ) It should be noted that , u v and ℑ are the main effects whiles ψ is the space-time interaction term.This model is used to study smooth temporal evolution of the estimated relative risk of TB prevalence in Kenya in each region at given point in time.

Parameter Estimation
Given that ( ) ( ) Pr , , | , , , , , , , , Pr u of u has a normal distribution and prior distri- bution ( ) Pr v of v has CAR structure (See chapter 3.1.1).According to [8], if the main temporal random effect t ℑ assumes unstructured random effect, then its prior distribution would be ( ) ℑ and if it assumes structured random effect, then its prior density follows a first order random walk defined by ( ) According to [8], prior specification for the interaction term ψ depends on the spatial and temporal main effect which are assumed to interact.Different types interactions it ψ were classified by [22] with prior distribution denoted by ( ) Pr ψ and precision variance denoted by 2 ψ τ .Therefore, the posterior distri- bution for the relative risk ϑ is defined by The interaction type depends on which of the two possible type of temporal effects (unstructured or structured) interacts with the two main effects ( i u and i v ).Knorr-Held and Rasser defined four interaction types.Each of the four type of interactions has different prior interrelationships involving the interaction term it ψ TB1.
1) Interaction type I: If the unstructured main effects ( t ℑ and i u ) are ex- pected to interact, then the distribution of the interaction parameter it ψ is de- fined as This may be considered as an independent unobserved covariate for each combination of region and period ( ) , i t , thus without any structure TB1,TB2.On the other hand, if spatial and temporal main effects are present in the model, then the interaction effect only denote independence in the deviations from them.The main effects can cause contribution to risk in neighboring regions or in consecutive period of time to be highly correlated.This is a global space-time heterogeneity effect and it is often modeled as τ .This interaction type has independent prior with no structure in space-time interaction TB1, TB2.
2) Interaction type II: This interaction effect is distributed as a random walk independently of other counties if we modelled t ℑ as a random walk [14].The prior distribution for this interaction is defined by [8] [10] [14] ( )

∑∑
This type of interactions has no structure in space TB4.This implies that each region has a specific evolution structure that is independent of that in the neighbouring region TB2, TB1.
3) Interaction type III: If we assumed that the unstructured temporal main effect ( t ℑ ) and the spatially correlated or structured main effect ( i v ) interact, then the interaction effect parameter ( ) ( ) This interaction is assumed to have a spatial structure for each period, independent of adjacent periods (its neighbors in time).This interaction type is analogous to the clustering effect, which is often modeled as a CAR distribution (section 3.1.1)for each period [1] [4].Here we implicitly assumed that each specific region may have a slight deviation from the global trend, but that this deviation is likely to be identical to that in the neighboring regions, while at the same time, independent of that in that in the previous period of time [1] [8].
4) Interaction type IV: Type IV is completely dependent on space and time theoretically [14].Hence the effect can no longer be factorized into independent blocks if t ℑ is modeled as a random walk allowing interaction with the struc- tured main effect ( i v ). [8]defined type IV interaction as ( ) ∑∑  [8] stated that, type IV is the most interesting type of interaction that occurs when deviation from the global trends are highly correlated with their neighbors, both in space and time.Here, hidden factors whose effects exceed the limits of one or more regions and also persistent for one or more period of time can be modeled.This is also an efficient way of obtaining information from data, particularly in the case of rare diseases or less populated regions, since the risk estimation for the region-period is not performed on the basis of only locally observed data but also on that in the neighboring regions and periods TB2.
The hyper-prior distribution for 2 τ ℑ and 2 ψ τ are modeled as gamma distri- bution.Estimation of all parameters was achieved with Bayesian MCMC via Gibbs sampling.

Results
Best fitting spatio-temporal model to Kenya TB data was selected from the above candidate models based on their respective model's DICs and pDs presented in Table 4.Note that these values are subject to Monte Carlo error, which is difficult to quantify.We have therefore chosen a very long run of which convergence was reached at 1,200,200 after a burn-in period of 100,000 and thinning of every 30 th element of the chain for each model shown in Figure 2.
From Table 4, D is the mean of the posterior deviance, pD is the effective number of parameters and DIC = D + pD proposed by [22].Among the spa- tio-temporal models presented in Table 4, the model with the lowest DIC (4194.30)and lowest pD (375.306) is the Knorr-Held and Rasser [8] model.We therefore recommend model (7) as the best fitting space-time model to Kenya TB data for 2002-2009.Table 5 summaries our effort to identify appropriate interaction type.The Table 5 presents deviance summary of the interaction types after MC convergence at 1,200,000 and a burn-in period of 100,000 for each model.The models fit with interaction type III and IV fit the data well but type IV seems better than type III since it yields the lowest pD (362.494) and DIC (419.410).We now consider analysis and interpretation of results under the model with interaction type IV since it gives best fit of the Kenya TB data.In Table 6, the overall mean relative risk µ is insignificant and the precision va- riance parameters 2 v τ and 2 u τ indicates significance of clustering and hetero- geneity of relative over the studied period respectively.The precision of the variance parameter indicates significance of TB relative risk interaction in space-time.Generally, the pattern does not change much over the study period.However, some counties have interesting time trends.For instance, in Figure 3, the two    years Ohio respiratory cancer dataset and found that interaction type II was appropriate; offering lowest deviance [16].Depending on the data you are dealing with, any of the four interaction types can yield best fit for the data [8].Spatio-temporal Parametrization of log relative risk can take a variety of forms and it is not clear yet which form is most appropriate [16].

Discussion
This paper explores application of spatio-temporal models used in disease mod-

Figure 1 .
Figure 1.Map of Kenya showing counties and boarders.
Pr u of u follows a normal distribution and prior distribution( )Pr v of v has CAR structure (See Sec- tion 3.1.1).Therefore, the posterior distribution is defined as

Figure 3
Figure 3 displays the distribution of the posterior relative risk for 2002-2009.

Figure 4
Figure 4 displays decreasing temporal trend of posterior relative parameters for some highly urbanized counties such as Mombasa and Nairobi.In contrast, pronounced increasing trends were observed for most rural counties such as Nandi and Kiambu.More information on temporal trend behavior of posterior relative risk for the rest of the counties can be found in Appendix.The left panel of the Figure 5 displays a slight increasing temporal trend from 200-2004, slight decrease from 2004-2005, increase in 2006, a slight decrease in 2007-2008 and slight increase in 2009.The temporal trend effect does not change much for 2002-2009.The right panel of the Figure 5 displays area relative risk of the type IV interaction.Again high risk of TB is observed in the North, West, North-West and Central counties of Kenya and low risk in the South-East counties.Interaction type IV was identified by [23] as the best fitting spatio-temporal model for modeling and mapping salmonellosis counts in cattle in Switzerland, 1991-2008.Knorr-Held fitted the four different types of interaction to the 21

Figure 3 .
Figure 3. Type IV interaction posterior mean of the relative risk maps for 2002-2009.
eling and mapping of TB relative risk in Kenya.These models were fitted to Kenya's TB prevalence data from 2002-2009.Markov Chain Monte Carlo via Gibbs sampling was used for simulation of parameters from posterior distributions.Rubin and Gelman convergence diagnostics test was used to confirm A.-K. Iddrisu et al.DOI: 10.4236/jtr.2018.62017191 Journal of Tuberculosis Research

Figure 4 .
Figure 4. Type IV interaction posterior mean of the relative risk for 2002-2009.

Figure 5 .
Figure 5. Type IV temporal trend and area posterior mean relative risk.

Figure S2 .
Figure S2.Interaction type IV posterior mean relative risk temporal trend by counties over the study period 2002-2009.

Table 1 .
Summary statistics of explanatory variables.

Table 3 .
Summarises Statistics of County Level TB Cases From 2002-2009.
, representing each period of time under study.Let it n be the number of persons-times at risk in region i at period t and it y be the corresponding observed cases which are counts TB2.The observed data it y depends on it N , the number of people at risk in region i and period t in the study population observed.Let dictor, µ denotes the grand mean, i Z the main effect of region i, t A the temporal trend effect in period t, it is the space-time interaction.The log relative risk for area i and period t is t ℑ represents an unstructured or structured temporal effect and the parameter it ψ ,

Table 6 .
Interaction Type IV Posterior Statistics.