Time Series Analysis on Reported Cases of Tuberculosis in Minna Niger State Nigeria

Predicting the trend of non-seasonal data is a difficult task in Social Science. In this research work, we used time series analysis of 144 observations on monthly basis for record of reported cases of tuberculosis patients in Minna General Hospital, Niger State from the period of 2007-2018. Exploratory Data Analysis (EDA: Time Plot and Descriptive Statistics), Stationarity Test (ADF), Trend estimation (T t ), Normality Test, and Forecast evaluation were carried out. The Augmented Dickey Fuller test for stationarity was conducted and the result revealed that the series are not stationary but became stationary after first difference. The correlogram established that the ARIMA (2, 1, 3) was the best model this was further confirmed from the result of L-jung Box. Equation for ARIMA (2, 1, 3) was given as X t + 0.6867X t−1 – 0.8859X t−2 = E t + 1.3077E t−1 − 1.2328E t−2 + 0.5788E t−3 . Which was used to predict five years likely cases of tuberculosis in Minna for the period of 2019-2023. It was clearly shown from the projection that the reported cases of tuberculosis reduce year by year by 7% over the period under consideration which could be as a result of intervention from government, health worker, and individuals. In line with these findings, we recommend that the management of general hospital to increase awareness campaign to the public on the causes and dan-gers of tuberculosis.


Introduction
Tuberculosis is an infectious disease that usually affects the lungs in particular the population in Minna Niger state in Nigeria, where the study takes place [1].
In the 18 th and 19 th centuries, a tuberculosis epidemic rampaged throughout Europe and north America, before the German microbiologist Robert Koch discovered the microbial causes of tuberculosis in 1882. Following Koch's discovery, the development of vaccines and effective drug treatment led to the belief that the disease was almost defeated indeed at one point the United Nation predicted that tuberculosis (TB) would be eliminated worldwide by 2025.
However, in the mid-1980s, Tuberculosis began to risk worldwide, so that in 1993, the World Health Organization (WHO) declared that tuberculosis was a global emergency; the first time that a disease has been labeled as such. The World Health Organization estimates that nine million people a year get sick with tuberculosis, with three million of these "missed" by health system. Tuberculosis is among the top three (3) causes of death for women aged 15 to 44.
Tuberculosis symptoms (cough, fever, night-sweats, weight loss, etc.) may be mild for many months and people ill with tuberculosis can infect up to 10 -15 other people through close contact over the course of a year. It is an air-born pathogen, meaning that the bacteria that cause tuberculosis can spread through the air from one person to another.
According to [3], people with compromised immune systems are most at risk of developing active tuberculosis. For instance, HIV suppresses the immune system, making it difficult for the body to control tuberculosis bacteria. People with both HIV and TB are around 20 -30 percent more likely to develop active TB than those who do not have HIV. Tobacco use has also been found to increase the risk of developing active TB. About 8 of T.B cases worldwide are related to smoking. Furthermore, [3] also stated that people with the following conditions have an increased risk: Diabetes -Certain cancers -Malnutrition -Kidney disease.

Aim and Objectives
The main aim of this research is to fit a time series analysis model that will describe the reported cases of tuberculosis in Niger state. The specific objectives are:  To test for stationarity of tuberculosis patient in Minna, Niger State.
 Identification and estimation of the model that best describes the data.
 To forecast for future occurrences.

Literature Review
Tuberculosis (TB) is a major global health problem. Despite progress made in diagnosis and treatment mortality associated with TB remains high. TB was classified by WHO (World Health Organization) in 2016 as the deadliest infection's disease with 5,000 deaths per day ( [2]) multi-drug resistant Tuberculosis (MDR-TB) defined as mycobacterium tuberculosis strain resistant to both Isoniazid and rifampicin represents a threat to global TB control. In 2015, among the 10.4 million new cases of TB disease, 48,000 were confirmed to be MDR-TB and 100,000 were found to have rifampicin resistant TB (RR-TB) and treated with second line TB drugs [4]. Also contributing is HIV/AIDs, a well-known risk factor for TB disease, TB drug resistance and TB related death [5]. Among the 1.8million TB death in 2015, 22% were HIV co-infected and 35% of HIV deaths were due to TB [4]. The emergence of MDR-TB has been sustained by the lack of diagnostics, non-adherence to first line treatment, retreatment failure and poor second line therapy success in many countries (Mc Bride et al., 2017). Several risk factors have been associated with MDR-TB. History of prior TB treatment, HIV infection, contact with a known TB patient, receipt of more than two treatment courses, the large burden of bacilli on sputum microscopy, long cavitation and bilateral lung disease [6] [7]. Since 2010, the World Health Organization (WHO) recommended GeneXpert MTB/RIF for rapid detection of rifampicin resistance and diagnosing MDR-TB in more than 90% f those tested [1]. In Mali, the prevalence of MDR-TB was 3.4% and 66.3% respectively in new and previous tested TB patients [8]. The microbiological confirmation of MDR-TD is based on labor-intensive costly and time-consuming culture and drug susceptibility testing (DST) methods that required extensive laboratory infrastructure and are not routinely available in countries with limited resources. These limitations often result in over use of unnecessary second-line TB drugs for individuals with susceptible TB disease and non-tuberculosis Mycobacteria (NTM). We aimed to determine clinical factors associated with microbiologic confirmation of MDR-TB disease among patients suspected of having MDR-TB by clinical criteria in Bamako, Mali. This knowledge is important for clinicians to identify individuals who are most likely to have confirmed MDR-TB in this setting.
Tuberculosis (TB) is a disease caused by germs that are spread from person to person through the air. It usually affects, the lungs, but it can also affect other parts of the body, such as the brain, the kidneys or the spine. A person with TB  [7]. Tuberculosis is a chronic infectious and communicable granulomatous disease caused by mycobacterium tuberculosis [9]. The tubercle bacilli establish infection in the lungs after they are carried in droplets small enough (5 to 10 microns) to reach the alveolar spaces. If the defense system of the host fails to eliminate the infections, the bacilli proliferate inside alveolar macrophages and eventually kill the cells. The infected macrophages produce cytokines and chemokines that attract other phagocyte cells, including monocytes, other alveolar macrophages and neutrophils, which eventually form a Nodular granulomatous structure called the tubercle. If the bacterial replication is not controlled, the tubercle enlarges and the bacilli enter local draining lymph nodes.
This leads to lymphadenopathy, a characteristic clinical manifestation of primary tuberculosis [5].
There are two sources of TB infection human and bovine (connected with domestic and wild mammals). The most common source of infection is the human cases whose sputum is positive for tubercle bacilli and who has either received no treatment or not been treated fully. Among the members of the mycobacterium tuberculosis a human pathogen, whereas mycobacterium bovid has a broad host range and is the principal agent responsible for tuberculosis (TB) in domestic and wild mammals. Mycobacterium bovid also infect human, causing zoonotic TB through ingestion, inhalation and less frequently, by contact with mucous membranes and broken skin. Zoonotic TB is indistinguishable clinically or pathologically from TB caused by M. tuberculosis [9]. Tuberculosis is one of the most common life-threatening infections among the persons living with HIV/AIDs, but it does not inevitably follow that HIV is common in TB patients. Earlier survey conducted in Bangladesh to evaluate the prevalence of HIV in TB patients has shown insignificant levels [10].
People who have Diabetes mellitus (DM) and live with TB patients, they are more risk for developing TB disease. Body immunity is a key power to protect the body from infections disease. Incidence of tuberculosis is greatest among those with conditions impairing immunity such as human immunodeficiency virus (HIV) infection and diabetes [11].
Currently, the standard short-course chemotherapy for TB treatment comprises a 6-month regimen. There are four drugs for the intensive phase Isoniazid (INH), Rifampin (RIF), Pyrazinamide (PZA), Ethambotol (EMB), and two-drugs for continuation phase Isoniazid and Rifampin. The 4-months continuation phase is used for majority of patient's only two drugs. Although these regimens are broadly applicable, there are modifications that should be made under specified circumstances. Alternative Chemotherapy using most costly and Toxic drugs, often for prolonged duration's generally 18months is required for multidrug resistant and extensively drug resistant tuberculosis. Directly observed treatment, (DOT) as part of a holistic development of drug resistance in tuberculosis [12].
There are two types of prevention clinical and behavioral preventive meas- and Isoniazid preventive therapy is used to prevent TB infection BCG vaccination significantly reduces the risk of tuberculosis by an average of 50% vaccination with BCG was significantly associated with a reduction in the incidence of Pulmonary tuberculosis and extract Pulmonary disease.
Preventive therapy with Isoniazid reduces the risk of disease among recently children by 60% -80% and side effects are rare. Preventive treatment among adults with latent tuberculosis infection also has a protective efficacy in the range 60% -80%, depending on the duration of therapy. Effectiveness is routine practice may be limited by partial uptake and compliance [13].
Effective preventive measures are essential to reduce tuberculosis (TB) transmission. [14] conducted a study to determined knowledge and acceptability of potential patient specific TB infection contract measure in a rural south African community. Their study results showed that most participants (89%) accepted the wearing of face masks in health facilities, but only 42% of TB suspends and 66% of participants accepted separate cohorting in health facilities and avoidance of co-sleeping with uninfected household members.
Another study was conducted by [15]  Summarily, majority of the authors that worked on TB related research in Niger State of Nigeria studied prevalence using percentage analysis which may not be the best to describe the rate of infection and formulation of the best model to predict the future. This study was able to proffer solution to this fault.

Research Methodology
The methodology for this research work is time series approach, the method as Before model formulation in time series analysis, stationarity and normality tests for data to be used is very important as stated in objective 1. We used graphical method and Augmented Dickey-Fuller unit root test for stationarity of the data and computations from skewness and kurtosis were used to determine whether the data is normally distributed or not. Statistical explanation is given below:

Unit Root Tests (Testing for Series Stationary)
For a univariate time series, the unit root test is frequently employed for testing stationary. The first test is frequently employed for testing stationary. The first test poses the null hypothesis that the given time series has a unit root, which means that the time series is non-stationary and tests if the null hypothesis is to be statistically rejected in favor of the alternative hypothesis that given time series is stationary. To detect whether a given series is non-stationary, let us assume that the relationship between current (in time t) and last value (intime t − 1) in the time series is s follows (Enders, 1995) where X t is an observation values at time t E t is White noise process this model is a first order autoregressive process. The time series X+ converges as t → to a stationary time series if /φ/ < 1. If /φ/ = 1 or > 1, the series X+ is not stationary and the variance of X t is time independent in other words, the series has a unit root.
The unit root test subsequently tests the following one-sided hypothesis H0: φ = 1 (has a unit root) H1: φ < 1 (has root outside the unit circle) A well-known test that is valid in large sample is the Augmented Dick- This test uses the existence of a unit root as the null hypothesis.

Forecasting
One of the most important objectives of time series analysis is to predict its future vales. It is all about making projection into the future from its past values on that basis of a model that effectively describes the evolution of a series.

Forecasting Based on Conditional Expectation
Let y t be determined by a set of variables. X t observed at data t are the past values. Suppose a forecast of y t+1 is required base on m most recent values X t would consist of a constant plus X t y t−1 , …

Stationarity Test
The time plot shown in Figure 1 revealed that the trend of reported cases of tuberculosis in Minna General Hospital data is not stationary and it implies that the monthly records of tuberculosis cases do not maintain an identical pattern during corresponding period of successive years significantly from 2007-2018.
More so, the unit root test shown in section 4.2 also, revealed the P-value of 0.1620, which means the result is not significant at 1%, 5% and 10% significant levels. We now accept H 0 and conclude that there is unit root (i.e. the data is not stationary).

Unit Root (ADF) Test
Results of Augmented Dickey-fuller Test at Level

Stationarity Test at First Difference
Since the result above is not stationary, in principle we have to compute the first difference of the data and rerun the augmented DF test.    The new result shows that, the P-value = 0.000 which is less than significant levels at 1.5 and 10%, we have every reason to reject H0 at zero order levels meaning it is stationary.

Time Plot of Reported Tuberculosis Patients after First Differencing
The time series plot below shows that the first differencing of the data of tuberculosis patients reduces with time. It shows that the data is within the range of linearity from 2007-2018 after the process. This was also confirmed in Figure 4 and Figure 5 given below.
It can be affirmed from the above graph and table that the monthly records of    The ACF and PACF from the correlogram above are declining towards zero (0) it shows that there is a trend in the dataset.

Trend Analysis for First Differencing
The charts below further explained the Autocorrelation and Partial Autocorrelation function of the data at first difference which revealed only one spike at the beginning of plotted data.

ARIMA Model Selection
The next stage is now to select the best ARIMA Model (p, d, q) that would be used for forecasting, their goodness of fit has been compared using the following criteria Akaike Information Criteria (AIC) and MAPE. The model with the lowest values would be selected. With careful consideration of all the criteria, ARIMA (2, 1, 3) was found to be the best.  Looking critically in Figure 6 below, the black line represents the actual values, the red is the linear trend model and the green line represents the forecasted values.

Model Evaluation
Model evaluation is an iterative procedure used to access model adequacy by checking whether the model assumptions are satisfied or not in which one of the basic assumptions is that the error term uncorrelated with zero mean and constant variance. Base on the graph shown below some of the data seems to be out of the line so further test will carryout to ascertain the adequacy of the model.
From the graph below it can be seen clearly that the data is normally distributed with pattern less form which shows that the model is a good one. Based on this, there are some degrees of assurance that the data is normally distributed.

Summary
This work explicitly explains the concepts of autoregressive integrated moving average (ARIMA). All aim and objectives stated earlier in chapter one was fulfilled. The data were collected from statistics department of general hospital Minna, Niger State based on the monthly records of tuberculosis patients reported in the hospital.

Conclusions
From the analysis done in chapter four, the exploratory data analysis (EDA) established that the time plot of the series is fluctuating. This shows that the data is non-stationary with a skew greater than zero and the kurtosis is peak with 0.48 with mean of 19.03 and S.D of 13.49.
The required equation in the trend (T t ) computation for the reported cases of tuberculosis patients in Minna General Hospital by the least square approach was T t = 24.4544 + 0.0747548t which imply that 24.4544 was the average rate of reported tuberculosis patients between 2007-2018 with 0.07 was the rate of decrease per annum.
It was established that the p-value 0.1620 is greater than the alpha level 0.05 that is (0.1620 > 0.05) using the ADF test of stationarity which led to the conclusion of non-stationary data. In this regard, there is a need for differencing which was performed before the stationarity test was conducted again and it was confirmed stationary after differencing once with p-value = 0.000 < 0.05 then model estimation was conducted. The correlogram revealed that since the autocorrelation function (ACF) does not decay exponentially zero and the partial autocorrelation function (PACF) did not cut off then ARIMA is suspected with the aid of AIC model. Out of AIC values generated, the minimum value that is stationary and invertible occur at the order (2, 1, 3) of ARIMA which is the best model for the data on monthly basis. It implies that p = 2, d = 1, q = 3 with equation The model evaluation was carried out to check the time series assumptions in term of violation and satisfaction. Model evaluation shows that none of the time series analysis assumptions is violated at 5% level of significance.
The forecast evaluation or prediction indicated from the graph that if proper measures are put in place by the health workers and individual members of the country, there will be a more decrease in the turn out tuberculosis patients.

Recommendation
Base on the findings and conclusions, it is predicted hypothetically and statistically with the formulated model that there will be a decrease in the number of tuberculosis patients in the next five years (2019-2023). The predicted values are obviously lesser than the values of the years under study (2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018). It is therefore recommended that: • The government through the health provider should increase awareness to the public on the causes and consequence of Tuberculosis. • The people with compromised immune system should be giving special attention because they are the mostly at risk of contracting the disease.
• Direct observe therapy should be encouraged so the health worker will monitor the administration of drugs.
• More effort should put on education and treatment so it can be totally eradicated.