^{1}

^{2}

^{3}

^{4}

^{*}

Diabetes has become a concern in the developed and developing countries with its growing number of patients reported to the ministry of health records. This paper discusses the use of the Autoregressive Fractional Moving Average (ARFIMA) technique to modeling the diabetes patient’s attendance at Al-Baha hospitals using monthly time series data. The data used in the analysis of this paper are monthly readings of diabetes patients data covered the period January 2006-December 2016. The data were collected from the General Directorate of Health Affairs, Al-Baha region. The autoregressive fractional moving average approach was applied to the data through the model identification, estimation, diagnostic checking and forecasting. Hurst test results and ACF confirmed that there is a long memory behavior in diabetic patient’s data. Also, the fractional difference to diabetes series data revealed that (
*d* = 0.44). Moreover, unit root tests indicated that the fractional difference of diabetes series level is stationary. Furthermore, according to AIC and BIC of model selection criteria ARFIMA (1, 0.44, 0) model shown the smallest values, hence this model was chosen as an adequate represents the data. Also, a diagnostic check confirmed that ARFIMA was appropriate and highly recommended in modeling and forecasting this type of data.

An increasing diabetic patient became a great challenge in The General Directorate of Health Affairs, Al-Baha, Kingdom of Saudi Arabia; therefore studying of this phenomenon becomes an important issue. Diabetes is a common disease around the world, which can encourage various systemic diseases and high mortality. It is a disease, which categorizes by high sugar levels in the blood and urine. It is usually diagnosed by means of a glucose tolerance test (GTT). There are three kinds of diabetes mellitus [

There have been growing efforts developed by Saudi Arabia researchers to study and analyze the number of diabetes patients incidence behavior especially Al Baha region. In this study autoregressive fractional integrated moving average of time series methods will apply to data representing diabetes patient in Al Baha hospitals with the objective of deciding which of these models provide accurate prediction to diabetes patients in the Kingdom of Saudi Arabia based accuracy measurements such as AIC, BIC. The study also hypothesizes that the number of diabetes patient’s attendance trend at Al Baha hospitals goes to increasing over the time which led to existence of the long memory characteristic in data. In this study, we shall identify the order of ARFIMA models, estimate the parameters, make relevant forecast based on the models. The paper is organized as follows. In Section 3, we briefly present some theoretical framework on ARFIMA models. Empirical results are discussed in Section 4. Finally, the conclusion is presented in Section 7.

Nemours forecasting models were proposed to modeling and forecasting the number of patients in many diseases however, very few papers are available for diabetes incidence researches using the time series model.

Earnest et al. (2005) used autoregressive integrated moving average (ARIMA) models to predict the number of beds occupied during a severe acute respiratory syndrome (SARS) outbreak in tertiary hospital, they used Hospital admission and occupancy for isolation beds data from Tan Tock Seng hospital for the period 14th March 2003 to 31st May 2003. They found that the ARIMA(1, 0, 3) model was able to describe and predict the number of beds occupied during the SARS outbreak well. They also provided three-day forecasts of the number of beds required [

Appiah et al. (2015) used time series analysis to forecast Malaria cases in Ejisu Juaben Municipality, they found that ARIMA(2, 1, 1) autoregressive process of order 2, differencing of order 1 and moving average of order 1 was best fit for the secondary data. Using the obtained model, they forecasted for the next two years from 2014 and 2016. Pan et al. [

The classical approach in modeling time series data is to apply the Box–Jenkins methodology depending on whether the series is stationary or not. If the series show long memory property prediction values based on the identified and estimated Box-Jenkins models may not be dependable [

∅ k ( B ) ( 1 − B ) d x t = θ ( B ) e t (1.1)

where: x t = time series data, d = nonnegative integer representing the difference to achieve stationarity, B = the difference lag operator, ∅ = the autoregressive parameters, θ = e moving average parameters. A long memory process is stationary with gradual decreasing AIC function ρ k at lag as → ∞ .

∑ k = 0 ∞ | ρ k | = ∞ (1.2)

Let a process { X t } ; t = 1 , ⋯ , T be a stochastic process, the model of an ARFIMA process of order (p, d, q) [

∅ ( B ) ( 1 − B ) d X t = θ ( B ) e t (1.3)

where: ∅ ( B ) = ( 1 − ∅ 1 B − ∅ 2 B 2 − ⋯ − ∅ p B p ) and θ ( B ) = ( 1 − θ 1 B − θ 2 B 2 − ⋯ − θ q B q ) are polynomials in with no common factors with roots outside the unit circle, ∅ i , ( i = 1 , ⋯ , p ) and θ i , ( i = 1 , ⋯ , q ) are parameters of the autoregressive and moving average, respectively. e t = white noise process with zero mean, constant variance σ e 2 , B = the backward-shift operator.

The fractional differencing operator is [

( 1 − B ) d = ∇ d (1.4)

= ∑ k = 0 ∞ ( d k ) ( − 1 ) k B k (1.5)

= ∑ k = 0 ∞ Γ ( k − d ) B k Γ ( − d ) Γ ( k + 1 ) (1.6)

with Γ ( . ) denoting the gamma function and the parameter d is escapable to have any real value. The parameter may not be an integer (Fractionally Integrated). A process { X t } ; t = 1 , ⋯ , T is stationary if; d = 0 which is reduced to an ARMA(p, q) [

According to [

ρ 1 = d 1 − d , ρ 2 = d ( 1 + d ) ( 1 − d ) ( 2 − d ) and ρ k = ∏ i = 0 k ( i − 1 + d i − d ) such that

ρ k = Γ ( k + d ) Γ ( 1 − d ) Γ ( d ) Γ ( k − d + 1 ) (1.7)

≈ Γ ( 1 − d ) Γ ( d ) k 2 d − 1

ρ k ≈ m k 2 d − 1 (1.8)

The partial autocorrelation function of the fractionally ARIMA process can be expressed as follows [

∅ k k = d k − d (1.9)

Hurst parameter (H) is a measure of the strength of a precise time series. ARFIMA(p, d,q) with 0 < d < 0.5 . The process is not stationary if d ≥ 0.5

There are several methods to estimate Hurst parameters of FARIMA model, the most important one is R/S method [

The rescaled range (R/S) technique was first presented by Hurst; He defined the range [

R ( t , m ) as: max 1 ≤ t ≤ m Y ( t , m ) − min 1 ≤ t ≤ m Y ( t , m ) (1.10)

where: t = the discrete integer-valued time, m = the time-span and the standard deviation of the process, S ( t , m ) , is:

S ( t , m ) = 1 m ∑ t = 1 m ( Y ( t , m ) − Y ( m ) ¯ ) 2 (1.11)

The use of R/S ratio permits the observation of the ranges of numerous processes to be linked to long periods. Hurst found that the power acts practical relative among the proportion of the range R ( t , m ) and the standard deviation to be:

S ( t , m ) : E [ R ( m ) / S ( m ) ] = c m H as m → ∞ (1.12)

where H is the Hurst parameter (0 < 1), and c is a finite positive constant that does not hang on the period m. by Taking logarithms to (1) (where are H and c in the equations) [

log { E [ R ( m ) / S ( m ) ] } = C + H log ( m ) + e ( m ) (1.13)

Equation (13) is recognized as the pox diagram of R/S.

This section discusses the empirical analysis results of applying ARFIMA models to data representing the diabetes patients attended Al-Baha hospitals during the period from January 2006 to November 2016 through testing of long memory, identification, estimation, and diagnostic checking using statistical R software.

The sequence chart of diabetes patients attended at Al-Baha hospitals from the period January 2006 to November 2016 fluctuates is shown in

It can be shown that the number of diabetes patients fluctuate shows a slight increase start from 2008 to 4308 patients in February 2014 and then decreased to 245 patients in June 2014, before it fluctuated steadily till the end of the study interval in 2016. The descriptive statistics of diabetes patients attended at Al-Baha hospitals during the period from January 2006 to November 2016 are reported in

From

It can be shown that the autocorrelation function starts with large positive peaks decays gradually to zero at increasing lags, while the partial autocorrelation function shows a large positive peaks cutoff to zero after lag 5, these results confirmed that the diabetes patients attended Al-Baha hospitals series are non-stationary and

Measure | Mean | Median | Max | Min | Std. Dev | Skewness | Kurtosis | Jarque-Bera | Prob |
---|---|---|---|---|---|---|---|---|---|

Value | 878.18 | 757 | 4308 | 26 | 651.789 | 1.258 | 7.13 | 130.14 | 0.000 |

their time series shows long memory. To check whether the diabetes patients series is stationary both Augmented dickey fuller ADF test as well as Philippe Peron PP test are applied to diabetes patients attended Al-Baha hospitals from January 2006 to November 2016 series level and its first difference, their empirical results are reported in

A closer look to

The application of Hurst exponent test results of diabetes patient’s data are reported in

Three estimation methods, such as Sperio, Geweke, and Porter-Hudak and R/S Analysis, d = H – 0.5 , were applied to data representing diabetes patients attended Al-Baha hospitals to estimate the fractional difference d their finding are shown in

The estimated values of fractional difference d are reported in

Both ADF and PP tests were applied to the fractional difference of diabetes series data ( d = 0.44 ), the empirical findings confirmed that fractional difference

Unit root test results/Diabetes data | ||||
---|---|---|---|---|

Test Type | Level | 1^{st} Difference | ||

Test value | Prob | Test value | Prob | |

ADF test | −2.15 | 0.51 | −6.65 | 0.01 |

PP test | −8.74 | 0.21 | −14.52 | 0.01 |

Hurst Exponent Test Values | ||||
---|---|---|---|---|

Simple R/S | Corrected R over S | Empirical Hurst Exp | Corrected empirical Hurst Exp | Theoretical Hurst Exp |

0.80535 | 1.0635 | 0.74807 | 0.75877 | 0.52635 |

Fractional Differenced d Estimation Results | |
---|---|

Sperio Estimate for “d” (dSperio)* | 0.74 |

Geweke and Porter-Hudak Estimator | 0.78 |

R/S Analysis, d = H – 0.5 | 0.44 |

Unit Root Test Results/fracdiff (Diabetes) Series | ||
---|---|---|

Test type | Level | |

Test value | Prb | |

ADF test | −3.7785 | 0.022 |

PP test | −130.2 | 0.001 |

of diabetes series level is stationary. After the empirical results of both correlogram and Hurst exponent test produced by the Rescaled range analysis confirmed the presence of long memory in the data of diabetes patients attended Al-Baha hospitals, and the estimation of fractional difference was achieved the findings confirmed that the autoregressive fractional time series model is appropriate in modeling and forecasting the diabetes patients data. In order to build an ARFIMA model, the fractional difference value of d : 0 < 0.44 < 0.5 is used for the estimation of ARFIMA model. Diabetes patient’s fractional differenced data has been generated. Numerous ARFIMA(p, 0.44, q) models with fixed fractional parameters are estimated and tested in

A closer look at the

ARFIMA(p, d, q) | AIC | BIC | Fraction Difference d |
---|---|---|---|

ARFIMA(1, d, 0) | 1594.423 | 1605.959 | 0.44 |

ARFIMA(1, d, 1) | 1596.422 | 1610.836 | 0.44 |

ARFIMA(2, d, 1) | 1595.143 | 1612.437 | 0.44 |

ARFIMA(1, d, 2) | 1595.026 | 1612.323 | 0.44 |

ARFIMA(2, d, 0) | 1596.387 | 1610.801 | 0.44 |

ARFIMA(0, d, 2) | 1596.0 357 | 1610.771 | 0.44 |

ARFIMA(2, d, 2) | 1596.236 | 1615.416 | 0.44 |

ARFIMA(1, 0.44, 0) | |||
---|---|---|---|

ARFIMA(1, 0.44, 0) | Estimates | Std. Errors | Prob. |

∅ 1 | −0.023 | 0.088 | 0.01 |

ARFIMA(1, 0.44, 0) parameter estimate for diabetes patients attended Al-Baha hospitals, the estimates of the ARFIMA(1, 0.44, 0) model above, the autoregressive parameter estimates are statistically significant at 0.05 significance level, therefore this model appears to be a good fit model (

The estimated equation of ARFIMA(1, 0.44, 0) is expressed as follows:

( 1 + 0.023 B ) 0.44 D i a b e t s t = ε t

An increasing number of diabetes patients is a great challenge to the General Directorate of Health Affairs, Al Baha, Kingdom of Saudi Arabia, and this paper uses ARFIMA models to model diabetes patients who attended Al-Baha hospitals. Monthly records of diabetes patients covering the period from January 2006 to December 2016 are collected from the General Directorate of Health Affairs Al-Baha region. The empirical results indicate that there is a slight increase in the number of diabetes patients in the region, and the correlogram shows large positive significant patterns decays slowing gradually at increasing lags.

Both ADF and PP tests confirmed that the series level is non-stationary; however, the first difference is stationary. Hurst test results and ACF confirmed that there is a long memory behavior in diabetes patients’ data, and the fractional difference in the data of diabetes series revealed that ( d = 0.44 ), also unit root tests indicated that fractional difference of diabetes series level is stationary. Nemours models have been suggested to model diabetic patient’s data, according to model selection criteria, ARFIMA(1, 0.44, 0) model shows the smallest values of AIC and BSC, hence this model I is chosen to represent the data. Diagnostic check confirms that ARFIMA(1, 0.44, 0) is an appropriate adequate parsimonious model for diabetes patients attended in Al-Baha hospitals. These findings indicate that for this particular type of data ARFIMA is highly recommended in modeling and forecasting diabetic patient’s data.

The authors are grateful to Professor Muhammad Osman Abdullah Al-Baha University for valuable comments.

The authors declare no conflicts of interest regarding the publication of this paper.

Al Zahrani, S., Al Rahman Al Sameeh, F., Musa, A.C.M. and Shokeralla, A.A.A. (2020) Forecasting Diabetes Patients Attendance at Al-Baha Hospitals Using Autoregressive Fractional Integrated Moving Average (ARFIMA) Models. Journal of Data Analysis and Information Processing, 8, 183-194. https://doi.org/10.4236/jdaip.2020.83011