Using Box-Jenkins Models to Forecast Mobile Cellular Subscription

In this paper, the Box-Jenkins modelling procedure is used to determine an ARIMA model and go further to forecasting. The mobile cellular subscription data for the study were taken from the administrative data submitted to the Zambia Information and Communications Technology Authority (ZICTA) as quarterly returns by all three mobile network operators Airtel Zambia, MTN Zambia and Zamtel. The time series of annual figures for mobile cellular subscription for all mobile network operators is from 2000 to 2014 and has a total of 15 observations. Results show that the ARIMA (1, 2, 1) is an adequate model which best fits the mobile cellular subscription time series and is therefore suitable for forecasting subscription. The model predicts a gradual rise in mobile cellular subscription in the next 5 years, culminating to about 9.0% cumulative increase in 2019.


Introduction
In Zambia, the penetration of information and communication technology (ICT) in general and mobile in particularly, plays an important role in compilation of the national Gross Domestic Product (GDP).There are three (3) mobile cellular operators in Zambia with networks spanning land area of almost 602,090 Km 2 , representing 80% network coverage.In 2014, Zambia had 67.1% of subscribers from the mobile cellular subsector with revenue contribution of nearly K3.4 billion.At the end of December 2014 the population of Zambia was estimated at 15.1 million while mobile cellular subscription (MCS) was 10.1 million.
Studies have shown that diffusion of mobile telecommunication affects the growth of GDP.Other studies have also shown that a long run causal relationship exists between growth in telecommunications and the growth of the economy both at sectoral and aggregate levels.Therefore, the importance of investment in telecommunication subsector is acknowledged world over.Globally, socio-economic effect and economic development due to improved telecommunication cannot be repudiated.
Time series modelling is an important part of every field.It provides both short and long term forecasting techniques.Effective implementation of forecasting techniques maximises the prospect of adopting optimum strategies.Literature shows that researchers have used both stochastic and deterministic models to model and forecast telecommunication data.However, stochastic models attributed to Box-Jenkins, the Auto Regressive Integrated Moving Average (ARIMA) models have been found to be more efficient and reliable even for short term forecasting than the deterministic models.Further, stochastic models are distribution-free as no assumptions are required about the data or parameter hence the adoption of the forecasting methodology in this paper.

Method and Materials
The MCS data for the study has been taken from the administrative data submitted to the Zambia Information and Communications Technology Authority (ZICTA) as quarterly returns by all three mobile network operators (MNOs).The time series of annual figures for MCS for all MNOs is from 2000 to 2014 and has a total of 15 observations.Each observation (X t ) in the time series is sum total of subscriber for Airtel Zambia, MTN Zambia and Zamtel i.e.Statistical Analysis System (SAS) [1] and Microsoft Excel will be used to implement the stochastic models and graphical representations respectively.

Stochastic Modelling
The Box-Jenkins approach to forecasting was first described by statisticians George Box and Gwilym Jenkins and was developed as a direct result of their experience with forecast problems in the business, economic, and control engineering applications [2].The Box-Jenkins methodology is a systematic process which is implemented by using an iterative process until an adequate model is achieved.The procedure is achieved by a step-by-step process of model IDENTIFICATION, SPECIFICATION, ESTIMATION, DIAGNOSTIC and FORECAST.The ARIMA has three parameters viz.autoregressive (p), differencing order (d) and the order of moving average (q).The generic Box-Jenkin models are denoted by ARIMA (p, d, q) given by ( )( ) ( ) where, ∅ , θ and a are autoregressive parameter, moving average parameter and residual respectively.The residuals are assumed to be iid 1 Normal.Using the backshift operator/transformation the equation above, when d = 0, is expressed as where, a t is a white noise process with mean 0 and variance σ 2 [3].

Measures of Forecast Accuracy
The statistics are used to compare how well models fit the time series.Akaike Information Criterion (AIC) and

Identification
This is the foremost step of the Box-Jenkins process of time series modelling.A timeplot of the MCS is plotted in Figure 1(a) and checked for stationarity and invertibility using visual display of the ACF2 and PACF3 graphs.Figure 1(a) show that the MCS time series is not stable and therefore nonstationary.The nonstationary behaviour is confirmed by the ACF and PACF plots in Figure 2(a) and Figure 2(b) below.Therefore some sort of transformation of the series is necessary to make it mean and variance stationary4 .ARIMA models are designed to model stationary time series.
Converting a nonstationary time series to a stationary one through differencing (where needed) is an important part of the process of fitting an ARIMA model.Table 1 shows the details of various ARIMA models along the forecast accuracy measures.An ARIMA model with least measures of accuracy particularly the AIC and SBC is considered an efficient model for prediction.Therefore, for MCS time series, the ARIMA (1, 2, 1) is an adequate (best fit) model because it has the lowest values for AIC and SBC statistics.

Parameter Estimation
Table 2 shows the estimated parameters and the associated p-values at 5% level of significance.Only the autoregressive parameter is significantly different from zero at 5% implying that the constant and the parameter for the moving average coefficients have little or no effect on the model.
The model variable and factors are given in Table 2. Hence, the mathematical form of the ARIMA (1, 2, 1) is

Diagnistic Check
Verification of goodness of fit of any model should include a test as to whether the residuals form a white noise process.Diagnistic check helps determine if an estimated model is statistically adequate.If the identified model passes the diagnostic tests, the model is ready to be used for forecasting.If it does not, the diagnostic tests    should indicate how the model ought to be modified, and a new cycle of identification, estimation and diagnosis is performed.The Autocorrelation check for white noise of an ARIMA (1, 2, 1) model in Table 4 p-values at 5% level of significance as shown above indicates that the model is good because the residuals are a white noise.

Forecasting
Box-Jenkins approach to forecasting stationary time series is relatively simple.The forecast value of t k X + given all observations up until n the k-step ahead forecast is denoted by ( ) ˆt x k .Table 3 shows five year forecasts for mobile cellular subscription using ARIMA (1, 2, 1).The trajectory of the forecasts from 2015 to 2019 is shown in Figure 3.

Discussion
The ARIMA (1, 2, 1) is an adequate model which best fits the mobile cellular subscription time series and is therefore suitable for forecasting subscription.The potential implication of this study is that by developing forecasting models for predicting mobile cellular subscription in advance on a regular basis is to support internal decisions and planning as well as market communication.The subscription forecast baseline in this study uses historical data from Airtel Zambia, MTN Zambia and Zamtel.The study also provides a model to foresee and allocate appropriate resources to maintain a steady increase in mobile cellular subscription.

Conclusion
In this paper, the Box-Jenkins modelling procedure is used to determine an ARIMA model and go further to forecasting.The mobile cellular subscription data for the study were taken from the administrative data submitted to the Zambia Information and Communications Technology Authority (ZICTA) as quarterly returns by all three mobile network operators Airtel Zambia, MTN Zambia and Zamtel.The time series of annual figures for mobile cellular subscription for all mobile network operators is from 2000 to 2014 and has a total of 15 observations.Results show that the ARIMA (1, 2, 1) is an adequate model which best fits the mobile cellular subscription time series and is therefore suitable for forecasting subscription.The model predicts a gradual rise in mobile cellular subscription in the next 5 years, culminating to about 9.0% cumulative increase in 2019.
of subscribers for Airtel Zambia, t β = number of subscribers for MTN Zambia, and t γ = number of subscribers for Zamtel.

Figure 1 (
b) and Figure 1(c) shows first order and second order differenced MCS series, respectively.The ACF and PACF plots are shown in Figure 2(c), Figure 2(d), Figure 2(e) and Figure 2(f) below.ACF and PACF plots indicate that the first and second differenced MCS series are stationary hence require further examination to establish the most suitable transformation for the MCS series.
's Bayesian Information Criterion (SBC) are some of the measures of accuracy of forecast that are widely used in SAS.Other measures used include Mean Squared Error (MSE), Mean Absolute Percentage Error (MAPE) and Mean Absolute Deviation (MAD).Forecast error is given by 1Independent and identically distributed.Schwartz

Table 1 .
Measures of accuracy for selected ARIMA models.

Table 2 .
Estimated parameter and significance tests.

Table 3 .
Autocorrelation check for white noise.