Modeling Consumer Price Index in Zambia : A Comparative Study between Multicointegration and Arima Approach

Consumer Price Index (CPI) is an important indicator used to determine inflation. The main objective of this research was to compare the forecasting ability of two time-series models using Zambia Monthly Consumer Price Index. We used monthly CPI data which were collected from January 2003 to December 2017. The models that were compared are the Autoregressive Integrated Moving average (ARIMA) model and Multicointegration (ECM) model. Results show that the ECM was the best fit model of CPI in Zambia since it showed smallest errors measures. Lastly, a forecast was done using the ECM and results show an average growth rate for food CPI at 6.63% and an average growth rate for nonfood CPI at 7.41%. Forecasting CPI is an important factor for any economy because it is essential in economic planning for the future. Hence, identifying a more accurate forecasting model is a major contribution to the development of Zambia.


Introduction
Rising prices affect everyone in terms of purchasing power especially if wages remain constant.This lowers the living standards.Generally, it is difficult to detect change in price levels across product in the absence of a systematic approach.The consumer price measures the weighted average of prices of a basket of goods and services, which include fuel, transport, food and medical care purchased by households.CPI identifies price changes across product categories re-levant to the consumer.According to [1], CPI is a weighted aggregate index that is computed and published monthly.The CPI may not adequately explain actual movements in the costs of living according to [2].This may be as a result of some biases which may include inaccurate data.Thus, the Engel curve method introduced by [3] addresses the above bias.
In Zambia, the consumer price index is recorded monthly by the Central Statistics Office (CSO).In order to come up with the monthly CPI, products that are essential to human needs such as fuel, food, medical services and so on are categorized in two major categories as; foods which are edible products needed to sustain humans and nonfood products such as fuel, education and so on.The two groups are further used to calculate the monthly CPI as an average.Forecasts of CPI are important because they affect many economic decisions.
Without knowing future CPI rates, future inflation rates cannot be estimated which would make it difficult for lenders to price loans, which in turn have a negative impact on the economy.Investors require good inflation forecasts, since the returns to stocks and bonds depend totally on what happens to inflation.Businesses need inflation forecasts to price their goods and services as well as plan production.Modelling inflation is important from the point of view of poverty alleviation and social justice [4].

Literature Review
The study by [5] stated that the CPI is one of the main indicators of economic performance and also the key indicator of the results of the monetary policy of the country, because of its wide use as a measure of inflation.The ARIMA (4, 1, 6) was selected as a potential model which fits the data as well as for accurate forecasting.Hence, the forecast was made for 12 months ahead of the year 2016, and the findings showed that the CPI was likely to continue rising up with time.
A research by [6] also further described CPI as a measure of changes in the general level of prices of a group of commodities.The best model was found to be the ARIMA (1, 1, 0) compare to ARIMA (0, 1, 1), and ARIMA (1, 1, 1).
The study by [7] relates between CPI and oil prices in Turkey using the Error Correction Model (ECM).Their study revealed that a 1% increase in fuel prices caused the CPI to rise by 1.26% with an approximate one-year lag.
According to [8], cointegration was actually present in the long run equilibrium relationship of different time series which is a key basic thought and theory in the current econometric field and also an important theoretical cornerstone in current researches on combination forecasting launched by time series.
The paper by [9] modelled inflation using a structural cointegration approach.This paper used cointegration and error-correction models to analyze the relative impact of the monetary, labor and external sectors on Polish inflation from 1990 to 1999.Results showed that the labor and external sectors dominated the determination of Polish inflation during the above period, but their effects have been opposite since 1994.The monetary sector appears not to have exerted influence on inflation, suggesting monetary policy has been passive.

Methodology
To carry out this study, monthly food and nonfood CPI collected from January 2003 to December 2017 was used.We used the monthly CPI (which is the average of the food and nonfood CPI) for the ARIMA model while food and nonfood CPI for Multicointegration to develop the error correction model.Statistical software package R (version 0.99.903) was used in obtaining results.

1) Variable Definition
We let, Monthly CPI be denoted by t U , Food be denoted by t X and Non- food be denoted by t Y .
2) Relationship among the Variables process, if the PACF decays exponentially (either direct or oscillatory) and ACF cut off after lag q.The series show an ARMA (p, q) process, if the PACF decays exponentially (either direct or oscillatory) and ACF decays exponentially (either direct or oscillatory).
The MA, AR and ARMA are defined as follows: AR model: MA model: The combination of AR and MA gives ARMA model: where t φ is the autoregressive parameter at time t, t ε is the error term at time t and t θ is the moving-average parameter at time t.
In order to build our tentative model, we will follow the three highlighted steps which are: Model Identification, Parameter Estimation and Diagnostic Checking.

4) Multicointegration Model
According to [10] In order to build the tentative model, we will follow the two highlighted steps which are: Step 1, Unit root test To test for unit root for each variable ( t X ) and ( t Y ), we used the Augmented Dickey-Fuller test (ADF) based on the hypothesis that H 0 : the series has a unit root H 1 : the series has no unit root.
Step 2, Two-step method This is based on the idea that cointegration between t X and t Y is tested using standard cointegration techniques before testing for multicointegration.We test for a cointegrating relationship between ( t X ) and ( t Y ) using a proposed cointegrating regression of where t X is food in time t, t Y is nonfood in time t, 0 α , 1 α are parameters and t z is the residual.If t z is stationary then a cointegraion relationship ex- ists between t X and t Y .5) Error Correction Models (ECM) Following the two step method above, we estimate the error correction model for t X and t Y .The ECM model is given by ( ) where

Results
Table 1 shows the summary statistics of the variables Food, Nonfood and monthly CPI.For food CPI, the minimum CPI was 48.4 with a maximum of 197.8.
Then 25% of the data was less or equal to 74.53 while 50% of the data was less of equal 106.2 and 75% of the data was less or equal to 134.32.On average, the food CPI was 109.64 with a standard deviation of 41.93263.Table 2 shows the ADF test for monthly CPI and differenced monthly CPI which shows that the monthly CPI data is stationary at difference order 1 (d = 1).
Figure 2 shows the time plot of the differenced data of order 1.  3.
Table 4 shows the measure of accuracy for selected ARIMA models.An ARIMA model with the smallest errors is the best model.The ARIMA (3,1,3) has been identified as the model with the smallest AIC, RMSE, MAE and MASE as can be seen in Table 4. Next, we proceed to estimate the parameters.
Table 5 shows the estimated parameters for ARIMA (3, 1, 3) model.Table 6 shows the Box-Ljung test results of the residues.Since the test fails to reject the null hypothesis at 5% level of significance, we conclude that the model is a good fit since the data is independent and uncorrelated.
Figure 4 shows the ACF of residuals plot.It is clear that there is no significant spike.So there is no residual correlation left in our data.
Figure 5 shows that the residuals are approximately normally distributed, and there is no correlation in the residuals implying ARIMA (3, 1, 3) was successfully selected as the tentative model to be used for Forecasting.
1) Multicointegration Table 7 shows the Augmented Dickey-Fuller Test results for food and nonfood variables before and after differencing respectively.Results show that Food and nonfood CPI is stationary after differencing.
100 Figure 6 shows time plots for Food CPI and Non Food CPI after differencing respectively and both time plots exhibit an upward trend.

2) Johansen Cointegration Test
The results from the ADF test showed that both variables (food and non-food) become stationary at first difference.We then used the Johansen cointegration test whose results yielded test statistic of 62.539 which was compared to the critical value of 8.18 at 5% significance level.This shows that there is sufficient evidence to conclude that the two variables are cointegrated.

3) Estimation of the Error Correction Model
Having identified that both food and nonfood variables where stationary at first difference, the Error Correction Model was developed as shown below.
Table 8 shows the estimated parameters for food and non-food.

4) Diagnostic Checking
We carried out an empirical fluctuation process and we found that our observations where dynamic which implied that the lagged observations where included in our model in order to increase the accuracy of the model.Further an ARCH Engle's test for residual heteroscedasticity was carried out and we observed from our results that our model was significant for this research.
Results in Figure 7 show that the residuals are approximately normally  Figure 7.The ACF, Histogram and q-q plot of residuals for Error Correction Model.
distributed, and there is no correlation in the residuals implying Error Correction Model was successfully selected as the tentative model to be used for Forecasting.

5) Model Comparison
Finally, we compare the ARIMA and ECM prediction accuracy, the model with the smallest errors is selected as the better forecasting model.Squared Error (MASE).A diagnostic checking was carried using q-q plot, ACF plot and the histogram of residuals.Results showed that the model was significant.
Multicointegration was also used as an appropriate approach to establish whether the two variables food and nonfood are cointegrated and if they can be used to model CPI.We established that both variables were stationary at first difference which enabled us to carry out a cointegration test as a special case.
Results from the Johansen cointegration test showed that the variables where cointegrated and it was appropriate to estimate an ECM.An ECM was estimated successfully.To check if the model was significant, we further carried out an ARCH and STABILITY tests and the results showed that the model was significant.
The ECM was selected as the better model to forecast CPI as it showed smallest errors.The identified model was later used to forecast the CPI of Zambia using the relationship of the food CPI and the non-food CPI.The forecast showed an average growth rate for food CPI at 6.63% and an average growth rate for nonfood CPI at 7.41%.

Conclusion
The main objective of this research was to compare the forecasting ability of two time-series models using Zambia Monthly Consumer Price Index.Multicointegration was identified as the more accurate model for forecasting compared to the ARIMA (3, 1, 3).The ECM forecast showed an average growth rate for food CPI at 6.63% and an average growth rate for nonfood CPI at 7.41%.The consumer price index plays a very important role as an economic indicator because it is key in the measurement of the inflation rate.Having the ability to forecast CPI is an important factor for any economy because forecasting is essential in economic planning for the future.Forecasts need to be accurate to avoid future dilemmas such as underestimating or overestimating economic flow variables; hence identifying a more accurate model to produce forecasts is a major contribution to the development of Zambia.

Figure 1
Figure 1 shows time plots for the variables considered in this study from January 2003 to December 2017.The figure clearly shows an upward trend in the monthly CPI, Food and Non Food.

Figure 3
Figure 3 shows the ACF (left) and PACF (right) respectively for d = 1.The error measures for selecting the best fit model were used in this study though there are several ways to determine best forecasting model.The best fit model is one with minimal errors.The error indicators for our study are MPE, MAE, MASE, RMSE and MAPE defined in Table3.

Figure 1 .
Figure 1.Time plots for monthly CPI, food and nonfood.

Figure 5 .
Figure 5. Histogram and q-q plot of residuals.

Figure 6 .
Figure 6.Time plot for Food CPI and Non Food CPI after first differencing.

Table 3 .
The error measures for ARIMA model selection.

Table 4 .
Error measures of tentative ARIMA models.

Table 8 .
Estimated parameters for food and non-food.

Table 9
shows the comparison of the two models.The ECM model shows the smallest errors as compared to the ARIMA (3, 1, 3) model.Thus, ECM is the

Table 10
shows the forecast for food from the ECM for January 2018 to December 2019.The average growth rate for food CPI is at 6.63%.Table11shows the forecast for nonfood of the ECM for January 2018 to December 2019.The average growth rate for nonfood CPI is at 7.41%.

Table 9 .
Comparison between ARIMA and ECM.

Table 10 .
Forecast for food from the ECM model.

Table 11 .
Forecast for nonfood from the ECM model.