Univariate Time-Series Analysis of Second-Hand Car Importation in Zambia

Zambia largely depends on the international second-hand car (SHC) market for their motor vehicle supply. The importation of Second hand Cars in Zambia presents a time series problem. The data used in this paper is monthly data on SHC importation from 1 January, 2014 to 31 December, 2016. Data was analyzed using Exponential Smoothing (ES) and Autoregressive Integrated Moving Average (ARIMA) models. The results showed that ARIMA (2, 1, 2) was the best fit for the SHC importation since its errors were smaller than those of the SES, DES and TES. The four error measures used were Rootmean-square error (RMSE), Mean absolute error (MAE), Mean percentage error (MPE) and Mean absolute percentage error (MAPE). The forecasts were also produced using the ARIMA (2, 1, 2) model for the next 18 months from January 2017. Although there is percentage increase of 90.6% from November 2015 to December 2016, the SHC importation generally is on the decrease in Zambia with percentage change of 59.5% from January 2014 to December 2016. The forecasts also show a gradual percentage decrease of 1.12% by June 2018. These results are more useful to policy and decision makers of Government departments such as Zambia Revenue Authority (ZRA) and Road Development Agency (RDA) in a bid to plan and execute their duties effectively.

ber of registered vehicles in developing countries rose from 110 million to 210 million, and by some estimates it is forecast to reach 1.2 billion by 2030 [1]. Rising incomes explain a large share of this growth; as people get richer, they can afford the personal mobility that an automobile confers. High-income countries export large numbers of second-hand vehicles to low-income countries, and this trade will probably grow [2]. Zambia, being one of the developing countries, has experienced strong economic growth over the last decade and the country's growth outlook is also positive [3]. The sustained positive growth of the Zambian economy has resulted in many shifts in consumption patterns of Zambian households. The economic reality of Zambia is that the majority of the population is middle class and hence middle income earners. This economic reality has forced Zambians to depend on the second-hand market for their motor vehicle supply. This is supported by [3] who noted that consumers with less purchasing power are more likely to be able to afford to buy second-hand motor vehicles. In addition, the car purchasing pattern in most developing countries has been high due to a rapid increase in ICT usage-Internet and mobile penetration, rising GDP and an emerging middle-class society [4].
The importation of Second-hand Cars in Zambia presents a time series problem. There are several techniques that use time series but in this study we shall only concentrate on the Exponential Smoothing (ES) and Autoregressive Integrated Moving Average (ARIMA) models. In [5], various smoothing techniques Smoothing (TES) in which the data shows both trend and seasonality. The Best fit of the three models will be compared with the ARIMA model depending on whether the data used will exhibit a level and/or trend and/or seasonality. The ARIMA model is another method that is used to model and forecast time series data. The ARIMA models are also known as the "Box-Jenkins" approach following the work of Box and Jenkins [6]. This paper therefore, focuses on the major tools of decision making called univariate time-series-models.

Methodology
Below is the flowchart of the methodology.
Two main classes of models are considered in this paper: The Exponential Smoothing (ES) and Autoregressive Integrated Moving Average (ARIMA) models. The first class involves the SES, DES and TES models. The three models will be analysed and the best fit model will be chosen depending on whether the data used will exhibit a level and/or trend and/or seasonality. The second class involves the ARIMA models with the following model-building process: Tentative identification of a model, Estimation of parameters in the identified model and Diagnostic checks. The Best fit from the two classes will finally be compared to  choose the model for forecasting (Figure 1).

Simple Exponential Smoothing (SES)
The SES is applied when the data pattern is nearly horizontal, and shows no particular trend or seasonal variation exists in previous data sets. For the series  and α to the recent observation t φ and forecast t φ respectively. Where α is the smoothing constant called alpha, t φ is the actual value for period t, t φ is the forecast value for period t. The model is of the form ( ) 1 , 0 1 and 0.
The value of α is subjectively such that a value close to zero is for smoothing out unwanted cyclical and irregular components and a value close to one is for forecasting.

Double Exponential Smoothing (DES)
This technique is used when the data exhibits a trend in its pattern. If you have a time series that can be described using an additive model with increasing or decreasing trend and no seasonality. The model is φ is the actual value in time t, t φ is the level of series at time t, t β is the slope (trend) of the time series at time t. α and β ( 0.1, 0.2, , 0.9 = ) are the smoothing coefficient for level and smoothing coefficient for trend respectively. The best values of α and β correspond to the minimum mean square error (MSE).

Triple Exponential Smoothing (TES)
The TES model is applied when time series data exhibit seasonality. It incorporates three smoothing equations; first for the level, second for trend and third for seasonality. The Triple exponential smoothing model is: So we have our prediction for time period T τ + :

Autoregressive Integrated Moving Average (ARIMA) Model
The ARIMA model has the following stages: identification, estimation, diagnosis and prediction. "I" stands for integrated process which implies that the process needs to undergo differentiation and that, upon completion of the modelling, the results undergo an integration process to produce final predictions and estimates [7]. The function representing the ARIMA model is denoted ARIMA (p, d, q), which produces a stationary function ARMA (p, q) upon differentiation with respect to time. In the ARIMA (p, d, q), p stands for the order of autoregressive (AR) part, d stands for the number of times the data needs to be differenced to become stationary and q stands for the moving average (MA) part. The R statistical package was used to perform the ARIMA modelling of identification, estimation, diagnosis and prediction. The expressions for MA, AR and ARMA are: AR model: MA model: and ARMA model: where t ϑ is the auto-regressive parameter at time t, t ε is the error term at time t and t ϕ is the moving-average parameter at time t.

Assumption: Stationarity
The stationarity assumption implies that the mean, variance and autocorrelation structures do not change over time. Stationarity will mean a flat looking series, without trend, constant variance over time and no periodic fluctuations (seasonality). However, this assumption of stationarity applies to ARIMA models and not ES models. When the data is found to be non-stationary, the first difference (d = 1) will be used. Only in extreme cases will second difference (d = 2) be applied.

Results and Discussion
The data collected was called into R version 3.3.3 to perform the necessary analysis as outlined in the subsections to follow. Figure 2 show a plot of the original SHC imports data.

Exponential Smoothing Output
Using the appropriate coding in R, the following output was automatically generated.

Simple Exponential Smoothing
The R output for the SES model was as shown in Table 2. The alpha level was     The following equations constituted the fitted DES model for SHC importation using Equations ( (2) and (3)).
( ) Table 4 shows the R output for the HW model. The smoothing parameters level, trend and gamma were estimated at 0.8006 α = 2158.93914 a = and.

Triple Exponential Smoothing
Using Equations (5)- (7), we fitted the HW model for SHC imports as;    Figure 3 shows the plots for three fitted ES techniques and original data models for easy of comparisons and choosing. Clearly the AICs in Table 5 show that the SES was a better fit than DES.

Choice of Appropriate Exponential Smoothing Technique
Hence the appropriate ES technique of the three ES compared was chosen to be SES.

Model Identification and Selection
To model an ARIMA, a time plot is the first step. Figure 4 shows a time plot of the SHC imports for d = 0 and d = 1. Figure 4(b) is as a result of non-stationarity nature of the observed data as evidenced by Figure 4(a) and the ACF and PACF plots in Figure 5. ARIMA modelling requires that observed data be stationary and if not, it must be made stationary.
Hence Figure 4(b) and Figure 5(c) and Figure 5(d) which are as a result of first difference, that is d = 1.
Model selection requires that the ACF and PACF plots for d = 1 in Figure 4 be examined to establish the most suitable ARIMA. But the ACF and PACF plots did not give clear indication of significant spikes at any one lag. Hence, several tentative ARIMA models and their respective AICs were examined as shown in Table 6. Table 6 shows ARIMA (2, 1, 2) was chosen as the best fit of the tentative ARIMA models examined. Although the first six had smaller AICs, their parameters were found to be insignificant. ARIMA (2, 1, 2) had all its parameters estimated significant as Table 7 shows.

Estimation
When estimating the parameters, R gave the following output for ARIMA (2, 1, Table 7(a). Then their significance was tested by use of p-value (see Table   7(b) for p-values of each parameter).

2) in
The parameters found significant were AR (1), AR (2), MA (1), and MA (2) at      Figure 6 is a plot of the fitted model to the observed SCH imports which shows that the model fluctuates so closely to the actual SHC imports.

Diagnostic Checking
The model with best fit was identified by analysis of residuals to ensure they form a white noise process. The ACF of residual, the Q-Q plot and the histogram of residuals were used to show that the residuals of the fit form a white noise process. Figure 7 below shows that the residual are white noise and all p-values of the Ljung Box test are greater than 0.05. Hence ARIMA (2, 1, 2) is indeed the best fit model.

Discussion
The   data of all the four considered in this report as highlighted in Figure 1. The accuracy of each fit was evaluated by using four metrics, as discussed in the preceding section. Each approach was applied to determine and rank the performances of the models for the given time series. Table 8 summarizes the four models and their forecasting performances.
The results indicate that the ARIMA model performs better than either of the other models for this given time series. The ARIMA (2, 1, 2) has more smaller prediction errors than the SES and so it was rightfully concluded that ARIMA (2, 1, 2) is the best model fit for the SHC imports data. Thus it can be used to even forecast future imports of SHCs.Note, however, that although the SES model exhibits the second best forecast after that of the ARIMA model, the performance of each model relies on the data used.
Here, it should be noted that differences between their performances are related to the differences between the methods of determining forecasts in the ES and in the ARIMA models. The forecasting method in the ES models relies on a weighted average of the past observed values in which the weights decline exponentially. This basically implies that the data for more recent observations contribute significantly more than the previous data does. The ARIMA model,

Forecasting
Forecasting is usually the last stage in time series analysis as stated in Figure 1. It plays a significant role in planning and decision making to policy makers. When both current and future events are taken into account, near perfect to perfect decisions are made by those in whom powers are bestowed of decision making.
Thus, no matter how uncertain forecasts might appear, they need not be ignored and decision maker are compelled never to ignore forecasts because of their vital nature on the entire process. Hence, Table 9 shows forecasts of 18 months (from January 2017 to June 2018).

Conclusion
Zambia largely depends on the international second-hand car market for their motor vehicle supply. In this paper, monthly time series data on second hand car