Additive Decomposition with Arima Model Forecasts When the Trend Component Is Quadratic ()

1. Introduction
There are two main reasons for time series analysis: 1) pattern identification of a series and 2) forecasting. These goals require that the observed time series data pattern be identified and described [1]. Besides identifying the pattern, the two main reasons will be better evaluated only when the right model is used for the analysis.
Descriptive time series, which is also known as decomposition of time series is the separation of an observed time series into its components represented by the trend (
), the seasonal (
), cyclical (
) and irregular (
) components. When the period of time is small, the cyclical component is embedded into the trend and the observed time series (
) can be decomposed into the trend-cycle component (
) seasonal component (
) and the irregular/residual component (
) [2].
There are three main decomposition models in descriptive time series:
Additive Model:
(1)
Multiplicative Model:
(2)
Pseudo-Additive/Mixed Model:
(3)
where, for time point t,
is the trend-cycle component;
is the seasonal component and
is the irregular or random component.
The assumption for the additive model (1) is that the irregular/error component
is Gaussian
white noise and the sum of the seasonal component
over a complete period is zero, (
) while for the multiplicative model (2),
is the Gaussian
white noise and the sum of the seasonal component over a complete period is (
).
The major problem in the use of descriptive time series is the choice of adequate model for time series decomposition. The methods used in the literature to make choice between additive, multiplicative and pseudo-additive models are the graphical and non-graphical methods. In time series where the amplitude of both the seasonal and irregular variations does not change as the level of the trend rises or falls, the additive model is adopted. However, when the amplitude of both the seasonal and irregular variations increases as the level of the trend rises, the multiplicative model is adopted [2]. Iwueze et al. [3] states that using Buys-Ballot table, the relationship between the seasonal means and the seasonal standard deviation could provide an insight of a desired model. In that study, the time plot for means and standard deviation was used for the choice of model.
The method of coefficient of variation of the seasonal quotients and differences was proposed by Justo and Rivera [4]. They opined that if the coefficient of variation for the seasonal quotients is greater than the coefficient of variation for the seasonal differences, the model for decomposition is additive otherwise, and the model is multiplicative. This method did not provide the choice and the use of seasonal quotients and a difference was not stated. Iwueze and Nwogu [1] provided a framework for choice of model in descriptive time series based on the Buys-Ballot table. According to them, the column (seasonal) variances of the Buys-Ballot table are simply the trending curve of the time series for the additive model and the product of the trending curve and square of the seasonal effect for the multiplicative model.
The ultimate objective of this paper is to identify and remove trend curve (quadratic) of a time series, using the identified decomposition model (additive or multiplicative model) then, fit an ARIMA model to the de-trended series and use the fitted model for forecasts. The significance of this paper is that it will improve the certainty of the analyst in choosing a suitable model for decomposition of a time series when trend is quadratic.
2. Method
2.1. Buys-Ballot Table
A Buys-Ballot table gives the summary of time series data arranged in m rows and s column for possible seasonal variation. In other to analyze the data, it is necessary to include the period and seasonal totals (
and
), period and seasonal averages (
and
), the grand total and mean (
and
). Wold [5] credits these arrangements of time series data to Buys-Ballot [6] hence, the table is referred to as Buys-Ballot table in the literature and is as shown in Table 1. In Table 1 below, the rows represent the periods/years while the column are the seasons.
For better understanding of Table 1, we have defined the column (j) totals, averages and standard deviation as follows:
Source: Iwueze and Nwogu [1].
(4)
(5)
(6)
where
is the series, m is the number of periods/years, s is the periodicity, and
is the overall number of observation/sample size.
= Total for
season,
= Average of jth season
= Standard deviation for jth season.
Let,
be the observation of the time series at time, t.
Define
as column mean and
as column variances for additive model.
We can write
in terms of the row (i) and column (j) of the Buys-Ballot table.
For the multiplicative model:
Let,
be the observation of time series at time, t.
Define
as column mean and
as column variances for multiplicative model.
We can write
in terms of the row (i) and column (j) of the Buys-Ballot table [3].
Several papers in the statistics literature have discussed the use of time plot of the entire series to make the appropriate choice between additive and multiplicative models. This makes a review of some of the works done as regards the choice of models in a descriptive time series analysis necessary in order to highlight the import of this study.
The time plot of a series can be used to choose between additive and multiplicative models. If the seasonal variation stays roughly the same size regardless of the mean level, then it is additive, but if it increases in size in direct proportion to the mean level, then it is said to be multiplicative [2].
Iwueze et al. [3] on the uses Buys-Ballot table, states that the relationship between the seasonal means and the seasonal standard deviation could give an indication of a desired model. Additive model should be employed when the seasonal standard deviation shows no applicable increase/decrease relative to any increase or decrease in the seasonal means. The multiplicative model should be employed when the seasonal standard deviation shows applicable increase/decrease relative to any increase or decrease in the seasonal means. This comparison was done using time plot for means and standard deviation.
Figures 1-4 respectively illustrate two (2) time series with their trend components, on which the choice of appropriate model can be easily decided. In the first case, the additive model was the appropriate choice as the differences between the trend and observed data (the seasonal differences) for the same periods in different years are almost the same, while in the second case the multiplicative
![]()
Figure 1. Additive model with non-seasonal effect.
![]()
Figure 2. Multiplicative model with non-seasonal effect.
![]()
Figure 3. Time series appropriate for additive decomposition with seasonal effect.
![]()
Figure 4. Time series appropriate for multiplicative decomposition with seasonal effect. Source: http://www.stats.govt.nz/surveys_and_methods/methods/data-analysis/seasonal-adjustment/theunderlying-model.aspx.
model was chosen, because the ratios of the trend and observed data (the seasonal indices) for the same periods in different years are almost the same. Thus, the appropriate model is either additive or multiplicative.
2.2. Autoregressive Moving Average (ARMA) Model
Frequently, after achieving stationarity, a time series contains AR(p) and MA(q) components of certain orders which can be combined and used for forecasting. It can also be called mixed process. Thus, a mixture of autoregressive process of order p, AR(p), and moving average of order q, MA(q), denoted as ARMA(p, q), is of the form;
(7)
(8)
where,
and
and
is a sequence of independently and identically distributed (iid) random variables with
and
.
For stationarity, the roots of
lie outside the unit circle and for invertibility condition, the roots of
all lie outside the unit circle.
For the ARMA(p, q) process, there are q autocorrelations,
whose values depend directly on the choice of the q moving average parameters
, as well as on the p autoregressive parameters
.
2.2.1. Model Selection Criteria
When more than one model is selected from the process enumerated in Equation (8), the Akaike’s Information Criterion (AIC) is then used to select the most suitable model amongst them. The Akaike’s Information Criterion is most commonly given as:
(9)
where, r is the number of model parameters, N = Effective number of data point used in the estimation procedure and
is the estimated residual variance (Mean sum of squared error (MSE)) [7] [8]. The model that minimizes the AIC criterion is the best model.
2.2.2. Accuracy Measures of the Estimated Values
1) Estimated Errors
To gauge the accuracy of our estimates, the estimated errors will be used to compare the expected estimated forecast values and observed values for 2013. This is done by subtracting the estimated forecast values (EFV) from the original values or [actual values (AV)] to obtain the estimate errors [9]. The estimate error (
) is denoted by
(10)
However, the accurate measures used in this paper are Mean Absolute Percentage Error (MAPE).
2) Mean Absolute Percentage Error (MAPE)
This accounts for the percentage of deviation between the actual values and estimates [10]. This can be obtained as:
(11)
where, v is the number of forecast values.
This method is now illustrated in Section 3.0 with the use of real-life time series data.
3. Empirical Illustration
The section is divided into two parts: 1) Identifying between additive or multiplicative model for time series decomposition. 2) ARIMA model and forecasting.
3.1. Identification of Additive or Multiplicative Model
Nigeria Spot component price of oil (US Dollar per Barrel) from 1983-2013 data (Appendix A) was applied to the Buys-Ballot table to ascertain if the additive or multiplicative models should be used for Time Series decomposition. The column means, variances and standard deviation were obtained. Then, the trend behaviour using means and standard deviation is shown below in Figure 5.
![]()
Figure 5. decomposition model identification.
The time plot for means and standard deviation in Figure 5 was used for the choice of model. Figure 5 show that the standard deviation variation stays roughly the same size regardless of the mean level, which implies additive [2]. Thus, the appropriate model for decomposition Nigeria Spot component price of oil (US Dollar per Barrel) is the additive model.
3.2. ARMA Model and Forecasting
This section is divided into three parts: de-trend the series using the appropriate model for decomposition (i.e. additive model), ARMA modeling of the series and Forecasting.
Data: Nigeria Spot component price of oil (US Dollar per Barrel) (1983-2013)
The Nigeria Spot component price of oil (US Dollar per Barrel) is a monthly data comprising 372 data points given in 31 years (1983-2013). Appendix A shows a complete presentation of the data in Buys-Ballot table form and its series plot is shown as Figure 6. Note that the first 30 years (1983-2012) was used for model building (Xt) and the last year (2013) was used for forecasts comparison.
Examining Figure 6, we noticed that NSCPO series appreciate from January 1983 to December 2012, which indicate a trend component which is either any of the polynomials of suitable order. Hence, we fitted several polynomials trend curve of order m0 in other to identify which curve represents the trend component in the NSCPO series, using R-square (R2) ( [11] [12] [13]). The polynomial trend that R2 is close to 1, with its entire coefficient being significant is the best trend curve.
3.2.1. Trend Identification and the De-Trended Series
The general polynomial trend is expressed as:
where, m is the order of the fitted polynomial at which R2 is close to 1.
We used Equation (12) to estimate the coefficients of the polynomials,
in Figure 7 using Microsoft Excel. Then, we also used regression
![]()
Figure 6. Nigeria spot component price of oil (NSCPO) (1983-2012) “Xt”.
![]()
Figure 7. Trend curves fitted to NSCPO series (Xt).
method to fit the coefficients of the polynomials of order m with the help of Minitab 17 in Appendix B, summarized in Table 2.
From Table 2, one of the estimated coefficients of the polynomial of order 3 is not significant and two estimated coefficients of the polynomial of order 4 are also not significant. However, when these coefficients are remove from the fitted polynomial trend curve, the resultant trend curve is quadratic as shown in Table 2 above (Hint: over fitting method). Therefore, the best fitted polynomial trend curve is quadratic trend with R2 = 0.86%. Another reason why the polynomials of order three and four cannot be used for this analysis is that they fit negative
![]()
Table 2. Fitted polynomials of the NSCPO series.
Footnote: **p-values are greater than the appropriate critical value (0.05) and the bold trend is the optimal order (m = 2) identified.
values to values that are positive. Thus, we conclude that the trend component is represented by the quadratic trend curve.
The fitted polynomial (quadratic trend; m = 2) is given as
By substitution of the coefficients in Table 2 into Equation (13), we have
(14)
Next, we used the additive model identified in removing the trend of the NSCPO Series since the trend curve is the quadratic. The identified model decomposition process is the additive model in Section 3.1 and Equation (1) is the representation of the additive model.
Additive Model Decomposition:
(15)
Then,
(16)
Figure 8 shows that the series is stationary and the variance is constant which indicated that the quadratic trend have been removed completely by the use of Equation (16) [or by the Additive Method.]. We now fit the best ARMA(p, q) model to the de-trended series [represented by (16)].
3.2.2. ARMA Modelling of the Series (16)
From Figure 8 above, the autocorrelation function (ACF) and partial autocorrelation function (PACF) are shown in Figure 9 and Figure 10, respectively.
Figure 9 show significant spikes at lags 1, 2, 3, 4 and 5 while Figure 10 showed significant spikes at lags 1 and 2 only. Figure 9 suggests MA(5) while Figure 10 suggest AR(2) process. However, suitable ARMA(p, q) models (
) may also be appropriate.
On the other hand, a test is appropriate, to test if the constant mean “
” be included in the models. In this case, the hypothesis of interest is given as
![]()
Figure 8. De-trended series of the NSCPO series (Yt).
![]()
Figure 9. ACF correlogram of NSCPO “Yt”, Equation (16).
![]()
Figure 10. PACF correlogram of NSCPO “Yt”, Equation (16).
against
(17)
The test statistics for testing
against
is
(18)
The computed t-value is t = −9.32 × 10−7 with p-value = 1.000 (see Appendix C). This p-value is greater than the critical p-value = 0.05 hence, we accepted
. This implies that
should not be included in the model.
Various ARMA(p, q) models were fitted to Equation (16) with respective residuals being white noise and the summary is shown in Table 3. The model selection criteria used to select the best model amongst models is Akaike’s Information Criterion (AIC) [Section 2.2.1, Equation (17)] is also detailed out in Table 3.
The identified model using Akaike’s Information Criterion in Table 3 is AR(2) model. AR(2) can be expressed as
(19)
Estimation of the Parameter of the Identified AR(2) for (16)
Estimates were obtained by use of Minitab 17 software and the results are tabulated in Table 4.
The residuals ACF and PACF in Figure 11 and Figure 12 reveal that the model is adequate for Equation (16). The adequacies of the model were also checked by the use of Ljung-Box Chi-square statistics [14], and the results are summarized in Table 5.
![]()
Table 3. AIC values for ARMA(p, q) models (
).
![]()
Table 4. Parameter estimates of AR(2) Model for (16).
Footnote: values after (±) are their standard errors.
![]()
Table 5. (Ljung-box) chi-square statistic for adequacy of (19).
Footnote:k is the lags.
![]()
Figure 11. Residual ACF correlogram of AR(2).
![]()
Figure 12. Residual PACF correlogram of AR(2).
In Table 5, comparing
with
, [i.e.
], it is obvious that the model is adequate and they can be used for forecasting.
3.2.3. Forecasting
We obtained forecasts for AR(2) model in Equation (19) starting at the origin point 360 for 12 months. The
-step forecasts denoted by
for
with the 95% confidence intervals of the forecasts are shown in Appendix D. Similarity the forecasts values using Equation (19) was obtained. The expected estimated forecast values for NSCPO Series (Xt); i.e. (
) and the estimated error (
) between actual values for the year 2013 and the expected estimated forecast values, using Equation (14) are shown in Table 6.
In Table 6 below, the accuracy measures of the estimated forecasts confirmed that additive model is the suitable method, because it is closer to the original values for 2013. [Hint: using the MAPE accuracy measure, the additive decomposition method shows 8.64% (or 9%) less than the original values for the year 2013).
![]()
Table 6. Comparison of forecast values obtained by Equation (14) and Equation (19) with the original values for 2013.
4. Conclusion
The Buys-Ballot procedure has shown that 1) the column mean and variances are not the same for the two models (Additive and Multiplicative), 2) when the trend is quadratic, the column variances mimic the shape of the trending series for both the additive and multiplicative models for the illustrative example. The result of the illustrative example using the data of Nigeria Spot component price of oil (US Dollar per Barrel) showed the additive model to be the appropriate model for decomposition of the series, based on the relationship between means and standard deviation. Finally, AR(2) was found to be adequate for the series under consideration; hence it was used to forecast. The comparison of the expected and observed prices showed no significant difference between them, using Mean Absolute Percentage Error (MAPE). Hence, we concluded that the additive model should be adopted when the trend component is quadratic (i.e., when the variation does not change as the level of the trend rises or falls).
Appendix A
Buys-Ballot Tablefor Nigeria Spot component price of oil (US Dollar per Barrel)
Source: Central Bank of Nigeria Statistical Bulletin 2014.
Appendix B. Minitab 17 Output
Regression Analysis: NSCPO versus t
The regression equation is
NSCPO = −1.03 + 0.219t
Predictor Coef SE Coef T P
Constant −1.033 2.181 −0.47 0.636
t 0.21910 0.01047 20.93 0.000
S = 20.6454 R-Sq = 55.0% R-Sq(adj) = 54.9%
Analysis of Variance
Source DF SS MS F P
Regression 1 186,648 186,648 437.90 0.000
Residual Error 358 152,592 426
Total 359 339,240
Regression Analysis: NSCPO versus t, t2
The regression equation is
NSCPO = 37.8 − 0.424 t + 0.00178 t2
Predictor Coef SE Coef T P
Constant 37.781 1.803 20.95 0.000
t −0.42423 0.02307 −18.39 0.000
t2 0.00178209 0.00006188 28.80 0.000
S = 11.3404 R-Sq = 86.5% R-Sq(adj) = 86.4%
Analysis of Variance
Source DF SS MS F P
Regression 2 293,328 146,664 1140.43 0.000
Residual Error 357 44,912 129
Total 359 339,240
Regression Analysis: NSCPO versus t, t2, t3
The regression equation is
NSCPO = 30.2 − 0.174t + 0.000051t2 + 0.000003t3
Predictor Coef SE Coef T P
Constant 30.198 2.343 12.89 0.000
t −0.17388 0.05612 −3.10 0.002
t2 0.0000507 0.0003610 0.14 0.888
t3 0.00000320 0.00000066 4.86 0.000
S = 10.9968 R-Sq = 87.3% R-Sq(adj) = 87.2%
Analysis of Variance
Source DF SS MS F P
Regression 3 296,189 98,730 816.42 0.000
Residual Error 356 43,051 121
Total 359 339,240
Regression Analysis: NSCPO versus t, t2, t3, t4
The regression equation is
NSCPO = 26.5 + 0.031t − 0.00249 t2 + 0.000014t3 − 0.000000t4
Predictor Coef SE Coef T P
Constant 26.462 2.933 9.02 0.000
T 0.0305 0.1122 0.27 0.786
t2 −0.002489 0.001262 −1.97 0.049
t3 0.00001413 0.00000525 2.69 0.007
t4 −0.00000002 0.00000001 −2.10 0.036
S = 10.9445 R-Sq = 87.5% R-Sq(adj) = 87.3%
Analysis of Variance
Source DF SS MS F P
Regression 4 296,717 74179 619.28 0.000
Residual Error 355 42,523 120
Total 359 339,240
Appendix C
One-Sample T: Additive Yt
Test of mu = 0 vs not = 0
Appendix D
Trend Analysis for NSCPO (Quadratic Trend
)
Data NSCPO
Length 360
NMissing 0
Fitted Trend Equation
Yt = 37.7815 − 0.424231t + 0.00178209t2
Accuracy Measures
MAPE 24.532
MAD 7.733
MSD 127.532
Forecasts for 2013
Period Forecast
May 116.878
Jun 117.742
Jul 118.610
Aug 119.481
Sep 120.356
Oct 121.235
Nov 122.117
Dec 123.002
Jan 123.892
Feb 124.784
Mar 125.681
Apr 126.580
![]()
AR(2) Yt Forecasts for 2013
ARIMA Model: Additive Yt
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 1.3389 0.0477 28.08 0.000
AR 2 −0.4315 0.0477 −9.05 0.000
Number of observations: 360
Residuals: SS = 4685.13 (backforecasts excluded)
MS = 13.09 DF = 358
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 14.8 32.8 65.3 78.0
DF 10 22 34 46
P-Value 0.141 0.064 0.001 0.002
Forecasts from period 360
95 Percent Limits
Period Forecast Lower Upper Actual
361 −4.6296 −11.7215 2.4623
362 −4.1159 −15.9676 7.7357
363 −3.5134 −18.7995 11.7726
364 −2.9284 −20.5811 14.7243
365 −2.4050 −21.6465 16.8365
366 −1.9566 −22.2478 18.3346
367 −1.5821 −22.5595 19.3953
368 −1.2741 −22.6970 20.1487
369 −1.0233 −22.7339 20.6872
370 −0.8204 −22.7162 21.0753
371 −0.6570 −22.6716 21.3577
372 −0.5257 −22.6165 21.5652