^{1}

^{2}

^{3}

This paper demonstrates the use of Buys-Ballot table for identification of decomposition model using graphical method, when the trend cycle component is quadratic. A suitable ARIMA model was fitted, and was used for forecasting. Using the Buys-Ballot techniques, the column means, variances and standard deviation were estimated for the model identification. The additive model had no seasonal effect but, the multiplicative model had seasonal effect. The result of the illustrative example using the data of Nigeria Spot component price of oil (US Dollar per Barrel) showed the additive model to be the appropriate model for decomposition of this series. AR(2) model was identified as a suitable ARIMA model for the de-trended Nigeria Spot component price of oil. This was used to make forecast for the next twelve months. The obtained expected oil prices were compared with the observed prices. The comparison of expected and observed prices showed no significance difference between them, using Mean Absolute Percentage Error (MAPE).

There are two main reasons for time series analysis: 1) pattern identification of a series and 2) forecasting. These goals require that the observed time series data pattern be identified and described [

Descriptive time series, which is also known as decomposition of time series is the separation of an observed time series into its components represented by the trend ( T t ), the seasonal ( S t ), cyclical ( C t ) and irregular ( e t ) components. When the period of time is small, the cyclical component is embedded into the trend and the observed time series ( X t , t = 1 , 2 , ⋯ , n ) can be decomposed into the trend-cycle component ( N t ) seasonal component ( S t ) and the irregular/residual component ( e t ) [

There are three main decomposition models in descriptive time series:

Additive Model:

X t = N t + S t + e t (1)

Multiplicative Model:

X t = N t × S t × e t (2)

Pseudo-Additive/Mixed Model:

X t = N t × S t + e t (3)

where, for time point t, N t is the trend-cycle component; S t is the seasonal component and e t is the irregular or random component.

The assumption for the additive model (1) is that the irregular/error component e t is Gaussian N ( 0 , σ 1 2 ) white noise and the sum of the seasonal component

over a complete period is zero, ( ∑ j = 0 s S j = 0 ) while for the multiplicative model (2), e t is the Gaussian N ( 1 , σ 2 2 ) white noise and the sum of the seasonal component over a complete period is ( ∑ j = 0 s S j = s ).

The major problem in the use of descriptive time series is the choice of adequate model for time series decomposition. The methods used in the literature to make choice between additive, multiplicative and pseudo-additive models are the graphical and non-graphical methods. In time series where the amplitude of both the seasonal and irregular variations does not change as the level of the trend rises or falls, the additive model is adopted. However, when the amplitude of both the seasonal and irregular variations increases as the level of the trend rises, the multiplicative model is adopted [

The method of coefficient of variation of the seasonal quotients and differences was proposed by Justo and Rivera [

The ultimate objective of this paper is to identify and remove trend curve (quadratic) of a time series, using the identified decomposition model (additive or multiplicative model) then, fit an ARIMA model to the de-trended series and use the fitted model for forecasts. The significance of this paper is that it will improve the certainty of the analyst in choosing a suitable model for decomposition of a time series when trend is quadratic.

A Buys-Ballot table gives the summary of time series data arranged in m rows and s column for possible seasonal variation. In other to analyze the data, it is necessary to include the period and seasonal totals ( T i . and T . j ), period and seasonal averages ( X ¯ i . and X ¯ . j ), the grand total and mean ( T .. and X ¯ .. ). Wold [

For better understanding of

Period(i) | Seasons | ||||||||
---|---|---|---|---|---|---|---|---|---|

1 | 2 | … | j | … | s | T i . | X ¯ i . | σ ^ i . | |

1 | X 1 | X 2 | … | X j | … | X s | T 1. | X ¯ 1. | σ ^ 1 |

2 | X s + 1 | X s + 2 | … | X s + j | … | X 2 s | T 2. | X ¯ 2. | σ ^ 2 |

3 | X 2 s + 1 | X 2 s + 2 | … | X 2 s + j | … | X 3 s | T 3. | X ¯ 3. | σ ^ 3 |

… | … | … | … | … | … | … | … | … | … |

i | X ( i − 1 ) s + 1 | X ( i − 1 ) s + 2 | … | X ( i − 1 ) s + j | … | X ( i − 1 ) s + s | T i . | X ¯ i . | σ ^ i . |

… | … | … | … | … | … | … | … | … | … |

m | X ( m − 1 ) s + 1 | X ( m − 1 ) s + 2 | … | X ( m − 1 ) s + j | … | X m s | T m . | X ¯ m . | σ ^ m . |

T . j | T .1 | T .2 | … | T . j | … | T . s | T .. | - | - |

X ¯ . j | X ¯ .1 | X ¯ .2 | … | X ¯ . j | … | X ¯ . s | - | X ¯ .. | - |

σ ^ . j | σ ^ .1 | σ ^ .2 | … | σ ^ . j | … | σ ^ . s | - | - | σ ^ .. |

Source: Iwueze and Nwogu [

T . j = ∑ i = 1 m X ( i − 1 ) s + j , i = 1 , 2 , ⋯ , s (4)

X ¯ . j = T . j m = 1 m ∑ i = 1 m X ( i − 1 ) s + j , j = 1 , 2 , ⋯ , s (5)

σ ^ . j = 1 m − 1 ∑ i = 1 m ( X ( i − 1 ) s + j − X ¯ . j ) 2 , j = 1 , 2 , ⋯ , s (6)

where X t , t = 1 , 2 , ⋯ , n is the series, m is the number of periods/years, s is the periodicity, and n = m s is the overall number of observation/sample size.

T . j = Total for j t h season, X ¯ . j = Average of jth season

σ ^ . j = Standard deviation for jth season.

Let, X t = a + b t + c t 2 + S t + e t be the observation of the time series at time, t.

Define X ¯ . j as column mean and σ . j 2 as column variances for additive model.

We can write t = ( i − 1 ) s + j in terms of the row (i) and column (j) of the Buys-Ballot table.

For the multiplicative model:

Let, X t = ( a + b t + c t 2 ) ∗ S t ∗ e t be the observation of time series at time, t.

Define X ¯ . j as column mean and σ . j 2 as column variances for multiplicative model.

We can write t = ( i − 1 ) s + j in terms of the row (i) and column (j) of the Buys-Ballot table [

Several papers in the statistics literature have discussed the use of time plot of the entire series to make the appropriate choice between additive and multiplicative models. This makes a review of some of the works done as regards the choice of models in a descriptive time series analysis necessary in order to highlight the import of this study.

The time plot of a series can be used to choose between additive and multiplicative models. If the seasonal variation stays roughly the same size regardless of the mean level, then it is additive, but if it increases in size in direct proportion to the mean level, then it is said to be multiplicative [

Iwueze et al. [

Figures 1-4 respectively illustrate two (2) time series with their trend components, on which the choice of appropriate model can be easily decided. In the first case, the additive model was the appropriate choice as the differences between the trend and observed data (the seasonal differences) for the same periods in different years are almost the same, while in the second case the multiplicative

model was chosen, because the ratios of the trend and observed data (the seasonal indices) for the same periods in different years are almost the same. Thus, the appropriate model is either additive or multiplicative.

Frequently, after achieving stationarity, a time series contains AR(p) and MA(q) components of certain orders which can be combined and used for forecasting. It can also be called mixed process. Thus, a mixture of autoregressive process of order p, AR(p), and moving average of order q, MA(q), denoted as ARMA(p, q), is of the form;

X t = ϕ 1 X t − 1 + ϕ 2 X t − 2 + ⋯ + ϕ p X t − p + e t + θ 1 e t − 1 + θ 2 e t − 2 + ⋯ + θ q e t − q (7)

X t − ϕ 1 X t − 1 − ϕ 2 X t − 2 − ⋯ − ϕ p X t − p = e t + θ 1 e t − 1 + θ 2 e t − 2 + ⋯ + θ q e t − q

( 1 − ϕ 1 B − ϕ 2 B 2 − ⋯ − ϕ p B p ) X t = ( 1 + θ 1 B + θ 2 B 2 + ⋯ + θ q B q ) e t

ϕ ( B ) X t = θ ( B ) e t (8)

where, ϕ ( B ) = 1 − ϕ 1 B − ϕ 2 B 2 − ⋯ − ϕ p B p and θ ( B ) = 1 + θ 1 B + θ 2 B 2 + ⋯ + θ q B q and e t is a sequence of independently and identically distributed (iid) random variables with E ( e t ) = 0 and V a r ( e t ) = σ e 2 .

For stationarity, the roots of ϕ ( B ) = 0 lie outside the unit circle and for invertibility condition, the roots of θ ( B ) = 0 all lie outside the unit circle.

For the ARMA(p, q) process, there are q autocorrelations, ρ 1 , ρ 2 , ⋯ , ρ q whose values depend directly on the choice of the q moving average parameters θ 1 , θ 2 , ⋯ , θ q , as well as on the p autoregressive parameters ϕ 1 , ϕ 2 , ⋯ , ϕ q .

When more than one model is selected from the process enumerated in Equation (8), the Akaike’s Information Criterion (AIC) is then used to select the most suitable model amongst them. The Akaike’s Information Criterion is most commonly given as:

AIC = N log σ ^ 2 + 2 r (9)

where, r is the number of model parameters, N = Effective number of data point used in the estimation procedure and σ ^ 2 is the estimated residual variance (Mean sum of squared error (MSE)) [

1) Estimated Errors

To gauge the accuracy of our estimates, the estimated errors will be used to compare the expected estimated forecast values and observed values for 2013. This is done by subtracting the estimated forecast values (EFV) from the original values or [actual values (AV)] to obtain the estimate errors [

e i = AV i − EFV i , i = 1 , 2 , ⋯ , v (10)

However, the accurate measures used in this paper are Mean Absolute Percentage Error (MAPE).

2) Mean Absolute Percentage Error (MAPE)

This accounts for the percentage of deviation between the actual values and estimates [

MAPE = 100 × [ 1 v ∑ i = 1 v e i AV i ] ( AV i ≠ 0 ) (11)

where, v is the number of forecast values.

This method is now illustrated in Section 3.0 with the use of real-life time series data.

The section is divided into two parts: 1) Identifying between additive or multiplicative model for time series decomposition. 2) ARIMA model and forecasting.

Nigeria Spot component price of oil (US Dollar per Barrel) from 1983-2013 data (Appendix A) was applied to the Buys-Ballot table to ascertain if the additive or multiplicative models should be used for Time Series decomposition. The column means, variances and standard deviation were obtained. Then, the trend behaviour using means and standard deviation is shown below in

The time plot for means and standard deviation in

This section is divided into three parts: de-trend the series using the appropriate model for decomposition (i.e. additive model), ARMA modeling of the series and Forecasting.

Data: Nigeria Spot component price of oil (US Dollar per Barrel) (1983-2013)

The Nigeria Spot component price of oil (US Dollar per Barrel) is a monthly data comprising 372 data points given in 31 years (1983-2013). Appendix A shows a complete presentation of the data in Buys-Ballot table form and its series plot is shown as _{t}) and the last year (2013) was used for forecasts comparison.

Examining _{0} in other to identify which curve represents the trend component in the NSCPO series, using R-square (R^{2}) ( [^{2} is close to 1, with its entire coefficient being significant is the best trend curve.

The general polynomial trend is expressed as:

where, m is the order of the fitted polynomial at which R^{2} is close to 1.

We used Equation (12) to estimate the coefficients of the polynomials, C k , k = 1 , 2 , ⋯ , m in

method to fit the coefficients of the polynomials of order m with the help of Minitab 17 in Appendix B, summarized in

From ^{2} = 0.86%. Another reason why the polynomials of order three and four cannot be used for this analysis is that they fit negative

order m | Estimated Coefficients (p-values) | R^{2} | REMARK | ||||
---|---|---|---|---|---|---|---|

C ^ 0 | C ^ 1 | C ^ 2 | C ^ 3 | C ^ 4 | |||

1 | −1.033 (0.636) | 0.219 (0.000) | 55.0% | Linear Trend | |||

2 | 37.781 (0.000) | −0.4242 (0.000 | 1.782 × 10^{−3 } (0.000) | 86.4% | Quadratic trend | ||

3 | 30.198 (0.000) | −0.1739 (0.000) | 5.07 × 10^{−5 } (0.888)** | 3.20 × 10^{−6 } (0.000) | 87.30% | Cubic Trend | |

4 | 26.462 (0.000) | 0.0305 (0.786)** | −0.0025^{ } (0.05)** | 1.413 × 10^{−5 } (0.000) | −2.0 × 10^{−8 } (0.036) | 87.4% | Quartic trend |

Footnote: **p-values are greater than the appropriate critical value (0.05) and the bold trend is the optimal order (m = 2) identified.

values to values that are positive. Thus, we conclude that the trend component is represented by the quadratic trend curve.

The fitted polynomial (quadratic trend; m = 2) is given as

By substitution of the coefficients in

X ^ t = 37.781 − 0.4242 t + ( 1.782 × 10 − 3 ) t 2 ; t = 1 , 2 , ⋯ , 360 (14)

Next, we used the additive model identified in removing the trend of the NSCPO Series since the trend curve is the quadratic. The identified model decomposition process is the additive model in Section 3.1 and Equation (1) is the representation of the additive model.

Additive Model Decomposition:

Y t = X t − X ^ t , t = 1 , 2 , ⋯ , 360 (15)

Then,

Y t = X t − ( 37.781 − 0.4242 t + ( 1.782 × 10 − 3 ) t 2 ) , t = 1 , 2 , ⋯ , 360 (16)

From

On the other hand, a test is appropriate, to test if the constant mean “ μ ” be included in the models. In this case, the hypothesis of interest is given as

H 0 : μ = 0 against H 1 : μ ≠ 0 (17)

The test statistics for testing H 0 : μ = 0 against H 1 : μ ≠ 0 is

t = Y ¯ t s t d ( Y t ) (18)

The computed t-value is t = −9.32 × 10^{−7} with p-value = 1.000 (see Appendix C). This p-value is greater than the critical p-value = 0.05 hence, we accepted H 0 : μ = 0 . This implies that μ should not be included in the model.

Various ARMA(p, q) models were fitted to Equation (16) with respective residuals being white noise and the summary is shown in

The identified model using Akaike’s Information Criterion in

Y t = ϕ 1 T t − 1 + ϕ 1 Υ t − 2 + e t (19)

Estimation of the Parameter of the Identified AR(2) for (16)

Estimates were obtained by use of Minitab 17 software and the results are tabulated in

The residuals ACF and PACF in

Model | r | σ 2 | N | AIC |
---|---|---|---|---|

AR(1) | 1 | 16.03 | 360 | 1000.81 |

AR(2) | 2 | 13.09 | 360 | 929.87 |

MA(5) | 5 | 14.12 | 360 | 963.13 |

ARMA(1, 1) | 2 | 13.81 | 360 | 949.14 |

ARMA(1, 2) | 3 | 13.24 | 360 | 935.97 |

ARMA(1, 3) | 4 | 13.15 | 360 | 935.51 |

ARMA(1, 4) | 5 | 13.14 | 360 | 937.24 |

ARMA(1, 5) | 6 | 13.05 | 360 | 936.76 |

AR(2) Model |
---|

φ 1 = 0.3389 ± 0.0477 |

φ 2 = − 0.4315 ± 0.0477 |

σ 2 = 13.09 |

Footnote: values after (±) are their standard errors.

k | df | AR(2) | Chi-square |
---|---|---|---|

Q ( k ) | |||

12 | 10 | 14.8 | 18.3 |

24 | 22 | 32.8 | 33.9 |

36 | 34 | 55.3 | 58.8 |

48 | 46 | 67.0 | 67.5 |

Footnote:k is the lags.

In

We obtained forecasts for AR(2) model in Equation (19) starting at the origin point 360 for 12 months. The l -step forecasts denoted by Y ^ t ( l i ) for i = 1 , 2 , ⋯ , 12 with the 95% confidence intervals of the forecasts are shown in Appendix D. Similarity the forecasts values using Equation (19) was obtained. The expected estimated forecast values for NSCPO Series (X_{t}); i.e. ( X t = Y t + X ^ t ) and the estimated error ( e i ) between actual values for the year 2013 and the expected estimated forecast values, using Equation (14) are shown in

In

Period (t) | Months | Actual (2013) | Expected estimated Forecast Values | Estimated Errors “ e i ” | ||
---|---|---|---|---|---|---|

AR(2) Y t | Quadratic Trend X ^ t | ( X t = Y t + X ^ t ) | ||||

361 | January | 115.41 | −4.6296 | 116.88 | 112.25 | 3.16 |

362 | February | 118.69 | −4.1159 | 117.74 | 113.62 | 5.07 |

363 | March | 110.57 | −3.5135 | 118.61 | 115.10 | −4.53 |

364 | April | 105.17 | −2.9284 | 119.48 | 116.55 | −11.38 |

365 | May | 105.83 | −2.4050 | 120.36 | 117.96 | −12.13 |

366 | June | 106.12 | −1.9566 | 121.24 | 119.28 | −13.16 |

367 | July | 110.21 | −1.5821 | 122.12 | 120.54 | −10.33 |

368 | August | 113.62 | −1.2741 | 123.00 | 121.73 | −8.11 |

369 | September | 114.3 | −1.0233 | 123.89 | 122.87 | −8.57 |

370 | October | 112.44 | −0.8204 | 124.78 | 123.96 | −11.52 |

371 | November | 111.47 | −0.6570 | 125.68 | 125.02 | −13.55 |

372 | December | 113.11 | −0.5257 | 126.58 | 126.05 | −12.94 |

MAPE | 8.64 |

The Buys-Ballot procedure has shown that 1) the column mean and variances are not the same for the two models (Additive and Multiplicative), 2) when the trend is quadratic, the column variances mimic the shape of the trending series for both the additive and multiplicative models for the illustrative example. The result of the illustrative example using the data of Nigeria Spot component price of oil (US Dollar per Barrel) showed the additive model to be the appropriate model for decomposition of the series, based on the relationship between means and standard deviation. Finally, AR(2) was found to be adequate for the series under consideration; hence it was used to forecast. The comparison of the expected and observed prices showed no significant difference between them, using Mean Absolute Percentage Error (MAPE). Hence, we concluded that the additive model should be adopted when the trend component is quadratic (i.e., when the variation does not change as the level of the trend rises or falls).

The authors declare no conflicts of interest regarding the publication of this paper.

Emmanuel, B.O., Enegesele, D. and Arimie, C.O. (2020) Additive Decomposition with Arima Model Forecasts When the Trend Component Is Quadratic. Open Access Library Journal, 7: e6435. https://doi.org/10.4236/oalib.1106435

Buys-Ballot Tablefor Nigeria Spot component price of oil (US Dollar per Barrel)

Source: Central Bank of Nigeria Statistical Bulletin 2014.

Regression Analysis: NSCPO versus t

The regression equation is

NSCPO = −1.03 + 0.219t

Predictor Coef SE Coef T P

Constant −1.033 2.181 −0.47 0.636

t 0.21910 0.01047 20.93 0.000

S = 20.6454 R-Sq = 55.0% R-Sq(adj) = 54.9%

Analysis of Variance

Source DF SS MS F P

Regression 1 186,648 186,648 437.90 0.000

Residual Error 358 152,592 426

Total 359 339,240

Regression Analysis: NSCPO versus t, t^{2}

The regression equation is

NSCPO = 37.8 − 0.424 t + 0.00178 t^{2}

Predictor Coef SE Coef T P

Constant 37.781 1.803 20.95 0.000

t −0.42423 0.02307 −18.39 0.000

t^{2} 0.00178209 0.00006188 28.80 0.000

S = 11.3404 R-Sq = 86.5% R-Sq(adj) = 86.4%

Analysis of Variance

Source DF SS MS F P

Regression 2 293,328 146,664 1140.43 0.000

Residual Error 357 44,912 129

Total 359 339,240

Regression Analysis: NSCPO versus t, t^{2}, t^{3}

The regression equation is

NSCPO = 30.2 − 0.174t + 0.000051t^{2} + 0.000003t^{3}

Predictor Coef SE Coef T P

Constant 30.198 2.343 12.89 0.000

t −0.17388 0.05612 −3.10 0.002

t^{2} 0.0000507 0.0003610 0.14 0.888

t^{3} 0.00000320 0.00000066 4.86 0.000

S = 10.9968 R-Sq = 87.3% R-Sq(adj) = 87.2%

Analysis of Variance

Source DF SS MS F P

Regression 3 296,189 98,730 816.42 0.000

Residual Error 356 43,051 121

Total 359 339,240

Regression Analysis: NSCPO versus t, t^{2}, t^{3}, t^{4}

The regression equation is

NSCPO = 26.5 + 0.031t − 0.00249 t^{2} + 0.000014t^{3} − 0.000000t^{4}

Predictor Coef SE Coef T P

Constant 26.462 2.933 9.02 0.000

T 0.0305 0.1122 0.27 0.786

t^{2} −0.002489 0.001262 −1.97 0.049

t^{3} 0.00001413 0.00000525 2.69 0.007

t^{4} −0.00000002 0.00000001 −2.10 0.036

S = 10.9445 R-Sq = 87.5% R-Sq(adj) = 87.3%

Analysis of Variance

Source DF SS MS F P

Regression 4 296,717 74179 619.28 0.000

Residual Error 355 42,523 120

Total 359 339,240

One-Sample T: Additive Y_{t}

Test of mu = 0 vs not = 0

Trend Analysis for NSCPO (Quadratic Trend X ^ t )

Data NSCPO

Length 360

NMissing 0

Fitted Trend Equation

Y_{t} = 37.7815 − 0.424231t + 0.00178209t^{2}

Accuracy Measures

MAPE 24.532

MAD 7.733

MSD 127.532

Forecasts for 2013

Period Forecast

May 116.878

Jun 117.742

Jul 118.610

Aug 119.481

Sep 120.356

Oct 121.235

Nov 122.117

Dec 123.002

Jan 123.892

Feb 124.784

Mar 125.681

Apr 126.580

AR(2) Y_{t} Forecasts for 2013

ARIMA Model: Additive Y_{t}

Final Estimates of Parameters

Type Coef SE Coef T P

AR 1 1.3389 0.0477 28.08 0.000

AR 2 −0.4315 0.0477 −9.05 0.000

Number of observations: 360

Residuals: SS = 4685.13 (backforecasts excluded)

MS = 13.09 DF = 358

Modified Box-Pierce (Ljung-Box) Chi-Square statistic

Lag 12 24 36 48

Chi-Square 14.8 32.8 65.3 78.0

DF 10 22 34 46

P-Value 0.141 0.064 0.001 0.002

Forecasts from period 360

95 Percent Limits

Period Forecast Lower Upper Actual

361 −4.6296 −11.7215 2.4623

362 −4.1159 −15.9676 7.7357

363 −3.5134 −18.7995 11.7726

364 −2.9284 −20.5811 14.7243

365 −2.4050 −21.6465 16.8365

366 −1.9566 −22.2478 18.3346

367 −1.5821 −22.5595 19.3953

368 −1.2741 −22.6970 20.1487

369 −1.0233 −22.7339 20.6872

370 −0.8204 −22.7162 21.0753

371 −0.6570 −22.6716 21.3577

372 −0.5257 −22.6165 21.5652