Time Series Forecasting Models for S&P 500 Financial Turbulence ()
1. Introduction
Understanding and predicting stock price developments and their causes has always been desirable in the financial world. Nowadays, most financial institutions make use of at least one type of forecasting method for their portfolio and risk management. Though Monte Carlo simulations with the use of Copulas may be the best way to have a full picture of all future possibilities, it does not provide the one-value results needed for planning ahead and showing forecasted results to ordinary investors without much knowledge about statistics. To achieve such one-value results, closed form forecasting methods are commonly used.
Not only should financial institutions try to forecast stock price developments, but they should also try to forecast risk parameters to better understand their portfolio risk level, and in addition make use of these forecasted risk parameters to better forecast stock price developments. Presumably, the most used risk parameter in portfolio and risk management is stock volatility. Stock volatility is so influential that a big part of the existing financial literature was dedicated to examining its predictability using various forecasting methods [1] - [8] .
Nonetheless, in the last decade two new promising risk parameters were discovered [9] [10] . These are Financial Turbulence (FT) and Absorption Ratio (AR). FT is given by Equation (1), whereas AR is given by Equation (2):
(1)
where,
= turbulence for a particular time period t (scalar)
= vector of asset returns for period t (1 × n vector)
= sample average vector of historical returns (1 × n vector)
= sample covariance matrix of historical returns (n × n matrix)
(2)
where,
= Absorption Ratio
= variance of the i-th eigen vector, sometimes called eigenportfolio
= variance of the j-th asset
= number of eigenvectors used to calculate AR
= number of assets
Salisu, Demirer & Gupta [11] showed that the use of these new financial indicators can indeed improve out-of-sample predictive performance of stock market volatility models over both the short and long time-horizon. Their use also extends to portfolio management [12] [13] [14] .
Despite the potential of FT and AR, there is still no research regarding their predictability through the use of forecasting methods in the scientific literature, which is not the case for other economic and financial indicators for which many similar papers to this paper exist [15] - [26] . Similarly to the stock volatility, forecasting FT and/or AR can be a great advantage for portfolio and risk management [11] [12] [13] [14] ; and thus, knowing how to best forecast these risk parameters is crucial for the success of financial institutions that would like to exploit these new risk parameters. As a result, this paper is dedicated to explore FT of the famous S&P 500 index predictability with the most common quantitative forecasting methods, namely Autoregressive model (AR(p)), Moving Average model (MA(q)), Autoregressive Integrated Moving Average model (ARIMA(p, d, q)), and Normal Dynamic Linear Model (NDLM(k)). Since this would be the first time a paper is covering this topic, the results of this paper will give financial institutions and individual investors the answers for the current questions regarding the predictability of the FT for the S&P 500 index through the use of common forecasting methods:
1) What is the best time series forecasting method for the short-term forecast of the S&P 500 FT among the most common time series forecasting methods?
2) How accurate and reliable is the best time series forecasting method for the short-term forecast of the S&P 500 FT?
3) What is the best time series forecasting method for the long-term forecast of the S&P 500 FT among the most common time series forecasting methods?
4) How accurate and reliable is the best time series forecasting method for the long-term forecast of the S&P 500 FT?
Unfortunately, due to time constraints, it was not possible to perform the same research with AR; yet, this paper’s author strongly encourages the scientific community to do the same research with AR, and with FT but with different forecasting methods or stock indexes.
2. Data and Methodology
2.1. Data
The data set used in this research was retrieved from Yahoo Finance through the use of the Python library yfinance. The time horizon was 4 years for the in-sample data and 1 year for the out-sample data. The in-sample is from 01/11/2017 until 31/10/2021, and the out-sample is from 01/11/2021 until 01/11/2022.
Due to the long-time horizon used in the in-sample, a few stocks that were present in the S&P 500 on 01/11/2022 do not have the historical data for the whole timeframe of the in-sample data. For the in-sample data, the stocks that did not have data throughout the whole time frame are 1) CARR, 2) CDAY, 3) CEG, 4) CTVA, 5) DOW, 6) FOX, 7) FOXA, 8) MRNA, 9) OGN, 10) OTIS, 11) VICI. Together they represent 2.19% of the total number of S&P 500 stocks and only 1.32% of the total S&P 500 market cap. Therefore, it can be concluded that their absence in the calculations would not have a significant effect on the final results of this research; and thus, they were excluded from the calculations to make the calculations more coherent given the need for the covariance matrix in the FT equation (Equation (1)).
2.2. Methodology
As already stated, four quantitative forecasting methods were used, namely AR(p), MA(q), ARIMA(p, d, q) and NDLM(k). AR(p) explores the fact that many time series phenomena linearly depend on their own previous values and on a time series process [27] , where “p” is the number of previous values considered to predict the value of the next time step. AR(p) is given as:
(3)
where,
= value of the next time step
= model parameters
= previous values
= white noise
The parameters that affect the prediction accuracy of AR(p) are “p” and
.
MA(q), on the other hand, explores the fact that various time series processes have their value of the next time step cross-correlated with a non-identical to itself random-variable [27] . That is, the next time step value linearly depends on the time series mean and the past errors, where “q” designates the number of previous errors that are considered. MA(q) is given by Equation (4):
(4)
where,
= value of the next time step
= mean of
= model parameters
= previous errors
= white noise
The parameters that affect the prediction accuracy of MA(q) are “q” and
.
There is even the possibility of combining these two models to create the ARMA(p, q) model, Yet, the big limitation of AR(p), MA(q), and ARMA(p, q) is that they only work well with linear stationary processes. Yet, some time series processes are non-stationary processes. In order to address this issue, ARIMA(p, d, q) models are frequently used [27] , where “d” is the number of times that the original series has to be differentiated to result in a stationary series (“d” is also known as order of homogeneity) [27] . ARIMA(p, d, q) is given as:
(5)
where,
when
:
, and when
:
, and so on.
The parameters that affect the prediction accuracy of ARIMA(p, d, q) are “p”, “d”, “q”,
and
.
In order to estimate p, q, and d of AR(p), MA(q), and ARIMA(p, d, q), Akaike information criterion (AIC) was used as the criterion selection. Afterwards, to estimate
and
, maximum likelihood estimation (MLE) was used. For both aforementioned processes, the arima function in R was used [28] .
Lastly, NDLM(k) can be used for both stationary and non-stationary series [29] . This is the case since it makes use of a polynomial trend equation to account for the time series process trend and a Fourier form representation to account for the seasonality present in the time series [29] [30] [31] . Moreover, “k” would be the number of parameters in the model, and NDLM(k) makes use of Bayesian inference to find the most likely parameters [29] [30] [31] . NDLM(k) can be represented with the following equations:
(6)
(7)
(8)
where,
= value at time step t
= transposed vector of dimension k composed by 1’s and 0’s
= parameters vector of dimension k
= observation noise
= observation variance
= k × k Jordan matrix
= system noise vector
= system covariance matrix
= conjugate prior distribution for the k parameters
= prior mean vector
= prior covariance matrix
= all information about
NDLM(k) is implemented by updating priors to obtain posteriors using a sequential approach. The posterior distribution is obtained through the Bayes theorem:
(9)
The forecasting function of this model is give as:
(10)
where,
= forecasted value for h time steps ahead
This model is usually represented as
. The assumptions for the use of this model in this research were that the observation variance was known and constant over time,
, that
and
were also constant over time,
and
, and that the system covariance matrix was unknown. Under the assumption that the system covariance matrix is unknown, the following equations hold:
(11)
(12)
(13)
In order to find the most suitable δ, we maximize the following mean standard equation MSE(δ):
(14)
Finally, the forecast function can be generalized thanks to the superposition principle into:
(15)
where,
= forecast function of the polynomial trend model of order k
= forecast function of the Fourier form representation with m used subperiods and with p being the period basis for the Fourier form representation
Furthermore, the time series process trend of the S&P 500 FT was captured by the polynomial trend model, where for k = 1 there is no trend, for k = 2 there is a linear trend, for k = 3 there is a quadratic trend, and so on. Regarding the seasonality of the S&P 500 FT, the Fourier form representation was used, where p is the period after which the seasonality seems to repeat itself and m is the best number subperiods within p that explains the seasonality without adding too much noise. The subperiods, λ, were determined by the equation:
(16)
Additionally, it is important to notice that for
, there is a difference in the aforementioned structure for NDLM(k). This difference being:
(17)
The parameters that affect the prediction accuracy of NDLM(k) are “k”,
,
,
,
(or
if
is unknown). For more details about NDLM(k)’s use, structure and variations, see [29] [30] [31] .
Determining the optimal k for the polynomial trend model is not very challenging. Yet, it cannot be said the same about determining the optimal m and p for the Fourier form representation [29] [30] [31] . Usually, qualitative analysis is needed to better understand the time series seasonality of the studied time series process, and thus determine the optimal k and m for the Fourier form representation [30] [31] .
Information gathered from ARIMA(p, d, q) and a qualitative analysis of the S&P 500 FT time series development were used to make an educated guess about the optimal k for the polynomial trend model and optimal m and p for the Fourier form representation. Afterwards, the R functions dlmModPoly and dlmModTrig were used to estimate the model matrices and update the posterior parameters for the in-sample data for respectively the polynomial trend model and the Fourier form representation [32] [33] . Thereafter, the algorithm that can be found here was used to estimate the posterior parameters for the out-sample data and to forecast the out-sample values.
In order to measure the efficacy of the studied forecasting methods, two forecasting tasks were used. The first one is a one-step ahead forecast and the second one is a full one business year (252 steps ahead) forecast. For the one-step ahead forecast, a rolling forecast was used, and the parameters and the matrices of NDLM(k) were updated at every time step. The parameters for the model AR(q), MA(q) and ARIMA(p, d, q) were also updated at every time step. The one business year forecast, on the other hand, had the parameters and matrices of NDLM(k) estimated with only the in-sample data and the parameters and matrices remained unchanged for the whole forecasted business year. The parameters for the model AR(q), MA(q) and ARIMA(p, d, q) also remained unchanged for the whole forecasted business year. On top of that,
and
used for AR(p), MA(q) and ARIMA(p, d, q) were respectively replaced by the forecasted value (
) and the function:
(18)
where,
= mean of in-sample
= forecasted value at time step
To evaluate each forecasting method performance, the Root-mean-square error/residual (RMSE) and Error/residual standard deviation (RSD) were used. Their equations are given by:
(19)
(20)
where,
= value at time step t
= forecasted value at time step t
= number of business days in the out-sample data (252 in this research)
Under the assumption that the residuals follow a normal distribution, 95% and 99% confidence intervals (CI) were calculated and used in the performance evaluation of each forecasting method. Lastly, a qualitative assessment was performed by visually evaluating the similarity between the actual stochastic development in the out-sample period and the forecasted values.
3. Results
The time-series development for the in-sample data can be found in Figure 1. With this graph, one can confidently affirm that the S&P 500 FT is stationary, though its variance temporally increased in 2020 due to COVID-19.
In Figure 2 and Figure 3, the in-sample Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) can be found. It can be seen that the S&P 500 FT has a high autocorrelation, even after 40 lags there is still a significant autocorrelation (i.e. an autocorrelation higher than 0.1). Therefore, it is a priori expected that MA(q) will have a high number of parameters. Regarding the PACF, the partial autocorrelation stops having a significant value (i.e. greater than 0.1) after 7 lags. Thus, AR(p) will presumably have seven parameters.
In Table 1 the optimal p, q, and d of AR(p), MA(q), and ARIMA(p, d, q) given the in-sample data can be found and their respective log-likelihood and AIC.
Figure 1. The S&P 500 FT in-sample time-series development.
AR(7) has both the lowest AIC and the highest log-likelihood, showing that it has the most potential out of these three models. However, d equals 1 for the ARIMA(p, d, q) model showing that the S&P 500 FT might be non-stationary, and thus that AR(p) and MA(q) are not suitable to forecast the S&P 500 FT. Yet, most probably the ARIMA(p, d, q) model estimated that the S&P 500 FT is non-stationary due to the temporary increase in variance in 2020 caused by an extremely short and strong economic recession during the COVID-19 lockdown.
In Table 2 the optimal parameters values, given the in-sample data, for AR(7), MA(1), and ARIMA(4, 1, 2) and their respective standard errors (S.E.) can be found. Needless to say, these parameters changed at every time step for the one-step ahead forecast. The only striking result from this table is that the S.E. of
in ARIMA(4, 1, 2) is almost two times greater in magnitude than the
value, showing a great uncertainty in this parameter estimation.
Regarding NDLM(k), it was chosen to use an one-order polynomial trend model and Fourier form representation with p = 12 and k = 3. These values were chosen given the stationarity of the S&P 500 FT and the fact that the business year has 12 months (i.e. after 12 months it repeats itself). On top of that, it was chosen to only use the first three periods of the Fourier form representation in order to respectively represent the annual, quadrimester and trimester seasonality and avoid noise from shorter periods. As a result, k equals 7 and NDLM(7) was used. Needless to say, one could have made different choices and perhaps
Table 2. Optimal parameters values for AR(7), MA(1), and ARIMA(4, 1, 2).
have taken the business cycle into account, thus having a higher p. Yet, given the stationarity of the S&P 500 FT in the in-sample period, and out-sample period length (12 months), the choice of using an one-order polynomial trend model and Fourier form representation with p = 12 and k = 3 was considered the best choice.
Below the priors for NDLM(7) can be found. Most of them are non-informative, besides the first term of
and the range for δ. The first term of
was chosen based on a slightly lower value of the in-sample mean to account for the extreme event of the COVID-19 lockdown. The range from 0.7 until 1 for δ was chosen because according to the scientific literature, δ lies between this range in the great majority of cases [30] [32] .
Priors
Given the in-sample data, the posteriors for NDLM(7) were estimated, which can be seen below. Naturally, the posteriors changed at every time step for the one-step ahead forecast. Nevertheless, it is important to mention that for the one-step ahead forecast, it was assumed that δ was equal to 0.81 at every time step.
Posteriors
In Table 3 the forecasting quantitative performance results for the one-step ahead forecast can be found. Similarly, one can find the forecasting quantitative performance results for the one business year forecast in Table 4. As already expected from the log-likelihood and AIC values, AR(7) had the best results for the one-step ahead forecast, followed by ARIMA(4, 1, 2), MA(1), and finally
Table 3. Forecast performance quantitative results (1 day).
Table 4. Forecast performance quantitative results (1 year).
NDLM(7). Given that the S&P 500 FT ranged from roughly 200 to 1100 from 2017 until 2022, AR(7) results were positive. This is the case since on average AR(7) would wrongly predict FT values by roughly 10% of its observed interval. On top of that, AR(7) would not give a false high or low FT (i.e. it gives a high forecasted FT and the next day a low FT occurs) within 99% of the time. However, it would not be recommended to use this forecasting method for financial models that depend on the precise magnitude of the S&P 500 FT values. Regarding the one business year forecast, NDLM(7) surprisingly had the best results, even better results than its results for the one-step ahead forecast, followed by MA(1), AR(7), and ARIMA(4,1,2). Anew, NDLM(7) were positive, but less than AR(7) for the one-step ahead forecast. This is the case since on average it would wrongly predict FT values by roughly 15% of its observed interval. Once again, it would not be recommended to use this forecasting method for financial models that depend on the values of the S&P 500 FT.
In Figure 4 to Figure 11 the graphical comparison between the forecasted values for both the one-step ahead and one business year forecast can be found for each model. There are no surprises in these graphs besides the evidence of NDLM(7)’s clear forecasting superiority against MA(1), which cannot be observed with only the forecast quantitative performance results.
Figure 5. FT forecasting: 1-day ARIMA(4, 1, 2).
Figure 7. FT forecasting: 1-day NDLM(7).
Figure 9. FT forecasting: 1-year ARIMA(4, 1, 2).
4. Conclusions
The aim of this research was to evaluate the S&P 500 FT predictability through the use of common forecasting methods, namely AR(p), MA(q), ARIMA(p, d, q), and NDLM(k). The results of quantitative and qualitative evaluation methods show that for the out-sample period (from November 2021 until November 2022), AR(7) was the best forecasting method for the S&P 500 FT one-step ahead forecast, whereas NDLM(7) was the best forecasting method for the S&P 500 FT one business year forecast. AR(7) would on average wrongly predict FT values by roughly 10% of its observed interval for the S&P 500 FT one-step ahead forecast. NDLM(7), on the other hand, would on average wrongly predict FT values by approximately 15% of its observed interval for the S&P 500 FT one business year forecast.
Despite the positive results of both models, it would not be recommended to use them for financial models that depend on the values of the S&P 500 FT, unless there is no other alternative. Instead, it would be better to use those forecasting models to have a good idea about whether the market will likely be turbulent (i.e., with high volatility) or not on a certain day or at a certain period.
Given quantitative limitations of AR(7) and NDLM(7) for financial models that depend on the values of the S&P 500 FT, the author of this paper invites the scientific community to perform a similar study as this one, using other forecasting methods. The scientific community is also encouraged to perform similar studies using other financial asset indexes, periods, and even considering AR instead of FT.