Time Series Forecasting Models for S&P 500 Financial Turbulence ()

Hugo Gobato Souto^{}

BASF Nederland B.V., Arnhem, Netherlands.

**DOI: **10.4236/jmf.2023.131007
PDF HTML XML
167
Downloads
844
Views
Citations

BASF Nederland B.V., Arnhem, Netherlands.

Although it has already been proven many times that the use of the risk parameter Financial Turbulence yields significant positive results in risk and portfolio management, there is currently no research regarding its predictability through the use of time series forecasting methods. Accurately forecasting the Financial Turbulence of a certain financial asset index or portfolio could be a great advantage for portfolio management for financial institutions given the positive results found by various research of the use of the Financial Turbulence in portfolio management. Therefore, this paper explores the predictability of the S&P 500 Financial Turbulence with the use of common time series forecasting methods, namely Autoregressive model (AR(*p*)), Moving Average model (MA(*q*)), Autoregressive Integrated Moving Average model (ARIMA(*p*, *d*, *q*)), and Normal Dynamic Linear Model (NDLM(*k*)). This paper makes use of in-sample data (from November 2017 until November 2021) and out-sample data (from November 2021 until November 2022) to evaluate the forecasting performance of these forecasting methods in both quantitative and qualitative manners. The results of this study indicate that regarding the S&P 500 Financial Turbulence, AR(7) is the best forecasting method for one-step ahead forecast, whereas NDLM(7) is the best forecasting method for one business year forecast.

Keywords

Financial Time Series, Bayesian Forecasting, Financial Turbulence, S&P 500, Time Series Forecasting

Share and Cite:

Souto, H. (2023) Time Series Forecasting Models for S&P 500 Financial Turbulence. *Journal of Mathematical Finance*, **13**, 112-129. doi: 10.4236/jmf.2023.131007.

1. Introduction

Understanding and predicting stock price developments and their causes has always been desirable in the financial world. Nowadays, most financial institutions make use of at least one type of forecasting method for their portfolio and risk management. Though Monte Carlo simulations with the use of Copulas may be the best way to have a full picture of all future possibilities, it does not provide the one-value results needed for planning ahead and showing forecasted results to ordinary investors without much knowledge about statistics. To achieve such one-value results, closed form forecasting methods are commonly used.

Not only should financial institutions try to forecast stock price developments, but they should also try to forecast risk parameters to better understand their portfolio risk level, and in addition make use of these forecasted risk parameters to better forecast stock price developments. Presumably, the most used risk parameter in portfolio and risk management is stock volatility. Stock volatility is so influential that a big part of the existing financial literature was dedicated to examining its predictability using various forecasting methods [1] - [8] .

Nonetheless, in the last decade two new promising risk parameters were discovered [9] [10] . These are Financial Turbulence (FT) and Absorption Ratio (AR). FT is given by Equation (1), whereas AR is given by Equation (2):

${d}_{t}=\left({y}_{t}-\mu \right){\Sigma}^{-1}{\left({y}_{t}-\mu \right)}^{\prime}$ (1)

where,

${d}_{t}$ = turbulence for a particular time period *t *(scalar)

${y}_{t}$ = vector of asset returns for period t (1 × *n* vector)

$\mu $ = sample average vector of historical returns (1 × *n* vector)

$\Sigma $ = sample covariance matrix of historical returns (*n* × *n* matrix)

$\text{AR}=\frac{{\displaystyle {\sum}_{i=1}^{n}{\sigma}_{{E}_{i}}^{2}}}{{\displaystyle {\sum}_{j=1}^{N}{\sigma}_{Aj}^{2}}}$ (2)

where,

$\text{AR}$ = Absorption Ratio

${\sigma}_{{E}_{i}}^{2}$ = variance of the* i*-th eigen vector, sometimes called eigenportfolio

${\sigma}_{Aj}^{2}$ = variance of the *j*-th asset

$n$ = number of eigenvectors used to calculate AR

$N$ = number of assets

Salisu, Demirer & Gupta [11] showed that the use of these new financial indicators can indeed improve out-of-sample predictive performance of stock market volatility models over both the short and long time-horizon. Their use also extends to portfolio management [12] [13] [14] .

Despite the potential of FT and AR, there is still no research regarding their predictability through the use of forecasting methods in the scientific literature, which is not the case for other economic and financial indicators for which many similar papers to this paper exist [15] - [26] . Similarly to the stock volatility, forecasting FT and/or AR can be a great advantage for portfolio and risk management [11] [12] [13] [14] ; and thus, knowing how to best forecast these risk parameters is crucial for the success of financial institutions that would like to exploit these new risk parameters. As a result, this paper is dedicated to explore FT of the famous S&P 500 index predictability with the most common quantitative forecasting methods, namely Autoregressive model (AR(*p*)), Moving Average model (MA(*q*)), Autoregressive Integrated Moving Average model (ARIMA(*p*, *d*, *q*)), and Normal Dynamic Linear Model (NDLM(*k*)). Since this would be the first time a paper is covering this topic, the results of this paper will give financial institutions and individual investors the answers for the current questions regarding the predictability of the FT for the S&P 500 index through the use of common forecasting methods:

1) What is the best time series forecasting method for the short-term forecast of the S&P 500 FT among the most common time series forecasting methods?

2) How accurate and reliable is the best time series forecasting method for the short-term forecast of the S&P 500 FT?

3) What is the best time series forecasting method for the long-term forecast of the S&P 500 FT among the most common time series forecasting methods?

4) How accurate and reliable is the best time series forecasting method for the long-term forecast of the S&P 500 FT?

Unfortunately, due to time constraints, it was not possible to perform the same research with AR; yet, this paper’s author strongly encourages the scientific community to do the same research with AR, and with FT but with different forecasting methods or stock indexes.

2. Data and Methodology

2.1. Data

The data set used in this research was retrieved from Yahoo Finance through the use of the Python library yfinance. The time horizon was 4 years for the in-sample data and 1 year for the out-sample data. The in-sample is from 01/11/2017 until 31/10/2021, and the out-sample is from 01/11/2021 until 01/11/2022.

Due to the long-time horizon used in the in-sample, a few stocks that were present in the S&P 500 on 01/11/2022 do not have the historical data for the whole timeframe of the in-sample data. For the in-sample data, the stocks that did not have data throughout the whole time frame are 1) CARR, 2) CDAY, 3) CEG, 4) CTVA, 5) DOW, 6) FOX, 7) FOXA, 8) MRNA, 9) OGN, 10) OTIS, 11) VICI. Together they represent 2.19% of the total number of S&P 500 stocks and only 1.32% of the total S&P 500 market cap. Therefore, it can be concluded that their absence in the calculations would not have a significant effect on the final results of this research; and thus, they were excluded from the calculations to make the calculations more coherent given the need for the covariance matrix in the FT equation (Equation (1)).

2.2. Methodology

As already stated, four quantitative forecasting methods were used, namely AR(*p*), MA(*q*), ARIMA(*p*, *d*, *q*) and NDLM(*k*). AR(*p*) explores the fact that many time series phenomena linearly depend on their own previous values and on a time series process [27] , where “*p*” is the number of previous values considered to predict the value of the next time step. AR(*p*) is given as:

${Y}_{t}={\displaystyle {\sum}_{i=1}^{p}{\phi}_{i}{Y}_{t-i}}+{\u03f5}_{t}$ (3)

where,

${Y}_{t}$ = value of the next time step

${\phi}_{i}$ = model parameters

${Y}_{t-i}$ = previous values

${\u03f5}_{t}$ = white noise

The parameters that affect the prediction accuracy of AR(*p*) are “*p*” and
${\phi}_{i}$ .

MA(*q*), on the other hand, explores the fact that various time series processes have their value of the next time step cross-correlated with a non-identical to itself random-variable [27] . That is, the next time step value linearly depends on the time series mean and the past errors, where “*q*” designates the number of previous errors that are considered. MA(*q*) is given by Equation (4):

${Y}_{t}=\mu +{\displaystyle {\sum}_{i=1}^{q}{\theta}_{i}{\u03f5}_{t-i}}+{\u03f5}_{t}$ (4)

where,

${Y}_{t}$ = value of the next time step

$\mu $ = mean of ${Y}_{t}$

${\theta}_{i}$ = model parameters

${\u03f5}_{t-i}$ = previous errors

${\u03f5}_{t}$ = white noise

The parameters that affect the prediction accuracy of MA(*q*) are “*q*” and
${\theta}_{i}$ .

There is even the possibility of combining these two models to create the ARMA(*p*, *q*) model, Yet, the big limitation of AR(*p*), MA(*q*), and ARMA(*p*, *q*) is that they only work well with linear stationary processes. Yet, some time series processes are non-stationary processes. In order to address this issue, ARIMA(*p*, *d*, *q*) models are frequently used [27] , where “*d*” is the number of times that the original series has to be differentiated to result in a stationary series (“*d*” is also known as order of homogeneity) [27] . ARIMA(*p*, *d*, *q*) is given as:

${\left(1-B\right)}^{d}{Y}_{t}={\displaystyle {\sum}_{i=1}^{p}{\phi}_{i}{Y}_{t-i}}+{\displaystyle {\sum}_{i=1}^{q}{\theta}_{i}{\u03f5}_{t-i}}+{\u03f5}_{t}$ (5)

where,

${\left(1-B\right)}^{d}{Y}_{t}$ when $d=\text{1}$ : ${Y}_{t}-{Y}_{t-1}$ , and when $d=\text{2}$ : $\left({Y}_{t}-{Y}_{t-1}\right)-\left({Y}_{t-1}-{Y}_{t-2}\right)$ , and so on.

The parameters that affect the prediction accuracy of ARIMA(*p*, *d*, *q*) are “*p*”, “*d*”, “*q*”,
${\phi}_{i}$ and
${\theta}_{i}$ .

In order to estimate *p*, *q*, and *d* of AR(*p*), MA(*q*), and ARIMA(*p*, *d*, *q*), Akaike information criterion (AIC) was used as the criterion selection. Afterwards, to estimate
${\phi}_{i}$ and
${\theta}_{i}$ , maximum likelihood estimation (MLE) was used. For both aforementioned processes, the arima function in R was used [28] .

Lastly, NDLM(*k*) can be used for both stationary and non-stationary series [29] . This is the case since it makes use of a polynomial trend equation to account for the time series process trend and a Fourier form representation to account for the seasonality present in the time series [29] [30] [31] . Moreover, “*k*” would be the number of parameters in the model, and NDLM(*k*) makes use of Bayesian inference to find the most likely parameters [29] [30] [31] . NDLM(*k*) can be represented with the following equations:

${Y}_{t}={{F}^{\prime}}_{t}{\theta}_{t}+{\nu}_{t},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\nu}_{t}~N\left(0,{v}_{t}\right)$ (6)

${\theta}_{t}={G}_{t}{\theta}_{t-1}+{\omega}_{t},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\omega}_{t}~N\left(0,{W}_{t}\right)$ (7)

$\left({\theta}_{0}|{D}_{0}\right)~N\left({m}_{0},{C}_{0}\right)$ (8)

where,

${Y}_{t}$ = value at time step *t*

${{F}^{\prime}}_{t}$ = transposed vector of dimension *k* composed by 1’s and 0’s

${\theta}_{t}$ = parameters vector of dimension *k*

${\nu}_{t}$ = observation noise

${v}_{t}$ = observation variance

${G}_{t}$ = *k* × *k* Jordan matrix

${\omega}_{t}$ = system noise vector

${W}_{t}$ = system covariance matrix

${\theta}_{0}$ = conjugate prior distribution for the *k* parameters

${m}_{0}$ = prior mean vector

${C}_{0}$ = prior covariance matrix

${D}_{t}$ = all information about ${Y}_{0:t}$

NDLM(*k*) is implemented by updating priors to obtain posteriors using a sequential approach. The posterior distribution is obtained through the Bayes theorem:

$P\left({\theta}_{t}|{D}_{t}\right)\propto P\left({\theta}_{t}|{D}_{t-1}\right)P\left({Y}_{t}|{\theta}_{t},{D}_{t-1}\right)$ (9)

The forecasting function of this model is give as:

${f}_{t}\left(h\right)={{F}^{\prime}}_{t}{G}_{t}^{h}E\left({\theta}_{t}|{D}_{t}\right)$ (10)

where,

${f}_{t}\left(h\right)$ = forecasted value for *h* time steps ahead

This model is usually represented as $\left\{{F}_{t},{G}_{t},{v}_{t},{W}_{t}\right\}$ . The assumptions for the use of this model in this research were that the observation variance was known and constant over time, ${v}_{t}=v$ , that ${F}_{t}$ and ${G}_{t}$ were also constant over time, ${F}_{t}=F$ and ${G}_{t}=G$ , and that the system covariance matrix was unknown. Under the assumption that the system covariance matrix is unknown, the following equations hold:

$Var\left({\theta}_{t}|{D}_{t}\right)={R}_{t}$ (11)

${R}_{t}=\frac{{G}_{t}{C}_{t-1}{{G}^{\prime}}_{t}}{\delta}\text{\hspace{0.17em}}\text{\hspace{0.17em}}with\text{\hspace{0.17em}}\delta \in \left(0,1\right]$ (12)

$\therefore {W}_{t}^{*}=\frac{1-\delta}{\delta}{G}_{t}{C}_{t-1}{{G}^{\prime}}_{t}$ (13)

In order to find the most suitable *δ*, we maximize the following mean standard equation MSE(*δ*):

$\text{MSE}\left(\delta \right)=\underset{t=1}{\overset{T}{{\displaystyle \sum}}}\frac{{Y}_{t}-{f}_{t}\left(\delta \right)}{T}$ (14)

Finally, the forecast function can be generalized thanks to the superposition principle into:

${f}_{t}\left(h\right)={P}_{t}^{k}\left(h\right)+{F}_{t}^{mp}\left(h\right)$ (15)

where,

${P}_{t}^{k}\left(h\right)$ = forecast function of the polynomial trend model of order *k*

${F}_{t}^{mp}\left(h\right)$ = forecast function of the Fourier form representation with *m* used subperiods and with *p *being the period basis for the Fourier form representation

Furthermore, the time series process trend of the S&P 500 FT was captured by the polynomial trend model, where for *k* = 1 there is no trend, for *k* = 2 there is a linear trend, for *k* = 3 there is a quadratic trend, and so on. Regarding the seasonality of the S&P 500 FT, the Fourier form representation was used, where *p* is the period after which the seasonality seems to repeat itself and *m* is the best number subperiods within p that explains the seasonality without adding too much noise. The subperiods, *λ*, were determined by the equation:

${\lambda}_{j}=\{\begin{array}{l}\frac{2\pi}{p}j\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,m,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{when}\text{\hspace{0.17em}}p\equiv \text{odd}\\ \frac{2\pi}{p}j\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,m-1,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{when}\text{\hspace{0.17em}}p\equiv \text{even}\iff \text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}{\lambda}_{p}=\pi \end{array}$ (16)

Additionally, it is important to notice that for
${F}_{t}^{mp}\left(h\right)$ , there is a difference in the aforementioned structure for NDLM(*k*). This difference being:

${G}_{t}=\{\begin{array}{l}\text{blockdiag}\left({J}_{2}\left(1,{\lambda}_{1}\right),\cdots ,{J}_{2}\left(1,{\lambda}_{m-1}\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{when}\text{\hspace{0.17em}}p\equiv \text{odd}\\ \text{blockdiag}\left({J}_{2}\left(1,{\lambda}_{1}\right),\cdots ,{J}_{2}\left(1,{\lambda}_{m-1}\right),-1\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{when}\text{\hspace{0.17em}}p\equiv \text{even}\end{array}$ (17)

${J}_{2}\left(1,{\lambda}_{j}\right)=\left(\begin{array}{cc}\mathrm{cos}\left({\lambda}_{j}\right)& \mathrm{sin}\left({\lambda}_{j}\right)\\ -\mathrm{sin}\left({\lambda}_{j}\right)& \mathrm{cos}\left({\lambda}_{j}\right)\end{array}\right)$

The parameters that affect the prediction accuracy of NDLM(*k*) are “*k*”,
${F}_{t}$ ,
${G}_{t}$ ,*
${v}_{t}$ *,
${W}_{t}$ (or
$\delta $ if
${W}_{t}$ is unknown). For more details about NDLM(*k*)’s use, structure and variations, see [29] [30] [31] .

Determining the optimal *k* for the polynomial trend model is not very challenging. Yet, it cannot be said the same about determining the optimal *m* and *p* for the Fourier form representation [29] [30] [31] . Usually, qualitative analysis is needed to better understand the time series seasonality of the studied time series process, and thus determine the optimal *k* and *m* for the Fourier form representation [30] [31] .

Information gathered from ARIMA(*p*, *d*, *q*) and a qualitative analysis of the S&P 500 FT time series development were used to make an educated guess about the optimal *k* for the polynomial trend model and optimal *m* and *p* for the Fourier form representation. Afterwards, the R functions dlmModPoly and dlmModTrig were used to estimate the model matrices and update the posterior parameters for the in-sample data for respectively the polynomial trend model and the Fourier form representation [32] [33] . Thereafter, the algorithm that can be found *here* was used to estimate the posterior parameters for the out-sample data and to forecast the out-sample values.

In order to measure the efficacy of the studied forecasting methods, two forecasting tasks were used. The first one is a one-step ahead forecast and the second one is a full one business year (252 steps ahead) forecast. For the one-step ahead forecast, a rolling forecast was used, and the parameters and the matrices of NDLM(*k*) were updated at every time step. The parameters for the model AR(*q*), MA(*q*) and ARIMA(*p*, *d*, *q*) were also updated at every time step. The one business year forecast, on the other hand, had the parameters and matrices of NDLM(*k*) estimated with only the in-sample data and the parameters and matrices remained unchanged for the whole forecasted business year. The parameters for the model AR(*q*), MA(*q*) and ARIMA(*p*, *d*, *q*) also remained unchanged for the whole forecasted business year. On top of that,
${Y}_{t-i}$ and
${\u03f5}_{t-i}$ used for AR(*p*), MA(*q*) and ARIMA(*p*, *d*, *q*) were respectively replaced by the forecasted value (
${f}_{t}$ ) and the function:

${\u03f5}_{t}={f}_{t-1}-\mu -{\displaystyle {\sum}_{i=1}^{q}{\theta}_{i}{\u03f5}_{t-i}}$ (18)

where,

$\mu $ = mean of in-sample ${Y}_{t}$

${f}_{t-1}$ = forecasted value at time step $t-\text{1}$

To evaluate each forecasting method performance, the Root-mean-square error/residual (RMSE) and Error/residual standard deviation (RSD) were used. Their equations are given by:

$\text{RMSE}=\frac{\sqrt{{\displaystyle {\sum}_{t=1}^{T}{\left({f}_{t}-{Y}_{t}\right)}^{2}}}}{T}$ (19)

$\text{RSD}=\sqrt{\frac{\text{RMSE}-\sqrt{{\displaystyle {\sum}_{t=1}^{T}{\left({f}_{t}-{Y}_{t}\right)}^{2}}}}{T}}$ (20)

where,

${Y}_{t}$ = value at time step *t*

${f}_{t-1}$ = forecasted value at time step *t*

$T$ = number of business days in the out-sample data (252 in this research)

Under the assumption that the residuals follow a normal distribution, 95% and 99% confidence intervals (CI) were calculated and used in the performance evaluation of each forecasting method. Lastly, a qualitative assessment was performed by visually evaluating the similarity between the actual stochastic development in the out-sample period and the forecasted values.

3. Results

The time-series development for the in-sample data can be found in Figure 1. With this graph, one can confidently affirm that the S&P 500 FT is stationary, though its variance temporally increased in 2020 due to COVID-19.

In Figure 2 and Figure 3, the in-sample Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) can be found. It can be seen that the S&P 500 FT has a high autocorrelation, even after 40 lags there is still a significant autocorrelation (*i.e*. an autocorrelation higher than 0.1). Therefore, it is a priori expected that MA(*q*) will have a high number of parameters. Regarding the PACF, the partial autocorrelation stops having a significant value (*i.e*. greater than 0.1) after 7 lags. Thus, AR(*p*) will presumably have seven parameters.

In Table 1 the optimal *p*, *q*, and d of AR(*p*), MA(*q*), and ARIMA(*p*, *d*, *q*) given the in-sample data can be found and their respective log-likelihood and AIC.

Figure 1. The S&P 500 FT in-sample time-series development.

Figure 2. In-sample ACF.

Figure 3. In-sample PACF.

Table 1. Optimal *p*, *q*, and *d**.*

AR(7) has both the lowest AIC and the highest log-likelihood, showing that it has the most potential out of these three models. However, d equals 1 for the ARIMA(*p*, *d*, *q*) model showing that the S&P 500 FT might be non-stationary, and thus that AR*(p*) and MA(*q*) are not suitable to forecast the S&P 500 FT. Yet, most probably the ARIMA(*p*, *d*, *q*) model estimated that the S&P 500 FT is non-stationary due to the temporary increase in variance in 2020 caused by an extremely short and strong economic recession during the COVID-19 lockdown.

In Table 2 the optimal parameters values, given the in-sample data, for AR(7), MA(1), and ARIMA(4, 1, 2) and their respective standard errors (S.E.) can be found. Needless to say, these parameters changed at every time step for the one-step ahead forecast. The only striking result from this table is that the S.E. of ${\phi}_{1}$ in ARIMA(4, 1, 2) is almost two times greater in magnitude than the ${\phi}_{1}$ value, showing a great uncertainty in this parameter estimation.

Regarding NDLM(*k*), it was chosen to use an one-order polynomial trend model and Fourier form representation with *p* = 12 and *k *= 3. These values were chosen given the stationarity of the S&P 500 FT and the fact that the business year has 12 months (*i.e. *after 12 months it repeats itself). On top of that, it was chosen to only use the first three periods of the Fourier form representation in order to respectively represent the annual, quadrimester and trimester seasonality and avoid noise from shorter periods. As a result, *k* equals 7 and NDLM(7) was used. Needless to say, one could have made different choices and perhaps

Table 2. Optimal parameters values for AR(7), MA(1), and ARIMA(4, 1, 2).

have taken the business cycle into account, thus having a higher *p*. Yet, given the stationarity of the S&P 500 FT in the in-sample period, and out-sample period length (12 months), the choice of using an one-order polynomial trend model and Fourier form representation with *p *= 12 and *k* = 3 was considered the best choice.

Below the priors for NDLM(7) can be found. Most of them are non-informative, besides the first term of
${m}_{0}$ and the range for *δ*. The first term of
${m}_{0}$ was chosen based on a slightly lower value of the in-sample mean to account for the extreme event of the COVID-19 lockdown. The range from 0.7 until 1 for *δ** *was chosen because according to the scientific literature, *δ* lies between this range in the great majority of cases [30] [32] .

Priors

$v=10$

$\delta \in \left[0.7,1\right]$

${m}_{0}={\left(488,0,0,0,0,0,0\right)}^{\prime}$

${C}_{0}=10{I}_{7}$

Given the in-sample data, the posteriors for NDLM(7) were estimated, which can be seen below. Naturally, the posteriors changed at every time step for the one-step ahead forecast. Nevertheless, it is important to mention that for the one-step ahead forecast, it was assumed that *δ** *was equal to 0.81 at every time step.

Posteriors

$F=\left(1,1,0,1,0,1,0\right)$

$G=\left(\begin{array}{ccccccc}1& 0& 0& 0& 0& 0& 0\\ 0& \mathrm{cos}\left(\frac{\pi}{6}\right)& \mathrm{sin}\left(\frac{\pi}{6}\right)& 0& 0& 0& 0\\ 0& -\mathrm{sin}\left(\frac{\pi}{6}\right)& \mathrm{cos}\left(\frac{\pi}{6}\right)& 0& 0& 0& 0\\ 0& 0& 0& \mathrm{cos}\left(\frac{\pi}{3}\right)& \mathrm{sin}\left(\frac{\pi}{3}\right)& 0& 0\\ 0& 0& 0& -\mathrm{sin}\left(\frac{\pi}{3}\right)& \mathrm{cos}\left(\frac{\pi}{3}\right)& 0& 0\\ 0& 0& 0& 0& 0& \mathrm{cos}\left(\frac{\pi}{2}\right)& \mathrm{sin}\left(\frac{\pi}{2}\right)\\ 0& 0& 0& 0& 0& -\mathrm{sin}\left(\frac{\pi}{2}\right)& \mathrm{cos}\left(\frac{\pi}{2}\right)\end{array}\right)$

${m}_{T}={\left(514.64,72.58,-97.68,44.33,-33.63,-70.92,4.18\right)}^{\prime}$

${C}_{T}=\left(\begin{array}{ccccccc}1197.55& -48.22& 899.90& -318.76& 365.76& -319.57& 46.18\\ -48.22& 2071.19& 383.74& -409.42& 998.73& -609.11& 186.05\\ 899.90& 383.74& 2684.23& -760.62& 239.31& -355.65& 109.85\\ -318.76& -409.42& -760.62& 2028.40& -44.90& -366.92& 610.46\\ 365.76& 998.73& 239.31& -44.97& 2578.48& -954.20& -52.33\\ -319.57& -609.12& -355.65& -366.92& -954.20& 1980.65& -208.99\\ 46.18& 186.05& -109.85& 610.46& -52.33& -208.99& 2013.25\end{array}\right)$

$\delta =0.81$

${W}_{T}^{*}=\left(\begin{array}{ccccccc}280.81& 95.71& 188.40& 36.90& 107.61& 10.83& 74.93\\ 95.71& 599.52& 107.24& 113.78& 264.66& 24.90& 165.39\\ 188.40& 107.24& 515.55& -112.55& 57.95& -44.12& 0.81\\ 36.90& 113.78& -112.55& 563.24& 61.12& 60.94& 236.79\\ 107.61& 264.66& 57.95& 61.12& 517.01& -130.10& 37.36\\ 10.83& 24.90& -44.12& 60.94& -130.10& 472.08& 49.00\\ 74.93& 165.39& 0.81& 236.79& 37.36& 49.00& 464.43\end{array}\right)$

In Table 3 the forecasting quantitative performance results for the one-step ahead forecast can be found. Similarly, one can find the forecasting quantitative performance results for the one business year forecast in Table 4. As already expected from the log-likelihood and AIC values, AR(7) had the best results for the one-step ahead forecast, followed by ARIMA(4, 1, 2), MA(1), and finally

Table 3. Forecast performance quantitative results (1 day).

Table 4. Forecast performance quantitative results (1 year).

NDLM(7). Given that the S&P 500 FT ranged from roughly 200 to 1100 from 2017 until 2022, AR(7) results were positive. This is the case since on average AR(7) would wrongly predict FT values by roughly 10% of its observed interval. On top of that, AR(7) would not give a false high or low FT (*i.e*. it gives a high forecasted FT and the next day a low FT occurs) within 99% of the time. However, it would not be recommended to use this forecasting method for financial models that depend on the precise magnitude of the S&P 500 FT values. Regarding the one business year forecast, NDLM(7) surprisingly had the best results, even better results than its results for the one-step ahead forecast, followed by MA(1), AR(7), and ARIMA(4,1,2). Anew, NDLM(7) were positive, but less than AR(7) for the one-step ahead forecast. This is the case since on average it would wrongly predict FT values by roughly 15% of its observed interval. Once again, it would not be recommended to use this forecasting method for financial models that depend on the values of the S&P 500 FT.

In Figure 4 to Figure 11 the graphical comparison between the forecasted values for both the one-step ahead and one business year forecast can be found for each model. There are no surprises in these graphs besides the evidence of NDLM(7)’s clear forecasting superiority against MA(1), which cannot be observed with only the forecast quantitative performance results.

Figure 4. FT forecasting: 1-day AR(7).

Figure 5. FT forecasting: 1-day ARIMA(4, 1, 2).

Figure 6. FT forecasting: 1-day MA(1).

Figure 7. FT forecasting: 1-day NDLM(7).

Figure 8. FT forecasting: 1-year AR(7).

Figure 9. FT forecasting: 1-year ARIMA(4, 1, 2).

Figure 10. FT forecasting: 1-year MA(1).

Figure 11. FT forecasting: 1-year NDLM(7).

4. Conclusions

The aim of this research was to evaluate the S&P 500 FT predictability through the use of common forecasting methods, namely AR(*p*), MA(*q*), ARIMA(*p*, *d*, *q*), and NDLM(*k*). The results of quantitative and qualitative evaluation methods show that for the out-sample period (from November 2021 until November 2022), AR(7) was the best forecasting method for the S&P 500 FT one-step ahead forecast, whereas NDLM(7) was the best forecasting method for the S&P 500 FT one business year forecast. AR(7) would on average wrongly predict FT values by roughly 10% of its observed interval for the S&P 500 FT one-step ahead forecast. NDLM(7), on the other hand, would on average wrongly predict FT values by approximately 15% of its observed interval for the S&P 500 FT one business year forecast.

Despite the positive results of both models, it would not be recommended to use them for financial models that depend on the values of the S&P 500 FT, unless there is no other alternative. Instead, it would be better to use those forecasting models to have a good idea about whether the market will likely be turbulent (*i.e*., with high volatility) or not on a certain day or at a certain period.

Given quantitative limitations of AR(7) and NDLM(7) for financial models that depend on the values of the S&P 500 FT, the author of this paper invites the scientific community to perform a similar study as this one, using other forecasting methods. The scientific community is also encouraged to perform similar studies using other financial asset indexes, periods, and even considering AR instead of FT.

Conflicts of Interest

The authors declare no conflicts of interest.

[1] | Demirer, R., Gupta, R., Lv, Z. and Wong, W.K. (2019) Equity Return Dispersion and Stock Market Volatility: Evidence from Multivariate Linear and Nonlinear Causality Tests. Sustainability, 11, 351. https://doi.org/10.3390/su11020351 |

[2] |
Engle, R.F., Ghysels, E. and Sohn, B. (2013) Stock Market Volatility and Macroeconomic Fundamentals. Review of Economics and Statistics, 95, 776-797. https://doi.org/10.1162/REST_a_00300 |

[3] |
Inci, A.C., Li, H. and McCarthy, J. (2011) Financial Contagion: A Local Correlation Analysis. Research in International Business and Finance, 25, 11-25. https://doi.org/10.1016/j.ribaf.2010.05.002 |

[4] |
Liu, R., Demirer, R., Gupta, R. and Wohar, M. (2019) Volatility Forecasting with Bivariate Multifractal Models. Journal of Forecasting, 39, 155-167. https://doi.org/10.1002/for.2619 |

[5] |
Poon, S. and Clive, W.J. (2003) Forecasting Volatility in Financial Markets: A Review. Journal of Economic Literature, 41, 478-539. https://doi.org/10.1257/jel.41.2.478 |

[6] |
Rangel, J.G. and Engle, R.F. (2012) The Factor-Spline-GARCH Model for High and Low Frequency Correlations. Journal of Business & Economic Statistics, 30, 109-124. https://doi.org/10.1080/07350015.2012.643132 |

[7] |
Salisu, A.A. and Gupta, R. (2021) Oil Shocks and Stock Market Volatility of the BRICS: A GARCH-MIDAS Approach. Global Finance Journal, 48, Article ID: 100546. https://doi.org/10.1016/j.gfj.2020.100546 |

[8] | Salisu, A.A. and Ogbonna, A.E. (2022) The Return Volatility of Cryptocurrencies during the COVID-19 Pandemic: Assessing the News Effect. Global Finance Journal, 54, Article ID: 100641. https://doi.org/10.1016/j.gfj.2021.100641 |

[9] | Kritzman, M. and Li, Y. (2010) Skulls, Financial Turbulence, and Risk Management. Financial Analysts Journal, 66, 30-41. https://doi.org/10.2469/faj.v66.n5.3 |

[10] |
Kritzman, M., Li, Y., Page, S. and Rigobon, R. (2011) Principal Components as a Measure of Systemic Risk. The Journal of Portfolio Management, 37, 112-126. https://doi.org/10.3905/jpm.2011.37.4.112 |

[11] |
Salisu, A.A., Demirer, R. and Gupta, R. (2022) Financial Turbulence, Systemic Risk and the Predictability of Stock Market Volatility. Global Finance Journal, 52, 100699. https://doi.org/10.1016/j.gfj.2022.100699 |

[12] |
Nystrup, P., Boyd, S., Lindström, E. and Madsen, H. (2018) Multi-Period Portfolio Selection with Drawdown Control. Annals of Operations Research, 282, 245-271. https://doi.org/10.1007/s10479-018-2947-3 |

[13] | Liu, X.Y., Yang, H., Gao, J. and Wang, C. (2021) FinRL: Deep Reinforcement Learning Framework to Automate Trading in Quantitative Finance. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3955949 |

[14] |
Nystrup, P., Madsen, H. and Lindström, E. (2018) Dynamic Portfolio Optimization across Hidden Market Regimes. Quantitative Finance, 18, 83-95. https://doi.org/10.1080/14697688.2017.1342857 |

[15] | Rotela Junior, P., Salomon, F.L.R. and De Oliveira Pamplona, E. (2014) ARIMA: An Applied Time Series Forecasting Model for the Bovespa Stock Index. Applied Mathematics, 5, 3383-3391. https://doi.org/10.4236/am.2014.521315 |

[16] | Adebiyi, A.A., Adewumi, A.O. and Ayo, C.K. (2014) Comparison of ARIMA and Artificial Neural Networks Models for Stock Price Prediction. Journal of Applied Mathematics, 2014, Article ID: 614342. https://doi.org/10.1155/2014/614342 |

[17] |
Ariyo, A.A., Adewumi, A.O. and Ayo, C.K. (2014) Stock Price Prediction Using the ARIMA Model. 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation, Cambridge, 26-28 March 2014, 106-112. https://doi.org/10.1109/UKSim.2014.67 |

[18] |
Frennberg, P. (1998) An Evaluation of Alternative Models for Predicting Stock Volatility: Evidence from a Small Stock Market. Journal of International Financial Markets, Institutions & Money, 5, 117-134. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=7257 |

[19] | Gruber, L.F. and West, M. (2017) Bayesian Online Variable Selection and Scalable Multivariate Volatility Forecasting in Simultaneous Graphical Dynamic Linear Models. Econometrics and Statistics, 3, 3-22. https://doi.org/10.1016/j.ecosta.2017.03.003 |

[20] |
Mondal, P., Shit, L. and Goswami, S. (2014) Study of Effectiveness of Time Series Modeling (Arima) in Forecasting Stock Prices. International Journal of Computer Science, Engineering and Applications, 4, 13-29. https://doi.org/10.5121/ijcsea.2014.4202 |

[21] |
Nonejad, N. (2017) Forecasting Aggregate Stock Market Volatility Using Financial and Macroeconomic Predictors: Which Models Forecast Best, When and Why? Journal of Empirical Finance, 42, 131-154. https://doi.org/10.1016/j.jempfin.2017.03.003 |

[22] |
Nystrup, P., Boyd, S., Lindström, E. and Madsen, H. (2018) Multi-Period Portfolio Selection with Drawdown Control. Annals of Operations Research, 282, 245-271. https://doi.org/10.1007/s10479-018-2947-3 |

[23] | Piccoli, P.P. (2015) Identification of a Dynamic Linear Model for the American GDP. Università Ca’ Foscari Venezia, Venice. http://Hdl.Handle.Net/10579/6810 |

[24] | Zhang, W., Gong, X., Wang, C. and Ye, X. (2021) Predicting Stock Market Volatility Based on Textual Sentiment: A Nonlinear Analysis. Journal of Forecasting, 40, 1479-1500. https://doi.org/10.1002/for.2777 |

[25] | Zhu, X., Ma, M., Yang, H. and Ge, W. (2017) Modeling the Spatiotemporal Dynamics of Gross Domestic Product in China Using Extended Temporal Coverage Nighttime Light Data. Remote Sensing, 9, 626. https://doi.org/10.3390/rs9060626 |

[26] |
Zolfaghari, M. and Gholami, S. (2021) A Hybrid Approach of Adaptive Wavelet Transform, Long Short-Term Memory and ARIMA-GARCH Family Models for the Stock Index Prediction. Expert Systems with Applications, 182, Article ID: 115149. https://doi.org/10.1016/j.eswa.2021.115149 |

[27] | Morettin, P.A and Toloi, C.M. (2006) Análise de Séries Temporais-2a Edição Revista e Ampliada. 2nd Edition, Editora Edgar Bluncher, São Paulo. |

[28] |
RDocumentation (n.d.) Arima Function—RDocumentation. https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/arima |

[29] | West, M. and Harrison, J. (1997) Bayesian Forecasting and Dynamic Models (Springer Series in Statistics). 2nd Edition, Springer, Berlin. |

[30] | Gamerman, D. (1998) Markov Chain Monte Carlo for Dynamic Generalised Linear Models. Biometrika, 85, 215-227. https://doi.org/10.1093/biomet/85.1.215 |

[31] |
Migon, H.S., Gamerman, D., Lopes, H.F. and Ferreira, M.A. (2005) Dynamic Models. In: Handbook of Statistics, Elsevier, Amsterdam, 553-588. https://doi.org/10.1016/S0169-7161(05)25019-8 |

[32] |
RDocumentation (n.d.) dlmModPoly function—RDocumentation. https://www.rdocumentation.org/packages/dlm/versions/1.1-6/topics/dlmModPoly |

[33] |
RDocumentation (n.d.) dlmModTrig function—RDocumentation. https://www.rdocumentation.org/packages/dlm/versions/1.1-5/topics/dlmModTrig |

Journals Menu

Contact us

customer@scirp.org | |

+86 18163351462(WhatsApp) | |

1655362766 | |

Paper Publishing WeChat |

Copyright © 2023 by authors and Scientific Research Publishing Inc.

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.