Relative Performance Evaluation of Competing Crude Oil Prices’ Volatility Forecasting Models: A Slacks-Based Super-Efficiency DEA Model ()
ficiency analysis and output-oriented super efficiency analysis only take account of technical efficiency, whereas orientation-free super efficiency analysis takes account of an additional performance component; namely, slacks. Notice that the efficient model SMA20 maintained its best position in the rankings regardless of whether the DEA analysis is input-oriented, output-oriented or orientation-free, because it is always on the efficient frontier and has zero slacks regardless of the performance measures used.
With respect to orientation-free super efficiency analysis, a close look at Table 3 reveals that whether one measures biasedness by ME or MVolScE and measures goodness-of-fit by MAE or MAVolScE, the ranks of the best models (e.g., SMA20, SES, and AR(5)) and the worst models (e.g., RW, HM and AR(1)) remain the same; i.e., they are robust to changes in measures. On the other hand, whether one measures biasedness by ME or MVolScE and measures goodness-of-fit by MSE or MSVolScE, the ranks of the best models (e.g., SMA20, SES, and CGARCH(1, 1)) and the worst models (e.g., RW, HM, and AR(1)) remain the same. These rankings suggest that, for our data set, AR(5) tends to produce large errors and CGARCH(1, 1) tends to produce small errors, as their ranks are sensitive to whether one penalizes large errors more than small ones or not. Finally, whether one measures biasedness by ME or MVolScE and measures goodness-of-fit by MMEU (respectively, MMEO), the ranks of the best models such as SMA20 and CGARCH(1, 1) (respectively, RW, HM, and SMA20) and the worst models such as RW, HM and AR(1) (respectively, SMA60 and PARCH(1, 1)) remain the same. Notice that the rankings under MMEU and MMEO differ significantly, which suggest for example that the performance of models such as RW, HM, and CGARCH(1, 1) is very sensitive to whether one penalizes negative errors more than positive ones (i.e., decision maker prefers models that under-estimate the forecasts) or vice versa. In general
Table 1. Unidimensional rankings of competing forecasting models―ranking in descending order of performance.
Table 2. Super efficiency DEA scores-based multidimensional rankings of volatility forecasting models.
1RW; 2HM; 3SMA20; 4SMA60; 5SES; 6ARMA (1, 1); 7AR (1); 8AR (5); 9GARCH (1, 1); 10GARCH-M (1, 1); 11EGARCH (1, 1); 12TGARCH (1, 1); 13PARCH (1, 1); 14CGARCH (1, 1).
however, when under-estimated forecasts are penalized, most GARCH types of models tend to perform well ―suggesting that they often produce forecasts that are over-estimated. On the other hand, when over-estimated forecasts are penalized, averaging models such as RW, HM, SES tend to perform very well―suggesting that these models often produce forecasts that are under-estimated.
Last, but not least, given our data set and the measures under consideration, numerical results suggest that, with the exception of CGARCH, the family of GARCH models have an average performance as compared to smoothing models such as SMA20 and SES―this suggests that the data generation process has a relatively long memory, which obviously gives advantage to models such as SMA20 and SES as compared to GARCH (1, 1), GARCH-M (1, 1), EGARCH (1, 1), TGARCH (1, 1) and PARCH (1, 1), which are short memory models―similar findings on the GARCH type of models were reported by Kang et al. [12] .
Table 3. Slack-based super efficiency DEA scores-based multidimensional rankings of volatility forecasting models.
1RW; 2HM; 3SMA20; 4SMA60; 5SES; 6ARMA (1, 1); 7AR (1); 8AR (5); 9GARCH (1, 1); 10GARCH-M (1, 1); 11EGARCH (1, 1); 12TGARCH (1, 1); 13PARCH (1, 1); 14CGARCH (1, 1).
5. Conclusion
Nowadays, forecasts play a crucial role in driving our decisions and shaping our future plans in many application areas such as economics, finance and investment, marketing, and design and operational management of supply chains, among others. Obviously, forecasting problems differ with respect to many dimensions; however, regardless of how one defines the forecasting problem, a common issue faced by both academics and professionals is related to the performance evaluation of competing forecasting models. Although most studies tend to use several performance criteria, and for each criterion, one or several metrics to measure each criterion, the assessment exercise of the relative performance of competing forecasting models is generally restricted to their ranking by measure, which usually leads to different unidimensional rankings. Xu and Ouenniche [1] proposed an input-oriented radial super efficiency DEA-based framework to evaluate the performance of competing forecasting models of crude oil prices’ volatility, which delivers a single ranking based on multiple performance criteria. However, such a framework suffers from four main issues. First, under the VRS assumption, inputoriented super-efficiency scores can be different from output-oriented super-efficiency scores, which may lead to different rankings. Second, radial super-efficiency DEA models may be infeasible for some efficient DMUs; therefore, ties would persist in the rankings. Third, radial super-efficiency DEA ignore mix efficiency, which may lead to overestimated efficiency scores. Fourth, in many applications such as ours, the choice of an orientation in DEA is rather superfluous. In this paper, we overcome these issues by proposing an orientation-free super-efficiency DEA framework; namely, a slacks-based super-efficiency DEA framework for assessing the relative performance of competing volatility forecasting models. We assessed the relative performance of fourteen forecasting models of crude oil prices’ volatility based on three criteria which are commonly used in the forecasting community; namely, the goodness-of-fit, biasedness, and correct sign criteria. We have chosen to consider several measures for each criterion to find out about the robustness of multidimensional rankings with respect to different measures. The main conclusions of this research may be summarized as follows. First, models that are on the efficient frontier and have zero slacks regardless of the performance measures used (e.g., SMA20) maintain their ranks regardless of whether the DEA analysis is input-oriented, output-oriented or orientation-free. Second, the multicriteria rankings of the best and the worst models seem to be robust to changes in most performance measures; however, SMA20 seems to be the best across the board. Third, when under-estimated forecasts are penalized, most GARCH types of models tend to perform well―suggesting that they often produce forecasts that are over-estimated. On the other hand, when over-estimated forecasts are penalized, averaging models such as RW, HM, SES tend to perform very well―suggesting that these models often produce forecasts that are under-estimated. Finally, our empirical results seem to suggest that, with the exception of CGARCH, the family of GARCH models have an average performance as compared to smoothing models such as SMA20 and SES, which suggests that the data generation process has a relatively long memory.
NOTES
*Corresponding author.