Forecasting and Backtesting of VaR in International Dry Bulk Shipping Market under Skewed Distributions

It is extremely important to model the empirical distributions of dry bulk shipping returns accurately in estimating risk measures. Based on several commonly used distributions and alternative distributions, this paper estab-lishes nine different risk models to forecast the Value-at-Risk (VaR) of dry bulk shipping markets. Several backtests are explored to compare the accuracy of VaR forecasting. The empirical results indicate the risk models based on commonly used distributions have relatively poor performance, while the alternative distributions, i.e. Skewed Student-T (SST) distribution, Skewed Generalized Error Distribution (SGED), and Hyperbolic distribution (HYP) produce more accurate VaR measurement. The empirical results suggest risk managers further consider more flexible empirical distributions when man-aging extreme risks in dry bulk shipping markets.


Introduction
Due to global trades, economic and policy uncertainties, the world dry bulk shipping market is characterized as a high-risk and highly volatile market, which brings various risks and opportunities to market participants [1].
Value-at-risk (VaR) is widely used by financial institutions and Banks as a standard tool for quantifying market risks [2]. Chao [3] applied the VaR model to analyze the Normal, Student-t (ST) and Skewed Student-T (SST) distribution performance to assess the risk of dry bulk freight charges, and considered SST distributed asymmetric long memory volatility structure can obtain accurate Q. N. Du this paper makes the VaR predictions on both long position that represents the shape of left tail and short position shaped by short tail, which can provide more robust conclusions. Third, as there is no evidence that any backtesting methods have absolute advantage over any other, so several tests should be used to ensure the robustness. This paper takes all four tests to evaluate each risk model: UC, IND, CC and DQ tests under six quantile levels.
The remainder of this paper is as follows: Section 2 introduces risk prediction model based on different distributions; Section 3 reviews the backtests; Section 4 provides data descriptions and preliminary analysis; Section 5 introduces empirical results; Section 6 concludes.

Freight Rate Volatility Models
The excess returns in dry bulk shipping markets are specified as: where t m and t σ are the conditional mean and standard deviation, and assume that t z follows the standard distributions used in this study.
This paper chooses the GJR-GARCH model [17] to model the volatility, which can describe the negative impact of the moment than the positive impact on the variance of the moment, and is more suitable for studying the asymmetric leverage problem of the dry bulk market. For the GJR-GARCH model, variance 2 t σ is defined as: ( )

Modeling the Distributions
This section describes the distribution model used for modeling. Theodossiou extends the GED distribution to accommodate the skewness and leptokurtosis of the returns empirical distribution. The probability density function of the standardized SGED distribution is expressed as follows: ; ; ; ; exp 2 1 where ( ) where j K is a modified Bessel function of the third order, which determines the shape of the distribution, α and δ are the shape parameters, µ determines the location of the distribution. when 1 λ = , function (6) is simplified as follows: This is the density function of the hyperbolic distribution (HYP), which is the easiest subclass of the generalized hyperbolic distribution family and is often preferred as a practical application.

VaR Calculation
This paper takes VaR as the risk forecasting measures, for a given time horizon and confidence level q, setting 1 q α = − , the VaR is equal to: is the conditional mean, t I represents all information sets before the realization of t r , The previous 2500 observations were used as in-sample. We estimate them using models in Section 2.1 and 2.2 and get the predicted values of 1| t t σ + and 1| t t m + , then get the first VaR value by formula (8). then by recursively updating parameter estimates, the total forecasted VaRs can be obtained.

VaR Backtesing
We use four backtesting methods to evaluate the predictive performance of the risk model. The first is Kupiec's UC test. First, define a Binary Variable sequence associated with the VaR measure "Violation" at a quantile level.
when the null hypothesis: the risk measure model for calculating the VaR value is sufficiently accurate) is established, it can be proved that the following likelihood function ratio LR uc satisfies: where T is the total length of the collision sequence, 0 T is the sum of the occurrences when the value is 0 in the sequence, and 1 T is the sum of the number of occurrences when the value in the sequence is 1. At one quantile level, if the calculated LR statistic is greater than the critical value of the distribution with a degree of freedom l at that level, the null hypothesis is rejected; otherwise, the null hypothesis is accepted, the risk metric model employed is considered sufficient 00 n indicates the model has been successfully measured in the current period, that is, the actual loss of the current period does not exceed the VaR value, and the number of observation periods that were successful in the previous period.
Similarly, if the calculated LR statistic is greater than the distribution threshold at that level, the null hypothesis is rejected; otherwise, the null hypothesis is accepted.
The conditional coverage test is in the case where the null hypothesis is established. It can be proved that the likelihood function ratio LR cc satisfies: Finally, Engle proposed a dynamic quantile DQ test based on the linear regression method of hit variables. The process of the hit function is expressed as: where ( ) t I α is a sequence of binary variables in Equation (12). Then perform a linear regression on the following formula: t Hit X λ ε = + (15) where t ε is a discrete process with a mean of zero and X is a matrix. In the case where the null hypothesis is established, the DQ statistics should satisfy: Under the quantile α , the DQ statistic is greater than the critical value of the distribution of degrees of freedom, rejecting the null hypothesis; otherwise, accepting the null hypothesis, that is, the risk measure model used is accurate.

Data and Preliminary Analysis
This paper selects the Baltic Dry Bulk Daily Freight Index of four sectors in the dry bulk market (Capesize, Panamax, Supramax, Handysize) as samples, that is BCI, BPI, BSI and BHSI. Taking period from September 2006 to December 2017 as the in-sample, and the latter data as the out-sample to evaluate the prediction performance of each risk model. As data contains fluctuations during the 2008 financial crisis, it also challenges the VaR forecasting. Define t p as the closing price on day t, and the daily returns are calculated as ln 100 Figure 1 shows daily return series for the four samples. Table 1 gives descriptive statistics.
According to the descriptive statistical results in Figure 1, Figure 2 and Table   1, it can be found that:     (1) is the Ljung and Box statistics of the return series of the 1th order; (** *, **, *) represent significant levels of 1%, 5%, and 10%, respectively.
1) The return series of the four indexes fluctuates greatly. The value of BCI returns is concentrated between −20 and 20, with the largest fluctuation range; the value of BPI returns is concentrated between −10 and 10, while value of Q. N. Du BSI and BHSI returns are concentrated between −5 and 5, with relatively small fluctuation range. These results are in line with the real fluctuations in each market, respectively. 2) The skewness of BPI and BHSI are negative, while BCI and BSI positive, which indicates that all the samples display asymmetry. The kurtosis of all the samples is greater than 3, and BSI and BHSI samples is almost 5 -6 times of the standard value, indicating that all four samples displays significant leptokurtosis. 3) J-B statistics show that the four samples all reject the normal hypothesis.
LQ (1) indicates that they are strongly correlated with each other. ADF and PP test results reject the null hypothesis of non-stationary significantly, implying all return series are stationary.

Estimation for GJR-GARCH Models with Different Distributions
This section discusses parameter estimation of the GJR-GARCH model based on nine statistical distributions. Due to space constraints, Table 2 only shows the BCI estimates.   Notes: In parenthesis is standard errors. Ln(θ) is the the maximized log-likelihood value. LQ(i) are the Ljung-Box statistics of order i. ***,**, * denote the significance level of 1%, 5%, and 10%, respectively.

Q. N. Du
From Table 2, the ARCH and GARCH coefficients of all the models are very significant, indicating that the dry bulk shipping market returns display significant volatility clustering. All the shape parameters and heavy-tailed parameters of each model are significant at a 1% level except for the HYP distribution, indicating that the return series are asymmetric, leptokurtosis and heavy-tailedness. In addition, the skewness coefficients of these distributions except for GHST distribution are all significantly positive, indicting the distribution of Capesize dry bulk shipping market returns is right-skewed, which is consistent with the second conclusion of Table 1. Moreover, at the same level of kurtosis, all of the nine distributions perform better in the negative-skewness case compared to the positive skewed one. LQ tests with different lag orders found no autocorrelation in the standardized residues, indicating that each model can capture the dynamics of the returns.

VaR Estimation Results
In this section, We present the one-day-ahead VaRs with 9 different distributions. Due to space constraints, this paper only shows the Panamax sector of shipping markets under 5% and 95% quantiles. For the sake of clarity, we randomly selected 250 predicted values for display.
Visually, see Figure 3 and

Backtesting Results
We use four methods to perform backtesting of each risk model. Tables 3-8 show the results for six quantile levels (1%, 5%, 10%, 99%, 95%, 90%). Table 9 summarizes the total number of rejections from Tables 3-8 to present results in a clearer manner.         The main conclusions from the results in Tables 3-9 are as follows:

Q. N. Du
First, most risk prediction models based on the normal distribution display the lowest accuracy. The P values are rejected most at the two significance levels, with 27 rejections at a 1% significance level, 11 rejections at the 5% significance level, 38 times out of 96 cases the P values are rejected at two significance levels, reaching about 39%. It also performs poorly in each type of shipping market from Table 9. The empirical results demonstrate that the normal distribution has the lowest accuracy in predicting the tail risk of the dry bulk shipping market. The GHST distribution was rejected about 34%, second only to the normal distribution, indicating that it can't well characterize the empirical returns distribution of dry bulk shipping market. In addition, risk prediction models based on commonly used distributions (norm, GED, ST) show a lower accuracy. For Q. N. Du the GED distribution, the P values are rejected for 9 times at the 5% significance level and 18 times at the 1% significance level, 27 times out of 96 cases the P values are rejected at two significance levels. But it perform relatively better in the Capesize market, with only 4 rejection times. For the ST distribution, the number of rejections is the same as the GED distribution. These conclusions further suggest when forecasting risks in the dry bulk shipping market, managers should avoid using commonly used distributions, but consider the alternative distributions that can describe the skewness and leptokurtosis features. Second, due to the different operations of four segments in dry bulk shipping market, the shipping freights volatility and tail risks are also different. As shown in Table 9, risk models based on different distributions perform differently in each market with the best accuracy (marked with lines). Specifically, the backtesting results for BCI and BPI have shown that the SGED distribution exhibits the highest accuracy, with only 3 times rejected at the 5% significant level, all passed at the 1% significance level. The backtesting results for BSI show the SST and HYP distributions have the highest accuracy with a total rejection of 4 times. For BHSI, the HYP distribution performed best, rejecting only one time at the 5% significance level. It is worth noting that from Table 1, BCI and BPI return series is relatively skewed, the leptokurtosis feature less obvious, while the BSI and BSHI returns are more skewed and leptokurtosis. Cause the SGED distribution can well describe the skewness, while the SST and HYP distributions can simultaneously characterize skewness and leptokurtosis of asset returns, which just corresponds to our backtesting results. Therefore, when forecasting and managing the risks of the dry bulk shipping market, participants should consider a more appropriate and accurate empirical distribution according to different ship sector.
Third, three alternative distributions (SGED, HYP and SST) generally show better accuracy than commonly used distributions. With HYP distribution, the backtesting results show that for 14 times the P values are rejected at the 1% significance and 6 times at the 5% significance level among the 96 cases, accounting for about 20%. HYP, GHST, NIG distributions are all in the GH family, but GHYP outperforms the other two distinctively. With SGED distribution, the backtesting results show that for 9 times the P values are rejected at the 1% significance level, and 11 times at the 5% significance level, accounting for 17%.
With SST distribution, the backtesting results show that for 11 times the P values are rejected at the 1% significance level, and 6 times at the 5% significance level, accounting for 17%, which displays the best accuracy on four samples. Risk prediction models based on these three distributions perform relatively well in the dry bulk shipping market, which provides empirical evidence for risk managers that they can consider SGED, HYP or SST distribution to model and forecast risks.
Finally, compared with symmetric distributions, their skewed extensions perform better in forecasting risks in dry bulk shipping market. The SST distribution extended by the ST distribution is about 9% more accurate than the ST distribution; the accuracy of the SGED distribution extended by the GED distribu-American Journal of Industrial and Business Management tion is about 6% higher than the GED distribution; and the SN distribution is also about 6% higher than the normal distribution. Even the ST and GED distribution can well capture tail feature of asset returns, it is difficult to provide sufficient accuracy to forecast the risk in the dry bulk market. While considering their skew extensions, it can significantly improve the accuracy of risk prediction models, which further suggests empirical distributions of dry bulk shipping market returns are more skewed but normal. Risk managers should fully consider the skewness of the tail risk when predicting this kind of highly volatile and high-risk market.

Robustness Test
This section tests the robustness of the main empirical results. Following Lin (2014)'s [19] robustness test for risk prediction models, we select risk models with better performance (SGED, SST, HYP) and redo VaR forecasting over a longer sample period. Table 10 shows the backtesting results in the new sample period based on the risk models of the three distributions (SGED, SST and HYP), which show a relatively better performance in Section 5.3. Specifically, with HYP distribution, the P values are rejected at the 1% significance for 8 times and at the 5% significance level for 6 times among 96 cases. With SGED distribution, the P values are rejected at the 1% significance for 10 times and at the 5% significance level for 6 times among 96 cases. With SST distribution, the P values are rejected at the 1% significance for 8 times and at the 5% significance level for 5 times among 96 cases. Specifically, the backtesting results in Table 10 are in line with the results   in Tables 3-9, indicating the robustness test support Section 5.3. to 7, columns 10 to 14 report the P-values for each model. ** indicates the significance of P < 0.01; * indicates the significance of P < 0.05.

Conclusion
The environment of the international dry bulk shipping market is complex and volatile, and the price changes are extremely dramatic. In such a highly volatile environment, coupled with the asymmetry and heavy tails of freight rates returns, forecasting market risks is extremely challenging. This paper tests the risk prediction models based on nine different types of distributions from the perspective of short and long positions. The empirical results show that commonly used distributions i.e. the norm, ST, and GED distributions perform poorly in the highly volatile dry bulk shipping market, while risk models based on SST, Q. N. Du SGED and HYP distribution perform better in general. This study provides some theoretical basis for market participants. First, when risk managers forecast the tail risks in the dry bulk shipping market, they should avoid using some common distributions, and consider SST, SGED and HYP distributions to describe skewness and leptokurtosis of returns. Secondly, risk managers should select distributions with best risk forecasting ability for different shipping sectors, which can more accurately measure the extreme risks for dry bulk shipping freight rates and further improve risk forecasting and management ability. Finally, this research will inevitably have certain limitations. For example, the backtesting can be more comprehensive; further consideration can be given to risk prediction indicators using Expected Shortfall (ES).

Conflicts of Interest
The author declares no conflicts of interest regarding the publication of this paper.