Tail Quantile Estimation of Heteroskedastic Intraday Increases in Peak Electricity Demand

Modelling of intraday increases in peak electricity demand using an autoregressive moving average-exponential generalized autoregressive conditional heteroskedastic—generalized single Pareto (ARMA-EGARCH-GSP) approach is discussed in this paper. The developed model is then used for extreme tail quantile estimation using daily peak electricity demand data from South Africa for the period, years 2000 to 2011. The advantage of this modelling approach lies in its ability to capture conditional heteroskedasticity in the data through the EGARCH framework, while at the same time estimating the extreme tail quantiles through the GSP modelling framework. Empirical results show that the ARMAEGARCH-GSP model produces more accurate estimates of extreme tails than a pure ARMA-EGARCH model.


Introduction
Peak electricity demand modelling is a policy concern for countries throughout the world.Many countries are investing heavily in the construction of new (reserve) generating plants in order to increase electricity supply during peak demand periods.Most countries including those with emerging economies have embarked on use of new and smart energy saving technologies and have put in place integrated demand side management and energy efficient strategies and policies in an effort to reduce consumption.In this paper we discuss the distribution of intraday changes in daily peak electricity demand and the modelling of extreme quantiles using an autoregressive moving average-exponential generalized autoregressive conditional heteroskedasticity-generalized single Pareto (ARMA-EGARCH-GSP) approach.We define intraday changes as daily increase/decrease in peak electricity demand in daily peak demand (DPD) where DPD is the maximum hourly demand in a 24-hour period.The paper focuses on positive intraday changes.Modelling of unexpected extreme positive intraday increases is important to load forecasters, systems operators and demand managers in planning, load flow analysis and scheduling of electricity.
The use of extreme value distributions requires that the assumptions of independent and identical distributed observations are met [1][2][3][4].These assumptions provide obstacles to the straightforward application of extreme value to both financial market returns and electricity return series [2,4].To overcome this problem, we adopt the approach used by [4].Using a two stage approach, [4] estimate a GARCH model in stage one with a view to filtering the return series to get nearly independent and identical distributed residuals.In stage two, the extreme value theory (EVT) framework is then applied to the standardized residuals.The relative performance of valueat-risk (VAR) models on daily stock market returns is discussed in [5].VAR is a measure of the risk of a portfolio.An EVT approach is used to generate VAR estimates and provide tail forecasts.Results from this study indicate that EVT based VAR estimates are more accurate at higher quantiles.The modelling approach discussed in this paper is important for assessing risk in intraday increases in peak electricity demand forecasting.This is supported by [6] who use the generalized extreme value (GEV) theory and block maxima approach to estimate the maximum load forecast errors in order to assess risk in long-term electricity load forecasting.An application of [4] modelling approach to electricity demand forecasting is discussed in literature.Reference [2] applies a generalized Pareto distribution (GPD) to an autoregressive GARCH filtered price change series.Empirical results from this study show that a peaks-overthreshold method provides accurate results in modelling tails of hourly electricity price changes.Reference [7] propose a model that accommodates autoregression and   1 100 ln ln weekly seasonalities in both the conditional mean and conditional volatility of daily electricity spot price returns.The tails of the distribution are then modelled using the EVT approach.The developed EVT-based model performs well in forecasting out-of-sample VAR.The rest of the paper is organized as follows.In Section 2 we describe the data and provide a brief discussion of the return series data.Section 3 discusses the modelling approach together with the models used in this paper.The empirical results are presented in Section 4, and the conclusion is presented in Section 5.

Data
Hourly electricity data is collected for years 2000 through to 2011 from Eskom, South Africa's power utility company.The hourly data is then divided into blocks of 24 hours each resulting in 4271 observations.All hours in a 24 hour block are from the same date.In each block the maximum hourly demand is recorded, and is referred to as daily peak demand (DPD).We see from the graphical plot of DPD in Figure 1 that these data exhibit strong seasonality with a steep positive linear trend.Formal unit root tests are conducted using the Augmented-Dickey Fuller test.Results indicate that the natural logarithm of the first difference of DPD is stationary.Based on the stationarity requirements we calculate the intraday percentage changes t that are called the return series data, as given in Equation ( 1). ( where t , 1 t  are the current and one period lagged DPD respectively.The returns t given in Equation ( 1) are explained in detail in Appendix A.

  r
The DPD return series given in Figure 2 shows that volatility occurs in bursts with a large number of extreme observations and exhibits the presence of volatility clustering.
The kernel density of DPD return series given in Figure 3 shows that the empirical distribution of the data is non-normal.The density is estimated using kernel density estimation [8].

The Models
Electricity returns are highly volatile and display seasonalities in both their mean and as well as volatility, exhibit leverage effects and clustering in volatility, and feature extreme levels of skewness and kurtosis [7].This requires the use of ARMA-GARCH extreme value theory modelling framework discussed in [2,4].

ARMA-EGARCH Model
Assuming a conditional normal distribution, we adopt an ARMA(p,q)-EGARCH(1,1) model with the following mean and variance structures   r Mean equation:   Copyright Variance equation: where t is the return series of DPD, as defined in Equation ( 1).The EGARCH model was developed to capture the leverage effect in financial time series data [9].Negative shocks in financial markets (bad news) generally have larger impacts on market volatility than positive shocks (good news).The presence of a leverage effect can be tested by the hypothesis that , then the impact is asymmetric.The EGARCH (1,1) model is used because the inequality constraints on the parameters,   and  , given in Equation (3) are not imposed; oscillatory behaviour in the conditional variance is permitted as the coefficient  can either be positive or negative, and the persistence of volatility shocks can be measured easily [10].Reference [9] discusses in detail the advantages of using the EGARCH approach instead of the standard GARCH model.

Generalized Single Pareto (GSP) Distribution
Reference [11], show that above a reasonably high threshold,  , the tail of a generalized burr gamma (GBG) distribution can be approximated by a Generalized Pareto (GP)-type distribution.The GP-type distribution, which is a peak over threshold (POT) distribution, is an approximation of the GPD with only one parameter to estimate.The distribution and survival functions of the GP-type distribution that we refer to as the generalized single Pareto (GSP) distribution with shape parameter  (also known as the extreme value index (EVI)) are given in Equations ( 4) and ( 5) respectively.
An expression for the tail quantiles  , associated A derivation of the quantile function is given in Appendix B1.Let t   be the return series as defined in Equation (1).We then fit a GSP distribution to the residual   t  we obtained after fitting the ARMA-EGARCH model to t .In order to extract upper extremes from this sequence, t r  , we take the exceedances over a predetermined high threshold  .We determine the threshold  using the generalized Pareto quantile plot as discussed in [1].Equations ( 2), ( 3) and ( 6) combine to form the ARMA-EGARCH-GSP model.

ARMA(p,q)-EGARCH(1,1) Model Results
In Table 1 we present descriptive statistics of the return series data (for which there are 4270 observations).The skewness and kurtosis presented in Table 1 show that the return series data are non-normal.The Jarque-Bera test is carried out to check whether the skewness and kurtosis are consistent with a normal distribution.
Our ARMA(p,q)-EGARCH ( As shown in Figure 1 electricity demand in South Africa exhibits strong seasonality.For DPD, seasonality is strong over the week, month and year.The following terms are therefore included AR (7), AR(28), AR(365) and MA (7) in the model given in equation (7) in order to filter out this seasonality from the data before fitting the GSP distribution.Several ARMA(p,q)-EGARCH(1,1) models are considered and the model with the smallest Akaike information criterion (AIC) is selected.The model parameters are estimated using the maximum likelihood method under the assumption that the errors are conditionally normally distributed.The estimates are obtained by [12] algorithm using numerical derivatives.The parameter estimates of the best model along with their p-values in parentheses are presented in Table 2.
The LjungBox test results given in Table 2   Note: a Q( 7) is the Ljung-Box tests for serial correlations in the standardized residuals with 7 lags while ARCH( 7) is Engle's LM test of ARCH effects up to the 7 th order.P-values are shown in parentheses.In all cases 5% level of significance is used.
that there is some autocorrelation remaining and most of the heteroskedasticity has been removed.It should be noted that it may not be possible to remove all autocorrelation because we are dealing with high-frequency data.

Threshold Estimation
We fit a GSP distribution to the upper tail of the residuals.A Pareto quantile plot is used to obtain the threshold.The Pareto quantile plot is defined as the scatter plot of the following points: log , log where 1 The observation on the -axis where the plot starts to follow a horizontal straight line is taken as the threshold.

In this case
There are 26 exceedances.The Pareto quantile plot is shown in Figure 4.

GSP Distribution Parameter Estimates
We now consider the error terms greater than  to be GSP distributed.The parameter  is estimated, using the ML method, as ˆ0.0717   .The derivation of the ML estimator of  is given in Appendix B2.
The QQ plot of the residual observations in Figure 5 suggests that the GSP distribution is a relatively good fit to the data.
The unconditional GSP distribution quantiles of the residual distribution are now estimated using the quantile function , t p where given in Equation ( 6) after substituting in the estimated parameter values.In the second stage of the modelling process we calculate the conditional tail quantiles, of our original return distribution as and t  are the conditional mean and volatility from the ARMA-EGARCH model.Equation ( 8) is used to estimate the conditional tail quantiles of the original return series.

Evaluation of Estimated Tail Quantiles at Different Probabilities (Number of Exceedances)
The estimated tail quantiles at different probabilities us- ing the conditional GSP distribution are evaluated.The estimated number of exceedances is then compared to the exceedances from fitting ARMA-EGARCH model.A summary of the results is given in Table 3.The 90 th observed quantile ( residuals) is obtained for example as follows: (0.9 1993 = 1794) the 1794 th ordered observed residual in the data set is 3.0353 and the number of exceedances above 3.0353 is 200, given in parenthesis.Using the quantile function given in Equation ( 6) for the GSP distribution yields ) are then counted and found to be 137.Overall the ARMA-EGARCH-GSP model produces more accurate estimates of extreme tails than a pure ARMA-EGARCH model as shown in Table 3.

Frequency Analysis of Exceedances (by Month)
There are 26 exceedances above the threshold (   ).A summary of the monthly frequency analysis of the exceedances over the period, years 2000-2011 is presented in Table 4 and the histogram is given in Figure 6.
Over the sampling period large intraday increases are most frequently experienced in April followed by January.This frequency analysis of extreme intraday increases is important to system operators and decision makers in the electricity sector as it helps them in planning and scheduling electricity.

Conclusion
In this paper the modelling and tail estimation of intraday increases in peak electricity demand using an ARMA-EGARCH-GSP approach is discussed.The advantage of this modelling approach lies in its ability to capture conditional heteroskedasticity in the data through the EGARCH framework, while at the same time estimating the extreme tail quantiles through the GSP modelling framework.Empirical results show that the ARMA-EGARCH-GSP model produces more accurate estimates of extreme tails than a pure ARMA-EGARCH model.Finally, we state some remaining issues.Interesting areas for future research would involve the modelling of the time-of-the-year seasonality of the volatility and also the use of other methods to determine the threshold.These will be studied elsewhere.

Appendix B1: Derivation of the Quantile Function for GSP Distribution
The distribution function of GSP distribution is given by     The distribution function of the GSP distribution is The probability density function is then given as We then solve 0 to obtain  as the ML estimator of  .

Figure 5 .
Figure 5. QQ plot of ε t,p above τ = 7.3891.The horizontal axis represents the standard theoretical quantiles while the empirical quantiles are plotted on the vertical axis.

1 t
3891 is the threshold and 0.0717 is the ML estimate of  .The number of observations that are larger than the estimated tail quantile ( ,0.

Figure 6 .
Figure 6.Histogram of the frequency of occurrence of exceedances (ε t,p ).

Table 1 . Descriptive statistics of the returns.
indicate

Table 4 . Monthly frequency of exceedances (2000-2011).
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec