^{1}

^{*}

^{2}

^{1}

This paper proposes a simple two-step nonparametric procedure to estimate the intraday jump tail and measure the jump tail risk in asset price with noisy high frequency data. We first propose the pre-averaging threshold approach to estimate the intraday jumps occurred, and then use the peaks-over-threshold (POT) method and generalized Pareto distribution (GPD) to model the intraday jump tail and further measure the jump tail risk. Finally, an empirical example further demonstrates the power of the proposed method to measure the jump tail risk under the effect of microstructure noise.

It’s well recognized that the financial asset returns are not normally distributed, but instead exhibit more slowly decaying and asymmetric tails. The earliest influential researches in Mandelbrot [

In contrast to the numerous studies on tail risk resulting from stochastic volatility, there is fewer work to study the jump tail risk. To the best of our knowledge, recent contributions are mainly from Bollerslev and Todorov [

In this paper, we focus on studying the intraday jump tail and measuring the jump tail risk under the market microstructure noise. A simple two-step nonparametric procedure is proposed to implement the analysis. In first step, we use the pre-averaging threshold method to nonparametrically estimate the intraday jump under the effect of microstructure noise. In particular, we first adopt local “pre-averaging” via a kernel function to produce a set of non-overlapping (asymptotically) noise-free observations, and then use the threshold technique to identify the jump series. In second step, we model the intraday jump tail based on the extreme value theory (EVT) and further calculate the jump tail risk measure (Value-at-Risk and Expected Shortfall). Our method is nonparametric, and is easy to implement. Finally, a real data example with actual high frequency data of MSFT is used to show these procedures.

The remainder of this paper is organized as follows. Section 2 presents the methodology to estimate the intraday jump and jump tail risk measurement. Section 3 provides an empirical example to show the procedure. Section 4 draws conclusions.

In this section, a simple two-step procedure is proposed to measure the intraday jump tail risk with noisy high frequency data. In first step, a pre-averaging threshold method is proposed to nonparametrically identify the intraday jump under the effect of microstructure noise. In second step, the peaks-over-threshold (POT) method based on the generalized Pareto distribution (GPD) is used to model the intraday jump tail and further to calculate the jump tail risk measure, i.e. VaR (Value-at-Risk) and ES (Expected Shortfall).

Assume that the efficient logarithmic price p t of an asset defined on a filtered probability space ( Ω , F , ( F t ) t ≥ 0 , ℙ ) , evolves as

d p t = b t d t + σ t d W t + d J t , (1)

where W = ( W t ) is an F -adapted standard Brownian motion. The drift b = ( b t ) and the volatility σ = ( σ t ) are progressively measurable processes which guarantee that (1) has a unique, strong solution, which are adapted and right continuous with left limits (càdlàg) processes. J = ( J t ) is a compound Poisson process with finite activity of jumps. Note that J t can be written as J t = ∑ i = 1 N t X τ i , where ( N t ) is a Poisson process with intensity λ , and X τ i denotes the jump size at the jump location τ i . X τ i are independent identically distributed and independent of N t . We further assume that N t is independent of W t . However, our results can extend to the scenarios with non-constant intensity and more general dependence structure between X τ i and N t .

Suppose that on the finite and fixed time horizon [ 0 , T ] , there are n + 1 discrete realization p t 0 , p t 1 , ⋯ , p t n − 1 , p t n of process p t . 0 = t 0 < t 1 < ⋯ < t n = T is an arbitrary partition of interval [ 0 , T ] . For simplicity, assume that the observations are equally spaced. Denote Δ n = T / n , then t i = i Δ n . In the presence of microstructure noise, at any given time t i , the actually observed log-price is Z t i other than p t i , which can be given as

Z t i = p t i + ε t i , (2)

where ε t is the noise term. Assume that the ε t s are i.i.d. and independent of W t and J t processes, and with E ε t = 0 , and E ε t 2 < ∞ . Although the noises are not necessary i.i.d, this assumption is only for the simplicity to prove the theoretical properties. See the studies in Yu et al. [

Our goal is to estimate the intraday jump X i , with these noisy observation data { Z t i , i = 0 , 1 , ⋯ , n } . For the simplicity of notation, we denote V i n = V i Δ n , Δ i n V = V i n − V i − 1 n for any process V = ( V t ) in the following.

In this paper, we use the pre-averaging approach to diminish the effect of noise. Let Z ¯ i n denote the weighted average of k n observations of

Z i n , Z i + 1 n , ⋯ , Z i + k n − 1 n , where Z ¯ i n = ∑ j = 1 k n − 1 g j n Δ i + j n Z , with weights g j n = g ( j / k n ) . We

require that the weighting function g ( x ) is continuous on [ 0 , 1 ] , piecewise C 1 with a piecewise Lipschitz derivative g ′ , and satisfies g ( 0 ) = g ( 1 ) = 0 ,

∫ 0 1 g ( s ) 2 d s > 0 . We further require that the integer sequence k n satisfies

k n Δ n = θ + ο ( Δ n 1 / 4 ) for some constant θ > 0 .

Then we can use the threshold technique to identify the jump with these pre-averaging observations { Z ¯ i n } . The threshold function is required to satisfy the following assumption.

Assumption 1 The threshold function r ( Δ n ) is a deterministic function of the step length Δ n , such that

(a) lim Δ n → 0 r ( Δ n ) = 0 ;

(b) lim Δ n → 0 Δ n 1 / 2 ( log 1 Δ n ) 2 r ( Δ n ) = 0 .

Power functions r ( Δ n ) = β Δ n α for any α ∈ ( 0 , 1 / 2 ) and β ∈ R are possible choices. Under the Assumption 1, for P-almost all ω , ∃ Δ > 0 such that

∀ Δ n ≤ Δ , we have that ∀ i = 0 , ⋯ , n − 1 , I { ( Z ¯ i n ) 2 ≤ r ( Δ n ) } = I { ∩ j = i + 1 i + k n − 1 ( Δ j N = 0 ) } , which says

that the threshold function r ( Δ n ) can be used to asymptotically identify the intervals where no jump occurred; also see the literature on the noise- and jump- robust volatility estimation (Jing et al. [

T ^ τ = { t i ∗ ∈ [ 0 , T ] : ( Z ¯ i ∗ n ) 2 > r ( Δ n ) , i = 0 , ⋯ , j n } , (3)

where i ∗ = i ( k n − 1 ) .

We now turn to estimate the jump size by a simple nonparametric method. Denote by Δ i i + k n − 1 N = N ( i + k n − 1 ) Δ n − N i Δ n ; if Δ i i + k n − 1 N ≥ 1 , by τ ( i ) the first instant a jump occurs within ( t i , t i + k n − 1 ] , and X τ ( i ) the size of this first jump, also let X ¯ = min j = 1 , ⋯ , N T | X τ j | . For the simplicity of notation, we denote j n = [ n / ( k n − 1 ) ] in the following. For small Δ n , we have that a.s. in any time interval ( t i , t i + k n − 1 ] , at most only one jump can occur. Moreover, we can obtain that the pre-averag- ing observation Z ¯ 0 i n of continuous diffusion process without jump satisfies sup Z ¯ 0 i n = O ( Δ n 1 / 4 log 1 / Δ n ) , while the pre-averaging observation of jump process J ¯ i n is greater than X ¯ multiplying some constant, which is not negligible. So we propose the following estimator for jump size X τ ( i ) ,

X ^ τ ( i ) = Z ¯ i n I { ( Z ¯ i n ) 2 > r ( Δ n ) } . (4)

Yu et al. [

and the size of the first jump occurs within ( t i , t i + k n − 1 ] , g ¯ X τ ( i ) I { Δ i i + k n − 1 N ≥ 1 } , where

g ¯ = ∫ 0 1 g ( s ) d s .

In this subsection, we present how to model the intraday jump tail and then to measure the jump tail risk, i.e. VaR (Value-at-Risk) and ES (Expected Shortfall) based on extreme value theory (EVT). Extreme value theory provides simple parametric models to capture the extreme tails of distribution and to forecast risk. There are mainly two methods of applying EVT: the first is known as the Block Maxima (Minima) (BMM) method based on the generalized extreme value distribution (GEV), while the second is known as the peaks-over-threshold (POT) approach based on the generalized Pareto distribution (GPD). Since the POT method uses GPD to fit the exceedances over a given threshold and hence it doesn’t require a large data set as BMM, it is considered more efficient in modelling limited data (McNeil, Frey and Embrecht [

Suppose that the jump series { X i } are identically distributed random variables with unknown underlying distribution function F ( x ) = P ( X i ≤ x ) . The excess distribution F u over a threshold u is given by

F u ( y ) = P ( X − u ≤ y | X > u ) = F ( y + u ) − F ( u ) 1 − F ( u ) = F ( x ) − F ( u ) 1 − F ( u ) (5)

for 0 < y < x F − u , where x F ≤ ∞ is the right endpoint of F , and y = x − u .

In EVT framework, there is a key result that for a large class of underlying distributions F (containing all the common continuous distributions in statistics, such as normal, lognormal, t, gamma, exponential, beta, etc.), as the threshold u progressively increases, the excess distribution F u converges to a generalized Pareto distribution. In the sense of this result, the GPD is the natural model for the excess distribution above sufficiently high thresholds. That is the excess distribution function F u can be approximated by GPD for a certain u :

F u ( y ) ≈ G ξ , σ ( y ) , (6)

where G ξ , σ is the generalized Pareto distribution (GPD), which is given by

G ξ , σ ( y ) = { ( 1 + ξ σ y ) − 1 / ξ if ξ ≠ 0 1 − e − y / σ if ξ ≠ 0 (7)

for y ∈ [ 0 , ( x F − u ) ] if ξ ≥ 0 , and y ∈ [ 0 , − σ ξ ] if ξ < 0 . Here ξ is the

shape parameter and σ is the scale parameter for GPD.

Hence, for x ≥ u , replacing the F u by GPD,

F ¯ ( x ) = P ( X > u ) P ( X > x | X > u ) = F ¯ ( u ) P ( X − u > x − u | X > u ) = F ¯ ( u ) F ¯ u ( x − u ) = F ¯ ( u ) ( 1 + ξ x − u σ ) − 1 / ξ . (8)

This gives a formula for tail probabilities. The inverse of (8) gives the high quantile of the distribution or VaR. Thus, for α ≥ F ( u ) (i.e. tail probability is 1 − α ), VaR is given by

V a R α = q α ( F ) = u + σ ξ ( ( 1 − α F ¯ ( u ) ) − ξ − 1 ) . (9)

For ξ < 1 , the ES is given by

E S α = 1 1 − α ∫ α 1 q x ( F ) d x = V a R α 1 − ξ + σ − ξ u 1 − ξ . (10)

Equations (9) and (10) give the theoretical formulae to calculate the jump tail risk measure. In the following, we show that how to estimate the VaR and ES with the identified jump series.

For the identified jump series { X ^ i } , if there are total n observations and N u of observations above u , we get an empirical estimator N u / n of F ¯ ( u ) . Putting the maximum likelihood estimates of the parameters of the GPD together, we arrive an estimator for tail distribution F ( x ) ,

F ^ ( x ) = 1 − N u n ( 1 + ξ ^ σ ^ ( x − u ) ) − 1 / ξ ^ . (11)

Also, we get the estimator of VaR

Math_140# (12)

and the estimator of ES

E S ^ α = V a R ^ α 1 − ξ ^ + σ ^ − ξ ^ u 1 − ξ ^ . (13)

The estimation procedure presented above depends heavily on the important parameter u . In this paper, we will use the mean excess plot to choose a reasonable threshold. The idea behind this method is demonstrated as follows. Given a high threshold u 0 , suppose that the excess X − u 0 follows a GPD with parameter ξ and σ . Then the mean excess over the threshold u 0 is

E ( X − u 0 | X > u 0 ) = σ 1 − ξ . (14)

For any u > u 0 , define the mean excess function e ( u ) as

e ( u ) = E ( X − u | X > u ) = σ + ξ ( u − u 0 ) 1 − ξ . (15)

Thus, for a fixed ξ , the mean excess function is a linear function of u for u > u 0 . This result leads to simple graphical method to infer the appropriate threshold value u 0 for the GPD. Define the empirical mean excess function as

e ^ ( u ) = 1 N u ∑ i = 1 N u ( x i − u ) . (16)

The scatter plot of e ^ ( u ) against u is called the mean excess plot, which should be linear in u for u > u 0 . Hence, we can choose a reasonable threshold according to the mean excess plot.

In this section, we implement our procedure of measuring the intraday jump tail risk with actual high frequency data. We collect the transaction data for Microsoft Corporation (MSFT) shares carried out on NASDAQ from Jan 3, 2011 to Jul 29, 2011 from Wharton Research Data Services (WRDS). We use every ten seconds data to identify and estimate the intraday jumps in one minute return by implementing pre-averaging step with k n = 7 observations. Over this seven months time period, there were total 336,960 ten-seconds observations corresponding to daily 6.5 trading hours in valid 144 trading days excluding weekends and holidays. The return is calculated by r t i = ( log P t i − log P t i − 1 ) × 100 , where P t i denotes the transaction price at t i .

Firstly, we use the pre-averaging threshold method to estimate the intraday jump. Let g ( x ) = x ∧ ( 1 − x ) , which is used in Jacod et al. [

Next, we use the POT method and generalized Pareto distribution (GPD) to fit the negative and positive jump tail respectively. The threshold u is chosen by the mean excess function.

Based on the chosen threshold u ,

We then calculate the VaR and ES for negative and positive jumps based on the above estimation results of jump tail distribution. The results of VaR and ES are presented in

Jump component in asset price process is a very important source of financial

Negative jump | Positive jump | ||
---|---|---|---|

Counts | 437 | 452 | |

Percentage | 0.78% | 0.81% | |

Threshold u | 0.20 | 0.25 | |

Counts of exceedances | 286 | 293 | |

Percentage of exceedances | 65.45% | 64.82% | |

0.2176***^{ } | −0.0803 | ||

(0.0669) | (0.0796) | ||

0.0809*** | 0.1225*** | ||

(0.0071)^{ } | (0.0138) |

Note: Values in parenthesis are the standard errors of the estimates, *, **, *** mean that the results are significant at 10%, 5%, 1% level respectively.

Sig. level | Negative tail | Positive tail | ||
---|---|---|---|---|

VaR | ES | VaR | ES | |

5.00% | 0.5417 | 0.5787 | 0.5762 | 0.6095 |

(0.9654) | (0.9177) | (0.9955) | (0.7538) | |

1.00% | 0.8408 | 0.8521 | 0.7216 | 0.7296 |

(0.9795) | (0.9795) | (0.7815) | (0.7815) | |

0.50% | 1.0057 | 1.0124 | 0.7787 | 0.7829 |

(0.9854) | (0.9854) | (0.6139) | (0.6139) | |

0.10% | 1.4994 | 1.5014 | 0.8995 | 0.9004 |

(0.3405) | (0.3405) | (0.2977) | (0.2977) | |

0.01% | 2.5864 | 2.5867 | 1.0473* | 1.0474* |

(0.1108) | (0.1108) | (0.0962) | (0.0962) |

Note: Values in parenthesis are the p values in testing the validity of VaRs and ESs, *, **, *** mean that the risk measures are invalid at 10%, 5%, 1% level respectively.

extreme risk. With the availability of high frequency data, it has aroused wide attention of researchers in last two decades. However, with the frequency of data increases, the identification of jump and its relevant studies will run into the bias problem caused by market microstructure noise. In this paper, we propose a simple nonparametric method to identify the intraday jump and measure the intraday jump tail risk with noisy high frequency data. We use a two-step procedure to measure the jump tail risk. In first step, we use a pre-averaging approach to diminish the effects of noises, and then propose the pre-averaging threshold estimator of intraday jump. In second step, we fit the tail distribution of the identified jump series with POT method and GPD, and then to calculate the risk measure (VaR and ES) of jump tail. Finally, we show the power of our procedure by a real data study. The results show that our proposed procedure of measuring the jump tail risk is valid and is easy to implement. Moreover, the nonparametric identification of intraday jump can also be used to analyze the dynamics of intraday jump, which is useful to study the microstructure of the market. Further studies on risk management, such as analyzing the impactors of jump tail risk, dynamic jump tail risk forecasting are the future research directions.

This research was supported in part by the NSFC (71601048), and the Fundamental Research Funds for the Central Universities in UIBE (13QD09).

Yu, C., Zhao, X.J. and Zhang, F. (2017) Measuring the Intraday Jump Tail Risk of Financial Asset Price with Noisy High Frequency Data. Open Journal of Statistics, 7, 72-83. https://doi.org/10.4236/ojs.2017.71006