Distributed Estimator of Market Beta under Extreme Conditions

Abstract

Market beta is a measure of the volatility or systematic risk of a security or portfolio compared to the market as a whole. This paper considers the distributed estimation of market beta in the case of massive data, and obtains the consistency and asymptotic normality of the estimator. Further, simulations show the finite sample properties of this estimator.

Share and Cite:

Zhu, S. (2023) Distributed Estimator of Market Beta under Extreme Conditions. Journal of Applied Mathematics and Physics, 11, 3676-3701. doi: 10.4236/jamp.2023.1111232.

1. Introduction

Distributed statistical inference, as a hot topic and an effective method, has been widely discussed in the past ten years, and a lot of research results have been accumulated. Its representative work are: In theory, Chen and Zhou (Chen and Zhou) [1] put forward the distributed Hill estimator and prove its Oracle property; Volgushev et al. (2017) [2] propose distributed inference for quantile regression processes, and propose a method to calculate the efficiency of this inference, which requires almost no additional computational cost. In application, Mohammed et al. (2020) [3] propose a technique to divide a Deep Neural Networks (DNN) in multiple partitions, which reduces the total latency for DNN inference; Smith and Hollinger (2018) [4] propose a distributed inference-based multi-robot exploration technique that uses the observed map structure to infer unobserved map features, resulting in a reduction in the cumulative exploration path length in the trial; Ye (2017) [5] started to study the stability of the beta coefficient of the Chinese stock market and found the best beta estimation time. Mitra (2019) [6] uses a smooth linear transfer function to measure the amplitude and direction of market movement, and the proposed classification can better capture the asymmetric behavior of beta.

Market beta, also known as systematic risk or equity beta, is a measure of a stock’s sensitivity to overall market movements. The development of market beta can be traced back to the early 20th century when economists and financial analysts began to understand the importance of systematic risk in determining asset returns. One of the pioneering works in this area is the paper of Markowitz (1952) [7] on portfolio selection, which laid the foundation for modern portfolio theory, and emphasized the importance of diversification. Black and Scholes (1973) [8] introduced the concept of beta to measure the systematic risk of individual securities or stock. Banks with a higher beta are expected to suffer from larger capital losses in the event of an extremely adverse shock in the financial system.

Estimating market beta involves analyzing historical data on a stock’s returns and its correlation with the market returns. The usually used econometric model is that the stock’s returns are regressed against the market returns over a specific time period, which has been used to evaluate beta of financial returns on commodities, currencies (Atanasov and Nitschka (2014) [9] ; Lettau et al. (2013)) [10] , stocks (Post and Versijp (2004)) [11] , and active trading strategies (Mitchell and Pulvino 2001) [12] . However, in extreme cases, conditional regression is based on a small number of tail observations, which may produce a relatively large variance of the estimator, and the data of the financial market are mostly heavy tail, which may further increase error. To avoid these situations, Oordt and Zhou (2017) [13] proposed a new method to estimate market β .

Let X and Y be continuous random variables with distribution functions F X and F Y , respectively. Assume that 1 F X and 1 F Y be heavy-tail with tail index α x and α y , respectively. This means that

1 F X ( u ) = u α x l x ( u ) and 1 F Y ( u ) = u α y l y ( u ) , (1.1)

where l x ( u ) and l y ( u ) are slowly varying functions as u . Let Q X ( p ¯ ) = F X ( 1 p ¯ ) with small p ¯ , relation of X and Y restricted under extreme X is given by

Y = β X + ε , for X > Q X ( p ¯ ) , (1.2)

ε is the error term that is assumed to be independent of the X under the condition X > Q X ( p ¯ ) .

To get estimator under the EVT method, we consider the following tail dependence measure from multivariate EVT (see, e.g, Hult and Lindskog (2002)) [14] ,

τ : = l i m p 0 τ ( p ) = l i m p 0 1 p P ( X > Q X ( p ) , Y > Q Y ( p ) ) [14] , (1.3)

where Q Y ( p ) denotes the quantile function of Y defined as Q Y ( p ) = F Y ( 1 p ) . And we assume that the usual second-order condition (see, e.g, de Haan and Stadtmüller (1996)) [15] for X, which quantifies the speed of convergence in this relation as

lim u P ( X > u x ) P ( X > u ) x α x a x ( u ) = x α x x ρ α x 1 ρ 1 α x [15] , (1.4)

where a x ( u ) = A x ( 1 1 F ( u ) ) and l i m u A x ( u ) = 0 , ρ 0 .

Suppose (1.4) hold, under the linear model in (1.2), with α y > α x / 2 , β 0 , the following conclusion is given in Oordt and Zhou (2017) [13] :

lim p 0 ( τ ( p ) ) 1 α x Q Y ( p ) Q X ( p ) = β [13] . (1.5)

Naturally, consider independent and identically distributed (i.i.d.) observations ( X 1 , Y 1 ) , ( X 2 , Y 2 ) , , ( X n , Y n ) with the i.i.d. unobserved error terms ε 1 , , ε n , we mimic the limit procedure p 0 by considering only the lowest k observations in the tail region, such that k : = k ( n ) and k / n 0 as n , Oordt and Zhou (2017) [13] gives an estimator of β as

β ^ : = τ ^ ( k / n ) 1 / α ^ x Q ^ Y ( k / n ) Q ^ X ( k / n ) [13] . (1.6)

And to prove asymptotic normality, the second-order condition for Y is given by:

l i m u P ( Y > u y ) P ( Y > u ) y α y a y ( u ) = y α y y ρ α y 1 ρ 1 α y [15] , (1.7)

where a y ( u ) = A y ( 1 1 F ( u ) ) is an eventually positive or negative function, l i m u A y ( u ) = 0 , ρ 0 . Then, Drees and Huang (1998) [16] define R ( x , y , p ) = 1 p P ( X > Q X ( p x ) , Y > Q Y ( p y ) ) . For this dependence structure, we assume that, R ( x , y , p ) R ( x , y ) as p 0 for some positive function R ( x , y ) , with a speed of convergence as follows: there exists a θ > 0 for which, as p 0 ,

R ( x , y , p ) R ( x , y ) = O ( p θ ) (1.8)

for all ( x , y ) [ 0,1 ] 2 / { ( 0,0 ) } . And we can simply get

l i m p 0 R ( 1,1, p ) = τ . (1.9)

Under condition (1.4), (1.7) and (1.8) hold, suppose k = O ( n ζ ) , where

ζ < min ( 2 θ 1 + 2 θ , 2 ρ 2 ρ + α x , 2 ρ 2 ρ + α y , 3 2 + α y ) ,

Oordt and Zhou (2017) [13] prove the asymptotic normality of β ^ .

This estimation method can be used not only for the assessment of investment risks, but also for banking (Oordt and Zhou (2018)) [17] , insurance and other fields. However, due to confidentiality, banks may not share their operating losses with each other, and insurance companies cannot share any observation results with the outside world in order to protect the privacy of customers. Therefore, banks and insurance companies can only make statistics based on their own data and share the results, and cannot re-identify individual data from the shared information. Distributed statistical inference is a good way to deal with these situations, it can analyze data stored in multiple machines, and it usually requires a divide-and-conquer algorithm that estimates the required parameters on each machine, transmits the results to a central machine that combines all the results, usually by simple averaging, to arrive at a computationally feasible estimator.

The objective of this paper is to apply divide-and-conquer idea to estimating market β . Considering independent and identically distributed (i.i.d.) observations ( X 1 , Y 1 ) , ( X 2 , Y 2 ) , , ( X n , Y n ) are distributed across k different machines, each machine has m observations, n = m k , and we assume as n ,

k : = k ( n ) , m : = m ( n ) , m log k . (1.10)

We follow a divide-and-conquer algorithm, first estimating β ^ n , j in each machine, and then taking the average of k machines as the distributed estimator β ^ D for β ,

β ^ D = 1 k j = 1 k β ^ n , j . (1.11)

Sort the observations X j 1 , X j 2 , , X j m in the j-th machine, we get the order statistic X j ( 1 ) X j ( 2 ) X j ( m ) , and only the first d are selected to estimate β , where d : = d ( n ) , d m 0 as n . From (1.5), we have

β ^ n , j = ( τ ^ j ( d m ) ) 1 α ^ j Q ^ Y j ( d m ) Q ^ X j ( d m ) , (1.12)

where, the tail index is estimated using the Hill estimator given in Hill (1975) [18] :

1 α ^ j : = 1 d i = 1 d ( log X j ( i ) log X j ( d + 1 ) ) [18] .

The estimator of dependence measure is provided by multivariate EVT, see Embrechts et al. (2000) [19] , that is

τ ^ j ( d m ) = 1 d t = 1 m 1 { Y j t > Y j ( d + 1 ) , X j t > X j ( d + 1 ) } , [19]

where Y j ( d + 1 ) is the ( d + 1 ) -th highest order statistic of Y j . Finally, Q ^ X j ( d m ) = X j ( d + 1 ) , Q ^ Y j ( d m ) = Y j ( d + 1 ) .

When l i m n sup τ ( d m ) = 0 , we require some additional conditions to ensure the asymptotic normality of the Hill estimator:

lim n d A x ( m d ) = λ < , d log m + . (1.13)

Suppose there would exist a sequence p n 0 as n such that, for sufficient large n, we have p n < p ¯ , which implies that the linear model in (1.2) applies for sufficiently large n.

The remainder of this paper is organized as follows. Section 2 provides the main results; finite behaviors of β ^ D are considered in Section 3; all proofs are deferred to Section 4.

2. Main Innovations and Results

The innovations of this paper are:

· Under extreme market conditions, with less data and heavy tails, a new beta estimator is proposed by using the distributed idea.

· In the numerical simulation, the profile of data pollution is considered, and the expected effect is achieved, and the data is more inclusive.

The results of this paper are:

• The consistency of β ^ D .

Theorem 2.1. Under the linear tail model in (1.2), assume (1.1) and (1.4) hold. (1.13) holds when l i m p 0 τ ( p ) = 0 . Then, as n , β ^ D P β .

• The asymptotic normality of β ^ D .

Theorem 2.2. Assume that the conditions in Theorem 2.1 hold, P ( Y < u ) = O ( P ( Y > u ) ) as u . Suppose both (1.7) and (1.8) hold, l i m p 0 τ ( p ) = τ ( 0,1 ) , Further assume that k d = O ( n ζ ) , where

ζ < min ( 2 θ 1 + 2 θ , 2 ρ 2 ρ 1 , 2 ρ 2 ρ 1 , 2 3 + α y ) .

Then, as n ,

k d ( β ^ D β ) d N ( 0, β 2 α x 2 ( 1 τ + 2 log τ 2 τ log τ 1 3 ( log τ ) 2 ) ) .

3. Simulation

We conduct two sets of simulations to demonstrate the finite sample performance of the distributed beta estimator β ^ D . For each simulation, we consider three linear models, that is, β = 1.5 , 1 , 0.5 . We generate samples with samples size n = 10,000. Based on r = 1000 repetitions, we obtain the finite sample squared bias, variance and Mean Squared Error (MSE) for our estimator.

3.1. Compare for Different Level of d

In the first set of simulations, we vary the level of d in the distributed beta estimator to verify the theoretical results on the oracle property. The oracle sample X 1 , , X n contains n = 10,000 observations stored in k machines with m observations each. We fix k = 20 and m = 500, compare the finite sample performance of the distributed beta estimator with that of the oracle beta estimator for different values of d. Since the Student’s t-distribution is known to be heavy-tailed with the tail index equal to the degrees of freedom, we perform simulations of X and ε based on random draws from a Student’s t-distribution with Four degree of freedom. According to Lemma 1.3.1 in Embrechts et al. (1997) [20] , the sum of two heavy-tailed random variables is also a heavy-tailed random variable, and the tail index of the sum is controlled by smaller tail index. Then, the observations for Y are constructed by aggregating the simulated X and

ε, which could guarantees Y is also heavy-tailed and α y > 1 2 α x .

The first column of Figure 1 compares the Mean Square Error of the distributed beta estimator β ^ D and the oracle beta estimator β ^ . Firstly, Mean Square Error gradually decreases with the increase of β. Theoretically, τ increases with the increase of β, while Mean Square Error decreases with the increase of

Figure 1. Finite sample performance for the distributed beta estimator and the oracle beta estimator for different levels of d. The blue report the simulation results for distributed EVT approach; the yellow lines report those for the EVT approach.

τ. Therefore, the simulation results are in agreement with the theoretical results. Secondly, the second and third columns of Figure 1 show decomposition of the MSE into squared bias and variance, we observe a trade off between the bias and varience for the both estimators: as d increase, the bias increase while the variance decreases, and when the number of observations is small, the variance of the oracle beta estimator is smaller than that of the distributed beta estimator, and as the number of observations increases, the variance becomes equal, which is in line with the result of Theorem 2.2.

3.2. Data Is Contaminated

In the second set of simulations, we want to know whether distributed estimators have good properties when the data is contaminated. We simulate three cases of X being contaminated, ε being contaminated and both X and ε being contaminated respectively. The total number of observations does not change, that is, n = 10,000 is divided into k = 20 machines with m = 500 observations in each machine.

Figure 2 shows the Mean Square Error, square deviation and variance of the two estimators when X is contaminated. We also model 10,000 observations of ε

Figure 2. X is contaminated, finite sample performance for the distributed beta estimator and the oracle beta estimator for different levels of p. The blue lines represent the simulation results of the distributed beta estimator, and the red lines represent the corresponding Oracle results.

from a Student’s t-distribution with 4 degrees of freedom, observations of X are drawn from a standard normal distribution with probability 0.1 and a Student’s t-distribution with 4 degrees of freedom with probability 0.9, this means that 1000 out of 10,000 observations are contaminated. We then sort the observations in each machine and use (1.12) to get β ^ n , j .

The third column in Figure 2 shows the variance of the two estimators when d takes different values, which is almost the same as the result when the observations are not contaminated. When the number of observations is small, the variance of the distributed estimator is larger than that of the Oracle estimator, and with the increase of d, the variance is close to zero. Observe the first column, the Mean Square Error is less than 0.05, the estimation effect is good.

Figure 3 shows the Mean Square Error, square deviation and variance of the two estimators when ε is contaminated. We also model 10,000 observations of X from a Student’s t-distribution with 4 degrees of freedom, observations of ε are drawn from a standard normal distribution with probability 0.1 and a Student’s t-distribution with 4 degrees of freedom with probability 0.9, this means that 1000 out of 10,000 observations are contaminated. We then sort the observations in each machine and use (1.12) to get β ^ n , j . Obviously, Figure 3 is basically

Figure 3. ε is contaminated, finite sample performance for the distributed beta estimator and the oracle beta estimator for different levels of p. The blue lines represent the simulation results of the distributed beta estimator, and the red lines represent the corresponding oracle results.

consistent with Figure 1, indicating that the selection of ε does not affect the properties of the estimator, which is consistent with the theory that the random error can be thin-tailed.

Figure 4 shows the Mean Square Error, square deviation and variance of the two estimators when both ε and X are contaminated. Observations of ε and X are drawn from a standard normal distribution with probability 0.1 and a Student’s t-distribution with 4 degrees of freedom with probability 0.9, this means that 1000 out of 10,000 observations are contaminated. We then sort the observations in each machine and use (1.12) to get β ^ n , j . Similar to Figure 1, this is consistent with the theoretical results, indicating that distributed estimators can be treated similarly when the data is contaminated.

4. Proof

In order to prove the main results, we need the following two lemmas.

Lemma 4.1. Assuming that (1.3) and (1.7) hold, k d = O ( n ζ ) , ζ < min ( 2 ρ 2 ρ 1 , 2 ρ 2 ρ 1 ) , then as n , we have

Figure 4. Both ε and X are contaminated, finite sample performance for the distributed beta estimator and the oracle beta estimator for different levels of p. The blue lines represent the simulation results of the distributed beta estimator, and the red lines represent the corresponding Oracle results.

lim n k d A x ( m d ) = 0 ; lim n k d A y ( m d ) = 0.

Proof. According to theorem 2.3.3 in de Haan and Ferrira (2006) [21] , as t , A x ( t ) is a regular function with parameter ρ < 0 , then

l i m n k d A x ( m d ) = l i m n k d ( m d ) ρ l ¯ x ( m d ) = l i m n ( k d ) 1 2 ρ n ρ l ¯ x ( m d ) = l i m n n ζ ( 1 2 ρ ) + ρ l ¯ x ( m d ) = l i m n ( k d ) ζ ( 1 2 ρ ) + ρ ( n k d ) ζ ( 1 2 ρ ) + ρ l ¯ x ( n k d )

where l ¯ x is a slowly varying function, since ζ < 2 ρ 2 ρ 1 , we have ζ ( 1 2 ρ ) + ρ < 0 , then,

l i m n k d A x ( m d ) = 0,

similarly, l i m n k d A y ( m d ) = 0

Let

C : = { Y > Q Y ( p y ) , X > Q X ( p x ) } ;

C 1 : = { β X > Q Y ( p y ) ( 1 + δ ) , X > Q X ( p x ) , ε > δ Q Y ( p y ) } ;

C 21 : = { β X > Q Y ( p y ) ( 1 δ ) , X > Q X ( p x ) } ;

C 22 : = { ε > δ Q Y ( p y ) , X > Q X ( p x ) }

with δ : = δ ( p ) > 0 to be specified later. It’s clear that for any 0 < δ < 1 , C 1 C C 21 C 22 .

Lemma 4.2. Suppose P ( Y < u ) = O ( P ( Y > u ) ) as u , Further assume that (1.4) and (1.7) hold, k d = O ( n ζ ) , where

ζ < min ( 2 θ 1 + 2 θ , 2 ρ 2 ρ 1 , 2 ρ 2 ρ 1 , 2 3 + α y ) ,

then as n ,

k d P ( ε > δ n Q Y ( d m ) ) 0 ; (4.1)

k d P ( ε δ n Q Y ( d m ) ) 0 ; (4.2)

k d [ m d P ( β X > Q Y ( d m ) ( 1 ± δ n ) ) ( β Q X ( d m ) Q Y ( d m ) ) α x ] = 0. (4.3)

Proof. Let δ = δ n = ( k d ) 1 / 2 κ , 0 < κ < 1 / 2 α y , such that

( 1 2 + κ ) α y + 3 2 < 1 ζ .

Notice that ζ < 2 α y + 3 , the choice of κ is feasible. And we have that l i m n k d δ n = 0 and lim n k d δ n α y d m = 0 . We first prove that as n , δ n Q Y ( d m ) .

From the heavy-tailed property of the distribution function of Y in (1), we obtain that Q Y ( p ) = p 1 α y l ˜ y ( p ) , l ˜ y is a slowly varying function as p 0 , then

lim n δ n Q Y ( d m ) = lim n ( k d ) 1 2 κ ( k d n ) 1 α y l ˜ y ( k d n ) = lim n n 1 α y ζ ( 1 2 + κ + 1 α y ) l ˜ y ( k d n ) = lim n ( k d ) 1 α y ζ ( 1 2 + κ + 1 α y ) ( k d n ) ζ ( 1 2 + κ + 1 α y ) 1 α y l ˜ y ( k d n )

Since ζ < 2 α y + 3 , 0 < κ < 1 / 2 α y , we have ζ ( 1 2 + κ + 1 α y ) 1 α y < 0 , together with d m 0 , as n , we have

δ n Q Y ( d m ) . (4.4)

Then we prove (4.1) first: Notice that

P ( ε > δ n Q Y ( d m ) ) = P ( ε > δ n Q Y ( d m ) , X > Q X ( p ¯ ) ) P ( X > Q X ( p ¯ ) ) P ( Y > δ n Q Y ( d m ) + β Q X ( p ¯ ) ) p ¯ P ( Y > δ n Q Y ( d m ) ) P ( Y > Q Y ( d m ) ) d m p ¯ ~ δ n α y d m p ¯ .

The penultimate step is based on (4.4). As n , since k d δ n α y d m 0 , then,

k d P ( ε > δ n Q Y ( d m ) ) 0.

Next, we prove (4.2): for some D > 0 , we write

P ( ε δ n Q Y ( d m ) ) = P ( ε δ n Q Y ( d m ) , Q X ( p ¯ ) X 1 β + 1 δ n Q Y ( d m ) ) P ( Q X ( p ¯ ) X 1 β + 1 δ n Q Y ( d m ) ) P ( Y 1 β + 1 δ n Q Y ( d m ) ) p ¯ P ( X 1 β + 1 δ n Q Y ( d m ) ) D P ( Y 1 β + 1 δ n Q Y ( d m ) ) p ¯ P ( X 1 β + 1 δ n Q Y ( d m ) ) ,

The last step uses the condition that P ( Y < u ) = O ( P ( Y > u ) ) . By (4.4), the denominator converges to p ¯ , which is positive and finite. Same as (4.1), we have

k d P ( ε δ n Q Y ( d m ) ) 0.

Then, we prove (4.3): by Lemme (4.1), we know that k d a x ( Q X ( d m ) ) 0 . Recalling the second-order condition (1.4), we have

lim u k d ( P ( X > u x ) P ( X > u ) x α x ) = 0 ,

substituting ux and u by Q Y ( d m ) ( 1 ± δ n ) / β and Q X ( d m ) , together with the fact that Q Y ( d m ) ( 1 ± δ n ) β Q X ( d m ) τ 1 α x , we get that

lim n k d ( m d P ( β X > Q Y ( d m ) ( 1 ± δ n ) ) ( Q Y ( d m ) ( 1 ± δ n ) β Q X ( d m ) ) α x ) = 0 ,

Compared with (4.3), since

lim n k d ( ( Q Y ( d m ) ( 1 ± δ n ) β Q X ( d m ) ) α x ( Q Y ( d m ) β Q X ( d m ) ) α x ) = lim n ( Q Y ( d m ) β Q X ( d m ) ) α x k d [ ( 1 ± δ n ) α x 1 ] = lim n ( Q Y ( d m ) β Q X ( d m ) ) α x k d δ n ( α x ) x α x 1 = 0.

The penultimate step uses Lagrange’s mean value theorem, where x is between 1 and 1 ± δ n . Hence, (4.3) is proved since lim n k d δ n = 0

Proof of Theorem 2.1. Since

β ^ D = 1 k j = 1 k β ^ n , j = 1 k j = 1 k [ ( τ ^ j ( d m ) ) 1 α ^ j Q ^ Y j ( d m ) Q ^ X j ( d m ) ] = 1 k j = 1 k [ ( τ ^ j ( d m ) τ ( d m ) ) 1 α ^ j × ( τ ( d m ) ) 1 α ^ j 1 α x × Q ^ Y j ( d m ) Q Y ( d m ) × Q X ( d m ) Q ^ X j ( d m ) × [ ( τ ( d m ) ) 1 α x Q Y ( d m ) Q X ( d m ) ] ] : = 1 k j = 1 k [ I j , 1 I j , 2 I j , 3 I j , 4 I j , 5 ] , (4.5)

where,

I j , 1 = ( τ ^ j ( d m ) τ ( d m ) ) 1 α ^ j ; I j , 2 = ( τ ( d m ) ) 1 α ^ j 1 α x ; I j , 3 = Q ^ Y j ( d m ) Q Y ( d m ) ;

I j , 4 = Q X ( d m ) Q ^ X j ( d m ) ; I j , 5 = ( τ ( d m ) ) 1 α x Q Y ( d m ) Q X ( d m ) ,

here we show that I j ,1 = 1 + o p ( 1 ) , I j ,2 = 1 + o p ( 1 ) , I j ,3 = 1 + o p ( 1 ) , I j ,4 = 1 + o p ( 1 ) , I j ,5 = β + o p ( 1 ) uniformly for j = 1,2, , k separately.

We first deal with I j ,1 . For ( x , y ) in the neighborhood of (1, 1), denote

τ ˜ j ( x , y ) = 1 d t = 1 m 1 { X j t > Q X ( d m x ) , Y j t > Q Y ( d m y ) } ,

then τ ^ j ( d / m ) can be written as

τ ^ j ( d m ) = τ ˜ j ( m d ( 1 F X ( X j ( d + 1 ) ) ) , m d ( 1 F Y ( Y j ( d + 1 ) ) ) ) ,

and ( m d ( 1 F X ( X j ( d + 1 ) ) ) , m d ( 1 F Y ( Y j ( d + 1 ) ) ) ) is in the neighborhood of (1, 1). According to Corollary 2.2.2 in de Haan and Ferrira (2006) [21] , as n ,

d ( m d Z m d , m 1 ) ~ N ( 0,1 ) ,

where F Z ( a ) = 1 1 / a , Z 1, m Z 2, m Z m , m . Combining with Z m d , m = d 1 1 F X ( X j ( d + 1 ) ) , we have d ( m d 1 1 F X ( X j ( d + 1 ) ) 1 ) ~ N ( 0,1 ) , then, as n ,

d ( m d ( 1 F X ( X j ( d + 1 ) ) ) 1 ) ~ N ( 0,1 ) .

Hence, for any δ > 0 , as n , we have

P ( | m d ( 1 F X ( X j ( d + 1 ) ) ) 1 | > d 1 2 + δ ) 0.

A similar relation for Y holds. Therefore, in order to prove that I j ,1 P 1 , we will prove a more general result that

τ ˜ j ( x , y ) τ ( d m ) P 1

uniformly for all j = 1,2, , k , ( x , y ) [ 1 d 1 / 2 + δ ,1 + d 1 / 2 + δ ] for some 0 < δ < 1 / 2 .

Since the tail dependence function R ( x , y , d / m ) = m d P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) , denote

ξ n , j ( x , y ) = 1 m t = 1 m 1 { X j t > Q X ( d m x ) , Y j t > Q Y ( d m y ) } ,

since the observed values of different machines are independent and identically distributed, we have

τ ˜ j ( x , y ) R ( x , y , d m ) = 1 d t = 1 m 1 { X j t > Q X ( d m x ) , Y j t > Q Y ( d m y ) } m d P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) = 1 m t = 1 m 1 { X j t > Q X ( d m x ) , Y j t > Q Y ( d m y ) } P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) = ξ n , j ( x , y ) E ξ n , j ( x , y ) .

Applying Chebyshev’s inequality, as n , we have

P ( | ξ n , j ( x , y ) E ξ n , j ( x , y ) 1 | > ε ) = P ( | ξ n , j ( x , y ) E ξ n , j ( x , y ) | > ε | E ξ n , j ( x , y ) | ) V a r ξ n , j ( x , y ) ε 2 | E ξ n , j ( x , y ) | 2 = 1 m P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) ( 1 P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) ) ε 2 [ P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) ] 2 = 1 P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) d ε 2 m d P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) P 1 P ( X > Q X ( d m x ) , Y > Q Y ( d m y ) ) d ε 2 τ P 0.

The penultimate step using the convergence of R ( x , y , p ) , that is (1.9), then as n , τ ˜ j ( x , y ) R ( x , y , d m ) P 1 .

Hence, what remains to be proved is that l i m n R ( x , y , d m ) τ ( d m ) = 1 holds uniformly for all ( x , y ) [ 1 d 1 / 2 + δ ,1 + d 1 / 2 + δ ] . If β = 0 , as n ,

lim n R ( x , y , d m ) τ ( d m ) = lim n m d d m x d m y m d d m d m = 1.

If β > 0 , applying Lemma 1 in Oordt and Zhou (2017) [13] with p = d / m directly gives that

lim n R ( x , y , d m ) m d P ( X > max ( Q Y ( d m y ) β , Q X ( d m x ) ) ) = 1

holds uniformly for all ( x , y ) [ 1 d 1 / 2 + δ ,1 + d 1 / 2 + δ ] . We further simplify the denominator as follows: as n ,

m d P ( X > max ( Q Y ( d m y ) β , Q X ( d m x ) ) ) = P ( X > max ( Q Y ( d m y ) β , Q X ( d m x ) ) ) P ( X > Q X ( d m ) ) ~ min ( ( β Q X ( d m ) Q Y ( d m y ) ) α x , ( Q X ( d m ) Q X ( d m x ) ) α x ) ,

the last step uses the second order condition of X, that is (1.4).

From (1.5), as n , we get that ( β Q X ( d m ) Q Y ( d m y ) ) α x ~ τ ( d m ) holds uniformly for | y 1 | d 1 / 2 + δ . In addition, as n , Q X ( d m ) Q X ( d m x ) P 1 holds uniformly for | x 1 | d 1 / 2 + δ . Combine with τ ( d m ) 1 , we get that

lim n m d P ( X > max ( Q Y ( d m y ) β , Q X ( d m x ) ) ) = τ ( d m )

holds uniformly for ( x , y ) [ 1 d 1 / 2 + δ ,1 + d 1 / 2 + δ ] , as n . Hence, we proved that τ ^ j ( d m ) τ ( d m ) P 1 , together with the consistency of 1 / α ^ j , we have I j , 1 = ( τ ^ j ( d m ) τ ( d m ) ) 1 α ^ j P 1 uniformly for j = 1 , 2 , , k , thus, I j ,1 = 1 + o p ( 1 ) .

Next, we deal with I j , 2 = ( τ ( d m ) ) 1 α ^ j 1 α x . Note that the observations of different machines are independently and identically distributed, similar to the proof in Oordt and Zhou (2017) [13] , if lim n sup τ ( d m ) > 0 , then the consistency of α ^ j leads to I j ,2 P 1 , as n ; If l i m n sup τ ( d m ) = 0 , to prove that as n , there is I 2 P 1 , equivalent to prove

( 1 α ^ j 1 α x ) log τ ( d m ) P 0.

Theorem 3.2.5 in de Haan and Ferrira (2006) [21] guarantees the asymptotic normality of α ^ j under conditions (1.3) and (1.13): as n ,

d ( 1 α ^ j 1 α x ) d N ( 1 α x ( 1 ρ ) , ( 1 α x ) 2 ) ,

that is,

d ( 1 α ^ j 1 α x ) d O p ( 1 ) ,

therefore, it only remains to prove that log τ ( d m ) = o ( d ) .

If β = 0 , τ ( d m ) = d m , by (1.13), d log m + , then,

log τ ( d m ) = log d log m = d ( log d d log m d ) = o ( d ) .

If β > 0 , by (1.5), for sufficiently large n, we have

τ ( d m ) ~ ( β Q x ( d m ) Q y ( d m ) ) α x = β α x ( d m ) α x α y 1 ( l ˜ x ( d m ) l ˜ y ( d m ) ) α x > D ( d m ) α x α y 1 + δ ,

for some D > 0 and δ > 0 , the last step uses the Potter inequality. Therefore, log τ ( d m ) = o ( d ) . Thus, I j ,2 P 1 , I j ,2 = 1 + o p ( 1 ) uniformly for j = 1 , 2 , , k .

For I j ,3 , according to Theorem 2.2.1 in de Haan and Ferrira (2006) [21] , as n , for j = 1 , 2 , , k , we have

d ( Y j ( d + 1 ) Q Y ( d m ) d m Q Y ( d m ) ) d N ( 0,1 ) ,

then

d ( Y j ( d + 1 ) Q Y ( d m ) d m Q Y ( d m ) ) = O p ( 1 ) .

Since Q Y ( p ) = p 1 α y l ˜ y ( p ) ,

l i m p 0 p Q Y ( p ) Q Y ( p ) = l i m p 0 p ( 1 α y p 1 α y 1 l ˜ y ( p ) + p 1 α y l ˜ y ( p ) ) p 1 α y l ˜ y ( p ) = l i m p 0 1 α y + p l ˜ y ( p ) l ˜ y ( p ) = 1 α y .

The last step exploits the properties of slowly varying functions, then we have

Y j ( d + 1 ) Q Y ( d m ) P 1 1 d 1 α y O p ( 1 ) ,

then I j ,3 = 1 o p ( 1 ) uniformly for j = 1,2, , k .

For I j ,4 , same as I j ,3 , we know X j ( d + 1 ) Q X ( d m ) P 1 1 d 1 α x O p ( 1 ) , then

I j ,4 = Q X ( d m ) X j ( d + 1 ) = 1 o p ( 1 ) .

Finally, according to (1.5), I j ,5 = ( τ ( d m ) ) 1 α x Q Y ( d m ) Q X ( d m ) β , then I j ,5 = β + o p ( 1 ) .

From the above analysis,

β ^ D P β .

¨

Proof of Theorem 2.2. From (4.5), β ^ n , j = I j , 1 I j , 2 I j , 3 I j , 4 I j , 5 , let’s analyze I j ,1 , I j ,2 , I j ,3 , I j ,4 , I j ,5 separately. Deal with I j ,1 first. By definition, R ( x , y ) is a homogeneous function of the first degree. According to Lemma 2 in Oordt and Zhou (2017) [13] , for x > 0 , we have R ( x ,1 ) = min ( x , τ ) . Thus, for x y > τ , R ( x , y ) = y R ( x y , 1 ) = τ y ; for x y < τ , R ( x , y ) = y R ( x y , 1 ) = x , hence, the partial derivatives of R at the neighborhood of (1, 1) exist as R 1 ( 1,1 ) = 0 and R 2 ( 1,1 ) = τ , where R 1 , R 2 denotes the partial derivatives of R with respect to x and y, respectively. Due to tail stable dependency function l ( x , y ) = x + y R ( x , y ) , Theorem 2 in Section 2 of Huang (1992) [22] gives the asymptotic normality of l ^ ( x , y ) as m , d = o ( m 2 θ 1 + 2 θ ) , we have

d ( l ^ ( x , y ) l ( x , y ) ) d B ( x , y ) ,

where

B ( x , y ) : = W ( x , y ) l 1 ( x , y ) W ( x , 0 ) l 2 ( x , y ) W ( 0 , y ) , l 1 ( x , y ) = x l ( x , y ) , l 2 ( x , y ) = y l ( x , y ) ,

W ( x , y ) is a continuous zero-mean Gaussian process, its covariance is

E W ( x 1 , y 1 ) W ( x 2 , y 2 ) = l ( x 1 x 2 , y 1 ) + l ( x 1 x 2 , y 2 ) + l ( x 1 , y 1 y 2 ) + l ( x 2 , y 1 y 2 ) l ( x 1 , y 2 ) l ( x 2 , y 1 ) l ( x 1 x 2 , y 1 y 2 ) ,

and l 1 ( 1 , 1 ) = 1 R 1 ( 1 , 1 ) = 1 ; l 2 ( 1 , 1 ) = 1 R 2 ( 1 , 1 ) = 1 τ ; τ = R ( 1 , 1 ) = 2 l ( 1 , 1 ) , then we have

d ( τ ^ j ( d m ) τ ( d m ) ) d ( 1 τ ) W j ( 0,1 ) + W j ( 1,0 ) W j ( 1 , 1 ) .

Let

S j ,1 = d ( ( τ ^ j ( d m ) τ ( d m ) ) 1 α ^ j 1 ) ,

then

S j , 1 = τ ( d m ) 1 α ^ j [ d ( τ ^ j ( d m ) 1 α ^ j τ ( d m ) 1 α ^ j ) ] = τ ( d m ) 1 α ^ j 1 α ^ j τ ˜ ( d m ) 1 α ^ j 1 [ d ( τ ^ j ( d m ) τ ( d m ) ) ] d 1 α x 1 τ [ ( 1 τ ) W j ( 0 , 1 ) + W j ( 1 , 0 ) W j ( 1 , 1 ) ] ,

the second step uses the Delta method, where τ ˜ ( d m ) is between τ ( d m ) and τ ^ j ( d m ) . Thus, I j , 1 = 1 d S j , 1 + 1 , and

S j ,1 d 1 α x 1 τ [ ( 1 τ ) W j ( 0,1 ) + W j ( 1,0 ) W j ( 1,1 ) ] . (4.6)

Next we deal I j , 2 = ( τ ( d m ) ) 1 α ^ j 1 α x . According to Theorem 3.2.5 in de Haan and Ferrira (2006) [21] , we know that Gaussian process can also control the convergence of the tail exponents, i.e.,

d ( 1 α ^ j 1 α x ) = 1 α x ( 0 1 W j ( s , 0 ) d s s W j ( 1 , 0 ) ) + d A 0 ( m d ) 1 1 ρ + o p ( 1 ) ,

where W j ( x , y ) is the same zero-mean Gaussian process as above, as n , A x ( m d ) ~ A 0 ( m d ) . Use Delta method, let g ( x ) = ( τ ( d m ) ) x 1 α x , g ( x ) = log τ ( τ ( d m ) ) x 1 α x , g ( 1 α x ) = log τ , we have

d ( ( τ ( d m ) ) 1 α ^ j 1 α x 1 ) P ( ( log τ ) 1 α x ) ( 0 1 W j ( s , 0 ) d s s W j ( 1 , 0 ) ) + log τ d A 0 ( m d ) 1 1 ρ .

From Lemma 4.1, as n , d A 0 ( m d ) P 0 , then,

d ( ( τ ( d m ) ) 1 α ^ j 1 α x 1 ) d ( ( log τ ) 1 α x ) ( 0 1 W j ( s , 0 ) d s s W j ( 1 , 0 ) ) .

Let

S j , 2 = d ( ( τ ( d m ) ) 1 α ^ j 1 α x 1 ) ,

thus, I j ,2 = 1 d S j ,2 + 1 and

S j ,2 d ( ( log τ ) 1 α x ) ( 0 1 W j ( s ,0 ) d s s W j ( 1,0 ) ) . (4.7)

For I j ,3 = Y j ( d + 1 ) Q Y ( d m ) . According Theorem 4 in Chen et al. (2021) [23] , we know that

d ( Y j ( d + 1 ) Q Y ( d m ) 1 ) = 1 α y W j ( 0,1 ) + o p ( 1 ) ,

let S j ,3 = d ( Y j ( d + 1 ) Q Y ( d m ) 1 ) , then I j ,3 = 1 d S j ,3 + 1 , and

S j ,3 d 1 α y W j ( 0,1 ) . (4.8)

Finally, we deal with I j ,4 = Q X ( d m ) X j ( d + 1 ) . Similar to I j ,3 , we know that

k d ( X j ( d + 1 ) Q X ( d m ) 1 ) = k 1 α x W j ( 1,0 ) + o p ( 1 ) ,

use Delta method, let g ( x ) = 1 x , then g ( x ) = 1 x 2 , g ( 1 ) = 1 ,

d ( Q X ( d m ) X j ( d + 1 ) 1 ) P 1 α x W j ( 1,0 ) .

Let S j ,4 = d ( Q X ( d m ) X j ( d + 1 ) 1 ) , then I j ,4 = 1 d S j ,4 + 1 and

S j ,4 d 1 α x W j ( 1,0 ) . (4.9)

Thus,

k d ( β ^ D τ 1 α x Q Y ( d m ) Q X ( d m ) 1 ) = 1 k j = 1 k [ ( S j , 1 + S j , 2 + S j , 3 + S j , 4 ) + o p ( 1 ) ] = 1 k j = 1 k S j .

where S j = S j , 1 + S j , 2 + S j , 3 + S j , 4 + o p ( 1 ) , combine (4.6), (4.7), (4.8) and (4.9), we have

S j d 1 α x 1 τ { ( 1 τ ) W j ( 0,1 ) + W j ( 1,0 ) W j ( 1,1 ) } + log τ α x ( 0 1 W j ( s ,0 ) d s s W j ( 1,0 ) ) + 1 α y W j ( 0,1 ) 1 α x W j ( 1,0 ) .

Based on the expression for l ( x , y ) , we get E S j 0, V a r S j Σ , where

Σ = 1 α x 2 ( 1 τ + 2 log τ 2 τ log τ 1 3 ( log τ ) 2 ) .

And S j is independent and identical distributed on different machines, use the Central Limit Theorem,

k d ( β ^ D τ 1 α x Q Y ( d m ) Q X ( d m ) 1 ) d N ( 0, Σ ) ,

where,

Σ = 1 α x 2 ( 1 τ + 2 log τ 2 τ log τ 1 3 ( log τ ) 2 ) .

Therefore, what remains to be proved is the following deterministic relation

lim n k d ( τ 1 α x Q Y ( d m ) Q X ( d m ) β 1 ) = 0.

According to (1.5), we know that lim n Q y ( d m ) Q x ( d m ) = β τ 1 α x > 0 , use Delta method, the above relation is equivalent to

lim n k d ( τ ( β Q X ( d m ) Q Y ( d m ) ) α x ) = 0.

Next, from (1.8) and (1.9), we have τ ( p ) τ = O ( p θ ) , combine with k d = O ( n ζ ) , ζ < 2 θ / 1 + 2 θ , we get that

lim n k d ( τ τ ( d m ) ) = lim n ( k d ) 1 2 ( d m ) θ = lim n n ξ ( 1 2 + θ ) θ = 0 ,

what remains to be proved is

lim n k d ( τ ( d m ) ( β Q X ( d m ) Q Y ( d m ) ) α x ) = 0. (4.10)

Notice that Lemma 1 in Oordt and Zhou (2017) [13] can be written as

lim p 0 P ( X > Q X ( d m ) , Y > Q Y ( d m ) ) P ( X > max ( Q Y ( d m ) β , Q X ( d m ) ) ) = 1 , (4.11)

when p = d m and x = y = 1 . And by (1.5), we have lim p 0 Q Y ( p ) Q X ( τ p ) = lim p 0 τ 1 α x Q Y ( p ) Q x ( p ) = β , then, for any τ < z < 1 , for sufficiently small p, we have Q y ( p ) β Q X ( z p ) , then, for sufficiently large n, Q X ( d m ) < Q X ( d m z ) < Q Y ( d m ) / β . Thus, (4.11) is equal to

l i m p 0 P ( X > Q X ( d m ) , Y > Q Y ( d m ) ) P ( X > max ( Q Y ( d m ) β , Q X ( d m ) ) ) = l i m p 0 P ( X > Q X ( d m ) , Y > Q Y ( d m ) ) P ( X > Q Y ( d m ) β ) = l i m p 0 m d P ( X > Q X ( d m ) , Y > Q Y ( d m ) ) P ( X > Q Y ( d m ) β ) P ( X > Q X ( d m ) ) = l i m p 0 τ ( d m ) ( Q Y ( d m ) Q X ( d m ) β ) α x = 1 ,

i.e., as n , we have

τ ( d m ) ( β Q X ( d m ) Q Y ( d m ) ) α x P 0.

Let’s just prove that the convergence rate is k d . By referring to the set C , C 0 , C 1 , C 21 , C 22 , without loss of generality, let p = k d n = d m , x = y = 1 , by Lemma 4.2,

We prove (4.10) by dealing with the three sets C 1 , C 21 , C 22 . The limit relation in (4.3) implies that

lim n k d ( m d P ( C 21 ) ( β Q X ( d m ) Q Y ( d m ) ( 1 ± δ n ) ) α x ) = 0 ,

and the limit relation in (4.1) implies that

lim n k d m d P ( C 22 ) = 0.

For C1, due to X and ε independent, we have that

P ( C 1 ) = P ( ε > δ Q Y ( m d ) ) P ( X > Q X ( m d ) , X β > ( 1 + δ ) Q Y ( m d ) ) = P ( ε > δ Q Y ( m d ) ) P ( X β > ( 1 + δ ) Q Y ( m d ) ) ,

the limit relation (4.2) implies that lim n k d ( P ( ε > δ Q Y ( m d ) ) 1 ) = 0 . Together with (4.3), we have

lim n k d ( m d P ( C 1 ) ( β Q X ( d m ) Q Y ( d m ) ) α x ) = 0.

Since C 1 C C 21 C 22 , combining P ( C 1 ) , P ( C 21 ) and P ( C 22 ) , we have

l i m n k d ( m d P ( C ) ( β Q X ( d m ) Q Y ( d m ) ) α x ) = 0 ,

then

lim n k d ( τ ( d m ) ( β Q X ( d m ) Q Y ( d m ) ) α x ) = 0.

Therefore,

k d ( β ^ D β ) d N ( 0, β 2 α x 2 ( 1 τ + 2 log τ 2 τ log τ 1 3 ( log τ ) 2 ) ) .

¨

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References

[1] Chen, L., Li, D. and Zhou, C. (2021) Distributed Inference for the Extreme Value Index. Biometrika, 109, 257-264.
https://doi.org/10.1093/biomet/asab001
[2] Volgushev, S., Chao, S. and Cheng, G. (2017) Distributed Inference for Quantile Regression Processes. The Annals of Statistics, 47, 1634-1662.
https://doi.org/10.1214/18-AOS1730
[3] Mohammed, T., Joe-Wong, C., Babbar, R. and Di Francesco, M. (2020) Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading. IEEE INFOCOM 2020-IEEE Conference on Computer Communications, Toronto, 6-9 July 2020, 854-863.
https://doi.org/10.1109/INFOCOM41043.2020.9155237
[4] Smith, A.J. and Hollinger, G.A. (2018) Distributed Inference-Based Multi-Robot Exploration. Autonomous Robots, 42, 1651-1668.
https://doi.org/10.1007/s10514-018-9708-7
[5] Ye, Y. (2017) The Stability of Beta Coefficient in China’s Stock Market. Journal of Service Science and Management, 10, 177-187.
https://doi.org/10.4236/jssm.2017.102016
[6] Mitra, S. (2019) Measuring Asymmetric Nature of Beta Using a Smooth Linear Transformation. Theoretical Economics Letters, 9, 2019-2032.
https://doi.org/10.4236/tel.2019.96128
[7] Markowitz, H. (1952) Portfolio Selection. The Journal of Financial, 7, 77-91.
https://doi.org/10.1111/j.1540-6261.1952.tb01525.x
[8] Black, F. and Scholes, M. (1973) The Pricing of Options and Corporate Liabilities. Journal of Political Economy, 81, 637-654.
https://doi.org/10.1086/260062
[9] Atanasov, V. and Nitschka, T. (2014) Currency Excess Returns and Global Downside Market Risk. Journal of International Money and Finance, 47, 268-285.
https://doi.org/10.1016/j.jimonfin.2014.06.006
[10] Lettau, M., Maggiori, M. and Weber, M. (2013) Conditional Risk Premia in Currency Markets and Other Asset Classes. Journal of Financial Economics, 114, 197-225.
https://doi.org/10.1016/j.jfineco.2014.07.001
[11] Post, T. and Versijp, P. (2004) Multivariate Tests for Stochastic Dominance Efficiency of a Given Portfolio. Journal of Financial and Quantitative Analysis, 42, 489-515.
https://doi.org/10.1017/S0022109000003367
[12] Mitchell, M. and Pulvino, T. (2001) Characteristics of Risk and Return in Risk Arbitrage. Journal of Finance, 56, 2135-2175.
https://doi.org/10.1111/0022-1082.00401
[13] Oordt, M. and Zhou, C. (2017) Estimating Systematic Risk under Extremely Adverse Market Conditions. Journal of Financial Econometrics, 17, 432-461.
https://doi.org/10.1093/jjfinec/nbx033
[14] Hult, H., Lindskog, F. (2002) Multivariate Extremes, Aggregation and Dependence in Elliptical Distributions. Advances in Applied Probability, 34, 587-608.
https://doi.org/10.1239/aap/1033662167
[15] de Haan, L. and Stadtmüller, U. (1996) Generalized Regular Variation of Second Order. Journal of the Australian Mathematical Society (Series A), 61, 381-395.
https://doi.org/10.1017/S144678870000046X
[16] Drees, H. and Huang, X. (1998) Best Attainable Rates of Convergence for Estimators of the Stable Tail Dependence Function. Journal of Multivariate Analysis, 64, 25-46.
https://doi.org/10.1006/jmva.1997.1708
[17] Oordt, M. and Zhou, C. (2018) Systemic Risk and Bank Business Models. Journal of Applied Econometrics, 34, 365-384.
https://doi.org/10.1002/jae.2666
[18] Hill, B.M. (1975) A Simple General Approach to Inference about the Tail of a Distribution. Annals of Statistics, 3, 1163-1174.
https://doi.org/10.1214/aos/1176343247
[19] Embrechts, P., de Haan, L. and Huang, X. (2000) Modelling Multivariate Extremes. In: Embrechts, P., Ed., Extremes and Integrated Risk Management, Risk Book, London, 59-67.
[20] Embrechts, P., Klüppelberg, C. and Mikosch, T. (1997) Modelling Extremal Events. Springer, New York.
https://doi.org/10.1007/978-3-642-33483-2
[21] de Haan, L. and Ferreira, A. (2006) Extreme Value Theory. Springer, New York.
https://doi.org/10.1007/0-387-34471-3
[22] Huang, X. (1992) Statistics of Bivariate Extreme Values. Doctoral Thesis, Tinbergen Institute Research Series, Amsterdam.
[23] Chen, L., Li, D. and Zhou, C. (2021) Distributed Inference for Tail Empirical and Quantile Processes.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.