^{1}

^{1}

^{1}

In this paper, random matrix theory is employed to perform information selection and denoising, and mean-realized variance-CVaR multi-objective portfolio models before (after) denoising are constructed for high-frequency data. The empirical study is conducted based on high-frequency data from stocks in the SSE 180 Index. Compared with the existing literatures, the main contribution of this paper is the introduction of both realized covariance matrix and random matrix theory in multi-objective portfolio problem. The result shows that the use of the realized covariance matrix can reduce the loss of market information, and random matrix theory could help improve the quality of information contained in correlation matrix among assets. Under the denoised mean-realized variance-CVaR criterion, the new portfolio selection has better out-of-sample performance.

Mean-variance model (MV model) proposed by Markowitz (1952) opened a new chapter in modern portfolio theory, and subsequently many scholars are devoted to expand and deepen it. Kolm et al. (2014) summarized the development, challenges and future development directions. In Markowitz’s MV model, the mean and variance are used to measure average return and risk of asset portfolio respectively. The calculation method of variance depends on characteristics of data. Based on the difference of frequency in data collection, data can be divided into low-frequency data and high-frequency data. As we know, high-frequency trading data with short-time-span could reduce the loss of information in financial market. And it is easier to be obtained with the rapid development of technology. Therefore, the research on investment strategy based on high-frequency data becomes necessary and much significant. The realized variance can be calculated by realized volatility proposed by Andersen & Bollerslev (1998). The literature about realized volatility (variance) is rich, most of which are devoted to its modification, expansion and application in financial high-frequency data. Scholars also introduced realized (variance) covariance into asset allocation research, e.g., Pooter et al. (2008), Yao (2010), Song & Hu (2017) and Yin (2016) etc.

In the above portfolio study with realized variance, only one risk factor (variance) is considered. Since different risk measures describe different risk character of assets, scholars considered multiple measures to construct multi-objective portfolio optimization model. The earlier studies are mean-absolute deviation-skewness model and mean-variance-skewness model in Konno et al. (1993, 1995). Due to excellent properties of CVaR, Roman et al. (2007) constructed a mean-variance-CVaR model which could result in a more balanced portfolio. Further, Li et al. (2012) and Yu & Ma (2014) used this model to study China’s foreign exchange reserves and sovereign fund investment respectively. Gao et al. (2016) extended it to dynamic situations in financial market and Shi et al. (2019) considered optimal investment and reinsurance problem in continuous time. However, the data in above literatures is low-frequency data, and the situation of high-frequency data is ready to be explored.

In addition, with the increasing complexity and diversity of financial markets, Laloux et al. (1999) and Plerou et al. (1999) first applied random matrix theory (RMT) to stock market, which demonstrated the existence of “noise” in asset correlation matrix and effect on portfolio strategy. Later, RMT is used in the study of financial risk management to improve information quality of financial market, for example Han et al. (2014), Xie et al. (2018), Bun et al. (2017) and Shen et al. (2019) etc. Li & Hong (2019) studied the stability of the network before and after “denoising” based on random matrix theory and effective frontier of portfolio under mean-variance model.

In summary, this paper will construct mean-realized variance-CVaR portfolio model, and discuss the influence of denoising technology and realized covariance on optimal multi-objective optimization strategy. The paper is organized as follows. Section 2 describes the related methods. Section 3 gives the datasets, empirical procedure and the out-of-sample performance of different portfolio strategies. Finally, Section 4 concludes the paper.

For convenience, we first give some notations.

Σ 1 : Realized covariance matrix Σ 2 : Covariance matrix

Σ ˜ 1 : Realized covariance matrix after denoising of Σ 1 Σ ˜ 2 : Realized covariance matrix after denoising of Σ 2

Δ 1 : Diagonal matrix of realized standard deviation Δ 2 : Diagonal matrix of standard deviation

C 1 : Asset correlation matrix based on Σ 1 C 2 : Pearson Correlation Coefficient Matrix

C ˜ 1 : Asset correlation matrix after denoising of C 1 C ˜ 2 : Asset correlation matrix after denoising of C 2

We consider the price process of an N × 1 dimensional financial assets P ( t ) = ( P 1 ( t ) , ⋯ , P N ( t ) ) T , where P j ( t ) represents the price of the j asset at time t. The logarithm price vector is

q ( t ) = ( log [ P 1 ( t ) ] , ⋯ , log [ P N ( t ) ] ) T . (1)

The return vector as follows:

R ( t + a , a ) = q ( t + a ) − q ( a ) . (2)

Assume that time period from t to t + 1 is divided into m segments, and the rate of asset return on each segment is R ( t + k / m , 1 / m ) = q ( t + k / m ) − q ( 1 / m ) , k = 1 , 2 , ⋯ , m . So return matrix from time t to t + 1 is described as:

H t , t + 1 = ( R ( t + 1 / m , 1 / m ) , ⋯ , R ( t + m / m , 1 / m ) ) . (3)

Therefore, the realized covariance matrix ( Σ 1 ) can be defined as:

Σ 1 = H t , t + 1 H t , t + 1 T , (4)

here the value of main diagonal element of realized covariance matrix is the realized variance of each asset.

A random matrix is expressed as:

Z = 1 N A A T , (5)

where A is an N × L matrix which is composed of N uncorrelated random variables with sequence length L, and each sequence obeys N ( 0 , 1 ) distribution. Based on Wigner (1951), for the window width Q = L / N ( > 1 ) , the predicted maximum and minimum eigenvalue of random matrix can be expressed as:

λ max / min Z = σ Z 2 ( 1 + 1 Q ± 2 1 Q ) , (6)

where σ Z 2 is the variance of Z , and σ Z 2 = 1 for standardized matrix.

Based on Kenett et al. (2009), eigenvalue Entropy (SE) of a random matrix is an effective tool to evaluate the information contained in the eigenvalue, as follows:

S E = − 1 log ( N ) ∑ j = 1 N [ λ j − λ j − 1 ] 2 ∑ j N [ λ j − λ j − 1 ] 2 log { [ λ j − λ j − 1 ] 2 ∑ j N [ λ j − λ j − 1 ] 2 } , (7)

where λ ( λ j > λ j − 1 ) represents the eigenvalue of matrix. The smaller SE means less noise information, which shows more economic information is contained in eigenvalues, and vice versa.

For an N × N asset correlation matrix C ( C 1 or C 2 )

C = E D E T , (8)

where D is a diagonal matrix formed by the eigenvalue { γ i } i = 1 n , and E is the eigenvector matrix of C. Let A = { γ i | γ i ∈ [ λ min Z , λ max Z ] , i = 1 , ⋯ , n } , thenA is the noise set. Here PG+ denoising method will be employed. All the elements of set A are replaced by 0 to construct the new diagonal matrix D ˜ . Then the denoised asset correlation matrix C ˜ ( C ˜ 1 or C ˜ 2 ) can be expressed as:

C ˜ = E D ˜ E T . (9)

We set the diagonal element of C ˜ to be 1 to ensure that T r ( C ) = T r ( C ˜ ) = N .

As we know, the covariance matrix Σ ( Σ 1 or Σ 1 ) and the asset correlation matrix C ( C 1 or C 2 ) satisfies the following relationship

Σ = Δ C Δ T , (10)

where Δ = Δ 1 (or Δ 2 ) represents the diagonal matrix formed by standard deviation of each asset. So the Σ ˜ after denoising could be obtained through the C ˜ .

Suppose that R = ( R 1 , R 2 , ⋯ , R N ) T is the return vector of N assets, and x = ( x 1 , x 2 , ⋯ , x N ) T is the weight vector. The variance of the cumulative return of portfolio is V a r ( x T R ) = x T M x , where M might be the realized covariance matrix Σ 1 , the realized covariance matrix after noise reduction ( Σ ˜ 1 ) or the covariance matrix after noise reduction ( Σ ˜ 2 ).

Based on Roman et al. (2007), we will study the following problem:

P min V a r ( x T R ) = x T M x s . t : x T μ = d C V a R ( x T R ) = z x T 1 = 1 x j ≥ 0 , ∀ j ∈ 1 , ⋯ , N (11)

where d represents the investor’s target return rate, z represents the control of CVaR, x T 1 = 1 is the weight constraint for full investment, x j ≥ 0 tells that no short-selling permitted. The specific determination of parameters d and z is shown in Appendix A.

To show the impact of random matrix and realized variance on investment strategies, the following three optimization models are arranged in this paper, see

Model | Abbreviation | Denoise | M | |
---|---|---|---|---|

P mrvc rmt | Mean-Realized Variance-Model (denoise) | MRVC (denoise) | yes | Σ ˜ 1 |

P mrvc | Mean-Realized Variance-CVaR Model | MRVC | no | Σ 1 |

P mvc rmt | Mean-Variance-CVaR model (denoise) | MVC (denoise) | yes | Σ ˜ 2 |

The average return:

Averagereturn = E [ R o u t × x ∗ ] , (12)

where R o u t represents the out sample data, and x ∗ represents the optimal investment weight.

Omega Ratio (OR) is proposed by Keating & Shadwick (2002), defined as:

OR = ∫ ε ∞ ( 1 − F ( x ) ) d x ∫ − ∞ ε F ( x ) d x = E [ R o u t × x ∗ − ε ] + E [ ε − R o u t × x ∗ ] + , (13)

where F ( x ) represents the cumulative distribution function of portfolio returns and ε is a specified threshold. Returns below the specific threshold are considered as losses and returns above as gains. For the convenience of calculation, ε = 0 is assumed ( Clemente et al., 2019). The portfolio with the highest ratio will be preferred by an investor.

The database is from Shanghai Stock Exchange 180 (SSE 180) Index, consisting of 180 stocks that best represent China’s A-Share Market. The five-minute return data of 120 stocks is collected from July 1, 2019 to August 10, 2019. And their five-minute logarithmic returns are calculated respectively. The data spanning from July 1, 2019 to July 31, 2019 is marked as in-sample data and the rest for out-of-sample data.

The empirical study will be processed according to the following procedure.

Step 1: calculating realized covariance matrix Σ 1 based on formula (1)-for- mula (4).

Step 2: “noise” detection. The noise information in asset correlation matrix C ( C 1 or C 2 ) and random matrix Zwill be analyzed by eigenvalue entropy (SE)based on formulas (5), (6) and (7).

Step 3: constructing the denoised covariance correlation Σ ˜ ( Σ ˜ 1 or Σ ˜ 1 ) according to formulas (8), (9) and (10).

Step 4: calculating the optimal asset weights under MRVC (denoise), MRVC and MVC (denoise) based on model (11) respectively.

(1) Assumed that d p is the median of the average return of all assets, and the interval [ d min , d max ] is determined by formulas from (14) to (18). d ∗ takes the 1 6 , 2 6 , 3 6 and 4 6 quantile value of this interval respectively, denoted as d 1 ∗ , d 2 ∗ , d 3 ∗ and d 4 ∗ .

(2) Find the optimal weight x j ∗ of assets under mean-variance model based on d ∗ in last step.

(3) Based on the given d ∗ and x j * , the interval [ z d ∗ , min , z d ∗ , max ] of z is determined by formulas (19) and (20) for α = 0.01 . The values of z is assumed to be the 1 4 , 2 4 , 3 4 of quantile values of interval [ d min , d max ] and z d ∗ , max respectively, denoted as z 1 ∗ , z 2 ∗ , z 3 ∗ and z 4 ∗ .

(4) Problems of P mrvc , P mrvc rmt and P mvc rmt with the given d ∗ and z will be solved through cvx toolkit in Matlab, which result in the optimal solution x ∗ .

Step 5: the in-sample optimal weight of assets with three models are obtained from step 1 to step 4. Further, the average returns and OR values for out-of-sample dataset are calculated by formulas (12) and (13).

We calculate the asset correlation matrix C ( C 1 or C 2 ), and further detect their noises based on Step 1 to Step 2, shown in

From

Q(T/N) | λ max | λ min | N ( ≥ λ max Z ) | Noise % | |||
---|---|---|---|---|---|---|---|

C | RMT | C | RMT | ||||

C 1 | 9.2 | 44.32 | 2.43 | 0.01 | 0.19 | 12 | 24.19 |

C 2 | 9.2 | 20.84 | 1.77 | 0.25 | 0.45 | 7 | 73.33 |

Asset correlation matrix | Eigenvalues | SE | |
---|---|---|---|

Asset correlation | Corresponding random matrix | ||

C 1 | A | 0.001895 | 0.512808 |

B | 0.641649 | 0.641649 | |

C 2 | A | 0.003654 | 0.540436 |

B | 0.004707 | 0.710836 |

Notes: the symbol “A” means that all eigenvalues are considered while “B” for removing the maximum 7 eigenvalues.

matrix C 1 and its corresponding random matrix’s maximum (minimum) eigenvalue 2.43 (0.19) are greater (smaller) than C 2 ’s maximum (minimum) eigenvalue 20.84 (0.25) and its corresponding random matrix’s maximum (minimum) eigenvalue 1.77 (0.45). This tells us that matrix C 1 has smaller noise interval. Meanwhile, compared C 1 with C 2 , the percentage of noise in C 1 is smaller, which means that matrix C 1 contains more useful economic information.

It can be seen from

Based on Step 3 to Step 4 in Section 3.2, we can obtain the optimal investment strategy under each model with different parameters, and the average return and OR values are shown in

The following results could be found from

1) Under any different constraints ( d ∗ , z ) of means and CVaR, both average return and OR of MRVC (denoise) are higher than MVC (denoise), which means that the introduction of realized covariance matrix for high-frequency data can help much for more effective market information and more appropriate investment decision.

2) Compared with MRVC model, the average return and OR of MVC (denoise) are improved mostly, which tells us that the use of random matrix can indeed improve the performance of investment portfolios to some extent. And the performance under MRVC (denoise) model is sensitive to the selection of parameters d ∗ and z.

To further understand out-of-sample performance of each model under different parameter, we plot the cumulative return with the optimal portfolio weights, see

From

1) For any different ( d ∗ , z ) , MRVC (denoise) performs the best and MRVC worst. This shows that the combined use of realized covariance matrix and random matrix theory in optimization model can better improve performance of portfolio.

2) There is little difference among three models when the market fluctuates slightly in the early stage. However, MRVC (denoise) begins to highlight its superiority when the market fluctuates sharply.

3) At a fixed return target, the superiority of MRVC (denoise) gradually increases with the relaxation of constraint on risk CVaR.

d 1 ∗ | ||||||||
---|---|---|---|---|---|---|---|---|

z 1 ∗ | z 2 ∗ | z 3 ∗ | z 4 ∗ | |||||

Average return | OR | Average return | OR | Average return | OR | Average return | OR | |

P mrvc | −0.1456 | 0.6350 | −0.1350 | 0.6762 | −0.1227 | 0.7173 | −0.1137 | 0.7483 |

P mrvc rmt | −0.1450 | 0.6329 | −0.1240 | 0.6874 | −0.1125 | 0.7246 | −0.1043 | 0.7512 |

P mvc rmt | −0.1713 | 0.5884 | −0.1708 | 0.5891 | −0.1708 | 0.5893 | −0.1707 | 0.5896 |

d 2 ∗ | ||||||||

z 1 ∗ | z 2 ∗ | z 3 ∗ | z 4 ∗ | |||||

Average return | OR | Average return | OR | Average return | OR | Average return | OR | |

P mrvc | −0.1174 | 0.7259 | −0.1107 | 0.7492 | −0.1146 | 0.7547 | −0.1201 | 0.7585 |

P mrvc rmt | −0.1138 | 0.7332 | −0.1037 | 0.7638 | −0.1009 | 0.7777 | −0.1025 | 0.6757 |

P mvc rmt | −0.1394 | 0.6699 | −0.1335 | 0.675 | −0.1365 | 0.6723 | −0.1331 | 0.7827 |

d 3 ∗ | ||||||||

z 1 ∗ | z 2 ∗ | z 3 ∗ | z 4 ∗ | |||||

Average return | OR | Average return | OR | Average return | OR | Average return | OR | |

P mrvc | −0.1436 | 0.7645 | −0.1450 | 0.7723 | −0.1424 | 0.7856 | −0.1372 | 0.7996 |

P mrvc rmt | −0.1252 | 0.7834 | −0.1248 | 0.7990 | −0.1197 | 0.8125 | −0.1183 | 0.8202 |

P mvc rmt | −0.1324 | 0.7822 | −0.1250 | 0.7825 | −0.1271 | 0.7809 | −0.1250 | 0.7826 |

d 4 ∗ | ||||||||

z 1 ∗ | z 2 ∗ | z 3 ∗ | z 4 ∗ | |||||

Average return | OR | Average return | OR | Average return | OR | Average return | OR | |

P mrvc | −0.2038 | 0.7667 | −0.2012 | 0.7703 | −0.1978 | 0.7759 | −0.1950 | 0.7809 |

P mrvc rmt | −0.1864 | 0.7791 | −0.1795 | 0.7831 | −0.1776 | 0.7846 | −0.1770 | 0.7851 |

P mvc rmt | −0.2013 | 0.7686 | −0.1994 | 0.7714 | −0.1998 | 0.7731 | −0.1996 | 0.7751 |

This paper studies multi-objective investment strategy based on mean-realized variance-CVaR and random matrix theory for high-frequency data. Compared with Roman et al. (2007), the innovation of this paper is the introduction of covariance matrix and random matrix theory in optimization problem ( Clemente et al., 2019). Compared with Li & Hong (2019), this paper considered CVaR and variance as factors of risk control simultaneously. To a certain extent, the new model can better deal with high frequency, noise and thick-tail characters of data in financial market. The empirical study found that the noise percentage in asset correlation matrix with realized covariance matrix is significantly reduced, and hence carries more effective information. The out-of-sample performance of MRVC (denoise) is significantly better than the other two models, which tells us that the use of realized covariance matrix and random matrix might help to improve information quality and effectiveness of high–frequency data in investment problem. Because of the limitation of length, this paper only considers five-minute return data of 120 stocks, and the relationship between different high-frequency data, denoising effect, and covariance matrix estimator can also be a direction for future research.

This work was partially supported by National Natural Science Foundation of China under Grant no. 71671104 and 11971301.

The authors declare no conflicts of interest regarding the publication of this paper.

Yang, Y. J., Zhu, Y. P., & Zhao, X. (2020). Portfolio Research Based on Mean-Realized Variance-CVaR and Random Matrix Theory under High- Frequency Data. Journal of Financial Risk Management, 9, 480-493. https://doi.org/10.4236/jfrm.2020.94026

Based on Mean-Variance-CVaR model in Roman et al. (2007), C V a R 1 − α can be written as follow,

C V a R 1 − α = 1 α ∑ i = 1 T p i [ − v − ∑ j = 1 N x j r i j ] + + v = 1 α ∑ i = 1 T p i y i + v = z , (14)

Thus formula (11) can be rewritten as P 1 :

P 1 min x T M x s . t : ∑ j = 1 N x j μ j = d 1 α ∑ i = 1 T p i y i + v = z y i ≥ − v − ∑ j = 1 N x j r i j , ∀ i ∈ i , ⋯ , T

y i ≥ 0 , ∀ i ∈ i , ⋯ , T ∑ j = 1 N x j = 1 x j ≥ 0 , ∀ j ∈ i , ⋯ , N (15)

Here v is the value of V a R 1 − α , p i represents the probability of return rate R x i at time i, R x i = ∑ j = 1 N x j r i j , r i j represents the return rate of asset j at time i, and μ j is the expected return rate of asset j.

In order to ensure P 1 has a feasible solution, d and z need to be within a certain range, that is, d ∈ [ d min , d max ] , z ∈ [ z d ∗ , min , z d ∗ , max ] , where d min = max { d min var , d min cvar } . d min var is determined by P 2 :

P 2 min x T M x s . t : ∑ j = 1 N x j μ j = d p ∑ j = 1 N x j = 1 x j ≥ 0 , ∀ j ∈ i , ⋯ , N (16)

Solving the P 3 to get x j 1 , thus d min var = ∑ j = 1 N x j 1 μ j .

d min cvar is determined by:

P 3 min 1 α ∑ i = 1 T p i [ − v − ∑ j = 1 N x j r i j ] + + v s . t : ∑ j = 1 N x j μ j = d p ∑ j = 1 N x j = 1 x j ≥ 0 , ∀ j ∈ i , ⋯ , N (17)

Solving the P 3 to get x j 2 , thus d min var = ∑ j = 1 N x j 2 μ j .

d max is determined by:

P 4 min ∑ j = 1 N x j μ j s . t : ∑ j = 1 N x j = 1 x j ≥ 0 , ∀ j ∈ i , ⋯ , N (18)

Solving P 4 to get x j 3 , thus d max = ∑ j = 1 N x j 3 μ j .

Here z d ∗ , min is determined by the model:

P 5 min 1 α ∑ i = 1 T p i [ − v − ∑ j = 1 N x j r i j ] + + v s . t : ∑ j = 1 N x j μ j = d * ∑ j = 1 N x j = 1 x j ≥ 0 , ∀ j ∈ i , ⋯ , N (19)

Solving P 5 to get x j 4 and v 1 , thus z d ∗ , min = 1 α ∑ i = 1 T p i [ − v 1 − ∑ j = 1 N x j 4 r i j ] + + v 1 .

z d ∗ , max is determined by:

P 6 min 1 α ∑ i = 1 T p i [ − v − ∑ j = 1 N x j * r i j ] + + v s . t : ∑ j = 1 N x j = 1 x j ≥ 0 , ∀ j ∈ i , ⋯ , N (20)

Solving P 6 to get v 2 , thus z d ∗ , max = 1 α ∑ i = 1 T p i [ − v 2 − ∑ j = 1 N x j * r i j ] + + v 2 . Here x j * = ( x 1 * , ⋯ , x N * ) is the optimal portfolio weight of the solution when the mean constraint is ∑ j = 1 N x j μ j = d * in mean-variance model.