Portfolio Research Based on Mean-Realized Variance-CVaR and Random Matrix Theory under High-Frequency Data

In this paper, random matrix theory is employed to perform information selection and denoising, and mean-realized variance-CVaR multi-objective portfolio models before (after) denoising are constructed for high-frequency data. The empirical study is conducted based on high-frequency data from stocks in the SSE 180 Index. Compared with the existing literatures, the main contribution of this paper is the introduction of both realized covariance matrix and random matrix theory in multi-objective portfolio problem. The result shows that the use of the realized covariance matrix can reduce the loss of market information, and random matrix theory could help improve the quality of information contained in correlation matrix among assets. Under the denoised mean-realized variance-CVaR criterion, the new portfolio selection has better out-of-sample performance.


Introduction
Mean-variance model (MV model) proposed by Markowitz (1952) opened a new chapter in modern portfolio theory, and subsequently many scholars are devoted to expand and deepen it. Kolm et al. (2014) summarized the development, challenges and future development directions. In Markowitz's MV model, the mean and variance are used to measure average return and risk of asset portfolio respectively. The calculation method of variance depends on characteristics of data. Based on the difference of frequency in data collection, data can be divided into low-frequency data and high-frequency data. As we know, high-frequency trading data with short-time-span could reduce the loss of information in financial market. And it is easier to be obtained with the rapid development of technology. Therefore, the research on investment strategy based on high-frequency data becomes necessary and much significant. The realized variance can be calculated by realized volatility proposed by Andersen & Bollerslev (1998). The literature about realized volatility (variance) is rich, most of which are devoted to its modification, expansion and application in financial high-frequency data. Scholars also introduced realized (variance) covariance into asset allocation research, e.g., Pooter et al. (2008), Yao (2010), Song & Hu (2017) and Yin (2016) etc.
In the above portfolio study with realized variance, only one risk factor (variance) is considered. Since different risk measures describe different risk character of assets, scholars considered multiple measures to construct multi-objective portfolio optimization model. The earlier studies are mean-absolute deviation-skewness model and mean-variance-skewness model in Konno et al. (1993Konno et al. ( , 1995. Due to excellent properties of CVaR, Roman et al. (2007) constructed a mean-variance-CVaR model which could result in a more balanced portfolio. Further, Li et al. (2012) and Yu & Ma (2014) used this model to study China's foreign exchange reserves and sovereign fund investment respectively. Gao et al. (2016) extended it to dynamic situations in financial market and Shi et al. (2019) considered optimal investment and reinsurance problem in continuous time. However, the data in above literatures is low-frequency data, and the situation of high-frequency data is ready to be explored.
In addition, with the increasing complexity and diversity of financial markets, Laloux et al. (1999) and Plerou et al. (1999) first applied random matrix theory (RMT) to stock market, which demonstrated the existence of "noise" in asset correlation matrix and effect on portfolio strategy. Later, RMT is used in the study of financial risk management to improve information quality of financial market, for example Han et al. (2014), Xie et al. (2018), Bun et al. (2017) and Shen et al. (2019) etc. Li & Hong (2019) studied the stability of the network before and after "denoising" based on random matrix theory and effective frontier of portfolio under mean-variance model.
In summary, this paper will construct mean-realized variance-CVaR portfolio model, and discuss the influence of denoising technology and realized covariance on optimal multi-objective optimization strategy. The paper is organized as follows. Section 2 describes the related methods. Section 3 gives the datasets, empirical procedure and the out-of-sample performance of different portfolio strategies. Finally, Section 4 concludes the paper.

Methods
For convenience, we first give some notations.

Realized Covariance Matrix
Consider the price process of an 1 N × dimensional financial assets P t represents the price of the j asset at time t. The logarithm price vector is The return vector as follows: Assume that time period from t to 1 t + is divided into m segments, and the rate of asset return on each segment is  . So return matrix from time t to 1 t + is described as: Therefore, the realized covariance matrix ( 1 Σ ) can be defined as: here, the value of main diagonal element of realized covariance matrix is the realized variance of each asset.

Noise Detection
Consider a random matrix, expressed as: where, A is an N L × matrix which is composed of N uncorrelated random variables with sequence length L, and each sequence obeys Wigner (1951), as the window width ( ) > , the predicted maximum and minimum eigenvalue of random matrix can be expressed as: where 2 Z σ is the variance of Z , and 2 1 Z σ = for standardized matrix.
Based on Kenett et al. (2009), eigenvalue Entropy (SE) of a random matrix is an effective tool to evaluate the information contained in the eigenvalue, as follows: 1 j j λ λ − > ) represents the eigenvalue of matrix. The smaller SE means less noise information, which shows more economic information is contained in eigenvalues, and vice versa.

Denoising Method
where D is a diagonal matrix formed by the eigenvalue { } 1 n i i γ = as the diagonal element, and E is the eigenvector matrix of C. Let , then A is the noise set. Here PG+ denoising method will be employed. All the elements of set A are replaced by 0 to construct the new diagonal matrix D  . Then the denoised asset correlation matrix C  ( 1 C  or 2 C  ) can be expressed as: We set the diagonal element of C  to be 1 to ensure that ( )

( )
Tr C Tr C N = =  . As we know, the covariance matrix Σ ( 1 Σ or 1 Σ ) and the asset correlation where 1 ∆ = ∆ or 2 ∆ represents the diagonal matrix formed by standard deviation of each asset. So the Σ  after denoising could be obtained through the C  .

Mean-Realized Variance-CVaR Optimization Model
Suppose that ( ) is the return vector of N assets, and ( ) where M might be the realized covariance matrix 1 Σ , the realized covariance matrix after noise reduction ( 1 Σ  ) or the covariance matrix after noise reduction ( 2 Σ  ).
Based on Roman et al. (2007), we will study the following problem: 1, , where d represents the investor's target return rate, z represents the control of is the weight constraint for full investment, 0 j x ≥ tells that noshort-selling permitted. The specific determination of parameters d and z is shown in Appendix A.
To show the impact of random matrix and realized variance on investment strategies, the following three optimization models are arranged in this paper, as Table 1.

Model Evaluation
The average return: where out R represents the out sample data, and x * represents the optimal investment weight.
Omega Ratio (OR) is proposed by Keating & Shadwick (2002), defined as: where ( ) F x represents the cumulative distribution function of portfolio returns and ε is a specified threshold. Returns below the specific threshold are considered as losses and returns above as gains. For the convenience of calculation, 0 ε = is assumed (Clemente et al., 2019). The portfolio with the highest ratio will be preferred by an investor.

Dataset Description
The database is from Shanghai Stock Exchange 180 (SSE 180) Index, compiled by 180 sample stocks that are the most representative of China's A-Share Market. The five-minute return data of 120 stocks is collected from July 1, 2019 to August 10, 2019. And their five-minute logarithmic returns are calculated respectively. The data spanning from July 1, 2019 to July 31, 2019 is marked as in-sample data and the rest for out-of-sample data.

Empirical Procedure
The empirical study will be processed according to the following procedure.
Step 4: calculating the optimal asset weights under MRVC (denoise), MRVC and MVC (denoise) based on model (11)  (2) Find the optimal weight j x * of assets under mean-variance model based on d * in last step.
(4) Problems of mrvc  , rmt mrvc  and rmt mvc  with the given d * and z will be solved through cvx toolkit in Matlab, which result in the optimal solution x * .
Step 5: the in-sample optimal weight of assets with three models are obtained from step 1 to step 4. Further, the average returns and OR values for out-of-sample dataset are calculated by formulas (12) and (13).

Characteristic Analysis of Asset Correlation Matrix
We calculate the asset correlation matrix C ( 1 C or 2 C ), and further detect their noises based on Step 1 to Step 2, shown in Table 2 and Table 3.
From Table 2, we find the maximum (minimum) eigenvalue 44.32 (0.01) of It can be seen from Table 3 that the SE of the asset correlation matrix C ( 1 C or 2 C ) is much smaller than its corresponding random matrix, which means that C ( 1 C or 2 C ) contains more economic information than its random matrix. After removing the eigenvalues greater than max Z λ in 1 C and 2 C respectively, SE rises sharply, which indicates that removing larger eigenvalues might reduce the information of asset correlation matrix. Therefore, we only replace eigenvalues less than 5 with 0 when PG+ method is used.

Out-of-Sample Performance of Optimal Asset Allocation Based on
Step 3 to Step 4 in Section 3.2, we can obtain the optimal investment strategy under each model with different parameters, and the average return and OR values are shown in Table 4.
The following results could be found from Table 4.

1) Under any different constraints ( )
, d z * of means and CVaR, both average return and OR of MRVC (denoise) are higher than MVC (denoise), which means that the introduction of realized covariance matrix for high-frequency data can help much for more effective market information and more appropriate investment decision.
2) Compared with MRVC model, the average return and OR of MVC (denoise) are improved mostly, which tells us that the use of random matrix can indeed improve the performance of investment portfolios to some extent. And the performance under MRVC (denoise) model is sensitive to the selection of parameters d * and z.
To further understand out-of-sample performance of each model under different parameter, we plot the cumulative return with the optimal portfolio weights, see Figure 1.
From Figure 1 we can give the following conclusion. 2) There is little difference among three models when the market fluctuates slightly in the early stage. However, MRVC (denoise) begins to highlight its superiority when the market fluctuates sharply.
3) At a fixed return target, the superiority of MRVC (denoise) gradually increases with the relaxation of constraint on risk CVaR. Journal of Financial Risk Management Table 4. Out-of-sample performance of optimal investment strategy.

Conclusion
This paper studies multi-objective investment strategy based on mean-realized variance-CVaR and random matrix theory for high-frequency data. Compared with Roman et al. (2007), the innovation of this paper is the introduction of covariance matrix and random matrix theory in optimization problem (Clemente et al., 2019). Compared with Li & Hong (2019), this paper considered CVaR and variance as factors of risk control simultaneously. To a certain extent, the new model can better deal with high frequency, noise and thick-tail characters of data in financial market. The empirical study found that the noise percentage in asset correlation matrix with realized covariance matrix is significantly reduced, and hence carries more effective information. The out-of-sample performance of MRVC (denoise) is significantly better than the other two models, which tells us that the use of realized covariance matrix and random matrix might help to improve information quality and effectiveness of high-frequency data in investment problem. Because of the limitation of length, this paper only considers five-minute return data of 120 stocks, and the relationship between different high-frequency data, denoising effect, and covariance matrix estimator can also be a direction for future research.
Thus formula (11) Here v is the value of Solving 5  to get