^{1}

^{*}

^{1}

^{1}

Since the great financial crisis of 2008, many studies have pointed out that even in the portfolio where the asset allocation is sufficiently diversified, it is still possible that risk allocation is well concentrated to a few assets. One approach to this problem is risk parity strategies which equalize the risk contribution of each asset. However, even if we equalize the risk contribution, risk sources are not necessarily diversified. In this paper, we propose non-hierarchical clustering-risk parity strategy which will equalize risk contribution from and within each cluster. In addition, in order to ensure robustness of clustering, we also propose x-means++ algorithm which combines k-means++ with x-means. Assuming assets with similar movement have common risk sources, our approach will construct a portfolio which equalizes risk sources. Empir-ical analysis using actual price data of various asset classes shows that our proposed method will outperform risk-parity strategies or hierarchical clustering risk parity strategies.

Since the great financial crisis of 2008, many studies have pointed out that even in the portfolio where the asset allocation is sufficiently diversified; it is still possible that risk allocation is well concentrated to a few assets.

Traditionally portfolios are constructed using the men-variance approach [

One approach to this problem is risk parity strategies which equalize the risk contribution of each asset [

As the risk of stocks is much larger than bonds, the majority of the portfolio risk comes from stocks. In the first place, we make diversified investments because we expect other assets to support the overall performance when one asset is in a poor condition.

When the majority of the portfolio risk is occupied by stock, other assets do not make up for the stock market slump, and the expected effects of diversified investment cannot be obtained. Risk parity strategies have been proposed as an alternative to these conditions.

The concept of risk parity applies to the asset allocation problem and there are many previous studies. For example, [

On the other hand, some caveats are pointed out for risk parity strategy too. [

Therefore, in this paper, we first group assets with similar movements using non-hierarchical clustering method. Then, we propose a non-hierarchical clustering/risk parity strategy in which the risk contributions are equal both in each cluster and within the cluster. We also propose x-means++ which is a combination of x-means algorithm [

Assuming assets with similar movement have common risk sources; our approach will construct a portfolio which equalizes risk sources. Empirical analysis using actual price data of various asset classes shows that our proposed method will outperform risk-parity strategies [

The remaining sections of this paper are organized as follows. In Section 2, we briefly describe the related studies of the risk-based portfolio. In Section 3, we introduce the risk parity portfolio and non-hierarchical risk parity portfolio. In Section 4, we describe the x-mean++ clustering and in Section 5, we verify its effectiveness through empirical analysis with the actual financial market data. Finally, we conclude in Notation.

Unlike the mean-variance portfolio, which uses both estimated return and risk, risk-based portfolios only use estimated risk to construct a portfolio. As predicting future returns is troublesome and also error maximization features of mean-variance optimization approach tend to construct a portfolio concentrated on a few securities [

Typical risk-based portfolios are the minimum variance portfolio [

Furthermore, it is known that these three portfolios can be written as a generalized risk-based portfolio [

As an extension of risk parity, there are principal component risk parity [

We consider a portfolio of n risky asset and let R = ( R 1 , ⋯ , R N ) T be the return (random variable) vector of each assets, μ = ( μ 1 , ⋯ , μ N ) T be the vector of expected returns, and Σ = E [ ( R − μ ) ( R − μ ) T ] be the covaraiance matrix of asset returns. Additionally, we denote weight vector of portfolio as w = ( w 1 , ⋯ , w N ) T .

To derive the specific form of the risk parity portfolio, we will introduce Marginal Risk Contribution (MRC) as a derivative of portfolio risk σ P = w T Σ w by weight w.

M R C = ∂ σ P ∂ w = Σ w σ P , M R C i = ( Σ w ) i σ P (1)

We will be able to decompose portfolio risk using MRC as following.

σ P = ∑ i = 1 N w i × M R C i = w T M R C (2)

We will additionally define Risk Contribution (RC) as below.

R C i = w i × M R C i σ P (3)

Finally Risk Parity Portfolio can be defined as a portfolio which RC_{i}s from each asset i are equal.

R C i = R C j , forall i , j (4)

Restricting short-selling and usage of leverage, [

min w ∑ i = 1 N ∑ j = 1 N ( R C i − R C j ) (5)

s .t , ∑ i = 1 N w i = 1 , w i > 0 (6)

Essence of risk parity portfolio is controlling risk allocation. While constructing a risk parity portfolio we choose to allocate risk contribution equally to each asset, but we can consider alternative way of allocating risk, which is called risk budgeting strategy [

In this article we aim to equalize risk contribution from each cluster and at the same time equalize risk contribution from each asset within every clusters.

To achieve this goal, we will first perform non-hierarchical clustering using asset returns to determine risk clusters.

And using this cluster we will solve optimization problem below to get the portfolio weights of non-hierarchical clustering risk parity portfolio. k stands for number of clusters and N k stands for number of assets in each cluster. We can see that risk contribution from each cluster is equalized and risk contributions from each asset within each cluster are equalized in this portfolio. We will introduce this method as non-hierarchical clustering risk parity strategy.

min w ∑ i = 1 N ∑ j = 1 N ( R C i − 1 k × 1 N k ) (7)

s .t , ∑ i = 1 N w i = 1 , w i > 0 (8)

The k-means is a standard algorithm of a hierarchical clustering method which is easy to implement and has high calculation efficiency.

A cluster refers to a collection of data points aggregated together according to certain distances and a centroid C i is a center point in each cluster.

We first define a target centroid number k.

The k-means divides the data into k clusters so as to minimize the following evaluation function in which d ( x , y ) is the distance function.

∑ i = 1 k ∑ x ∈ C i ( d ( x , C i ) ) 2 (9)

However, the k-means algorithm has two shortcomings. First, the result may depend on the initial clusters, so the algorithm does not guarantee the optimal clustering. Second, the algorithm needs to set the numbers of clusters k initially.

The initialization method called k-means++ [

The feature of k-means++ is the initialization of centroids C i . The k-means++ algorithm decides the k clusters as follows:

Step 1:

Choose one data point at random in data as an initial centroid C 1 .

Step 2:

For each data point x i , compute d ( x i ) , the distance between x i and the nearest centroid that has already been chosen.

Step 3:

Choose one new data point x p at random as a new centroid with the following probability

( d ( x p ) ) 2 / ∑ i = 1 n − 1 ( d ( x i ) ) 2 (10)

here, the data already selected as the cluster has the probability 0 because the distance between the data and the nearest centroid is 0.

Step 4:

Repeat the step 2 and 3 until k centroids have been chosen.

Pros | Cons | |
---|---|---|

k-means | easy to implement and has high calculation efficiency. | depends on the initial clusters. needs to set the numbers of clusters k initially. |

k-means++ | can decrease the dependency on initial clusters. | needs to set the numbers of clusters k initially. |

x-means | can determine the optimal number of clusters. | depends on the initial clusters. |

x-means++ (Ours) | can determine the optimal number of clusters. can decrease the dependency on initial clusters. | - |

The x-means algorithm can determine the optimal number of clusters unlike k-means algorithm which the number of clusters has to be given in advance.

The process of x-means clustering is to perform k-means repeatedly from k = 2 until a Bayesian information criterion (BIC) does not improve.

This study applies the following algorithm proposed by [

Step 1:

We prepare p-dimensional data whose sample size is n.

Step 2:

We apply k-means ( k = 2 ) to all data. We name the divided clusters as C 1 , C 2 .

Step 3:

We repeat the following procedure from step 4 to step 9 by setting i = 1 , 2 .

Step 4:

For a cluster C i , we apply k-means ( k = 2 ). We name the divided clusters as C i 1 , C i 2 .

Step 5:

We assume the following p-dimensional normal distribution for the data x i ∈ C i :

f ( x ; θ i ) = 1 ( 2 π ) p 2 det | V i | exp − 1 2 ( x − μ i ) T V i − 1 ( x − μ i ) (11)

Then, we calculate BIC as

B I C = − 2 log L ( θ i ; x i ∈ C i ) + q log n i (12)

where θ i = [ μ i , V i ] is the maximum likelihood estimate of the p-dimensional normal distribution; μ i is p-dimensional means vector, and V i is p × p dimensional covariance matrix; q is the number of the parameters dimension, and it becomes q = p ( p + 3 ) / 2 . n i is the number of elements in C i . L is the likelihood function which indicates L ( ⋅ ) = ∏ f ( ⋅ ) .

Step 6:

We assume the p-dimentional normal distributions with their parameters θ i ( 1 ) , θ i ( 2 ) for C i 1 , C i 2 respectively. The probability density function of this 2-division model becomes

g ( θ i ( 1 ) , θ i ( 2 ) ; x ) = α i [ f ( θ i ( 1 ) ; x ) ] δ i [ f ( θ i ( 2 ) ; x ) ] 1 − δ i , (13)

where

δ i = { 1 , if x is included in C i 1 0 , if x is included in C i 2 (14)

x i will be included in either C i 1 or C i 2 ; α i is a constant which lets equation (12) be a probability density function ( 1 / 2 ≤ α i ≤ 1 ). We approximate α i as follows:

α i = 0.5 / K ( β i ) , (15)

where β i is a normalized distance between the two clusters, shown by

β i = ‖ μ 1 − μ 2 ‖ 2 | V 1 | + | V 2 | , (16)

K ( ⋅ ) stands for a lower probability of normal distribution. The BIC for this model is

B I C ′ = − 2 log L ′ ( θ ′ i ; x i ∈ C i ) + q log n i (17)

where θ ′ i = [ θ i ( 1 ) , θ i ( 2 ) ] is the maximum likelihood estimate of the p-dimensional normal distribution; since there are two parameters of mean and covariance for each p variable, the number of parameters dimension becomes q ′ = p ( p + 3 ) . L ′ is the likelihood function which indicates L ′ ( ⋅ ) = ∏ g ( ⋅ ) .

Step 7:

If B I C > B I C ′ , we prefer the two-divide model, and decide to continue the division; we set C i ← C i 1 . As for C i 2 , we push the p-dimensional data, the cluster centers, the log likelihood and the BIC onto the stack. We return to Step 4.

Step 8:

If B I C ≤ B I C ′ , we prefer not to divide clusters anymore, and decide to stop. We extract the stacked data, which is stored in Step 7, and we set C i ← C i 2 . We return to Step 4. If the stack is empty, go to Step 9.

Step 9:

The 2-division procedure for C i is completed. We renumber the cluster identification such that it becomes unique in C i

Step 10:

The two-division procedure for initial k = 2 divided clusters is completed. We renumber all cluster identifications such that they become unique.

Step 11:

We note the outputs of the cluster identification, the center of each cluster, the log likelihood of each cluster, and the number of elements in each cluster.

This section describes the empirical study with real market data.

We perform empirical analysis using equity and bond futures price data. The indices we use in this study are summarized in

We compare risk parity (RP) [

Investment assets | ||||||||
---|---|---|---|---|---|---|---|---|

Equity furure (15 assets) | S&P500 (SP) | NAS DAQ (NQ) | CA (PT) | GB (Z) | FR (CF) | DE (GX) | EU (VG) | ES (IB) |

NL (EO) | CH (SM) | NIKKEI (NK) | TOPIX (TP) | HK (HI) | AU (XP) | SG (QZ) | ||

Bond furure (12 assets) | US2Y (TU) | US5Y (FV) | US10Y (TY) | US20Y (US) | AU3Y (YM) | AU10Y (XM) | CA10Y (CN) | DE2Y (DU) |

DE5Y (OE) | DE10Y (RX) | GB10Y (G) | JP10Y (JB) |

^{a}Words in parentheses denote tickers.

Performance Statistics | SP | NQ | PT | Z | CF | GX | VG | IB | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Return (%, Ann) | 5.32 | 8.53 | 4.49 | 3.13 | 2.90 | 3.34 | 2.89 | 2.37 | |||||||

Risk (%, Ann) | 19.34 | 26.30 | 18.70 | 18.42 | 22.38 | 22.86 | 23.42 | 23.04 | |||||||

R/R | 0.28 | 0.32 | 0.24 | 0.17 | 0.13 | 0.15 | 0.12 | 0.10 | |||||||

Performance Statistics | EO | SM | NK | TP | HI | XP | QZ | ||||||||

Return (%, Ann) | 2.97 | 5.38 | 4.85 | 3.74 | 6.22 | 5.08 | 4.80 | ||||||||

Risk (%, Ann) | 22.02 | 18.36 | 23.87 | 22.58 | 23.05 | 16.34 | 19.51 | ||||||||

R/R | 0.13 | 0.29 | 0.20 | 0.17 | 0.27 | 0.31 | 0.25 | ||||||||

Performance Statistics | OE | RX | G | JB | |||||||||||

Return (%, Ann) | 2.50 | 4.28 | 3.44 | 1.93 | |||||||||||

Risk (%, Ann) | 3.03 | 5.17 | 5.79 | 2.95 | |||||||||||

R/R | 0.82 | 0.83 | 0.59 | 0.65 | |||||||||||

Performance Statistics | TU | FV | TY | US | YM | XM | CN | DU | |||||||

Return (%, Ann) | 1.33 | 3.03 | 4.31 | 5.76 | 2.80 | 2.80 | 3.92 | 0.83 | |||||||

Risk (%, Ann) | 1.42 | 3.72 | 5.75 | 9.88 | 7.17 | 7.17 | 5.41 | 1.15 | |||||||

R/R | 0.94 | 0.82 | 0.75 | 0.58 | 0.39 | 0.39 | 0.72 | 0.72 | |||||||

First, we estimate covariance matrix and perform clustering methods using 250 days of asset return data. Then, we construct each portfolio every 20 business days. Our simulation period is from April, 2001 to May, 2020.

For evaluating an investment strategy, we use the following measures that are widely used in financial space. Returns are annualized, risk is calculated as standard deviation of return and R/R stands for return/risk ratio. In this paper, each portfolio will have different risk levels as we utilize wide range of assets with various risk levels. We think R/R which is the efficiency of the portfolio performance, is more appropriate measure for performance evaluation for this study than return alone.

Return = 250 T ∑ t = 1 T r t (18)

Risk = 250 T − 1 ∑ t = 1 T ( r t − μ ) 2 (19)

R / R = Return / Risk (20)

maxDD = min k ∈ [ 1 , T ] ( 0 , W k max j ∈ [ 1 , k ] W j − 1 ) (21)

Here, r t denotes the portfolio return at time t, μ denotes average of r t and W k denotes the wealth of portfolio at time k.

Performance Measures | RP | HRP | CRP2 | CRP3 | CRP4 | CRP5 | CRP6 | CRP7 | CRP8 | XRP |
---|---|---|---|---|---|---|---|---|---|---|

All Period (from Apr, 2001 to May, 2020) | ||||||||||

Return (%, Ann) | 2.31 | 1.22 | 1.05 | 2.48 | 2.50 | 2.67 | 2.60 | 3.31 | 2.71 | 3.06 |

Risk (%, Ann) | 2.04 | 1.35 | 3.59 | 2.65 | 2.78 | 3.38 | 2.48 | 2.83 | 3.20 | 2.25 |

R/R | 1.14 | 0.90 | 0.29 | 0.93 | 0.90 | 0.79 | 1.05 | 1.17 | 0.85 | 1.36 |

maxDD (%) | 7.24 | 3.93 | 18.07 | 11.20 | 9.96 | 8.66 | 8.48 | 7.48 | 9.23 | 7.07 |

First Half Period (from Apr, 2001 to Apr, 2010) | ||||||||||

Return (%, Ann) | 2.28 | 1.74 | −0.06 | 2.21 | 2.42 | 2.31 | 2.30 | 2.34 | 2.81 | 2.91 |

Risk (%, Ann) | 2.42 | 1.82 | 4.26 | 3.14 | 3.54 | 3.23 | 2.83 | 3.00 | 3.95 | 2.62 |

R/R | 0.94 | 0.95 | −0.01 | 0.71 | 0.68 | 0.71 | 0.81 | 0.78 | 0.71 | 1.11 |

maxDD (%) | 7.24 | 3.93 | 18.07 | 11.20 | 9.96 | 8.66 | 8.48 | 7.48 | 9.23 | 7.07 |

Second Half Period (from May, 2010 to May, 2020) | ||||||||||

Return (%, Ann) | 2.34 | 0.75 | 2.04 | 2.71 | 2.58 | 2.99 | 2.86 | 4.17 | 2.62 | 3.18 |

Risk (%, Ann) | 1.61 | 0.68 | 2.86 | 2.12 | 1.85 | 3.50 | 2.11 | 2.67 | 2.32 | 1.85 |

R/R | 1.45 | 1.10 | 0.71 | 1.28 | 1.39 | 0.86 | 1.36 | 1.56 | 1.13 | 1.72 |

maxDD (%) | −4.79 | −1.23 | −8.11 | −5.68 | −5.59 | −8.86 | −5.71 | −5.81 | −5.37 | −4.23 |

In terms of R/R for all period, XRP is the most efficient among RP, HRP and all CRPs. In addition, the return level of XRP is higher than all methods excluding CRP7. Also, maxDD of XRP is smaller than all methods excluding HRP. Our result shows that XRP has the best performance of all. XRP also gives the best R/R and second best maxDD in both the first half and the second half.

Our study makes the following contributions:

· We propose non-hierarchical clustering-risk parity strategy in which the risk contributions are equal both in each cluster and within the cluster.

· We also propose x-means++algorithm which combines k-means++ algorithm with x-means algorithm to ensure robustness of clustering.

· Empirical analysis shows that the portfolio equalized risk contribution from each risk sources by our proposed approach, outperforms risk parity strategies or hierarchical clustering risk parity strategies.

Our future tasks are to perform empirical analysis using larger dataset such as individual stocks to verify the robustness of our proposed strategy and to apply our method to a complex valued risk diversification strategy.

The authors declare no conflicts of interest regarding the publication of this paper.

Nakagawa, K., Kawahara, T. and Ito, A. (2020) Asset Allocation Strategy with Non-Hierarchical Clustering Risk Parity Portfoli. Journal of Mathematical Finance, 10, 513-524. https://doi.org/10.4236/jmf.2020.104031

R: Return (random variable) vector

μ : Vector of expected returns

Σ : Covaraiance matrix

w: Weight vector of portfolio

σ P : Portfolio risk

MRC: Marginal Risk Contribution

RC: Risk Contribution

C i : Center point in cluster i

k: Number of clusters

n i : Number of elements in C i

d ( ⋅ ) : Distance function

f ( x ; θ i ) : p-dimensional normal distribution for the data x and paramater θ i