^{1}

^{*}

^{1}

^{2}

^{*}

^{3}

^{*}

In this paper, a long-short beta neutral portfolio strategy is proposed based on earnings yields forecasts, where positions are modified by accounting for time-varying risk budgeting by employing an appropriate integration measure. In contrast to previous works, which primarily rely on a standard principal component analysis (PCA), here we exploit the advantages of a probabilistic PCA (PPCA) framework to extract the factors to be used for designing an efficient integration measure, as well as relating these factors to an asset-pricing model. Our experimental evaluation with a dataset of 12 developed equity market indexes reveals certain improvements of our proposed approach, in terms of an increased representation capability of the underlying principal factors, along with an increased robustness to noisy and/or missing data in the original dataset.

Markets constitute a highly dynamically evolving universe, which undergoes through distinct time periods, and reacts in diverse conditions and phenomena, thus, monitoring the opportunities which appear for investors during these periods is of significant importance. An approach to take this behavior into account in asset management is to exploit the importance of the market factor to explain expected returns. More specifically, the part of the variance related to the global market component of risk can be considered as a proxy of market integration. In this context, integration implies that all barriers are eliminated and therefore risk premia associated with global factors are identical in any of such markets.

Construction of optimal portfolios by accounting not only for the diversification across assets, but also across time, attracted the interest of the research community during the last decades [1-4]. In a recent work [

By adopting the same assumptions as in [

The developed strategy is based on the intuition that if global factors determine strongly the excess returns, then less alpha opportunities are left for long-short investors. It is then straightforward to use market integration as a guide for risk-budgeting decisions. Indeed, one would expect that an investment manager would allocate more risk to those decisions that he is most confidence in, rather than to those he feels less certain about.

The standard principal component analysis (PCA) was employed in [

The rest of the paper is organized as follows: Section 2 introduces the main principles of PPCA and analyzes in detail the proposed methodology for portfolio optimization. In Section 3, an experimental evaluation of the proposed approach is performed on a set of 14 developed equity markets, while Section 4 concludes and gives directions for further extensions.

Market integration is a fundamental concept in financial economics [8,9]. Especially for investors, integration would provide a broader range of available assets and lower risk premia, but at the cost of fewer diversification opportunities. Markets are internationally integrated if the reward for risk is identical regardless the market one trades in. Previous works [10,11] exploited an increase in the common component of equity returns data as an indication of higher integration, although such approaches do not measure market integration in the strict sense.

Motivated by [

In the subsequent analysis, we consider an ensemble of M time-series each one with N observations representing equity returns. This ensemble is assumed to follow a linear multiple factor structure, as follows:

where L is the number of common factors, which capture the systematic component of risk, denotes the expected return on the m-th asset, is the realization at time t of the l-th common factor, is the sensitivity of the m-th asset to the movements of the l-th factor, and denotes the noise term. The assumption for a linear multiple factor structure yields that the factors can be estimated by employing a principal component analysis, in our case working in a probabilistic framework.

In this section, the main principles of PPCA are introduced in brief. A remarkable feature of the typical PCA is the absence of an associated probabilistic model for the observed data. A probabilistic formulation of PCA is obtained from a Gaussian latent-variable model, with the principal axes emerging as maximum-likelihood (ML) estimates. Moreover, the latent-variable formulation results naturally in an iterative, and computationally efficient, expectation-maximization (EM) algorithm for performing PPCA.

A further motivation is that PPCA is characterized by some additional practical advantages: a) the probabilistic model offers the potential to extend the scope of standard PCA. For instance, multiple PPCA models may be combined as a probabilistic mixture, increasing the explanatory power of the principal factors, while also PPCA projections can be obtained in case of missing observations, and b) along with its use as a dimensionality reduction technique, PPCA can be employed as a general Gaussian density model. The benefit of doing this is that ML estimates for the parameters associated with the covariance matrix can be computed efficiently from the data principal components. Potential applications include detection and classification of abnormal changes, which may be further employed as an alerting mechanism for an investment manager. This later observation is left as a separate thorough study.

Let be the data matrix whose columns are the observed time-series. A latentvariable model aims to relate an M-dimensional observation vector (rows of) to a corresponding K-dimensional vector of latent (or unobserved) variables (rows of) as follows:

where is the matrix of latent variables, denotes a linear mapping between the original space and the space of latent variables, is a vector of all ones, is an M-dimensional parameter vector, which permits the model to have nonzero mean, and is an M-dimensional vector, which stands for the measurement error or the noise corrupting the observations.

In the following, let denote an arbitrary row of the matrices and, respectively. Working in a probabilistic framework, we make the following assumptions for the parameters contained in Equation (2):

• The latent variables are independent and identically distributed (i.i.d.) Gaussians with unit variance, that is, , where is the identity matrix.

• The error (or noise) model is isotropic Gaussian, that is,.

• By combining the above two assumptions with Equation (2), a Gaussian distribution is also induced for the observations, namely, , where the observation covariance model is given by.

From the above we deduce that the model parameters can be determined by ML via an iterative procedure. We also emphasize that the subspace defined by the ML estimates of the columns of will not correspond, in general, to the principal subspace of the observed data.

The isotropic Gaussian noise model, in conjunction with Equation (2), yields that the conditional probability distribution of given is as follows:

Then, the associated log-likelihood function is given by

where and denote the determinant and the trace, respectively, of a matrix, and the sample covariance matrix is given by

with denoting the n-th observation (row of). Notice also that the ML estimate for is given by the sample mean of the data. Finally, the estimates for the probabilistic principal components, that is, the columns of and the noise variance, are obtained by iterative maximization of Equation (4):

where the K columns of the matrix are the principal eigenvectors of, with corresponding eigenvalues, constituting the diagonal matrix, and is an arbitrary orthogonal rotation matrix. In practice, is ignored by simply setting. Finally, the matrix whose columns are the probabilistic principal factors is simply obtained by projecting the data matrix on the probabilistic principal components, that is,

One of the main aspects of the proposed approach is the time-varying management of the risk, through an integration measure based on second-order statistics, which is applied on overlapping time-windows. More specifically, let h denote the window length and s the step size. Then, the integration measure we use at time t, G_{t}, is defined as

where is the squared correlation between two variables, denotes the first principal factor extracted by applying PPCA in the time window [t − h, t], and is the vector of returns of the m-th asset during the time interval [t − h, t]. The integration measure is computed by rolling the time window every s time steps, where the time-scale depends on our specific needs (e.g., daily, weekly, monthly).We interpret the periods when the returns are highly correlated to the first probabilistic principal component to be periods of high integration. We emphasize again that measuring and monitoring integration is crucial in the pure alpha framework, since a high integration period nominates decisions for decreased risk to be taken. Moreover, since our data have only equity indexes we can use this measure toassess integration assuming that most of the times equity markets are positively correlated.

The construction of an optimal portfolio entails a meanvariance optimization over an ensemble of assets expected returns and a covariance matrix. For this purpose, accurate prediction of returns and covariances are required, with the former one, that is, the forecast of returns, being the most challenging part.

Concerning the estimationof the sample covariance matrix, an initial estimate is computed for each time window [t − h, t], which is then corrected using shrinkage (Stein’s estimator) [13,14] to improve the out-of-sample performance and reduce the estimation risk.

Market neutral portfolios are portfolios uncorrelated with the market, delivering positive returns. This is only possible through the use of short sales, the optimization program will return the long and short positions, with the additional property of being beta neutral. Among the advantages of market neutral investing, is its ability to enhance the performance of any index portfolio by combining an alpha strategy with an equity index. Our goal is to build a zero-beta portfolio by determining optimal long and short positions, under the constraint that portfolio’s beta should be equal to zero.

According to modern portfolio theory, an investor is interested in maximizing his utility, associated with a portfolio, with being an M-dimensional vector of weights (one weight per asset). The vector of optimal weights is obtained by solving the following optimization problem:

where is the expected return of our portfolio, with being the vector of expected excess returns of the assets, is the portfolio’s variance, with being the estimated covariance matrix of the returns, and is the desired portfolio’s variance. In Equation (11), is a regularization parameter representing the investor’s level of risk aversion. In order to eliminate the exposure to the market or main factor, a beta constraint should be introduced:

where is the M-dimensional vector of assets’ beta values calculated with an equally weighted index (EWI) as the market, and the portfolio’s beta. The constraint of zero portfolio’s beta yields an optimal portfolio which is not exposed to market risk. Motivated by [

In addition, the EWI assumption results in the following simple expression for the vector of assets’ betas:

where is the M-dimensional vector of all ones.

We modify the approach introduced by [

In the case of a minimum variance portfolio, the objective function is modified such that only the risk part is optimized. The optimization problem in this case is then:

As we mentioned above, we expected that the selected assets in the optimal minimal variance portfolios correspond to those highly related to the smallest factors and, in the same way, the market neutral portfolios will be composed of assets not highly related to the main factor, therefore related to factors of lower risk. The construction of portfolios with minimum variance is therefore an indirect way to build an alpha strategy.

Having calculated the optimal weights (long and short positions) by solving the optimization problem expressed by Equations (11) and (13), the second stage aims in adapting these weights so as to keep a constant risk through the strategy or to achieve a time-varying risk allocation. Two adjustmentsare considered: 1) reduction of the exposure to the market and enforcement to constant volatility of our portfolio for the initial strategy, and 2) adaptation of the corresponding weights by employing the values of the computed integration measure G_{t} for the scaled strategies.

As mentioned before, one the main purposes for monitoring the degree of integration through time is to warn investors for periods of high integration, where fewer opportunities are left to generate alpha, and thus, they should take more conservative decisions.

The approach introduced in [

More specifically, the integration measure G_{t}, given by Equation (9), is employed to decide for the units of risk, which could be taken at each period of time.The rule we propose for the adaptation of the weights is as follows: for small integration values, the portfolio is allowed to have the maximum risk exposure, which is equivalent to using the optimal weights obtained from the solution of the optimization problem defined by Equation (11). On the other hand, for high integration values, the risk is reduced by reducing the weights of every position proportionally to its integration value.

The vector of optimal weights is scaled by an appropriate scaling factor resulting in the following scaled weights

where is the integration value at time t after removing the mean computed over all the overlapping windows, so as to use the same thresholds for the whole time period of study. The lower and upper bounds of the integration, and, respectively, are thresholds which specify when the integration is low, and therefore higher possibilities to generate alpha exist, and when the integration reaches an upper limit, above which there are very few chances to generate alpha. Equation (14) implies that the optimal weights are modified linearly for integration values ranging in the interval, while for integration values larger than the weights are set equal to zero and for integration values smaller than the optimal weights are left unchanged. In practice, the values of the two thresholds and can be set based on the quantiles of the distribution of the historical integration measure.

The proposed PPCA-based approach to analyze a dataset is used to measure integration and build alpha generating portfolio strategies for a set of financial data. We analyzed a group of 12 developed equity markets (Australia, Canada, France, Germany, Hong Kong, Japan, Singapore, Spain, Sweden, Switzerland, United Kingdom, and USA), for which liquid index futures contracts are available in order to enable short positions in the portfolio strategies.

Closing prices at a daily frequency for the main futures indexes of each country have been collected, expressed in local currency, covering the period between January 2001 and January 2013. The use of data in local currencies can be advantageous in terms of diversification of international portfolios, however any undesired exposure to a specific currency can be hedge using several methods which do not need to be considered in this study.

During the selected time period, all markets had undergone through the two main markets crisis of recent years: the IT-bubble and the subprimes and debt crisis. Both crises are followed by a recovery period, thus, offering a good opportunity to study integration and longterm portfolio strategies. We note also that, in order to emphasize the short-run movements of the data, the relative change between consecutive time instants is used. This can be measured by computing the first difference of the natural logarithm (dlog) of the time-series samples. Thus, a preprocessing step is applied on the original ensemble of the M time-series as follows:

Besides, to overcome the limitation of significantly different variances or expression in different units (as it is the case in our dataset with the different currencies) between the several time-series, a further normalization to zero mean and unit variance of the dlog time-series is performed. The performance of the proposed method is evaluated for a window length h = 250 and step-size s = 25.

To analyze the risk-adjusted performance of the different portfolio strategies we use three different indicators [

The Sharpe ratio is defined as the ratio of the average excess return of an asset over the risk-free rate and the volatility of this excess return:

This ratio is the most common risk-adjusted performance measure used in financial studies. Its main drawback is that it is only useful for symmetrical distributions of returns.

The Sortino ratio is a variation of the Sharpe ratio, where the return is adjusted by the downside volatility, the volatility calculated only with negative returns. The Sortino ratio is defined as

Since it takes into account the skewness of the distribution of returns, it would therefore be a better risk-adjusted performance indicator for payoffs with higher downside risk.

The maximum drawdown, which is defined as the maximum cumulated continuous loss over a given period, measures the degree of extreme losses. We also present the ratio of the maximum drawdown divided by the volatility, which indicates the degree of extreme loss in terms of the standard deviation of the returns. In particular, a lower maximum drawdown could be associated with lower overall risk. Moreover, this ratio standardizes the MDD measure in order to make it possible to compare payoffs with different volatilities.

In this section, the characteristics of the probabilistic principal factors extracted with PPCA are exhibited for the complete dataset.

For a visual inspection of the difference between the principal components extracted with PCA and PPCA,

However, one of the major advantages of PPCA, in contrast to PCA, is its ability to estimate simultaneously the underlying noise variance. This feature can be very important for further processing, such as, for designing adaptive and more accurate predictors of the local or global trends, by taking into account the spurious variations caused by the underlying noise. In

ing the significance of its accurate estimation for further actions, such as, trend estimation and forecasting. Moreover, interestingly, the noise variance acquires its local maximum values in periods close to the periods of crises.

Concerning the evolution of the integration measure G_{t}, _{t} for the PCA and PPCA, for (h, s) = (250, 25). As it can be seen, the dominant principal component extracted by both PCA and PPCA results in the same mean squared correlations with the 12 market indexes. This was expected, since from

Although the integration measure defined by Equation (9) results in similar values for both PCA and PPCA, we emphasize once again the superiority of our PPCA-based approach, which is also capable of monitoring the timevarying behavior of the underlying noise component, as well as its robustness against corrupted data, as it will be shown experimentally in a subsequent section.

In this section, the efficiency of the proposed optimal portfolio construction method is evaluated. More specifically, the portfolio positions are calculated as the minimum variance optimal weights. Besides, the risk is controlled after each optimization by adjustment of the weights to a target volatility of 10%. As our optimized portfolio has long and short positions, it is possible to decrease (increase) the overall portfolio risk by reducing (enlarging) each position by a given proportion.

In the optimization process, the matrix is used, which is obtained from a historical covariance matrix adjusted with a shrinkage method. We build a minimum variance portfolio at each estimation date with the shrunk covariance matrix estimated from the matrix of excess returns for each window. The proposed strategy has a Sharpe ratio of 0.20%, a Sortino ratio of 0.35%, and a maximum drawdown of 44.3%. The results are summarized in

Integration is calculated for rolling windows of one

year length (h = 250) every 25 observations. As we mentioned before, we have observed that high integration allows very few opportunities to generate alpha. This is illustrated in

We observe that the strategy suffers during periods of high integration of the equity markets. In addition, integration spikes during periods of crisis, which make this information useful to adjust the total risk taken at each date.

We construct a scaling factor for our long-short weightsbased on the level of integration in order to structure a downside protection that alerts during periods of high integration, when the market is highly drifted and there are few possibilities of alpha generation. The idea is to transform the integration measure, which is a variable defined in the range [0,1], where 1 indicates perfect integration and 0 indicates no integration, into a variable defined in the same range but where 1 indicates that the optimal weights from the optimization should be considered and 0 indicates that no risk should be taken.

Three simple scaling methods are employed to transform the integration level: 1), 2) using Equation (14), for levels of G_{t} between and and outside their

range we set the levels equal to 0 for and equal to 1 for and 3),where is the linearly de-trended series of. The optimal weights are scaled according to these three methods and the performance of each strategy is computed. For the second scaling method, we set the parameters to be the 20^{th} percenttile of observed up to the estimation point and to be equal to the 80th percentile that we consider a high level of integration.

The average value of for the whole period of study is 0.56. The minimum for the whole series of observations is 0.43 and is 0.66. We compare the basic minimum variance strategy with the different proposed strategies with the risk adjusted by integration. Sharpe ratios improve from 0.2 to 0.41 on average for the scaled strategies. The Sortino ratio is equal to 0.35 for the basic strategy and does not change on average for the risk budgeting strategies. However, it is different for each of the three scaling methods. The ratio of maximum drawdown over volatility varies from 4.34 to 3.80, resulting in a 0.54 improvement of standard deviation.

The best of the three scaling methods is shown to be the third one, which improves the Sharpe ratio of the standard minimum variance strategy from 0.20 to 0.55 and the ratio of maximum drawdown over volatility from 4.34 to 3.68, while the size of the extreme loss of the strategy is reduced by 0.66 standard deviations. All the above results are summarized in

It is also important to note that all the scaled strategies have improved the risk-adjusted performance indicators. As a general conclusion we see that the use of a time-varying risk budgeting, based on an integration measure expressed in terms of the dominant probabilistic principal factor, improves the payoff of the basic minimum variance strategy with the protection against loses during certain critical periods.