How External Trends and Internal Components Decomposition Method Improve the Predictability of Financial Time Series? ()
1. Introduction
Stock market management has been gaining importance in the past several years; however, the researcher mainly focused on return forecasting and volatility (Rapach & Zhou, 2013; Pan et al., 2020; Sun & Yu, 2020). In financial economics, the efficient market hypothesis states that future prices cannot be predicted based on past prices (Malkiel, 2003). This concept has been continually disapproved in different ways (Lee & Lee, 2009; Rossi & Gunardi, 2018), and the dwindling of support among researchers for it was encouraging to explore the stocks return’s structure. By analysing the relationships between their agents using different tools, many studies have been done to extract meaningful information and understand their correlations for both daily (Forbes & Rigobon, 2002) and intraday time scales (Munnix et al., 2010).
Several methods have been developed to include the non-linearity of stock return’s dynamic analysis (Fiedor, 2014c). Most of the analyses use synchronous correlations of equity returns. They have shown a common factor that drives returns, and stocks themselves are arranged in groups (Fiedor, 2014b). Chaudhuri (1997) has found evidence of a single stock market’s common trend through an empirical investigation. Therefore, many studies proposed models for its prediction (Yang et al., 2000; Wen et al., 2019). Separating the market’s global trend from the local effects on stock markets has been a crucial problem. It allows distinguishing whether the stocks are just following the common trend or, on the contrary, the source of their fluctuations.
Analysing stock markets from the point of predictability has attracted many researchers’ attention. Scholars focus on testing return predictability for different stock markets (Chen et al., 2010; Bannigidadmath & Narayan, 2016) or examining the robustness of the evidence on stock return predictability (Campbell & Yogo, 2006; Kostakis et al., 2015). A significant number of these studies used approximate entropy to analyse financial time series (Darbellay & Wuertz, 2000; Assaf et al., 2021). Because of its suitability for characterising them (Pincus & Kalman, 2004) and its usefulness in quantifying the market’s efficiency in stock and foreign exchange (Risso, 2009; Zunino et al., 2009). Entropy is a technique that borrowed its concept from mechanics and information theory and requires infinite data series. The approximate entropy method was proposed to address this problem with simple computations based on the repetitive patterns of time series fluctuations. To the authors’ best knowledge, few publications are available in the literature discussing improving returns predictability. However, studies enhance the predictability degree need to be done.
This paper explores the possibility of improving the predictability of financial returns. Unlike previous studies exploring the relationship between their statistical properties and predictability (Duan & Stanley, 2011; Pan et al., 2005), it uses the separation of the local effects from the global trend imposed by the market. The decomposition method based on the independent component analysis approach evaluates the efficiency of local market policies. It offers a better understanding of the system dynamic and hence improves the returns directional prediction correctness. With a good understanding of stock market patterns, dynamic modeling and decision making may be significantly improved by identifying risks and opportunities. The results show that the returns predictability degree after decomposition has been improved. Incorporating the return’s absolute value in the process can enhance its performance, which means the process is a hopeful way for the return’s directional prediction correctness improvement. Moreover, this study examines the impact of frequency on predictability and finds that high-frequency data are more predictable than daily data.
The reminder of the paper is organized as follows: Section 2 presents data description and the methods and techniques utilized in the empirical analysis. Section 3 discusses the empirical results. Finally, Section 4 concludes the work.
2. Data and Methodology
2.1. Data
This study is based on 4 stock indices from the USA and China to make the results more convincing. The data has been downloaded from the Wind Financial Terminal platform. Due to data constraints, the dataset for S & P500 (Standard and Poor’s 500) and Nasdaq 100 (National Association of Securities Dealers Automated Quotations) extends from January 5, 2010, to May 28, 2019 (389 and 89 stocks), while for SSE Index (Shanghai Stock Exchange) and SZSE 500 composite index (Shenzhen Stock Exchange) covers the period from January 4, 2000, to May 28, 2019 (315 and 245 stocks). This study used 1-minute data to investigate the impact of frequency on predictability for the same stocks listed on S\ & P500. 1-minuite data cover the period from March 27, 2019, to April 5, 2019, due to the consistency and availability of all stocks that will help compare them.
The decomposition method used in this study requires the following data transformation:
(1)
where
and
are the prices at the instants t and
respectively.
2.2. Decomposition Method
For a time series of returns
,
and
where i refers to a specific stock, the existing methods for the separation of the internal from the external contributions allow writing the time series in the following way:
(2)
where
represents the impact of the market trend on the stock i and
symbolise the contribution due to purely local factors.
Generally, these methods assume that the local components have a zero average. Under this assumption, de Menezes & Brabasi (2004) have proposed a method to separate the internal dynamics where the following equation can compute the external components:
(3)
where
(4)
(5)
This method can forecast the correct outcomes in specific cases. Therefore, Barthelemy et al. (2010) proposed the ETICA decomposition method based on an independent component analysis approach (the external trend and internal components analysis). The context is essentially the Arbitrage Pricing Theory (APT), in which
is the excessive
. The
’s estimation is not conceptually different from the more established Fama-Macbeth regression techniques widely used for factor extraction. However, the ETICA methodology is an alternative approach to Fama-Macbeth within the APT context that adds value from a finance perspective. de Menezes et Barabasi (2004) proposed the separation method, where the internal component
has a zero average by definition. Its pricing implications yield the restriction that the elements of the parameter vector
are jointly equal to zero. However, the internal contribution average is expected in many cases to be non-zero; hence these yields incorrect results. The decomposition method assumes the independence of the global trend from internal contributions, which are required to be independent of one stock to another, and the external components so can be written:
(6)
where
is the collective trend common to all stocks reacting to it with the prefactor
, so the authors assumed:
(7)
The parameter
is estimated under two scenarios (the average of
and its dispersion). The first one assumes that in the absence of internal contributions:
(8)
where
(9)
or by an alternative assumption:
(10)
where
and
. In this cases
and
can be fixed to:
while,
(
is the global normalized pattern). The second scenario assumes the absence of correlation between
’s (
and the temporal average of
’s. Barthelemy used the second scenario to estimate
since the assumption of the absence of the internal contribution leads to incorrect results (Barthelemy et al., 2010). The parameter
is estimated by the slope of an observed linear correlation obtained from the following equation:
(11)
To consider the case of having a strong correlation (negatively and positively) we propose the following new approach:
(12)
means by definition that there exists a, and b such as:
(13)
By replacing
in the equation (11), we get:
(14)
In the absence of any condition about a and
, we can’t separate them from each other [we can get
with a linear regression]. Therefore, to express that the correlation is equal to ±1, and assume that:
(15)
We get then
(16)
And
(17)
After collecting the data, this study applied the ETICA method once satisfied its conditions mentioned in Barthelemy et al. (2010). To check the effect of the
on the results, it considered the Equations (10), (15), and (16) and compared the results obtained for different values. Table 1 summarises the intervals for the parameter
Using stocks from S & P 500 (daily data), NASDAQ 100 (3 days data), SSE (weekly data) and SZSE (monthly data) indices when the correlation between
and
is equal to −1, 0 and 1.
This paper extended the external trend and internal components analysis decomposition method algorithm and applied it to
instead of
to
Table 1. Intervals for the parameter
.
enhance the predictability. It is very important and useful since many studies have suggested that signs of returns are predictable (Chronopoulos et al., 2018).
2.3. Approximate Entropy
The algorithm of Kolmogorov-Sinai entropy has been shown to work well for real dynamic systems, but even a small amount of noise makes it fail in analysing the system’s complexity successfully (Delgado-Bonal & Marshak, 2019). To quantify the concept of changing complexity, Pincus, in 1991, developed a new statistic for the experimental data series called “Approximate Entropy” (Pincus, 1991). The study concluded that the application of the K-S entropy was incorrect in some cases, such as the presence of stochastic components.
To solve the K-S entropy limitation, he formulated the approximate entropy (ApEn) with the same philosophy. The independence of the ApEn of any model makes it suitable for a different kind of data analysis (Delgado-Bonal & Marshak, 2019). Therefore, it is applicable without any assumption about data. This is why it is extensively used in different fields. As an input, the ApEn required the pair parameter m, the embedding dimension (a non-negative integer), and the noise filter r (positive real number).
Given a time series
of length T, he defined the blocks:
(18)
And
(19)
The distance between them is:
(20)
By letting the value of
calculating the number of blocks (with length = m) similar to a given block, consecutive values be equal to:
(21)
The approximate entropy is calculated by:
(22)
where
(23)
where parameters m and r can be fixed to recommended values. Even the method necessitated data between 10m and 30m. It could be applied to data where
(Pincus, 1995).
According to (Pincus, 2008), approximate entropy’ properties can better analyse the financial time series than other entropy measures. Therefore, this study used it to quantify the original time series’s predictability degree, representing the return rates of different stock markets with different time scales. Then we compared the results with the ones we obtained using both
and
. The ApEn parameter r has been fixed to a recommended value equal to 0.2* standard deviation of the series of data under analysis (literature considered it a standard value (Chou, 2014)). In contrast, the embedding dimension is fixed to a widely validated value m = 2 (Pincus, 2008).
3. Empirical Results
The external trend and internal component analysis decomposition method have some conditions assumed for the data. This study used only the stocks that fulfilled the following:
· Internal fluctuations and the global trend are statistically independent.
· From stock to stock, correlations between the local fluctuations are negligible.
After decomposition, one of the most important conditions is that the prefactor
does not vary over time. Its stability has been checked and confirmed for all the stock used in this study (
is the harmless choice to make, as mentioned in Barthelemy et al. (2010)). Once this condition is fulfilled, the approximate entropy technique has been applied to different quantities. The results were compared to determine whether
and
have smaller approximate entropy values than
. In this study, the ApEn of the S & P 500 daily stocks were calculated using 3 different parameter values from Table 1. The results were similar for the three different values, which means that the predictability is independent of this parameter.
Figure 1 presents the kernel density of ApEn rate estimates for both
(solid line) and
(dashes line). It can be seen that generally, the entropy rates of the
are lower than
’s entropy rates, which means that they are more predictable. These values are calculated using different embedding dimensions to explore the impact of m parameter choice (m = 2 or 3 because higher embedding dimensions are rarely used in practice). As shown in Table 2,
and
have smaller approximate entropy estimated averages than
. From this, it can be concluded that both of the quantities
and
are more predictable than
. Additionally, the embedding dimension m affects on the results (ApEn values have slightly changed in Table 2).
As stated earlier, ApEn is applied to data from NASDAQ 100, SSE and SZSE 500 composite indices, respectively (m = 2, r = 0.2). Figure 2 represents some of
Figure 1. Kernal density for S & P500 daily entropy rates (ApEn).
Table 2. Estimated ApEn averages for different embidding dimension.
Table 3. Estimated ApEn averages of data from Nasdaq 100 (3days data), SSE (weekly data) and SZSE composite (monthly data).
their kernel densities and Table 3 shows the estimated averages.
As shown in Table 3, the approximate entropy averages of
are smaller than
. Figure 2 shows the kernel density of ApEn rate estimates for both
(represented by dashes line) and
(represented by solid line). As shown in the graph, the entropy rates of the
are, generally, lower than
’s entropy rates, which means that they are more predictable. The outcomes Table 3 and Figure 2 show that the predictability degree of internal components has improved.
It has been mentioned above that the
’s are stable, which means that each stock reacts similarly to the common collective trend
over time. Therefore, it is reasonable to obtain equal approximate entropy values for the external parts of all the stocks belonging to the same stock market. As result, their estimated averages for the same stock market indices mentioned earlier are equal to 1.54, 1.27, and 0.38, respectively smaller than 1.56, 1.36, and 0.58 (estimated averages of
’s rates). Based on these results showing that both
and
are more predictable than
, it is possible to conclude
Figure 2. Kernal density for weekly and monthly entropy rates of SSE index and SZSE 500 composite indices, respectively.
that decomposing financial time series via ETICA is effective in decreasing the ApEn, hence, improving the predictability degree.
The mix of empirical evidence of returns predictability suggested in many studies that it is better to predict their sign instead. Christoffersen & Diebold (2006) have developed this area in their theoretical work and demonstrated that returns’ sign is predictable. Their model has been extended to offer investors the highest gains (Chronopoulos et al., 2018). Therefore, this study considered the consequences of decomposing the absolute value of returns instead of the returns. The ETICA has been applied to
(to obtain int abs and ext abs).
Figure 3 and Figure 4 show the number of stocks having improved predictability after decomposing
and
(for both internal and external components). The data are daily, 3-days, weekly, and monthly from S & P 500, NASDAQ 100, SSE and SZSE 500 composite indices. As it can be seen from these histograms, the number of stocks has always augmented in the case of decomposing
instead of
. Hence, utilising the absolute value of returns has enhanced the decomposition performance in predictability improvement. For example, decomposing
from daily S & P 500 stocks (389) improved 104 stocks’
and 371
and the number of stocks becomes 210 and 381 when we decomposed
. Moreover, using the absolute value of
has affected the number of stocks and its predictability.
This article analysed the data frequency impact on predictability. The ApEn estimated average of minute data from the S & P 500 index was compared with the daily averages. The estimated minute data average is recorded as 1.27 (m = 2) and 0.95 (m = 3), while the daily averages are 1.63 (m = 2) and 1.08 (m = 3). The intraday averages are smaller, meaning that they are more predictable than daily data, consistent with the empirical findings of Fiedor (2014a) for the NYSE 100 stocks. The findings of this study, can be used as guidelines for studies deal with
Figure 3. Number of stocks having internal components approximate entropy smaller than the approximate entropy of the returns.
Figure 4. Number of stocks having external components approximate entropy smaller than the approximate entropy of the returns.
the forecasting models in wish to get more correctness that helps in decision making and minimizing risks.
4. Conclusion
This paper used the ETICA method (external trend and internal components analysis decomposition) to explore the possibility of improving financial returns predictability. The method offers a better understanding of the system dynamic. The outcomes show that returns become more predictable after separating the local components from the external trend. Furthermore, decomposing the return’s absolute value instead of returns can enhance the process performance which means that it is a hopeful way. This study examines the effect of both high and daily frequency price changes on predictability. The results show that intraday data are more predictable than daily data for S & P 500 stocks. The findings can guide studies dealing with forecasting models to improve the returns directional prediction correctness. Moreover, they encourage further studies to design improved models for prediction which are usually used for guiding decision making by identifying risks and opportunities.