Nonlinear GDP Forecasting: A Threshold-ARDLX Approach with Leading Macroeconomic Indicators ()
1. Introduction
Swings in energy prices play a crucial role in driving economic growth, as energy is a fundamental input for almost all economic activities (Alam, 2006; Kilian, 2008; Aslan et al., 2013; Van Eyden et al., 2019; Triantoro et al., 2023). These fluctuations in energy prices can affect production costs, transportation, and the affordability of goods and services, influencing both consumer spending and business decision-making. In addition, energy price fluctuations can drive inflation, shaping monetary policy decisions that form broader economic activity (Jorgenson & Wilcoxen, 1993). Thus, forecasting GDP growth, the most fundamental macroeconomic variable with the sources, direction, redirection, and use of energy, is of paramount importance (Alam, 2006).
The significance of incorporating energy variables in the forecasting process of economic growth lies in the extensive research examining the relationship between energy indices and GDP. Many studies have focused on using energy-related variables to predict GDP effectively (Pan et al., 2018; Wu et al., 2021; Zhang et al., 2023). In particular, there is strong evidence of a correlation between energy consumption and economic growth in countries with varying economic structures and at different stages of economic development (Chiou-Wei et al., 2008). Furthermore, power indices of energy intensity have been proposed to predict future movements in economic growth (Diebold & Rudebusch, 1991; Eder & Provornaya, 2018).
In this paper, we contribute to the aforementioned discussion by exploring the predictive ability of a set of energy and industrial variables to forecast GDP growth by introducing leading indicator models.1 First, we develop three distinct leading indices that incorporate energy prices, energy consumption, capacity utilization, and the industrial production variable, and evaluate the potential of these newly launched indices as a simple and effective tool for condensing information on future trends in real GDP growth. To the best of our knowledge, these indicators have not been used previously in the context we do. Second, to assess the robustness of our forecasting exercise, we use the AR(1) and AR(2) models as our benchmark standards. This allows us to ensure reliable predictions. Additionally, we apply a recursive and a rolling window approach to improve robustness by focusing on the most recent data, allowing the model to adapt to changing patterns and remain relevant over time. We conclude that our proposed indices offer performance enhancements when incorporated into TAR-DLX models, particularly in terms of mean square error (MSE), where the highest level of model outperformance compared to benchmark models is observed.
The remainder of this paper is organized as follows: Section 2 delves into the relevant literature addressing key issues of economic growth, energy indices, and industrial production; Section 3 presents the data and methodology employed, ensuring transparency in our approach; Section 4 analyzes the results, providing compelling and informative insights. Finally, Section 5 presents the conclusion, highlighting significant findings and actionable policy implications that can drive future progress.
2. Brief Literature Review
The economic prosperity of each country is reflected in its gross domestic product and the ability to forecast it constitutes a critical task for governments and policy makers. Through GDP forecasting, effective government planning, sound decision-making, and rational allocation of national resources can be achieved. In this section, we present a brief review of the literature on four variables used to forecast Gross Domestic Product, namely energy consumption, capacity utilization, energy prices, and industrial production which due to their close link to the production process and economic growth, are directly related to GDP.
The energy consumption index is a key indicator of economic activity since energy consumption is linked to economic activity at both the level of industrial production and the level of household consumption. Lee’s (2006) study showed that energy consumption and GDP do not have a neutral relationship, and revealed a two-way causality in the United States and a one-way relationship from energy consumption to GDP in Canada, Belgium, the Netherlands and Switzerland. In the same vein, Aslan et al. (2013) used wavelet filters to decompose the time series and tested the causality of each time scale with the relative time scale of the other series separately. Their findings highlighted the existence of a causal relationship between energy consumption and economic growth which is stronger at finer time scales and which is less evident at longer time horizons. Faisal et al. (2016) also argued the existence of a two-way causal relationship between electricity consumption and GDP in Russia. Meanwhile, Da et al. (2017) introduced industrial electricity as a leading indicator of stock market returns, linking it to broader economic activity. They also found that electricity consumption data can also provide early warnings of changes in economic output, enhancing their usefulness in GDP forecasting models. Moreover, Liu (2020) highlighted that the relationship between primary energy consumption and nominal GDP is strong through the effect of primary energy consumption on the prices of goods and services produced in an economy, while Bouznit et al. (2023) used cointegrating polynomial regressions with breakpoints and a simultaneous equation model to demonstrate a monotonically increasing relationship between energy use and real per capita GDP in Algeria. In the very recent literature, Lu et al. (2024) developed a new set of energy consumption indicators to predict GDP growth rates, confirming the strong predictive ability of the new indicators, especially the electricity consumption index from the industrial sector.
Energy prices play a crucial role in GDP forecasting, as they affect gross domestic product in many ways. On the supply side, energy prices affect production costs, whereas on the demand side they affect consumption. Therefore, increases or decreases in energy prices bring about changes in consumption and production levels which changes are passed on to GDP. Blanchard & Gali (2007) analyzed the effects of oil price shocks on inflation and economic activity, underlining that these depend on the structure of the economy and the flexibility of the labour market. Jamil & Ahmad (2010) with a multidimensional GDP forecasting model, suggested the integration of electricity management into economic planning, while the study by Kilian & Vigfusson (2011) argued that changes in oil prices have asymmetric effects on GDP. In the same vein, Berk & Yetkiner (2014) with a modified endogenous model showed significant cointegration between energy prices and real per capita GDP and found significant effects of energy prices composite on per capita GDP. Furthermore, Dagoumas et al. (2020) validated the influence of household electricity prices on real GDP, also noting that industrial electricity prices caused household electricity prices. Finally, Wang (2022), using the FMOLS cointegration estimation method, highlighted that changes in the price index of coal, electricity, and oil have an impact on China’s domestic economy.
The capacity utilization index is also an important determinant of economic growth as it is directly linked to productivity, technological progress, investment, and GDP growth. The findings of Bauer (1990) support the usefulness of this indicator in the analysis of economic conditions and in the estimation of inflation, while according to Kim & Lau (1994) whose research was one of the first to examine the relationship between capacity utilization and GDP, full capacity utilization, combined with investment in technology and infrastructure, contributes to productivity enhancement and GDP growth. Corrado & Mattey (1997) argued that despite changes in the structure of the economy, capacity utilization remains a reliable indicator of inflationary pressures. More recently, Bahramian & Saliminezhad (2021) reinforced the hypothesis that capacity utilization remains a critical indicator for assessing inflationary pressures and economic fluctuations.
Another widely used economic indicator for forecasting GDP is the industrial production index. Early studies indicate that industrial activities, through changes in output, affect broader macroeconomic conditions and GDP (Bernanke, 1983; Sims, 1980) and that fluctuations in industrial output usually predict the direction of GDP, since industry represents an important sector of the economy that reflects broader trends in economic growth (Stock & Watson,1989). Industrial production is also closely linked to energy prices and energy consumption, as suggested by Hamilton (2003), who studied the interaction between industrial sectors and energy consumption. Many empirical studies have also confirmed the strong correlation between the industrial production index and GDP. For the Eurozone, Ferrara & Koopman (2010) argued that industrial production data significantly improve the accuracy of GDP forecasts. Similarly, Camacho & Perez-Quiros (2010) highlighted the role of industrial production in GDP forecasts for European economies. Chauvet & Potter (2013) found that incorporating the industrial production index into dynamic factor models yields better forecasts of US GDP than traditional autoregressive models. Finally, Eraslan & Gotz (2021) introduced an unconventional activity index that combines monthly industrial production with high-frequency indicators and concluded that this indicator shows a high correlation with quarterly GDP growth.
The cited literature highlights that simultaneous analysis of multiple indicators such as energy consumption, energy prices, capacity utilization and industrial production, significantly improves the accuracy of GDP forecasts. Studies such as Sharma (2010), Wu et al. (2021) and Wang (2022) advocate the use of combined models that incorporate multidimensional data, offering a more comprehensive approach to forecasting economic growth than traditional models.
3. Methodology and Data
3.1. Data and Forecast Evaluation
We have collected our preliminary variables from the official site of the Federal Reserve Economic Data System, the database of the Central Bank System of the United States. The observations are quarterly and span the period from 1992:01 to 2024:04. The time series we consider for our analysis include the energy consumption, the energy price index2, the capacity utilization and the industrial production index. In each forecasting exercise we perform, we use GDP growth as our dependent variable.
The importance of these variables stems from their substantial role in forecasting GDP growth. The connection between energy indices and income is a crucial area of research that has significant implications for economic growth and sustainability (Abosedra & Baghestani, 1989; Soytas & Sari, 2003; Apergis & Payne, 2014). Empirical research has shown that there is compelling evidence indicating positive effects of primary export growth on both industrial export growth and GDP, both in the short and long term (Ciccone, 1987; Haluska, 2022).
To enhance the reliability of our analysis and effectively separate the signal from the prevalent noise in time series data, we implement a comprehensive approach to variable inspection and transformation. This involves utilizing logarithmic transformations and seasonal differences (Kyriazi et al., 2024) at their respective frequencies of observation. To enhance the effectiveness of our forecasting exercise, we employ both a recursive window and a rolling window approach, with the use of 36, 48 & 60 months and we set three thresholds −0.01, 0.0, +0.01. With these requirements will ensure robust and dynamic analysis, providing us with greater precision and insight into our predictions. To evaluate the performance and accuracy of our methodology, we use two basic measures, the Mean Squared Error (MSE) and the Mean Absolute Error (MAE). When we split our sample into two components, namely the rolling window n0 and the evaluation window n1, we define the MSE and MAE for each model follows:
(1)
In the tables presenting our results, we include relative measures where we take the ratio of the MSE or MAE of each forecasting model relative to the same measure when the forecast is computed by the sample mean.
3.2. Methodology
Let yt be our dependent variable defined as: yt = log(GDP/GDPt−4), where GDP represents real Gross Domestic Product at time t, and GDPt−4 corresponds to its value four quarters earlier. This transformation captures the annualized growth rate of real GDP, ensuring comparability across time periods while accounting for seasonal variations. We consider four key macroeconomic indicators as explanatory variables: Energy Consumption (ENC), Energy Price Index (ENP), Capacity Utilization (CPI) and Industrial Production Index (IP). To standardize these variables, we begin by computing their scaling factor, which is determined by expressing each value as a ratio of its first observation. Next, we calculate their growth rates using the first log-difference transformation. This approach helps to achieve stationarity and allows for a more meaningful interpretation of the data. The transformed variables are then used as explanatory factors in the forecasting model.
To perform our forecasting exercise, we construct two indices: The first index is defined as the ratio of industrial production to lagged energy consumption: It1 = INDt/ECt−1, where INDt = IPt/IP0 and ECt = ENCt/ENC0 while the second index is calculated as the ratio of energy consumption to the sum of lagged capacity utilization and lagged energy price index: It2 = ECt/(CPt−1 + EPt−1), where CPt = CPIt/CPI0 and EPt = ENPt/ENP0. The explanatory variables are then defined as xt1 = log(It1/It−1|1) and xt2 = log(It2/It−1|2) and the corresponding threshold variables τt1 = xt1I(xt1 > c) and τt2 = xt2I(xt2 > c), where c is the threshold. We consider a broad forecasting model that is built based on all these explanatory variables and belongs to the class of threshold-ARDLX or TAR-DLX models. We have:
(2)
This formulation, through the threshold formulation of τt1 and τt2, allows for a nonlinear analysis of macroeconomic fluctuations.
Note that the following hold: a) the model naturally nests for both the AR(1) and the AR(2) benchmarks, b) the model captures the first leading indicator as
and the second leading indicator as
, with a maximum lag up to 2. Our empirical implementation approach is then as follows. We first estimate the parameters of the models, plus the AR(1) and AR(2) benchmarks, using both least squares and quantile regressions, with estimation performed using rolling windows of size 36, 48, and 60 months, as well as a recursive window. After estimation, we compute forecasts for all models and repeat the process until we exhaust all observations. We then evaluate our forecasts using mean-squared and mean-absolute error measures and report the top-performing models across a number of different sample periods.
4. Discussion of Results
This section presents the results of the GDP forecast exercise for four distinct time periods: 1992-2024, 2000-2024, 2004-2024 and 2008-2024. The findings are summarized in Tables 1-4, which report the statistical results based on five forecasting models: AR(1), AR(2), TAR-DLX1(2,2), TAR-DLX2(2,2), and TAR-DLX12(2,2).
Table 1. Forecasting performance attribution of the threshold-ARDLX approach—MSE and MAE for period 1992:1-2024:4.
|
|
Rolling Window |
Recursive Window |
|
Models |
QR |
Roll |
QQ |
Thres |
LS |
Roll |
Thres |
QR |
QQ |
Thres |
LS |
Thres |
Relative MSE |
AR(1) |
0.545 |
36 |
0.50 |
−0.01 |
0.617 |
60 |
ALL |
0.647 |
0.75 |
ALL |
0.650 |
ALL |
AR(2) |
0.568 |
36 |
0.50 |
−0.01 |
0.617 |
60 |
ALL |
0.709 |
0.75 |
ALL |
0.650 |
ALL |
TARDLX1(2,2) |
0.549 |
36 |
0.50 |
0.01 |
0.636 |
60 |
−0.01 |
0.652 |
0.75 |
0.01 |
0.660 |
−0.01 |
TARDLX2(2,2) |
0.538 |
60 |
0.50 |
0.01 |
0.571 |
60 |
0.00 |
0.627 |
0.75 |
0.00 |
0.609 |
0.00 |
TARDLX12(2,2) |
0.535 |
60 |
0.50 |
0.01 |
0.576 |
60 |
0.01 |
0.631 |
0.50 |
0.00 |
0.610 |
0.00 |
Relative MAE |
AR(1) |
0.631 |
36 |
0.50 |
ALL |
0.652 |
36 |
ALL |
0.744 |
0.50 |
ALL |
0.725 |
ALL |
AR(2) |
0.626 |
36 |
0.50 |
ALL |
0.652 |
36 |
ALL |
0.788 |
0.50 |
ALL |
0.725 |
ALL |
TARDLX1(2,2) |
0.637 |
36 |
0.50 |
0.01 |
0.681 |
36 |
0.01 |
0.738 |
0.50 |
−0.01 |
0.734 |
0.00 |
TARDLX2(2,2) |
0.661 |
36 |
0.50 |
0.01 |
0.710 |
60 |
0.01 |
0.744 |
0.50 |
0.01 |
0.733 |
0.01 |
TARDLX12(2,2) |
0.682 |
60 |
0.50 |
0.01 |
0.713 |
60 |
0.01 |
0.744 |
0.50 |
0.01 |
0.732 |
0.01 |
1. LS and QR columns show relative MSE and MAE, computed as the ratio to the sample mean forecast.
2. Roll is the rolling window of the estimation, Thres corresponds to the thresholds giving optimal forecasting performance.
3. QQ corresponds to the quantile giving optimal forecasting performance.
4. The forecasting models are as in Equation (2).
Table 2. Forecasting performance attribution of the threshold-ARDLX approach—MSE and MAE for period 2000:4-2024:4.
|
|
Rolling Window |
Recursive Window |
|
Models |
QR |
Roll |
QQ |
Thres |
LS |
Roll |
Thres |
QR |
QQ |
Thres |
LS |
Thres |
Relative MSE |
AR(1) |
0.780 |
36 |
0.50 |
ALL |
0.931 |
60 |
ALL |
0.849 |
0.75 |
ALL |
0.955 |
ALL |
AR(2) |
0.836 |
60 |
0.75 |
ALL |
0.931 |
60 |
ALL |
0.966 |
0.75 |
ALL |
0.955 |
ALL |
TARDLX1(2,2) |
0.782 |
36 |
0.50 |
0.01 |
0.956 |
48 |
−0.01 |
0.832 |
0.75 |
0.01 |
0.958 |
−0.01 |
TARDLX2(2,2) |
0.764 |
60 |
0.75 |
0.01 |
0.846 |
60 |
0.00 |
0.820 |
0.75 |
0.01 |
0.885 |
0.00 |
TARDLX12(2,2) |
0.749 |
60 |
0.75 |
0.00 |
0.852 |
60 |
0.01 |
0.828 |
0.75 |
0.01 |
0.885 |
0.00 |
Relative MAE |
AR(1) |
0.779 |
60 |
0.75 |
ALL |
0.854 |
60 |
ALL |
0.827 |
0.75 |
ALL |
0.877 |
ALL |
AR(2) |
0.782 |
36 |
0.50 |
ALL |
0.854 |
60 |
ALL |
0.906 |
0.75 |
ALL |
0.877 |
ALL |
TARDLX1(2,2) |
0.788 |
36 |
0.50 |
−0.01 |
0.879 |
60 |
0.01 |
0.804 |
0.75 |
0.01 |
0.889 |
−0.01 |
TARDLX2(2,2) |
0.756 |
60 |
0.75 |
0.01 |
0.909 |
60 |
−0.01 |
0.828 |
0.75 |
0.00 |
0.916 |
0.01 |
TARDLX12(2,2) |
0.764 |
60 |
0.75 |
0.00 |
0.909 |
60 |
−0.01 |
0.829 |
0.75 |
0.01 |
0.916 |
0.01 |
1. LS and QR columns show relative MSE and MAE, computed as the ratio to the sample mean forecast.
2. Roll is the rolling window of the estimation, Thres corresponds to the thresholds giving optimal forecasting performance.
3. QQ corresponds to the quantile giving optimal forecasting performance.
4. The forecasting models are as in Equation (2).
Table 3. Forecasting performance attribution of the threshold-ARDLX approach—MSE and MAE for period 2004:4-2024:4.
|
|
Rolling Window |
Recursive Window |
|
Models |
QR |
Roll |
QQ |
Thres |
LS |
Roll |
Thres |
QR |
QQ |
Thres |
LS |
Thres |
Relative MSE |
AR(1) |
0.773 |
36 |
0.50 |
ALL |
0.950 |
60 |
ALL |
0.841 |
0.75 |
ALL |
0.975 |
ALL |
AR(2) |
0.844 |
36 |
0.75 |
ALL |
0.950 |
60 |
ALL |
0.892 |
0.75 |
ALL |
0.975 |
ALL |
TARDLX1(2,2) |
0.775 |
36 |
0.50 |
0.01 |
0.966 |
48 |
−0.01 |
0.820 |
0.75 |
−0.01 |
0.987 |
−0.01 |
TARDLX2(2,2) |
0.764 |
60 |
0.75 |
0.01 |
0.857 |
60 |
0.00 |
0.788 |
0.75 |
0.00 |
0.872 |
0.00 |
TARDLX12(2,2) |
0.746 |
60 |
0.75 |
0.00 |
0.857 |
48 |
0.01 |
0.777 |
0.75 |
0.00 |
0.879 |
0.01 |
Relative MAE |
AR(1) |
0.736 |
36 |
0.50 |
ALL |
0.837 |
36 |
ALL |
0.795 |
0.75 |
ALL |
0.931 |
ALL |
AR(2) |
0.775 |
36 |
0.50 |
ALL |
0.837 |
36 |
ALL |
0.876 |
0.75 |
ALL |
0.931 |
ALL |
TARDLX1(2,2) |
0.747 |
36 |
0.50 |
−0.01 |
0.854 |
36 |
0.00 |
0.773 |
0.75 |
−0.01 |
0.937 |
−0.01 |
TARDLX2(2,2) |
0.722 |
60 |
0.75 |
0.01 |
0.887 |
48 |
0.01 |
0.789 |
0.75 |
0.01 |
0.955 |
0.01 |
TARDLX12(2,2) |
0.726 |
60 |
0.75 |
0.00 |
0.880 |
48 |
0.01 |
0.777 |
0.75 |
0.00 |
0.955 |
0.01 |
1. LS and QR columns show relative MSE and MAE, computed as the ratio to the sample mean forecast.
2. Roll is the rolling window of the estimation, Thres corresponds to the thresholds giving optimal forecasting performance.
3. QQ corresponds to the quantile giving optimal forecasting performance.
4. The forecasting models are as in Equation (2).
Table 4. Forecasting performance attribution of the threshold-ARDLX approach—MSE and MAE for period 2008:4-2024:4.
|
|
Rolling Window |
Recursive Window |
|
Models |
QR |
Roll |
QQ |
Thres |
LS |
Roll |
Thres |
QR |
QQ |
Thres |
LS |
Thres |
Relative MSE |
AR(1) |
0.079 |
60 |
0.50 |
ALL |
0.119 |
60 |
ALL |
0.063 |
0.50 |
ALL |
0.122 |
ALL |
AR(2) |
0.039 |
60 |
0.50 |
ALL |
0.119 |
60 |
ALL |
0.033 |
0.50 |
ALL |
0.122 |
ALL |
TARDLX1(2,2) |
0.065 |
60 |
0.50 |
ALL |
0.119 |
60 |
ALL |
0.056 |
0.50 |
ALL |
0.122 |
ALL |
TARDLX2(2,2) |
0.047 |
60 |
0.50 |
0.0 |
0.169 |
60 |
−0.01 |
0.050 |
0.50 |
0.0/0.01 |
0.152 |
−0.01 |
TARDLX12(2,2) |
0.075 |
60 |
0.50 |
0.01 |
0.169 |
60 |
−0.01 |
0.050 |
0.50 |
0.0/0.01 |
0.152 |
−0.01 |
Relative MAE |
AR(1) |
0.219 |
60 |
0.50 |
ALL |
0.291 |
60 |
ALL |
0.204 |
0.50 |
ALL |
0.301 |
ALL |
AR(2) |
0.152 |
60 |
0.50 |
ALL |
0.291 |
60 |
ALL |
0.132 |
0.50 |
ALL |
0.301 |
ALL |
TARDLX1(2,2) |
0.167 |
60 |
0.50 |
0.0/0.01 |
0.291 |
60 |
ALL |
0.158 |
0.50 |
0.0/0.01 |
0.301 |
ALL |
TARDLX2(2,2) |
0.140 |
60 |
0.50 |
−0.01 |
0.299 |
60 |
−0.01 |
0.180 |
0.50 |
−0.01 |
0.308 |
−0.01 |
TARDLX12(2,2) |
0.240 |
60 |
0.50 |
0.01 |
0.299 |
60 |
−0.01 |
0.180 |
0.50 |
−0.01 |
0.308 |
−0.01 |
1. LS and QR columns show relative MSE and MAE, computed as the ratio to the sample mean forecast.
2. Roll is the rolling window of the estimation, Thres corresponds to the thresholds giving optimal forecasting performance.
3. QQ corresponds to the quantile giving optimal forecasting performance.
4. The forecasting models are as in Equation (2).
These models were estimated using two distinct methodologies: quantile regression (QR) and least squares (LS). Each table includes results for both rolling and recursive window estimations employing these two methods. The evaluation metrics considered are the relative mean square error (MSE) and the relative mean absolute error (MAE), with the conditional AR(1) and AR(2) models serving as benchmarks, note that these measures are always relative to the recursive or rolling sample mean, making the values directly comparable. Furthermore, these results are presented by the appropriate rolling window of 36, 48 & 60 months, as well as the three threshold values considered (−0.01, 0.0, +0.01). Specifically, in the results obtained by the quantile regression method, focus is given to two quantiles (QQ), namely 0.50 & 0.75. Consequently, each table corresponding to the four time periods provides a detailed and illustrative overview of the findings3.
Before analyzing our findings, it is important to mention that our three newly introduced models are of the TAR-DLX variety, and each model incorporates a combination of the threshold autoregressive models and the distributed lag models with the respective exogenous variables, where in our case is the proposed indices. That is, we consider forecasting models which contain both the AR(1) and AR(2) benchmarks and capture either the first leading indicator or the second leading indicator or both leading indicators, with a maximum lag of up to 2. In the TAR-DLX1(2,2) model, the exogenous variable is the first proposed indicator, with maximum lag up to 2. Corresponding in the TAR-DLX2(2,2) model, the second proposed indicator is employed as exogenous variable, with the same maximum lag. Finally, in the combined TAR-DLX12(2,2) model, both indicators are employed, with the same maximum lag. Furthermore, our discussion considers the overall performance of these models during each period separately.
We start off our discussion, with the overall performance of the models in each period separately. Table 1 presents the results of the five models forecasts for MSE and MAE during the period 1992:Q1-2024:Q4. In terms of the relative MSE across all forecasting methods considered, the TAR-DLX2(2,2) and TAR-DLX12(2,2) models outperform the benchmarks in the rolling window of 60 months, with QQ 0.50 and threshold 0.01, while the TAR-DLX1(2,2) model exhibits slightly higher values than the benchmarks in the rolling window of 36 months, using the same QQ and thresholds. The findings for the relative MAE are different of those for the relative MSE. In the recursive window of the quantile regression method, the TAR-DLX1(2,2) model performs better (0.738) than AR(1) and AR(2) at QQ 0.50 with a threshold of −0.01, while TAR-DLX2(2,2) and TAR-DLX12(2,2) achieve the same performance (0.744) as AR(1) and surpass AR(2) at QQ 0.50 with a threshold of 0.01. Concurrently, the benchmark models, within a 36-month rolling window and with a QQ of 0.50 outperform TAR-DLX models, in the remaining methods. According to Thomakos (2025), the disparity in outcomes between MSE and MAE can be attributed to the superiority of MSE as a measure of error and forecasting accuracy.
We now proceed to Table 2 which reports the period 2000:Q4-2004:Q4. The results related to relative MSE are observed to be similar to those in Table 1. The TAR-DLX2(2,2) and TAR-DLX12(2,2) models outperform the benchmarks in the rolling window of 60 months in all the methods utilized. The superiority of these two models is evident at the thresholds of 0.00 and 0.01 and for the methods where QQ is set at 0.75. Conversely, the benchmarks outperform the TAR-DLX1(2,2) model in almost all methods across different rolling windows, QQ values, and at all thresholds. During this period, the TAR-DLX1(2,2) model shows improvement in the recursive window of the quantile regression method with QQ 0.75 at the 0.01 threshold and outperforms the benchmarks. In the results for the relative MAE, we find the superiority of TAR-DLX2(2,2) and TAR-DLX12(2,2) in the quantile regression method (60-month rolling window, QQ 0.75, thresholds 0.01 and 0.00, respectively), while the benchmarks outperform the TAR-DLX1(2,2), TAR-DLX2(2,2).
Similarly, the findings for the period 2004:Q4-2008:Q4 are presented in Table 3. Looking at Table 3, we find that in terms of MSE, the results are analogous to those of the previous periods. The TAR-DLX2(2,2) and TAR-DLX12(2,2) models outperform the benchmark models across all methods, particularly in the 60-month rolling window combined with the thresholds 0.00 and 0.01. We also note remarkable results with respect to the MAE, as this is the first period in which the responses of the two methods show a different direction. In the OLS results, we observe that the benchmarks slightly outperform our proposed indices in the 36-month rolling window and all thresholds, while in the corresponding results for the QR method, we find that TAR-DLX2(2,2) and TAR-DLX12(2,2) provide performance enhancements over the benchmarks in the 60-month rolling window, and a QQ of 0.75 at thresholds 0.00 and 0.01. A similar pattern is observed in our findings for the recursive window of the two methods, with mixed results across the different thresholds.
Finally, we turn to Table 4, the last table, which summarizes the findings for the period 2008:Q4-2024:Q4. The OLS results demonstrate that, for both relative MSE and relative MAE, the benchmark models consistently outperform TAR-DLX2(2,2), and TAR-DLX12(2,2) in the 60-month rolling window and at all thresholds. The same outperformance of the benchmark models is also evident in the recursive window of the OLS method at all thresholds. We note that the TAR-DLX1(2,2) model performs the same as the benchmarks models both in the quantile regression method and in the recursive window at both error metrics. In contrast, in the quantile regression method, the TAR-DLX1(2,2), TAR-DLX2(2,2), and TAR-DLX12(2,2) models have an intra-performance, for both MSE and MAE, compared to the AR(1) and AR(2) benchmarks. The performance of these three models is worse than that of the AR(2) benchmark, but they demonstrate a superiority over the AR(1) benchmark in the 60-month rolling window, with QQ 0.50, across various thresholds. A similar intra-performance of the TAR-DLX models, is also observed in the recursive window of the quantile regression method with QQ 0.50. The only exception in this intra-performance of models is the MAE value (0.240) of TAR-DLX12(2,2) which loses from both benchmarks in the quantile regression method (60-month rolling window, QQ 0.50, threshold 0.01).
According to the above subsection of our findings, we conclude that the quantile regression method is more efficient than the ordinary least squares method, with lower MSE and MAE values across models and periods. Across all time periods, the application of quantile regression leads to better error metrics in all five models compared to OLS. Furthermore, the application of the recursive window to these methods suggests that quantile regression appears to outperform OLS, especially in the most recent periods. In general, our findings show that recursive estimation performs better than fixed estimation (QR and OLS) in some cases, especially in the most recent periods.
5. Conclusion and Policy Implication
In our empirical exercise, we construct three different leading indicators each incorporating dominant variables that are particularly important in forecasting GDP: energy prices (EPN), energy consumption (ENC), capacity utilization (CPI) and the industrial production variable (IP), and we assess the predictive ability of these indicators both individually and in combination for forecasting GDP. To ensure the reliability of our forecasts, we use the AR(1) and AR(2) models as benchmarks. Additionally, we focus on more recent data and apply both recursive and rolling window approaches. We aggregate the results from the application of these indicators using two different estimation methods: quantile regression (QR) and least squares (LS). Finally, we evaluate the individual and combined performance of the indicators, comparing them against the benchmarks as well as across estimation methods over the four periods considered.
The inclusion of these variables as indirect proxies for economic activity, rather than relying solely on GDP, which is typically reported on a quarterly or annual basis, makes their incorporation into GDP forecasting essential. Including indicators that encompass these variables provides valuable insights into the economic factors influencing GDP, offering a more accurate reflection of trends and economic shifts. This, in turn, enhances policy and decision-making.
Our findings demonstrate that, in most of the evaluated schemes and rolling window combinations, the proposed models exhibit superior performance compared to the benchmarks, whether employing least squares or quantile regression estimation. This performance improvement is evident in the MAE and MSE metrics, leading to an improvement in the GDP forecast. The most significant enhancement is observed in the MSE across all estimation methods, where the three proposed models outperform the benchmarks, with overall percentages per model ranging from 62% to 87%4, while in the MAE metric, the percentage of superiority per model ranges from 43% to 50%. Furthermore, the quantile regression method consistently demonstrates superior performance compared to the least squares method. We have also included the 2008 financial crisis as a limitation when referring to the period 2008:Q4-2024:Q4. The results show that the stability of our models is not undermined by this structural break, especially in the results of the quantile regression method.
Our approach appears promising and is likely to serve as a valuable tool for policy making by governments, businesses, and financial institutions. Early identification of trends leads to more informed and proactive options to mitigate potential risks before they are fully implemented or to take advantage of upcoming opportunities. The utilization of these indices by policy makers will facilitate the acquisition of pivotal information, enabling the prediction and comprehension of economic trends and they will be able to identify heterogeneities in the economy relative to standard economic structures, correct vulnerabilities, assess risks and formulate proactive strategies to achieve economic stability. Future research could explore the adaptation of the proposed indicators by including additional dominant variables or constructing new indicators with other relevant factors. These indices could then be tested for their predictive ability, either individually or in combination with the indicators already introduced. Furthermore, given that the evaluation of our models has been carried out using data exclusively from the US economy, it is suggested that they be evaluated in the context of alternative economic systems.
NOTES
1For additional methodological contributions see Guerard et al., 2020, 2024; Kyriazi and Thomakos, 2020a, 2020b; Kyriazi, 2024.
2The Energy Price Index includes the weighted average of coal, crude oil and natural gas prices, based on current US dollars.
3For the predictive validity and ability, first we applied the well-known Mincer-Zarnowitz test regression for forecast unbiasedness and efficiency (validity) and that test indicated that all our best performing models cannot reject the corresponding null hypothesis, and pass the test. Then, we applied the Clark-West test for predictive ability of our models vs. the AR(1) and AR(2) benchmarks and the results were in general in favor of our models, sometimes our models performed better than the benchmark and sometimes not.
4The corresponding performances of the AR(1) and AR(2) benchmarks were used to calculate the percentage improvements in MSE.