TITLE:
Dynamic Conditional Feature Screening: A High-Dimensional Feature Selection Method Based on Mutual Information and Regression Error
AUTHORS:
Yi Zhao, Guangming Deng
KEYWORDS:
High-Dimensional Feature Screening, Conditional Mutual Information, Regression Error Difference, Dynamic Weighting, Dynamic Thresholding, Macroeconomic Forecasting, FRED-MD Dataset
JOURNAL NAME:
Open Journal of Statistics,
Vol.15 No.2,
April
22,
2025
ABSTRACT: Current high-dimensional feature screening methods still face significant challenges in handling mixed linear and nonlinear relationships, controlling redundant information, and improving model robustness. In this study, we propose a Dynamic Conditional Feature Screening (DCFS) method tailored for high-dimensional economic forecasting tasks. Our goal is to accurately identify key variables, enhance predictive performance, and provide both theoretical foundations and practical tools for macroeconomic modeling. The DCFS method constructs a comprehensive test statistic by integrating conditional mutual information with conditional regression error differences. By introducing a dynamic weighting mechanism, DCFS adaptively balances the linear and nonlinear contributions of features during the screening process. In addition, a dynamic thresholding mechanism is designed to effectively control the false discovery rate (FDR), thereby improving the stability and reliability of the screening results. On the theoretical front, we rigorously prove that the proposed method satisfies the sure screening property and rank consistency, ensuring accurate identification of the truly important feature set in high-dimensional settings. Simulation results demonstrate that under purely linear, purely nonlinear, and mixed dependency structures, DCFS consistently outperforms classical screening methods such as SIS, CSIS, and IG-SIS in terms of true positive rate (TPR), false discovery rate (FDR), and rank correlation. These results highlight the superior accuracy, robustness, and stability of our method. Furthermore, an empirical analysis based on the U.S. FRED-MD macroeconomic dataset confirms the practical value of DCFS in real-world forecasting tasks. The experimental results show that DCFS achieves lower prediction errors (RMSE and MAE) and higher R2 values in forecasting GDP growth. The selected key variables—including the Industrial Production Index (IP), Federal Funds Rate, Consumer Price Index (CPI), and Money Supply (M2)—possess clear economic interpretability, offering reliable support for economic forecasting and policy formulation.