Missing Data Handling: A Comprehensive Review, Taxonomy, and Comparative Evaluation - Journal of Computer and Communications

JCC > Vol.13 No.6, June 2025

Journal of Computer and Communications

Volume 13, Issue 6 (June 2025)

ISSN Print: 2327-5219 ISSN Online: 2327-5227

Google-based Impact Factor: 1.98 Citations

Missing Data Handling: A Comprehensive Review, Taxonomy, and Comparative Evaluation ()

XML

Download as PDF (Size: 1865KB) PP. 81-102

DOI: 10.4236/jcc.2025.136006 9 Downloads 84 Views

Author(s)

Ikram Chourib

Affiliation(s)

Paris, France.

ABSTRACT

Missing data remains a persistent and pervasive challenge across a wide range of domains, significantly impacting data analysis pipelines, predictive modeling outcomes, and the reliability of decision-making processes. This paper presents a comprehensive and updated review of missing data handling techniques that entail both traditional statistical methods and state-of-the-art graph-based and machine-learning approaches. A novel taxonomy is introduced, classifying strategies into three principal categories: preprocessing techniques, graph-based imputations, and algorithms inherently tolerant to missing values. Particular emphasis is placed on recent advancements in deep learning architectures, including Generative Adversarial Imputation Networks (GAIN), Self-Attention Imputation for Time Series (SAITS), and MissFormer, as well as graph-based methods such as Graph Recovery Imputation Network (GRIN) and Temporal Spatial Imputation Graph Neural Network (TSI-GNN). These models demonstrate notable improvements in handling complex missingness patterns and scaling to large heterogeneous datasets. To complement the theoretical review, an empirical evaluation was conducted on two benchmark datasets (Heart Disease and Kidney Disease), examining the effectiveness and limitations of various imputation strategies under different missingness scenarios. The results underscore the critical importance of adapting missing data handling techniques to the nature of the dataset, the underlying missingness mechanism, and the proportion of missing entries. Finally, the paper outlines promising research directions, advocating for the development of lightweight, explainable, and scalable models; online adaptive imputation strategies for streaming data; multimodal data integration techniques; and privacy-preserving imputation frameworks within federated and decentralized learning environments. Addressing these challenges is essential for building the next generation of reliable, transparent, and intelligent data-driven systems.

KEYWORDS

Missing Data, Data Imputation, Deep Learning, Machine Learning

Share and Cite:

Chourib, I. (2025) Missing Data Handling: A Comprehensive Review, Taxonomy, and Comparative Evaluation. Journal of Computer and Communications, 13, 81-102. doi: 10.4236/jcc.2025.136006.

Cited by

No relevant information.

Journals Menu

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies