Open Journal of Statistics

Volume 11, Issue 4 (August 2021)

ISSN Print: 2161-718X   ISSN Online: 2161-7198

Google-based Impact Factor: 0.53  Citations  

Study on the Missing Data Mechanisms and Imputation Methods

HTML  XML Download Download as PDF (Size: 319KB)  PP. 477-492  
DOI: 10.4236/ojs.2021.114030    619 Downloads   4,278 Views  Citations

ABSTRACT

The absence of some data values in any observed dataset has been a real hindrance to achieving valid results in statistical research. This paper aimed at the missing data widespread problem faced by analysts and statisticians in academia and professional environments. Some data-driven methods were studied to obtain accurate data. Projects that highly rely on data face this missing data problem. And since machine learning models are only as good as the data used to train them, the missing data problem has a real impact on the solutions developed for real-world problems. Therefore, in this dissertation, there is an attempt to solve this problem using different mechanisms. This is done by testing the effectiveness of both traditional and modern data imputation techniques by determining the loss of statistical power when these different approaches are used to tackle the missing data problem. At the end of this research dissertation, it should be easy to establish which methods are the best when handling the research problem. It is recommended that using Multivariate Imputation by Chained Equations (MICE) for MAR missingness is the best approach to dealing with missing data.

Share and Cite:

Alruhaymi, A. and Kim, C. (2021) Study on the Missing Data Mechanisms and Imputation Methods. Open Journal of Statistics, 11, 477-492. doi: 10.4236/ojs.2021.114030.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.