TITLE:
Study on the Missing Data Mechanisms and Imputation Methods
AUTHORS:
Abdullah Z. Alruhaymi, Charles J. Kim
KEYWORDS:
Missing Data, Mechanisms, Imputation Techniques, Models
JOURNAL NAME:
Open Journal of Statistics,
Vol.11 No.4,
August
11,
2021
ABSTRACT: The absence of some data values in any observed dataset has been a real
hindrance to achieving valid results in statistical research. This paper aimed at the
missing data widespread problem faced by analysts and statisticians in academia
and professional environments. Some data-driven methods were studied to obtain
accurate data. Projects that highly rely on data face this missing data
problem. And since machine learning models are only as good as the data used to
train them, the missing data problem has a real impact on the solutions
developed for real-world problems. Therefore, in this dissertation, there is an
attempt to solve this problem using different mechanisms. This is done by
testing the effectiveness of both traditional and modern data imputation
techniques by determining the loss of statistical power when these different
approaches are used to tackle the missing data problem. At the end of this
research dissertation, it should be easy to establish which methods are the
best when handling the research problem. It is recommended that using
Multivariate Imputation by Chained Equations (MICE) for MAR missingness is the
best approach to dealing with missing data.