TITLE:
Dirty Data between Errors and Their Handling—A Firsthand Experience in Solving Dirty Data from Within
AUTHORS:
Faheem Bukhatwa, Ahmed Laarfi, Ismahan Salem
KEYWORDS:
Data Science, Database, Artificial Intelligence, System Analysis, Big Data
JOURNAL NAME:
International Journal of Intelligence Science,
Vol.13 No.2,
April
28,
2023
ABSTRACT: Managing
large amounts of data is becoming part of everyday life in most organizations.
Handling, analyzing, searching, and making predictions from big data is
becoming the norm for many organizations of many interests. Big data provides
the foundations for more benefits and higher values to be extracted from big
data. As big data comes with countless benefits, it also comes with many
challenges to fulfilling its expectations. Some of those problems haunting big
data banks are being termed dirty data. This paper focuses on dirty data while
working on an organization’s natural live information system. The author was
responsible for studying and analyzing a faltering information system and
planning and carrying out the required solutions and fixes. The importance of
the work carried out lies in the high level of dirty data observed in the
system. Therefore, this paper is based on the part of dirty data—the paper focuses on how the team suffered from dirty data and how it was dealt with.