TITLE:
Hoeffding Tree Algorithms for Anomaly Detection in Streaming Datasets: A Survey
AUTHORS:
Asmah Muallem, Sachin Shetty, Jan Wei Pan, Juan Zhao, Biswajit Biswal
KEYWORDS:
Hoeffding Trees, Distributed, Ensembles, Anomaly Detection, Machine Learning, Spark
JOURNAL NAME:
Journal of Information Security,
Vol.8 No.4,
October
25,
2017
ABSTRACT:
This survey aims to deliver an extensive and well-constructed overview of using
machine learning for the problem of detecting anomalies in streaming datasets.
The objective is to provide the effectiveness of using Hoeffding Trees as
a machine learning algorithm solution for the problem of detecting anomalies
in streaming cyber datasets. In this survey we categorize the existing research
works of Hoeffding Trees which can be feasible for this type of study into the
following: surveying distributed Hoeffding Trees, surveying ensembles of
Hoeffding Trees and surveying existing techniques using Hoeffding Trees for
anomaly detection. These categories are referred to as compositions within
this paper and were selected based on their relation to streaming data and the
flexibility of their techniques for use within different domains of streaming
data. We discuss the relevance of how combining the techniques of the proposed
research works within these compositions can be used to address the
anomaly detection problem in streaming cyber datasets. The goal is to show
how a combination of techniques from different compositions can solve a
prominent problem, anomaly detection.