TITLE:
Performance and Availability Evaluation of Big Data Environments in the Private Cloud
AUTHORS:
Tarcísio Rolim, Erica Sousa
KEYWORDS:
Cloud Computing, Big Data, Hadoop Cluster, Performance Evaluation, Availability Evaluation, Reliability Block Diagram, Stochastic Petri Nets
JOURNAL NAME:
Journal of Computer and Communications,
Vol.12 No.12,
December
31,
2024
ABSTRACT: Cloud computing allows scalability at a lower cost for data analytics in a big data environment. This paradigm considers the dimensioning of resources to process different volumes of data, minimizing the response time of big data. This work proposes a performance and availability evaluation of big data environments in the private cloud through a methodology and stochastic and combinatorial models considering performance metrics such as execution times, processor utilization, memory utilization, and availability. The proposed methodology considers objective activities, performance, and availability modeling to evaluate the private cloud environment. A performance model based on stochastic Petrinets is adopted to evaluate the big data environment on the private cloud. Reliability block diagram models are adopted to evaluate the availability of big environment data in the private cloud. Two case studies based on the CloudStack platform and Hadoop cluster are adopted to demonstrate the viability of the proposed methodologies and models. Case Study 1 evaluated the performance metrics of the Hadoop cluster in the private cloud, considering different service offerings, workloads, and the number of data sets. The sentiment analysis technique is used in tweets from users with symptoms of depression to generate the analyzed datasets. Case Study 2 evaluated the availability of big data environments in the private cloud.