Journal of Data Analysis and Information Processing

Volume 6, Issue 2 (May 2018)

ISSN Print: 2327-7211   ISSN Online: 2327-7203

Google-based Impact Factor: 1.59  Citations  

Integrated Real-Time Big Data Stream Sentiment Analysis Service

HTML  XML Download Download as PDF (Size: 1984KB)  PP. 46-66  
DOI: 10.4236/jdaip.2018.62004    1,414 Downloads   4,228 Views  Citations

ABSTRACT

Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.

Share and Cite:

Chung, S. and Aring, D. (2018) Integrated Real-Time Big Data Stream Sentiment Analysis Service. Journal of Data Analysis and Information Processing, 6, 46-66. doi: 10.4236/jdaip.2018.62004.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.