Sentiment Analysis of Investor Opinions on Twitter

Abstract

The rapid growth of social networks has produced an unprecedented amount of user-generated data, which provides an excellent opportunity for text mining. Sentiment analysis, an important part of text mining, attempts to learn about the authors’ opinion on a text through its content and structure. Such information is particularly valuable for determining the overall opinion of a large number of people. Examples of the usefulness of this are predicting box office sales or stock prices. One of the most accessible sources of user-generated data is Twitter, which makes the majority of its user data freely available through its data access API. In this study we seek to predict a sentiment value for stock related tweets on Twitter, and demonstrate a correlation between this sentiment and the movement of a company’s stock price in a real time streaming environment. Both n-gram and “word2vec” textual representation techniques are used alongside a random forest classification algorithm to predict the sentiment of tweets. These values are then evaluated for correlation between stock prices and Twitter sentiment for that each company. There are significant correlations between price and sentiment for several individual companies. Some companies such as Microsoft and Walmart show strong positive correlation, while others such as Goldman Sachs and Cisco Systems show strong negative correlation. This suggests that consumer facing companies are affected differently than other companies. Overall this appears to be a promising field for future research.

Share and Cite:

Dickinson, B. and Hu, W. (2015) Sentiment Analysis of Investor Opinions on Twitter. Social Networking, 4, 62-71. doi: 10.4236/sn.2015.43008.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Zephoria Incorporation (2015) Top 20 Facebook Statistics. Zephoria Incorporation, Sarasota.
[2] Yang, B.S. and Cardie, C. (2014) Context-Aware Learning for Sentence-Level Sentiment Analysis with Posterior Regularization. Proceedings of ACL, Baltimore Maryland, June 2014, 325-335.
[3] Richard, S., Perelygin, A., Wu, J., Chuang, J., Manning, C., Ng, A. and Potts, C. (2013) Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank. Conference on Empirical Methods on Natural Language Processing (EMNLP), Seattle Washington, October 2013, 1631-1642.
[4] Lev, B. and Thiagarajan, S.R. (1993) Fundamental Information Analysis. Journal of Accounting Research, 31, 190-215. http://dx.doi.org/10.2307/2491270
[5] Wong, W.-K., Manzur, M. and Chew, B.-K. (2003) How Rewarding Is Technical Analysis? Evidence from Singapore Stock Market. Applied Financial Economics, 13, 543-551.
http://dx.doi.org/10.1080/0960310022000020906
[6] William, C. and Trenkle, J. (1994) N-Gram-Based Text Categorization. Proceedings of Annual Symposium on Document Analysis and Information Retrieval, Las Vegas Nevada, April 1994, 161-175.
[7] Tomas, M., Chen, K., Corrado, G. and Dean, J. (2013) Efficient Estimation of Word Representations in Vector Space. Computation and Language, arXiv preprint arXiv: 1301.3781.
[8] Tomas, M., Sutskever, I., Chen, K., Corrado, G. and Dean, J. (2013) Distributed Representations of Words and Phrases and Their Compositionality. Proceedings of Neural Information Processing Systems, Lake Tahoe, December 2013, 3111-3119.
[9] Tomas, M. (2013) Linguistic Regularities in Continuous Space Word Representations. Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Westin Peachtree Plaza Hotel, 9-14 June 2013, 746-751.
[10] Zhang, X. and LeCun, Y. (2015) Text Understanding from Scratch. Computation and Language, arXiv preprint arXiv: 1502.01710.
[11] Luciano, B. and Feng, J.L. (2010) Robust Sentiment Detection on Twitter from Biased and Noisy Data. Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, 23-27 August 2010, 36-44.
[12] Tang, D.Y., Wei, F.R., Qin, B., Liu, T. and Zhou, M. (2014) Coooolll: A Deep Learning System for Twitter Sentiment Classification. Proceedings of the 8th International Workshop on Semantic Evaluation, Dublin, 23-24 August 2014, 208-212.
[13] Rui, H.X., Liu, Y.Z. and Whinston, A. (2013) Whose and What Chatter Matters? The Effect of Tweets on Movie Sales. Decision Support Systems, 56, 863-870. http://dx.doi.org/10.1016/j.dss.2012.12.022
[14] Gregoire, M., Mikolov, T., Ranzato, M. and Bengio, Y. (2014) Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews. Computation and Language, arXiv preprint arXiv: 1412.5335.
[15] Johan, B., Mao, H.N. and Zeng, X.J. (2011) Twitter Mood Predicts the Stock Market. Journal of Computational Science, 2, 1-8. http://dx.doi.org/10.1016/j.jocs.2010.12.007
[16] Si, J.F., Mukherjee, A., Liu, B., Pan, S., Li, Q. and Li, H.Y. (2014) Exploiting Social Relations and Sentiment for Stock Prediction. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Doha, 25-29 October 2014, 1139-1145.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.