Recommending Who to Follow on Twitter Based on Tweet Contents and Social Connections


In this paper, we examine methods that can provide accurate results in a form of a recommender system within a social networking framework. The social networking site of choice is Twitter, due to its interesting social graph connections and content characteristics. We built a recommender system which recommends potential users to follow by analyzing their tweets using the CRM114 regex engine as a basis for content classification. The evaluation of the recommender system was based on a dataset generated from real Twitter users created in late 2009.

Share and Cite:

Tsourougianni, E. and Ampazis, N. (2013) Recommending Who to Follow on Twitter Based on Tweet Contents and Social Connections. Social Networking, 2, 165-173. doi: 10.4236/sn.2013.24016.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] B. Stuart and B. Martin, “Continuance Usage Intention in Microblogging Services: The Case of Twitter,” Proceedings of the 17th European Conference on Information Systems (ECIS), Verona, 8-10 June 2009, pp. 556-567.
[2] H. Kwak, C. Lee, H. Park, and S. Moon, “What is Twitter, a Social Network or a News Media?” WWW ’10: Proceedings of the 19th International Conference on World wide web, New York, 26-30 April 2010, pp. 591-600.
[3] S. Gaudin, “Twitter Now has 75M Users Most Asleep at the Mouse,” 2010.
[4] W. S. Yerazunis, “Crm114 Revealed,” 2006.
[5] “Seven Hypothesis about Spam Filtering,” TREC, 2006.
[6] B. A. Huberman, D. M. Romero and F. Wu, “Social Net- works That Matter: Twitter under the Microscope,” Technical Representative, 2008.
[7] C. Wagner and M. Strohmaier, “The Wisdom in Twee- tonomies: Acquiring Latent Conceptual Structures,” Semantic Search Workshop at WWW, 2010.
[8] F.-Y. Wang, K. M. Carley, D. Zeng and W. Mao, “Social Computing: From Social Informatics to Social Intelligence,” IEEE Intelligent Systems, Vol. 22, 2007, pp. 79-83.
[9] D. Gaffney, “#Iranelection: Quantifying Online Activism,” Proceedings of the WebSci10: Extending the Frontiers of Society On-Line,” Raleigh, 26-27 April 2010. submission 6.pdf
[10] J. W. Owens, K. Lenz, and S. Speagle, “Trick or Tweet: How Usable Is Twitter for First-Time Users?” 2009.
[11] D. Tunkelang, “Tunkrank What Is Tunkrank,” 2010.
[12] M. Cha, H. Haddadi, F. Benevenuto and K. P. Gummadi, “Measuring User Influence in Twitter: The Million Follower Fallacy,” Proceedings of the 4th International AAAI Conference on Weblogs and Social Media.
[13] D. D. Avello and D. J. Brenes, “Overcoming Spammers in Twitter—A Tale of Five Algorithms,” Conference on Information Retrieval, 2010.
[14] J. Pazzani, M. J. Muramatsu and D. Billsus, “Syskill & Webert: Identifying Interesting Web Sites,” AAAI/IAAI, Vol. 1, pp. 54-61.
[15] J. R. Mooney, and L. Roy, “Content-Based Book Recommending Using Learning for Text Categorization,” Proceedings of ACM DL’00, pp. 195-204.
[16] J. Chen, W. Geyer, C. Dugan, M. Muller and I. Guy, “Make New Friends, but Keep the Old: Recommending People on Social Networking Sites,” Proceedings of the CHI’09, 2009, pp. 201-210.
[17] W. Geyer, C. Dugan, D. R. Millen, M. Muller, and J. Freyne, “Recommending Topics for Self-Descriptions in Online User Profiles,” Proceedings of the 2008 ACM conference on Recommender systems, New York, 2008, pp. 59-66.
[18] P. Melville, J. R. Mooney and R. Nagarajan, “Content- Boosted Collaborative Filtering,” Proceedings of 2001 SIGIR Workshop on Recommender Systems, 2001.
[19] J. Hannon, M. Bennett and B. Smyth, “Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches,” RecSys, 2010, pp. 199-206.
[20] J. Chen, R. Nairn, L. Nelson, M. Bernstein, and E. H. Chi, “Short and Tweet: Experiments on Recommending Content from Information Streams,” Proceedings of CHI, 2010, pp. 1185-1194.
[21] D. Tunklang, “A Twitter Analog to Pagerank”, 2009, to-pagerank
[22] N. A. Christakis, “The Dynamics of Personal Influence,” 2009,
[23] A. K. McCallum, “Bow: A Toolkit For Statistical Language Modeling, Text Retrieval, Classification And Clustering,” 1996.
[24] C. Siefkes, F. Assis, S. Chhabra and W. S. Yerazunis, ?“Combining Winnow and Orthogonal Sparse Bigrams for Incremental Spam Filtering,” Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Springer-Verlag, New York, Inc., New York, 2004, pp. 410-421.
[25] F. Assis, W. S. Yerazunis, C. Siefkes and S. Chhabra, “CRM114 versus Mr. X: CRM114 Notes for the TREC 2005 Spam Track,” Proceedings of 14th Text REtrieval Conference (TREC), 2005.
[26] S. Chhabra, W. S. Yerazunis, and C. Siefkes, “Spam Filtering Using a Markov Random Field Model with Variable Weighting Schemas,” 4th IEEE International Conference on Data Mining, 2004, pp. 347-350.

Copyright © 2022 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.