Clustering Student Discussion Messages on Online Forumby Visualization and Non-Negative Matrix Factorization


The use of online discussion forum can effectively engage students in their studies. As the number of messages posted on the forum is increasing, it is more difficult for instructors to read and respond to them in a prompt way. In this paper, we apply non-negative matrix factorization and visualization to clustering message data, in order to provide a summary view of messages that disclose their deep semantic relationships. In particular, the NMF is able to find the underlying issues hidden in the messages about which most of the students are concerned. Visualization is employed to estimate the initial number of clusters, showing the relation communities. The experiments and comparison on a real dataset have been reported to demonstrate the effectiveness of the approaches.

Share and Cite:

X. Huang, J. Zhao, J. Ash and W. Lai, "Clustering Student Discussion Messages on Online Forumby Visualization and Non-Negative Matrix Factorization," Journal of Software Engineering and Applications, Vol. 6 No. 7B, 2013, pp. 7-12. doi: 10.4236/jsea.2013.67B002.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] T. Opsahl, “Triadic Closure in Two-mode Networks: Redefining the Global and Local Clustering Coefficients,” Social Networks, Vol. 35, 2013. doi:10.1016/j.socnet.2011.07.001.
[2] P. D. Laurie and T. Ellis, “Using Data Mining as a Strategy for Assessing Asynchronous Ddiscussion Forums,” Computers & Education, Vol. 45, No. 1, 2005, pp. 141-160. doi:10.1016/j.compedu.2004.05.003
[3] N. Lia and D. D. Wub, “Using Text Mining and Sentiment Analysis for Online Forums Hotspot Detection and Forecast,” Decision Support Systems, Vol. 48, No. 2, 2010, pp. 354-368. doi:10.1016/j.dss.2009.09.003
[4] A. Silva, “Visual Analysis of Online Interactions through Social Network Patterns,” IEEE 12th International Conference on Advanced Learning Technologies (ICALT), 2012, pp. 639- 641.
[5] X. Huang, X. Zheng, W. Yuan, F. Wang and S. Zhu, “Enhanced Clustering of Biomedical Documents Using Ensemble Non-negative Matrix Factorization”, Information Sciences, Vol. 181, No.11, 2011, pp. 2293-2302. doi:10.1016/j.ins.2011.01.029
[6] D. D. Lee, H. S. Seung, “Learning the parts of objects by non-negative matrix factoriza-tion”, Nature, 401, 1999, pp.788-791. doi:10.1038/44565
[7] T. Anderson, towards a theory of on-line learning. In T. Anderson, & F. Elloumi (Eds.), Theory and practice of online learning, pp. 33-60, 2004, Athabasca Univer-sity Press.
[8] W. Xu, X. Liu and Y. Gong, “Document clustering based on non-negative matrix factorization”, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, 2003, pp. 267-273.
[9] C. Ding, T. Li and W. Peng, “Orthogonal non-negative matrix t-factorizations for clustering”, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, pp. 126-135. doi:10.1145/1150402.1150420
[10] X. Huang and W. Lai, “Clustering Graphs for Visualization via Node Similarities”, Journal of Visual Languages and Computing, Vol.17, No.3, 2006, pp. 225-253. doi:10.1016/j.jvlc.2005.10.003
[11] X. Huang, W. Lai, A. S. M. Sajeev and J. Gao, “A New Algorithm to Remove Overlapping Nodes in Graph Layout”, Information Sciences, Vol. 177, No. 14, 2007, pp. 2821-2844. doi:10.1016/j.ins.2007.02.016
[12] C. R. Romero and S. Ventura, “Educational Data Mining: A Review of the State of the Art.” IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications andReviewsVol.40, No.6, 2010, pp. 601–618.
[13] J. Hung and K. Zhang, “Revealing online learning behaviours and activity patterns and making predictions with data mining techniques in online teaching” MERLOT Journal of Online Learning and Teaching, Vol.4, No.4, 2008.
[14] F.-R. Lin, L.-S. Hsieh and &F.-T. Chuang, “Discovering genres of online discussion threads via text mining”, Computers &Education, Vol.52, No.2, 2009, pp.481-495. doi:10.1016/j.compedu.2008.10.005
[15] G. Codo, D. Garcia, E. Santamaria, J. A. Moran, J. Melenchon and C. Monzo, “Modelling students’ activity in online discussion forms: a strategy based on time series and agglomerative hierarchical clustering”, Proceedings of Educational Data Mining, pp.253-258, 2011.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.