TITLE:
Clusters Merging Method for Short Texts Clustering
AUTHORS:
Yu Wang, Lihui Wu, Hongyu Shao
KEYWORDS:
Short Texts Clustering, Slide Window, Information Gain, Hierarchical Clustering
JOURNAL NAME:
Open Journal of Social Sciences,
Vol.2 No.9,
August
27,
2014
ABSTRACT:
Under push of Mobile Internet, new social media
such as microblog, we chat, question answering systems are constantly emerging.
They produce huge amounts of short texts which bring forward new challenges to text
clustering. In response to the features of large amount and dynamic growth of short
texts, a two-stage clustering method was putted forward. This method adopted a sliding
window sliding on the flow of short texts. Inside the slide window, hierarchical
clustering method was used, and between the slide windows, clusters merging method
based on information gain was adopted. Experiment indicated that this method is
fast and has a higher accuracy.