Finding Statistically Significant Communities in Networks with Weighted Label Propagation

Abstract

Various networks exist in the world today including biological, social, information, and communication networks with the Internet as the largest network of all. One salient structural feature of these networks is the formation of groups or communities of vertices that tend to be more connected to each other within the same group than to those outside. Therefore, the detection of these communities is a topic of great interest and importance in many applications and different algorithms including label propagation have been developed for such purpose. Speaker-listener label propagation algorithm (SLPA) enjoys almost linear time complexity, so desirable in dealing with large networks. As an extension of SLPA, this study presented a novel weighted label propagation algorithm (WLPA), which was tested on four real world social networks with known community structures including the famous Zachary's karate club network. Wilcoxon tests on the communities found in the karate club network by WLPA demonstrated an improved statistical significance over SLPA. Withthehelp of Wilcoxon tests again, we were able to determine the best possible formation of two communities in this network relative to the ground truth partition, which could be used as a new benchmark for assessing community detection algorithms. Finally WLPA predicted better communities than SLPA in two of the three additional real social networks, when compared to the ground truth.

Share and Cite:

Hu, W. (2013) Finding Statistically Significant Communities in Networks with Weighted Label Propagation. Social Networking, 2, 138-146. doi: 10.4236/sn.2013.23012.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] S. A. Rice, “The Identification of Blocs in Small Political Bodies,” The American Political Science Review, Vol. 21, No. 3, 1927, pp. 619-627. doi:10.2307/1945514
[2] R. S. Weiss and E. Jacobson, “A Method for the Analysis of the Structure of Complex Organizations,” American Sociological Review, Vol. 20, No. 6, 1955, pp. 661-668. doi:10.2307/2088670
[3] S. Fortunato, “Community Detection in Graphs,” Physics Reports, Vol. 486, No. 3-5, 2010, pp. 75-174. doi:10.1016/j.physrep.2009.11.002
[4] G. Agarwal and D. Kempe, “Modularity-Maximizing Network Communities Using Mathematical Programming,” The European Physical Journal B, Vol. 66, No. 3, 2008, pp. 409-418. doi:10.1140/epjb/e2008-00425-1
[5] U. N. Raghavan, R. Albert and S. Kumara, “Near Linear Time Algorithm to Detect Community Structures in Large-Scale Networks,” Physical Review E, Vol. 76, 2007, Article ID: 036106. doi:10.1103/PhysRevE.76.036106
[6] I. X. Y. Leung, P. Hui, P. Lio and J. Crowcroft, “Towards Real-Time Community Detection in Large Networks,” Physical Review E, Vol. 79, 2009, Article ID: 066107. doi:10.1103/PhysRevE.79.066107
[7] J. Xie, B. K. Szymanski and X. Liu, “SLPA: Uncovering Overlapping Communities in Social Networks via A Speaker-Listener Interaction Dynamic Process,” IEEE ICDM Workshop on DMCCI 2011, Vancouver, 2011, pp 344-349.
[8] J. R. Xie, “Agent-Based Dynamics Modeling for Opinion Spreading and Community Detection in Large-Scale Social Networks,” Rensselaer Polytechnic Institute, Troy, 2012.
[9] J. Bagrow and E. Bollt, “A Local Method for Detecting Communities,” Physical Review E, Vol. 72, 2005, Article ID: 046108. doi:10.1103/PhysRevE.72.046108
[10] W. W. Zachary, “An Information Flow Model for Conflict and Fission in Small Groups,” Journal of Anthropological Research, Vol. 33, 1977, pp. 452-473.
[11] M. Girvan and M. E. J. Newman, “Community Structure in Social and Biological Networks,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 99, 2002, pp. 7821-7826. doi:10.1073/pnas.122653799
[12] X. Xu, N. Yuruk, Z. Feng and T. A. J. Schweiger, “Scan: A Structural Clustering Algorithm for Networks,” Proceedings of 2007 International Conference Knowledge Discovery and Data Mining, San Jose, August 2007, pp. 824-833.
[13] http://www.orgnet.com/
[14] http://www-personal.umich.edu/~mejn/netdata/
[15] M. Girvan and M. E. J. Newman, “Community Structure in Social and Biological Networks,” Proceedings of the National Academy of Sciences of the United States of America, Vol. 99, No. 12, 2002, pp. 8271-8276. doi:10.1073/pnas.122653799
[16] M. A. Porter, J.-P. Onnela and P. J. Mucha, “Communities in Networks,” Notices of the American Mathematical Society, Vol. 56, No. 9, 2009, pp. 1082-1097, 1164-1166.
[17] A. Medus, G. Acuna, and C. Dorso, “Detection of Community Structures in Networks via Global Optimization,” Physica A: Statistical Mechanics and Its Applications, Vol. 358, No. 2-4, 2005, pp. 593-604.
[18] A. Lancichinetti, F. Radicchi, J. J. Ramasco and S. Fortunato, “Finding Statistically Significant Communities in Networks,” PLoS ONE, Vol. 6, No. 4, 2011, Article ID: e18961. doi:10.1371/journal.pone.0018961
[19] A. Mirshahvalad, J. Lindholm, M. Derlén and M. Rosvall, “Significant Communities in Large Sparse Networks,” PLoS ONE, Vol. 7, No. 3, 2012, Article ID: e33721. doi:10.1371/journal.pone.0033721
[20] J. Reichardt and S. Bornholdt, “Statistical Mechanics of Community Detection,” Physical Review E, Vol. 74, 2006, Article ID: 016110. doi:10.1103/PhysRevE.74.016110
[21] J. P. Ferry, J. O. Bumgarner and S. T. Ahearn, “Probabilistic Community Detection in Networks,” The 14th International Conference on Information Fusion, Chicago, 5-8 July 2011, pp. 1-8.
[22] M. Meila, “Comparing Clusterings by the Variation of Information,” In: B. Scholkopf and M. K. Warmuth, Eds., Learning Theory and Kernel Machines: 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop, Springer, Washington DC, 2003, pp. 173-187.
[23] L. Danon, A. Diaz-Guilera, J. Duch and A. Arenas, “Comparing Community Structure Identification,” Journal of Statistical Mechanics, No. 9, 2005, ArticleID: P09008. doi:10.1088/1742-5468/2005/09/P09008
[24] S. van Dongen, “Performance Criteria for Graph Clustering and Markov Cluster Experiments,” Technical Report INS-R0012, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, 2000.
[25] W. M. Rand, “Objective Criteria for the Evaluation of Clustering Methods,” Journal of the American Statistical Association, Vol. 66, No. 336, 1971, pp. 846-850. doi:10.1080/01621459.1971.10482356

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.