Scalable Overlapping Community Detection in Internet-Scale Networks
Proceedings of the Workshop of Information Technology and Systems, Auckland 2014
15 Pages Posted: 13 Dec 2016
Date Written: 2014
The rapid growth of the Internet, particularly the explosion of social media, has led to unprecedented increases in the volume of network data worldwide. Already, the Yahoo Web Graph collected in 2002 contains in excess of one billion URLs, the Facebook social network recently exceeded one billion users, and numerous other social networks or online communities easily claim memberships in the millions of users. One fundamental task towards understanding the structural and functional properties of large-scale networks is to detect its community structure, where each community consists of a group of (relatively) densely interconnected nodes. Recently, there has been growing interest in overlapping community detection due to the evidence of significant community overlaps found in large-scale real networks with ground-truth communities (Yang and Leskovec 2012). Not surprisingly, for example, it is generally accepted that actors in a social network tend to belong to multiple social groups (such as family, colleagues, and friends), depending on whom they are interacting with.
The discovered communities can be explored and utilized in a number of important applications such as identifying fraudulent actions in telecommunication networks (Pinheiro 2012), studying dynamics of viral marketing (Leskovec et al. 2007), and identifying target groups in consumer networks (Hill et al. 2006). However, only a few algorithms have been successfully applied to large networks in excess of hundreds of millions of nodes — and to the best of our knowledge, none of them are based on a statistical framework.
Suggested Citation: Suggested Citation