ABSTRACT
We propose a novel method, based on concepts from expander graphs, to sample communities in networks. We show that our sampling method, unlike previous techniques, produces subgraphs representative of community structure in the original network. These generated subgraphs may be viewed as stratified samples in that they consist of members from most or all communities in the network. Using samples produced by our method, we show that the problem of community detection may be recast into a case of statistical relational learning. We empirically evaluate our approach against several real-world datasets and demonstrate that our sampling method can effectively be used to infer and approximate community affiliation in the larger network.
- M. Adler and M. Mitzenmacher. Towards compressing web graphs. In Proc. of the IEEE DCC, pages 203--212, 2000. Google ScholarDigital Library
- M. Boguna, R. P. Satorras, A. D. Guilera, and A. Arenas. Models of social networks based on social distance attachment. Physical Review E, 70(5), 2004.Google ScholarCross Ref
- E. Bullmore and O. Sporns. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews. Neuroscience, Feb 2009.Google Scholar
- B. A. Cipra. The best of the 20th century: Editors name top 10 algorithms. SIAM News, 33, 2000.Google Scholar
- A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70(6):066111+, Dec 2004.Google Scholar
- R. Clayford and T. Johnson. Operational parameters affecting use of anonymous cell phone tracking for generating traffic information. In 82nd TRB Annual Meeting, Jan 2003.Google Scholar
- D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, and S. Suri. Feedback effects between similarity and social influence in online communities. In KDD '08, pages 160--168, 2008. Google ScholarDigital Library
- L. Danon, A. Diaz-Guilera, J. Duch, and A. Arenas. Comparing community structure identification. J. Stat. Mech.: Theory and Experiment, 2005(09):P09008+, Sep 2005.Google Scholar
- J. Duch and A. Arenas. Community detection in complex networks using extremal optimization. Physical Review E, 72(2):027104+, 2005.Google Scholar
- T. Feder and R. Motwani. Clique partitions, graph compression and speeding-up algorithms. In J. Comp. and Sys. Sci., pages 123--133, 1991. Google ScholarDigital Library
- S. Fortunato. Community detection in graphs. Jun 2009.Google Scholar
- O. Frank. Network Sampling and Model Fitting. Cambridge University Press, February 2005.Google ScholarCross Ref
- J. Gehrke, P. Ginsparg, and J. Kleinberg. Overview of the 2003 kdd cup. SIGKDD Explor. Newsl., 5(2):149--151, 2003. Google ScholarDigital Library
- A. C. Gilbert and K. Levchenko. Compressing network graphs. In Proc. LinkKDD workshop at the 10th ACM Conference on KDD, Aug 2004.Google Scholar
- M. Girvan and M. E. Newman. Community structure in social and biological networks. Proc Natl Acad Sci USA, 99(12):7821--7826, Jun 2002.Google ScholarCross Ref
- S. Goel, Matthew, and J. Salganik. Respondent-driven sampling as markov chain monte carlo. Stat. in Medicine, 2009.Google ScholarCross Ref
- B. H. Good, Y.-A. de Montjoye, and A. Clauset. The performance of modularity maximization in practical contexts. arXiv ePrints, 0910.0165, 2009.Google Scholar
- D. Gusfield. Partition-distance: A problem and class of perfect graphs arising in clustering. Info. Proc. Letters, 82(3):159--164, May 2002.Google ScholarCross Ref
- S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bull. Amer. Math. Soc., 43:439--561, 2006.Google ScholarCross Ref
- C. Hubler, H.-P. Kriegel, K. Borgwardt, and Z. Ghahramani. Metropolis algorithms for representative subgraph sampling. In ICDM '08, pages 283--292, 2008. Google ScholarDigital Library
- H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A. L. Barabasi. The large-scale organization of metabolic networks. Nature, 407(6804):651--654, Oct 2000.Google ScholarCross Ref
- R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. J. ACM, 51(3):497--515, May 2004. Google ScholarDigital Library
- J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD '08, pages 462--470, 2008. Google ScholarDigital Library
- J. Leskovec and C. Faloutsos. Sampling from large graphs. In KDD '06, pages 631--636, New York, NY, USA, 2006. Google ScholarDigital Library
- J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD '05, pages 177--187, 2005. Google ScholarDigital Library
- D. J. C. Mackay. Information Theory, Inference & Learning Algorithms. Cambridge University Press, Jun 2002. Google ScholarDigital Library
- S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res., 8:935--983, 2007. Google ScholarDigital Library
- A. Mehler and S. Skiena. Expanding network communities from representative examples. ACM TKDD, 3(2):1--27, 2009. Google ScholarDigital Library
- N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation of state calculations by fast computing machines. J. Chem. Phys., 21(6):1087--92, 1953.Google ScholarCross Ref
- M. E. J. Newman. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(3), 2006.Google ScholarCross Ref
- D. Rafiei. Effectively visualizing large networks through sampling. In Proc. VIS 05, pages 375--382, 2005.Google Scholar
- M. Richardson, R. Agrawal, and P. Domingos. Trust Management for the Semantic Web, volume 2870. January 2003.Google Scholar
- S. Wasserman, K. Faust, and D. Iacobucci. Social Network Analysis : Methods and Applications. Cambridge University Press, Nov 1994.Google ScholarCross Ref
- W. Zachary. An information flow model for conflict and fission in small groups. J. Anthropological Research, 33:452--473, 1977.Google ScholarCross Ref
Index Terms
- Sampling community structure
Recommendations
Disjoint and Non-Disjoint Community Detection with Control of Overlaps Between Communities
AbstractOverlapping community detection has become an important challenge in networks analysis that motivates researchers to propose community detection methods that best fit existing complex and non-disjoint structures in real-world networks such as ...
Overlapping Community Detection by Collective Friendship Group Inference
ASONAM '10: Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and MiningThere has been considerable interest in improving the capability to identify communities within large collections of social networking data. However, many of the existing algorithms will compartment an actor (node) into a single group, ignoring the fact ...
Community Structure Detection Using Firefly Algorithm
This article describes how parallel to the continuous growth of the Internet, which allows people to share and collaborate more, social networks have become more attractive as a research topic in many different disciplines. Community structures are ...
Comments