skip to main content
10.1145/1772690.1772762acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Sampling community structure

Published:26 April 2010Publication History

ABSTRACT

We propose a novel method, based on concepts from expander graphs, to sample communities in networks. We show that our sampling method, unlike previous techniques, produces subgraphs representative of community structure in the original network. These generated subgraphs may be viewed as stratified samples in that they consist of members from most or all communities in the network. Using samples produced by our method, we show that the problem of community detection may be recast into a case of statistical relational learning. We empirically evaluate our approach against several real-world datasets and demonstrate that our sampling method can effectively be used to infer and approximate community affiliation in the larger network.

References

  1. M. Adler and M. Mitzenmacher. Towards compressing web graphs. In Proc. of the IEEE DCC, pages 203--212, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Boguna, R. P. Satorras, A. D. Guilera, and A. Arenas. Models of social networks based on social distance attachment. Physical Review E, 70(5), 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. E. Bullmore and O. Sporns. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews. Neuroscience, Feb 2009.Google ScholarGoogle Scholar
  4. B. A. Cipra. The best of the 20th century: Editors name top 10 algorithms. SIAM News, 33, 2000.Google ScholarGoogle Scholar
  5. A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70(6):066111+, Dec 2004.Google ScholarGoogle Scholar
  6. R. Clayford and T. Johnson. Operational parameters affecting use of anonymous cell phone tracking for generating traffic information. In 82nd TRB Annual Meeting, Jan 2003.Google ScholarGoogle Scholar
  7. D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, and S. Suri. Feedback effects between similarity and social influence in online communities. In KDD '08, pages 160--168, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Danon, A. Diaz-Guilera, J. Duch, and A. Arenas. Comparing community structure identification. J. Stat. Mech.: Theory and Experiment, 2005(09):P09008+, Sep 2005.Google ScholarGoogle Scholar
  9. J. Duch and A. Arenas. Community detection in complex networks using extremal optimization. Physical Review E, 72(2):027104+, 2005.Google ScholarGoogle Scholar
  10. T. Feder and R. Motwani. Clique partitions, graph compression and speeding-up algorithms. In J. Comp. and Sys. Sci., pages 123--133, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Fortunato. Community detection in graphs. Jun 2009.Google ScholarGoogle Scholar
  12. O. Frank. Network Sampling and Model Fitting. Cambridge University Press, February 2005.Google ScholarGoogle ScholarCross RefCross Ref
  13. J. Gehrke, P. Ginsparg, and J. Kleinberg. Overview of the 2003 kdd cup. SIGKDD Explor. Newsl., 5(2):149--151, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. C. Gilbert and K. Levchenko. Compressing network graphs. In Proc. LinkKDD workshop at the 10th ACM Conference on KDD, Aug 2004.Google ScholarGoogle Scholar
  15. M. Girvan and M. E. Newman. Community structure in social and biological networks. Proc Natl Acad Sci USA, 99(12):7821--7826, Jun 2002.Google ScholarGoogle ScholarCross RefCross Ref
  16. S. Goel, Matthew, and J. Salganik. Respondent-driven sampling as markov chain monte carlo. Stat. in Medicine, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  17. B. H. Good, Y.-A. de Montjoye, and A. Clauset. The performance of modularity maximization in practical contexts. arXiv ePrints, 0910.0165, 2009.Google ScholarGoogle Scholar
  18. D. Gusfield. Partition-distance: A problem and class of perfect graphs arising in clustering. Info. Proc. Letters, 82(3):159--164, May 2002.Google ScholarGoogle ScholarCross RefCross Ref
  19. S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bull. Amer. Math. Soc., 43:439--561, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  20. C. Hubler, H.-P. Kriegel, K. Borgwardt, and Z. Ghahramani. Metropolis algorithms for representative subgraph sampling. In ICDM '08, pages 283--292, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A. L. Barabasi. The large-scale organization of metabolic networks. Nature, 407(6804):651--654, Oct 2000.Google ScholarGoogle ScholarCross RefCross Ref
  22. R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. J. ACM, 51(3):497--515, May 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD '08, pages 462--470, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Leskovec and C. Faloutsos. Sampling from large graphs. In KDD '06, pages 631--636, New York, NY, USA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD '05, pages 177--187, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. J. C. Mackay. Information Theory, Inference & Learning Algorithms. Cambridge University Press, Jun 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res., 8:935--983, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Mehler and S. Skiena. Expanding network communities from representative examples. ACM TKDD, 3(2):1--27, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation of state calculations by fast computing machines. J. Chem. Phys., 21(6):1087--92, 1953.Google ScholarGoogle ScholarCross RefCross Ref
  30. M. E. J. Newman. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(3), 2006.Google ScholarGoogle ScholarCross RefCross Ref
  31. D. Rafiei. Effectively visualizing large networks through sampling. In Proc. VIS 05, pages 375--382, 2005.Google ScholarGoogle Scholar
  32. M. Richardson, R. Agrawal, and P. Domingos. Trust Management for the Semantic Web, volume 2870. January 2003.Google ScholarGoogle Scholar
  33. S. Wasserman, K. Faust, and D. Iacobucci. Social Network Analysis : Methods and Applications. Cambridge University Press, Nov 1994.Google ScholarGoogle ScholarCross RefCross Ref
  34. W. Zachary. An information flow model for conflict and fission in small groups. J. Anthropological Research, 33:452--473, 1977.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Sampling community structure

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '10: Proceedings of the 19th international conference on World wide web
      April 2010
      1407 pages
      ISBN:9781605587998
      DOI:10.1145/1772690

      Copyright © 2010 International World Wide Web Conference Committee (IW3C2)

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 April 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    ePub

    View this article in ePub.

    View ePub