skip to main content
10.1145/2492517.2492546acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Community detection in content-sharing social networks

Published:25 August 2013Publication History

ABSTRACT

Network structure and content in microblogging sites like Twitter influence each other ---user A on Twitter follows user B for the tweets that B posts on the network, and A may then re-tweet the content shared by B to his/her own followers. In this paper, we propose a probabilistic model to jointly model link communities and content topics by leveraging both the social graph and the content shared by users. We model a community as a distribution over users, use it as a source for topics of interest, and jointly infer both communities and topics using Gibbs sampling. While modeling communities using the social graph, or modeling topics using content have received a great deal of attention, a few recent approaches try to model topics in content-sharing platforms using both content and social graph. Our work differs from the existing generative models in that we explicitly model the social graph of users along with the user-generated content, mimicking how the two entities co-evolve in content-sharing platforms. Recent studies have found Twitter to be more of a content-sharing network and less a social network, and it seems hard to detect tightly knit communities from the follower-followee links. Still, the question of whether we can extract Twitter communities using both links and content is open. In this paper, we answer this question in the affirmative. Our model discovers coherent communities and topics, as evinced by qualitative results on sub-graphs of Twitter users. Furthermore, we evaluate our model on the task of predicting follower-followee links. We show that joint modeling of links and content significantly improves link prediction performance on a sub-graph of Twitter (consisting of about 0.7 million users and over 27 million tweets), compared to generative models based on only structure or only content and paths-based methods such as Katz.

References

  1. C. J. Anderson, S. Wasserman, and K. Faust. Building stochastic blockmodels. Social Networks, 1992.Google ScholarGoogle Scholar
  2. B. Ball, B. Karrer, and M. Newman. An efficient and principled method for detecting communities in networks. CoRR, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  3. D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. JMLR, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. L. Buntine. Operations for learning with graphical models. JAIR'94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In NIPS, 2000.Google ScholarGoogle Scholar
  6. L. Dietz, S. Bickel, and T. Scheffer. Unsupervised prediction of citation influences. In ICML, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Eroshev, S. Fienberg, and J. Lafferty. Mixed-membership models of scientific publications. PNAS, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. S. Fortunato. Community detection in graphs. Physics Reports, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  9. S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and bayesian restoration of images. PAMI, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Girvan and M. Newman. Community structure in social and biological networks. In PNAS, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  11. T. Griffiths and M. Steyvers. Finding scientific topics. In PNAS, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  12. B. Hu, Z. Song, and M. Ester. User features and social networks for topic modeling in online social media. In ASONAM, 2012, pages 202--209. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Karrer and M. Newman. Stochastic blockmodels and community structure in networks. Phys. Rev. E, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  14. H. Kwak, C. Lee, H. Park, and S. Moon. What is Twitter, a social network or a news media? In WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani. Kronecker graphs: An approach to modeling networks. JMLR'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Leskovec and C. Faloutsos. Sampling from large graphs. In KDD'06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. JASIST, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Liu, A. Niculescu-Mizil, and W. Gryc. Topic-link lda: Joint models of topic and author community. In ICML, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Z. Lu, B. Savas, W. Tang, and I. Dhillon. Supervised link prediction using multiple sources. In ICDM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. McCallum, A. Corrada-Emmanuel, and X. Wang. Topic and role discovery in social networks. In IJCAI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. McCallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks with experiments on enron and academic email. JAIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. P. Minka. Estimating a dirichlet distribution. Technical report, Microsoft Research, 2003.Google ScholarGoogle Scholar
  23. R. Nallapati and W. Cohen. Link-plsa-lda: A new unsupervised model for topics and influence of blogs. In ICWSM, 2008.Google ScholarGoogle Scholar
  24. R. M. Nallapati, A. Ahmed, E. P. Xing, and W. Cohen. Joint latent topic models for text and citations. In KDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Newman. Detecting community structure in networks. The European Physical Journal B, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  26. M. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Review, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  27. N. Pathak, C. Delong, A. Banerjee, and K. Erickson. Social Topic Models for Community Extraction. In SNA-KDD, 2008.Google ScholarGoogle Scholar
  28. I. Porteous, D. Newman, A. Ihler, A. Asuncion, P. Smyth, and M. Welling. Fast collapsed gibbs sampling for LDA. KDD'08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In UAI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Sachan, D. Contractor, T. Faruquie, and L. V. Subramaniam. Using content and interactions for discovering communities in social networks. In WWW, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical dirichlet processes. Journal of American Statistical Association, 2005.Google ScholarGoogle Scholar
  32. J. Yang and J. Leskovec. Patterns of temporal variation in online media. In WSDM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Zachary. An information flow model for conflict and fission in small groups. Journal of anthropological research, 1977.Google ScholarGoogle Scholar
  34. D. Zhou, E. Manavoglu, J. Li, C. L. Giles, and H. Zha. Probabilistic models for discovering e-communities. WWW'06. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Community detection in content-sharing social networks

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ASONAM '13: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
          August 2013
          1558 pages
          ISBN:9781450322409
          DOI:10.1145/2492517

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 August 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate116of549submissions,21%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader