skip to main content
research-article

Social Link Prediction in Online Social Tagging Systems

Published:01 November 2013Publication History
Skip Abstract Section

Abstract

Social networks have become a popular medium for people to communicate and distribute ideas, content, news, and advertisements. Social content annotation has naturally emerged as a method of categorization and filtering of online information. The unrestricted vocabulary users choose from to annotate content has often lead to an explosion of the size of space in which search is performed. In this article, we propose latent topic models as a principled way of reducing the dimensionality of such data and capturing the dynamics of collaborative annotation process. We propose three generative processes to model latent user tastes with respect to resources they annotate with metadata. We show that latent user interests combined with social clues from the immediate neighborhood of users can significantly improve social link prediction in the online music social media site Last.fm. Most link prediction methods suffer from the high class imbalance problem, resulting in low precision and/or recall. In contrast, our proposed classification schemes for social link recommendation achieve high precision and recall with respect to not only the dominant class (nonexistence of a link), but also with respect to sparse positive instances, which are the most vital in social tie prediction.

References

  1. Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, 635--644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022. Google ScholarGoogle ScholarCross RefCross Ref
  3. Markus Bundschus, Shipeng Yu, Volker Tresp, Achim Rettinger, Mathaeus Dejori, and Hans-Peter Kriegel. 2009. Hierarchical Bayesian models for collaborative tagging systems. In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM’09). IEEE, 728--733. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Iván Cantador, Peter Brusilovsky, and Tsvi Kuflik. 2011. 2nd Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec’11). In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jonathan Chang and David Blei. 2009. Relational topic models for document networks. In Proceedings of the Conference on AI and Statistics.Google ScholarGoogle Scholar
  6. Nello Cristianini and John Shawe-Taylor. 2010. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Darcy Davis, Ryan Lichtenwalter, and Nitesh V. Chawla. 2011. Multi-relational link prediction in heterogeneous information networks. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM’11). IEEE, 281--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Laura Dietz. 2009. Modeling shared tastes in online communities. In Proceedings of the NIPS Workshop on Applications for Topic Models: Text and Beyond.Google ScholarGoogle Scholar
  9. Seyda Ertekin, Jian Huang, Leon Bottou, and Lee Giles. 2007. Learning on the border: Active learning in imbalanced data classification. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). ACM, New York. 127--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Liang Ge and Aidong Zhang. 2012. Pseudo cold start link prediction with multiple sources in social networks. In Proceedings of the SIAM International Conference on Data Mining. SIAM/Omnipress, 768--779.Google ScholarGoogle ScholarCross RefCross Ref
  11. Scott Golder and Bernardo A. Huberman. 2006. The structure of collaborative tagging systems. J. Inf. Sci. 32, 2, 198--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mark Granovetter. 1983. The strength of weak ties: A network theory revisited. Sociol. Theory 1, 201--233.Google ScholarGoogle ScholarCross RefCross Ref
  13. Thomas L. Griffiths and Mark Steyvers. 2004. Finding scientific topics. Proc. Nat. Acad. Sci. 101, Suppl 1, 5228--5235.Google ScholarGoogle ScholarCross RefCross Ref
  14. Manish Gupta, Rui Li, Zhijun Yin, and Jiawei Han. 2010. Survey on social tagging techniques. SIGKDD Explor. Newsl. 12, 1, 58--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Harry Halpin, Valentin Robu, and Hana Shepherd. 2007. The complex dynamics of collaborative tagging. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, 211--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Negar Hariri, Bamshad Mobasher, and Robin Burke. 2012. Context-aware music recommendation based on latenttopic sequential patterns. In Proceedings of the 6th ACM Conference on Recommender Systems (RecSys’12). ACM, New York, 131--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Morgan Harvey, Ian Ruthven, and Mark J. Carman. 2011. Improving social bookmark search using personalised latent variable language models. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, 485--494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Peter D. Hoff. 2009. Multiplicative latent factor models for description and prediction of social networks. Comput. Math. Organ. Theory 15, 4, 261--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Donald B. Johnson. 1977. Efficient algorithms for shortest paths in sparse networks. J. ACM 24, 1, 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1, 39--43. DOI:10.1007/BF02289026.Google ScholarGoogle ScholarCross RefCross Ref
  21. S. Sathiya Keerthi, Olivier Chapelle, and Dennis DeCoste. 2006. Building support vector machines with reduced classifier complexity. J. Mach. Learn. Res. 7, 1493--1515. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tamara G. Kolda and Brett W. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3, 455--500. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ralf Krestel, Peter Fankhauser, and Wolfgang Nejdl. 2009. Latent Dirichlet allocation for tag recommendation. In Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys’09). ACM, New York, 61--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kristina Lerman and Anon Plangprasopchok. 2009. Handbook of Research on Web 2.0, 3.0, and X.0: Technologies, Business, and Social Applications. IGI Global, Chapter Leveraging user-specified metadata to personalize image search.Google ScholarGoogle Scholar
  25. Vincent Leroy, B. Barla Cambazoglu, and Francesco Bonchi. 2010. Cold start link prediction. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, 393--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Nan Lin, Daifeng Li, Ying Ding, Bing He, Zheng Qin, Jie Tang, Juanzi Li, and Tianxi Dong. 2012. The dynamic features of delicious, Flickr, and YouTube. J. Amer. Soc. Inf. Sci. Technol. 63, 1, 139--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Marek Lipczak, Borkur Sigurbjornsson, and Alejandro Jaimes. 2012. Understanding and leveraging tag-based relations in on-line social networks. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT’12). ACM, New York, 229--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lu Liu, Feida Zhu, Lei Zhang, and Shiqiang Yang. 2012. A probabilistic graphical model for topic and preference discovery on social media. Neurocomput. 95, 78--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yan Liu, Alexandru Niculescu-Mizil, and Wojciech Gryc. 2009. Topic-link LDA: Joint models of topic and author community. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). ACM, New York, 665--672. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Zhiyuan Liu, Yuzhou Zhang, Edward Y. Chang, and Maosong Sun. 2011. PLDA+: Parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol. 2, 3, Article 26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Bo Long, Xiaoyun Wu, Zhongfei (Mark) Zhang, and Philip S. Yu. 2006. Unsupervised learning on k-partite graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, New York, 317--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Caimei Lu, Xiaohua Hu, Xin Chen, Jung-Ran Park, TingTing He, and Zhoujun Li. 2010. The topicperspectivemodel for social tagging systems. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, 683--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Linyuan Lu and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: 390, 6, 1150--1170.Google ScholarGoogle ScholarCross RefCross Ref
  34. Masoud Makrehchi. 2011. Social link recommendation by learning hidden topics. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, 189--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a Feather: Homophily in Social Networks. Ann. Rev. Sociol. 27, 1, 415--444.Google ScholarGoogle ScholarCross RefCross Ref
  36. Aditya Krishna Menon and Charles Elkan. 2011. Link prediction via matrix factorization. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD’11). Springer, 437--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, 251--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rohit Parimi and Doina Caragea. 2011. Predicting friendship links in social networks using a topic modeling approach. In Proceedings of the 15th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’11). Springer, 75--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Marco Pennacchiotti and Siva Gurumurthy. 2011. Investigating topic models for social media user recommendation. In Proceedings of the 20th International Conference on World Wide Web (WWW’11). ACM, New York, 101--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. John C. Platt. 1999. Advances in Kernel Methods. MIT Press, Cambridge, MA, 185--208.Google ScholarGoogle Scholar
  41. Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas Griffiths, Padhraic Smyth, and Mark Steyvers. 2010. Learning author-topic models from text corpora. ACM Trans. Inf. Syst. 28, 1, Article 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Adam Sadilek, Henry Kautz, and Jeffrey P. Bigham. 2012. Finding your friends and following them to where you are. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM’12). ACM, New York, 723--732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Rossano Schifanella, Alain Barrat, Ciro Cattuto, Benjamin Markines, and Filippo Menczer. 2010. Folks in folksonomies: Social link prediction from shared metadata. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, 271--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Shai Shalev-Shwartz and Nathan Srebro. 2008. SVM optimization: inverse dependence on training set size. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). ACM, New York, 928--935. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In Proceedings of the International Conference on Very Large Databases.Google ScholarGoogle Scholar
  46. Ben Taskar, Ming fai Wong, Pieter Abbeel, and Daphne Koller. 2003. Link prediction in relational data. In Neural Information Processing Systems.Google ScholarGoogle Scholar
  47. Ivor W. Tsang, James T. Kwok, and Pak-Ming Cheung. 2005. Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 6, 363--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. S. Wasserman and K. Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.Google ScholarGoogle Scholar
  49. Rongjing Xiang, Jennifer Neville, and Monica Rogati. 2010. Modeling relationship strength in online social networks. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, 981--990. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Social Link Prediction in Online Social Tagging Systems

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Information Systems
              ACM Transactions on Information Systems  Volume 31, Issue 4
              November 2013
              192 pages
              ISSN:1046-8188
              EISSN:1558-2868
              DOI:10.1145/2536736
              Issue’s Table of Contents

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 November 2013
              • Accepted: 1 June 2013
              • Revised: 1 May 2013
              • Received: 1 January 2013
              Published in tois Volume 31, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader