research-article

Social Link Prediction in Online Social Tagging Systems

Authors:
Charalampos Chelmis

University of Southern California

University of Southern California
View Profile

,
Viktor K. Prasanna

University of Southern California

University of Southern California
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 31 Issue 4Article No.: 20pp 1–27https://doi.org/10.1145/2516891

Published:01 November 2013Publication History

ACM Transactions on Information Systems

Abstract

Social networks have become a popular medium for people to communicate and distribute ideas, content, news, and advertisements. Social content annotation has naturally emerged as a method of categorization and filtering of online information. The unrestricted vocabulary users choose from to annotate content has often lead to an explosion of the size of space in which search is performed. In this article, we propose latent topic models as a principled way of reducing the dimensionality of such data and capturing the dynamics of collaborative annotation process. We propose three generative processes to model latent user tastes with respect to resources they annotate with metadata. We show that latent user interests combined with social clues from the immediate neighborhood of users can significantly improve social link prediction in the online music social media site Last.fm. Most link prediction methods suffer from the high class imbalance problem, resulting in low precision and/or recall. In contrast, our proposed classification schemes for social link recommendation achieve high precision and recall with respect to not only the dominant class (nonexistence of a link), but also with respect to sparse positive instances, which are the most vital in social tie prediction.

References

Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, 635--644. Google ScholarDigital Library
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022. Google ScholarCross Ref
Markus Bundschus, Shipeng Yu, Volker Tresp, Achim Rettinger, Mathaeus Dejori, and Hans-Peter Kriegel. 2009. Hierarchical Bayesian models for collaborative tagging systems. In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM’09). IEEE, 728--733. Google ScholarDigital Library
Iván Cantador, Peter Brusilovsky, and Tsvi Kuflik. 2011. 2nd Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec’11). In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York. Google ScholarDigital Library
Jonathan Chang and David Blei. 2009. Relational topic models for document networks. In Proceedings of the Conference on AI and Statistics.Google Scholar
Nello Cristianini and John Shawe-Taylor. 2010. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press. Google ScholarDigital Library
Darcy Davis, Ryan Lichtenwalter, and Nitesh V. Chawla. 2011. Multi-relational link prediction in heterogeneous information networks. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM’11). IEEE, 281--288. Google ScholarDigital Library
Laura Dietz. 2009. Modeling shared tastes in online communities. In Proceedings of the NIPS Workshop on Applications for Topic Models: Text and Beyond.Google Scholar
Seyda Ertekin, Jian Huang, Leon Bottou, and Lee Giles. 2007. Learning on the border: Active learning in imbalanced data classification. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). ACM, New York. 127--136. Google ScholarDigital Library
Liang Ge and Aidong Zhang. 2012. Pseudo cold start link prediction with multiple sources in social networks. In Proceedings of the SIAM International Conference on Data Mining. SIAM/Omnipress, 768--779.Google ScholarCross Ref
Scott Golder and Bernardo A. Huberman. 2006. The structure of collaborative tagging systems. J. Inf. Sci. 32, 2, 198--208. Google ScholarDigital Library
Mark Granovetter. 1983. The strength of weak ties: A network theory revisited. Sociol. Theory 1, 201--233.Google ScholarCross Ref
Thomas L. Griffiths and Mark Steyvers. 2004. Finding scientific topics. Proc. Nat. Acad. Sci. 101, Suppl 1, 5228--5235.Google ScholarCross Ref
Manish Gupta, Rui Li, Zhijun Yin, and Jiawei Han. 2010. Survey on social tagging techniques. SIGKDD Explor. Newsl. 12, 1, 58--72. Google ScholarDigital Library
Harry Halpin, Valentin Robu, and Hana Shepherd. 2007. The complex dynamics of collaborative tagging. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, 211--220. Google ScholarDigital Library
Negar Hariri, Bamshad Mobasher, and Robin Burke. 2012. Context-aware music recommendation based on latenttopic sequential patterns. In Proceedings of the 6th ACM Conference on Recommender Systems (RecSys’12). ACM, New York, 131--138. Google ScholarDigital Library
Morgan Harvey, Ian Ruthven, and Mark J. Carman. 2011. Improving social bookmark search using personalised latent variable language models. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, 485--494. Google ScholarDigital Library
Peter D. Hoff. 2009. Multiplicative latent factor models for description and prediction of social networks. Comput. Math. Organ. Theory 15, 4, 261--272. Google ScholarDigital Library
Donald B. Johnson. 1977. Efficient algorithms for shortest paths in sparse networks. J. ACM 24, 1, 1--13. Google ScholarDigital Library
Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1, 39--43. DOI:10.1007/BF02289026.Google ScholarCross Ref
S. Sathiya Keerthi, Olivier Chapelle, and Dennis DeCoste. 2006. Building support vector machines with reduced classifier complexity. J. Mach. Learn. Res. 7, 1493--1515. Google ScholarDigital Library
Tamara G. Kolda and Brett W. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3, 455--500. Google ScholarDigital Library
Ralf Krestel, Peter Fankhauser, and Wolfgang Nejdl. 2009. Latent Dirichlet allocation for tag recommendation. In Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys’09). ACM, New York, 61--68. Google ScholarDigital Library
Kristina Lerman and Anon Plangprasopchok. 2009. Handbook of Research on Web 2.0, 3.0, and X.0: Technologies, Business, and Social Applications. IGI Global, Chapter Leveraging user-specified metadata to personalize image search.Google Scholar
Vincent Leroy, B. Barla Cambazoglu, and Francesco Bonchi. 2010. Cold start link prediction. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, 393--402. Google ScholarDigital Library
Nan Lin, Daifeng Li, Ying Ding, Bing He, Zheng Qin, Jie Tang, Juanzi Li, and Tianxi Dong. 2012. The dynamic features of delicious, Flickr, and YouTube. J. Amer. Soc. Inf. Sci. Technol. 63, 1, 139--162. Google ScholarDigital Library
Marek Lipczak, Borkur Sigurbjornsson, and Alejandro Jaimes. 2012. Understanding and leveraging tag-based relations in on-line social networks. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT’12). ACM, New York, 229--238. Google ScholarDigital Library
Lu Liu, Feida Zhu, Lei Zhang, and Shiqiang Yang. 2012. A probabilistic graphical model for topic and preference discovery on social media. Neurocomput. 95, 78--88. Google ScholarDigital Library
Yan Liu, Alexandru Niculescu-Mizil, and Wojciech Gryc. 2009. Topic-link LDA: Joint models of topic and author community. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). ACM, New York, 665--672. Google ScholarDigital Library
Zhiyuan Liu, Yuzhou Zhang, Edward Y. Chang, and Maosong Sun. 2011. PLDA+: Parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol. 2, 3, Article 26. Google ScholarDigital Library
Bo Long, Xiaoyun Wu, Zhongfei (Mark) Zhang, and Philip S. Yu. 2006. Unsupervised learning on k-partite graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, New York, 317--326. Google ScholarDigital Library
Caimei Lu, Xiaohua Hu, Xin Chen, Jung-Ran Park, TingTing He, and Zhoujun Li. 2010. The topicperspectivemodel for social tagging systems. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, 683--692. Google ScholarDigital Library
Linyuan Lu and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: 390, 6, 1150--1170.Google ScholarCross Ref
Masoud Makrehchi. 2011. Social link recommendation by learning hidden topics. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, 189--196. Google ScholarDigital Library
Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a Feather: Homophily in Social Networks. Ann. Rev. Sociol. 27, 1, 415--444.Google ScholarCross Ref
Aditya Krishna Menon and Charles Elkan. 2011. Link prediction via matrix factorization. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD’11). Springer, 437--452. Google ScholarDigital Library
Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, 251--260. Google ScholarDigital Library
Rohit Parimi and Doina Caragea. 2011. Predicting friendship links in social networks using a topic modeling approach. In Proceedings of the 15th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’11). Springer, 75--86. Google ScholarDigital Library
Marco Pennacchiotti and Siva Gurumurthy. 2011. Investigating topic models for social media user recommendation. In Proceedings of the 20th International Conference on World Wide Web (WWW’11). ACM, New York, 101--102. Google ScholarDigital Library
John C. Platt. 1999. Advances in Kernel Methods. MIT Press, Cambridge, MA, 185--208.Google Scholar
Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas Griffiths, Padhraic Smyth, and Mark Steyvers. 2010. Learning author-topic models from text corpora. ACM Trans. Inf. Syst. 28, 1, Article 4. Google ScholarDigital Library
Adam Sadilek, Henry Kautz, and Jeffrey P. Bigham. 2012. Finding your friends and following them to where you are. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM’12). ACM, New York, 723--732. Google ScholarDigital Library
Rossano Schifanella, Alain Barrat, Ciro Cattuto, Benjamin Markines, and Filippo Menczer. 2010. Folks in folksonomies: Social link prediction from shared metadata. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, 271--280. Google ScholarDigital Library
Shai Shalev-Shwartz and Nathan Srebro. 2008. SVM optimization: inverse dependence on training set size. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). ACM, New York, 928--935. Google ScholarDigital Library
Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In Proceedings of the International Conference on Very Large Databases.Google Scholar
Ben Taskar, Ming fai Wong, Pieter Abbeel, and Daphne Koller. 2003. Link prediction in relational data. In Neural Information Processing Systems.Google Scholar
Ivor W. Tsang, James T. Kwok, and Pak-Ming Cheung. 2005. Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 6, 363--392. Google ScholarDigital Library
S. Wasserman and K. Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.Google Scholar
Rongjing Xiang, Jennifer Neville, and Monica Rogati. 2010. Modeling relationship strength in online social networks. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, 981--990. Google ScholarDigital Library

Index Terms

Social Link Prediction in Online Social Tagging Systems

Recommendations

Friendship prediction and homophily in social media

Social media have attracted considerable attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and topical components of social media has ...
Read More
Exploring generative models of tripartite graphs for recommendation in social media
MSM '13: Proceedings of the 4th International Workshop on Modeling Social Media

As social media sites grow in popularity, tagging has naturally emerged as a method of searching, categorizing and filtering online information, especially multimedia content. The unrestricted vocabulary users choose from to annotate content however, ...
Read More
Investigating Homophily in Online Social Networks
WI-IAT '10: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01

Similarity breeds connections, the principle of homophily, has been well studied in existing sociology literature. %Several studies have observed this phenomena by conducting surveys on human subjects. These studies have concluded that new ties are ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 31, Issue 4
November 2013
192 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2536736
Editor:
Jamie Callan
Carnegie Mellon University, USA
Issue’s Table of Contents
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 November 2013
- Accepted: 1 June 2013
- Revised: 1 May 2013
- Received: 1 January 2013
Published in tois Volume 31, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Annotation
Last.fm
collaborative tagging
graphical models
link prediction
link recommendation
machine learning
social bookmarking
social media
topic models
unsupervised learning
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 28
  Total Citations
  View Citations
- 779
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Social Link Prediction in Online Social Tagging Systems

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Friendship prediction and homophily in social media

Exploring generative models of tripartite graphs for recommendation in social media

Investigating Homophily in Online Social Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Social Link Prediction in Online Social Tagging Systems

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Friendship prediction and homophily in social media

Exploring generative models of tripartite graphs for recommendation in social media

Investigating Homophily in Online Social Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media