Abstract
Social networks have become a popular medium for people to communicate and distribute ideas, content, news, and advertisements. Social content annotation has naturally emerged as a method of categorization and filtering of online information. The unrestricted vocabulary users choose from to annotate content has often lead to an explosion of the size of space in which search is performed. In this article, we propose latent topic models as a principled way of reducing the dimensionality of such data and capturing the dynamics of collaborative annotation process. We propose three generative processes to model latent user tastes with respect to resources they annotate with metadata. We show that latent user interests combined with social clues from the immediate neighborhood of users can significantly improve social link prediction in the online music social media site Last.fm. Most link prediction methods suffer from the high class imbalance problem, resulting in low precision and/or recall. In contrast, our proposed classification schemes for social link recommendation achieve high precision and recall with respect to not only the dominant class (nonexistence of a link), but also with respect to sparse positive instances, which are the most vital in social tie prediction.
- Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, 635--644. Google ScholarDigital Library
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022. Google ScholarCross Ref
- Markus Bundschus, Shipeng Yu, Volker Tresp, Achim Rettinger, Mathaeus Dejori, and Hans-Peter Kriegel. 2009. Hierarchical Bayesian models for collaborative tagging systems. In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM’09). IEEE, 728--733. Google ScholarDigital Library
- Iván Cantador, Peter Brusilovsky, and Tsvi Kuflik. 2011. 2nd Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec’11). In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York. Google ScholarDigital Library
- Jonathan Chang and David Blei. 2009. Relational topic models for document networks. In Proceedings of the Conference on AI and Statistics.Google Scholar
- Nello Cristianini and John Shawe-Taylor. 2010. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press. Google ScholarDigital Library
- Darcy Davis, Ryan Lichtenwalter, and Nitesh V. Chawla. 2011. Multi-relational link prediction in heterogeneous information networks. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM’11). IEEE, 281--288. Google ScholarDigital Library
- Laura Dietz. 2009. Modeling shared tastes in online communities. In Proceedings of the NIPS Workshop on Applications for Topic Models: Text and Beyond.Google Scholar
- Seyda Ertekin, Jian Huang, Leon Bottou, and Lee Giles. 2007. Learning on the border: Active learning in imbalanced data classification. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). ACM, New York. 127--136. Google ScholarDigital Library
- Liang Ge and Aidong Zhang. 2012. Pseudo cold start link prediction with multiple sources in social networks. In Proceedings of the SIAM International Conference on Data Mining. SIAM/Omnipress, 768--779.Google ScholarCross Ref
- Scott Golder and Bernardo A. Huberman. 2006. The structure of collaborative tagging systems. J. Inf. Sci. 32, 2, 198--208. Google ScholarDigital Library
- Mark Granovetter. 1983. The strength of weak ties: A network theory revisited. Sociol. Theory 1, 201--233.Google ScholarCross Ref
- Thomas L. Griffiths and Mark Steyvers. 2004. Finding scientific topics. Proc. Nat. Acad. Sci. 101, Suppl 1, 5228--5235.Google ScholarCross Ref
- Manish Gupta, Rui Li, Zhijun Yin, and Jiawei Han. 2010. Survey on social tagging techniques. SIGKDD Explor. Newsl. 12, 1, 58--72. Google ScholarDigital Library
- Harry Halpin, Valentin Robu, and Hana Shepherd. 2007. The complex dynamics of collaborative tagging. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, 211--220. Google ScholarDigital Library
- Negar Hariri, Bamshad Mobasher, and Robin Burke. 2012. Context-aware music recommendation based on latenttopic sequential patterns. In Proceedings of the 6th ACM Conference on Recommender Systems (RecSys’12). ACM, New York, 131--138. Google ScholarDigital Library
- Morgan Harvey, Ian Ruthven, and Mark J. Carman. 2011. Improving social bookmark search using personalised latent variable language models. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, 485--494. Google ScholarDigital Library
- Peter D. Hoff. 2009. Multiplicative latent factor models for description and prediction of social networks. Comput. Math. Organ. Theory 15, 4, 261--272. Google ScholarDigital Library
- Donald B. Johnson. 1977. Efficient algorithms for shortest paths in sparse networks. J. ACM 24, 1, 1--13. Google ScholarDigital Library
- Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1, 39--43. DOI:10.1007/BF02289026.Google ScholarCross Ref
- S. Sathiya Keerthi, Olivier Chapelle, and Dennis DeCoste. 2006. Building support vector machines with reduced classifier complexity. J. Mach. Learn. Res. 7, 1493--1515. Google ScholarDigital Library
- Tamara G. Kolda and Brett W. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3, 455--500. Google ScholarDigital Library
- Ralf Krestel, Peter Fankhauser, and Wolfgang Nejdl. 2009. Latent Dirichlet allocation for tag recommendation. In Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys’09). ACM, New York, 61--68. Google ScholarDigital Library
- Kristina Lerman and Anon Plangprasopchok. 2009. Handbook of Research on Web 2.0, 3.0, and X.0: Technologies, Business, and Social Applications. IGI Global, Chapter Leveraging user-specified metadata to personalize image search.Google Scholar
- Vincent Leroy, B. Barla Cambazoglu, and Francesco Bonchi. 2010. Cold start link prediction. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, 393--402. Google ScholarDigital Library
- Nan Lin, Daifeng Li, Ying Ding, Bing He, Zheng Qin, Jie Tang, Juanzi Li, and Tianxi Dong. 2012. The dynamic features of delicious, Flickr, and YouTube. J. Amer. Soc. Inf. Sci. Technol. 63, 1, 139--162. Google ScholarDigital Library
- Marek Lipczak, Borkur Sigurbjornsson, and Alejandro Jaimes. 2012. Understanding and leveraging tag-based relations in on-line social networks. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT’12). ACM, New York, 229--238. Google ScholarDigital Library
- Lu Liu, Feida Zhu, Lei Zhang, and Shiqiang Yang. 2012. A probabilistic graphical model for topic and preference discovery on social media. Neurocomput. 95, 78--88. Google ScholarDigital Library
- Yan Liu, Alexandru Niculescu-Mizil, and Wojciech Gryc. 2009. Topic-link LDA: Joint models of topic and author community. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). ACM, New York, 665--672. Google ScholarDigital Library
- Zhiyuan Liu, Yuzhou Zhang, Edward Y. Chang, and Maosong Sun. 2011. PLDA+: Parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol. 2, 3, Article 26. Google ScholarDigital Library
- Bo Long, Xiaoyun Wu, Zhongfei (Mark) Zhang, and Philip S. Yu. 2006. Unsupervised learning on k-partite graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, New York, 317--326. Google ScholarDigital Library
- Caimei Lu, Xiaohua Hu, Xin Chen, Jung-Ran Park, TingTing He, and Zhoujun Li. 2010. The topicperspectivemodel for social tagging systems. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, 683--692. Google ScholarDigital Library
- Linyuan Lu and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: 390, 6, 1150--1170.Google ScholarCross Ref
- Masoud Makrehchi. 2011. Social link recommendation by learning hidden topics. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). ACM, New York, 189--196. Google ScholarDigital Library
- Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a Feather: Homophily in Social Networks. Ann. Rev. Sociol. 27, 1, 415--444.Google ScholarCross Ref
- Aditya Krishna Menon and Charles Elkan. 2011. Link prediction via matrix factorization. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD’11). Springer, 437--452. Google ScholarDigital Library
- Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, 251--260. Google ScholarDigital Library
- Rohit Parimi and Doina Caragea. 2011. Predicting friendship links in social networks using a topic modeling approach. In Proceedings of the 15th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’11). Springer, 75--86. Google ScholarDigital Library
- Marco Pennacchiotti and Siva Gurumurthy. 2011. Investigating topic models for social media user recommendation. In Proceedings of the 20th International Conference on World Wide Web (WWW’11). ACM, New York, 101--102. Google ScholarDigital Library
- John C. Platt. 1999. Advances in Kernel Methods. MIT Press, Cambridge, MA, 185--208.Google Scholar
- Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas Griffiths, Padhraic Smyth, and Mark Steyvers. 2010. Learning author-topic models from text corpora. ACM Trans. Inf. Syst. 28, 1, Article 4. Google ScholarDigital Library
- Adam Sadilek, Henry Kautz, and Jeffrey P. Bigham. 2012. Finding your friends and following them to where you are. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM’12). ACM, New York, 723--732. Google ScholarDigital Library
- Rossano Schifanella, Alain Barrat, Ciro Cattuto, Benjamin Markines, and Filippo Menczer. 2010. Folks in folksonomies: Social link prediction from shared metadata. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, 271--280. Google ScholarDigital Library
- Shai Shalev-Shwartz and Nathan Srebro. 2008. SVM optimization: inverse dependence on training set size. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). ACM, New York, 928--935. Google ScholarDigital Library
- Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In Proceedings of the International Conference on Very Large Databases.Google Scholar
- Ben Taskar, Ming fai Wong, Pieter Abbeel, and Daphne Koller. 2003. Link prediction in relational data. In Neural Information Processing Systems.Google Scholar
- Ivor W. Tsang, James T. Kwok, and Pak-Ming Cheung. 2005. Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 6, 363--392. Google ScholarDigital Library
- S. Wasserman and K. Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.Google Scholar
- Rongjing Xiang, Jennifer Neville, and Monica Rogati. 2010. Modeling relationship strength in online social networks. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, 981--990. Google ScholarDigital Library
Index Terms
- Social Link Prediction in Online Social Tagging Systems
Recommendations
Friendship prediction and homophily in social media
Social media have attracted considerable attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and topical components of social media has ...
Exploring generative models of tripartite graphs for recommendation in social media
MSM '13: Proceedings of the 4th International Workshop on Modeling Social MediaAs social media sites grow in popularity, tagging has naturally emerged as a method of searching, categorizing and filtering online information, especially multimedia content. The unrestricted vocabulary users choose from to annotate content however, ...
Investigating Homophily in Online Social Networks
WI-IAT '10: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01Similarity breeds connections, the principle of homophily, has been well studied in existing sociology literature. %Several studies have observed this phenomena by conducting surveys on human subjects. These studies have concluded that new ties are ...
Comments