ABSTRACT
While the volume of scholarly publications has increased at a frenetic pace, accessing and consuming the useful candidate papers, in very large digital libraries, is becoming an essential and challenging task for scholars. Unfortunately, because of language barrier, some scientists (especially the junior ones or graduate students who do not master other languages) cannot efficiently locate the publications hosted in a foreign language repository. In this study, we propose a novel solution, cross-language citation recommendation via Hierarchical Representation Learning on Heterogeneous Graph (HRLHG), to address this new problem. HRLHG can learn a representation function by mapping the publications, from multilingual repositories, to a low-dimensional joint embedding space from various kinds of vertexes and relations on a heterogeneous graph. By leveraging both global (task specific) plus local (task independent) information as well as a novel supervised hierarchical random walk algorithm, the proposed method can optimize the publication representations by maximizing the likelihood of locating the important cross-language neighborhoods on the graph. Experiment results show that the proposed method can not only outperform state-of-the-art baseline models, but also improve the interpretability of the representation model for cross-language citation recommendation task.
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio . 2015. Neural machine translation by jointly learning to align and translate Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- Yoshua Bengio, Aaron Courville, and Pascal Vincent . 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence Vol. 35, 8 (2013), 1798--1828. Google ScholarDigital Library
- Shaosheng Cao, Wei Lu, and Qiongkai Xu . 2015. Grarep: Learning graph representations with global structural information Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 891--900. Google ScholarDigital Library
- Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li . 2007. Learning to rank: from pairwise approach to listwise approach Proceedings of the 24th international conference on Machine learning. ACM, 129--136. Google ScholarDigital Library
- José Augusto de Azevedo, Joaquim Jo ao ER Silvestre Madeira, Ernesto Q Vieira Martins, and Filipe Manuel A Pires . 1990. A shortest paths ranking algorithm. In Proceedings of the Annual Conference of Associazione Italiana di Ricerca Operativa: Models and Methods for Decision Support (AIRO'90). 1--8.Google Scholar
- Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami . 2017. metapath2vec: Scalable representation learning for heterogeneous networks Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135--144. Google ScholarDigital Library
- Tao-yang Fu, Wang-Chien Lee, and Zhen Lei . 2017. HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1797--1806. Google ScholarDigital Library
- Giorgio Gallo and Stefano Pallottino . 1986. Shortest path methods: A unifying approach. Netflow at Pisa (1986), 38--64.Google ScholarCross Ref
- Aditya Grover and Jure Leskovec . 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855--864. Google ScholarDigital Library
- Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft . 2016. A deep relevance matching model for ad-hoc retrieval Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 55--64. Google ScholarDigital Library
- Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles . 2010. Context-aware citation recommendation. In Proceedings of the 19th international conference on World wide web. ACM, 421--430. Google ScholarDigital Library
- Zhuoren Jiang, Xiaozhong Liu, and Liangcai Gao . 2015. Chronological Citation Recommendation with Information-Need Shifting Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 1291--1300. Google ScholarDigital Library
- Thomas N Kipf and Max Welling . 2016. Semi-Supervised Classification with Graph Convolutional Networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
- Ni Lao and William W Cohen . 2010. Relational retrieval using a combination of path-constrained random walks. Machine learning Vol. 81, 1 (2010), 53--67. Google ScholarDigital Library
- Xiaozhong Liu, Yingying Yu, Chun Guo, and Yizhou Sun . 2014. Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 121--130. Google ScholarDigital Library
- Sean M McNee, Istvan Albert, Dan Cosley, Prateep Gopalkrishnan, Shyong K Lam, Al Mamunur Rashid, Joseph A Konstan, and John Riedl . 2002. On the recommending of citations for research papers Proceedings of the 2002 ACM conference on Computer supported cooperative work. ACM, 116--125. Google ScholarDigital Library
- Donald Metzler and W Bruce Croft . 2007. Linear feature-based models for information retrieval. Information Retrieval Vol. 10, 3 (2007), 257--274. Google ScholarDigital Library
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean . 2013 a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
- Tomas Mikolov, Quoc V Le, and Ilya Sutskever . 2013 b. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 (2013).Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013 c. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena . 2014. Deepwalk: Online learning of social representations Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701--710. Google ScholarDigital Library
- Xiang Ren, Jialu Liu, Xiao Yu, Urvashi Khandelwal, Quanquan Gu, Lidan Wang, and Jiawei Han . 2014. Cluscite: Effective citation recommendation by information network-based clustering. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 821--830. Google ScholarDigital Library
- Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl . 2001. Item-based collaborative filtering recommendation algorithms Proceedings of the 10th international conference on World Wide Web. ACM, 285--295. Google ScholarDigital Library
- Xiaolin Shi, Jure Leskovec, and Daniel A McFarland . 2010. Citing for high impact. In Proceedings of the 10th annual joint conference on Digital libraries. ACM, 49--58. Google ScholarDigital Library
- Xiaoyuan Su and Taghi M Khoshgoftaar . 2009. A survey of collaborative filtering techniques. Advances in artificial intelligence Vol. 2009 (2009), 4. Google ScholarDigital Library
- Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S Yu, and Tianyi Wu . 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment Vol. 4, 11 (2011), 992--1003.Google ScholarDigital Library
- Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei . 2015. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1067--1077. Google ScholarDigital Library
- Jie Tang and Jing Zhang . 2009. A discriminative approach to topic-based citation recommendation. Advances in Knowledge Discovery and Data Mining (2009), 572--579. Google ScholarDigital Library
- Xuewei Tang, Xiaojun Wan, and Xun Zhang . 2014. Cross-language context-aware citation recommendation in scientific articles. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. ACM, 817--826. Google ScholarDigital Library
- Chengxiang Zhai and John Lafferty . 2001. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 334--342. Google ScholarDigital Library
Index Terms
- Cross-language Citation Recommendation via Hierarchical Representation Learning on Heterogeneous Graph
Recommendations
Cross-language Citation Recommendation via Publication Content and Citation Representation Fusion
JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital LibrariesWhile citation recommendation can be important for scholars, unfortunately, because of language barrier, some scientists cannot efficiently retrieve and consume the publications hosted in a foreign language repository. In this study, we propose a novel ...
Full-text based context-rich heterogeneous network mining approach for citation recommendation
JCDL '14: Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital LibrariesCitation relationship between scientific publications has been successfully used for scholarly bibliometrics, information retrieval and data mining tasks, and citation-based recommendation algorithms are well documented. While previous studies ...
Neural Citation Network for Context-Aware Citation Recommendation
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalThe accelerating rate of scientific publications makes it difficult to find relevant citations or related work. Context-aware citation recommendation aims to solve this problem by providing a curated list of high-quality candidates given a short passage ...
Comments