skip to main content
10.1145/3292500.3330860acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Scalable Graph Embeddings via Sparse Transpose Proximities

Published:25 July 2019Publication History

ABSTRACT

Graph embedding learns low-dimensional representations for nodes in a graph and effectively preserves the graph structure. Recently, a significant amount of progress has been made toward this emerging research area. However, there are several fundamental problems that remain open. First, existing methods fail to preserve the out-degree distributions on directed graphs. Second, many existing methods employ random walk based proximities and thus suffer from conflicting optimization goals on undirected graphs. Finally, existing factorization methods are unable to achieve scalability and non-linearity simultaneously. This paper presents an in-depth study on graph embedding techniques on both directed and undirected graphs. We analyze the fundamental reasons that lead to the distortion of out-degree distributions and to the conflicting optimization goals. We propose transpose proximity, a unified approach that solves both problems. Based on the concept of transpose proximity, we design STRAP, a factorization based graph embedding algorithm that achieves scalability and non-linearity simultaneously. STRAP makes use of the backward push algorithm to efficiently compute the sparse Personalized PageRank (PPR) as its transpose proximities. By imposing the sparsity constraint, we are able to apply non-linear operations to the proximity matrix and perform efficient matrix factorization to derive the embedding vectors. Finally, we present an extensive experimental study that evaluates the effectiveness of various graph embedding algorithms, and we show that \strap outperforms the state-of-the-art methods in terms of effectiveness and scalability.

References

  1. http://snap.stanford.edu/data/index.html.Google ScholarGoogle Scholar
  2. http://law.di.unimi.it/datasets.php.Google ScholarGoogle Scholar
  3. https://github.com/leoribeiro/struc2vec/.Google ScholarGoogle Scholar
  4. Amr Ahmed, Nino Shervashidze, Shravan Narayanamurthy, Vanja Josifovski, and Alexander J Smola. Distributed large-scale natural graph factorization. In WWW, pages 37--48. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Rodrigo Aldecoa, Chiara Orsini, and Dmitri Krioukov. Hyperbolic graph generator. Computer Physics Communications, 196:492--496, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  6. Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. Fast incremental and personalized pagerank. VLDB, 4(3):173--184, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NIPS, pages 585--591, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Mansurul Bhuiyan and Mohammad Al Hasan. Representing graphs as bag of vertices and partitions for graph classification. Data Science and Engineering, 3(2):150--165, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  9. Shaosheng Cao, Wei Lu, and Qiongkai Xu. Grarep: Learning graph representations with global structural information. In CIKM, pages 891--900. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shaosheng Cao, Wei Lu, and Qiongkai Xu. Deep neural networks for learning graph representations. In AAAI, pages 1145--1152, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C Aggarwal, and Thomas S Huang. Heterogeneous network embedding via deep architectures. In SIGKDD, pages 119--128. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Siheng Chen, Sufeng Niu, Leman Akoglu, Jelena Kovavc ević, and Christos Faloutsos. Fast, warped graph embedding: Unifying framework and one-click algorithm. arXiv preprint arXiv:1702.05764, 2017.Google ScholarGoogle Scholar
  13. Kenneth L Clarkson and David P Woodruff. Low-rank approximation and regression in input sparsity time. JACM, 63(6):54, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu. A survey on network embedding. TKDE, 2018.Google ScholarGoogle Scholar
  15. Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In SIGKDD, pages 135--144. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. Liblinear: A library for large linear classification. JMLR, 9(Aug):1871--1874, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xu Feng, Yuyang Xie, Mingye Song, Wenjian Yu, and Jie Tang. Fast randomized pca for sparse data. In Asian Conference on Machine Learning, pages 710--725, 2018.Google ScholarGoogle Scholar
  18. Palash Goyal and Emilio Ferrara. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 151:78--94, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  19. Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks. In SIGKDD, pages 855--864. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Glen Jeh and Jennifer Widom. Simrank: a measure of structural-context similarity. In SIGKDD, pages 538--543, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Junghwan Kim, Haekyu Park, Ji-Eun Lee, and U Kang. Side: representation learning in signed directed networks. In WWW, pages 509--518. International World Wide Web Conferences Steering Committee, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. ICLR, 2017.Google ScholarGoogle Scholar
  23. Kyle Kloster and David F Gleich. Heat kernel based community detection. In SIGKDD, pages 1386--1395. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dmitri Krioukov, Fragkiskos Papadopoulos, Maksim Kitsak, Amin Vahdat, and Marián Boguná. Hyperbolic geometry of complex networks. Physical Review E, 82(3):036106, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  25. Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. Personalized pagerank estimation and search: A bidirectional approach. In WSDM, pages 163--172, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jianxin Ma, Peng Cui, and Wenwu Zhu. Depthlgp: Learning embeddings of out-of-sample nodes in dynamic networks. AAAI, 2018.Google ScholarGoogle Scholar
  27. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111--3119, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. Learning convolutional neural networks for graphs. In ICML, pages 2014--2023, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. Asymmetric transitivity preserving graph embedding. In SIGKDD, pages 1105--1114. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: bringing order to the web. 1999.Google ScholarGoogle Scholar
  31. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In SIGKDD, pages 701--710. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM, pages 459--467. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Leonardo FR Ribeiro, Pedro HP Saverese, and Daniel R Figueiredo. struc2vec: Learning node representations from structural identity. In SIGKDD, pages 385--394. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sam T Roweis and Lawrence K Saul. Nonlinear dimensionality reduction by locally linear embedding. science, 290(5500):2323--2326, 2000.Google ScholarGoogle Scholar
  35. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. Line: Large-scale information network embedding. In WWW, pages 1067--1077. International World Wide Web Conferences Steering Committee, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Lei Tang, Suju Rajan, and Vijay K Narayanan. Large scale multi-label classification via metalabeler. In WWW, pages 211--220. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Anton Tsitsulin, Davide Mottin, Panagiotis Karras, and Emmanuel Müller. Verse: Versatile graph embeddings from similarity measures. In WWW, pages 539--548. International World Wide Web Conferences Steering Committee, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas. Mining multi-label data. In Data mining and knowledge discovery handbook, pages 667--685. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  39. Daixin Wang, Peng Cui, and Wenwu Zhu. Structural deep network embedding. In KDD, pages 1225--1234. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Shuo Shang, and Ji-Rong Wen. Topppr: top-k personalized pagerank queries with precision guarantees on large graphs. In SIGMOD, pages 441--456. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Shuhan Yuan, Xintao Wu, and Yang Xiang. Sne: signed network embedding. In PAKDD, pages 183--195. Springer, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  42. Daokun Zhang, Jie Yin, Xingquan Zhu, and Chengqi Zhang. Network representation learning: A survey. IEEE transactions on Big Data, 2018.Google ScholarGoogle Scholar
  43. Ziwei Zhang, Peng Cui, Xiao Wang, Jian Pei, Xuanrong Yao, and Wenwu Zhu. Arbitrary-order proximity preserved network embedding. In SIGKDD, pages 2778--2786. ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Chang Zhou, Yuqiong Liu, Xiaofei Liu, Zhongyi Liu, and Jun Gao. Scalable graph embedding for asymmetric proximity. In AAAI, pages 2942--2948, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Lekui Zhou, Yang Yang, Xiang Ren, Fei Wu, and Yueting Zhuang. Dynamic network embedding by modeling triadic closure process. 2018.Google ScholarGoogle Scholar

Index Terms

  1. Scalable Graph Embeddings via Sparse Transpose Proximities

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
          July 2019
          3305 pages
          ISBN:9781450362016
          DOI:10.1145/3292500

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 July 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '19 Paper Acceptance Rate110of1,200submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader