skip to main content
10.1145/3066911.3066918acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Supervised typing of big graphs using semantic embeddings

Published:19 May 2017Publication History

ABSTRACT

We propose a supervised algorithm for generating type embeddings in the same semantic vector space as a given set of entity embeddings. The algorithm is agnostic to the derivation of the underlying entity embeddings. It does not require any manual feature engineering, generalizes well to hundreds of types and achieves near-linear scaling on Big Graphs containing many millions of triples and instances by virtue of an incremental execution. We demonstrate the utility of the embeddings on a type recommendation task, outperforming a non-parametric feature-agnostic baseline while achieving 15× speedup and near-constant memory usage on a full partition of DBpedia. Using state-of-the-art visualization, we illustrate the agreement of our extensionally derived DBpedia type embeddings with the manually curated domain ontology. Finally, we use the embeddings to probabilistically cluster about 4 million DBpedia instances into 415 types in the DBpedia ontology.

References

  1. Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems. 2787--2795.Google ScholarGoogle Scholar
  2. Danqi Chen and Christopher D Manning. 2014. A Fast and Accurate Dependency Parser using Neural Networks.. In EMNLP. 740--750.Google ScholarGoogle Scholar
  3. Marieke van Erp and Piek Vossen. 2016. Entity Typing using Distributional Semantics and DBpedia. In Under Review.Google ScholarGoogle Scholar
  4. Paul Groth, Sujit Pal, Darin McBeath, Brad Allen, and Ron Daniel. 2016. Applying Universal Schemas for Domain Specific Ontology Expansion. Proceedings of AKBC (2016), 81--85.Google ScholarGoogle ScholarCross RefCross Ref
  5. Shu Guo, Quan Wang, Bin Wang, Lihong Wang, and Li Guo. 2015. Semantically Smooth Knowledge Graph Embedding.. In ACL (1). 84--94. Google ScholarGoogle ScholarCross RefCross Ref
  6. Shaohua Li, Jun Zhu, and Chunyan Miao. 2016. PSDVec: A toolbox for incremental and scalable word embedding. Neurocomputing (2016).Google ScholarGoogle Scholar
  7. Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion.. In AAAI. 2181--2187.Google ScholarGoogle Scholar
  8. Uta Lösch, Stephan Bloehdorn, and Achim Rettinger. 2012. Graph kernels for RDF data. In Extended Semantic Web Conference. Springer, 134--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yongtao Ma, Thanh Tran, and Veli Bicer. 2013. Typifier: Inferring the type semantics of structured data. In Data Engineering (ICDE), 2013 IEEE 29th International Conference on. IEEE, 206--217.Google ScholarGoogle Scholar
  10. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579--2605.Google ScholarGoogle Scholar
  11. André Melo, Heiko Paulheim, and Johanna Völker. 2016. Type prediction in RDF knowledge bases using hierarchical multilabel classification. (2016).Google ScholarGoogle Scholar
  12. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google ScholarGoogle Scholar
  13. Heiko Paulheim and Christian Bizer. 2013. Type inference on noisy rdf data. In International Semantic Web Conference. Springer, 510--525. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation.. In EMNLP, Vol. 14. 1532--43.Google ScholarGoogle Scholar
  15. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kaspar Riesen and Horst Bunke. 2010. Graph classification and clustering based on vector space embedding. World Scientific Publishing Co., Inc. Google ScholarGoogle ScholarCross RefCross Ref
  17. Petar Ristoski and Heiko Paulheim. 2016. Rdf2vec: Rdf graph embeddings for data mining. In International Semantic Web Conference. Springer, 498--514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jessica Rosati, Petar Ristoski, Tommaso Di Noia, Renato de Leone, and Heiko Paulheim. 2016. RDF graph embeddings for content-based recommender systems. In CEUR workshop proceedings, Vol. 1673. RWTH, 23--30.Google ScholarGoogle Scholar
  19. Magnus Sahlgren. 2008. The distributional hypothesis. Italian Journal of Linguistics 20, 1 (2008), 33--54.Google ScholarGoogle Scholar
  20. Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th annual meeting of the association for computational linguistics. Association for Computational Linguistics, 384--394.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Quan Wang, Bin Wang, and Li Guo. 2015. Knowledge Base Completion Using Embeddings and Rules.. In IJCAI. 1859--1866.Google ScholarGoogle Scholar
  22. Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes.. In AAAI. Citeseer, 1112--1119.Google ScholarGoogle Scholar

Index Terms

  1. Supervised typing of big graphs using semantic embeddings

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SBD '17: Proceedings of The International Workshop on Semantic Big Data
        May 2017
        57 pages
        ISBN:9781450349871
        DOI:10.1145/3066911

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 May 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SBD '17 Paper Acceptance Rate8of15submissions,53%Overall Acceptance Rate30of54submissions,56%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader