ABSTRACT
We propose a supervised algorithm for generating type embeddings in the same semantic vector space as a given set of entity embeddings. The algorithm is agnostic to the derivation of the underlying entity embeddings. It does not require any manual feature engineering, generalizes well to hundreds of types and achieves near-linear scaling on Big Graphs containing many millions of triples and instances by virtue of an incremental execution. We demonstrate the utility of the embeddings on a type recommendation task, outperforming a non-parametric feature-agnostic baseline while achieving 15× speedup and near-constant memory usage on a full partition of DBpedia. Using state-of-the-art visualization, we illustrate the agreement of our extensionally derived DBpedia type embeddings with the manually curated domain ontology. Finally, we use the embeddings to probabilistically cluster about 4 million DBpedia instances into 415 types in the DBpedia ontology.
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems. 2787--2795.Google Scholar
- Danqi Chen and Christopher D Manning. 2014. A Fast and Accurate Dependency Parser using Neural Networks.. In EMNLP. 740--750.Google Scholar
- Marieke van Erp and Piek Vossen. 2016. Entity Typing using Distributional Semantics and DBpedia. In Under Review.Google Scholar
- Paul Groth, Sujit Pal, Darin McBeath, Brad Allen, and Ron Daniel. 2016. Applying Universal Schemas for Domain Specific Ontology Expansion. Proceedings of AKBC (2016), 81--85.Google ScholarCross Ref
- Shu Guo, Quan Wang, Bin Wang, Lihong Wang, and Li Guo. 2015. Semantically Smooth Knowledge Graph Embedding.. In ACL (1). 84--94. Google ScholarCross Ref
- Shaohua Li, Jun Zhu, and Chunyan Miao. 2016. PSDVec: A toolbox for incremental and scalable word embedding. Neurocomputing (2016).Google Scholar
- Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion.. In AAAI. 2181--2187.Google Scholar
- Uta Lösch, Stephan Bloehdorn, and Achim Rettinger. 2012. Graph kernels for RDF data. In Extended Semantic Web Conference. Springer, 134--148. Google ScholarDigital Library
- Yongtao Ma, Thanh Tran, and Veli Bicer. 2013. Typifier: Inferring the type semantics of structured data. In Data Engineering (ICDE), 2013 IEEE 29th International Conference on. IEEE, 206--217.Google Scholar
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579--2605.Google Scholar
- André Melo, Heiko Paulheim, and Johanna Völker. 2016. Type prediction in RDF knowledge bases using hierarchical multilabel classification. (2016).Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google Scholar
- Heiko Paulheim and Christian Bizer. 2013. Type inference on noisy rdf data. In International Semantic Web Conference. Springer, 510--525. Google ScholarDigital Library
- Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation.. In EMNLP, Vol. 14. 1532--43.Google Scholar
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701--710. Google ScholarDigital Library
- Kaspar Riesen and Horst Bunke. 2010. Graph classification and clustering based on vector space embedding. World Scientific Publishing Co., Inc. Google ScholarCross Ref
- Petar Ristoski and Heiko Paulheim. 2016. Rdf2vec: Rdf graph embeddings for data mining. In International Semantic Web Conference. Springer, 498--514. Google ScholarDigital Library
- Jessica Rosati, Petar Ristoski, Tommaso Di Noia, Renato de Leone, and Heiko Paulheim. 2016. RDF graph embeddings for content-based recommender systems. In CEUR workshop proceedings, Vol. 1673. RWTH, 23--30.Google Scholar
- Magnus Sahlgren. 2008. The distributional hypothesis. Italian Journal of Linguistics 20, 1 (2008), 33--54.Google Scholar
- Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th annual meeting of the association for computational linguistics. Association for Computational Linguistics, 384--394.Google ScholarDigital Library
- Quan Wang, Bin Wang, and Li Guo. 2015. Knowledge Base Completion Using Embeddings and Rules.. In IJCAI. 1859--1866.Google Scholar
- Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes.. In AAAI. Citeseer, 1112--1119.Google Scholar
Index Terms
- Supervised typing of big graphs using semantic embeddings
Recommendations
Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval
MM '18: Proceedings of the 26th ACM international conference on MultimediaThis paper learns semantic embeddings for multi-label cross-modal retrieval. Our method exploits the structure in semantics represented by label vectors to guide the learning of embeddings. First, we construct a semantic graph based on label vectors ...
Using semantic web technology to support ICD-11 textual definitions authoring
SWAT4LS '11: Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life SciencesThe beta phase of the 11th revision of International Classification of Diseases (ICD-11) intends to accept public input through a distributed model of crowdsourcing. One of the core use cases is to create textual definitions for the ICD categories. The ...
Semantic Annotation for Web Services Based on DBpedia
SOSE '13: Proceedings of the 2013 IEEE Seventh International Symposium on Service-Oriented System EngineeringThe vast majority of Web services on the Internet lack explicit and sufficient semantic information. As a result, we cannot provide all the relevant services during service discovery, and have difficulty in service composition. In this paper, we propose ...
Comments