ABSTRACT
Several probabilistic extensions of description logic languages have been proposed and thoroughly studied. However, their practical use has been hampered by intractability of various reasoning tasks. While present-day knowledge bases (KBs) contain millions of instances and thousands of axioms, most state-of-the-art reasoners are capable of handling small scale KBs with thousands of instances. Thus, recent research has focused on leveraging the structure of KBs and queries in order to speed up inference runtime. However, these efforts have not been satisfactory in providing reasoners that are suitable for practical use in large scale KBs. In this study, we aim to tackle this challenging problem. In doing so, we use a probabilistic extension of OWL RL (called PRORL) as a modeling language and exploit graph neighborhoods (of undirected graphical models) for efficient approximate probabilistic inference. We show that subgraph extraction based inference is much faster and has comparable accuracy to full graph inference. We perform several experiments, in order to support our claim, over a NELL KB containing millions of instances and thousands of axioms. Furthermore, we propose a novel graph-based algorithm to automatically partition inferences rules based on their structure for efficient parallel inference.
- Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer, 722--735.Google ScholarDigital Library
- Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. AcM, 1247--1250.Google ScholarDigital Library
- Rodrigo De Salvo Braz, Eyal Amir, and Dan Roth. 2005. Lifted first-order probabilistic inference. In IJCAI. Citeseer, 1319--1325.Google Scholar
- Guy Broeck. 2011. On the completeness of first-order knowledge compilation for lifted probabilistic inference. In Advances in Neural Information Processing Systems. 1386--1394.Google Scholar
- Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R Hruschka Jr, and Tom M Mitchell. 2010. Toward an Architecture for Never-Ending Language Learning.. In AAAI. 306--1313.Google ScholarDigital Library
- Melisachew Wudage Chekol, Jakob Huber, Christian Meilicke, and Heiner Stuckenschmidt. 2016. Markov Logic Networks with Numerical Constraints. In ECAI 2016. 1017--1025.Google Scholar
- Yang Chen and Daisy Zhe Wang. 2014. Knowledge expansion over probabilistic knowledge bases. In SIGMOD. ACM, 649--660.Google Scholar
- Yang Chen, Daisy ZheWang, and Sean Goldberg. 2016. ScaLeKB: scalable learning and inference over large knowledge bases. The VLDB Journal 25, 6 (2016), 893-- 918.Google ScholarDigital Library
- Luc De Raedt, Anton Dries, Ingo Thon, Guy Van den Broeck, and Mathias Verbeke. 2015. Inducing probabilistic relational rules from probabilistic examples. In IJCAI. 1835--1842.Google ScholarDigital Library
- Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion. In SIGKDD. 601--610.Google ScholarDigital Library
- Laécio L. dos Santos, Rommel N. Carvalho, Marcelo Ladeira, Weigang Li, and Gilson Libório Mendes. 2015. PR-OWL2 RL - A Language for Scalable Uncertainty Reasoning on the Semantic Web information. In URSW co-located with ISWC, Bethlehem, USA, October 12, 2015. 14--25.Google Scholar
- Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying Relations for Open Information Extraction. In Proceedings of the Conference of Empirical Methods in Natural Language Processing (EMNLP '11). Edinburgh, Scotland, UK.Google ScholarDigital Library
- Luis Galárraga, Christina Teflioudi, Katja Hose, and Fabian M Suchanek. 2015. Fast Rule Mining in Ontological Knowledge Bases with AMIE+. The VLDB Journal 24, 6 (2015), 707--730.Google ScholarDigital Library
- Kiril Gashteovski, Rainer Gemulla, and Luciano Del Corro. 2017. MinIE: Minimizing Facts in Open Information Extraction. In EMNLP, Copenhagen, Denmark, September 9--11, 2017. 2630--2640.Google ScholarCross Ref
- Eric Gribkoff and Dan Suciu. 2016. SlimShot: In-Database Probabilistic Inference for Knowledge Bases. PVLDB 9, 7 (2016), 552--563.Google ScholarDigital Library
- Johannes Hoffart, Fabian M Suchanek, Klaus Berberich, Edwin Lewis-Kelham, Gerard De Melo, and Gerhard Weikum. 2011. YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In Proceedings of the 20th international conference companion on World wide web. ACM, 229--232.Google ScholarDigital Library
- Jean Christoph Jung and Carsten Lutz. 2012. Ontology-based access to probabilistic data with OWL QL. In ISWC. Springer, 182--197.Google Scholar
- Tushar Khot, Niranjan Balasubramanian, Eric Gribkoff, Ashish Sabharwal, Peter Clark, and Oren Etzioni. 2015. Markov logic networks for natural language question answering. arXiv preprint arXiv:1507.03045 (2015).Google Scholar
- Angelika Kimmig, Bart Demoen, Luc De Raedt, Vitor Santos Costa, and Ricardo Rocha. 2011. On the implementation of the probabilistic logic programming language ProbLog. Theory and Practice of Logic Programming 11, 2--3 (2011), 235--262.Google ScholarDigital Library
- Markus Krötzsch. 2012. OWL 2 Profiles: An Introduction to Lightweight Ontology Languages. In Proceedings of the 8th Reasoning Web Summer School, Vienna, Austria, September 3--8 2012 (LNCS), Thomas Eiter and Thomas Krennwallner (Eds.), Vol. 7487. Springer, 112--183.Google ScholarCross Ref
- Happy Mittal, Anuj Mahajan, Vibhav G Gogate, and Parag Singla. 2015. Lifted inference rules with constraints. In Advances in Neural Information Processing Systems. 3519--3527.Google Scholar
- Mathias Niepert, Jan Noessner, and Heiner Stuckenschmidt. 2011. Log-linear description logics. In IJCAI. 2153--2158.Google Scholar
- Feng Niu, Christopher Ré, AnHai Doan, and Jude Shavlik. 2011. Tuffy: Scaling up statistical inference in markov logic networks using an RDBMS. Proc. of the VLDB Endowment 4, 6 (2011), 373--384.Google ScholarDigital Library
- Jan Noessner, Mathias Niepert, and Heiner Stuckenschmidt. 2013. RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models. In Twenty-Seventh AAAI Conference on Artificial Intelligence.Google Scholar
- David Poole. 2003. First-order probabilistic inference. In IJCAI, Vol. 3. 985--991.Google ScholarDigital Library
- MatthewRichardson and Pedro Domingos. 2006. Markov logic networks. Machine learning 62, 1--2 (2006), 107--136.Google Scholar
- Stefan Schoenmackers, Oren Etzioni, Daniel SWeld, and Jesse Davis. 2010. Learning first-order horn clauses from web text. In EMNLP. 1088--1098.Google Scholar
- Jaeho Shin, SenWu, FeiranWang, Christopher De Sa, Ce Zhang, and Christopher Ré. 2015. Incremental knowledge base construction using deepdive. Proceedings of the VLDB Endowment 8, 11 (2015), 1310--1321.Google ScholarDigital Library
- Dan Suciu, Dan Olteanu, Christopher Ré, and Christoph Koch. 2011. Probabilistic databases. Synthesis Lectures on Data Management 3, 2 (2011), 1--180.Google ScholarDigital Library
- Denny Vrandecic and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM 57, 10 (2014), 78--85.Google ScholarDigital Library
- Ce Zhang and Christopher Ré. 2014. DimmWitted: A Study of Main-Memory Statistical Analytics. PVLDB 7, 12 (2014), 1283--1294.Google ScholarDigital Library
Index Terms
- Leveraging Graph Neighborhoods for Efficient Inference
Recommendations
Automatic differentiation variational inference
Probabilistic modeling is iterative. A scientist posits a simple model, fits it to her data, refines it according to her analysis, and repeats. However, fitting complex models to large data is a bottleneck in this process. Deriving algorithms for new ...
Efficient Attack Graph Analysis through Approximate Inference
Attack graphs provide compact representations of the attack paths an attacker can follow to compromise network resources from the analysis of network vulnerabilities and topology. These representations are a powerful tool for security risk assessment. ...
Towards Partition-Aware Lifted Inference
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementThere is an ever increasing number of rule learning algorithms and tools for automatic knowledge base (KB) construction. These tools often produce weighted rules and facts that make up a probabilistic KB (PKB). In such a PKB, probabilistic inference is ...
Comments