Abstract
The Resource Description Framework (RDF) represents a main ingredient and data representation format for Linked Data and the Semantic Web. It supports a generic graph-based data model and data representation format for describing things, including their relationships with other things. As the size of RDF datasets is growing fast, RDF data management systems must be able to cope with growing amounts of data. Even though physically handling RDF data using a relational table is possible, querying a giant triple table becomes very expensive because of the multiple nested joins required for answering graph queries. In addition, the heterogeneity of RDF Data poses entirely new challenges to database systems. This article provides a comprehensive study of the state of the art in handling and querying RDF data. In particular, we focus on data storage techniques, indexing strategies, and query execution mechanisms. Moreover, we provide a classification of existing systems and approaches. We also provide an overview of the various benchmarking efforts in this context and discuss some of the open problems in this domain.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, RDF Data Storage and Query Processing Schemes: A Survey
- Daniel J. Abadi, Adam Marcus, Samuel R. Madden, and Kate Hollenbach. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, 411--422. Google ScholarDigital Library
- Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi Silberschatz. 2009. HadoopDB: An architectural hybrid of mapreduce and DBMS technologies for analytical workloads. Proc. VLDB 2, 1 (2009), 922--933. Retrieved from http://www.vldb.org/pvldb/2/vldb09-861.pdf. Google ScholarDigital Library
- Maribel Acosta, Maria-Esther Vidal, Tomas Lampo, Julio Castillo, and Edna Ruckhaus. 2011. ANAPSID: An adaptive query processing engine for SPARQL endpoints. Semant. Web (2011), 18--34. Google ScholarDigital Library
- Razen Al-Harbi, Ibrahim Abdelaziz, Panos Kalnis, Nikos Mamoulis, Yasser Ebrahim, and Majed Sahli. 2016. Accelerating SPARQL queries by exploiting hash-based locality and adaptive partitioning. VLDB J. 25, 3 (2016), 355--380. Google ScholarDigital Library
- Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, and Dimitris Plexousakis. 2001. On storing voluminous RDF descriptions: The case of web portal catalogs. In Proceedings of the International Workshop on the Web and Databases (WebDB’01). 43--48.Google Scholar
- Keith Alexander and Michael Hausenblas. 2009. Describing linked datasets—On the design and usage of void, the vocabulary of interlinked datasets. In Proceedings of the Linked Data on the Web Workshop (LDOW’09). Retrieved from http://richard.cyganiak.de/2008/papers/void-ldow2009.pdf.Google Scholar
- Güneş Aluç, Olaf Hartig, M. Tamer Özsu, and Khuzaima Daudjee. 2014a. Diversified stress testing of RDF data management systems. In Proceedings of the International Semantic Web Conference. Springer, 197--212. Google ScholarDigital Library
- Güneş Aluç, M. Tamer Özsu, and Khuzaima Daudjee. 2014b. Workload matters: Why RDF databases need a new design. Proc. VLDB Endow. 7, 10 (2014), 837--840. Google ScholarDigital Library
- Güneş Aluç, M. Tamer Ozsu, Khuzaima Daudjee, and Olaf Hartig. 2013. Chameleon-db: A Workload-Aware Robust RDF Data Management System. Technical Report CS-2013-10. University of Waterloo.Google Scholar
- Andrés Aranda-Andújar, Francesca Bugiotti, Jesús Camacho-Rodríguez, Dario Colazzo, François Goasdoué, Zoi Kaoudi, and Ioana Manolescu. 2012. AMADA: Web data repositories in the amazon cloud. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). 2749--2751. Google ScholarDigital Library
- Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational data processing in spark. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’15). 1383--1394. Google ScholarDigital Library
- Medha Atre and James A. Hendler. 2009. BitMat: A main memory bit-matrix of RDF triples. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). Citeseer, 33.Google Scholar
- Medha Atre, Jagannathan Srinivasan, and James A. Hendler. 2008. BitMat: A main-memory bit matrix of RDF triples for conjunctive triple pattern queries. In Proceedings of the Poster and Demonstration Session at the 7th International Semantic Web Conference (ISWC’08). Retrieved from http://ceur-ws.org/Vol-401/iswc2008pd_submission_16.pdf. Google ScholarDigital Library
- Anirudh Badam and Vivek S. Pai. 2011. SSDAlloc: Hybrid SSD/RAM memory management made easy. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. USENIX Association, 16--16. Google ScholarDigital Library
- Tim Berners-Lee, James Hendler, Ora Lassila et al. 2001. The semantic web. Sci. Amer. 284, 5 (2001), 28--37.Google Scholar
- Philip A. Bernstein and Dah-Ming W. Chiu. 1981. Using semi-joins to solve relational queries. J. ACM 28, 1 (1981), 25--40. Google ScholarDigital Library
- Christian Bizer, Tom Heath, and Tim Berners-Lee. 2009. Linked data-the story so far. https://eprints.soton.ac.uk/271285/.Google Scholar
- Christian Bizer and Andreas Schultz. 2009. The Berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst. 5, 2 (2009), 1--24.Google ScholarCross Ref
- Mihaela A. Bornea, Julian Dolby, Anastasios Kementsietsidis, Kavitha Srinivas, Patrick Dantressangle, Octavian Udrea, and Bishwaranjan Bhattacharjee. 2013. Building an efficient RDF store over a relational database. In Proceedings of the 2013 International Conference on Management of Data. ACM, 121--132. Google ScholarDigital Library
- Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. 2002. Sesame: A generic architecture for storing and querying RDF and RDF schema. In Proceedings of the 1st International Semantic Web Conference on the Semantic Web (ISWC’02). Springer, 54--68. Google ScholarDigital Library
- Rick Cattell. 2011. Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39, 4 (2011), 12--27. Google ScholarDigital Library
- Surajit Chaudhuri and Gerhard Weikum. 2000. Rethinking database system architecture: Toward a self-tuning RISC-style database system. In Proceedings of 26th International Conference on Very Large Data Bases (VLDB’00). 1--10. Google ScholarDigital Library
- Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2014. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the Posters 8 Demonstrations Track a Track Within the 13th International Semantic Web Conference (ISWC’14). 261--264. Retrieved from http://ceur-ws.org/Vol-1272/paper_43.pdf. Google ScholarDigital Library
- Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2015. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT’15). 292--300. Google ScholarDigital Library
- Long Cheng and Spyros Kotoulas. 2015. Scale-out processing of large RDF datasets. IEEE Trans. Big Data 1, 4 (2015), 138--150.Google ScholarCross Ref
- Eugene Inseok Chong, Souripriya Das, George Eadon, and Jagannathan Srinivasan. 2005. An efficient SQL-based RDF querying scheme. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB’05). VLDB Endowment, 1216--1227. Retrieved from http://portal.acm.org/citation.cfm?id=1083592.1083734. Google ScholarDigital Library
- World Wide Web Consortium. 2014a. RDF 1.1: On Semantics of RDF Datasets. https://www.w3.org/TR/rdf11-datasets/.Google Scholar
- World Wide Web Consortium. 2014b. RDF 1.1 Primer.Google Scholar
- George P. Copeland and Setrag Khoshafian. 1985. A decomposition storage model. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 268--279. Google ScholarDigital Library
- Philippe Cudré-Mauroux, Iliya Enchev, Sever Fundatureanu, Paul Groth, Albert Haque, Andreas Harth, Felix Leif Keppmann, Daniel Miranker, Juan F Sequeda, and Marcin Wylot. 2013. Nosql databases for rdf: An empirical evaluation. In Proceedings of the International Semantic Web Conference. Springer, 310--325. Google ScholarDigital Library
- Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51 (Jan. 2008), 107--113. Issue 1. Google ScholarDigital Library
- Gianluca Demartini, Iliya Enchev, Marcin Wylot, Joel Gapany, and Philippe Cudre-Mauroux. 2012. BowlognaBench—Benchmarking RDF analytics. In Data-Driven Process Discovery and Analysis, Karl Aberer, Ernesto Damiani, and Tharam Dillon (Eds.). Lecture Notes in Business Information Processing, Vol. 116. Springer, Berlin, 82--102.Google Scholar
- Uwe Deppisch. 1986. S-tree: A dynamic balanced signature index for office retrieval. In Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 77--87. Google ScholarDigital Library
- Amol Deshpande, Zachary Ives, Vijayshankar Raman et al. 2007. Adaptive query processing. Foundations and Trends in Databases 1, 1 (2007), 1--140. Google ScholarDigital Library
- Benjamin Djahandideh, François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare in action: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 1432--1435.Google ScholarCross Ref
- Orri Erling and Ivan Mikhailov. 2008. Towards web scale RDF. Proc. SSWS (2008). https://www.csee.umbc.edu/courses/graduate/691/spring13/01/papers/VOSArticleWebScaleRDF.pdf.Google Scholar
- Dieter Fensel. 2003. Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer Science 8 Business Media. Google ScholarDigital Library
- Luis Galárraga, Katja Hose, and Ralf Schenkel. 2014. Partout: A distributed engine for efficient RDF processing. In 23rd International World Wide Web Conference (WWW’14). 267--268. Google ScholarDigital Library
- José M. Giménez-García, Javier D. Fernández, and Miguel A. Martínez-Prieto. 2015. HDT-MR: A scalable solution for RDF compression with HDT and MapReduce. In Proceedings of the European Semantic Web Conference. Springer, 253--268. Google ScholarDigital Library
- François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 771--782.Google ScholarCross Ref
- Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 599--613. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/gonzalez. Google ScholarDigital Library
- Eric L. Goodman and Dirk Grunwald. 2014. Using vertex-centric programming platforms to implement SPARQL queries on large graphs. In Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms (IA3’14). IEEE Press, Piscataway, NJ, 25--32. Google ScholarDigital Library
- Olaf Görlitz and Steffen Staab. 2011. Splendid: Sparql endpoint federation exploiting void descriptions. In Proceedings of the 2nd International Conference on Consuming Linked Data. CEUR-WS.org, 13--24. Google ScholarDigital Library
- Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin. 2005. LUBM: A benchmark for OWL knowledge base systems. Web Semant. 3 (Oct. 2005), 158--182. Issue 2--3. Google ScholarDigital Library
- Sairam Gurajada, Stephan Seufert, Iris Miliaraki, and Martin Theobald. 2014. TriAD: A distributed shared-nothing RDF engine based on asynchronous message passing. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 289--300. Google ScholarDigital Library
- Laura Haas, Donald Kossmann, Edward Wimmers, and Jun Yang. 1997. Optimizing queries across diverse data sources. VLDB. 276--285. http://www.vldb.org/conf/1997/P276.PDF. Google ScholarDigital Library
- Peter Haase, Katja Hose, Ralf Schenkel, Michael Schmidt, and Andreas Schwarte. 2014. Federated query processing over linked data. In Linked Data Management. 369--387. Retrieved fromGoogle Scholar
- Peter Haase, Tobias Mathäß, and Michael Ziller. 2010. An evaluation of approaches to federated query processing over linked data. In Proceedings of the 6th International Conference on Semantic Systems. ACM, 5. Google ScholarDigital Library
- Mohammad Hammoud, Dania Abed Rabbou, Reza Nouri, Seyed-Mehdi-Reza Beheshti, and Sherif Sakr. 2015. DREAM: Distributed RDF engine with adaptive query planner and minimal communication. Proc. VLDB 8, 6 (2015), 654--665. Retrieved from http://www.vldb.org/pvldb/vol8/p654-Hammoud.pdf. Google ScholarDigital Library
- Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, and Nikos Mamoulis. 2015. Evaluating SPARQL queries on massive RDF datasets. Proc. VLDB 8, 12 (2015), 1848--1851. Retrieved from http://www.vldb.org/pvldb/vol8/p1848-harbi.pdf. Google ScholarDigital Library
- Stephen Harris and Nicholas Gibbins. 2003. 3store: Efficient bulk RDF storage. In Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems (PSSS’03). CEUR-WS.org.Google Scholar
- Steve Harris, Nick Lamb, and Nigel Shadbolt. 2009. 4store: The design and implementation of a clustered RDF store. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). 94--109.Google Scholar
- Andreas Harth and Stefan Decker. 2005. Optimized index structures for querying RDF from the web. In Proceedings of the IEEE Latin American Web Congress (LA-WEB’05). 71--80. Google ScholarDigital Library
- Aisha Hasan, Mohammad Hammoud, Reza Nouri, and Sherif Sakr. 2016. DREAM in action: A distributed and adaptive RDF system on the cloud. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). 191--194. Google ScholarDigital Library
- Jiewen Huang, Daniel J. Abadi, and Kun Ren. 2011. Scalable SPARQL querying of large RDF graphs. Proc. VLDB 4, 11 (2011), 1123--1134.Google ScholarDigital Library
- Mohammad Husain, James McGlothlin, Mohammad M. Masud, Latifur Khan, and Bhavani M. Thuraisingham. 2011. Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans. Knowl. Data Eng. 23, 9 (2011), 1312--1327. Google ScholarDigital Library
- Vijay Ingalalli, Dino Ienco, Pascal Poncelet, and Serena Villata. 2016. Querying RDF data using a multigraph-based approach. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 245--256.Google Scholar
- Zoi Kaoudi and Ioana Manolescu. 2015. RDF in the clouds: A survey. VLDB J. 24, 1 (2015), 67--91. Google ScholarDigital Library
- Vaibhav Khadilkar, Murat Kantarcioglu, Bhavani M. Thuraisingham, and Paolo Castagna. 2012. Jena-HBase: A distributed, scalable and effcient RDF triple store. In Proceedings of the ISWC 2012 Posters 8 Demonstrations Track. Retrieved from http://ceur-ws.org/Vol-914/paper_14.pdf. Google ScholarDigital Library
- HyeongSik Kim, Padmashree Ravindra, and Kemafor Anyanwu. 2013. Optimizing RDF(S) queries on cloud platforms. In Proceedings of the 22nd International World Wide Web Conference (WWW’13). 261--264. Retrieved from http://dl.acm.org/citation.cfm?id=2487917. Google ScholarDigital Library
- Jinha Kim, Hyungyu Shin, Wook-Shin Han, Sungpack Hong, and Hassan Chafi. 2015. Taming subgraph isomorphism for RDF query processing. Proc. VLDB 8, 11 (2015), 1238--1249. Retrieved from http://www.vldb.org/pvldb/vol8/p1238-kim.pdf. Google ScholarDigital Library
- Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a pc. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12), Vol. 8. 31--46. Google ScholarDigital Library
- Günter Ladwig and Andreas Harth. 2011. CumulusRDF: Linked data management on nested key-value stores. In Proceedings of the 7th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’11). 30.Google Scholar
- Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 2 (April 2010), 35--40. Google ScholarDigital Library
- Kisung Lee and Ling Liu. 2013. Scaling queries over big RDF graphs with semantic hash partitioning. Proc. VLDB Endow. 6, 14 (2013), 1894--1905. Google ScholarDigital Library
- Baolin Liu and Bo Hu. 2005. An evaluation of RDF storage systems for large data applications. In Proceedings of the 1st International Conference on Semantics, Knowledge and Grid. IEEE, 59--59. Google ScholarDigital Library
- Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning in the cloud. Proc. VLDB 5, 8 (2012), 716--727. Retrieved from http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf. Google ScholarDigital Library
- Li Ma, Zhong Su, Yue Pan, Li Zhang, and Tao Liu. 2004. RStar: An RDF storage and query system for enterprise resource management. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM, 484--491. Google ScholarDigital Library
- Miguel A. Martínez-Prieto, Mario Arias, and Javier D. Fernandez. 2012. Exchange and consumption of huge RDF data. In The Semantic Web: Research and Applications. Springer, 437--452. Google ScholarDigital Library
- Brian McBride. 2002. Jena: A semantic web toolkit. IEEE Internet Comput. 6, 6 (2002), 55--59. Google ScholarDigital Library
- Mohamed Morsey, Jens Lehmann, Sören Auer, and Axel-Cyrille Ngonga Ngomo. 2011. DBpedia SPARQL benchmark--Performance assessment with real queries on real data. In Proceedings of the International Semantic Web Conference (ISWC’11). Springer, 454--469. Google ScholarDigital Library
- Raghava Mutharaju, Sherif Sakr, Alessandra Sala, and Pascal Hitzler. 2013. D-SPARQ: Distributed, scalable and efficient RDF query engine. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 261--264. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_21.pdf. Google ScholarDigital Library
- Hubert Naacke, Olivier Curé, and Bernd Amann. 2016. SPARQL query processing with apache spark. CoRR abs/1604.08903 (2016). Retrieved from http://arxiv.org/abs/1604.08903.Google Scholar
- Thomas Neumann and Gerhard Weikum. 2008. RDF-3X: A RISC-style engine for RDF. Proc. VLDB Endow. 1, 1 (2008), 647--659. Google ScholarDigital Library
- Thomas Neumann and Gerhard Weikum. 2010. The RDF-3X engine for scalable management of RDF data. VLDB J. 19, 1 (2010), 91--113. Google ScholarDigital Library
- Andriy Nikolov, Andreas Schwarte, and Christian Hütter. 2013. Fedsearch: Efficiently combining structured queries and full-text search in a SPARQL federation. In Proceedings of the International Semantic Web Conference. Springer, 427--443. Google ScholarDigital Library
- Damla Oguz, Belgin Ergenc, Shaoyi Yin, Oguz Dikenelli, and Abdelkader Hameurlain. 2015. Federated query processing on linked data: A qualitative survey and open challenges. Knowl. Eng. Rev. 30, 5 (2015), 545--563.Google ScholarCross Ref
- Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. 2008. Pig latin: A not-so-foreign language for data processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08). 1099--1110. Google ScholarDigital Library
- M. Tamer Özsu. 2016. A survey of RDF data management systems. Front. Comput. Sci. 10, 3 (2016), 418--432. Google ScholarDigital Library
- Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, and Nectarios Koziris. 2013. H2RDF+: High-performance distributed joins over large-scale RDF graphs. In Proceedings of the 2013 IEEE International Conference on Big Data. 255--263.Google ScholarCross Ref
- Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, and Nectarios Koziris. 2012. H2RDF: Adaptive query processing on RDF data in the cloud. In Proceedings of the 21st World Wide Web Conference (WWW’12). 397--400. Google ScholarDigital Library
- Nikolaos Papailiou, Dimitrios Tsoumakos, Ioannis Konstantinou, Panagiotis Karras, and Nectarios Koziris. 2014. HRDF+: An efficient data management system for big RDF graphs. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 909--912. Google ScholarDigital Library
- Peng Peng, Lei Zou, Lei Chen, and Dongyan Zhao. 2016. Query workload-based RDF graph fragmentation and allocation. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 377--388.Google Scholar
- Minh-Duc Pham, Peter Boncz, and Orri Erling. 2012. S3g2: A scalable structure-correlated social graph generator. In Proceedings of the Technology Conference on Performance Evaluation and Benchmarking. Springer, 156--172.Google Scholar
- Roshan Punnoose, Adina Crainiceanu, and David Rapp. 2015. SPARQL in the cloud using Rya. Inf. Syst. 48 (2015), 181--195. Google ScholarDigital Library
- Nur Aini Rakhmawati, Jürgen Umbrich, Marcel Karnstedt, Ali Hasnain, and Michael Hausenblas. 2013. Querying over federated SPARQL endpoints—A state of the art survey. arXiv Preprint arXiv:1306.1723 (2013).Google Scholar
- Louiqa Raschid and Stanley Y. W. Su. 1986. A parallel processing strategy for evaluating recursive queries. In Proceedings of the Conference on Very Large Data Bases (VLDB’86), Vol. 86. 412--419. Google ScholarDigital Library
- Padmashree Ravindra, HyeongSik Kim, and Kemafor Anyanwu. 2011. An intermediate algebra for optimizing RDF graph pattern matching on mapreduce. In Proceedings of the 8th Extended Semantic Web Conference: Research and Applications (ESWC’11). 46--61. Google ScholarDigital Library
- Kurt Rohloff and Richard E. Schantz. 2010. High-performance, massively scalable distributed systems using the mapreduce software framework: The SHARD triple-store. In Programming Support Innovations for Emerging Distributed Applications. ACM, 4. Google ScholarDigital Library
- Sherif Sakr, Anna Liu, Daniel M. Batista, and Mohammad Alomari. 2011. A survey of large scale data management approaches in cloud environments. IEEE Commun. Surveys Tutor. 13, 3 (2011), 311--336.Google ScholarCross Ref
- Sherif Sakr, Anna Liu, and Ayman G. Fayoumi. 2013. The family of mapreduce and large-scale data processing systems. Comput. Surveys 46, 1 (2013). Google ScholarDigital Library
- Muhammad Saleem, Yasar Khan, Ali Hasnain, Ivan Ermilov, and Axel-Cyrille Ngonga Ngomo. 2016. A fine-grained evaluation of SPARQL endpoint federation systems. Semantic Web 7, 5 (2016), 493--518.Google ScholarDigital Library
- Alexander Schätzle, Martin Przyjaciel-Zablocki, Thorsten Berberich, and Georg Lausen. 2015. S2X: Graph-parallel querying of RDF with graphX. In Proceedings of the 1st International Workshop on Big-Graphs Online Querying (BigOQ’15).Google Scholar
- Alexander Schätzle, Martin Przyjaciel-Zablocki, Thomas Hornung, and Georg Lausen. 2013. PigSPARQL: A SPARQL query processing baseline for big data. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 241--244. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_16.pdf. Google ScholarDigital Library
- Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, and Georg Lausen. 2015. S2RDF: RDF querying with SPARQL on spark. CoRR abs/1512.07021 (2015). Retrieved from http://arxiv.org/abs/1512.07021.Google Scholar
- M. Schmidt, T. Hornung, N. Küchlin, G. Lausen, and C. Pinkel. 2008. An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In Proceedings of the International Semantic Web Conference (ISWC’08). 82--97. Google ScholarDigital Library
- M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. 2009. SPˆ 2bench: A SPARQL performance benchmark. In Proceedings of the IEEE 25th International Conference on Data Engineering (ICDE’09). IEEE, 222--233. Google ScholarDigital Library
- Andreas Schwarte, Peter Haase, Katja Hose, Ralf Schenkel, and Michael Schmidt. 2011. Fedx: Optimization techniques for federated query processing on linked data. In Proceedings of the International Semantic Web Conference. Springer, 601--616. Google ScholarDigital Library
- Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 International Conference on Management of Data. ACM, 505--516. Google ScholarDigital Library
- Lefteris Sidirourgos, Romulo Goncalves, Martin Kersten, Niels Nes, and Stefan Manegold. 2008. Column-store support for RDF data management: Not all swans are white. Proc. VLDB Endow. 1, 2 (2008), 1553--1563. Google ScholarDigital Library
- Markus Stocker, Andy Seaborne, Abraham Bernstein, Christoph Kiefer, and Dave Reynolds. 2008. SPARQL basic graph pattern optimization using selectivity estimation. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). ACM, 595--604. Google ScholarDigital Library
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. R. Madden, E. O’Neil, P. O’Neil, A. Rasin, N. Tran, and S. Zdonik. 2005. C-store: A column oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB’05). Google ScholarDigital Library
- Philip Stutz, Abraham Bernstein, and William Cohen. 2010. Signal/collect: Graph algorithms for the (semantic) web. In Proceedings of the International Semantic Web Conference. Springer, 764--780. Google ScholarDigital Library
- Philip Stutz, Bibek Paudel, Mihaela Verman, and Abraham Bernstein. 2015. Random walk triplerush: Asynchronous graph querying and sampling. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). ACM, 1034--1044. Google ScholarDigital Library
- Tolga Urhan and Michael J. Franklin. 2000. Xjoin: A reactively scheduled pipelined join operatorỳ. Bull. Tech. Committee (2000), 27.Google Scholar
- Patrick Valduriez. 1987. Join indices. ACM Trans. Database Syst. 12, 2 (1987), 218--246. Google ScholarDigital Library
- Xin Wang, Thanassis Tiropanis, and Hugh C. Davis. 2013. Lhd: Optimising linked data query processing using parallelisation. LDOW. http://ceur-ws.org/Vol-996/papers/ldow2013-paper-06.pdf.Google Scholar
- Cathrin Weiss, Panagiotis Karras, and Abraham Bernstein. 2008. Hexastore: Sextuple indexing for semantic web data management. Proc. VLDB Endow. 1, 1 (2008), 1008--1019. Google ScholarDigital Library
- Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, and Dave Reynolds. 2003. Efficient RDF storage and retrieval in jena2. In Proceedings of the International Conference on Semantic Web and Databases (SWDB’03). 131--150. Google ScholarDigital Library
- Kevin Wilkinson and Kevin Wilkinson. 2006. Jena property table implementation. In Proceedings of the International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’06).Google Scholar
- Buwen Wu, Yongluan Zhou, Pingpeng Yuan, Hai Jin, and Ling Liu. 2014. SemStore: A semantic-preserving distributed RDF triple store. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM’14). 509--518. Google ScholarDigital Library
- Marcin Wylot and Philippe Cudré-Mauroux. 2016. DiploCloud: Efficient and scalable management of RDF data in the cloud. IEEE Trans. Knowl. Data Eng. 28, 3 (2016), 659--674. Google ScholarDigital Library
- Marcin Wylot, Jigé Pont, Mariusz Wisniewski, and Philippe Cudré-Mauroux. 2011. dipLODocus{RDF} - Short and long-tail RDF analytics for massive webs of data. In Proceedings of the International Semantic Web Conference. 778--793. Google ScholarDigital Library
- Pingpeng Yuan, Pu Liu, Buwen Wu, Hai Jin, Wenya Zhang, and Ling Liu. 2013. TripleBit: A fast and compact system for large scale RDF data. Proc. VLDB Endow. 6, 7 (2013), 517--528. Google ScholarDigital Library
- Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10). Retrieved from https://www.usenix.org/conference/hotcloud-10/spark-cluster-computing-working-sets. Google ScholarDigital Library
- Kai Zeng, Jiacheng Yang, Haixun Wang, Bin Shao, and Zhongyuan Wang. 2013. A distributed graph engine for web scale RDF data. In Proceedings of the 39th International Conference on Very Large Data Bases. VLDB Endowment, 265--276. Google ScholarDigital Library
- Xiaofei Zhang, Lei Chen, Yongxin Tong, and Min Wang. 2013. EAGRE: Towards scalable I/O efficient SPARQL query evaluation on the cloud. In Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE’13). 565--576. Google ScholarDigital Library
- Lei Zou, M. Tamer Özsu, Lei Chen, Xuchuan Shen, Ruizhe Huang, and Dongyan Zhao. 2014. gStore: A graph-based SPARQL query engine. VLDB J. 23, 4 (2014), 565--590. Google ScholarDigital Library
Index Terms
- RDF Data Storage and Query Processing Schemes: A Survey
Recommendations
RDF, Jena, SparQL and the 'Semantic Web'
SIGUCCS '09: Proceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaborationThe Resource Description Format (RDF) is used to represent information modeled as a "graph": a set of individual objects, along with a set of connections among those objects. In that role, RDF is one of the pillars of the so-called Semantic Web. This ...
The RDF foundry: call for an initiative to build enhanced RDF resources for biological data integration
WIMS '11: Proceedings of the International Conference on Web Intelligence, Mining and SemanticsCurrently, the OBO Foundry plays an important role by setting guidelines to formalise the concepts within the biomedical domain. The ontologies within the OBO Foundry are usually represented in the OBO ontology language. While being human-readable, this ...
Don't like RDF reification?: making statements about statements using singleton property
WWW '14: Proceedings of the 23rd international conference on World wide webStatements about RDF statements, or meta triples, provide additional information about individual triples, such as the source, the occurring time or place, or the certainty. Integrating such meta triples into semantic knowledge bases would enable the ...
Comments