skip to main content
survey

RDF Data Storage and Query Processing Schemes: A Survey

Published:06 September 2018Publication History
Skip Abstract Section

Abstract

The Resource Description Framework (RDF) represents a main ingredient and data representation format for Linked Data and the Semantic Web. It supports a generic graph-based data model and data representation format for describing things, including their relationships with other things. As the size of RDF datasets is growing fast, RDF data management systems must be able to cope with growing amounts of data. Even though physically handling RDF data using a relational table is possible, querying a giant triple table becomes very expensive because of the multiple nested joins required for answering graph queries. In addition, the heterogeneity of RDF Data poses entirely new challenges to database systems. This article provides a comprehensive study of the state of the art in handling and querying RDF data. In particular, we focus on data storage techniques, indexing strategies, and query execution mechanisms. Moreover, we provide a classification of existing systems and approaches. We also provide an overview of the various benchmarking efforts in this context and discuss some of the open problems in this domain.

Skip Supplemental Material Section

Supplemental Material

References

  1. Daniel J. Abadi, Adam Marcus, Samuel R. Madden, and Kate Hollenbach. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, 411--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi Silberschatz. 2009. HadoopDB: An architectural hybrid of mapreduce and DBMS technologies for analytical workloads. Proc. VLDB 2, 1 (2009), 922--933. Retrieved from http://www.vldb.org/pvldb/2/vldb09-861.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Maribel Acosta, Maria-Esther Vidal, Tomas Lampo, Julio Castillo, and Edna Ruckhaus. 2011. ANAPSID: An adaptive query processing engine for SPARQL endpoints. Semant. Web (2011), 18--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Razen Al-Harbi, Ibrahim Abdelaziz, Panos Kalnis, Nikos Mamoulis, Yasser Ebrahim, and Majed Sahli. 2016. Accelerating SPARQL queries by exploiting hash-based locality and adaptive partitioning. VLDB J. 25, 3 (2016), 355--380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, and Dimitris Plexousakis. 2001. On storing voluminous RDF descriptions: The case of web portal catalogs. In Proceedings of the International Workshop on the Web and Databases (WebDB’01). 43--48.Google ScholarGoogle Scholar
  6. Keith Alexander and Michael Hausenblas. 2009. Describing linked datasets—On the design and usage of void, the vocabulary of interlinked datasets. In Proceedings of the Linked Data on the Web Workshop (LDOW’09). Retrieved from http://richard.cyganiak.de/2008/papers/void-ldow2009.pdf.Google ScholarGoogle Scholar
  7. Güneş Aluç, Olaf Hartig, M. Tamer Özsu, and Khuzaima Daudjee. 2014a. Diversified stress testing of RDF data management systems. In Proceedings of the International Semantic Web Conference. Springer, 197--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Güneş Aluç, M. Tamer Özsu, and Khuzaima Daudjee. 2014b. Workload matters: Why RDF databases need a new design. Proc. VLDB Endow. 7, 10 (2014), 837--840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Güneş Aluç, M. Tamer Ozsu, Khuzaima Daudjee, and Olaf Hartig. 2013. Chameleon-db: A Workload-Aware Robust RDF Data Management System. Technical Report CS-2013-10. University of Waterloo.Google ScholarGoogle Scholar
  10. Andrés Aranda-Andújar, Francesca Bugiotti, Jesús Camacho-Rodríguez, Dario Colazzo, François Goasdoué, Zoi Kaoudi, and Ioana Manolescu. 2012. AMADA: Web data repositories in the amazon cloud. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). 2749--2751. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational data processing in spark. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’15). 1383--1394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Medha Atre and James A. Hendler. 2009. BitMat: A main memory bit-matrix of RDF triples. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). Citeseer, 33.Google ScholarGoogle Scholar
  13. Medha Atre, Jagannathan Srinivasan, and James A. Hendler. 2008. BitMat: A main-memory bit matrix of RDF triples for conjunctive triple pattern queries. In Proceedings of the Poster and Demonstration Session at the 7th International Semantic Web Conference (ISWC’08). Retrieved from http://ceur-ws.org/Vol-401/iswc2008pd_submission_16.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Anirudh Badam and Vivek S. Pai. 2011. SSDAlloc: Hybrid SSD/RAM memory management made easy. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. USENIX Association, 16--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Tim Berners-Lee, James Hendler, Ora Lassila et al. 2001. The semantic web. Sci. Amer. 284, 5 (2001), 28--37.Google ScholarGoogle Scholar
  16. Philip A. Bernstein and Dah-Ming W. Chiu. 1981. Using semi-joins to solve relational queries. J. ACM 28, 1 (1981), 25--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Christian Bizer, Tom Heath, and Tim Berners-Lee. 2009. Linked data-the story so far. https://eprints.soton.ac.uk/271285/.Google ScholarGoogle Scholar
  18. Christian Bizer and Andreas Schultz. 2009. The Berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst. 5, 2 (2009), 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  19. Mihaela A. Bornea, Julian Dolby, Anastasios Kementsietsidis, Kavitha Srinivas, Patrick Dantressangle, Octavian Udrea, and Bishwaranjan Bhattacharjee. 2013. Building an efficient RDF store over a relational database. In Proceedings of the 2013 International Conference on Management of Data. ACM, 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. 2002. Sesame: A generic architecture for storing and querying RDF and RDF schema. In Proceedings of the 1st International Semantic Web Conference on the Semantic Web (ISWC’02). Springer, 54--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Rick Cattell. 2011. Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39, 4 (2011), 12--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Surajit Chaudhuri and Gerhard Weikum. 2000. Rethinking database system architecture: Toward a self-tuning RISC-style database system. In Proceedings of 26th International Conference on Very Large Data Bases (VLDB’00). 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2014. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the Posters 8 Demonstrations Track a Track Within the 13th International Semantic Web Conference (ISWC’14). 261--264. Retrieved from http://ceur-ws.org/Vol-1272/paper_43.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2015. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT’15). 292--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Long Cheng and Spyros Kotoulas. 2015. Scale-out processing of large RDF datasets. IEEE Trans. Big Data 1, 4 (2015), 138--150.Google ScholarGoogle ScholarCross RefCross Ref
  26. Eugene Inseok Chong, Souripriya Das, George Eadon, and Jagannathan Srinivasan. 2005. An efficient SQL-based RDF querying scheme. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB’05). VLDB Endowment, 1216--1227. Retrieved from http://portal.acm.org/citation.cfm?id=1083592.1083734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. World Wide Web Consortium. 2014a. RDF 1.1: On Semantics of RDF Datasets. https://www.w3.org/TR/rdf11-datasets/.Google ScholarGoogle Scholar
  28. World Wide Web Consortium. 2014b. RDF 1.1 Primer.Google ScholarGoogle Scholar
  29. George P. Copeland and Setrag Khoshafian. 1985. A decomposition storage model. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 268--279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Philippe Cudré-Mauroux, Iliya Enchev, Sever Fundatureanu, Paul Groth, Albert Haque, Andreas Harth, Felix Leif Keppmann, Daniel Miranker, Juan F Sequeda, and Marcin Wylot. 2013. Nosql databases for rdf: An empirical evaluation. In Proceedings of the International Semantic Web Conference. Springer, 310--325. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51 (Jan. 2008), 107--113. Issue 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Gianluca Demartini, Iliya Enchev, Marcin Wylot, Joel Gapany, and Philippe Cudre-Mauroux. 2012. BowlognaBench—Benchmarking RDF analytics. In Data-Driven Process Discovery and Analysis, Karl Aberer, Ernesto Damiani, and Tharam Dillon (Eds.). Lecture Notes in Business Information Processing, Vol. 116. Springer, Berlin, 82--102.Google ScholarGoogle Scholar
  33. Uwe Deppisch. 1986. S-tree: A dynamic balanced signature index for office retrieval. In Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 77--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Amol Deshpande, Zachary Ives, Vijayshankar Raman et al. 2007. Adaptive query processing. Foundations and Trends in Databases 1, 1 (2007), 1--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Benjamin Djahandideh, François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare in action: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 1432--1435.Google ScholarGoogle ScholarCross RefCross Ref
  36. Orri Erling and Ivan Mikhailov. 2008. Towards web scale RDF. Proc. SSWS (2008). https://www.csee.umbc.edu/courses/graduate/691/spring13/01/papers/VOSArticleWebScaleRDF.pdf.Google ScholarGoogle Scholar
  37. Dieter Fensel. 2003. Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer Science 8 Business Media. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Luis Galárraga, Katja Hose, and Ralf Schenkel. 2014. Partout: A distributed engine for efficient RDF processing. In 23rd International World Wide Web Conference (WWW’14). 267--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. José M. Giménez-García, Javier D. Fernández, and Miguel A. Martínez-Prieto. 2015. HDT-MR: A scalable solution for RDF compression with HDT and MapReduce. In Proceedings of the European Semantic Web Conference. Springer, 253--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 771--782.Google ScholarGoogle ScholarCross RefCross Ref
  41. Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 599--613. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/gonzalez. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Eric L. Goodman and Dirk Grunwald. 2014. Using vertex-centric programming platforms to implement SPARQL queries on large graphs. In Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms (IA3’14). IEEE Press, Piscataway, NJ, 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Olaf Görlitz and Steffen Staab. 2011. Splendid: Sparql endpoint federation exploiting void descriptions. In Proceedings of the 2nd International Conference on Consuming Linked Data. CEUR-WS.org, 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin. 2005. LUBM: A benchmark for OWL knowledge base systems. Web Semant. 3 (Oct. 2005), 158--182. Issue 2--3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Sairam Gurajada, Stephan Seufert, Iris Miliaraki, and Martin Theobald. 2014. TriAD: A distributed shared-nothing RDF engine based on asynchronous message passing. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 289--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Laura Haas, Donald Kossmann, Edward Wimmers, and Jun Yang. 1997. Optimizing queries across diverse data sources. VLDB. 276--285. http://www.vldb.org/conf/1997/P276.PDF. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Peter Haase, Katja Hose, Ralf Schenkel, Michael Schmidt, and Andreas Schwarte. 2014. Federated query processing over linked data. In Linked Data Management. 369--387. Retrieved fromGoogle ScholarGoogle Scholar
  48. Peter Haase, Tobias Mathäß, and Michael Ziller. 2010. An evaluation of approaches to federated query processing over linked data. In Proceedings of the 6th International Conference on Semantic Systems. ACM, 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Mohammad Hammoud, Dania Abed Rabbou, Reza Nouri, Seyed-Mehdi-Reza Beheshti, and Sherif Sakr. 2015. DREAM: Distributed RDF engine with adaptive query planner and minimal communication. Proc. VLDB 8, 6 (2015), 654--665. Retrieved from http://www.vldb.org/pvldb/vol8/p654-Hammoud.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, and Nikos Mamoulis. 2015. Evaluating SPARQL queries on massive RDF datasets. Proc. VLDB 8, 12 (2015), 1848--1851. Retrieved from http://www.vldb.org/pvldb/vol8/p1848-harbi.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Stephen Harris and Nicholas Gibbins. 2003. 3store: Efficient bulk RDF storage. In Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems (PSSS’03). CEUR-WS.org.Google ScholarGoogle Scholar
  52. Steve Harris, Nick Lamb, and Nigel Shadbolt. 2009. 4store: The design and implementation of a clustered RDF store. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). 94--109.Google ScholarGoogle Scholar
  53. Andreas Harth and Stefan Decker. 2005. Optimized index structures for querying RDF from the web. In Proceedings of the IEEE Latin American Web Congress (LA-WEB’05). 71--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Aisha Hasan, Mohammad Hammoud, Reza Nouri, and Sherif Sakr. 2016. DREAM in action: A distributed and adaptive RDF system on the cloud. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). 191--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Jiewen Huang, Daniel J. Abadi, and Kun Ren. 2011. Scalable SPARQL querying of large RDF graphs. Proc. VLDB 4, 11 (2011), 1123--1134.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Mohammad Husain, James McGlothlin, Mohammad M. Masud, Latifur Khan, and Bhavani M. Thuraisingham. 2011. Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans. Knowl. Data Eng. 23, 9 (2011), 1312--1327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Vijay Ingalalli, Dino Ienco, Pascal Poncelet, and Serena Villata. 2016. Querying RDF data using a multigraph-based approach. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 245--256.Google ScholarGoogle Scholar
  58. Zoi Kaoudi and Ioana Manolescu. 2015. RDF in the clouds: A survey. VLDB J. 24, 1 (2015), 67--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Vaibhav Khadilkar, Murat Kantarcioglu, Bhavani M. Thuraisingham, and Paolo Castagna. 2012. Jena-HBase: A distributed, scalable and effcient RDF triple store. In Proceedings of the ISWC 2012 Posters 8 Demonstrations Track. Retrieved from http://ceur-ws.org/Vol-914/paper_14.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. HyeongSik Kim, Padmashree Ravindra, and Kemafor Anyanwu. 2013. Optimizing RDF(S) queries on cloud platforms. In Proceedings of the 22nd International World Wide Web Conference (WWW’13). 261--264. Retrieved from http://dl.acm.org/citation.cfm?id=2487917. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Jinha Kim, Hyungyu Shin, Wook-Shin Han, Sungpack Hong, and Hassan Chafi. 2015. Taming subgraph isomorphism for RDF query processing. Proc. VLDB 8, 11 (2015), 1238--1249. Retrieved from http://www.vldb.org/pvldb/vol8/p1238-kim.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a pc. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12), Vol. 8. 31--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Günter Ladwig and Andreas Harth. 2011. CumulusRDF: Linked data management on nested key-value stores. In Proceedings of the 7th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’11). 30.Google ScholarGoogle Scholar
  64. Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 2 (April 2010), 35--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Kisung Lee and Ling Liu. 2013. Scaling queries over big RDF graphs with semantic hash partitioning. Proc. VLDB Endow. 6, 14 (2013), 1894--1905. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Baolin Liu and Bo Hu. 2005. An evaluation of RDF storage systems for large data applications. In Proceedings of the 1st International Conference on Semantics, Knowledge and Grid. IEEE, 59--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning in the cloud. Proc. VLDB 5, 8 (2012), 716--727. Retrieved from http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Li Ma, Zhong Su, Yue Pan, Li Zhang, and Tao Liu. 2004. RStar: An RDF storage and query system for enterprise resource management. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM, 484--491. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Miguel A. Martínez-Prieto, Mario Arias, and Javier D. Fernandez. 2012. Exchange and consumption of huge RDF data. In The Semantic Web: Research and Applications. Springer, 437--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Brian McBride. 2002. Jena: A semantic web toolkit. IEEE Internet Comput. 6, 6 (2002), 55--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Mohamed Morsey, Jens Lehmann, Sören Auer, and Axel-Cyrille Ngonga Ngomo. 2011. DBpedia SPARQL benchmark--Performance assessment with real queries on real data. In Proceedings of the International Semantic Web Conference (ISWC’11). Springer, 454--469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Raghava Mutharaju, Sherif Sakr, Alessandra Sala, and Pascal Hitzler. 2013. D-SPARQ: Distributed, scalable and efficient RDF query engine. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 261--264. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_21.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Hubert Naacke, Olivier Curé, and Bernd Amann. 2016. SPARQL query processing with apache spark. CoRR abs/1604.08903 (2016). Retrieved from http://arxiv.org/abs/1604.08903.Google ScholarGoogle Scholar
  74. Thomas Neumann and Gerhard Weikum. 2008. RDF-3X: A RISC-style engine for RDF. Proc. VLDB Endow. 1, 1 (2008), 647--659. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Thomas Neumann and Gerhard Weikum. 2010. The RDF-3X engine for scalable management of RDF data. VLDB J. 19, 1 (2010), 91--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Andriy Nikolov, Andreas Schwarte, and Christian Hütter. 2013. Fedsearch: Efficiently combining structured queries and full-text search in a SPARQL federation. In Proceedings of the International Semantic Web Conference. Springer, 427--443. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Damla Oguz, Belgin Ergenc, Shaoyi Yin, Oguz Dikenelli, and Abdelkader Hameurlain. 2015. Federated query processing on linked data: A qualitative survey and open challenges. Knowl. Eng. Rev. 30, 5 (2015), 545--563.Google ScholarGoogle ScholarCross RefCross Ref
  78. Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. 2008. Pig latin: A not-so-foreign language for data processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08). 1099--1110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. M. Tamer Özsu. 2016. A survey of RDF data management systems. Front. Comput. Sci. 10, 3 (2016), 418--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, and Nectarios Koziris. 2013. H2RDF+: High-performance distributed joins over large-scale RDF graphs. In Proceedings of the 2013 IEEE International Conference on Big Data. 255--263.Google ScholarGoogle ScholarCross RefCross Ref
  81. Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, and Nectarios Koziris. 2012. H2RDF: Adaptive query processing on RDF data in the cloud. In Proceedings of the 21st World Wide Web Conference (WWW’12). 397--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Nikolaos Papailiou, Dimitrios Tsoumakos, Ioannis Konstantinou, Panagiotis Karras, and Nectarios Koziris. 2014. HRDF+: An efficient data management system for big RDF graphs. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 909--912. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Peng Peng, Lei Zou, Lei Chen, and Dongyan Zhao. 2016. Query workload-based RDF graph fragmentation and allocation. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 377--388.Google ScholarGoogle Scholar
  84. Minh-Duc Pham, Peter Boncz, and Orri Erling. 2012. S3g2: A scalable structure-correlated social graph generator. In Proceedings of the Technology Conference on Performance Evaluation and Benchmarking. Springer, 156--172.Google ScholarGoogle Scholar
  85. Roshan Punnoose, Adina Crainiceanu, and David Rapp. 2015. SPARQL in the cloud using Rya. Inf. Syst. 48 (2015), 181--195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Nur Aini Rakhmawati, Jürgen Umbrich, Marcel Karnstedt, Ali Hasnain, and Michael Hausenblas. 2013. Querying over federated SPARQL endpoints—A state of the art survey. arXiv Preprint arXiv:1306.1723 (2013).Google ScholarGoogle Scholar
  87. Louiqa Raschid and Stanley Y. W. Su. 1986. A parallel processing strategy for evaluating recursive queries. In Proceedings of the Conference on Very Large Data Bases (VLDB’86), Vol. 86. 412--419. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Padmashree Ravindra, HyeongSik Kim, and Kemafor Anyanwu. 2011. An intermediate algebra for optimizing RDF graph pattern matching on mapreduce. In Proceedings of the 8th Extended Semantic Web Conference: Research and Applications (ESWC’11). 46--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Kurt Rohloff and Richard E. Schantz. 2010. High-performance, massively scalable distributed systems using the mapreduce software framework: The SHARD triple-store. In Programming Support Innovations for Emerging Distributed Applications. ACM, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Sherif Sakr, Anna Liu, Daniel M. Batista, and Mohammad Alomari. 2011. A survey of large scale data management approaches in cloud environments. IEEE Commun. Surveys Tutor. 13, 3 (2011), 311--336.Google ScholarGoogle ScholarCross RefCross Ref
  91. Sherif Sakr, Anna Liu, and Ayman G. Fayoumi. 2013. The family of mapreduce and large-scale data processing systems. Comput. Surveys 46, 1 (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Muhammad Saleem, Yasar Khan, Ali Hasnain, Ivan Ermilov, and Axel-Cyrille Ngonga Ngomo. 2016. A fine-grained evaluation of SPARQL endpoint federation systems. Semantic Web 7, 5 (2016), 493--518.Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Alexander Schätzle, Martin Przyjaciel-Zablocki, Thorsten Berberich, and Georg Lausen. 2015. S2X: Graph-parallel querying of RDF with graphX. In Proceedings of the 1st International Workshop on Big-Graphs Online Querying (BigOQ’15).Google ScholarGoogle Scholar
  94. Alexander Schätzle, Martin Przyjaciel-Zablocki, Thomas Hornung, and Georg Lausen. 2013. PigSPARQL: A SPARQL query processing baseline for big data. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 241--244. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_16.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, and Georg Lausen. 2015. S2RDF: RDF querying with SPARQL on spark. CoRR abs/1512.07021 (2015). Retrieved from http://arxiv.org/abs/1512.07021.Google ScholarGoogle Scholar
  96. M. Schmidt, T. Hornung, N. Küchlin, G. Lausen, and C. Pinkel. 2008. An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In Proceedings of the International Semantic Web Conference (ISWC’08). 82--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. 2009. SPˆ 2bench: A SPARQL performance benchmark. In Proceedings of the IEEE 25th International Conference on Data Engineering (ICDE’09). IEEE, 222--233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Andreas Schwarte, Peter Haase, Katja Hose, Ralf Schenkel, and Michael Schmidt. 2011. Fedx: Optimization techniques for federated query processing on linked data. In Proceedings of the International Semantic Web Conference. Springer, 601--616. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 International Conference on Management of Data. ACM, 505--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Lefteris Sidirourgos, Romulo Goncalves, Martin Kersten, Niels Nes, and Stefan Manegold. 2008. Column-store support for RDF data management: Not all swans are white. Proc. VLDB Endow. 1, 2 (2008), 1553--1563. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Markus Stocker, Andy Seaborne, Abraham Bernstein, Christoph Kiefer, and Dave Reynolds. 2008. SPARQL basic graph pattern optimization using selectivity estimation. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). ACM, 595--604. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. R. Madden, E. O’Neil, P. O’Neil, A. Rasin, N. Tran, and S. Zdonik. 2005. C-store: A column oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB’05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Philip Stutz, Abraham Bernstein, and William Cohen. 2010. Signal/collect: Graph algorithms for the (semantic) web. In Proceedings of the International Semantic Web Conference. Springer, 764--780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. Philip Stutz, Bibek Paudel, Mihaela Verman, and Abraham Bernstein. 2015. Random walk triplerush: Asynchronous graph querying and sampling. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). ACM, 1034--1044. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. Tolga Urhan and Michael J. Franklin. 2000. Xjoin: A reactively scheduled pipelined join operatorỳ. Bull. Tech. Committee (2000), 27.Google ScholarGoogle Scholar
  106. Patrick Valduriez. 1987. Join indices. ACM Trans. Database Syst. 12, 2 (1987), 218--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Xin Wang, Thanassis Tiropanis, and Hugh C. Davis. 2013. Lhd: Optimising linked data query processing using parallelisation. LDOW. http://ceur-ws.org/Vol-996/papers/ldow2013-paper-06.pdf.Google ScholarGoogle Scholar
  108. Cathrin Weiss, Panagiotis Karras, and Abraham Bernstein. 2008. Hexastore: Sextuple indexing for semantic web data management. Proc. VLDB Endow. 1, 1 (2008), 1008--1019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, and Dave Reynolds. 2003. Efficient RDF storage and retrieval in jena2. In Proceedings of the International Conference on Semantic Web and Databases (SWDB’03). 131--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. Kevin Wilkinson and Kevin Wilkinson. 2006. Jena property table implementation. In Proceedings of the International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’06).Google ScholarGoogle Scholar
  111. Buwen Wu, Yongluan Zhou, Pingpeng Yuan, Hai Jin, and Ling Liu. 2014. SemStore: A semantic-preserving distributed RDF triple store. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM’14). 509--518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Marcin Wylot and Philippe Cudré-Mauroux. 2016. DiploCloud: Efficient and scalable management of RDF data in the cloud. IEEE Trans. Knowl. Data Eng. 28, 3 (2016), 659--674. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Marcin Wylot, Jigé Pont, Mariusz Wisniewski, and Philippe Cudré-Mauroux. 2011. dipLODocus{RDF} - Short and long-tail RDF analytics for massive webs of data. In Proceedings of the International Semantic Web Conference. 778--793. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. Pingpeng Yuan, Pu Liu, Buwen Wu, Hai Jin, Wenya Zhang, and Ling Liu. 2013. TripleBit: A fast and compact system for large scale RDF data. Proc. VLDB Endow. 6, 7 (2013), 517--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10). Retrieved from https://www.usenix.org/conference/hotcloud-10/spark-cluster-computing-working-sets. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Kai Zeng, Jiacheng Yang, Haixun Wang, Bin Shao, and Zhongyuan Wang. 2013. A distributed graph engine for web scale RDF data. In Proceedings of the 39th International Conference on Very Large Data Bases. VLDB Endowment, 265--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. Xiaofei Zhang, Lei Chen, Yongxin Tong, and Min Wang. 2013. EAGRE: Towards scalable I/O efficient SPARQL query evaluation on the cloud. In Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE’13). 565--576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Lei Zou, M. Tamer Özsu, Lei Chen, Xuchuan Shen, Ruizhe Huang, and Dongyan Zhao. 2014. gStore: A graph-based SPARQL query engine. VLDB J. 23, 4 (2014), 565--590. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. RDF Data Storage and Query Processing Schemes: A Survey

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Computing Surveys
          ACM Computing Surveys  Volume 51, Issue 4
          July 2019
          765 pages
          ISSN:0360-0300
          EISSN:1557-7341
          DOI:10.1145/3236632
          • Editor:
          • Sartaj Sahni
          Issue’s Table of Contents

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 6 September 2018
          • Revised: 1 December 2017
          • Accepted: 1 December 2017
          • Received: 1 November 2016
          Published in csur Volume 51, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • survey
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader