skip to main content
research-article

Querying Regular Graph Patterns

Published:01 January 2014Publication History
Skip Abstract Section

Abstract

Graph data appears in a variety of application domains, and many uses of it, such as querying, matching, and transforming data, naturally result in incompletely specified graph data, that is, graph patterns. While queries need to be posed against such data, techniques for querying patterns are generally lacking, and properties of such queries are not well understood.

Our goal is to study the basics of querying graph patterns. The key features of patterns we consider here are node and label variables and edges specified by regular expressions. We provide a classification of patterns, and study standard graph queries on graph patterns. We give precise characterizations of both data and combined complexity for each class of patterns. If complexity is high, we do further analysis of features that lead to intractability, as well as lower-complexity restrictions. Since our patterns are based on regular expressions, query answering for them can be captured by a new automata model. These automata have two modes of acceptance: one captures queries returning nodes, and the other queries returning paths. We study properties of such automata, and the key computational tasks associated with them. Finally, we provide additional restrictions for tractability, and show that some intractable cases can be naturally cast as instances of constraint satisfaction problems.

Skip Supplemental Material Section

Supplemental Material

References

  1. Abiteboul, S., Buneman, P., and Suciu, D. 1999. Data on the Web: From Relations to Semistructured Data and XML. Morgan-Kauffman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Angles, R. and Gutierrez, C. 2008. Survey of graph database models. ACM Comput. Surv. 40, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arenas, M., Barceló, P., Libkin, L., and Murlak, F. 2010. Relational and XML Data Exchange. Morgan & Claypool. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Barceló, P., Hurtado, C., Libkin, L., and Wood, P. 2010a. Expressive languages for path queries over graph-structured data. In Proceedings of the 29th ACM Symposium on Principles of Database Systems (PODS). 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Barceló, P., Libkin, L., Poggi, A., and Sirangelo, C. 2010b. XML with incomplete information. ACM 58, 1, 1--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Barceló, P., Libkin, L., and Reutter, J. 2013. Parameterized regular expressions and their languages. Theoret. Comput. Sci. 474, 21--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Björklund, H., Martens, W., and Schwentick, T. 2007. Conjunctive query containment over trees. In Proceeding of the 11th International Symposium on Database Programming Languages (DBPL). 66--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bonatti, P. A., Lutz, C., Murano, A., and Vardi, M. Y. 2008. The complexity of enriched mu-calculi. Log. Meth. Comput. Sci. 8, 4.Google ScholarGoogle Scholar
  9. Börger, E., Gräedel, E., and Gurevich, Y. 1997. The Classical Decision Problem. Perspectives in Mathematical Logics, Springer-Verlag.Google ScholarGoogle Scholar
  10. Buneman, P., Davidson, S. B., Hillebrand, G. G., and Suciu, D. 1996. A query language and optimization techniques for unstructured data. In Proceedings of the SIGMOD Conference. 505--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2000a. Answering regular path queries using views. In Proceedings of the 16th International Conference on Data Engineering (ICDE). 389--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2000b. Containment of conjunctive regular path queries with inverse. In Proceedings of the 7th International Conference on Principles of Knowledge Representation and Reasoning (KR). 176--185.Google ScholarGoogle Scholar
  13. Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2000c. View-based query processing and constraint satisfaction. In Proceedings of the 15th Annual IEEE Symposium on Logic in Computer Science (LICS). 361--371. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2002. Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci. 64, 3, 443--465. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Calvanese, D., De Giacomo, G., Lenzerini, M., and Vardi, M. 2011. Simplifying schema mappings. In Proceedings of the 14th International Conference on Database Theory (ICDT). 114--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Cheng, J., Yu, J. X., Ding, B., Yu, P. S., and Wang, H. 2008. Fast graph pattern matching. In Proceedings of the 24th International Conference on Data Engineering (ICDE). 913--922. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cohen, S. and Sagiv, Y. 2005. An abstract framework for generating maximal answers to queries. In Proceedings of the 10th International Conference on Database Theory (ICDT). 129--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Consens, M. and Mendelzon, A. 1990. Graphlog: A visual formalism for real life recursion. In Proceedings of the 9th ACM Symposium on Principles of Database Systems (PODS). 404--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Cruz, I., Mendelzon, A., and Wood, P. 1987. A graphical query language supporting recursion. In Proceedings of the ACM Special Interest Group on Management of Data 1987 Annual Conference (SIGMOD). 323--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. De Giacomo, G. and Lenzerini, M. 1997. A uniform framework for concept definitions in description logics. J. Artif. Intell. Res. (JAIR) 6, 87--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dechter, R. 2003. Constraint Processing. Morgan-Kauffman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Deutsch, A. and Tannen, V. 2001. Optimization properties for classes of conjunctive regular path queries. In Proceedings of the 8th International Workshop on Database Programming Languages (DBPL). 21--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Diestel, R. 2005. Graph Theory. Springer.Google ScholarGoogle Scholar
  24. Fagin, R., Kolaitis, P., Miller, R., and Popa, L. 2005. Data exchange: Semantics and query answering. Theoret. Comput. Sci. 336, 1, 89--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Fan, W., Li, J., Ma, S., Tang, N., and Wu, Y. 2010a. Graph pattern matching: From intractable to polynomial time. In Proc. VLDB Endow. 3, 1, 264--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Fan, W., Li, J., Ma, S., Wang, H., and Wu, Y. 2010b. Homomorphism revisited for graph matching. In Proc. VLDB Endow. 3, 1, 1161--1172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Fan, W., Li, J., Ma, S., Tang, N., and Wu, Y. 2011. Adding regular expressions to graph reachability and pattern queries. In Proceedings of the 27th International Conference on Data Engineering (ICDE). 39--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Glaister, I. and Shallit, J. 1996. A lower bound technique for the size of nondeterministic finite automata. Inf. Process. Lett. 59, 2, 75--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Gottlob, G., Koch, C., and Schulz, K. 2006. Conjunctive queries over trees. J. ACM 53, 2, 238--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Gutierrez, C., Hurtado, C., Mendelzon, A. O., and Pérez, J. 2011. Foundations of semantic web databases. J. Comput. Syst. Sci. 77, 3, 520--541. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gyssens, M., Paredaens, J., Van den Bussche, J., and Van Gucht, D. 1994. A graph-oriented object database model. IEEE Trans. Knowl. Data Eng. 6, 4, 572--586. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Imielinski, T. and Lipski, W. 1984. Incomplete information in relational databases. J. ACM 31, 4, 761--791. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Johnson, D. and Klug, A. 1984. Testing containment of conjunctive queries under functional and inclusion dependencies. J. Comput Syst. Sci. 28, 1, 167--189.Google ScholarGoogle ScholarCross RefCross Ref
  34. Kanza, Y., Nutt, W., and Sagiv, Y. 2002. Querying incomplete information in semistructured data. J. Comput. Syst. Sci. 64, 3, 655--693. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Kolaitis, P. and Vardi, M. 2007. A logical approach to constraint satisfaction. In Finite Model Theory and Its Applications, Springer, 339--370.Google ScholarGoogle Scholar
  36. Kozen, D. 1977. Lower bounds for natural proof systems. In Proceeding of the 18th Annual Symposium on Foundations of Computer Science (FOCS). 254--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Kupferman, O., Vardi, M. Y., and Wolper, P. 2001. Module checking. Inf. Computat. 164, 2, 322--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Lakshmanan, L., Ramesh, G., Wang, W. H., and Zhao, Z. 2004. On testing satisfiability of tree pattern queries. In Proceedings of the 30th International Conference on Very Large Data Bases (VLDB). 120--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Lenzerini, M. 2002. Data integration: A theoretical perspective. In Proceedings of the 21st ACM Symposium on Principles of Database Systems (PODS). 233--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Leser, U. 2005. A query language for biological networks. Bioinformatics 21, 2, ii33--ii39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Libkin, L. 2004. Elements of Finite Model Theory. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Libkin, L. 2011. Incomplete information and certain answers in general data models. In Proceedings of the 30th ACM Symposium on Principles of Database Systems (PODS). 59--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., and Alon, U. 2002. Network motifs: Simple building blocks of complex networks. Science 298, 5594, 824--827.Google ScholarGoogle Scholar
  44. Natarajan, M. 2000. Understanding the structure of a drug trafficking organization: A conversational analysis. Crime Prevention Studies 11, 273--298.Google ScholarGoogle Scholar
  45. Olken, F. 2003. Graph data management for molecular biology. OMICS: A Journal of Integrative Biology 7, 1, 75--78.Google ScholarGoogle ScholarCross RefCross Ref
  46. Pérez, J., Arenas, M., and Gutierrez, C. 2009. Semantics and complexity of SPARQL. ACM Trans. Datab. Syst. 34, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ronen, R. and Shmueli, O. 2009. Soql: A language for querying and creating data in social networks. In Proceedings of the 25th International Conference on Data Engineering (ICDE). 1595--1602. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. San Martín, M. and Gutierrez, C. 2009. Representing, querying and transforming social networks with RDF/SPARQL. In Proceedings of the 6th European Semantic Web Conference (ESWC). 293--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Tong, H., Faloutsos, C., Gallagher, B., and Eliassi-Rad, T. 2007. Fast best-effort pattern matching in large attributed graphs. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 737--746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Weikum, G., Kasneci, G., Ramanath, M., and Suchanek, F. 2009. Database and information-retrieval methods for knowledge discovery. Commun. ACM 52, 4, 56--64. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Querying Regular Graph Patterns

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Journal of the ACM
        Journal of the ACM  Volume 61, Issue 1
        January 2014
        222 pages
        ISSN:0004-5411
        EISSN:1557-735X
        DOI:10.1145/2578041
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 January 2014
        • Accepted: 1 November 2013
        • Revised: 1 May 2013
        • Received: 1 November 2011
        Published in jacm Volume 61, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader