skip to main content
10.1145/1007568.1007649acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

TOSS: an extension of TAX with ontologies and similarity queries

Published:13 June 2004Publication History

ABSTRACT

TAX is perhaps the best known extension of the relational algebra to handle queries to XML databases. One problem with TAX (as with many existing relational DBMSs) is that the semantics of terms in a TAX DB are not taken into account when answering queries. Thus, even though TAX answers queries with 100% precision, the recall of TAX is relatively low. Our TOSS system improves the recall of TAX via the concept of a similarity enhanced ontology (SEO). Intuitively, an ontology is a set of graphs describing relationships (such as isa, partof, etc.) between terms in a DB. An SEO also evaluates how similarities between terms (e.g. "J. Ullman", "Jeff Ullman", and "Jeffrey Ullman") affect ontologies. Finally, we show how the algebra proposed in TAX can be extended to take SEOs into account. The result is a system that provides a much higher answer quality than TAX does alone (quality is defined as the square root of the product of precision and recall). We experimentally evaluate the TOSS system on the DBLP and SIGMOD bibliographic databases and show that TOSS has acceptable performance.

References

  1. S. Al-Khalifa, C. Yu, and H. V. Jagadish. Querying structured text in an xml database. In Proc. ACM SIGMOD Conf. on Management of Data, San Diego, CA, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Bonatti, Y. Deng, and V. S. Subrahmanian. An ontology-extended relational algebra. In Proceedings of the IEEE International Conference on Information Reuse and Integration (IEEE IRI 2003), 2003.]]Google ScholarGoogle ScholarCross RefCross Ref
  3. P. Bonatti, M. L. Sapino, and V. S. Subrahmanian. Merging heterogeneous security orderings. Journal of Computer Security, 5(1):3--29, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Calvanese, G. D. Giacomo, and M. Lenzerini. A framework for ontology integration. In Proc. of the First Semantic Web Working Symposium, pages 303--316, 2001.]]Google ScholarGoogle Scholar
  5. W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string metrics for matching names and records. In Proc. of the First Workshop on Data Cleaning, Record Linkage, and Object Consolidation, 2003.]]Google ScholarGoogle Scholar
  6. DBLP XML records. Available at http://dblp.uni-trier.de/xml/, Nov 2003.]]Google ScholarGoogle Scholar
  7. G. A. Miller et. al. WordNet - a lexical database for english. Cognitive Science Laboratory, Princeton University. Available at http://www.cogsci.princeton.edu/~wn/w3wn.html, 2000.]]Google ScholarGoogle Scholar
  8. H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: A tree algebra for XML. In Proc. DBPL Conf, Rome, Italy, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. A. Jaro. Probabilistic linkage of large public health data files. Statistics in Medicine, 14:491--498, 1995.]]Google ScholarGoogle ScholarCross RefCross Ref
  10. D. Maluf and G. Wiederhold. Abstraction of representation for interoperation. Lecture Notes in AI, 1315, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Mitra, G. Wiederhold, and M. Kersten. A graph-oriented model for articulation of ontology interdependencies. In Proceedings Conference on Extending Database Technology 2000 (EDBT'2000), Konstanz, Germany, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Monge and C. Elkan. The field-matching problem: algorithm and applications. In Proc. of the Second International Conference on Knowledge Discovery and Data Mining, 1996.]]Google ScholarGoogle Scholar
  13. SIGMOD Record in XML. Available at http://www.acm.org/sigmod/record/xml/, Nov 2002.]]Google ScholarGoogle Scholar
  14. V. G. Voiskunskii. Evaluation of search results: A new approach. Journal of the American Society for Information Science, 48(2), Feb 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Wiederhold. Mediators in the architecture of future information systems. IEEE Computer, pages 38--49, Mar 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Wiederhold. Intelligent integration of information. In Proc. 1993 ACM SIGMOD Conf. on Management of Data, pages 434--437, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Wiederhold. Interoperation, mediation and ontologies. In International Symp. on Fifth Generation Computer Systems, Workshop on Heterogeneous Cooperative Knowledge Bases, ICOT, pages 33--48, 1994.]]Google ScholarGoogle Scholar
  18. Apache Xindice XML database. Available at http://xml.apache.org/xindice/.]]Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data
    June 2004
    988 pages
    ISBN:1581138598
    DOI:10.1145/1007568

    Copyright © 2004 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 June 2004

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate785of4,003submissions,20%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader