skip to main content
10.1145/1247480.1247549acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Towards keyword-driven analytical processing

Published:11 June 2007Publication History

ABSTRACT

Gaining business insights from data has recently been the focus of research and product development. On Line-Analytical Processing (OLAP) tools provide elaborate query languages that allow users to group and aggregate data in various ways, and explore interesting trends and patterns in the data. However, the dynamic nature of today's data along with the overwhelming detail at which data is provided, make it nearly impossible to organize the data in a way that a business analyst needs for thinking about the data. In this paper, we introduce "Keyword-Driven Analytical Processing" (KDAP), which combines intuitive keyword-based search with the power of aggregation in OLAP without having to spend considerable effort in organizing the data in terms that the business analyst understands. Our design point is around a user mentality that we frequently encounter: "users don't know how to specify what they want, but they know it when they see it". We present our complete solution framework, which implements various phases from disambiguating the keyword terms to organizing and ranking the results in dynamic facets, that allow the user to explore efficiently the aggregation space. We address specific issues that analysts encounter, like joins, groupings and aggregations, and we provide efficient and scalable solutions. We show, how KDAP can handle both categorical and numerical data equally well and, finally, we demonstrate the generality and applicability of KDAP to two different aspects of OLAP, namely, finding exceptions or surprises in the data and finding bellwether regions where local aggregates are highly correlated with global aggregates, using various experiments on real data.

References

  1. Flamenco faceted search system. http://flamenco.berkeley.edu/.Google ScholarGoogle Scholar
  2. Google trends. http://www.google.com/trends.Google ScholarGoogle Scholar
  3. S. Agrawal, S. Chaudhuri, G. Das, and A. Gionis. Automated ranking of database query results. In CIDR, 2003.Google ScholarGoogle Scholar
  4. A. Balmin, V. Hristidis, and Y. Papakonstantinou. Authority-based keyword queries in databases using objectrank. In VLDB, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chaudhuri, G. Das, and V. Narasayya. Dbexplorer: A system for keyword search over relational databases. In ICDE, 2002.Google ScholarGoogle Scholar
  7. S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In VLDB, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Chaudhuri and U. Dayal. An overview of data warehousing and olap technology. ACM SIGMOD Record, March 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Chen, R. Ramakrishnan, J. Shavlik, and P. Tamma. Bellwether analysis: Predicting global aggregates from local regions. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. I. M. D. Florescu and D. Kossmann. Integrating keyword search into xml query processing. In WWW, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Dakka, R. Dayal, and P. G. Ipeirotis. Automatic discovery of useful facet terms. In SIGIR Faceted Search Workshop, 2006.Google ScholarGoogle Scholar
  12. G. Das, V. Hristidis, N. Kapoor, and S. Sudarshan. Ordering the attributes of query results.Google ScholarGoogle Scholar
  13. L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. Xrank: Ranked keyword search over xml documents. In SIGMOD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Hearst. Clustering versus faceted categories for information exploration. Communications of the ACM, April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient ir-style keyword search over relational databases. In VLDB, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. Hristidis and Y. Papakonstantinou. Discover: keyword search in relational databases. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Li, C. Yu, and H. V. Jagadish. Schema-free xquery. In VLDB, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Sarawagi. Explaining differences in multidimensional aggregates. In VLDB, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Sarawagi. User-adaptive exploration of multidimensional data. In VLDB, 2000.Google ScholarGoogle Scholar
  21. S. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of olap data cubes. In EDBT, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Tunkelang. Dynamic category sets: An approach for faceted search. In SIGIR Faceted Search Workshop, 2006.Google ScholarGoogle Scholar
  23. K. P. Yee. Faceted metadata for image search and browsing. In CHI, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards keyword-driven analytical processing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
      June 2007
      1210 pages
      ISBN:9781595936868
      DOI:10.1145/1247480
      • General Chairs:
      • Lizhu Zhou,
      • Tok Wang Ling,
      • Program Chair:
      • Beng Chin Ooi

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader