ABSTRACT
Gaining business insights from data has recently been the focus of research and product development. On Line-Analytical Processing (OLAP) tools provide elaborate query languages that allow users to group and aggregate data in various ways, and explore interesting trends and patterns in the data. However, the dynamic nature of today's data along with the overwhelming detail at which data is provided, make it nearly impossible to organize the data in a way that a business analyst needs for thinking about the data. In this paper, we introduce "Keyword-Driven Analytical Processing" (KDAP), which combines intuitive keyword-based search with the power of aggregation in OLAP without having to spend considerable effort in organizing the data in terms that the business analyst understands. Our design point is around a user mentality that we frequently encounter: "users don't know how to specify what they want, but they know it when they see it". We present our complete solution framework, which implements various phases from disambiguating the keyword terms to organizing and ranking the results in dynamic facets, that allow the user to explore efficiently the aggregation space. We address specific issues that analysts encounter, like joins, groupings and aggregations, and we provide efficient and scalable solutions. We show, how KDAP can handle both categorical and numerical data equally well and, finally, we demonstrate the generality and applicability of KDAP to two different aspects of OLAP, namely, finding exceptions or surprises in the data and finding bellwether regions where local aggregates are highly correlated with global aggregates, using various experiments on real data.
- Flamenco faceted search system. http://flamenco.berkeley.edu/.Google Scholar
- Google trends. http://www.google.com/trends.Google Scholar
- S. Agrawal, S. Chaudhuri, G. Das, and A. Gionis. Automated ranking of database query results. In CIDR, 2003.Google Scholar
- A. Balmin, V. Hristidis, and Y. Papakonstantinou. Authority-based keyword queries in databases using objectrank. In VLDB, 2004.Google ScholarDigital Library
- G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, 2002. Google ScholarDigital Library
- S. Chaudhuri, G. Das, and V. Narasayya. Dbexplorer: A system for keyword search over relational databases. In ICDE, 2002.Google Scholar
- S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In VLDB, 2004. Google ScholarDigital Library
- S. Chaudhuri and U. Dayal. An overview of data warehousing and olap technology. ACM SIGMOD Record, March 1997. Google ScholarDigital Library
- B. Chen, R. Ramakrishnan, J. Shavlik, and P. Tamma. Bellwether analysis: Predicting global aggregates from local regions. In VLDB, 2006. Google ScholarDigital Library
- I. M. D. Florescu and D. Kossmann. Integrating keyword search into xml query processing. In WWW, 2000. Google ScholarDigital Library
- W. Dakka, R. Dayal, and P. G. Ipeirotis. Automatic discovery of useful facet terms. In SIGIR Faceted Search Workshop, 2006.Google Scholar
- G. Das, V. Hristidis, N. Kapoor, and S. Sudarshan. Ordering the attributes of query results.Google Scholar
- L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. Xrank: Ranked keyword search over xml documents. In SIGMOD, 2003. Google ScholarDigital Library
- M. Hearst. Clustering versus faceted categories for information exploration. Communications of the ACM, April 2006. Google ScholarDigital Library
- V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient ir-style keyword search over relational databases. In VLDB, 2003. Google ScholarDigital Library
- V. Hristidis and Y. Papakonstantinou. Discover: keyword search in relational databases. In VLDB, 2002. Google ScholarDigital Library
- Y. Li, C. Yu, and H. V. Jagadish. Schema-free xquery. In VLDB, 2004. Google ScholarDigital Library
- F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD, 2006. Google ScholarDigital Library
- S. Sarawagi. Explaining differences in multidimensional aggregates. In VLDB, 1999. Google ScholarDigital Library
- S. Sarawagi. User-adaptive exploration of multidimensional data. In VLDB, 2000.Google Scholar
- S. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of olap data cubes. In EDBT, 1998. Google ScholarDigital Library
- D. Tunkelang. Dynamic category sets: An approach for faceted search. In SIGIR Faceted Search Workshop, 2006.Google Scholar
- K. P. Yee. Faceted metadata for image search and browsing. In CHI, 2003. Google ScholarDigital Library
Index Terms
- Towards keyword-driven analytical processing
Recommendations
Efficient Aggregation Algorithms for Compressed Data Warehouses
Aggregation and cube are important operations for online analytical processing (OLAP). Many efficient algorithms to compute aggregation and cube for relational OLAP have been developed. Some work has been done on efficiently computing cube for ...
Efficient aggregation algorithms on very large compressed data warehouses
AbstractMultidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. ...
Scalable aggregate keyword query over knowledge graph
AbstractExisting keyword query systems over knowledge graphs are easy to use and can produce interesting results. However, they cannot address even simple aggregate queries (i.e., a query that needs statistics such as COUNT, SUM, AVG, MAX, MIN,...
Highlights- SAKQ enables users to pose aggregate queries using simple keywords.
- Type-...
Comments