ABSTRACT
Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which shows how semantic information about data sources may be used to discover cached query results of interest.
- {1} S. Abiteboul and A. Bonner. (1991) Obiects and view., In Proc. of the ACM SIGMOD Conf. on Management of Data, pp. 238-247, 1991. Google Scholar
- {2} S. Abiteboul, S. Cluet, and T. Mile. (1993) Querying and updating the file., In Proc. Int. Conf. on Very Large Data Bases (VLDB), pp. 73-84, 1993. Google Scholar
- {3} S. Adah, K.S. Candan, Su-Shing Chen, K. Erol and V.S. Subrahmanian. (1995) Advanced Video Information System: Data Structures and Query Processing., Accepted for publication in the ACM-Springer Multi-media Systems Journal. Google Scholar
- {4} S. Adah and V.S. Subrahmanian. (1994) Amalgamating Knowledge Bases, III: Algorithms, data structures and query processing. Technical Report CS-TR-3124, Computer Science Department, University of Maryland, Aug. 1993. Accepted for publication in Journal of Logic Programming. (http://www.cs.umd.edu/projects/hermes/publications/abstracts/akbiii.ps)Google Scholar
- {5} S. Adah and R. Emery. (1995) A Uniform Framework For Integrating Knowledge In Heterogeneous Knowledge Systems, Proc. of the Eleventh International Conference on Data Engineering, pp. 513- 520, (http://www.cs.umd.edu/projects/hermes/publications/abstracts/cons.ps) Google Scholar
- {6} S. Adah and V.S. Subrahmanian. (1995) Intelligent Caching in Heterogeneous Reasoning and Mediator Systems, Proc. of the Second International Conference on Building and Sharing of Very Large-Scale Knowledge Bases (cd. N. Mars), pps 247-256, IOS Press, Twente, The Netherlands, May 1995.Google Scholar
- {7} J. Blakeley, N. Coburn, and P.-A. Larson. (1989) Updating derived relations: Detecting irrelevant and autonomously computable updates., ACM Trans. on Database Systems, 14(3):369-400, 1989. Google ScholarDigital Library
- {8} Stefano Ceri and Jennifer Widom. Deriving Production Rules for Incremental View Maintenance., Proc. of the 17th VLDB Conference, 1991. Google Scholar
- {9} U. Dayal. (1989) Queries and view in a object-oriented databases., In Int. Workshop on Database Programming Languages, 1989. Google Scholar
- {10} U. Dayal and H. Hwang. (1984) View definition and generalization for database integration in a multidatabase system., IEEE Trans. Software Eng., SE- 10(6):628-644, 1984.Google Scholar
- {11} N. Gehani, H. Jagadish, and W. Roome. (1994) OdeFS: A file system interface to an object-oriented database., Proc. Int. Conf. on Very Large Databases (VLDB), pp. 249-260, 1994. Google Scholar
- {12} Ashish Gupta, Dinesh Katiyar, and Inderpal Singh Mumick. (1992) Counting Solutions to the View Maintenance Problem., In Workshop on Deductive Databases, JICSLP, 1992.Google Scholar
- {13} A. Gupta, I.S. Mumick and V.S. Subrahmanian. (1993) Maintaining Views Incrementally., Proc. 1993 ACM SIGMOD Conf. on Management of Data, Washington, DC. Google Scholar
- {14} E. Hanson. (1987) A performance analysis of view materialization strategies., In Proc. of the ACM SIGMOD Conf. on Management of Data, pp. 440-453, 1987. Google Scholar
- {15} A. Kemper, C. Kilger, G. Moerkotte. (1994) Function Materialization in Object Bases: Design, Realization, and Evaluation., IEEE Transactions on Knowledge and Data Engineering, Vol. 6, No. 4, August 1994. Google Scholar
- {16} Laks V.S. Lakshmanan, F. Sadri amd I.N. Subramanian. (1993) On the logical foundations of schema integration and evolution in Heterogeneous Database Systems. , Proc. DOOD-93, Phoenix, Arizona.Google Scholar
- {17} Laks V.S. Lakshmanan, F. Sadri and I.N. Subramanian. (1995) Logic and Algebraic Languages for Interoperability in Multidatabase Systems, submitted to Journal of Logic Programming.Google Scholar
- {18} S. Leach and J. Lu. (1994) Computing Annotated Logic Programs., Proceedings of the 11th International Conference on Logic Programming (ed. P. Van Hentenryck), MIT Press, pps 257-271. Google Scholar
- {19} J. Lu, G. Moerkotte, J. Schue, V.S. Subrahmanian. (1995) Efficient Maintenance of Materialized Mediated Views, Proc. 1995 ACM SIGMOD Conf. on Management of Data, San Jose, CA, May 1995. Google Scholar
- {20} J. Lu, A. Nerode and V.S. Subrahmanian. (1993) Hybrid Knowledge Bases, Accepted for publication in: IEEE Trans. on Knowledge and Data Engineering. Google Scholar
- {21} A. Motro. (1987) Superviews: Virtual integration of multiple databases., IEEE Trans. Software Eng., 13(7):785-798, 1987. Google Scholar
- {22} I. S. Mumick. (1991) Query Optimization in Deductive and Relational Databases., Ph.D. Thesis, Stanford University, CA 94305, 1991. Google Scholar
- {23} M. Scholl, C. Laasch, and M. Tresch. (1991) Updatable views in object-oriented databases., In Proc. Int. Conf. on Deductive and Object-Oriented Databases (DOOD), 1991.Google Scholar
- {24} A. Sheth and J. Larson. (1990) Federated database systems for managing distributed, heterogeneous and autonomous databases., ACM Computing Surveys, 22(3):183-236, 1990. Google Scholar
- {25} Oded Shmueli and Alon Itai. (1984) Maintenance of Views. In Sigmod Record, 14(2):240-255, 1984. Google Scholar
- {26} M. Stonebraker, A. Jhingran, J. Goh, and S. Potamianos. (1990) On rules, procedures, caching and views in data base systems., In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 281-290, 1990. Google Scholar
- {27} V.S. Subrahmanian. (1994) Amalgamating Knowledge Bases., ACM Trans, on Database Systems, 19, 2, pps 291-331, 1994. Google Scholar
- {28} V.S. Subrahmanian, S. Adah, A. Brink, R. Emery, J. Lu, A. Rajput, T.J. Rogers, R. Ross. (1994) HERMES: A Heterogeneous Reasoning and Mediator System, submitted for publication. (http://www.cs.umd.edu/projects/hermes/ overview/paper)Google Scholar
- {29} G. Wiederhold. (1992) Mediators in the Architecture of Future Information Systems, IEEE Computer, March 1992, pps 38-49. Google Scholar
- {30} G. Wiederhold, S. Jajodia, and W. Litwin. (1993) Integrating temporal data in a heterogeneous environment. , In Temporal Databases Benjamin/Cummings, Jan. 1993.Google Scholar
- {31} P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie and T. Price. (1979) Access Path Selection in a Relational Database Management System., Proc. of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 22-34. Google Scholar
- {32} D. Chimenti, R. Gamboa and R. Krishnamurty. (1989) Towards an open architecture for LDL., Proc. of the 15th International VLDB Conference, pp. 195-203, Amsterdam, The Netherlands, August 1989. Google Scholar
- {33} S. Chaudhuri and K. Shim. (1993) Query optimization in the Presence of Foreign Functions., Proc. of the 19th VLDB Conference. Google Scholar
- {34} S. Chaudhuri, R. Krishnamurty, S. Potamianos and K. Shim. (1995) Optimizing Queries with Materialized Views, Proc. of International Conference on Data Engineering, pp. 109-200, 1995. Google Scholar
- {35} J. D. Ullman. Principles of Database and knowledge Base Systems, volume 2. Computer Science Press, 1989. Google Scholar
- {36} C. M. Chen and N. Rousopoulos. (1994) Adaptive Selectivity Estimation Using Query Feedback., Proc. of the 1994 ACM-SIGMOD Conference on Management of Data, pp. 161-172. Google Scholar
- {37} H. Tamaki and T. Sato. (1986) OLD Resolution with Tabulation, Proc. 3rd Intl. Conf. on Logic Programming (ed. E. Shapiro), pps 84-98, Springer. Google Scholar
- {38} D. S. Warren. (1992) Memoing for Logic Programs, Comm. of the ACM, 35, 3, pps 94-111. Google Scholar
- {39} X. Qian. (1995) Query folding., To appear in the Proc. of the 1996 IEEE Data Engineering Conf., Technical Report SRI-CSL-95-09, Computer Science Laboratory, SRI International, July 1995. Google Scholar
- {40} S. Adah and X. Qian. (1995) Query Transformation in Heterogeneous Reasoning Systems., Submitted for publication.Google Scholar
- {41} S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. (1994) The TSIMMIS Project: Integration of Heterogeneous Information Sources., In Proceedings of IPSJ Conference, Tokyo, Japan, October 1994. (Also available via anonymous FTP from host db.stanford.edu, file/pub/chawathe/1994/tsimmis-overview.ps.)Google Scholar
- {42} W. Du and R. Krishnamurthy and M.-C. Shan. (1992) Query Optimization in Heterogeneous Database Management Systems., In Proc. VLDB Conference, pp. 277-291, Vancouver, Canada, 1992. Google Scholar
- {43} Q. Zhu and P.-A. Larson. (1994) A Query Sampling Method for Estimating Local Cost Parameters in a Multidatabase System., Proc. IEEE Data Engineering Conf., pp. 144-153, 1994. Google Scholar
- {44} H. Lu and B.-C. Ooi and C.-H. Goh. (1993) Multidatabase Query Optimization: Issues and Solutions., Proc. RIDE-IMS '93, pp. 137-143, 1993.Google Scholar
Index Terms
- Query caching and optimization in distributed mediator systems
Recommendations
Query caching and optimization in distributed mediator systems
Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and ...
Selective Victim Caching: A Method to Improve the Performance of Direct-Mapped Caches
Although direct-mapped caches suffer from higher miss ratios as compared to set-associative caches, they are attractive for today's high-speed pipelined processors that require very low access times. Victim caching was proposed by Jouppi [1] as an ...
Comments