ABSTRACT
The need to automatically extract and classify the contents of multimedia data archives such as images, video, and text documents has led to significant work on similarity based retrieval of data. To date, most work in this area has focused on the creation of index structures for similarity based retrieval. There is very little work on developing formalisms for querying multimedia databases that support similarity based computations and optimizing such queries, even though it is well known that feature extraction and identification algorithms in media data are very expensive. We introduce a similarity algebra that brings together relational operators and results of multiple similarity implementations in a uniform language. The algebra can be used to specify complex queries that combine different interpretations of similarity values and multiple algorithms for computing these values. We prove equivalence and containment relationships between similarity algebra expressions and develop query rewriting methods based on these results. We then provide a generic cost model for evaluating cost of query plans in the similarity algebra and query optimization methods based on this model. We supplement the paper with experimental results that illustrate the use of the algebra and the effectiveness of query optimization methods using the Integrated Search Engine (I.SEE) as the testbed.
- 1.S. Adall, C. Buff and Y. Temtanapat. "Integrated Search Engine", to appear in the Proc. of i997 IEEE Knowledge and Data Engineering Exchange Workshop, KI)~3X'97. Google ScholarDigital Library
- 2.S. Adatl, K.S. Candan, Y. Papakonstantinou and V.S. Subrahmanian. "Query Caching and Optimization in Distributed Mediator Systems", Proc. of the 1996 Sigmod Conference on Management of Data, pp. 137 - 148. Google ScholarDigital Library
- 3.M. Arya, W. Cody, C. Faloutsos, J. Richardson and A. Toga. (1995) "Design and Implementation of QBISM, a 3D Medical Image Database System", in: (V.S. Subrahmanian and S. Jajodia, eds.) "Multimedia Database Systems: Issues and Research Directions", Springer 1995. Google ScholarDigital Library
- 4.S. Berchtold, D. A. Keim, and Hans-Peter Kriegel. (1996) "The X-tree : An Index Structure for tIigh-Dimensional Data", Proc. 1996 Intl. Conf. on Very Large Databases, Bombay, India, pps 28-39. Google ScholarDigital Library
- 5.T. Blum, D. Keislar, J. Wheaton and E. Wold. (1995) "Audio Databases with Content-based Retrieval", Proc. 1995 IJCAI workshop on Intelligent Multimedia Information Retrieval, Montreal, Canada. Google ScholarDigital Library
- 6.E.W. Brown, J.P. Callan and W. B. Crof. (1994) "Fast incremental Indexing for Pull-Text Information retrieval", Proc. 1994 Intl. Conf. on Very Large databases, Santiago, Chile, pps, 192-202. Google ScholarDigital Library
- 7.M. J. Carey and D. Kossmann (1997) "On Saying "Enough Already!" in SQL." Proceedings of the 1997 Sigmod Conference on Management of Data, pp. 219-230. Google ScholarDigital Library
- 8.S. Chaudhuri and L. Gravano. (1996) "Optimizing Queries over Multimedia Repositories", Proc. 1996 ACM SIGMOD Conf. on Management of Data, pps 91-102. Google ScholarDigital Library
- 9.C. Faloutsos. (1996) "Searching Multimedia Databases by Content", Kluwer Academic Publishers. Google ScholarDigital Library
- 10.L. Gravano, C. Chang, H. Garcia-Molina and A. Paepcke. "STARTS: Stanford Proposal for Internet Meta- Searching", SIGMOD 1997, pp. 207-218. Google ScholarDigital Library
- 11.V.N. Gudivada and V.V. Raghavan. (1993) "Design and Evaluation of Algorithms for Image Retrieval by Spatial Similarity", ACM Transactions on Information Systems, 13,1, pps 115-144. Google ScholarDigital Library
- 12.H. V. Jagadish, A. Mendelzon and T. Milo. "Similarity- Based Queries", Proc. of ACM PODS 1995, pp. 36-45. Google ScholarDigital Library
- 13.K.-I. Lin, H.V. Jagadish and C. Faloutsos. (1994) "The TV-Tree: An Index Structure for High-dimensional Data", VLDB Journal, 3, pps 517-542. Google ScholarDigital Library
- 14.W. Niblack, et. al. (1993) "The QBIC Project: Querying Images by Content Using Color, Texture and Shape", IBM Research Report, Feb. 1993.Google ScholarCross Ref
- 15.S. Santini and R. Jain. (1996) "Similarity Matching", to appear in: IEEE Transactions on Pattern Analysis and Machine Intelligence.Google Scholar
- 16.X. Qian. "Query folding." In Proceedings of the Twelfth International Conference on Data Enginnering, 1996. Google ScholarDigital Library
- 17.A. Tomasic, H. Garcia-Molina and K. Shoens. (1994) "Incremental Updates of Inverted Lists for Text Retrieval", Proc. 1994 ACM SIGMOD Conf. on Management of Data, Minneapolis, pps 289--300. Google ScholarDigital Library
Index Terms
- A multi-similarity algebra
Recommendations
A multi-similarity algebra
The need to automatically extract and classify the contents of multimedia data archives such as images, video, and text documents has led to significant work on similarity based retrieval of data. To date, most work in this area has focused on the ...
Construction of quotient BCI(BCK)-algebra via a fuzzy ideal
The present paper gives a new construction of a quotient BCI(BCK)- algebra X/µ by a fuzzy ideal µ in X and establishes the Fuzzy Homomorphism Fundamental Theorem. We show that if µ is a fuzzy ideal (closed fuzzy ideal) of X. then X/µ is a commutative (...
A similarity based relational algebra for Web and multimedia data
Modelling vagueness and subjectivity in information accessWeb and multimedia data are becoming very important. A fundamental characteristic of these data is imprecision. Query languages for Web and multimedia data must express imprecision in features matching, similarity queries and user preferences. In ...
Comments