skip to main content
10.1145/2911996.2912068acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

Retrieval of Multimedia Objects by Fusing Multiple Modalities

Published:06 June 2016Publication History

ABSTRACT

Effective multimedia retrieval requires the combination of the heterogeneous media contained within multimedia objects and the features that can be extracted from them. To this end, we extend a unifying framework that integrates all well-known weighted, graph-based, and diffusion-based fusion techniques that combine two modalities (textual and visual similarities) to model the fusion of multiple modalities. We also provide a theoretical formula for the optimal number of documents that need to be initially selected, so that the memory cost in the case of multiple modalities remains the same as in the case of two modalities. Experiments using two test collections and three modalities (similarities based on visual descriptors, visual concepts, and textual concepts) indicate improvements in the effectiveness over bimodal fusion under the same memory complexity.

References

  1. J. Ah-Pine, S. Clinchant, and G. Csurka. Comparison of several combinations of multimodal and diversity seeking methods for multimedia retrieval. In Multilingual Information Access Evaluation II. Multimedia Experiments: Proceedings of the 10th Workshop of the Cross-Language Evaluation Forum (CLEF), pages 124--132. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Ah-Pine, G. Csurka, and S. Clinchant. Unsupervised visual and textual information fusion in cbmir using graph-based methods. ACM Transactions on Information Systems (TOIS), 33(2):9, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. K. Atrey, M. A. Hossain, A. El Saddik, and M. S. Kankanhalli. Multimodal fusion for multimedia analysis: a survey. Multimedia systems, 16(6):345--379, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Costa Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. R. Lanckriet, R. Levy, and N. Vasconcelos. On the role of correlation and abstraction in cross-modal multimedia retrieval. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 36(3):521--535, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Hafner, H. S. Sawhney, W. Equitz, M. Flickner, and W. Niblack. Efficient color histogram indexing for quadratic form distancefunctions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 17(7):729--736, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. H. Hsu, L. S. Kennedy, and S.-F. Chang. Video search reranking through random walk over document-level context graph. In Proceedings of the 15th International Conference on Multimedia, pages 971--980. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. Jégou, M. Douze, C. Schmid, and P. Pérez. Aggregating local descriptors into a compact image representation. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3304--3311. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  8. B. Safadi and G. Quénot. Re-ranking by local re-scoring for video indexing and retrieval. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM), pages 2081--2084. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Safadi, M. Sahuguet, and B. Huet. When textual and visual information join forces for multimedia retrieval. In Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR), page 265. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Siddiquie, B. White, A. Sharma, and L. S. Davis. Multi-modal image retrieval for complex queries using small codes. In Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR), page 321. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. E. Van De Sande, T. Gevers, and C. G. Snoek. Evaluating color descriptors for object and scene recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(9):1582--1596, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Wang, Y. He, C. Kang, S. Xiang, and C. Pan. Image-text cross-modal retrieval via modality-specific feature learning. In Proceedings of the 5th ACM International Conference on Multimedia Retrieval (ICMR), pages 347--354. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Wang, X. Lin, and Q. Zhang. Towards metric fusion on multi-view data: a cross-view based graph random walk approach. In Proceedings of the 22nd ACM International Conference on Information and knowledge management (CIKM), pages 805--810. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Xu, H. Li, X. Chang, S.-I. Yu, X. Du, X. Li, L. Jiang, Z. Mao, Z. Lan, S. Burger, et al. Incremental multimodal query construction for video search. In Proceedings of the 5th ACM International Conference on Multimedia Retrieval (ICMR), pages 675--678. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Retrieval of Multimedia Objects by Fusing Multiple Modalities

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval
        June 2016
        452 pages
        ISBN:9781450343596
        DOI:10.1145/2911996

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 June 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        ICMR '16 Paper Acceptance Rate20of120submissions,17%Overall Acceptance Rate254of830submissions,31%

        Upcoming Conference

        WiSec '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader