short-paper

Retrieval of Multimedia Objects by Fusing Multiple Modalities

Authors:
Ilias Gialampoukidis

ITI-CERTH, Thessaloniki, Greece

ITI-CERTH, Thessaloniki, Greece
View Profile

,
Anastasia Moumtzidou

ITI-CERTH, Thessaloniki, Greece

ITI-CERTH, Thessaloniki, Greece
View Profile

,
Theodora Tsikrika

ITI-CERTH, Thessaloniki, Greece

ITI-CERTH, Thessaloniki, Greece
View Profile

,
Stefanos Vrochidis

ITI-CERTH, Thessaloniki, Greece

ITI-CERTH, Thessaloniki, Greece
View Profile

,
Ioannis Kompatsiaris

ITI-CERTH, Thessaloniki, Greece

ITI-CERTH, Thessaloniki, Greece
View Profile

ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia RetrievalJune 2016Pages 359–362https://doi.org/10.1145/2911996.2912068

Published:06 June 2016Publication History

ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

Pages 359–362

ABSTRACT

Effective multimedia retrieval requires the combination of the heterogeneous media contained within multimedia objects and the features that can be extracted from them. To this end, we extend a unifying framework that integrates all well-known weighted, graph-based, and diffusion-based fusion techniques that combine two modalities (textual and visual similarities) to model the fusion of multiple modalities. We also provide a theoretical formula for the optimal number of documents that need to be initially selected, so that the memory cost in the case of multiple modalities remains the same as in the case of two modalities. Experiments using two test collections and three modalities (similarities based on visual descriptors, visual concepts, and textual concepts) indicate improvements in the effectiveness over bimodal fusion under the same memory complexity.

References

J. Ah-Pine, S. Clinchant, and G. Csurka. Comparison of several combinations of multimodal and diversity seeking methods for multimedia retrieval. In Multilingual Information Access Evaluation II. Multimedia Experiments: Proceedings of the 10th Workshop of the Cross-Language Evaluation Forum (CLEF), pages 124--132. Springer, 2009. Google ScholarDigital Library
J. Ah-Pine, G. Csurka, and S. Clinchant. Unsupervised visual and textual information fusion in cbmir using graph-based methods. ACM Transactions on Information Systems (TOIS), 33(2):9, 2015. Google ScholarDigital Library
P. K. Atrey, M. A. Hossain, A. El Saddik, and M. S. Kankanhalli. Multimodal fusion for multimedia analysis: a survey. Multimedia systems, 16(6):345--379, 2010. Google ScholarDigital Library
J. Costa Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. R. Lanckriet, R. Levy, and N. Vasconcelos. On the role of correlation and abstraction in cross-modal multimedia retrieval. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 36(3):521--535, 2014. Google ScholarDigital Library
J. Hafner, H. S. Sawhney, W. Equitz, M. Flickner, and W. Niblack. Efficient color histogram indexing for quadratic form distancefunctions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 17(7):729--736, 1995. Google ScholarDigital Library
W. H. Hsu, L. S. Kennedy, and S.-F. Chang. Video search reranking through random walk over document-level context graph. In Proceedings of the 15th International Conference on Multimedia, pages 971--980. ACM, 2007. Google ScholarDigital Library
H. Jégou, M. Douze, C. Schmid, and P. Pérez. Aggregating local descriptors into a compact image representation. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3304--3311. IEEE, 2010.Google ScholarCross Ref
B. Safadi and G. Quénot. Re-ranking by local re-scoring for video indexing and retrieval. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM), pages 2081--2084. ACM, 2011. Google ScholarDigital Library
B. Safadi, M. Sahuguet, and B. Huet. When textual and visual information join forces for multimedia retrieval. In Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR), page 265. ACM, 2014. Google ScholarDigital Library
B. Siddiquie, B. White, A. Sharma, and L. S. Davis. Multi-modal image retrieval for complex queries using small codes. In Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR), page 321. ACM, 2014. Google ScholarDigital Library
K. E. Van De Sande, T. Gevers, and C. G. Snoek. Evaluating color descriptors for object and scene recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(9):1582--1596, 2010. Google ScholarDigital Library
J. Wang, Y. He, C. Kang, S. Xiang, and C. Pan. Image-text cross-modal retrieval via modality-specific feature learning. In Proceedings of the 5th ACM International Conference on Multimedia Retrieval (ICMR), pages 347--354. ACM, 2015. Google ScholarDigital Library
Y. Wang, X. Lin, and Q. Zhang. Towards metric fusion on multi-view data: a cross-view based graph random walk approach. In Proceedings of the 22nd ACM International Conference on Information and knowledge management (CIKM), pages 805--810. ACM, 2013. Google ScholarDigital Library
S. Xu, H. Li, X. Chang, S.-I. Yu, X. Du, X. Li, L. Jiang, Z. Mao, Z. Lan, S. Burger, et al. Incremental multimodal query construction for video search. In Proceedings of the 5th ACM International Conference on Multimedia Retrieval (ICMR), pages 675--678. ACM, 2015. Google ScholarDigital Library

Index Terms

Retrieval of Multimedia Objects by Fusing Multiple Modalities
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Combination, fusion and federated search
    2. Specialized information retrieval
      1. Multimedia and multimodal retrieval

Recommendations

Content-based multimedia information retrieval: State of the art and challenges

Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100+ recent ...
Read More
A Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systems
CAIVL '97: Proceedings of the 1997 Workshop on Content-Based Access of Image and Video Libraries (CBAIVL '97)

Content-based multimedia information retrieval (MIR) has become one of the most active research areas in the past few years. Many retrieval approaches based on extracting and representing visual properties of multimedia data have been developed. While ...
Read More
Applications of Image Understanding in Semantics-Oriented Multimedia Information Retrieval
MSE '00: Proceedings of the 2000 International Conference on Microelectronic Systems Education

This paper focuses on research in development of semantics-oriented multimedia information retrieval techniques.Semantics-oriented information retrieval addresses the effectiveness of the retrieval.With the goal of significantly improving retrieval ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval
June 2016
452 pages
ISBN:9781450343596
DOI:10.1145/2911996
General Chairs:
John R. Kender
Columbia University, USA
,
John R. Smith
IBM Research, USA
,
Program Chairs:
Jiebo Luo
University of Rochester, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Winston Hsu
National Taiwan University, Taiwan
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 June 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
graph-based methods
multimedia information retrieval
multimodal fusion
unsupervised fusion
Qualifiers
- short-paper
Conference

Acceptance Rates
ICMR '16 Paper Acceptance Rate20of120submissions,17%Overall Acceptance Rate254of830submissions,31%
More
Upcoming Conference
WiSec '24

Sponsor:

sigsac

17th ACM Conference on Security and Privacy in Wireless and Mobile Networks

May 27 - 30, 2024

Seoul , Republic of Korea
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 151
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Retrieval of Multimedia Objects by Fusing Multiple Modalities

ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Content-based multimedia information retrieval: State of the art and challenges

A Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systems

Applications of Image Understanding in Semantics-Oriented Multimedia Information Retrieval