research-article

Efficient Binary Coding for Subspace-based Query-by-Image Video Retrieval

Authors:
Ruicong Xu

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Yang Yang

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Fumin Shen

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Ning Xie

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Heng Tao Shen

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

MM '17: Proceedings of the 25th ACM international conference on MultimediaOctober 2017Pages 1354–1362https://doi.org/10.1145/3123266.3123392

Published:19 October 2017Publication History

MM '17: Proceedings of the 25th ACM international conference on Multimedia

Pages 1354–1362

ABSTRACT

Subspace representations have been widely applied for videos in many tasks. In particular, the subspace-based query-by-image video retrieval (QBIVR), facing high challenges on similarity-preserving measurements and efficient retrieval schemes, urgently needs considerable research attention. In this paper, we propose a novel subspace-based QBIVR framework to enable efficient video search. We first define a new geometry-preserving distance metric to measure the image-to-video distance, which transforms the QBIVR task to be the Maximum Inner Product Search (MIPS) problem. The merit of this distance metric lies in that it helps to preserve the genuine geometric relationship between query images and database videos to the greatest extent. To boost the efficiency of solving the MIPS problem, we introduce two asymmetric hashing schemes which can bridge the domain gap of images and videos properly. The first approach, termed Inner-product Binary Coding (IBC), achieves high-quality binary codes by learning the binary codes and coding functions simultaneously without continuous relaxations. The other one, Bilinear Binary Coding (BBC) approach, employs compact bilinear projections instead of a single large projection matrix to further improve the retrieval efficiency. Extensive experiments on four real-world video datasets verify the effectiveness of our proposed approaches, as compared to the state-of-the-art methods.

References

Ronen Basri, Tal Hassner, and Lihi Zelnik-Manor. 2011. Approximate Nearest Subspace Search. TPAMI, Vol. 33, 2 (2011), 266--278. Google ScholarDigital Library
Martin Bäuml, Makarand Tapaswi, and Rainer Stiefelhagen. 2013. Semi-supervised Learning with Constraints for Person Identification in Multimedia Data CVPR. 3602--3609. Google ScholarDigital Library
Andre F. de Araújo, Jason Chaves, Roland Angst, and Bernd Girod. 2015. Temporal aggregation for large-scale query-by-image video retrieval ICIP. 1519--1522.Google Scholar
Andre F. de Araújo, Mina Makar, Vijay Chandrasekhar, David M. Chen, Sam S. Tsai, Huizhong Chen, Roland Angst, and Bernd Girod. 2014. Efficient video search using image queries. In ICIP. 3082--3086.Google Scholar
Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective Matrix Factorization Hashing for Multimodal Data CVPR. 2083--2090. Google ScholarDigital Library
Yunchao Gong, Sanjiv Kumar, Henry A. Rowley, and Svetlana Lazebnik. 2013 a. Learning Binary Codes for High-Dimensional Data Using Bilinear Projections CVPR. 484--491. Google ScholarDigital Library
Yunchao Gong, Sanjiv Kumar, Vishal Verma, and Svetlana Lazebnik. 2012. Angular Quantization-based Binary Codes for Fast Similarity Search NIPS. 1205--1213. Google ScholarDigital Library
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013 b. Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval. TPAMI, Vol. 35, 12 (2013), 2916--2929. Google ScholarDigital Library
Richang Hong, Yang Yang, Meng Wang, and Xian-Sheng Hua. 2015. Learning Visual Semantic Relationships for Efficient Visual Retrieval. TBD, Vol. 1, 4 (2015), 152--161.Google ScholarCross Ref
Yiqun Hu, Ajmal S. Mian, and Robyn A. Owens. 2011. Sparse approximated nearest points for image set classification CVPR. 121--128. Google ScholarDigital Library
Zi Huang, Heng Tao Shen, Jie Shao, Xiaofang Zhou, and Bin Cui. 2009. Bounded coordinate system indexing for real-time video clip search. TOIS, Vol. 27, 3 (2009), 17:1--17:33. Google ScholarDigital Library
Jianqiu Ji, Jianmin Li, Qi Tian, Shuicheng Yan, and Bo Zhang. 2015. Angular-Similarity-Preserving Binary Signatures for Linear Subspaces. TIP, Vol. 24, 11 (2015), 4372--4380.Google ScholarDigital Library
Jianqiu Ji, Jianmin Li, Shuicheng Yan, Qi Tian, and Bo Zhang. 2014. Similarity-Preserving Binary Signature for Linear Subspaces AAAI. 2767--2772. Google ScholarDigital Library
Qing-Yuan Jiang and Wu-Jun Li. 2015. Scalable Graph Hashing with Feature Transformation IJCAI. 2248--2254. Google ScholarDigital Library
Yu-Gang Jiang, Zuxuan Wu, Jun Wang, Xiangyang Xue, and Shih-Fu Chang. 2015. Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks. CoRR Vol. abs/1502.07209 (2015).Google Scholar
Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks CVPR. 3270--3278.Google Scholar
Alan J. Laub. 2005. Matrix analysis - for scientists and engineers. SIAM. Google ScholarDigital Library
Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2016. Feature Learning Based Deep Supervised Hashing with Pairwise Labels IJCAI. 1711--1717. Google ScholarDigital Library
Yan Li, Ruiping Wang, Zhiwu Huang, Shiguang Shan, and Xilin Chen. 2015. Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold CVPR. 4758--4767.Google Scholar
Zijia Lin, Guiguang Ding, Mingqing Hu, and Jianmin Wang. 2015. Semantics-preserving hashing for cross-view retrieval CVPR. 3864--3872.Google Scholar
Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou. 2015. Deep hashing for compact binary codes learning. In CVPR. 2475--2483.Google Scholar
Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In CVPR. 2074--2081. Google ScholarDigital Library
Wei Liu, Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2011. Hashing with Graphs ICML. 1--8. Google ScholarDigital Library
Mohammad Norouzi and David J. Fleet. 2011. Minimal Loss Hashing for Compact Binary Codes. In ICML. 353--360. Google ScholarDigital Library
Florent Perronnin, Jorge Sánchez, and Thomas Mensink. 2010. Improving the Fisher Kernel for Large-Scale Image Classification ECCV. 143--156. Google ScholarDigital Library
Mohammad Rastegari, Jonghyun Choi, Shobeir Fakhraei, Hal Daumé III, and Larry S. Davis. 2013. Predictable Dual-View Hashing. In ICML. 1328--1336. Google ScholarDigital Library
Fumin Shen, Wei Liu, Shaoting Zhang, Yang Yang, and Heng Tao Shen. 2015. Learning Binary Codes for Maximum Inner Product Search ICCV. 4148--4156. Google ScholarDigital Library
Fumin Shen, Xiang Zhou, Yang Yang, Jingkuan Song, Heng Tao Shen, and Dacheng Tao. 2016. A Fast Optimization Method for General Binary Code Learning. TIP, Vol. 25, 12 (2016), 5610--5621. Google ScholarDigital Library
Anshumali Shrivastava and Ping Li. 2014. Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) NIPS. 2321--2329. Google ScholarDigital Library
Anshumali Shrivastava and Ping Li. 2015. Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS). In UAI. 812--821. Google ScholarDigital Library
Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources SIGMOD. 785--796. Google ScholarDigital Library
Raviteja Vemulapalli, Jaishanker K. Pillai, and Rama Chellappa. 2013. Kernel Learning for Extrinsic Classification of Manifold Features CVPR. 1782--1789. Google ScholarDigital Library
Ruiping Wang and Xilin Chen. 2009. Manifold Discriminant Analysis. In CVPR. 429--436.Google Scholar
Zhongwen Xu, Yi Yang, and Alexander G. Hauptmann. 2015. A discriminative CNN video representation for event detection CVPR. 1798--1807.Google Scholar
Yang Yang, Yadan Luo, Weilun Chen, Fumin Shen, Jie Shao, and Heng Tao Shen. 2016. Zero-Shot Hashing via Transferring Supervised Knowledge ACM MM. 1286--1295. Google ScholarDigital Library
Yang Yang, Fumin Shen, Heng Tao Shen, Hanxi Li, and Xuelong Li. 2015. Robust Discrete Spectral Hashing for Large-Scale Image Semantic Indexing. TBD, Vol. 1, 4 (2015), 162--171.Google ScholarCross Ref
Yang Yang, Zheng-Jun Zha, Yue Gao, Xiaofeng Zhu, and Tat-Seng Chua. 2014. Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss. TMM, Vol. 16, 6 (2014), 1677--1689.Google ScholarCross Ref
Litao Yu, Yang Yang, Zi Huang, Peng Wang, Jingkuan Song, and Heng Tao Shen. 2016. Web Video Event Recognition by Semantic Analysis from Ubiquitous Documents. TIP, Vol. 25, 12, 5689--5701. Google ScholarDigital Library
Dongqing Zhang and Wu-Jun Li. 2014. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization AAAI. 2177--2183. Google ScholarDigital Library
Yi Zhen and Dit-Yan Yeung. 2012. A probabilistic model for multimodal hash function learning SIGKDD. 940--948. Google ScholarDigital Library
Jile Zhou, Guiguang Ding, and Yuchen Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search SIGIR. 415--424. Google ScholarDigital Library
Xiaofeng Zhu, Zi Huang, Heng Tao Shen, and Xin Zhao. 2013. Linear cross-modal hashing for efficient multimedia search ACM MM. 143--152. Google ScholarDigital Library

Index Terms

Efficient Binary Coding for Subspace-based Query-by-Image Video Retrieval
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval

Recommendations

Binary Coding by Matrix Classifier for Efficient Subspace Retrieval
ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Fast retrieval in large-scale database with high-dimensional subspaces is an important task in many applications, such as image retrieval, video retrieval and visual recognition. This can be facilitated by approximate nearest subspace (ANS) retrieval ...
Read More
Image retrieval based on incremental subspace learning

Many problems in information processing involve some form of dimensionality reduction, such as face recognition, image/text retrieval, data visualization, etc. The typical linear dimensionality reduction algorithms include principal component analysis (...
Read More
Learning a Maximum Margin Subspace for Image Retrieval

One of the fundamental problems in Content-Based Image Retrieval (CBIR) has been the gap been low level visual features and high level semantic concepts. To narrow down this gap, relevance feedback is introduced into image retrieval. With the user ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '17: Proceedings of the 25th ACM international conference on Multimedia
October 2017
2028 pages
ISBN:9781450349062
DOI:10.1145/3123266
General Chairs:
Qiong Liu
FXPAL, USA
,
Rainer Lienhart
Universität Augsburg, Germany
,
Haohong Wang
TCL America, USA
,
Program Chairs:
Sheng-Wei "Kuan-Ta" Chen
Academia Sinica, Taiwan
,
Susanne Boll
University of Oldenburg, Germany
,
Phoebe Chen
La Trobe University, Australia
,
Gerald Friedland
Lawrence Livermore National Lab, USA
,
Jia Li
Google, USA
,
Shuicheng Yan
Qihoo 360, China
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 October 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
asymmetric hashing
geometry-preserving distance metric
query-by-image
video retrieval
Qualifiers
- research-article
Conference

Acceptance Rates
MM '17 Paper Acceptance Rate189of684submissions,28%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 307
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient Binary Coding for Subspace-based Query-by-Image Video Retrieval

MM '17: Proceedings of the 25th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Binary Coding by Matrix Classifier for Efficient Subspace Retrieval

Image retrieval based on incremental subspace learning

Learning a Maximum Margin Subspace for Image Retrieval