skip to main content
10.1145/2324796.2324816acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval

Published:05 June 2012Publication History

ABSTRACT

Bag-of-words models are among the most widely used and successful representations in multimedia retrieval. However, the quantization error which is introduced when mapping keypoints to visual words is one of the main drawbacks of the bag-of-words model. Although some techniques, such as soft-assignment to bags [23] and query expansion [27], have been introduced to deal with the problem, the performance gain is always at the cost of longer query response time, which makes them difficult to apply to large-scale multimedia retrieval applications. In this paper, we propose a simple "constrained keypoint quantization" method which can effectively reduce the overall quantization error of the bag-of-words representation and greatly improve the retrieval efficiency at the same time. The central idea of the proposed quantization method is that if a keypoint is far away from all visual words, we simply remove it. At first glance, this simple strategy seems naive and dangerous. However, we show that the proposed method has a solid theoretical background. Our experimental results on three widely used datasets for near duplicate image and video retrieval confirm that by removing a large amount of keypoints which have high quantization error, we obtain comparable or even better retrieval performance while dramatically boosting retrieval efficiency.

References

  1. cc_web_video: Near-duplicate web video dataset. available: http://vireo.cs.cityu.edu.hk/webvideo/.Google ScholarGoogle Scholar
  2. http://www.flickr.com.Google ScholarGoogle Scholar
  3. http://www.robots.ox.ac.uk/~vgg/data/oxbuildings.Google ScholarGoogle Scholar
  4. http://www.robots.ox.ac.uk/~vgg/research/affine.Google ScholarGoogle Scholar
  5. R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. ACM Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. O. Boiman, E. Shechtman, and M. Irani. In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. Boughhorbel, J.-P. Tarel, and F. Fleuret. Non-mercer kernels for svm object recognition. In British Machine Vision Conference, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. Y. Cai, L. Yang, W. Ping, F. Wang, T. Mei, X.-S. Hua, and S. Li. Million-scale near-duplicate video retrieval system. In ACM Multimedia, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. Workshop on Statistical Learning in Computer Vision, 2004.Google ScholarGoogle Scholar
  10. R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of features. Journal of Machine Learning Research, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 1963.Google ScholarGoogle ScholarCross RefCross Ref
  13. H. Jégou, M. Douze, and C. Schmid. Improving bag-of-features for large scale image search. International Journal of Computer Vision, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Jurie and B. Triggs. Creating efficient codebooks for visual recognition. In Computer Vision and Pattern Recognition, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Ke, R. Sukthankar, and L. Huston. Efficient near-duplicate detection and sub-image retrieval. In ACM Multimedia, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Li, L. Yang, X.-S. Hua, and H.-J. Zhang. Large-scale robust visual codebook construction. In ACM Multimedia, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Li, W. Tong, R. Jin, A. K. Jain, and J.-E. Lee. An efficient key point quantization algorithm for large scale image retrieval. In ACM workshop on Large-scale multimedia retrieval and mining, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Lowe. Distinctive image features from scale-invariant keypoints. In International Journal of Computer Vision, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Lyu. Mercer kernels for object recognition with local features. In Computer Vision and Pattern Recognition, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Application (VISSAPP'09), 2009.Google ScholarGoogle Scholar
  21. D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Computer Vision and Pattern Recognition, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  23. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In Computer Vision and Pattern Recognition, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  24. J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. Tuytelaars and C. Schmid. Vector quantizing feature space with a regular lattice. In International Conference on Computer Vision, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  26. X. Wu, A. G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In ACM Multimedia, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Yang, Y. Cai, A. Hanjalic, X.-S. Hua, and S. Li. Video-based image retrieval. In ACM Multimedia, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Yang, B. Geng, Y. Cai, A. Hanjalic, and X.-S. Hua. Object retrieval using visual query context. IEEE Transactions on Multimedia, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Yang, F. Nie, D. Xu, J. Luo, Y. Zhuang, and Y. Pan. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Yang, Y.-T. Zhuang, F. Wu, and Y.-H. Pan. Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Transactions on Multimedia, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. W.-L. Zhao, S. Tan, and C.-W. Ngo. Large-scale near-duplicate web video search: challenge and opportunity. In International Conference on Multimedia and Expo, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICMR '12: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
      June 2012
      489 pages
      ISBN:9781450313292
      DOI:10.1145/2324796

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 June 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICMR '12 Paper Acceptance Rate50of145submissions,34%Overall Acceptance Rate254of830submissions,31%

      Upcoming Conference

      ICMR '24
      International Conference on Multimedia Retrieval
      June 10 - 14, 2024
      Phuket , Thailand

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader