ABSTRACT
Bag-of-visual words (BoW) model has recently been well advocated for image classification and search. However, one critical limitation of existing BoW model is the lack of semantic information. To alleviate the impact of this issue, it is imperative to construct semantic-aware visual dictionary. In this paper, we propose a novel approach for learning visual word dictionary embedding intermediate-level semantics. Specifically, we first introduce an Attribute aware Dictionary Learning(AttrDL) scheme to learn multiple sub-dictionaries with specific semantic meanings. We divide training images into different sets and each represents a specific attribute. For each image set, an attribute-aware sub-vocabulary is learned. Hence, these resulting sub-vocabularies are more discriminative for semantics than the traditional vocabularies. Second, to get semantic-aware and discriminative BoW representation with the learned sub-vocabularies, we adopt the idea of L21-norm regularized sparse coding and recode the resulting sparse representation of each image. Experimental results show that the proposed scheme outperforms the state-of-the-art algorithms in both image classification and search tasks.
- L. Cao, R. Ji, Y. Gao, Y, Yang and Q. Tian. Weakly supervised sparse coding with geometric consistency pooling. In CVPR, 2012.Google Scholar
- J. Wang, J. Yang, K. Yu, F. Lv, T. Huang and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, 2010.Google ScholarCross Ref
- J. Feng, B. Ni, Q. Tian and S. Yan. Geometric $\ell_p$-norm feature pooling for image classification. In CVPR, 2011.Google Scholar
- J. Yang, K. Yu, Y. Gong and T. Huang. Linear spatial pyramid matching using sparse conding forimage classification. In CVPR, 2009.Google Scholar
- S. Gao, I. Tsang, L.-T.Chia, and P. Zhao. Local features are not lonely - Laplacian sparse coding for image classification. In CVPR, 2011.Google Scholar
- L. Torresani, M. Szummer, and A. Fitzgibbon. Efficient object category recognition using Classemes. In ECCV, 2010. Google ScholarDigital Library
- S. Lazebnik, C. Schmid and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizingnatural scene categories. In CVPR, 2006. Google ScholarDigital Library
- M. Aharon, M. Elad, and A. Bruckstein. K-svd: an alogrithm for designing overcompletedictionaries for sparse representation. Transaction on Image Processing, 2006.Google Scholar
- K. Engan, S. O. Aase, and J. H. Husoy. Method of optimal directions for frame design. In ICASSP, 1999. Google ScholarDigital Library
- H. Lee, A. Battle, R. Raina and A. Y. Ng. Efficient sparse coding algorithms. In NIPS, 2007.Google ScholarDigital Library
- J. Mairal, F. Bach, J. Ponce, G. Saprio and A. Zisserman. Supervised dictionary learning. In NIPS, 2008.Google ScholarDigital Library
- N. Zhou, Y. Shen, J. Peng and J. Fan. Learning inter-related visual dictionary for objectrecognition. In CVPR, 2012.Google Scholar
- A. Krause and D. Dueck. Submodular dictionary leanring for sparse representation. In ICML, 2011.Google Scholar
- N. Kumar, A. C. Berg, P. N. Belhumeur and S. K. Nayar. Attribute and simile classifers for face verification. In ICCV, 2009.Google ScholarCross Ref
- C. Lampert, H. Nickisch and S. Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR, 2009.Google ScholarCross Ref
- G. Griffin, A. Holub and P. Perona. Caltech-256 object category dataset. Technical report, California Institute of Technology, 2007.Google Scholar
- Z.-J. Zha, L. Yang, T. Mei, M. Wang and Z. Wang. Visual Query Suggestion. In MM, 2009. Google ScholarDigital Library
- S. Lazebnik, C. Schmid and J. Ponce. Beyond bags of features: Spatial pyramid matching forrecognizing natural scene categories. In CVPR, 2006. Google ScholarDigital Library
- F. Li, R. Fergus and P. Perona. Learning generative visual models from few traningexamples: an incremental bayesian approach tested on 101 objectcategories. In CVPR workshop, 2004. Google ScholarDigital Library
- Y. Su and F. Jurie. Improving image classification using semantic attributes. International Journal of Computer Vision, 2012. Google ScholarDigital Library
- J. Liu, Y. Yang and M. Shah. Learning semantic visual vocabularies using diffusion distance. In CVPR, 2009.Google ScholarCross Ref
- Z.-J. Zha, X.-S. Hua, T. Mei, J.Wang, G.-J. Qi and Z. Wang. Joint Multi-Label Multi-Instance Learning for image classification. In CVPR, 2009.Google Scholar
- S. Bengio, F. Pereira, Y. Singer and D. Strelow. Group sparse coding. In NIPS, 2009.Google ScholarDigital Library
- A. Farhadi, I. Endres, D. Hoiem and D. Forsyth. Describing objects by their attributes. In CVPR, 2009.Google ScholarCross Ref
- D. Parikh and K. Grauman. Relative attributes. In ICCV,2011. Google ScholarDigital Library
- D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2007. Google ScholarDigital Library
- G. Patterson and J. Hays. SUN Attribute Database:Discovering, Annotating, and Recognizing Scene Attributes In CVPR, 2012. Google ScholarDigital Library
- K. Yu, T. Zhang and Y. Gong. Nonlinear learning using local coordinate coding. In NIPS, 2009.Google Scholar
- J. Carreira, R. Caeoiro, J. Batista and C. Sminchisescu. Semantic segmentation with second-order pooling. In ECCV, 2012. Google ScholarDigital Library
- J. Deng, A. Berg, K. Li and L. Feifei. What does classifying more than 10,000 image categoriestell us? In ECCV, 2010. Google ScholarDigital Library
- H. Jegou, M. Douze and C. Schmid. Hamming embedding and weak geometric consistency for largescale image search. In ECCV, 2008. Google ScholarDigital Library
- A. Shabou and H. L. Borgne. Locality-constrained and spatially regularized coding for scene categorization. In CVPR, 2012. Google ScholarDigital Library
- Z.-J. Zha, T. Mei, J.Wang, Z. Wang and X.-S. Hua. Graph-based Semi-Supervised Learning with Multiple Lables. In JVCIR, 2009. Google ScholarDigital Library
- P. Raghavan, C. D. Manning and H. Schtze. An introduction to information retrieval. Cambridge University Press, 2008. Google ScholarDigital Library
- J. Cai, Z. Zha, W. Zhou and Q. Tian. Attribute-assisted Reranking for Web Image Retrieval. In MM, 2012. Google ScholarDigital Library
- J. Cai, Z. Zha, Y. Zhao and Z. Wang. Evaluation Of Histogram Based Interest Point Detector In Web Image Classification And Search. In ICME, 2010.Google ScholarCross Ref
- S. Zhang, Q. Tian, G. Hua, Q. Huang and S. Li. Discriptive visual words and visual phrases for image applications. In MM, 2009. Google ScholarDigital Library
- Z.-J. Zha, L. Yang, T. Mei, M. Wang and Z. Wang. Visual Query Suggestion: Towards capturing user intent in internet image search. In TOMCCAP, 2010. Google ScholarDigital Library
Index Terms
Learning attribute-aware dictionary for image classification and search
Recommendations
Attribute Guided Dictionary Learning
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia RetrievalAttributes have shown great potential in visual recognition recently since they, as mid-level features, can be shared across different categories. However, existing attribute learning methods are prone to learning the correlated attributes which results ...
Fisher discrimination dictionary pair learning for image classification
Dictionary learning has played an important role in the success of sparse representation. Although several dictionary learning approaches have been developed for image classification, discriminative dictionary pair learning, i.e., jointly learning a ...
Elastic net regularized dictionary learning for image classification
Dictionary learning plays a key role in image representation for classification. A multi-modal dictionary is usually learned from feature samples across different classes and shared in the feature encoding process. Ideally each atom in dictionary ...
Comments