ABSTRACT
In this work we propose a method that integrates multi-task learning (MTL) and deep learning. Our method appends a MTL-like loss to a deep convolutional neural network, in order to learn the relations between tasks together at the same time, and also incorporates the label correlations between pairs of tasks. We apply the proposed method on a transfer learning scenario, where our objective is to fine-tune the parameters of a network that has been originally trained on a large-scale image dataset for concept detection, so that it be applied on a target video dataset and a corresponding new set of target concepts. We evaluate the proposed method for the video concept detection problem on the TRECVID 2013 Semantic Indexing dataset. Our results show that the proposed algorithm leads to better concept-based video annotation than existing state-of-the-art methods.
- A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. Advances in Neural Information Processing Systems (NIPS 2007), 2007. Google ScholarDigital Library
- A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243--272, 2008. Google ScholarDigital Library
- C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Trans. on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Google ScholarDigital Library
- K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference, 2014.Google ScholarCross Ref
- H. Daumé, III. Bayesian multitask learning with latent hierarchies. In the 25th Conf. on Uncertainty in Artificial Intelligence (UAI 2009), pages 135--142, Quebec, Canada, 2009. AUAI Press. Google ScholarDigital Library
- T. Evgeniou and M. Pontil. Regularized multi--task learning. In the 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2004), pages 109--117, Seattle, WA, 2004. Google ScholarDigital Library
- R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871--1874, 2008. Google ScholarDigital Library
- R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2014), 2014. Google ScholarDigital Library
- Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.Google Scholar
- A. Krizhevsky, S. Ilya, and G. Hinton. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (NIPS 2012), pages 1097--1105, 2012. Google ScholarDigital Library
- A. Kumar and H. Daume. Learning task grouping and overlap in multi-task learning. In the 29th ACM Int. Conf. on Machine Learning (ICML 2012), pages 1383--1390, Edinburgh, Scotland, 2012.Google Scholar
- M. Long and J. Wang. Learning multiple tasks with deep relationship networks. CoRR, abs/1506.02117, 2015.Google Scholar
- F. Markatopoulou, V. Mezaris, and I. Patras. Cascade of classifiers based on binary, non-binary and deep convolutional network descriptors for video concept detection. In the IEEE Int. Conf. on Image Processing (ICIP 2015), pages 1786--1790, Quebec, Canada, 2015.Google ScholarDigital Library
- F. Markatopoulou, V. Mezaris, and I. Patras. Online Multi-Task Learning for Semantic Concept Detection in Video. In the IEEE Int. Conf. on Image Processing (ICIP 2016), Phoenix, AZ, USA, 2016.Google Scholar
- F. Markatopoulou et al. ITI-CERTH in TRECVID 2015. In TRECVID 2015. NIST, USA, 2015.Google Scholar
- H. Mousavi, U. Srinivas, V. Monga, Y. Suo, M. Dao, and T. Tran. Multi-task image classification via collaborative, hierarchical spike-and-slab priors. In the IEEE Int. Conf. on Image Processing (ICIP 2014), pages 4236--4240, Paris, France, 2014.Google ScholarCross Ref
- G. Obozinski and B. Taskar. Multi-task feature selection. In the 23rd Int. Conf. on Machine Learning (ICML 2006). Workshop of Structural Knowledge Transfer for Machine Learning, Pittsburgh, PA, 2006.Google Scholar
- M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Learning and transferring mid-level image representations using convolutional neural networks. In the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2014), Columbus, OH, 2014. Google ScholarDigital Library
- W. Ouyang, X. Chu, and X. Wang. Multi-source deep learning for human pose estimation. In the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2014), pages 2337--2344, Columbus, OH, 2014. Google ScholarDigital Library
- P. Over et al. TRECVID 2013-An overview of the goals, tasks, data, evaluation mechanisms and metrics. In TRECVID 2013. NIST, USA, 2013.Google Scholar
- O. Russakovsky, J. Deng, and H. S. et al. ImageNet Large Scale Visual Recognition Challenge. Int. Journal of Computer Vision (IJCV 2015), 115(3):211--252, 2015. Google ScholarDigital Library
- K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv technical report, 2014.Google Scholar
- C. Snoek, D. Fontijne, K. E. van de Sande, and H. e. a. Stokman. Qualcomm research and University of Amsterdam at TRECVID 2015: Recognizing concepts, objects, and events in video. In TRECVID 2015. NIST, USA, 2015.Google Scholar
- C. G. M. Snoek and M. Worring. Concept-Based Video Retrieval. Foundations and Trends in Information Retrieval, 2(4):215--322, 2009. Google ScholarDigital Library
- G. Sun, Y. Chen, X. Liu, and E. Wu. Adaptive multi-task learning for fine-grained categorization. In the IEEE Int. Conf. on Image Processing (ICIP 2015), pages 996--1000, Quebec, Canada, 2015.Google ScholarDigital Library
- C. Szegedy et al. Going deeper with convolutions. In the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, 2015.Google ScholarCross Ref
- Y. Yang and T. M. Hospedales. A unified perspective on multi-domain and multi-task learning. In the Int. Conf. on Learning Representations (ICLR 2015), San Diego, California, 2015.Google Scholar
- E. Yilmaz, E. Kanoulas, and J. A. Aslam. A simple and efficient sampling method for estimating AP and NDCG. In the 31st ACM Int. Conf. on Research and Development in Information Retrieval (SIGIR 2008), pages 603--610, Singapore, 2008. Google ScholarDigital Library
- J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. How transferable are features in deep neural networks? Advances in Neural Information Processing Systems (NIPS 2014), pages 3320--3328, 2014. Google ScholarDigital Library
- Z. Zhang, P. Luo, C. C. Loy, and X. Tang. Facial landmark detection by deep multi-task learning. In the 13th Europ. Conf. on Computer Vision (ECCV 2014), pages 94--108, Zurich, Switzerland, 2014. Springer.Google ScholarCross Ref
- J. Zhou, J. Chen, and J. Ye. Clustered multi-task learning via alternating structure optimization. Advances in Neural Information Processing Systems (NIPS 2011), 2011. Google ScholarDigital Library
- J. Zhou, J. Chen, and J. Ye. MALSAR: Multi-task learning via structural regularization. Technical report, 2011.Google Scholar
Index Terms
- Deep Multi-task Learning with Label Correlation Constraint for Video Concept Detection
Recommendations
Transductive multi-label learning for video concept detection
MIR '08: Proceedings of the 1st ACM international conference on Multimedia information retrievalTransductive video concept detection is an effective way to handle the lack of sufficient labeled videos. However, another issue, the multi-label interdependence, is not essentially addressed in the existing transductive methods. Most solutions only ...
A transductive multi-label learning approach for video concept detection
In this paper, we address two important issues in the video concept detection problem: the insufficiency of labeled videos and the multiple labeling issue. Most existing solutions merely handle the two issues separately. We propose an integrated ...
Asymmetry label correlation for multi-label learning
AbstractAs an effective method for mining latent information between labels, label correlation is widely adopted by many scholars to model multi-label learning algorithms. Most existing multi-label algorithms usually ignore that the correlation between ...
Comments