skip to main content
10.1145/3126686.3126740acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Robust and Real-Time Visual Tracking with Triplet Convolutional Neural Network

Authors Info & Claims
Published:23 October 2017Publication History

ABSTRACT

In this paper, we propose a new visual object tracking which realizes robustness against object occlusion and deformation. In the proposed visual tracking, triplet convolutional neural network (triplet-CNN) structure is devised. The three inputs for the triplet-CNN come from current query frame, tracked object in a previous frame, and reference object. Object location in the query frame is predicted by fusing latent features from the three inputs. Moreover, predicted object is compared with reference object by using a Siamese CNN, so that object occlusion and deformation are detected and search range of tracking object is found adaptively. Comprehensive experimental results on a large-scale benchmark database showed that the proposed method outperformed state-of-the-art tracking methods in terms of precision and robustness with real-time tracking (about 25 fps).

References

  1. Niu, W., Jiao, L., Han, D., and Wang, Y.F., 2003. Real-time multi person tracking in video surveillance. In Proceedings of the Pacific Rim Multimedia Conference on IEEE, 1144--1148.Google ScholarGoogle Scholar
  2. Aggarwal, J. K., and Ryoo, M. S., 2011. Human activity analysis: a review. ACM Computing Surveys 43, 3, 1--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Crowley, J. K., and Schewerdt, K., 1999. Robust tracking and compression for video communications. In Proceedings of the IEEE International Conference Recognition, Analysis and Tracking of Faces and Gestures in Real-Time, 2--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Menresa, C., Varona, J., Mas, R., and Perales, F. J., 2005. Hand tracking and gesture recognition for human-computer interaction. Electronics Letters on Computer Vision and Image Analysis 4, 3, 96--104.Google ScholarGoogle ScholarCross RefCross Ref
  5. Adam, A., Rivlin, E., and Shimshoni, I., 2006. Robust fragments-based tracking using the integral histogram. In Proceedings of the International Conference on Computer Vision, 798--805. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Shu, G., Dehghan, A., Oreifeg, O, Hand, E., and Shah, M., 2012. Part-based multiple-person tracking with partial occlusion handling. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on IEEE, 1815--1821. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ma, C., Yang, X., Zhang, C., and Yang, M. H., 2015, Long-term correlation tracking. In Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on IEEE, 5388--5396.Google ScholarGoogle Scholar
  8. Wang, L., Ouyang, W., Wang, X., Lu, H., 2015. Visual tracking with fully convolutional networks, In Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on IEEE, 3119--3127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Nam, H., Han, B., 2016. Learning multi-domain convolutional neural networks for visual tracking. In Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on IEEE, 4293--4302.Google ScholarGoogle ScholarCross RefCross Ref
  10. Held, D., Thrun, S., Savarese, S., 2016. Learning to track at 100 fps with deep regression networks. In European Conference on Computer Vision, 749--765.Google ScholarGoogle ScholarCross RefCross Ref
  11. Ma, C., Huang, J. B., Yang, X., and Yang, M. H., 2015. Hierarchical convolutional feature for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, 3074--3082. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv, 1409.1556.Google ScholarGoogle Scholar
  13. Chollet, F., 2015. Keras. Available: https://github.com/fchollet/kerasGoogle ScholarGoogle Scholar
  14. Kristan, M., Pflugfelder, R., Leonardis, A., Matas, J., Cehovin, L., et al., 2014. In European Conference on Computer Vision Workshop, 191--217.Google ScholarGoogle Scholar
  15. Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., et al., 2015. The visual object tracking vot2015 results. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW), 1--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wu, Y., Lim, J., Yang, M. H., 2015. Object tracking benchmark. Pattern Analysis and Machine Intelligence, IEEE Transactions on 37, 9, 1834--1848.Google ScholarGoogle Scholar
  17. Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., and Torr, P. H., 2016. Staple: complementary learners for real-time tracking. In Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on IEEE, 1401--1409.Google ScholarGoogle Scholar
  18. Zhang, J., Ma, S., Sclaroff, S., 2014. Meem: robust tracking via multiple experts using entropy minimization. In European Conference on Computer Vision, 188--203.Google ScholarGoogle ScholarCross RefCross Ref
  19. Danelljan, M., Hager, G. K., and Felsberg, M., 2014. Accurate scale estimation for robust visual tracking. In British Machine Vision Conference, Nottingham, 1--11.Google ScholarGoogle Scholar
  20. Henriques, J. F., Caseiro, R., Martins, P., and Bastista, J., 2015. High-speed tracking with kernelized correlation filters. Pattern Analysis and Machine Intelligence, IEEE Transactions on 37, 3, 583--896.Google ScholarGoogle Scholar
  21. Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., and Torr, P. H. S., 2016. Staple: Complementary learners for real-time tracking. In Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on IEEE, 1401--1409.Google ScholarGoogle Scholar

Index Terms

  1. Robust and Real-Time Visual Tracking with Triplet Convolutional Neural Network

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017
        October 2017
        558 pages
        ISBN:9781450354165
        DOI:10.1145/3126686

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 October 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia
      • Article Metrics

        • Downloads (Last 12 months)6
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader