skip to main content
10.1145/2964284.2964295acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration

Authors Info & Claims
Published:01 October 2016Publication History

ABSTRACT

Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS)}, which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.

References

  1. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. From contours to regions: An empirical evaluation. In CVPR, pages 2294--2301, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  2. P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 33(5):898--916, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. W. Brendel and S. Todorovic. Video object segmentation by tracking regions. In ICCV, pages 833--840, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  4. T. Brox and J. Malik. Object segmentation by long term analysis of point trajectories. In ECCV, pages 282--295, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Chen, J. Shen, W. Wang, and B. Ni. Video object segmentation via dense trajectories. IEEE Trans. Multimedia, 17(12):2225--2234, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Corso, E. Sharon, S. Dube, S. El-Saden, U. Sinha, and A. Yuille. Efficient multilevel brain tumor segmentation with integrated bayesian model classification. Medical Imaging, IEEE Transactions on, 27(5):629--640, 2008.Google ScholarGoogle Scholar
  7. K. Fan. On a Theorem of Weyl Concerning Eigenvalues of Linear Transformations. I. Proceedings of the National Academy of Science, 35:652--655, Nov. 1949.Google ScholarGoogle ScholarCross RefCross Ref
  8. K. Fragkiadaki and J. Shi. Detection free tracking: Exploiting motion and topology for segmenting and tracking under entanglement. In CVPR, pages 2073--2080, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Galasso, R. Cipolla, and B. Schiele. Video segmentation with superpixels. In ACCV, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. Galasso, M. Keuper, T. Brox, and B. Schiele. Spectral graph reduction for efficient image and streaming video segmentation. In CVPR, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Galasso, N. S. Nagaraja, T. J. Cardenas, T. Brox, and B. Schiele. A unified video segmentation benchmark: Annotation, metrics and analysis. In ICCV, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Gao, J. Song, F. Nie, Y. Yan, N. Sebe, and H. T. Shen. Optimal graph learning with partial tags and multiple features for image and video annotation. In CVPR, pages 4371--4379, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  13. L. Gao, J. Song, F. Nie, F. Zou, N. Sebe, and H. T. Shen. Graph-without-cut: An ideal graph learning for image segmentation. In AAAI, pages 1188--1194, 2016.Google ScholarGoogle Scholar
  14. M. Grundmann, V. Kwatra, M. Han, and I. Essa. Efficient hierarchical graph-based video segmentation. In CVPR, pages 2141--2148, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. A. Jain, S. Chatterjee, and R. Vidal. Coarse-to-fine semantic video segmentation using supervoxel trees. In ICCV, pages 1865--1872, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Jiang, G. Zhang, H. Wang, and H. Bao. Spatio-temporal video segmentation of static scenes and its applications. IEEE Trans. Multimedia, 17(1):3--15, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  17. M. Keuper, B. Andres, and T. Brox. Motion trajectory segmentation via minimum cost multicuts. In ICCV, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Keuper, B. Andres, and T. Brox. Motion trajectory segmentation via minimum cost multicuts. In ICCV, pages 3271--3279, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Khoreva, F. Galasso, M. Hein, and B. Schiele. Classifier based graph construction for video segmentation. In CVPR, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  20. C. Li, L. Lin, W. Zuo, S. Yan, and J. Tang. Sold: Sub-optimal low-rank decomposition for efficient video segmentation. In CVPR, 2015.Google ScholarGoogle Scholar
  21. B. Liu and X. He. Multiclass semantic video segmentation with object-level active inference. In CVPR, pages 4286--4294, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  22. B. Luo, H. Li, T. Song, and C. Huang. Object segmentation from long video sequences. In ACM Multimedia, pages 1187--1190, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Ma and L. J. Latecki. Maximum weight cliques with mutex constraints for video object segmentation. In CVPR, pages 670--677, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. S. Nagaraja, F. R. Schmidt, and T. Brox. Video segmentation with just a few strokes. In ICCV, pages 3235--3243, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Nie, X. Wang, and H. Huang. Clustering and projected clustering with adaptive neighbors. In SIGKDD, pages 977--986, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Nie, X. Wang, M. I. Jordan, and H. Huang. The constrained laplacian rank algorithm for graph-based clustering. In AAAI, pages 1969--1976, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Ochs and T. Brox. Object segmentation in video: A hierarchical variational approach for turning point trajectories into dense regions. In ICCV, pages 1583--1590, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Ochs and T. Brox. Higher order motion models and spectral clustering. In CVPR, pages 614--621, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. Ochs, J. Malik, and T. Brox. Segmentation of moving objects by long term video analysis. IEEE Trans. Pattern Anal. Mach. Intell., 36(6):1187--1200, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Paris. Edge-preserving smoothing and mean-shift segmentation of video streams. In ECCV, pages 460--473, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. H. Raza, M. Grundmann, and I. A. Essa. Geometric context from videos. In CVPR, pages 3081--3088, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. V. Reina, S. Avidan, H. Pfister, and E. L. Miller. Multiple hypothesis video segmentation from superpixel flows. In ECCV, pages 268--281, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Shen, C. Shen, Q. Shi, A. van den Hengel, Z. Tang, and H. T. Shen. Hashing on nonlinear manifolds. IEEE Trans. Image Processing, 24(6):1839--1851, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Son, I. Jung, K. Park, and B. Han. Tracking-by-segmentation with online gradient boosting decision tree. In ICCV, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Song, Y. Yang, Z. Huang, H. T. Shen, and J. Luo. Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Trans. Multimedia, 15(8):1997--2008, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Wang, J. Liu, Y. Li, and H. Lu. Semi- and weakly- supervised semantic segmentation with deep convolutional neural networks. In ACM Multimedia, pages 1223--1226, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Xu, C. Xiong, and J. J. Corso. Streaming hierarchical video segmentation. In ECCV, pages 626--639, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. X. Yao, J. Han, G. Cheng, and L. Guo. Semantic segmentation based on stacked discriminative autoencoders and context-constrained weakly supervised learning. In ACM Multimedia, pages 1211--1214, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Yi and V. Pavlovic. Multi-cue structure preserving MRF for unconstrained video segmentation. In ICCV, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. C.-P. Yu, H. Le, G. Zelinsky, and D. Samaras. Efficient video segmentation using parametric graph partitioning. In ICCV, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. V. Zografos, R. Lenz, E. Ringaby, M. Felsberg, and K. Nordberg. Fast segmentation of sparse 3d point trajectories using group theoretical invariants. In ACCV, pages 675--691, 2014.Google ScholarGoogle Scholar

Index Terms

  1. Joint Graph Learning and Video Segmentation via Multiple Cues and Topology Calibration

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '16: Proceedings of the 24th ACM international conference on Multimedia
        October 2016
        1542 pages
        ISBN:9781450336031
        DOI:10.1145/2964284

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 October 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        MM '16 Paper Acceptance Rate52of237submissions,22%Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader