Abstract
This paper presents a new method to both track and segment multiple objects in videos using min-cut/max-flow optimizations. We introduce objective functions that combine low-level pixel wise measures (color, motion), high-level observations obtained via an independent detection module, motion prediction, and contrast-sensitive contextual regularization. One novelty is that external observations are used without adding any association step. The observations are image regions (pixel sets) that can be provided by any kind of detector. The minimization of appropriate cost functions simultaneously allows "detection-before-track" tracking (track-to-observation assignment and automatic initialization of new tracks) and segmentation of tracked objects. When several tracked objects get mixed up by the detection module (e.g., a single foreground detection mask is obtained for several objects close to each other), a second stage of minimization allows the proper tracking and segmentation of these individual entities despite the confusion of the external detection module.
- A. Yilmaz, O. Javed, and M. Shah, "Object tracking: a survey," ACM Computing Surveys, vol. 38, no. 4, p. 13, 2006. Google ScholarDigital Library
- A. Bugeau and P. Pérez, "Detection and segmentation of moving objects in highly dynamic scenes," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1-8, Minneapolis, Minn, USA, June 2007.Google Scholar
- A. Bugeau and P. Pérez, "Track and cut: simultaneous tracking and segmentation of multiple objects with graph cuts," in Proceedings of the 3rd International Conference on Computer Vision Theory and Applications (VISAPP '08), pp. 1-8, Madeira, Portugal, January 2008.Google Scholar
- R. Kalman, "A new approach to linear filtering and prediction problems," Journal of Basic Engineering, vol. 82, pp. 35-45, 1960.Google ScholarCross Ref
- N. J. Gordon, D. J. Salmond, and A. F. M. Smith, "Novel approach to nonlinear/non-Gaussian Bayesian state estimation," IEE Proceedings F: Radar and Signal Processing, vol. 140, no. 2, pp. 107-113, 1993.Google ScholarCross Ref
- D. Reid, "An algorithm for tracking multiple targets," IEEE Transactions on Automatic Control, vol. 24, no. 6, pp. 843-854, 1979.Google ScholarCross Ref
- I. J. Cox, "A review of statistical data association techniques for motion correspondence," International Journal of Computer Vision, vol. 10, no. 1, pp. 53-66, 1993. Google ScholarDigital Library
- Y. Bar-Shalom and X. Li, Estimation and Tracking: Principles, Techniques, and Software, Artech House, Boston, Mass, USA, 1993.Google Scholar
- Y. Bar-Shalom and X. Li, Multisensor-Multitarget Tracking: Principles and Techniques, YBS Publishing, Storrs, Conn, USA, 1995.Google Scholar
- D. Terzopoulos and R. Szeliski, "Tracking with Kalman snakes," in Active Vision, pp. 3-20, MIT Press, Cambridge, Mass, USA, 1993. Google Scholar
- M. Isard and A. Blake, "Condensation--conditional density propagation for visual tracking," International Journal of Computer Vision, vol. 29, no. 1, pp. 5-28, 1998. Google ScholarDigital Library
- J. MacCormick and A. Blake, "A probabilistic exclusion principle for tracking multiple objects," International Journal of Computer Vision, vol. 39, no. 1, pp. 57-71, 2000. Google ScholarDigital Library
- N. Paragios and R. Deriche, "Geodesic active regions for motion estimation and tracking," in Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV '99), vol. 1, pp. 688-694, Kerkyra, Greece, September 1999. Google Scholar
- A. Criminisi, G. Cross, A. Blake, and V. Kolmogorov, "Bilayer segmentation of live video," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), vol. 1, pp. 53-60, New York, NY, USA, June 2006. Google ScholarDigital Library
- N. Paragios and G. Tziritas, "Adaptive detection and localization of moving objects in image sequences," Signal Processing: Image Communication, vol. 14, no. 4, pp. 277-296, 1999.Google ScholarCross Ref
- Y. Shi and W. C. Karl, "Real-time tracking using level sets," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), vol. 2, pp. 34-41, San Diego, Calif, USA, June 2005. Google ScholarDigital Library
- M. Bertalmio, G. Sapiro, and G. Randall, "Morphing active contours," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 7, pp. 733-737, 2000. Google ScholarDigital Library
- D. Cremers and C. Schnörr, "Statistical shape knowledge in variational motion segmentation," Image and Vision Computing , vol. 21, no. 1, pp. 77-86, 2003.Google ScholarCross Ref
- A.-R. Mansouri, "Region tracking via level set PDEs without motion computation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 947-961, 2002. Google ScholarDigital Library
- R. Ronfard, "Region-based strategies for active contour models," International Journal of Computer Vision, vol. 13, no. 2, pp. 229-251, 1994. Google ScholarDigital Library
- A. Yilmaz, X. Li, and M. Shah, "Contour-based object tracking with occlusion handling in video acquired using mobile cameras," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1531-1536, 2004. Google ScholarDigital Library
- N. Xu and N. Ahuja, "Object contour tracking using graph cuts based active contours," in Proceedings of the IEEE International Conference on Image Processing (ICIP '02), vol. 3, pp. 277-280, Rochester, NY, USA, September 2002.Google Scholar
- J. Shi and C. Tomasi, "Good features to track," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '94), pp. 593-600, Seattle, Wash, USA, June 1994.Google Scholar
- D. Comaniciu, V. Ramesh, and P. Meer, "Real-time tracking of non-rigid objects using mean shift," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '00), vol. 2, pp. 142-149, Hilton Head Island, SC, USA, June 2000.Google Scholar
- D. Comaniciu, V. Ramesh, and P. Meer, "Kernel-based optical tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564-577, 2003. Google ScholarDigital Library
- D. Freedman and M. W. Turek, "Illumination-invariant tracking via graph cuts," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), vol. 2, pp. 10-17, San Diego, Calif, USA, June 2005. Google ScholarDigital Library
- R. Kjeldsen and J. Kender, "Finding skin in color images," in Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96), pp. 312-317, Killington, Vt, USA, October 1996. Google Scholar
- M. Singh and N. Ahuja, "Regression based bandwidth selection for segmentation using Parzen windows," in Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03), vol. 1, pp. 2-9, Nice, France, October 2003. Google Scholar
- B. D. Lucas and T. Kanade, "An iterative technique of image registration and its application to stereo," in Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI '81), Vancouver, Canada, August 1981.Google Scholar
- A. D. Jepson, D. J. Fleet, and T. F. El-Maraghi, "Robust online appearance models for visual tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1296-1311, 2003. Google ScholarDigital Library
- H. T. Nguyen and A. W. M. Smeulders, "Fast occluded object tracking by a robust appearance filter," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 1099-1104, 2004. Google ScholarDigital Library
- O. Juan and Y. Boykov, "Active graph cuts," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), vol. 1, pp. 1023-1029, New York, NY, USA, June 2006. Google ScholarDigital Library
- P. Kohli and P. Torr, "Effciently solving dynamic markov random fields using graph cuts," in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 922-929, Beijing, China, October 2005. Google ScholarDigital Library
- Y. Wang, J. F. Doherty, and R. E. Van Dyck, "Moving object tracking in video," in Proceedings of the 29th Applied Imagery Pattern Recognition Workshop (AIPR '00), p. 95, Washington, DC, USA, October 2000. Google Scholar
- D. Comaniciu and P. Meer, "Mean shift: a robust approach toward feature space analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, 2002. Google ScholarDigital Library
- Y. Boykov, O. Veksler, and R. Zabih, "Fast approximate energy minimization via graph cuts," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, 2001. Google ScholarDigital Library
- Y. Boykov and M.-P. Jolly, "Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images," in Proceedings of the 8th IEEE International Conference on Computer Vision (ICCV '01), vol. 1, pp. 105-112, Vancouver, Canada, July 2001.Google Scholar
- S. Kullback and R. A. Leibler, "On information and sufficiency," Annals of Mathematical Statistics, vol. 22, no. 1, pp. 79-86, 1951.Google ScholarCross Ref
- A. Blake, C. Rother, M. Brown, P. Pérez, and P. Torr, "Interactive image segmentation using an adaptive GMMRF model," in Proceedings of the 8th European Conference on Computer Vision (ECCV '04), pp. 428-441, Prague, Czech Republic, May 2004.Google Scholar
- Y. Boykov and V. Kolmogorov, "An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124-1137, 2004. Google ScholarDigital Library
- Y. Boykov, O. Veksler, and R. Zabih, "Markov random fields with efficient approximations," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '98), pp. 648-655, Santa Barbara, Calif, USA, June 1998. Google Scholar
- M. Bray, P. Kohli, and P. Torr, "PoseCut: simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts," in Proceedings of the 9th European Conference on Computer Vision (ECCV '06), pp. 642-655, Graz, Austria, May 2006. Google ScholarDigital Library
- J. Rihan, P. Kohli, and P. Torr, "Objcut for face detection," in Proceedings of the 4th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP '06), pp. 861-871, Madurai, India, December 2006. Google ScholarDigital Library
- L. Zhao and L. S. Davis, "Closely coupled object detection and segmentation," in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), vol. 1, pp. 454-461, Beijing, China, October 2005. Google ScholarDigital Library
- D. Ramanan, "Using segmentation to verify object hypotheses," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), Minneapolis, Minn, USA, June 2007.Google Scholar
Index Terms
- Track and cut: simultaneous tracking and segmentation of multiple objects with graph cuts
Recommendations
Nugget-cut: a segmentation scheme for spherically- and elliptically-shaped 3D objects
Proceedings of the 32nd DAGM conference on Pattern recognitionIn this paper, a segmentation method for spherically-and elliptically-shaped objects is presented. It utilizes a user-defined seed point to set up a directed 3D graph. The nodes of the 3D graph are obtained by sampling along rays that are sent through ...
Detect or track: towards cost-effective video object detection/tracking
AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial IntelligenceState-of-the-art object detectors and trackers are developing fast. Trackers are in general more efficient than detectors but bear the risk of drifting. A question is hence raised - how to improve the accuracy of video object detection/tracking by ...
TRIC-track: Tracking by Regression with Incrementally Learned Cascades
ICCV '15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)This paper proposes a novel approach to part-based tracking by replacing local matching of an appearance model by direct prediction of the displacement between local image patches and part locations. We propose to use cascaded regression with ...
Comments