skip to main content
10.1145/1873951.1873962acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Crowdsourced automatic zoom and scroll for video retargeting

Published:25 October 2010Publication History

ABSTRACT

Screen size and display resolution limit the experience of watching videos on mobile devices. The viewing experience can be improved by determining important or interesting regions within the video (called regions of interest, or ROIs) and displaying only the ROIs to the viewer. Previous work focuses on analyzing the video content using visual attention model to infer the ROIs. Such content-based technique, however, has limitations. In this paper, we propose an alternative paradigm to infer ROIs from a video. We crowdsource from a large number of users through their implicit viewing behavior using a zoom and pan interface, and infer the ROIs from their collective wisdom. A retargeted video, consisting of relevant shots determined from historical users behavior, can be automatically generated and replayed to subsequent users who would prefer a less interactive viewing experience. This paper presents how we collect the user traces, infer the ROIs and their dynamics, group the ROIs into shots, and automatically reframe those shots to improve the aesthetics of the video. A user study with 48 participants shows that our automatically retargeted video is of comparable quality to one handcrafted by an expert user

References

  1. D. Arijon. Grammar of the Film Language. Silman-James Press, 1991.Google ScholarGoogle Scholar
  2. D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., 24(5):603--619, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Doubek, I. Geys, T. Svoboda, and L. V. Gool. Cinematographic rules applied to a camera network. In Proc. of the 5th Workshop on Omnidirectional Vision, pages 17--30, 2004.Google ScholarGoogle Scholar
  4. H. El-Alfy, D. Jacobs, and L. Davis. Multi-scale video cropping. In Proc. of MULTIMEDIA '07, pages 97?106, Augsburg, Germany, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. X. Fan, X. Xie, H.-Q. Zhou, and W.-Y. Ma. Looking into video frames on small displays. In Proc. of MULTIMEDIA '03, pages 247--250, Berkeley, CA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. L. Gleicher and F. Liu. Re-cinematography: Improving the camerawork of casual video. ACM Trans. Multimedia Comput. Commun. Appl., 5(1):1--28, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Han, K. N. Ngan, M. Li, and H. Zhang. Unsupervised extraction of visual attention objects in color images. IEEE Trans. Circuits Syst. Video Techn., 16(1):141--145, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L.-w. He, M. F. Cohen, and D. H. Salesin. The virtual cinematographer: a paradigm for automatic real-time camera control and directing. In Proc. of SIGGRAPH '96, pages 217--224, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T.-H. Huang, K.-Y. Cheng, and Y.-Y. Chuang. A collaborative benchmark for region of interest detection algorithms. In Proc. of CVPR '09, Miami, FL, June 2009.Google ScholarGoogle ScholarCross RefCross Ref
  10. L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell., 20(11):1254--1259, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T.-Y. Li and X.-Y. Xiao. An interactive camera planning system for automatic cinematographer. In Proc. of Multimedia Modeling, pages 310--315, Los Alamitos, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. Liu and M. Gleicher. Video retargeting: automating pan and scan. In Proc. of MULTIMEDIA '06, pages 241--250, Santa Barbara, CA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Pritch, A. Rav-Acha, and S. Peleg. Nonchronological video synopsis and indexing. IEEE Trans. Pattern Anal. Mach. Intell., 30(11):1971--1984, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Quang Minh Khiem, G. Ravindra, A. Carlier, and W. T. Ooi. Supporting zoomable video streams with dynamic region-of-interest cropping. In Proc. of ACM MMSYS '10, pages 259--270, Phoenix, AZ, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Rubinstein, A. Shamir, and S. Avidan. Improved seam carving for video retargeting. ACM Trans. Graph., 27(3):1--9, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Shamma, R. Shaw, P. Shafton, and Y. Liu. Watch What I Watch. In Proc. ACM MIR '07, Augsburg, Germany, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Ukita, T. Ono, and M. Kidode. Region extraction of a gaze object using the gaze point and view image sequences. In Proc. of the 7th International Conference on Multimodal Interfaces, pages 129--136, Torento, Italy, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Walther and C. Koch. Modeling attention to salient proto-objects. Neural Networks, 19:1395--1407, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Xie, H. Liu, S. Goumaz, and W.-Y. Ma. Learning user interest for image browsing on small-form-factor devices. In Proc. of CHI '05, pages 671--680, Portland, OR, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Crowdsourced automatic zoom and scroll for video retargeting

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MM '10: Proceedings of the 18th ACM international conference on Multimedia
          October 2010
          1836 pages
          ISBN:9781605589336
          DOI:10.1145/1873951

          Copyright © 2010 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 October 2010

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate995of4,171submissions,24%

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader