ABSTRACT
Amateur instructional videos often show a single uninterrupted take of a recorded demonstration without any edits. While easy to produce, such videos are often too long as they include unnecessary or repetitive actions as well as mistakes. We introduce DemoCut, a semi-automatic video editing system that improves the quality of amateur instructional videos for physical tasks. DemoCut asks users to mark key moments in a recorded demonstration using a set of marker types derived from our formative study. Based on these markers, the system uses audio and video analysis to automatically organize the video into meaningful segments and apply appropriate video editing effects. To understand the effectiveness of DemoCut, we report a technical evaluation of seven video tutorials created with DemoCut. In a separate user evaluation, all eight participants successfully created a complete tutorial with a variety of video editing effects using our system.
Supplemental Material
- Adams, B., and Venkatesh, S. Situated event bootstrapping and capture guidance for automated home movie authoring. In Proceedings of MULTIMEDIA, ACM Press (2005), 754--763. Google ScholarDigital Library
- Bai, J., Agarwala, A., Agrawala, M., and Ramamoorthi, R. Selectively de-animating video. ACM Trans. Graph. 31, 4 (2012), 66:1--66:10. Google ScholarDigital Library
- Barnes, C., Goldman, D. B., Shechtman, E., and Finkelstein, A. Video tapestries with continuous temporal zoom. ACM Trans. Graph. 29 (2010), 89:1--89:9. Google ScholarDigital Library
- Bergman, L., Castelli, V., Lau, T., and Oblinger, D. Docwizards: a system for authoring follow-me documentation wizards. In Proceedings of UIST, ACM Press (2005), 191--200. Google ScholarDigital Library
- Bernstein, M. S., Brandt, J., and Miller, R. C. Crowds in two seconds. Proceedings of UIST (2011).Google ScholarDigital Library
- Berthouzoz, F., Li, W., and Agrawala, M. Tools for placing cuts and transitions in interview video. ACM Trans. Graph. 31, 4 (2012), 67:1--67:8. Google ScholarDigital Library
- Carter, S., Adcock, J., Doherty, J., and Branham, S. Nudgecam: toward targeted, higher quality media capture. In Proceedings of MULTIMEDIA, ACM Press (2010), 615--618. Google ScholarDigital Library
- Casares, J., Long, A. C., Myers, B. A., Bhatnagar, R., Stevens, S. M., Dabbish, L., Yocum, D., and Corbett, A. Simplifying video editing using metadata. In Proceedings of DIS, ACM Press (2002), 157. Google ScholarDigital Library
- Chi, P.-y., Ahn, S., Ren, A., Dontcheva, M., Li, W., and Hartmann, B. MixT: automatic generation of step-by-step mixed media tutorials. In Proceedings of UIST, ACM Press (2012), 93. Google ScholarDigital Library
- Davis, M., Heer, J., and Ramirez, A. Active capture: automatic direction for automatic movies. In Proceedings of MULTIMEDIA, ACM Press (2003), 88. Google ScholarDigital Library
- Diakopoulos, N., and Essa, I. Videotater: an approach for pen-based digital video segmentation and tagging. In Proceedings of UIST, ACM Press (2006), 221--224. Google ScholarDigital Library
- Fussell, S. R., Setlock, L. D., and Kraut, R. E. Effects of head-mounted and scene-oriented video systems on remote collaboration on physical tasks. In Proceedings of CHI, ACM Press (2003). Google ScholarDigital Library
- Grabler, F., Agrawala, M., Li, W., Dontcheva, M., and Igarashi, T. Generating photo manipulation tutorials by demonstration. SIGGRAPH (2009). Google ScholarDigital Library
- Grossman, T., Matejka, J., and Fitzmaurice, G. Chronicle: capture, exploration, and playback of document workflow histories. In Proceedings of UIST, ACM Press (2010). Google ScholarDigital Library
- Gupta, A., Fox, D., Curless, B., and Cohen, M. DuploTrack: a real-time system for authoring and guiding duplo block assembly. In Proceedings of UIST, ACM Press (2012), 389--402. Google ScholarDigital Library
- Gurevich, P., Lanir, J., Cohen, B., and Stone, R. TeleAdvisor: a versatile augmented reality tool for remote assistance. In Proceedings of CHI, ACM Press (2012). Google ScholarDigital Library
- Heck, R., Wallick, M., and Gleicher, M. Virtual videography. ACM Trans. Multimedia Comput. Commun. Appl. 3, 1 (2007). Google ScholarDigital Library
- Heer, J., Good, N. S., Ramirez, A., Davis, M., and Mankoff, J. Presiding over accidents: system direction of human action. In Proceedings of CHI, ACM Press (2004), 463--470. Google ScholarDigital Library
- Henderson, S., and Feiner, S. Exploring the benefits of augmented reality documentation for maintenance and repair. IEEE Trans on Visualization and Computer Graphics 17, 10 (2011), 1355--1368. Google ScholarDigital Library
- Joshi, N., Mehta, S., Drucker, S., Stollnitz, E., Hoppe, H., Uyttendaele, M., and Cohen, M. Cliplets: juxtaposing still and dynamic imagery. In Proceedings of UIST, ACM Press (2012), 251--260. Google ScholarDigital Library
- Lafreniere, B., Bunt, A., Lount, M., Terry, M., and Cowan, D. Looks cool, I'll try this later!: Understanding the faces and uses of online tutorials. University of Waterloo Tech Report (2012).Google Scholar
- Liu, F., Gleicher, M., Wang, J., Jin, H., and Agarwala, A. Subspace video stabilization. ACM Trans. Graph. 30, 1 (2011), 4:1--4:10. Google ScholarDigital Library
- Mackay, W. E. Eva: an experimental video annotator for symbolic analysis of video data. SIGCHI Bull. 21, 2 (1989), 68--71. Google ScholarDigital Library
- Müller, E. Where quality matters: discourses on the art of making a YouTube video. In The YouTube Reader, Stockholm: National Library of Sweden (2009).Google Scholar
- Panagiotakis, C., and Tziritas, G. G. A speech/music discriminator based on RMS and zero-crossings. IEEE Transactions on Multimedia 7, 1 (2005), 155--166. Google ScholarDigital Library
- Pongnumkul, S., Dontcheva, M., Li, W., Wang, J., Bourdev, L., Avidan, S., and Cohen, M. F. Pause-and-play: automatically linking screencast video tutorials with applications. In Proceedings of UIST, ACM Press (2011), 135--144. Google ScholarDigital Library
- Pritch, Y., Ratovitch, S., and Hendel, A. Clustered synopsis of surveillance video. In Proceedings of AVSS, IEEE Computer Society (2009). Google ScholarDigital Library
- Ranjan, A., Birnholtz, J. P., and Balakrishnan, R. Dynamic shared visual spaces: experimenting with automatic camera control in a remote repair task. In Proceedings of CHI, ACM Press (2007), 1177--1186. Google ScholarDigital Library
- Torrey, C., Churchill, E. F., and McDonald, D. W. Learning how: The search for craft knowledge on the Internet. In Proceedings of CHI, ACM Press (2009), 1371--1380. Google ScholarDigital Library
- Torrey, C., McDonald, D. W., Schilit, B. N., and Bly, S. How-To pages: Informal systems of expertise sharing. In Proceedings of ECSCW, Springer London (2007), 391--410.Google ScholarCross Ref
Index Terms
- DemoCut: generating concise instructional videos for physical demonstrations
Recommendations
Turning to the masters: motion capturing cartoons
In this paper, we present a technique we call "cartoon capture and retargeting" which we use to track the motion from traditionally animated cartoons and retarget it onto 3-D models, 2-D drawings, and photographs. By using animation as the source, we ...
Turning to the masters: motion capturing cartoons
SIGGRAPH '02: Proceedings of the 29th annual conference on Computer graphics and interactive techniquesIn this paper, we present a technique we call "cartoon capture and retargeting" which we use to track the motion from traditionally animated cartoons and retarget it onto 3-D models, 2-D drawings, and photographs. By using animation as the source, we ...
MixT: automatic generation of step-by-step mixed media tutorials
CHI EA '12: CHI '12 Extended Abstracts on Human Factors in Computing SystemsAs software interfaces become more complicated, users rely on tutorials to learn, creating an increasing demand for effective tutorials. Existing tutorials, however, are limited in their presentation: Static step-by-step tutorials are easy to scan but ...
Comments