skip to main content
research-article
Open Access

Efficient Neural Networks for Real-time Motion Style Transfer

Published:26 July 2019Publication History
Skip Abstract Section

Abstract

Style is an intrinsic, inescapable part of human motion. It complements the content of motion to convey meaning, mood, and personality. Existing state-of-the-art motion style methods require large quantities of example data and intensive computational resources at runtime. To ensure output quality, such style transfer applications are often run on desktop machine with GPUs and significant memory. In this paper, we present a fast and expressive neural network-based motion style transfer method that generates stylized motion with quality comparable to the state of the art method, but uses much less computational power and a much smaller memory footprint. Our method also allows the output to be adjusted in a latent style space, something not offered in previous approaches. Our style transfer model is implemented using three multi-layered networks: a pose network, a timing network and a foot-contact network. A one-hot style vector serves as an input control knob and determines the stylistic output of these networks. During training, the networks are trained with a large motion capture database containing heterogeneous actions and various styles. Joint information vectors together with one-hot style vectors are extracted from motion data and fed to the networks. Once the network has been trained, the database is no longer needed on the device, thus removing the large memory requirement of previous motion style methods. At runtime, our model takes novel input and allows real-valued numbers to be specified in the style vector, which can be used for interpolation, extrapolation or mixing of styles. With much lower memory and computational requirements, our networks are efficient and fast enough for real-time use on mobile devices. Requiring no information about future states, the style transfer can be performed in an online fashion. We validate our result both quantitatively and perceptually, confirming its effectiveness and improvement over previous approaches.

References

  1. Kenji Amaya, Armin Bruderlin, and Tom Calvert. 1996. Emotion from Motion. In Proceedings of the Conference on Graphics Interface '96 (GI '96). Canadian Information Processing Society, Toronto, Ont., Canada, Canada, 222--229. http://dl.acm.org/citation.cfm?id=241020.241079 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andreas Aristidou, Qiong Zeng, Efstathios Stavrakis, KangKang Yin, Daniel Cohen-Or, Yiorgos Chrysanthou, and Baoquan Chen. 2017. Emotion Control of Unstructured Dance Movements. In Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation (SCA '17). ACM, New York, NY, USA, Article 9, 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Matthew Brand and Aaron Hertzmann. 2000. Style Machines. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '00). ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, 183--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Han Du, Erik Herrmann, Janis Sprenger, Noshaba Cheema, Somayeh Hosseini, Klaus Fischer, and Philipp Slusallek. 2019. Stylistic Locomotion Modeling with Conditional Variational Autoencoder. In Eurographics 2019 - Short Papers, Paolo Cignoni and Eder Miguel (Eds.). The Eurographics Association.Google ScholarGoogle Scholar
  5. Daniel Holden, Ikhsanul Habibie, Ikuo Kusajima, and Taku Komura. 2017a. Fast neural style transfer for motion data. IEEE computer graphics and applications 37, 4 (2017), 42--49.Google ScholarGoogle Scholar
  6. Daniel Holden, Taku Komura, and Jun Saito. 2017b. Phase-functioned neural networks for character control. ACM Transactions on Graphics (TOG) 36, 4 (2017), 42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. 2015. Learning Motion Manifolds with Convolutional Autoencoders. In SIGGRAPH Asia 2015 Technical Briefs (SA '15). ACM, New York, NY, USA, Article 18, 4 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ludovic Hoyet, Kenneth Ryall, Katja Zibrek, Hwangpil Park, Jehee Lee, Jessica Hodgins, and Carol O'Sullivan. 2013. Evaluating the distinctiveness and attractiveness of human motions on realistic virtual bodies. ACM Transactions on Graphics (TOG) 32, 6 (2013), 204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Eugene Hsu, Kari Pulli, and Jovan Popović. 2005. Style Translation for Human Motion. ACM Trans. Graph. 24, 3, 1082--1089. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yejin Kim and Michael Neff. 2012. Component-based Locomotion Composition. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '12). Eurographics Association, Goslar Germany, Germany, 165--173. http://dl.acm.org/citation.cfm?id=2422356.2422381 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lucas Kovar and Michael Gleicher. 2003. Flexible Automatic Motion Blending with Registration Curves. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '03). Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 214--224. http://dl.acm.org/citation.cfm?id=846276.846307 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zimo Li, Yi Zhou, Shuangjiu Xiao, Chong He, and Hao Li. 2017. Auto-Conditioned LSTM Network for Extended Complex Human Motion Synthesis. CoRR abs/1707.05363 (2017). arXiv:1707.05363 http://arxiv.org/abs/1707.05363Google ScholarGoogle Scholar
  14. Libin Liu and Jessica Hodgins. 2017. Learning to schedule control fragments for physics-based characters using deep q-learning. ACM Transactions on Graphics (TOG) 36, 3 (2017), 29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Libin Liu and Jessica Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (July 2018), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ian Mason, Sebastian Starke, He Zhang, Hakan Bilen, and Taku Komura. 2018. Few-shot learning of homogeneous human locomotion styles. Computer Graphics Forum 37, 7, 143--153.Google ScholarGoogle ScholarCross RefCross Ref
  17. Evie McCrum-Gardner. 2008. Which is the correct statistical test to use? British Journal of Oral and Maxillofacial Surgery 46, 1 (2008), 38--41.Google ScholarGoogle ScholarCross RefCross Ref
  18. Rachel McDonnell, Sophie Jörg, Joanna McHugh, Fiona Newell, and Carol O'Sullivan. 2008. Evaluating the Emotional Content of Human Motions on Real and Virtual Characters. In Proceedings of the 5th Symposium on Applied Perception in Graphics and Visualization (APGV '08). ACM, New York, NY, USA, 67--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Quinn McNemar. 1947. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12, 2 (1947), 153--157.Google ScholarGoogle ScholarCross RefCross Ref
  20. Michael Neff, Yingying Wang, Rob Abbott, and Marilyn Walker. 2010. Evaluating the Effect of Gesture and Language on Personality Perception in Conversational Agents. In Intelligent Virtual Agents, Jan Allbeck, Norman Badler, Timothy Bickmore, Catherine Pelachaud, and Alla Safonova (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 222--235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Rubens F Nunes, Joaquim B Cavalcante-Neto, Creto A Vidal, Paul G Kry, and Victor B Zordan. 2012. Using natural vibrations to guide control for locomotion. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. ACM, 87--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dario Pavllo, David Grangier, and Michael Auli. 2018. QuaterNet: A Quaternion-based Recurrent Model for Human Motion. CoRR abs/1805.06485 (2018). arXiv:1805.06485 http://arxiv.org/abs/1805.06485Google ScholarGoogle Scholar
  23. Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. CoRR abs/1804.02717 (2018). arXiv:1804.02717 http://arxiv.org/abs/1804.02717Google ScholarGoogle Scholar
  24. Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel Van De Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG) 36, 4 (2017), 41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sebastian Raschka. 2018. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. CoRR abs/1811.12808 (2018). arXiv:1811.12808 http://arxiv.org/abs/1811.12808Google ScholarGoogle Scholar
  26. Harrison Jesse Smith and Michael Neff. 2017. Understanding the impact of animated gesture performance on personality perceptions. ACM Transactions on Graphics (TOG) 36, 4 (2017), 49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Graham W. Taylor and Geoffrey E. Hinton. 2009. Factored Conditional Restricted Boltzmann Machines for Modeling Motion Style. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML '09). ACM, New York, NY, USA, 1025--1032. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Munetoshi Unuma, Ken Anjyo, and Ryozo Takeuchi. 1995. Fourier Principles for Emotion-based Human Figure Animation. In Proceedings of the 22Nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '95). ACM, New York, NY, USA, 91--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ruben Villegas, Jimei Yang, Duygu Ceylan, and Honglak Lee. 2018. Neural Kinematic Networks for Unsupervised Motion Retargetting. CoRR abs/1804.05653. arXiv:1804.05653 http://arxiv.org/abs/1804.05653Google ScholarGoogle Scholar
  30. Yingying Wang and Michael Neff. 2015. Deep Signatures for Indexing and Retrieval in Large Motion Databases. In Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games (MIG '15). ACM, New York, NY, USA, 37--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yingying Wang, Jean E Fox Tree, Marilyn Walker, and Michael Neff. 2016. Assessing the impact of hand motion on virtual character personality. ACM Transactions on Applied Perception (TAP) 13, 2 (2016), 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shihong Xia, Congyi Wang, Jinxiang Chai, and Jessica Hodgins. 2015. Real-time style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics (TOG) 34, 4 (2015), 119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M Ersin Yumer and Niloy J Mitra. 2016. Spectral style transfer for human motion between independent actions. ACM Transactions on Graphics (TOG) 35, 4 (2016), 137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. He Zhang, Sebastian Starke, Taku Komura, and Jun Saito. 2018. Mode-adaptive neural networks for quadruped motion control. ACM Transactions on Graphics (TOG) 37, 4 (2018), 145. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient Neural Networks for Real-time Motion Style Transfer

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the ACM on Computer Graphics and Interactive Techniques
          Proceedings of the ACM on Computer Graphics and Interactive Techniques  Volume 2, Issue 2
          July 2019
          239 pages
          EISSN:2577-6193
          DOI:10.1145/3352480
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 July 2019
          Published in pacmcgit Volume 2, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader