Abstract
Style is an intrinsic, inescapable part of human motion. It complements the content of motion to convey meaning, mood, and personality. Existing state-of-the-art motion style methods require large quantities of example data and intensive computational resources at runtime. To ensure output quality, such style transfer applications are often run on desktop machine with GPUs and significant memory. In this paper, we present a fast and expressive neural network-based motion style transfer method that generates stylized motion with quality comparable to the state of the art method, but uses much less computational power and a much smaller memory footprint. Our method also allows the output to be adjusted in a latent style space, something not offered in previous approaches. Our style transfer model is implemented using three multi-layered networks: a pose network, a timing network and a foot-contact network. A one-hot style vector serves as an input control knob and determines the stylistic output of these networks. During training, the networks are trained with a large motion capture database containing heterogeneous actions and various styles. Joint information vectors together with one-hot style vectors are extracted from motion data and fed to the networks. Once the network has been trained, the database is no longer needed on the device, thus removing the large memory requirement of previous motion style methods. At runtime, our model takes novel input and allows real-valued numbers to be specified in the style vector, which can be used for interpolation, extrapolation or mixing of styles. With much lower memory and computational requirements, our networks are efficient and fast enough for real-time use on mobile devices. Requiring no information about future states, the style transfer can be performed in an online fashion. We validate our result both quantitatively and perceptually, confirming its effectiveness and improvement over previous approaches.
- Kenji Amaya, Armin Bruderlin, and Tom Calvert. 1996. Emotion from Motion. In Proceedings of the Conference on Graphics Interface '96 (GI '96). Canadian Information Processing Society, Toronto, Ont., Canada, Canada, 222--229. http://dl.acm.org/citation.cfm?id=241020.241079 Google ScholarDigital Library
- Andreas Aristidou, Qiong Zeng, Efstathios Stavrakis, KangKang Yin, Daniel Cohen-Or, Yiorgos Chrysanthou, and Baoquan Chen. 2017. Emotion Control of Unstructured Dance Movements. In Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation (SCA '17). ACM, New York, NY, USA, Article 9, 10 pages. Google ScholarDigital Library
- Matthew Brand and Aaron Hertzmann. 2000. Style Machines. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '00). ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, 183--192. Google ScholarDigital Library
- Han Du, Erik Herrmann, Janis Sprenger, Noshaba Cheema, Somayeh Hosseini, Klaus Fischer, and Philipp Slusallek. 2019. Stylistic Locomotion Modeling with Conditional Variational Autoencoder. In Eurographics 2019 - Short Papers, Paolo Cignoni and Eder Miguel (Eds.). The Eurographics Association.Google Scholar
- Daniel Holden, Ikhsanul Habibie, Ikuo Kusajima, and Taku Komura. 2017a. Fast neural style transfer for motion data. IEEE computer graphics and applications 37, 4 (2017), 42--49.Google Scholar
- Daniel Holden, Taku Komura, and Jun Saito. 2017b. Phase-functioned neural networks for character control. ACM Transactions on Graphics (TOG) 36, 4 (2017), 42. Google ScholarDigital Library
- Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 138. Google ScholarDigital Library
- Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. 2015. Learning Motion Manifolds with Convolutional Autoencoders. In SIGGRAPH Asia 2015 Technical Briefs (SA '15). ACM, New York, NY, USA, Article 18, 4 pages. Google ScholarDigital Library
- Ludovic Hoyet, Kenneth Ryall, Katja Zibrek, Hwangpil Park, Jehee Lee, Jessica Hodgins, and Carol O'Sullivan. 2013. Evaluating the distinctiveness and attractiveness of human motions on realistic virtual bodies. ACM Transactions on Graphics (TOG) 32, 6 (2013), 204. Google ScholarDigital Library
- Eugene Hsu, Kari Pulli, and Jovan Popović. 2005. Style Translation for Human Motion. ACM Trans. Graph. 24, 3, 1082--1089. Google ScholarDigital Library
- Yejin Kim and Michael Neff. 2012. Component-based Locomotion Composition. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '12). Eurographics Association, Goslar Germany, Germany, 165--173. http://dl.acm.org/citation.cfm?id=2422356.2422381 Google ScholarDigital Library
- Lucas Kovar and Michael Gleicher. 2003. Flexible Automatic Motion Blending with Registration Curves. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '03). Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 214--224. http://dl.acm.org/citation.cfm?id=846276.846307 Google ScholarDigital Library
- Zimo Li, Yi Zhou, Shuangjiu Xiao, Chong He, and Hao Li. 2017. Auto-Conditioned LSTM Network for Extended Complex Human Motion Synthesis. CoRR abs/1707.05363 (2017). arXiv:1707.05363 http://arxiv.org/abs/1707.05363Google Scholar
- Libin Liu and Jessica Hodgins. 2017. Learning to schedule control fragments for physics-based characters using deep q-learning. ACM Transactions on Graphics (TOG) 36, 3 (2017), 29. Google ScholarDigital Library
- Libin Liu and Jessica Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (July 2018), 14 pages. Google ScholarDigital Library
- Ian Mason, Sebastian Starke, He Zhang, Hakan Bilen, and Taku Komura. 2018. Few-shot learning of homogeneous human locomotion styles. Computer Graphics Forum 37, 7, 143--153.Google ScholarCross Ref
- Evie McCrum-Gardner. 2008. Which is the correct statistical test to use? British Journal of Oral and Maxillofacial Surgery 46, 1 (2008), 38--41.Google ScholarCross Ref
- Rachel McDonnell, Sophie Jörg, Joanna McHugh, Fiona Newell, and Carol O'Sullivan. 2008. Evaluating the Emotional Content of Human Motions on Real and Virtual Characters. In Proceedings of the 5th Symposium on Applied Perception in Graphics and Visualization (APGV '08). ACM, New York, NY, USA, 67--74. Google ScholarDigital Library
- Quinn McNemar. 1947. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12, 2 (1947), 153--157.Google ScholarCross Ref
- Michael Neff, Yingying Wang, Rob Abbott, and Marilyn Walker. 2010. Evaluating the Effect of Gesture and Language on Personality Perception in Conversational Agents. In Intelligent Virtual Agents, Jan Allbeck, Norman Badler, Timothy Bickmore, Catherine Pelachaud, and Alla Safonova (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 222--235. Google ScholarDigital Library
- Rubens F Nunes, Joaquim B Cavalcante-Neto, Creto A Vidal, Paul G Kry, and Victor B Zordan. 2012. Using natural vibrations to guide control for locomotion. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. ACM, 87--94. Google ScholarDigital Library
- Dario Pavllo, David Grangier, and Michael Auli. 2018. QuaterNet: A Quaternion-based Recurrent Model for Human Motion. CoRR abs/1805.06485 (2018). arXiv:1805.06485 http://arxiv.org/abs/1805.06485Google Scholar
- Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. CoRR abs/1804.02717 (2018). arXiv:1804.02717 http://arxiv.org/abs/1804.02717Google Scholar
- Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel Van De Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG) 36, 4 (2017), 41. Google ScholarDigital Library
- Sebastian Raschka. 2018. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. CoRR abs/1811.12808 (2018). arXiv:1811.12808 http://arxiv.org/abs/1811.12808Google Scholar
- Harrison Jesse Smith and Michael Neff. 2017. Understanding the impact of animated gesture performance on personality perceptions. ACM Transactions on Graphics (TOG) 36, 4 (2017), 49. Google ScholarDigital Library
- Graham W. Taylor and Geoffrey E. Hinton. 2009. Factored Conditional Restricted Boltzmann Machines for Modeling Motion Style. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML '09). ACM, New York, NY, USA, 1025--1032. Google ScholarDigital Library
- Munetoshi Unuma, Ken Anjyo, and Ryozo Takeuchi. 1995. Fourier Principles for Emotion-based Human Figure Animation. In Proceedings of the 22Nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '95). ACM, New York, NY, USA, 91--96. Google ScholarDigital Library
- Ruben Villegas, Jimei Yang, Duygu Ceylan, and Honglak Lee. 2018. Neural Kinematic Networks for Unsupervised Motion Retargetting. CoRR abs/1804.05653. arXiv:1804.05653 http://arxiv.org/abs/1804.05653Google Scholar
- Yingying Wang and Michael Neff. 2015. Deep Signatures for Indexing and Retrieval in Large Motion Databases. In Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games (MIG '15). ACM, New York, NY, USA, 37--45. Google ScholarDigital Library
- Yingying Wang, Jean E Fox Tree, Marilyn Walker, and Michael Neff. 2016. Assessing the impact of hand motion on virtual character personality. ACM Transactions on Applied Perception (TAP) 13, 2 (2016), 9. Google ScholarDigital Library
- Shihong Xia, Congyi Wang, Jinxiang Chai, and Jessica Hodgins. 2015. Real-time style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics (TOG) 34, 4 (2015), 119. Google ScholarDigital Library
- M Ersin Yumer and Niloy J Mitra. 2016. Spectral style transfer for human motion between independent actions. ACM Transactions on Graphics (TOG) 35, 4 (2016), 137. Google ScholarDigital Library
- He Zhang, Sebastian Starke, Taku Komura, and Jun Saito. 2018. Mode-adaptive neural networks for quadruped motion control. ACM Transactions on Graphics (TOG) 37, 4 (2018), 145. Google ScholarDigital Library
Index Terms
- Efficient Neural Networks for Real-time Motion Style Transfer
Recommendations
Unpaired motion style transfer from video to animation
Transferring the motion style from one animation clip to another, while preserving the motion content of the latter, has been a long-standing problem in character animation. Most existing data-driven approaches are supervised and rely on paired data, ...
Adult2child: Motion Style Transfer using CycleGANs
MIG '20: Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and GamesChild characters are commonly seen in leading roles in top-selling video games. Previous studies have shown that child motions are perceptually and stylistically different from those of adults. Creating motion for these characters by motion capturing ...
Motion Puzzle: Arbitrary Motion Style Transfer by Body Part
This article presents Motion Puzzle, a novel motion style transfer network that advances the state-of-the-art in several important respects. The Motion Puzzle is the first that can control the motion style of individual body parts, allowing for local ...
Comments