Abstract
Many analysis tasks for human motion rely on high-level similarity between sequences of motions, that are not an exact matches in joint angles, timing, or ordering of actions. Even the same movements performed by the same person can vary in duration and speed. Similar motions are characterized by similar sets of actions that appear frequently. In this paper we introduce motion motifs and motion signatures that are a succinct but descriptive representation of motion sequences. We first break the motion sequences to short-term movements called motion words, and then cluster the words in a high-dimensional feature space to find motifs. Hence, motifs are words that are both common and descriptive, and their distribution represents the motion sequence. To cluster words and find motifs, the challenge is to define an effective feature space, where the distances among motion words are semantically meaningful, and where variations in speed and duration are handled. To this end, we use a deep neural network to embed the motion words into feature space using a triplet loss function. To define a signature, we choose a finite set of motion-motifs, creating a bag-of-motifs representation for the sequence. Motion signatures are agnostic to movement order, speed or duration variations, and can distinguish fine-grained differences between motions of the same class. We illustrate examples of characterizing motion sequences by motifs, and for the use of motion signatures in a number of applications.
Supplemental Material
Available for Download
Supplemental files.
- Okan Arikan, David A. Forsyth, and James F. O'Brien. 2003. Motion Synthesis from Annotations. ACM Trans. Graph. 22, 3 (July 2003), 402--408. Google ScholarDigital Library
- Andreas Aristidou, Panayiotis Charalambous, and Yiorgos Chrysanthou. 2015. Emotion Analysis and Classification: Understanding the Performers' Emotions Using the LMA Entities. Comput. Graph. Forum 34, 6 (Sept. 2015), 262--276. Google ScholarDigital Library
- Andreas Aristidou, Daniel Cohen-Or, Jessica K. Hodgins, and Ariel Shamir. 2018. Self-similarity Analysis for Motion Capture Cleaning. Comput. Graph. Forum 37, 2 (May 2018), 297--309.Google ScholarCross Ref
- Andreas Aristidou, Qiong Zeng, Efstathios Stavrakis, Kangkang Yin, Daniel Cohen-Or, Yiorgos Chrysanthou, and Baoquan Chen. 2017. Emotion Control of Unstructured Dance Movements. In Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation (SCA '17). ACM, New York, NY, USA, 9:1--9:10. Google ScholarDigital Library
- Andreas Baak, Meinard Müller, and Hans-Peter Seidel. 2008. An Efficient Algorithm for Keyframe-based Motion Retrieval in the Presence of Temporal Deformations. In Proceedings of the ACM International Conference on Multimedia Information Retrieval (MIR '08). ACM, New York, NY, USA, 451--458. Google ScholarDigital Library
- Jernej Barbič, Alla Safonova, Jia-Yu Pan, Christos Faloutsos, Jessica K. Hodgins, and Nancy S. Pollard. 2004. Segmenting Motion Capture Data into Distinct Behaviors. In Proceedings of Graphics Interface 2004 (GI '04). Canadian Human-Computer Communications Society, School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada, 185--194. Google ScholarDigital Library
- Philippe Beaudoin, Stelian Coros, Michiel van de Panne, and Pierre Poulin. 2008. Motion-motif Graphs. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '08). Eurographics Association, 117--126. Google ScholarDigital Library
- Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation Learning: A Review and New Perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (Aug. 2013), 1798--1828. Google ScholarDigital Library
- Jürgen Bernard, Eduard Dobermann, Anna Vögele, Björn Krüger, Jörn Kohlhammer, and Dieter Fellner. 2017. Visual-Interactive Semi-Supervised Labeling of Human Motion Capture Data. In Proceedings of Visualization and Data Analysis (VDA '17). Society for Imaging Science and Technology, 34--45.Google ScholarCross Ref
- Jürgen Bernard, Nils Wilhelm, Björn Krüger, Thorsten May, Tobias Schreck, and Jörn Kohlhammer. 2013. MotionExplorer: Exploratory Search in Human Motion Capture Data Based on Hierarchical Aggregation. IEEE Transactions on Visualization and Computer Graphics 19, 12 (Dec. 2013), 2257--2266. Google ScholarDigital Library
- Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA. Google ScholarDigital Library
- Durell Bouchard and Norman I. Badler. 2015. Segmenting Motion Capture Data Using a Qualitative Analysis. In Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games (MIG '15). 23--30. Google ScholarDigital Library
- Jinxiang Chai and Jessica K. Hodgins. 2005. Performance Animation from Low-dimensional Control Signals. ACM Trans. Graph. 24, 3 (July 2005), 686--696. Google ScholarDigital Library
- Min-Wen Chao, Chao-Hung Lin, Jackie Assa, and Tong-Yee Lee. 2012. Human Motion Retrieval from Hand-Drawn Sketch. IEEE Transactions on Visualization and Computer Graphics 18, 5 (May 2012), 729--740. Google ScholarDigital Library
- Songle Chen, Zhengxing Sun, and Yan Zhang. 2015. Scalable Organization of Collections of Motion Capture Data via Quantitative and Qualitative Analysis. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR '15). 411--418. Google ScholarDigital Library
- Myung G. Choi, Kyungyong Yang, Takeo Igarashi, Jun Mitani, and Jehee Lee. 2012. Retrieval and Visualization of Human Motion Data via Stick Figures. Comput. Graph. Forum 31, 7 (2012), 2057--2065. Google ScholarDigital Library
- Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a Similarity Metric Discriminatively, with Application to Face Verification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05). 539--546. Google ScholarDigital Library
- CMU. 2018. Carnegie Mellon University MoCap Database: http://mocap.cs.cmu.edu/. (2018).Google Scholar
- Bruce Croft, Donald Metzler, and Trevor Strohman. 2009. Search Engines: Information Retrieval in Practice (1st ed.). Addison-Wesley Publishing Company, USA. Google ScholarDigital Library
- Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, and Cèdric Bray. 2004. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV. 1--22.Google Scholar
- Zhigang Deng, Qin Gu, and Qing Li. 2009. Perceptually Consistent Example-based Human Motion Retrieval. In Proceedings of the Symposium on Interactive 3D Graphics and Games (I3D '09). 191--198. Google ScholarDigital Library
- DMCD. 2018. Dance Motion Capture Database: http://dancedb.cs.ucy.ac.cy/. (2018).Google Scholar
- Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. 2012. What Makes Paris Look Like Paris? ACM Trans. Graph. 31, 4 (July 2012), 101:1--101:9. Google ScholarDigital Library
- Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, and Trevor Darrell. 2017. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Trans. Pattern Anal. Mach. Intell. 39, 4 (April 2017), 677--691. Google ScholarDigital Library
- Matthew Field, David Stirling, Zengxi Pan, Montserrat Ros, and Fazel Naghdy. 2015. Recognizing Human Motions Through Mixture Modeling of Inertial Data. Pattern Recogn. 48, 8 (Aug. 2015), 2394--2406. Google ScholarDigital Library
- Kevin Forbes and Eugene Fiume. 2005. An Efficient Search Algorithm for Motion Data Using Weighted PCA. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '05). 67--76. Google ScholarDigital Library
- Daniel Holden, Taku Komura, and Jun Saito. 2017. Phase-functioned Neural Networks for Character Control. ACM Trans. Graph. 36, 4 (July 2017), 42:1--42:13. Google ScholarDigital Library
- Daniel Holden, Jun Saito, and Taku Komura. 2016. A Deep Learning Framework for Character Motion Synthesis and Editing. ACM Trans. Graph. 35, 4 (July 2016), 138:1--138:11. Google ScholarDigital Library
- Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. 2015. Learning Motion Manifolds with Convolutional Autoencoders. In SIGGRAPH Asia 2015 Technical Briefs (SA '15). 18:1--18:4. Google ScholarDigital Library
- Bing Hu, Yanping Chen, Jesin Zakaria, Liudmila Ulanova, and Eamonn Keogh. 2013. Classification of Multi-dimensional Streaming Time Series by Weighting Each Classifier's Track Record. In IEEE 13th International Conference on Data Mining (ICDM'16). 281--290.Google ScholarCross Ref
- Yueqi Hu, Shuangyuan Wu, Shihong Xia, Jinghua Fu, and Wei Chen. 2010. Motion track: Visualizing variations of human motion data. In Proceedings of the IEEE Pacific Visualization Symposium (PacificVis). 153--160.Google ScholarCross Ref
- Mohamed E. Hussein, Marwan Torki, Mohammad A. Gowayyed, and Motaz El-Saban. 2013. Human Action Recognition Using a Temporal Hierarchy of Covariance Descriptors on 3D Joint Locations. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI '13). AAAI Press, 2466--2472. Google ScholarDigital Library
- Mubbasir Kapadia, I-kao Chiang, Tiju Thomas, Norman I. Badler, and Joseph T. Kider, Jr. 2013. Efficient Motion Retrieval in Large Motion Databases. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D '13). 19--28. Google ScholarDigital Library
- Ioannis Kapsouras and Nikos Nikolaidis. 2014. Action Recognition on Motion Capture Data Using a Dynemes and Forward Differences Representation. J. Vis. Comun. Image Represent. 25, 6 (Aug. 2014), 1432--1445. Google ScholarDigital Library
- Eamonn Keogh, Themistoklis Palpanas, Victor B. Zordan, Dimitrios Gunopulos, and Marc Cardle. 2004. Indexing Large Human-motion Databases. In Proceedings of the International Conference on Very Large Data Bases (VLDB '04). 780--791. Google ScholarDigital Library
- Lucas Kovar and Michael Gleicher. 2004. Automated Extraction and Parameterization of Motions in Large Data Sets. ACM Trans. Graph. 23, 3 (Aug. 2004), 559--568. Google ScholarDigital Library
- Lucas Kovar, Michael Gleicher, and Frédéric Pighin. 2002. Motion Graphs. ACM Trans. Graph. 21, 3 (July 2002), 473--482. Google ScholarDigital Library
- Björn Krüger, Jochen Tautges, Andreas Weber, and Arno Zinke. 2010. Fast Local and Global Similarity Searches in Large Motion Capture Databases. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '10). Eurographics Association, 1--10. Google ScholarDigital Library
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (5 2015), 436--444.Google Scholar
- Jehee Lee, Jinxiang Chai, Paul S. A. Reitsma, Jessica K. Hodgins, and Nancy S. Pollard. 2002. Interactive Control of Avatars Animated with Human Motion Data. ACM Trans. Graph. 21, 3 (July 2002), 491--500. Google ScholarDigital Library
- Fei-Fei Li and Pietro Perona. 2005. A Bayesian Hierarchical Model for Learning Natural Scene Categories. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05). 524--531. Google ScholarDigital Library
- Xiaosheng Li and Jessica Lin. 2017. Linear Time Complexity Time Series Classification with Bag-of-Pattern-Features. In IEEE International Conference on Data Mining (ICDM '17). 277--286.Google Scholar
- Jessica Lin, Rohan Khade, and Yuan Li. 2012. Rotation-invariant Similarity in Time Series Using Bag-of-patterns Representation. J. Intell. Inf. Syst. 39, 2 (Oct. 2012), 287--315. Google ScholarDigital Library
- Feng Liu, Yueting Zhuang, Fei Wu, and Yunhe Pan. 2003. 3D Motion Retrieval with Motion Index Tree. Comput. Vis. Image Underst. 92, 2--3 (Nov. 2003), 265--284. Google ScholarDigital Library
- Guodong Liu, Jingdan Zhang, Wei Wang, and Leonard McMillan. 2005. A System for Analyzing and Indexing Human-motion Databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '05). 924--926. Google ScholarDigital Library
- Libin Liu and Jessica Hodgins. 2017. Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning. ACM Trans. Graph. 36, 3 (June 2017), 29:1--29:14. Google ScholarDigital Library
- Xin Liu, Gao-Feng He, Shu-Juan Peng, Yiu-ming Cheung, and Yuan Yan Tang. 2017. Efficient Human Motion Retrieval via Temporal Adjacent Bag of Words and Discriminative Neighborhood Preserving Dictionary Learning. IEEE Transactions on Human-Machine Systems 47, 6 (Dec 2017), 763--776.Google ScholarCross Ref
- Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. 2017. VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera. ACM Trans. Graph. 36, 4 (July 2017), 44:1--44:14. Google ScholarDigital Library
- Meinard Müller, Andreas Baak, and Hans-Peter Seidel. 2009. Efficient and Robust Annotation of Motion Capture Data. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '09). 17--26. Google ScholarDigital Library
- Meinard Müller and Tido Röder. 2006. Motion Templates for Automatic Classification and Retrieval of Motion Capture Data. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '06). Eurographics Association, 137--146. Google ScholarDigital Library
- Meinard Müller, Tido Röder, and Michael Clausen. 2005. Efficient Content-based Retrieval of Motion Capture Data. ACM Trans. Graph. 24, 3 (July 2005), 677--685. Google ScholarDigital Library
- Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, and Kostas Daniilidis. 2017. Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '17). 1--10.Google ScholarCross Ref
- Xue Bin Peng, Glen Berseth, Kangkang Yin, and Michiel Van De Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. ACM Trans. Graph. 36, 4 (July 2017), 41:1--41:13. Google ScholarDigital Library
- Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, and Eamonn Keogh. 2012. Searching and Mining Trillions of Time Series Subsequences Under Dynamic Time Warping. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12). ACM, New York, NY, USA, 262--270. Google ScholarDigital Library
- Michalis Raptis, Darko Kirovski, and Hugues Hoppe. 2011. Real-time Classification of Dance Gestures from Skeleton Animation. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '11). 147--156. Google ScholarDigital Library
- Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. 2000. The Earth Mover's Distance As a Metric for Image Retrieval. Int. J. Comput. Vision 40, 2 (Nov. 2000), 99--121. Google ScholarDigital Library
- Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15). 815--823.Google ScholarCross Ref
- Saurabh Singh, Abhinav Gupta, and Alexei A. Efros. 2012. Unsupervised Discovery of Mid-Level Discriminative Patches. Springer Berlin Heidelberg, Berlin, Heidelberg, 73--86.Google Scholar
- Chuan Sun, Imran N. Junejo, and Hassan Foroosh. 2011. Motion Retrieval Using Low-Rank Subspace Decomposition of Motion Volume. Comput. Graph. Forum 30, 7 (November 2011), 1953--1962.Google Scholar
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15). 1--9.Google ScholarCross Ref
- Wataru Takano and Yoshihiko Nakamura. 2015. Symbolically Structured Database for Human Whole Body Motions Based on Association Between Motion Symbols and Motion Words. Robot. Auton. Syst. 66, C (April 2015), 75--85. Google ScholarDigital Library
- Jochen Tautges, Arno Zinke, Björn Krüger, Jan Baumann, Andreas Weber, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bernd Eberhardt. 2011. Motion Reconstruction Using Sparse Accelerometer Data. ACM Trans. Graph. 30, 3 (May 2011), 18:1--18:12. Google ScholarDigital Library
- Matthew Thorne, David Burke, and Michiel van de Panne. 2004. Motion Doodles: An Interface for Sketching Character Motion. ACM Trans. Graph. 23, 3 (Aug. 2004), 424--431. Google ScholarDigital Library
- M. Alex O. Vasilescu. 2002. Human motion signatures: analysis, synthesis, recognition. In Proceedings of the International Conference on Pattern Recognition, Vol. 3. IEEE, 456--460. Google ScholarDigital Library
- Raviteja Vemulapalli, Felipe Arrate, and Rama Chellappa. 2014. Human Action Recognition by Representing 3D Skeletons As Points in a Lie Group. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '14). 588--595. Google ScholarDigital Library
- Anna Vögele, Björn Krüger, and Reinhard Klein. 2014. Efficient Unsupervised Temporal Segmentation of Human Motion. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '14). Eurographics Association, 167--176. Google ScholarDigital Library
- Jing Wang and Bobby Bodenheimer. 2003. An Evaluation of a Cost Metric for Selecting Transitions Between Motion Segments. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '03). Eurographics Association, 232--238. Google ScholarDigital Library
- Yingying Wang and Michael Neff. 2015. Deep Signatures for Indexing and Retrieval in Large Motion Databases. In Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games (MIG '15). ACM, New York, NY, USA, 37--45. Google ScholarDigital Library
- Ian H. Witten, Timothy C. Bell, and Alistair Moffat. 1994. Managing Gigabytes: Compressing and Indexing Documents and Images (1st ed.). John Wiley & Sons, Inc. Google ScholarDigital Library
- Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Krähenbühl. 2017. Sampling Matters in Deep Embedding Learning. In Proceedings of the IEEE International Conference on Computer Vision (ICCV '17). 2859--2867.Google ScholarCross Ref
- Shuangyuan Wu, Zhaoqi Wang, and Shihong Xia. 2009. Indexing and Retrieval of Human Motion Data by a Hierarchical Tree. In Proceedings of the 16th ACM Symposium on Virtual Reality Software and Technology (VRST '09). 207--214. Google ScholarDigital Library
- Jun Xiao, Yinfu Feng, Mingming Ji, Xiaosong Yang, Jian J. Zhang, and Yueting Zhuang. 2015. Sparse Motion Bases Selection for Human Motion Denoising. Signal Process. 110, C (May 2015), 108--122. Google ScholarDigital Library
- Chin-Chia Michael Yeh, Nickolas Kavantzas, and Eamon Keogh. 2017. Matrix Profile VI: Meaningful Multidimensional Motif Discovery. In IEEE 17th International Conference on Data Mining (ICDM'17). 1317--1322.Google Scholar
- Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Zachary Zimmerman, Diego Furtado Silva, Abdullah Mueen, and Eamonn Keogh. 2018. Time Series Joins, Motifs, Discords and Shapelets: A Unifying View That Exploits the Matrix Profile. Data Min. Knowl. Discov. 32, 1 (Jan. 2018), 83--123. Google ScholarDigital Library
- Byoung-Kee Yi, H. V. Jagadish, and Christos Faloutsos. 1998. Efficient Retrieval of Similar Time Sequences Under Time Warping. In Proceedings of the Fourteenth International Conference on Data Engineering (ICDE '98). IEEE Computer Society, Washington, DC, USA, 201--208. Google ScholarDigital Library
- Sergey Zagoruyko and Nikos Komodakis. 2015. Learning to compare image patches via convolutional neural networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '15). 4353--4361.Google ScholarCross Ref
- Feng Zhou, Fernando De la Torre, and Jessica K. Hodgins. 2013. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion. IEEE Trans. Pattern Anal. Mach. Intell. 35, 3 (March 2013), 582--596. Google ScholarDigital Library
Index Terms
- Deep motifs and motion signatures
Recommendations
Entropy-based motion extraction for motion capture animation: Motion Capture and Retrieval
CASA 2005In this paper, we present a new segmentation solution for extracting motion patterns from motion capture data by searching for critical keyposes in the motion sequence. A rank is established for critical keyposes that identifies the significance of the ...
Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
Special issue on applications in molecular biologyThe MEME algorithm extends the expectation maximization (EM) algorithm for identifying motifs in unaligned biopolymer sequences. The aim of MEME is to discover new motifs in a set of biopolymer sequences where little or nothing is known in advance about ...
Optimized Motion Capture System for Full Body Human Motion Capturing Case Study of Educational Institution and Small Animation Production
DMDCM '11: Proceedings of the 2011 Workshop on Digital Media and Digital Content ManagementMotion capture system or MOCAP is a set of devices used for capturing moving objects. In addition to had used in the scientific community, Medical, Engineering, MOCAP is currently being used extensively in film and animation industry to create realistic ...
Comments