research-article

Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors

Authors:
Yang Liu

City University of Hong Kong, Hong Kong, Hong Kong

City University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Zhenjiang Li

City University of Hong Kong, Hong Kong, Hong Kong

City University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Zhidan Liu

Shenzhen University, Shenzhen, China

Shenzhen University, Shenzhen, China
View Profile

,
Kaishun Wu

Shenzhen University, Shenzhen, China

Shenzhen University, Shenzhen, China
View Profile

MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and ServicesJune 2019Pages 287–299https://doi.org/10.1145/3307334.3326109

Published:12 June 2019Publication History

MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

Pages 287–299

ABSTRACT

This paper presents ArmTroi, a wearable system for understanding and analyzing the detailed arm motions of people primarily by using the motion sensors from wrist-worn wearable devices. ArmTroi can achieve real-time 3D arm skeleton tracking and reliable gesture inference tolerant to missing wearable sensors for enabling numerous useful application designs. We have coped with two major challenges through ArmTroi. First, the skeleton of each arm is determined from the locations of the elbow and wrist, whereas a wearable device only senses a single point from the wrist. We find that the potential solution space is huge. This underconstrained nature fundamentally challenges the achievement of accurate and real-time arm skeleton tracking. Second, wearable sensors may not reliably provide sensory data. For example, devices are not worn by the user, yet the learning tools for gesture inference, such as deep learning, typically have static network structures, which require nontrivial network adaptation to match the input's varying availability and ensure reliable gesture inference. We propose effective techniques to address above challenges, and all computations can be conducted on the user's smartphone. ArmTroi is thus a fully lightweight and portable system. We develop a prototype and extensive evaluation shows the efficacy of the ArmTroi design.

References

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proc. of ICLR .Google Scholar
Yoshua Bengio. 2013. Deep learning of representations: Looking forward. In Proc. of Springer SLSP . Google ScholarDigital Library
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In Proc. of IEEE CVPR .Google ScholarCross Ref
Andrea Giovanni Cutti, Andrea Giovanardi, Laura Rocchi, Angelo Davalli, and Rinaldo Sacchetti. 2008. Ambulatory measurement of shoulder and elbow kinematics through inertial and magnetic sensors. Springer Medical & biological engineering & computing (2008).Google Scholar
Neeraj Deshmukh, Aravind Ganapathiraju, and Joseph Picone. 1999. Hierarchical search for large-vocabulary conversational speech recognition: working toward a solution to the decoding problem. IEEE Signal Processing Magazine (1999).Google Scholar
Han Ding, Longfei Shangguan, Zheng Yang, Jinsong Han, Zimu Zhou, Panlong Yang, Wei Xi, and Jizhong Zhao. 2015. Femo: A platform for free-weight exercise monitoring with rfids. In Proc. of ACM SenSys . Google ScholarDigital Library
Yong Du, Wei Wang, and Liang Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proc. of IEEE CVPR .Google Scholar
Mahmoud El-Gohary and James McNames. 2012. Shoulder and elbow joint angle tracking with inertial sensors. IEEE Transactions on Biomedical Engineering (2012).Google Scholar
Biyi Fang, Nicholas D Lane, Mi Zhang, Aidan Boran, and Fahim Kawsar. 2016. BodyScan: Enabling radio-based sensing on wearable devices for contactless activity and vital sign monitoring. In Proc. of ACM MobiSys . Google ScholarDigital Library
Petko Georgiev, Nicholas D Lane, Kiran K Rachuri, and Cecilia Mascolo. 2016. LEO: Scheduling sensor inference algorithms across heterogeneous mobile processors and network resources. In Proc. of ACM MobiCom . Google ScholarDigital Library
John J Guiry, Pepijn Van de Ven, and John Nelson. 2014. Multi-sensor fusion for enhanced contextual awareness of everyday activities with ubiquitous devices. Multidisciplinary Digital Publishing Institute Journal on Sensors (2014).Google Scholar
Xiaonan Guo, Jian Liu, and Yingying Chen. 2017. FitCoach: Virtual fitness coach empowered by wearable mobile devices. In Proc. of IEEE INFOCOM .Google ScholarCross Ref
Kiryong Ha, Zhuo Chen, Wenlu Hu, Wolfgang Richter, Padmanabhan Pillai, and Mahadev Satyanarayanan. 2014. Towards wearable cognitive assistance. In Proc. of ACM MobiSys . Google ScholarDigital Library
Nils Yannick Hammerla, James Fisher, Peter Andras, Lynn Rochester, Richard Walker, and Thomas Plötz. 2015. PD Disease State Assessment in Naturalistic Environments Using Deep Learning.. In Proc. of AAAI . Google ScholarDigital Library
Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proc. of ACM MobiSys .Google ScholarDigital Library
Samuli Hemminki, Petteri Nurmi, and Sasu Tarkoma. 2013. Accelerometer-based transportation mode detection on smartphones. In Proc. of ACM SenSys . Google ScholarDigital Library
Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber, et almbox. 2001. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies.Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation (1997).Google Scholar
Loc N Huynh, Youngki Lee, and Rajesh Krishna Balan. 2017. DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications. In Proc. of ACM MobiSys .Google ScholarDigital Library
Doo Young Kwon and Markus Gross. 2007. A framework for 3D spatial gesture design and modeling using a wearable input device. In Proc. of ACM ISWC . Google ScholarDigital Library
Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In Proc. of ACM/IEEE IPSN .Google ScholarCross Ref
Oscar D Lara and Miguel A Labrador. 2013. A survey on human activity recognition using wearable sensors. IEEE Communications Surveys and Tutorials (2013).Google Scholar
Zachary C Lipton, David C Kale, and Randall Wetzel. 2016. Modeling missing data in clinical time series with rnns. Machine Learning for Healthcare (2016).Google Scholar
Cihang Liu, Lan Zhang, Zongqian Liu, Kebin Liu, Xiangyang Li, and Yunhao Liu. 2016. Lasagna: towards deep hierarchical understanding and searching over mobile sensing data. In Proc. of ACM MobiCom . Google ScholarDigital Library
Sicong Liu, Yingyan Lin, Zimu Zhou, Kaiming Nan, Hui Liu, and Junzhao Du. 2018. On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework. In Proc. of ACM MobiSys . Google ScholarDigital Library
Roanna Lun and Wenbing Zhao. 2015. A survey of applications and human motion recognition with microsoft kinect. World Scientific on International Journal of Pattern Recognition and Artificial Intelligence (2015).Google Scholar
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. In Proc. of EMNLP .Google ScholarCross Ref
Sri Harish Mallidi and Hynek Hermansky. 2016. Novel neural network based fusion for multistream ASR. In Proc. of IEEE ICASSP .Google ScholarCross Ref
Akhil Mathur, Nicholas D Lane, Sourav Bhattacharya, Aidan Boran, Claudio Forlivesi, and Fahim Kawsar. 2017. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware. In Proc. of ACM MobiSys . Google ScholarDigital Library
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Interspeech .Google Scholar
Ramanan Navaratnam, Arasanathan Thayananthan, Philip HS Torr, and Roberto Cipolla. 2005. Hierarchical Part-Based Human Body Pose Estimation.. In Proc. of BMVC .Google ScholarCross Ref
Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In Proc. of ICML . Google ScholarDigital Library
Qifan Pu, Sidhant Gupta, Shyamnath Gollakota, and Shwetak Patel. 2013. Whole-home gesture recognition using wireless signals. In Proc. of ACM MobiCom . Google ScholarDigital Library
Muhannad Quwaider and Subir Biswas. 2008. Body posture identification using hidden Markov model with a wearable sensor network. In Proc. of ICST BodyNets . Google ScholarDigital Library
Nancy Berryman Reese and William D Bandy. 2016. Joint Range of Motion and Muscle Length Testing-E-Book .Elsevier Health Sciences.Google Scholar
Qaiser Riaz, Guanhong Tao, Björn Krüger, and Andreas Weber. 2015. Motion reconstruction using very few accelerometers and ground contacts. Elsevier Graphical Models (2015). Google ScholarDigital Library
Alexander M Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In Proc. of EMNLP .Google ScholarCross Ref
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing (1997). Google ScholarDigital Library
Chew Zhen Shan, Eileen Su Lee Ming, Hisyam Abdul Rahman, and Yeong Che Fai. 2015. Investigation of upper limb movement during badminton smash. In Proc. of IEEE ASCC .Google Scholar
Sheng Shen, Mahanth Gowda, and Romit Roy Choudhury. 2018. Closing the Gaps in Inertial Motion Tracking. In Proc. of ACM MobiCom . Google ScholarDigital Library
Sheng Shen, He Wang, and Romit Roy Choudhury. 2016. I am a Smartwatch and I can Track my User's Arm. In Proc. of ACM MobiSys . Google ScholarDigital Library
Muhammad Shoaib, Stephan Bosch, Hans Scholten, Paul JM Havinga, and Ozlem Durmaz Incel. 2015. Towards detection of bad habits by fusing smartphone and smartwatch sensors. In Proc. of IEEE PerCom Workshops .Google ScholarCross Ref
Leonid Sigal, Alexandru O Balan, and Michael J Black. 2010. Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Springer Journal on International journal of computer vision (2010). Google ScholarDigital Library
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research (2014). Google ScholarDigital Library
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proc. of NIPS . Google ScholarDigital Library
Jochen Tautges, Arno Zinke, Björn Krüger, Jan Baumann, Andreas Weber, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bernd Eberhardt. 2011. Motion reconstruction using sparse accelerometer data. ACM Transactions on Graphics (2011). Google ScholarDigital Library
Edison Thomaz, Irfan Essa, and Gregory D Abowd. 2015. A practical approach for recognizing eating moments with wrist-mounted inertial sensing. In Proc. of ACM Ubicomp . Google ScholarDigital Library
Yonatan Vaizman, Katherine Ellis, and Gert Lanckriet. 2017. Recognizing detailed human context in the wild from smartphones and smartwatches. IEEE Pervasive Computing (2017).Google Scholar
Yonatan Vaizman, Nadir Weibel, and Gert Lanckriet. 2018. Context Recognition In-the-Wild: Unified Model for Multi-Modal Sensors and Multi-Label Classification. Proc. of the ACM on IMWUT (2018).Google ScholarDigital Library
Praneeth Vepakomma, Debraj De, Sajal K Das, and Shekhar Bhansali. 2015. A-Wristocracy: Deep learning on wrist-worn sensing for recognition of user complex activities. In Proc. of IEEE BSN .Google ScholarCross Ref
Tran Huy Vu, Archan Misra, Quentin Roy, Kenny Choo Tsu Wei, and Youngki Lee. 2018. Smartwatch-based Early Gesture Detection 8 Trajectory Tracking for Interactive Gesture-Driven Applications. Proc. of ACM IMWUT (2018). Google ScholarDigital Library
Sijie Xiong, Sujie Zhu, Yisheng Ji, Binyao Jiang, Xiaohua Tian, Xuesheng Zheng, and Xinbing Wang. 2017. iBlink: Smart Glasses for Facial Paralysis Patients. In Proc. of ACM MobiSys . Google ScholarDigital Library
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proc. of ICML . Google ScholarDigital Library
Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, and Tarek Abdelzaher. 2017. DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework. In Proc. of ACM SenSys . Google ScholarDigital Library
Zhengyou Zhang. 2012. Microsoft kinect sensor and its effect. IEEE multimedia (2012). Google ScholarDigital Library
Mingmin Zhao, Yonglong Tian, Hang Zhao, Mohammad Abu Alsheikh, Tianhong Li, Rumen Hristov, Zachary Kabelac, Dina Katabi, and Antonio Torralba. 2018. RF-based 3D skeletons. In Proc. of ACM SIGCOMM . Google ScholarDigital Library
Pengfei Zhou, Yuanqing Zheng, and Mo Li. 2012. How long to wait": predicting bus arrival time with mobile phone based participatory sensing. In Proc. of ACM MobiSys . Google ScholarDigital Library

Index Terms

Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors
1. Human-centered computing
  1. Ubiquitous and mobile computing

Recommendations

When Wearable Sensing Meets Arm Tracking (poster)
MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

In this poster, we present our recent work, a wearable system for achieving real-time 3D arm skeleton. We have coped with the major challenge that the skeleton of each arm is determined from the locations of the elbow and wrist, whereas a wearable ...
Read More
Real-time vision-based hand tracking and gesture recognition
Read More
Design of an accurate end-of-arm force display system based on wearable arm gesture sensors and EMG sensors
Highlights
- A force display system based on information fusion for impaired arm is proposed.
Abstract
Most upper limb rehabilitation patients are still hard to feel the accuracy force they have imposed in the end of arm after a systematic upper limb rehabilitation. In order to provide an accurate end-of-arm force for those disabled ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services
June 2019
736 pages
ISBN:9781450366618
DOI:10.1145/3307334
General Chairs:
Junehwa Song
KAIST, South Korea
,
Minkyong Kim
Samsung Electronics
,
Program Chairs:
Nicholas D. Lane
University of Oxford & Samsung AI
,
Rajesh K. Balan
Singapore Management University
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
arm tracking
deep learning
gesture inference
mobile sensing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate274of1,679submissions,16%
Upcoming Conference
MOBISYS '24

Sponsor:

sigmobile

The 22nd Annual International Conference on Mobile Systems, Applications and Services

June 3 - 7, 2024

Minato-ku, Tokyo , Japan
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 40
  Total Citations
  View Citations
- 1,091
  Total Downloads
- Downloads (Last 12 months)73
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors

MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

ABSTRACT

References

Cited By

Index Terms

Recommendations

When Wearable Sensing Meets Arm Tracking (poster)

Real-time vision-based hand tracking and gesture recognition

Design of an accurate end-of-arm force display system based on wearable arm gesture sensors and EMG sensors