ABSTRACT
The proliferation of mobile devices has enabled extensive mobile-data supported applications, e.g., exercise and activity recognition and quantification. Typically, these applications need predefined features and are only applicable to predefined activities. In this work, we address the issue of deep understanding of arbitrary activities and semantic searching of any activity over massive mobile sensing data. The challenges stem from the rich dynamics and the wide-spectrum of activities that a human being could perform. We propose a hierarchical activity representation, extract common bases of motion data in an unsupervised manner by leveraging the power of deep neural networks, and propose a universal multi-resolution representation for all activities without prior knowledge. Based on this representation, we design an innovative system Lasagna to manage and search motion data semantically. We implement a prototype system and our comprehensive evaluations show that our system can achieve highly accurate activity classification (with precision 98.9%) and search (with recall almost 100% and precision about 90%) over a diverse set of activities.
- Anjum, A., and Ilyas, M. U. Activity recognition using smartphone sensors. In 2013 IEEE 10th Consumer Communications and Networking Conference (CCNC), pp. 914--919.Google Scholar
- Bao, L., and Intille, S. S. Activity recognition from user-annotated acceleration data. In International Conference on Pervasive Computing (2004), Springer, pp. 1--17.Google ScholarCross Ref
- Chen, K.-Y., Ashbrook, D., Goel, M., Lee, S.-H., and Patel, S. Airlink: sharing files between multiple devices using in-air gestures. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 565--569. Google ScholarDigital Library
- Cheng, H.-T. Learning and Recognizing The Hierarchical and Sequential Structure of Human Activities. PhD thesis, CARNEGIE MELLON UNIVERSITY, 2013. Google ScholarDigital Library
- George, E. I., and McCulloch, R. E. Variable selection via gibbs sampling. Journal of the American Statistical Association 88, 423 (1993), 881--889.Google ScholarCross Ref
- Guo, Y., Yang, L., Ding, X., Han, J., and Liu, Y. Opensesame: Unlocking smart phone through handshaking biometrics. In Proceedings of IEEE INFOCOM (2013), pp. 365--369.Google ScholarCross Ref
- Harbach, M., von Zezschwitz, E., Fichtner, A., De Luca, A., and Smith, M. It's a hard lock life: A field study of smartphone (un) locking behavior and risk perception. In SOUPS 2014, pp. 213--230.Google Scholar
- Hinton, G. E. Training products of experts by minimizing contrastive divergence. Neural computation 14, 8 (2002), 1771--1800. Google ScholarDigital Library
- Hong, F., Wei, M., You, S., Feng, Y., and Guo, Z. Waving authentication: Your smartphone authenticate you on motion gesture. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (2015), pp. 263--266. Google ScholarDigital Library
- Huynh, T., Fritz, M., and Schiele, B. Discovery of activity patterns using topic models. In Proceedings of the 10th international conference on Ubiquitous computing (2008), ACM, pp. 10--19. Google ScholarDigital Library
- Jiang, W., and Yin, Z. Human activity recognition using wearable sensors by deep convolutional neural networks. In Proceedings of the 23rd Annual ACM Conference on Multimedia Conference (2015), pp. 1307--1310. Google ScholarDigital Library
- Kiros, R., Salakhutdinov, R., and Zemel, R. Multimodal neural language models. In Proceedings of the 31st International Conference on Machine Learning (2014), pp. 595--603.Google Scholar
- Kwapisz, J. R., Weiss, G. M., and Moore, S. A. Activity recognition using cell phone accelerometers. ACM SigKDD Explorations Newsletter 12, 2 (2011), 74--82. Google ScholarDigital Library
- Kwon, Y., Kang, K., and Bae, C. Unsupervised learning for human activity recognition using smartphone sensors. Expert Systems with Applications 41, 14 (2014), 6067--6074.Google ScholarCross Ref
- Lane, N. D., and Georgiev, P. Can deep learning revolutionize mobile sensing? In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications (2015), ACM, pp. 117--122. Google ScholarDigital Library
- Lane, N. D., Georgiev, P., and Qendro, L. Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 283--294. Google ScholarDigital Library
- Lee, H., Grosse, R., Ranganath, R., and Ng, A. Y. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning (2009), ACM, pp. 609--616. Google ScholarDigital Library
- Lee, H., Pham, P., Largman, Y., and Ng, A. Y. Unsupervised feature learning for audio classification using convolutional deep belief networks. In Advances in neural information processing systems (2009), pp. 1096--1104. Google ScholarDigital Library
- Li, Z., Li, M., Wang, J., and Cao, Z. Ubiquitous data collection for mobile users in wireless sensor networks. In INFOCOM, 2011 Proceedings IEEE (2011), IEEE, pp. 2246--2254.Google ScholarCross Ref
- Liu, J., Zhong, L., Wickramasuriya, J., and Vasudevan, V. User evaluation of lightweight user authentication with a single tri-axis accelerometer. In Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services (2009), ACM, p. 15. Google ScholarDigital Library
- Liu, X., Zhou, Z., Diao, W., Li, Z., and Zhang, K. When good becomes evil: Keystroke inference with smartwatch. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (2015), pp. 1273--1285. Google ScholarDigital Library
- Lütkepohl, H. New introduction to multiple time series analysis. Springer Science & Business Media, 2005. Google ScholarDigital Library
- Matsubara, Y., Sakurai, Y., and Faloutsos, C. Autoplait: Automatic mining of co-evolving time sequences. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 193--204. Google ScholarDigital Library
- Michalevsky, Y., Boneh, D., and Nakibly, G. Gyrophone: Recognizing speech from gyroscope signals. In 23rd USENIX Security Symposium (2014), pp. 1053--1067. Google ScholarDigital Library
- Owusu, E., Han, J., Das, S., Perrig, A., and Zhang, J. Accessory: password inference using accelerometers on smartphones. In Proceedings of the Twelfth Workshop on Mobile Computing Systems & Applications (2012), ACM. Google ScholarDigital Library
- Parate, A., Chiu, M.-C., Chadowitz, C., Ganesan, D., and Kalogerakis, E. Risq: Recognizing smoking gestures with inertial sensors on a wristband. In Proceedings of the 12th annual international conference on Mobile systems, applications, and services (2014), pp. 149--161. Google ScholarDigital Library
- Rabiner, L. R., and Juang, B.-H. An introduction to hidden markov models. ASSP Magazine, IEEE 3, 1 (1986), 4--16.Google ScholarCross Ref
- Ramanathan, V., Tang, K., Mori, G., and Fei-Fei, L. Learning temporal embeddings for complex video analysis. In The IEEE International Conference on Computer Vision (2015). Google ScholarDigital Library
- Ren, Y., Chen, Y., Chuah, M. C., and Yang, J. User verification leveraging gait recognition for smartphone enabled mobile healthcare systems. IEEE Transactions on Mobile Computing 14, 9 (2015), 1961--1974.Google ScholarDigital Library
- Roy, N., Wang, H., and Roy Choudhury, R. I am a smartphone and i can tell my user's walking direction. In Proceedings of the 12th annual international conference on Mobile systems, applications, and services (2014), ACM, pp. 329--342. Google ScholarDigital Library
- Schroff, F., Kalenichenko, D., and Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 815--823.Google ScholarCross Ref
- Shahzad, M., Liu, A. X., and Samuel, A. Secure unlocking of mobile touch screen devices by simple gestures: you can see it but you can not do it. In Proceedings of the 19th annual international conference on Mobile computing & networking (2013), ACM, pp. 39--50. Google ScholarDigital Library
- Shoaib, M., Bosch, S., Incel, O. D., Scholten, H., and Havinga, P. J. Fusion of smartphone motion sensors for physical activity recognition. Sensors 14, 6 (2014), 10146--10176.Google ScholarCross Ref
- Sun, Y., Chen, Y., Wang, X., and Tang, X. Deep learning face representation by joint identification-verification. In Advances in Neural Information Processing Systems 27. 2014, pp. 1988--1996. Google ScholarDigital Library
- Sun, Y., Liang, D., Wang, X., and Tang, X. Deepid3: Face recognition with very deep neural networks. arXiv preprint arXiv:1502.00873 (2015).Google Scholar
- Sun, Y., Wang, X., and Tang, X. Deeply learned face representations are sparse, selective, and robust. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 2892--2900.Google ScholarCross Ref
- Sun, Z., Purohit, A., Bose, R., and Zhang, P. Spartacus: spatially-aware interaction for mobile devices through energy-efficient audio sensing. In Proceeding of the 11th annual international conference on Mobile systems, applications, and services (2013), ACM, pp. 263--276. Google ScholarDigital Library
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. Going deeper with convolutions. arXiv preprint arXiv:1409.4842 (2014).Google Scholar
- Tung, Y.-C., and Shin, K. G. Echotag: accurate infrastructure-free indoor location tagging with smartphones. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (2015), ACM, pp. 525--536. Google ScholarDigital Library
- Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3156--3164.Google ScholarCross Ref
- Wang, H., Lai, T. T.-T., and Roy Choudhury, R. Mole: Motion leaks through smartwatch sensors. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (2015), ACM, pp. 155--166. Google ScholarDigital Library
- Wang, P., Wang, H., and Wang, W. Finding semantics in time series. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (2011), pp. 385--396. Google ScholarDigital Library
- Wang, R., Harari, G., Hao, P., Zhou, X., and Campbell, A. T. Smartgpa: how smartphones can assess and predict academic performance of college students. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 295--306. Google ScholarDigital Library
- Wang, X., and Gupta, A. Unsupervised learning of visual representations using videos. In Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 2794--2802. Google ScholarDigital Library
- Wang, X., Smith, K., and Hyndman, R. Characteristic-based clustering for time series data. Data mining and knowledge Discovery 13, 3 (2006), 335--364. Google ScholarDigital Library
- Wu, F.-J., Chu, F.-I., and Tseng, Y.-C. Cyber-physical handshake. In ACM SIGCOMM Computer Communication Review (2011), vol. 41, pp. 472--473. Google ScholarDigital Library
- Xu, Z., Bai, K., and Zhu, S. Taplogger: Inferring user inputs on smartphone touchscreens using on-board motion sensors. In Proceedings of the fifth ACM conference on Security and Privacy in Wireless and Mobile Networks (2012), pp. 113--124. Google ScholarDigital Library
- Zhang, L., Liu, K., Jiang, Y., Li, X.-Y., Liu, Y., and Yang, P. Montage: Combine frames with movement continuity for realtime multi-user tracking. In Proceedings of IEEE INFOCOM (2014), pp. 799--807.Google ScholarCross Ref
- Zhang, L., Pathak, P. H., Wu, M., Zhao, Y., and Mohapatra, P. Accelword: Energy efficient hotword detection through accelerometer. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (2015), ACM, pp. 301--315. Google ScholarDigital Library
- Zheng, N., Bai, K., Huang, H., and Wang, H. You are how you touch: User verification on smartphones via tapping behaviors. In IEEE 22nd International Conference on Network Protocols (ICNP) (2014), pp. 221--232. Google ScholarDigital Library
Index Terms
- Lasagna: towards deep hierarchical understanding and searching over mobile sensing data
Recommendations
A Survey on Deep Learning for Human Activity Recognition
Human activity recognition is a key to a lot of applications such as healthcare and smart home. In this study, we provide a comprehensive survey on recent advances and challenges in human activity recognition (HAR) with deep learning. Although there are ...
Capturing Daily Student Life by Recognizing Complex Activities Using Smartphones
MobiQuitous 2017: Proceedings of the 14th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and ServicesIn-depth understanding of student life is essential to proactively support students in their academic educations. However, there is no work that identifies and recognizes a sufficient set of activities to capture a daily student life since complex ...
From Multimedia Logs to Personal Chronicles
MM '17: Proceedings of the 25th ACM international conference on MultimediaMultimodal data streams are essential for analyzing personal life, environmental conditions, and social situations. Since these data streams have different granularities and semantics, the semantic gap becomes even more formidable. To make sense of all ...
Comments