skip to main content
10.1145/3210240.3210337acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework

Published:10 June 2018Publication History

ABSTRACT

Recent research has demonstrated the potential of deploying deep neural networks (DNNs) on resource-constrained mobile platforms by trimming down the network complexity using different compression techniques. The current practice only investigate stand-alone compression schemes even though each compression technique may be well suited only for certain types of DNN layers. Also, these compression techniques are optimized merely for the inference accuracy of DNNs, without explicitly considering other application-driven system performance (e.g. latency and energy cost) and the varying resource availabilities across platforms (e.g. storage and processing capability). In this paper, we explore the desirable tradeoff between performance and resource constraints by user-specified needs, from a holistic system-level viewpoint. Specifically, we develop a usage-driven selection framework, referred to as AdaDeep, to automatically select a combination of compression techniques for a given DNN, that will lead to an optimal balance between user-specified performance goals and resource constraints. With an extensive evaluation on five public datasets and across twelve mobile devices, experimental results show that AdaDeep enables up to 9.8x latency reduction, 4.3x energy efficiency improvement, and 38x storage reduction in DNNs while incurring negligible accuracy loss. AdaDeep also uncovers multiple effective combinations of compression techniques unexplored in existing literature.

Skip Supplemental Material Section

Supplemental Material

p389-liu.webm

webm

88.7 MB

References

  1. Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel. 2017. Constrained Policy Optimization. Proceedings of ICML (2017).Google ScholarGoogle Scholar
  2. Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. 2015. Deep compositional question answering with neural module networks. arXiv preprint arXiv:1511.02799 2 (2015).Google ScholarGoogle Scholar
  3. Irwan Bello, Hieu Pham, Quoc V Le, Mohammad Norouzi, and Samy Bengio. 2017. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2017).Google ScholarGoogle Scholar
  4. James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. James S Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. In Proceedings of NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sourav Bhattacharya and Nicholas D Lane. 2016. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of SenSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Soravit Changpinyo, Mark Sandler, and Andrey Zhmoginov. 2017. The power of sparsity in convolutional neural networks. arXiv preprint arXiv:1702.06257 (2017).Google ScholarGoogle Scholar
  8. Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2016. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. In Proceedings of ISCA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  10. Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, and Sergey Levine. 2017. Learning modular neural network policies for multi-task and multi-robot transfer. In Proceedings of ICRA.Google ScholarGoogle ScholarCross RefCross Ref
  11. Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves.. In Proceedings of IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Biyi Fang, Jillian Co, and Mi Zhang. 2017. DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation. In Proceedings of SenSys. 5:1--5:13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Petko Georgiev, Nicholas D Lane, Kiran K Rachuri, and Cecilia Mascolo. 2016. LEO: Scheduling sensor inference algorithms across heterogeneous mobile processors and network resources. In Proceedings of MobiCom. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Google. 2017. TensorFlow. (2017). https://goo.gl/j7HAZJ.Google ScholarGoogle Scholar
  15. Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: efficient inference engine on compressed deep neural network. In Proceedings of ISCA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Song Han, Huizi Mao, and William J Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In Proceedings of ICLR.Google ScholarGoogle Scholar
  17. Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints. In Proceedings of MobiSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google ScholarGoogle Scholar
  19. Loc N Huynh, Rajesh Krishna Balan, and Youngki Lee. 2017. DeepMon: Building Mobile GPU Deep Learning Models for Continuous Vision Applications. In Proceedings of MobiSys. 186--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016).Google ScholarGoogle Scholar
  21. Kazufumi Ito and Karl Kunisch. 2008. Lagrange multiplier approach to variational problems and applications. SIAM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Diederik Kingma and Jimmy Ba. 2015. Adam:A method for stochastic optimization. In Proceedings of ICLR.Google ScholarGoogle Scholar
  23. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Alex Krizhevsky, Nair Vinod, and Hinton Geoffrey. 2014. The CIFAR-10 dataset. https://goo.gl/hXmru5. (2014).Google ScholarGoogle Scholar
  25. Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In Proceedings of IPSN. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Nicholas D Lane and Petko Georgiev. 2015. Can deep learning revolutionize mobile sensing?. In Proceedings of HotMobile. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nicholas D Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of UbiComp. 283--294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yann LeCun. 1998. The MNIST database of handwritten digits. https://goo.gl/t6gTEy. (1998).Google ScholarGoogle Scholar
  29. Yan LeCun. 2017. LeNet. (2017). https://goo.gl/APBzd5.Google ScholarGoogle Scholar
  30. Zhenjiang Li, Mo Li, Prasant Mohapatra, Jinsong Han, and Shuaiyu Chen. 2017. iType: Using eye gaze to enhance typing privacy. In Proceedings of INFOCOM.Google ScholarGoogle ScholarCross RefCross Ref
  31. Min Lin, Qiang Chen, and Shuicheng Yan. 2014. Network in network. In Proceedings of ICLR.Google ScholarGoogle Scholar
  32. Baoyuan Liu, Min Wang, Hassan Foroosh, Marshall Tappen, and Marianna Pensky. 2015. Sparse convolutional neural networks. In Proceedings of CVPR.Google ScholarGoogle Scholar
  33. Lanlan Liu and Jia Deng. 2018. Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution. In Proceedings of AAAI.Google ScholarGoogle Scholar
  34. Yang Liu and Zhenjiang Li. 2018. iType: Using eye gaze to enhance typing privacy. In Proceedings of INFOCOM.Google ScholarGoogle Scholar
  35. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. In Proceedings of NIPS Workshops.Google ScholarGoogle Scholar
  36. Liu Sicong, Zhou Zimu, Du Junzhao, Shangguan Longfei, Jun Han, and Xin Wang. 2017. UbiEar: Bringing Location-independent Sound Awareness to the Hard-of-hearing People with Smartphones. Journal of IMWUT (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of ICLR.Google ScholarGoogle Scholar
  38. Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Proceedings of NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Mostofa Patwary, Mr Prabhat, and Ryan Adams. 2015. Scalable bayesian optimization using deep neural networks. In Proceedings of ICML. 2171--2180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter. 2016. Bayesian optimization with robust Bayesian neural networks. In Proceedings of NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel Emer. 2017. Efficient processing of deep neural networks: A tutorial and survey. arXiv preprint arXiv:1703.09039 (2017).Google ScholarGoogle Scholar
  42. UCI. 2017. Dataset for Human Activity Recognition. https://goo.gl/m5bRo1. (2017).Google ScholarGoogle Scholar
  43. Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep Reinforcement Learning with Double Q-Learning.. In Proceedings of AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Harm Van Seijen, Hado Van Hasselt, Shimon Whiteson, and Marco Wiering. 2009. A theoretical and empirical analysis of Expected Sarsa. In Proceedings of ADPRL.Google ScholarGoogle ScholarCross RefCross Ref
  45. Stylianos I Venieris and Christos-Savvas Bouganis. 2017. Latency-driven design for FPGA-based convolutional neural networks. In Proceedings of FPL. 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  46. Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Van Hasselt, Marc Lanctot, and Nando De Freitas. 2016. Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581 (2016).Google ScholarGoogle Scholar
  47. Mengwei Xu, Feng Qian, and Saumay Pushp. 2017. Enabling Cooperative Inference of Deep Learning on Wearables and Smartphones. arXiv preprint arXiv:1712.03073 (2017).Google ScholarGoogle Scholar
  48. Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceedings of CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  49. Xiaolong Zheng, Jiliang Wang, Longfei Shangguan, Zimu Zhou, and Yunhao Liu. 2017. Design and Implementation of a CSI-Based Ubiquitous Smoking Detection System. IEEE/ACM Transactions on Networking 25, 6 (2017), 3781--3793. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).Google ScholarGoogle Scholar
  51. Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2017. Learning Transferable Architectures for Scalable Image Recognition. arXiv preprint arXiv:1707.07012 (2017).Google ScholarGoogle Scholar

Index Terms

  1. On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MobiSys '18: Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services
      June 2018
      560 pages
      ISBN:9781450357203
      DOI:10.1145/3210240

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 June 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate274of1,679submissions,16%

      Upcoming Conference

      MOBISYS '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader