skip to main content
announcement
Open Access

Deep Learning at Scale and at Ease

Published:02 November 2016Publication History
Skip Abstract Section

Abstract

Recently, deep learning techniques have enjoyed success in various multimedia applications, such as image classification and multimodal data analysis. Large deep learning models are developed for learning rich representations of complex data. There are two challenges to overcome before deep learning can be widely adopted in multimedia and other applications. One is usability, namely the implementation of different models and training algorithms must be done by nonexperts without much effort, especially when the model is large and complex. The other is scalability, namely the deep learning system must be able to provision for a huge demand of computing resources for training large models with massive datasets. To address these two challenges, in this article we design a distributed deep learning platform called SINGA, which has an intuitive programming model based on the common layer abstraction of deep learning models. Good scalability is achieved through flexible distributed training architecture and specific optimization techniques. SINGA runs on both GPUs and CPUs, and we show that it outperforms many other state-of-the-art deep learning systems. Our experience with developing and training deep learning models for real-life multimedia applications in SINGA shows that the platform is both usable and scalable.

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado et al. 2015. TensorFlow: Large-scale machine learning on heterogeneous systems. arXiv:1603.04467. http://tensorflow.org/.Google ScholarGoogle Scholar
  2. Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio. 2012. Theano: New features and speed improvements. In Proceedings of the Deep Learning Workshop (NIPS’12).Google ScholarGoogle Scholar
  3. Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. 2016b. Training deep nets with sublinear memory cost. arXiv:1604.06174. http://arxiv.org/abs/1604.06174Google ScholarGoogle Scholar
  4. Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274.Google ScholarGoogle Scholar
  5. Jianmin Chen, Rajat Monga, Samy Bengio, and Rafal Józefowicz. 2016a. Revisiting distributed synchronous SGD. arXiv:1604.00981. http://arxiv.org/abs/1604.00981Google ScholarGoogle Scholar
  6. Trishul Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. 2014. Project Adam: Building an efficient and scalable deep learning training system. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). 571--582. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/chilimbi. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yan-Tao. Zheng. 2009. NUS-WIDE: A real-world Web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR’09). Article No. 48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dan Claudiu Ciresan, Ueli Meier, Luca Maria Gambardella, and Jürgen Schmidhuber. 2010. Deep big simple neural nets excel on handwritten digit recognition. arXiv:1003.0358.Google ScholarGoogle Scholar
  9. Adam Coates, Brody Huval, Tao Wang, David J. Wu, Bryan C. Catanzaro, and Andrew Y. Ng. 2013. Deep learning with COTS HPC systems. In Proceedings of the 30th International Conference on Machine Learning (ICML’13). 1337--1345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Collobert, K. Kavukcuoglu, and C. Farabet. 2011. Torch7: A Matlab-like environment for machine learning. In Proceedings of the BigLearn Workshop (NIPS’11).Google ScholarGoogle Scholar
  11. Wei Dai, Jinliang Wei, Xun Zheng, Jin Kyu Kim, Seunghak Lee, Junming Yin, Qirong Ho, and Eric P. Xing. 2013. Petuum: A framework for iterative-convergent distributed ML. arXiv:1312.7651. http://arxiv.org/abs/1312.7651Google ScholarGoogle Scholar
  12. Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew W. Senior, Paul A. Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems (NIPS’12). 1232--1240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. John C. Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, 2121--2159. http://dl.acm.org/citation.cfm?id=2021068 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fangxiang Feng, Xiaojie Wang, and Ruifan Li. 2014. Cross-modal retrieval with correspondence autoencoder. In Proceedings of the 22nd ACM International Conference on Multimedia (MM’14). 7--16. DOI:http://dx.doi.org/10.1145/2647868.2654902 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. arXiv:1512.03385.Google ScholarGoogle Scholar
  16. Geoffrey Hinton and Ruslan Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786, 504--507.Google ScholarGoogle Scholar
  17. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093.Google ScholarGoogle Scholar
  18. Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, and Sai Wu. 2014. epiC: An extensible and scalable system for processing big data. Proceedings of the VLDB Endowment 7, 7, 541--552. http://www.vldb.org/pvldb/vol7/p541-jiang.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Alex Krizhevsky. 2014. One weird trick for parallelizing convolutional neural networks. arXiv:1404.5997.Google ScholarGoogle Scholar
  20. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25 (NIPS’12). 1106--1114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Quoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Greg Corrado, Kai Chen, Jeffrey Dean, and Andrew Y. Ng. 2012. Building high-level features using large scale unsupervised learning. In Proceedings of the International Conference on Machine Learning (ICML’12).Google ScholarGoogle Scholar
  22. Yann LeCun, Léon Bottou, Genevieve B. Orr, and Klaus-Robert Müller. 1996. Efficient BackProp. In Neural Networks: Tricks of the Trade. Springer, 9--50. DOI:http://dx.doi.org/10.1007/3-540-49430-8_2Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 583--598. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/li_mu. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (NIPS’13). 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tomas Mikolov, Stefan Kombrink, Lukás Burget, Jan Cernocký, and Sanjeev Khudanpur. 2011. Extensions of recurrent neural network language model. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11). IEEE, Los Alamitos, CA, 5528--5531. DOI:http://dx.doi.org/10.1109/ICASSP.2011.5947611Google ScholarGoogle ScholarCross RefCross Ref
  26. Beng Chin Ooi, Kian-Lee Tan, Sheng Wang, Wei Wang, Qingchao Cai, Gang Chen, Jinyang Gao et al. 2015. SINGA: A distributed deep learning platform. In Proceedings of the ACM Multimedia Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, and Thomas S. Huang. 2013. GPU asynchronous stochastic gradient descent to speed up neural network training. arXiv:1312.6186.Google ScholarGoogle Scholar
  28. Benjamin Recht, Christopher Re, Stephen J. Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems (NIPS’11). 693--701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Frank Seide, Hao Fu, Jasha Droppo, Gang Li, and Dong Yu. 2014. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH’14). 1058--1062.Google ScholarGoogle ScholarCross RefCross Ref
  30. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. http://arxiv.org/abs/1409.1556Google ScholarGoogle Scholar
  31. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2014. Going deeper with convolutions. arXiv:1409.4842.Google ScholarGoogle Scholar
  32. Heng Tao Shen, Beng Chin Ooi, and Kian-Lee Tan. 2000. Giving meanings to WWW images. In Proceedings of the ACM Multimedia Conference. 39--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kian-Lee Tan, Qingchao Cai, Beng Chin Ooi, Weng-Fai Wong, Chang Yao, and Hao Zhang. 2015. In-memory databases: Challenges and opportunities from software and hardware perspectives. ACM SIGMOD Record 44, 2, 35--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ji Wan, Dayong Wang, Steven Chu Hong Hoi, Pengcheng Wu, Jianke Zhu, Yongdong Zhang, and Jintao Li. 2014. Deep learning for content-based image retrieval: A comprehensive study. In Proceedings of the ACM Multimedia Conference. 157--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xinxi Wang and Ye Wang. 2014. Improving content-based and hybrid music recommendation using deep learning. In Proceedings of the ACM Multimedia Conference. 627--636. DOI:http://dx.doi.org/10.1145/2647868.2654940 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wei Wang, Beng Chin Ooi, Xiaoyan Yang, Dongxiang Zhang, and Yueting Zhuang. 2014. Effective multi-modal retrieval based on stacked auto-encoders. Proceedings of the VLDB Endowment 7, 8, 649--660. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, and Sheng Wang. 2015a. SINGA: Putting deep learning in the hands of multimedia users. In Proceedings of the ACM Multimedia Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wei Wang, Xiaoyan Yang, Beng Chin Ooi, Dongxiang Zhang, and Yueting Zhuang. 2015b. Effective deep learning-based multi-modal retrieval. VLDB Journal 25, 1, 79--101. DOI:http://dx.doi.org/10.1007/s00778-015-0391-4 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ren Wu, Shengen Yan, Yi Shan, Qingqing Dang, and Gang Sun. 2015. Deep Image: Scaling up image recognition. arXiv:1501.02876. http://arxiv.org/abs/1501.02876Google ScholarGoogle Scholar
  40. Zuxuan Wu, Yu-Gang Jiang, Jun Wang, Jian Pu, and Xiangyang Xue. 2014. Exploring inter-feature and inter-class relationships with deep neural networks for video classification. In Proceedings of the ACM Multimedia Conference. 167--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Omry Yadan, Keith Adams, Yaniv Taigman, and Marc’Aurelio Ranzato. 2013. Multi-GPU training of ConvNets. arXiv:1312.5853.Google ScholarGoogle Scholar
  42. Quanzeng You, Jiebo Luo, Hailin Jin, and Jianchao Yang. 2015. Joint visual-textual sentiment analysis with deep neural networks. In Proceedings of the ACM Multimedia Conference. 1071--1074. DOI:http://dx.doi.org/10.1145/2733373.2806284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Dong Yu, Adam Eversole, Mike Seltzer, Kaisheng Yao, Oleksii Kuchaiev, Yu Zhang, Frank Seide et al. 2014. An Introduction to Computational Networks and the Computational Network Toolkit. Microsoft Technical Report MSR-TR-2014-112. Microsoft Research.Google ScholarGoogle Scholar
  44. Ce Zhang and Christopher Re. 2014. DimmWitted: A study of main-memory statistical analytics. Proceedings of the VLDB Endowment 7, 12, 1283--1294. http://www.vldb.org/pvldb/vol7/p1283-zhang.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Hanwang Zhang, Yang Yang, Huan-Bo Luan, Shuicheng Yang, and Tat-Seng Chua. 2014. Start from scratch: Towards automatically identifying, modeling, and naming visual attributes. In Proceedings of the ACM Multimedia Conference. 187--196. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Learning at Scale and at Ease

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 12, Issue 4s
          Special Section on Trust Management for Multimedia Big Data and Special Section on Best Papers of ACM Multimedia 2015
          November 2016
          242 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/2997658
          Issue’s Table of Contents

          Copyright © 2016 Owner/Author

          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 November 2016
          • Accepted: 1 August 2016
          • Revised: 1 June 2016
          • Received: 1 February 2016
          Published in tomm Volume 12, Issue 4s

          Check for updates

          Qualifiers

          • announcement
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader