skip to main content
research-article

Designing Future Warehouse-Scale Computers for Sirius, an End-to-End Voice and Vision Personal Assistant

Authors Info & Claims
Published:06 April 2016Publication History
Skip Abstract Section

Abstract

As user demand scales for intelligent personal assistants (IPAs) such as Apple’s Siri, Google’s Google Now, and Microsoft’s Cortana, we are approaching the computational limits of current datacenter (DC) architectures. It is an open question how future server architectures should evolve to enable this emerging class of applications, and the lack of an open-source IPA workload is an obstacle in addressing this question. In this article, we present the design of Sirius, an open end-to-end IPA Web-service application that accepts queries in the form of voice and images, and responds with natural language. We then use this workload to investigate the implications of four points in the design space of future accelerator-based server architectures spanning traditional CPUs, GPUs, manycore throughput co-processors, and FPGAs. To investigate future server designs for Sirius, we decompose Sirius into a suite of eight benchmarks (Sirius Suite) comprising the computationally intensive bottlenecks of Sirius. We port Sirius Suite to a spectrum of accelerator platforms and use the performance and power trade-offs across these platforms to perform a total cost of ownership (TCO) analysis of various server design points. In our study, we find that accelerators are critical for the future scalability of IPA services. Our results show that GPU- and FPGA-accelerated servers improve the query latency on average by 8.5× and 15×, respectively. For a given throughput, GPU- and FPGA-accelerated servers can reduce the TCO of DCs by 2.3× and 1.3×, respectively.

References

  1. ABIResearch. 2013. Wearable computing devices, like Apple iWatch, will exceed 485 million annual shipments by 2018. Retrieved February 18, 2016, from https://www.abiresearch.com/press/wearable-computing-devices-like-apples-iwatch-will.Google ScholarGoogle Scholar
  2. ApacheNutch. 2010. Apache Nutch Home Page. Retrieved February 18, 2016, from http://nutch.apache.org.Google ScholarGoogle Scholar
  3. AppleSiri. 2011. Apple’s Siri. Retrieved February 18, 2016, from https://www.apple.com/ios/siri/.Google ScholarGoogle Scholar
  4. Luiz Andre Barroso, Jimmy Clidaras, and Urs Holzle. 2013. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition. Morgan & Claypool.Google ScholarGoogle Scholar
  5. Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. 2006. SURF: Speeded up robust features. In Computer Vision—ECCV 2006. Lecture Notes in Computer Science, Vol. 3951. Springer, 404--417.Google ScholarGoogle Scholar
  6. Dimitris Bouris, Antonis Nikitakis, and Ioannis Papaefstathiou. 2010. Fast and efficient FPGA-based feature detection employing the SURF algorithm. In Proceedings of the 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’10). IEEE, Los Alamitos, CA, 3--10. DOI:http://dx.doi.org/10.1109/FCCM.2010.11Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Bradski. 2000. Dr. Dobb’s Journal of Software Tools. OpenCV Library.Google ScholarGoogle Scholar
  8. Vijay R. Chandrasekhar, David M. Chen, Sam S. Tsai, Ngai-Man Cheung, Huizhong Chen, Gabriel Takacs, Yuriy Reznik, Ramakrishna Vedantham, Radek Grzeszczuk, Jeff Bach, and Bernd Girod. 2011. The Stanford mobile visual search data set. In Proceedings of the 2nd Annual ACM Conference on Multimedia Systems (MMSys’11). ACM, New York, NY, 117--122. DOI:http://dx.doi.org/10.1145/1943552.1943568Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). ACM, New York, NY, 269--284. DOI:http://dx.doi.org/10.1145/2541940.2541967Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jike Chong, Ekaterina Gonina, and Kurt Keutzer. 2011. Efficient automatic speech recognition on the GPU. In GPU Computing Gems Emerald Edition, W.-M. W. Hwu (Ed.). Morgan Kaufmann, 601--618.Google ScholarGoogle Scholar
  11. ClarityLab. 2015. Sirius: An Open End-to-End Voice and Vision Personal Assistant. Retrieved February 18, 2016, from http://sirius.clarity-lab.org.Google ScholarGoogle Scholar
  12. George E. Dahl, Dong Yu, Li Deng, and Alex Acero. 2012. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing 20, 1, 30--42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’12).Google ScholarGoogle Scholar
  14. Tung H. Dinh, Dao Q. Vu, Vu-Duc Ngo, Nam Pham Ngoc, and Vu T. Truong. 2014. High throughput FPGA architecture for corner detection in traffic images. In Proceedings of the 2014 IEEE 5th International Conference on Communications and Electronics (ICCE’14). IEEE, Los Alamitos, CA, 297--302.Google ScholarGoogle Scholar
  15. Paul R. Dixon, Tasuku Oonishi, and Sadaoki Furui. 2009. Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition. Computer Speech and Language 23, 4, 510--526. DOI:http://dx.doi.org/10.1016/j.csl.2009.03.005Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural acceleration for general-purpose approximate programs. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE, Los Alamitos, CA, 449--460. DOI:http://dx.doi.org/10.1109/MICRO.2012.48Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Clément Farabet, Yann LeCun, Koray Kavukcuoglu, Eugenio Culurciello, Berin Martini, Polina Akselrod, and Selcuk Talay. 2011. Large-scale FPGA-based convolutional networks. In Scaling Up Machine Learning, R. Bekkerman, M. Bilenko, and J. Langford (Eds.). Cambridge University Press, 399--419. http://yann.lecun.com/exdb/publis/pdf/farabet-suml-11.pdf.Google ScholarGoogle Scholar
  18. Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. 2012. Clearing the clouds: A study of emerging scale-out workloads on modern hardware. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVII). ACM, New York, NY, 37--48. DOI:http://dx.doi.org/10.1145/2150976.2150982Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty. 2010. Building Watson: An overview of the DeepQA project—Ferrucci—AI magazine. AI MAGAZINE 31, 3, 59--79. http://www.aaai.org/ojs/index.php/aimagazine/article/view/2303.Google ScholarGoogle ScholarCross RefCross Ref
  20. G. David Forney Jr. 1973. The Viterbi algorithm. Proceedings of the IEEE 61, 3, 268--278.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. GoogleAndroidWear. 2014. Android Wear. Retrieved February 18, 2016, from http://www.android.com/wear/.Google ScholarGoogle Scholar
  23. GoogleGlass. 2014. Google Glass. Retrieved February 18, 2016, from http://www.google.com/glass.Google ScholarGoogle Scholar
  24. GoogleNow. 2014. Google Now. Retrieved February 18, 2016, from http://www.google.com/landing/now/.Google ScholarGoogle Scholar
  25. Alex Graves, Abdel-Rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’13). IEEE, Los Alamitos, CA, 6645--6649.Google ScholarGoogle ScholarCross RefCross Ref
  26. J. Hauswald, T. Manville, Q. Zheng, R. Dreslinski, C. Chakrabarti, and T. Mudge. 2014. A hybrid approach to offloading mobile image classification. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’14). IEEE, Los Alamitos, CA, 8375--8379.Google ScholarGoogle Scholar
  27. Marti A. Hearst. 2011. ‘Natural’ search user interfaces. Communications of the ACM 54, 11, 60--67. DOI:http://dx.doi.org/10.1145/2018396.2018414Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel Rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury. 2012. Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine Article No. 38131.Google ScholarGoogle Scholar
  29. Chang-Hong Hsu, Yunqi Zhang, Michael A. Laurenzano, David Meisner, Thomas Wenisch, Lingjia Tang, Jason Mars, and Ron Dreslinski. 2015. Adrenaline: Pinpointing and reigning in tail queries with quick voltage boosting. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, Los Alamitos, CA, 10.Google ScholarGoogle ScholarCross RefCross Ref
  30. Xuedong Huang, James Baker, and Raj Reddy. 2014. A historical perspective of speech recognition. Communications of the ACM 57, 1, 94--103. DOI:http://dx.doi.org/10.1145/2500887Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. David Huggins-Daines, Mohit Kumar, Arthur Chan, Alan W. Black, Mosur Ravishankar, and Alex I. Rudnicky. 2006. Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices. In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. IEEE, Los Alamitos, CA, I.Google ScholarGoogle Scholar
  32. IDCMobile. 2015. Smartphone OS Market Share, 2015 Q2.Google ScholarGoogle Scholar
  33. IntelVTune. 2015. Intel VTune Home Page. Retrieved February 18, 2016, from https://software.intel.com/ en-us/intel-vtune-amplifier-xe.Google ScholarGoogle Scholar
  34. Ravi Iyer, Sadagopan Srinivasan, Omesh Tickoo, Zhen Fang, Ramesh Illikkal, Steven Zhang, Vineet Chadha, Paul M. Stillwell Jr., and Seung Eun Lee. 2011. CogniServe: Heterogeneous server architecture for large-scale recognition. IEEE Micro 31, 3, 20--31.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.Google ScholarGoogle Scholar
  36. Jungsuk Kim, Jike Chong, and Ian R. Lane. 2012. Efficient on-the-fly hypothesis rescoring in a hybrid GPU/CPU-based large vocabulary continuous speech recognition engine. In Proceedings of the 13th Annual Conference on the International Speech Communication Association (INTERSPEECH’12).Google ScholarGoogle Scholar
  37. Onur Kocberber, Boris Grot, Javier Picorel, Babak Falsafi, Kevin Lim, and Parthasarathy Ranganathan. 2013. Meet the walkers: Accelerating index traversals for in-memory databases. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, 468--479.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rajeev Krishna, Scott Mahlke, and Todd Austin. 2003. Architectural optimizations for low-power, real-time speech recognition. In Proceedings of the 2003 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’03). ACM, New York, NY, 220--231. DOI:http://dx.doi.org/10.1145/951710.951740Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates Inc., 1097--1105. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convol utional-neural-networks.pdf.Google ScholarGoogle Scholar
  40. John Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 282--289.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Michael Laurenzano, Yunqi Zhang, Lingjia Tang, and Jason Mars. 2014. Protean code: Achieving near-free online code transformations for warehouse scale computers. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). ACM, New York, NY.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Kevin Lim, David Meisner, Ali G. Saidi, Parthasarathy Ranganathan, and Thomas F. Wenisch. 2013. Thin servers with smart pipes: Designing SoC accelerators for memcached. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). ACM, New York, NY, 36--47.Google ScholarGoogle Scholar
  43. Edward C. Lin, Kai Yu, Rob A. Rutenbar, and Tsuhan Chen. 2007. A 1000-word vocabulary, speaker-independent, continuous live-mode speech recognizer implemented in a single FPGA. In Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays (FPGA’07). ACM, New York, NY, 60--68. DOI:http://dx.doi.org/10.1145/1216919.1216928Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jan Van Lunteren, Christoph Hagleitner, Timothy Heil, Giora Biran, Uzi Shvadron, and Kubilay Atasu. 2012. Designing a programmable wire-speed regular-expression matching accelerator. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE, Los Alamitos, CA, 461--472. DOI:http://dx.doi.org/10.1109/MICRO.2012.49Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Sergey Lyubka. 2009. SLRE: Super Light Regular Expression Library. Available at http://cesanta.com/.Google ScholarGoogle Scholar
  46. Jason Mars and Lingjia Tang. 2013. Whare-map: Heterogeneity in homogeneous warehouse-scale computers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). IEEE, Los Alamitos, CA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44). ACM, New York, NY, 248--259. DOI:http://dx.doi.org/10.1145/2155620.2155650Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Jason Mars, Lingjia Tang, Kevin Skadron, Mary Lou Soffa, and Robert Hundt. 2012. Increasing utilization in modern warehouse-scale computers using bubble-up. IEEE Micro 32, 3, 88--99. DOI:http://dx.doi.org/10.1109/MM.2012.22Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Binu Mathew, Al Davis, and Zhen Fang. 2003. A low-power accelerator for the SPHINX 3 speech recognition system. In Proceedings of the 2003 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’03). ACM, New York, NY, 210--219. DOI:http://dx.doi.org/10.1145/951710.951739Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. MicrosoftCortana. 2015. Cortana. Retrieved February 18, 2016, from http://www.windowsphone.com/ en-us/features-8-1.Google ScholarGoogle Scholar
  51. MobileMarketing. 2014. Qualcomm Acquires Kooaba Visual Recognition Company. Retrieved February 18, 2016, from http://mobilemarketingmagazine.com/qualcomm-acquires-kooaba-visual-recognition-company/.Google ScholarGoogle Scholar
  52. NVIDIA cuDNN. 2015. NVIDIA cuDNN: GPU Accelerated Deep Learning. Retrieved February 18, 2016, from https://developer.nvidia.com/cudnn.Google ScholarGoogle Scholar
  53. Naoaki Okazaki. 2007. CRFsuite: A fast implementation of conditional random fields (CRFs). Retrieved February 18, 2016, from http://www.chokkan.org/software/crfsuite/.Google ScholarGoogle Scholar
  54. Vinicius Petrucci, Michael A. Laurenzano, Yunqi Zhang, John Doherty, Daniel Mosse, Jason Mars, and Lingjia Tang. 2015. Octopus-man: QoS-driven task management for heterogeneous multicore in warehouse scale computers. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, Los Alamitos, CA, 10.Google ScholarGoogle ScholarCross RefCross Ref
  55. Nico Piatkowski. 2011. Linear-Chain CRF@GPU. Retrieved February 18, 2016, from http://sfb876.tu-dortmund.de/crfgpu/linear_crf_cuda.html.Google ScholarGoogle Scholar
  56. Martin F. Porter. 1980. An algorithm for suffix stripping. Program: Electronic Library and Information Systems 14, 3, 130--137.Google ScholarGoogle ScholarCross RefCross Ref
  57. Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer, and Karel Vesely. 2011. The Kaldi speech recognition toolkit. In Proceedings of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  58. Andrew Putnam, Adrian Caulfield, Eric Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, Jim Larus, Eric Peterson, Simon Pope, Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the 41st Annual International Symposium on Computer Architecture (ISCA’14). http://research.microsoft.com/apps/pubs/default.aspx?id=212001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV’11). IEEE, Los Alamitos, CA, 2564--2571.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3, 211--252. DOI:http://dx.doi.org/10.1007/s11263-015-0816-yGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  61. David Rybach, Stefan Hahn, Patrick Lehnen, David Nolden, Martin Sundermeyer, Zoltan Tüske, Siemon Wiesler, Ralf Schlüter, and Hermann Ney. 2011. RASR—the RWTH Aachen University Open Source Speech Recognition Toolkit. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop.Google ScholarGoogle Scholar
  62. Frank Seide, Gang Li, and Dong Yu. 2011. Conversational speech transcription using context-dependent deep neural networks. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH’11). 437--440. http://msr-waypoint.com/pubs/153169/CD-DNN-HMM-SWB-Interspeech2011-Pub.pdf.Google ScholarGoogle Scholar
  63. M. G. Siegler. 2011. Apple’s Massive New Data Center Set to Host Nuance Tech; Partnership Announcement Due at WWDC. Retrieved February 18, 2016, from http://techcrunch.com/2011/05/09/apple-nuance-data-center-deal/.Google ScholarGoogle Scholar
  64. A. Singh, N. Kumar, S. Gera, and A. Mittal. 2010. Achieving magnitude order improvement in Porter Stemmer algorithm over multi-core architecture. In Proceedings of the 2010 7th International Conference on Informatics and Systems (INFOS’10). 1--8.Google ScholarGoogle Scholar
  65. Yuliang Sun, Zilong Wang, Sitao Huang, Lanjun Wang, Yu Wang, Rong Luo, and Huazhong Yang. 2014. Accelerating frequent item counting with FPGA. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’14). ACM, New York, NY, 109--112. DOI:http://dx.doi.org/10.1145/2554688.2554766Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Sriram Swaminathan, Russell Tessier, Dennis Goeckel, and Wayne Burleson. 2002. A dynamically reconfigurable adaptive Viterbi decoder. In Proceedings of the 2002 ACM/SIGDA 10th International Symposium on Field-Programmable Gate Arrays (FPGA’02). ACM, New York, NY, 227--236. DOI:http://dx.doi.org/10.1145/503048.503081Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Lingjia Tang, Jason Mars, Wei Wang, Tanima Dey, and Mary Lou Soffa. 2013a. ReQoS: Reactive static/dynamic compilation for QoS in warehouse scale computers. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’13). ACM, New York, NY, 89--100. DOI:http://dx.doi.org/10.1145/2451116.2451126Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Lingjia Tang, Jason Mars, Xiao Zhang, Robert Hagmann, Robert Hundt, and Eric Tune. 2013b. Optimizing Google’s warehouse scale computers: The NUMA experience. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’13). IEEE, Los Alamitos, CA, 188--197. DOI:http://dx.doi.org/10.1109/HPCA.2013.6522318Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. ThinkMate. 2014. RAX XF2-1130V3-SH. Retrieved February 18, 2016, from http://www.thinkmate.com/system/rax-xf2-1130v3-sh.Google ScholarGoogle Scholar
  70. Erik F. Tjong, Kim Sang, and Sabine Buchholz. 2000. Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning—Volume 7 (ConLL’00). 127--132. DOI:http://dx.doi.org/10.3115/1117601.1117631Google ScholarGoogle Scholar
  71. Oscar Tackstrom, Dipanjan Das, Slav Petrov, Ryan McDonald, and Joakim Nivre. 2013. Token and type constraints for cross-lingual part-of-speech tagging. Transactions of the Association for Computational Linguistics 1, 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  72. J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and A. W. M. Smeulders. 2013. Selective search for object recognition. International Journal of Computer Vision 104, 2, 154--171. https://ivi.fnwi.uva.nl/isis/publications/2013/UijlingsIJCV2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Giorgos Vasiliadis, Michalis Polychronakis, Spiros Antonatos, Evangelos P. Markatos, and Sotiris Ioannidis. 2009. Regular expression matching on graphics hardware for intrusion detection. In Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection (RAID’09). 265--283. DOI:http://dx.doi.org/10.1007/978-3-642-04342-0_14Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux: Precise online QoS management for increased utilization in warehouse scale computers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). ACM, New York, NY, 607--618. DOI:http://dx.doi.org/10.1145/2485922.2485974Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Yi-Hua E. Yang, Weirong Jiang, and Viktor K. Prasanna. 2008. Compact architecture for high-throughput regular expression matching on FPGA. In Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS’08). ACM, New York, NY, 30--39. DOI:http://dx.doi.org/10.1145/1477942.1477948Google ScholarGoogle Scholar
  76. Yunqi Zhang, Michael Laurenzano, Jason Mars, and Lingjia Tang. 2014. SMiTe: Precise QoS prediction on real system SMT processors to improve utilization in warehouse scale computers. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). ACM, New York, NY.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Designing Future Warehouse-Scale Computers for Sirius, an End-to-End Voice and Vision Personal Assistant

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computer Systems
      ACM Transactions on Computer Systems  Volume 34, Issue 1
      April 2016
      91 pages
      ISSN:0734-2071
      EISSN:1557-7333
      DOI:10.1145/2912578
      Issue’s Table of Contents

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 April 2016
      • Accepted: 1 December 2015
      • Received: 1 October 2015
      Published in tocs Volume 34, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader