skip to main content
research-article
Open Access

ApproxHPVM: a portable compiler IR for accuracy-aware optimizations

Published:10 October 2019Publication History
Skip Abstract Section

Abstract

We propose ApproxHPVM, a compiler IR and system designed to enable accuracy-aware performance and energy tuning on heterogeneous systems with multiple compute units and approximation methods. ApproxHPVM automatically translates end-to-end application-level quality metrics into accuracy requirements for individual operations. ApproxHPVM uses a hardware-agnostic accuracy-tuning phase to do this translation that provides greater portability across heterogeneous hardware platforms and enables future capabilities like accuracy-aware dynamic scheduling and design space exploration.

ApproxHPVM incorporates three main components: (a) a compiler IR with hardware-agnostic approximation metrics, (b) a hardware-agnostic accuracy-tuning phase to identify error-tolerant computations, and (c) an accuracy-aware hardware scheduler that maps error-tolerant computations to approximate hardware components. As ApproxHPVM does not incorporate any hardware-specific knowledge as part of the IR, it can serve as a portable virtual ISA that can be shipped to all kinds of hardware platforms.

We evaluate our framework on nine benchmarks from the deep learning domain and five image processing benchmarks. Our results show that our framework can offload chunks of approximable computations to special-purpose accelerators that provide significant gains in performance and energy, while staying within user-specified application-level quality metrics with high probability. Across the 14 benchmarks, we observe from 1-9x performance speedups and 1.1-11.3x energy reduction for very small reductions in accuracy.

Skip Supplemental Material Section

Supplemental Material

a186-sharif.webm

webm

124.5 MB

References

  1. Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe. 2009. PetaBricks: A Language and Compiler for Algorithmic Choice. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’09). ACM, New York, NY, USA, 38–49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O’Reilly, and Saman Amarasinghe. 2014. OpenTuner: An Extensible Framework for Program Autotuning. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT ’14). ACM, New York, NY, USA, 303–316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jason Ansel, Yee Lok Wong, Cy Chan, Marek Olszewski, Alan Edelman, and Saman Amarasinghe. 2011. Language and Compiler Support for Auto-tuning Variable-accuracy Algorithms. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO ’11). IEEE Computer Society, Washington, DC, USA, 85–96. http://dl.acm.org/citation.cfm?id=2190025.2190056Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Woongki Baek and Trishul M. Chilimbi. 2010. Green: A Framework for Supporting Energy-conscious Programming Using Controlled Approximation. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’10). ACM, New York, NY, USA, 198–209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Brett Boston, Adrian Sampson, Dan Grossman, and Luis Ceze. 2015. Probability type inference for flexible approximate programming. In OOPSLA. ACM, 470–487.Google ScholarGoogle Scholar
  6. Simone Campanoni, Glenn Holloway, Gu-Yeon Wei, and David Brooks. 2015. HELIX-UP: Relaxing Program Semantics to Unleash Parallelization. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO ’15). IEEE Computer Society, Washington, DC, USA, 235–245. http://dl.acm.org/citation.cfm?id= 2738600.2738630Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Michael Carbin, Sasa Misailovic, and Martin C. Rinard. 2013. Verifying Quantitative Reliability for Programs That Execute on Unreliable Hardware. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA ’13). ACM, New York, NY, USA, 33–52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-end Optimizing Compiler for Deep Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’18). USENIX Association, Berkeley, CA, USA, 579–594. http://dl.acm.org/citation.cfm?id=3291168.3291211Google ScholarGoogle Scholar
  9. Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Temam. 2014. DaDianNao: A Machine-Learning Supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). IEEE Computer Society, Washington, DC, USA, 609–622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2016. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Vol. 44. IEEE, 367–379. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. CoRR abs/1410.0759 (2014). arXiv: 1410.0759 http://arxiv.org/abs/ 1410.0759Google ScholarGoogle Scholar
  12. Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O’Reilly, and Saman Amarasinghe. 2015. Autotuning Algorithmic Choice for Input Sensitivity. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 379–390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, and Olivier Temam. 2015. ShiDianNao: Shifting Vision Processing Closer to the Sensor. In Proceedings of the 42Nd Annual International Symposium on Computer Architecture (ISCA ’15). ACM, New York, NY, USA, 92–104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural Acceleration for General-Purpose Approximate Programs. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE Computer Society, Washington, DC, USA, 449–460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Li Fei-Fei, Rob Fergus, and Pietro Perona. 2004. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories. In 2004 Conference on Computer Vision and Pattern Recognition Workshop. 178–178. Google ScholarGoogle ScholarCross RefCross Ref
  16. Dustin Franklin. 2018. NVIDIA Jetson TX2 Delivers Twice the Intelligence to the Edge. NVIDIA Developer Blog. (2018). https://devblogs.nvidia.com/jetson-tx2-delivers-twice-intelligence-edgeGoogle ScholarGoogle Scholar
  17. Yonatan Geifman. 2019. VGG16 models for CIFAR-10 and CIFAR-100 using Keras. https://github.com/geifmany/cifar-vgg . (2019).Google ScholarGoogle Scholar
  18. Inigo Goiri, Ricardo Bianchini, Santosh Nagarakatte, and Thu D Nguyen. 2015. Approxhadoop: Bringing approximations to mapreduce frameworks. In ASPLOS. ACM, 383–397.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. K. Gonugondla, M. Kang, and N. R. Shanbhag. 2018. A Variation-Tolerant In-Memory Machine Learning Classifier via OnChip Training. IEEE Journal of Solid-State Circuits 53, 11 (Nov 2018), 3163–3173. Google ScholarGoogle ScholarCross RefCross Ref
  20. Antonio Gulli and Sujit Pal. 2017. Deep Learning with Keras. Packt Publishing.Google ScholarGoogle Scholar
  21. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778. Google ScholarGoogle ScholarCross RefCross Ref
  22. Nhut-Minh Ho and Weng-Fai Wong. 2017. Exploiting half precision arithmetic in Nvidia GPUs. In 2017 IEEE High Performance Extreme Computing Conference (HPEC). 1–7. Google ScholarGoogle ScholarCross RefCross Ref
  23. Henry Hoffmann, Stelios Sidiroglou, Michael Carbin, Sasa Misailovic, Anant Agarwal, and Martin Rinard. 2011. Dynamic Knobs for Responsive Power-aware Computing. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVI). ACM, New York, NY, USA, 199–212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs/1704.04861 (2017). arXiv: 1704.04861 http://arxiv.org/abs/1704.04861Google ScholarGoogle Scholar
  25. D. Anoushe Jamshidi, Mehrzad Samadi, and Scott Mahlke. 2014. D2MA: Accelerating Coarse-grained Data Transfer for GPUs. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT ’14). ACM, New York, NY, USA, 431–442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Rakesh Komuravelli, Matthew D. Sinclair, Johnathan Alsop, Muhammad Huzaifa, Maria Kotsifakou, Prakalp Srivastava, Sarita V. Adve, and Vikram S. Adve. 2015. Stash: Have Your Scratchpad and Cache It Too. In Proceedings of the 42Nd Annual International Symposium on Computer Architecture (ISCA ’15). ACM, New York, NY, USA, 707–719. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Maria Kotsifakou, Prakalp Srivastava, Matthew D. Sinclair, Rakesh Komuravelli, Vikram Adve, and Sarita Adve. 2018. HPVM: Heterogeneous Parallel Virtual Machine. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’18). ACM, New York, NY, USA, 68–80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report.Google ScholarGoogle Scholar
  29. Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS ’12). Curran Associates Inc., USA, 1097–1105. http://dl.acm.org/citation.cfm?id=2999134.2999257Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO ’04). IEEE Computer Society, Washington, DC, USA. http://dl.acm.org/citation.cfm?id=977395.977673Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yann LeCun, Bernhard Boser, John S. Denker, Donnie Henderson, Richard E. Howard, Wayne Hubbard, and Lawrence D. Jackel. 1989. Handwritten Digit Recognition with a Back-propagation Network. In Proceedings of the 2nd International Conference on Neural Information Processing Systems (NIPS ’89). MIT Press, Cambridge, MA, USA, 396–404. http: //dl.acm.org/citation.cfm?id=2969830.2969879Google ScholarGoogle Scholar
  32. Yann LeCun, Corinna Cortes, and Christopher J. C. Burges. 1998. The MNIST database of handwritten digits. (1998). http://yann.lecun.com/exdb/mnistGoogle ScholarGoogle Scholar
  33. Xiangjun Li and Jianfei Cai. 2007. Robust Transmission of JPEG2000 Encoded Images Over Packet Loss Channels. In Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007, July 2-5, 2007, Beijing, China. 947–950.Google ScholarGoogle ScholarCross RefCross Ref
  34. Darryl D. Lin, Sachin S. Talathi, and V. Sreekanth Annapureddy. 2016. Fixed Point Quantization of Deep Convolutional Networks. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML’16). JMLR.org, 2849–2858. http://dl.acm.org/citation.cfm?id=3045390.3045690Google ScholarGoogle Scholar
  35. Shaoli Liu, Zidong Du, Jinhua Tao, Dong Han, Tao Luo, Yuan Xie, Yunji Chen, and Tianshi Chen. 2016. Cambricon: An Instruction Set Architecture for Neural Networks. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA ’16). IEEE Press, Piscataway, NJ, USA, 393–405. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jiayuan Meng, Srimat Chakradhar, and Anand Raghunathan. 2009. Best-Effort Parallel Execution Framework for Recognition and Mining Applications. In Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS ’09). IEEE Computer Society, Washington, DC, USA, 1–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jiayuan Meng, Anand Raghunathan, Srimat Chakradhar, and Surendra Byna. 2010. Exploiting the forgiving nature of applications for scalable parallel execution. In 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS ’10). 1–12. Google ScholarGoogle ScholarCross RefCross Ref
  38. Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory F. Diamos, Erich Elsen, David García, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. 2018. Mixed Precision Training. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. https://openreview.net/forum?id=r1gs9JgRZGoogle ScholarGoogle Scholar
  39. Sasa Misailovic, Michael Carbin, Sara Achour, Zichao Qi, and Martin C. Rinard. 2014. Chisel: Reliability- and Accuracyaware Optimization of Approximate Computational Kernels. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA ’14). ACM, New York, NY, USA, 309–328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sasa Misailovic, Deokhwan Kim, and Martin Rinard. 2013. Parallelizing Sequential Programs with Statistical Accuracy Tests. ACM Transactions Embedded Computing Systems (TECS) 12, Article 88 (May 2013), 26 pages. Issue 2s. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Sasa Misailovic, Daniel M. Roy, and Martin C. Rinard. 2011. Probabilistically Accurate Program Transformations. In Proceedings of the 18th International Conference on Static Analysis (SAS’11). Springer-Verlag, Berlin, Heidelberg, 316–333. http://dl.acm.org/citation.cfm?id=2041552.2041576Google ScholarGoogle Scholar
  42. Sasa Misailovic, Stelios Sidiroglou, Henry Hoffmann, and Martin Rinard. 2010. Quality of Service Profiling. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE ’10). ACM, New York, NY, USA, 25–34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Sasa Misailovic, Stelios Sidiroglou, and Martin C. Rinard. 2012. Dancing with Uncertainty. In Proceedings of the 2012 ACM Workshop on Relaxing Synchronization for Multicore and Manycore Scalability (RACES ’12). ACM, New York, NY, USA, 51–60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. NVIDIA. 2010. PTX: Parallel thread execution ISA version 2.3. NVIDIA COMPUTE Programmer’s Manual 3 (2010). http: //developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/ptx_isa_2.3.pdfGoogle ScholarGoogle Scholar
  45. NVIDIA. 2018. NVIDIA Jetson TX2 Developer Kit. (2018). https://www.nvidia.com/en-us/autonomous-machines/embeddedsystems/jetson-tx2Google ScholarGoogle Scholar
  46. NVIDIA Developer Forums. 2018. Power Monitoring on Jetson TX2. (2018). https://devtalk.nvidia.com/default/topic/ 1000830/jetson-tx2/jetson-tx2-ina226-power-monitor-with-i2c-interfaceGoogle ScholarGoogle Scholar
  47. Martin Rinard. 2006. Probabilistic Accuracy Bounds for Fault-tolerant Computations That Discard Tasks. In Proceedings of the 20th Annual International Conference on Supercomputing (ICS ’06). ACM, New York, NY, USA, 324–334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov, James Hegeman, Roman Levenstein, Bert Maher, Nadathur Satish, Jakob Olesen, Jongsoo Park, Artem Rakhov, and Misha Smelyanskiy. 2018. Glow: Graph Lowering Compiler Techniques for Neural Networks. CoRR abs/1805.00907 (2018). arXiv: 1805.00907 http://arxiv.org/abs/1805.00907Google ScholarGoogle Scholar
  49. Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H Bailey, Costin Iancu, and David Hough. 2013. Precimonious: Tuning assistant for floating-point precision. In SC ’13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 1–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Charbel Sakr, Yongjune Kim, and Naresh Shanbhag. 2017. Analytical Guarantees on Numerical Precision of Deep Neural Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (ICML ’17). 3007–3016. http://dl.acm.org/citation.cfm?id=3305890.3305992Google ScholarGoogle Scholar
  51. Mehrzad Samadi, Davoud Anoushe Jamshidi, Janghaeng Lee, and Scott Mahlke. 2014. Paraprox: Pattern-based Approximation for Data Parallel Applications. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’14). ACM, New York, NY, USA, 35–50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Adrian Sampson, Andre Baixo, Benjamin Ransford, Thierry Moreau, Joshua Yip, Luis Ceze, and Mark Oskin. 2015. ACCEPT: A Programmer-Guided Compiler Framework for Practical Approximate Computing. In U. Washington, Tech. Rep. UW-CSE-15-01-01. https://dada.cs.washington.edu/research/tr/2015/01/UW-CSE-15-01-01.pdfGoogle ScholarGoogle Scholar
  53. Adrian Sampson, Werner Dietl, Emily Fortuna, Danushen Gnanapragasam, Luis Ceze, and Dan Grossman. 2011. EnerJ: Approximate Data Types for Safe and General Low-power Computation. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). ACM, New York, NY, USA, 164–174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Ben Sander. 2013. HSAIL: Portable compiler IR for HSA.. In Hot Chips Symposium 2013. 1–32.Google ScholarGoogle ScholarCross RefCross Ref
  55. Eric Schkufza, Rahul Sharma, and Alex Aiken. 2014. Stochastic Optimization of Floating-point Programs with Tunable Precision. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 53–64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing Performance vs. Accuracy Trade-offs with Loop Perforation. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE ’11). ACM, New York, NY, USA, 124–134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). arXiv: 1409.1556 http://arxiv.org/abs/1409.1556Google ScholarGoogle Scholar
  58. Prakalp Srivastava, Mingu Kang, Sujan K. Gonugondla, Sungmin Lim, Jungwook Choi, Vikram Adve, Nam Sung Kim, and Naresh Shanbhag. 2018. PROMISE: An End-to-end Design of a Programmable Mixed-signal Accelerator for Machinelearning Algorithms. In Proceedings of the 45th Annual International Symposium on Computer Architecture (ISCA ’18). IEEE Press, Piscataway, NJ, USA, 43–56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Renée St. Amant, Amir Yazdanbakhsh, Jongse Park, Bradley Thwaites, Hadi Esmaeilzadeh, Arjang Hassibi, Luis Ceze, and Doug Burger. 2014. General-purpose Code Acceleration with Limited-precision Analog Computation. In Proceeding of the 41st Annual International Symposium on Computer Architecture (ISCA ’14). IEEE Press, Piscataway, NJ, USA, 505–516. http://dl.acm.org/citation.cfm?id=2665671.2665746Google ScholarGoogle ScholarCross RefCross Ref
  60. Phillip Stanley-Marbell, Armin Alaghi, Michael Carbin, Eva Darulova, Lara Dolecek, Andreas Gerstlauer, Ghayoor Gillani, Djordje Jevdjic, Thierry Moreau, Mattia Cacciotti, Alexandros Daglis, Natalie D. Enright Jerger, Babak Falsafi, Sasa Misailovic, Adrian Sampson, and Damien Zufferey. 2018. Exploiting Errors for Efficiency: A Survey from Circuits to Algorithms. CoRR abs/1809.05859 (2018). arXiv: 1809.05859 http://arxiv.org/abs/1809.05859Google ScholarGoogle Scholar
  61. The XLA Team. 2019. XLA: Domain-specific compiler for linear algebra that optimizes TensorFlow computations. https: //github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/g3doc/overview.md . (2019).Google ScholarGoogle Scholar
  62. N. Thomos, N. V. Boulgouris, and M. G. Strintzis. 2006. Optimized Transmission of JPEG2000 Streams Over Wireless Channels. IEEE Transactions on Image Processing 15, 1 (January 2006).Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Ran Xu, Jinkyu Koo, Rakesh Kumar, Peter Bai, Subrata Mitra, Sasa Misailovic, and Saurabh Bagchi. 2018. VideoChef: Efficient Approximation for Streaming Video Processing Pipelines. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 43–56. https://www.usenix.org/conference/atc18/presentation/xu-ranGoogle ScholarGoogle Scholar
  64. Wei Yang. 2019. Classification on CIFAR-10/100 and ImageNet with PyTorch. https://github.com/bearpaw/pytorchclassification/blob/master/models/cifar/alexnet.py . (2019).Google ScholarGoogle Scholar
  65. Zeyuan Allen Zhu, Sasa Misailovic, Jonathan A. Kelner, and Martin Rinard. 2012. Randomized Accuracy-aware Program Transformations for Efficient Approximate Computations. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12). ACM, New York, NY, USA, 441–454. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ApproxHPVM: a portable compiler IR for accuracy-aware optimizations

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Programming Languages
      Proceedings of the ACM on Programming Languages  Volume 3, Issue OOPSLA
      October 2019
      2077 pages
      EISSN:2475-1421
      DOI:10.1145/3366395
      Issue’s Table of Contents

      Copyright © 2019 Owner/Author

      This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 October 2019
      Published in pacmpl Volume 3, Issue OOPSLA

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader