skip to main content
Skip header Section
The Art of Differentiating Computer Programs: An Introduction to Algorithmic DifferentiationJanuary 2012
Publisher:
  • Society for Industrial and Applied Mathematics
  • 3600 University City Science Center Philadelphia, PA
  • United States
ISBN:978-1-61197-206-1
Published:12 January 2012
Pages:
358
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

This is the first entry-level book on algorithmic (also known as automatic) differentiation (AD), providing fundamental rules for the generation of first- and higher-order tangent-linear and adjoint code. The author covers the mathematical underpinnings as well as how to apply these observations to real-world numerical simulation programs. Readers will find many examples and exercises, including hints to solutions. Also included are the prototype AD tools dco and dcc for use with the examples and exercises. The derivative code compiler dcc provides first- and higher-order tangent-linear and adjoint modes for a limited subset of C/C++. In addition, readers will have access to a supplementary website containing sources of all software discussed in the book, additional exercises and comments on their solutions (growing over the coming years), links to other sites on AD, and errata. Audience: This book is intended for undergraduate and graduate students in computational science, engineering, and finance as well as applied mathematics and computer science. It will provide researchers and developers at all levels with an intuitive introduction to AD.

Cited By

  1. Deussen J and Naumann U (2023). Subdomain separability in global optimization, Journal of Global Optimization, 86:3, (573-588), Online publication date: 1-Jul-2023.
  2. ACM
    Cherubin S and Agosta G (2020). Tools for Reduced Precision Computation, ACM Computing Surveys, 53:2, (1-35), Online publication date: 31-Mar-2021.
  3. Kofman E, Fernández J and Marzorati D (2021). Compact sparse symbolic Jacobian computation in large systems of ODEs, Applied Mathematics and Computation, 403:C, Online publication date: 15-Aug-2021.
  4. Mitusch S, Funke S and Kuchta M (2021). Hybrid FEM-NN models, Journal of Computational Physics, 446:C, Online publication date: 1-Dec-2021.
  5. ACM
    Georgiou S, Rizou S and Spinellis D (2019). Software Development Lifecycle for Energy Efficiency, ACM Computing Surveys, 52:4, (1-33), Online publication date: 31-Jul-2020.
  6. ACM
    Jonasson K, Sigurdsson S, Yngvason H, Ragnarsson P and Melsted P (2020). Algorithm 1005, ACM Transactions on Mathematical Software, 46:1, (1-20), Online publication date: 31-Mar-2020.
  7. Gray J, Hwang J, Martins J, Moore K and Naylor B (2019). OpenMDAO, Structural and Multidisciplinary Optimization, 59:4, (1075-1104), Online publication date: 1-Apr-2019.
  8. Hückelheim J, Hovland P, Strout M and Müller J (2019). Reverse-mode algorithmic differentiation of an OpenMP-parallel compressible flow solver, International Journal of High Performance Computing Applications, 33:1, (140-154), Online publication date: 1-Jan-2019.
  9. ACM
    Naumann U (2019). Adjoint Code Design Patterns, ACM Transactions on Mathematical Software, 45:3, (1-32), Online publication date: 30-Sep-2019.
  10. Schanen M, Maldonado D and Anitescu M A Framework for Distributed Approximation of Moments with Higher-Order Derivatives Through Automatic Differentiation Computational Science – ICCS 2019, (251-260)
  11. Menon H, Lam M, Osei-Kuffuor D, Schordan M, Lloyd S, Mohror K and Hittinger J ADAPT Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, (1-13)
  12. Menon H, Lam M, Osei-Kuffuor D, Schordan M, Lloyd S, Mohror K and Hittinger J ADAPT Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, (1-13)
  13. Merriënboer B, Moldovan D and Wiltschko A Tangent Proceedings of the 32nd International Conference on Neural Information Processing Systems, (6259-6268)
  14. Günther S, Gauger N and Schroder J (2018). A non-intrusive parallel-in-time adjoint solver with the XBraid library, Computing and Visualization in Science, 19:3-4, (85-95), Online publication date: 1-Jul-2018.
  15. ACM
    Meyer X, Chopard B and Salamin N Scheduling Finite Difference Approximations for DAG-Modeled Large Scale Applications Proceedings of the Platform for Advanced Scientific Computing Conference, (1-12)
  16. Khan K, Watson H and Barton P (2017). Differentiable McCormick relaxations, Journal of Global Optimization, 67:4, (687-729), Online publication date: 1-Apr-2017.
  17. ACM
    Vassiliadis V, Riehme J, Deussen J, Parasyris K, Antonopoulos C, Bellas N, Lalis S and Naumann U Towards automatic significance analysis for approximate computing Proceedings of the 2016 International Symposium on Code Generation and Optimization, (182-193)
  18. ACM
    Sluşanschi E and Dumitrel V (2016). ADiJaC -- Automatic Differentiation of Java Classfiles, ACM Transactions on Mathematical Software, 43:2, (1-33), Online publication date: 2-Sep-2016.
  19. Lotz J, Schwalbach M and Naumann U (2016). A Case Study in Adjoint Sensitivity Analysis of Parameter Calibration, Procedia Computer Science, 80:C, (201-211), Online publication date: 1-Jun-2016.
  20. Safiran N, Lotz J and Naumann U (2016). Algorithmic Differentiation of Numerical Methods, Procedia Computer Science, 80:C, (2231-2235), Online publication date: 1-Jun-2016.
  21. ACM
    Naumann U, Lotz J, Leppkes K and Towara M (2015). Algorithmic Differentiation of Numerical Methods, ACM Transactions on Mathematical Software, 41:4, (1-21), Online publication date: 26-Oct-2015.
  22. Safiran N, Lotz J and Naumann U (2015). Second-order Tangent Solvers for Systems of Parameterized Nonlinear Equations, Procedia Computer Science, 51:C, (231-238), Online publication date: 1-Sep-2015.
  23. Lotz J, Naumann U, Hannemann-Tamas R, Ploch T and Mitsos A (2015). Higher-order Discrete Adjoint ODE Solver in C++ for Dynamic Optimization, Procedia Computer Science, 51:C, (256-265), Online publication date: 1-Sep-2015.
  24. Elsheikh A (2015). An equation-based algorithmic differentiation technique for differential algebraic equations, Journal of Computational and Applied Mathematics, 281:C, (135-151), Online publication date: 1-Jun-2015.
  25. Towara M, Schanen M and Naumann U (2015). MPI-Parallel Discrete Adjoint OpenFOAM, Procedia Computer Science, 51:C, (19-28), Online publication date: 1-Sep-2015.
  26. ACM
    Eriksson-Bique S, Polishchuk V and Sysikaski M Optimal Geometric Flows via Dual Programs Proceedings of the thirtieth annual symposium on Computational geometry, (100-109)
  27. ACM
    Hascoet L and Pascual V (2013). The Tapenade automatic differentiation tool, ACM Transactions on Mathematical Software, 39:3, (1-43), Online publication date: 1-Apr-2013.
  28. ACM
    Gebremedhin A, Nguyen D, Patwary M and Pothen A (2013). ColPack, ACM Transactions on Mathematical Software, 40:1, (1-31), Online publication date: 1-Sep-2013.
  29. Lotz J, Naumann U, Sagebaum M and Schanen M Discrete adjoints of PETSc through dco/c++ and adjoint MPI Proceedings of the 19th international conference on Parallel Processing, (497-507)
  30. Förster M and Naumann U Solving a least-squares problem with algorithmic differentiation and OpenMP Proceedings of the 19th international conference on Parallel Processing, (763-774)
  31. ACM
    Minh B, Förster M and Naumann U Towards tangent-linear GPU programs using OpenACC Proceedings of the 4th Symposium on Information and Communication Technology, (27-34)
  32. Schanen M and Naumann U A wish list for efficient adjoints of one-sided MPI communication Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface, (248-257)
Contributors
  • RWTH Aachen University

Index Terms

  1. The Art of Differentiating Computer Programs: An Introduction to Algorithmic Differentiation

      Recommendations

      Reviews

      Bernard Kuc

      A large part of my job for the last eight years has been dealing with the first- and second-order derivatives of financial instruments. Hence, I find myself intimately aware of the numerical inaccuracies and computation time complexity of finite difference techniques. Thus, it was with great anticipation that I started reading this book about a world of faster and more accurate derivatives. Chapter 1 broadly describes a motivation for algorithmic differentiation. It starts off with a few examples where derivatives are required, such as a steepest descent search in nonlinear programming, how to use the Newton algorithm for solving systems of nonlinear equations, and how to deal with constraints. Having defined some problems, the author describes manual differentiation before moving on to approximation techniques. It is in the context of finite differences that the inaccuracies introduced by finite precision floating-point numbers are analyzed. Chapter 2 introduces the tangent linear and adjoint models around which algorithmic differentiation is based. The tangent linear model computes the derivative of the output with respect to the input on a source code line-by-line basis, where the derivative is propagated along with the original computation. As such, this is also called the forward mode. The order of complexity for computing the derivative is proportional to the number of inputs. The adjoint model effectively computes the derivative of the input with respect to the output, and thus its time complexity scales in proportion to the number of outputs. In the many cases where there are significantly fewer outputs than inputs, the adjoint model provides sizable time savings. To achieve this, in the adjoint model, the code has to be inverted and the derivative computed going backwards through the code (reverse mode). Obviously, this is a much more onerous task, and so the chapter ends by describing how to achieve call tree reversal. Chapter 3 covers higher-order derivatives. Given that we have two modes to differentiate a function, we obviously have four combinations of modes to compute the second derivatives: forward over forward, reverse over forward, forward over reverse, and reverse over reverse. The first still generates tangent linear code, while the remaining three stay as adjoint models. Each of these is dealt with in detail in this chapter, and supplemented with example source code to show what a simple function looks like after each of the four possible combinations of transformations has been applied to it. Although code transformation, or rather code generation of derivative code, is the primary focus of this book, both tangent linear and adjoint model computation can be performed by overriding operators on a custom defined data type. Hence, this chapter also defines data types required for computing first- and second-order derivatives, and how operator overriding can be used to achieve the computation of the derivative with minimal change to the original source code. Numerical timings of the various differentiation combinations on some sample toy problems are provided throughout. I found chapter 4 to be the weakest in the book. The intention is to detail how one would build a derivative code compiler. However, to do so requires a whole book's worth of knowledge on compiler design, not just half of a chapter. Hence, this chapter is a whirlwind of topics, including deterministic and nondeterministic finite automata; call flow graphs and parse trees; regular expressions and their conversion to state automata; syntax analysis; top-down and bottom-up parsers; left-right parsers and dealing with operator precedence; and attribute grammars. Luckily, the author sticks to a single lexical analyzer (flex) and a single parser (bison). Input files, data formats, and syntax are described for both flex and bison, and usage examples are provided. The final chapter brings it all together. Using dcc version 0.9, the derivative code compiler written as a part of research efforts into algorithmic differentiation, the author walks the user through all the theoretical topics covered in chapters 2 and 3. Starting with how the compiler changes function signatures to include the additional input and output parameters that will contain the derivatives, Naumann then discusses how local variables are dealt with. The bulk of the chapter looks at three examples of increasing complexity. The first is the simple function y = sin ( x ). The second covers nested functions, to which the author adds the challenge of dealing with a function that modifies its input parameter. The final example is a function that takes a vector of doubles as input. For each of these examples, the author generates the two types of first-order derivative code (tangent linear and adjoint), as well as the four types of second-order derivative code described above. The book ends with three appendices, the last of which contains hints for solutions to the numerous exercise problems at the end of each chapter. Overall, I must admit I expected better. To me, this book feels like two books, one on algorithmic differentiation and the other on derivative code compilers, neither of which is truly complete. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.