research-article

Open Access

Functional programming for modular Bayesian inference

Authors:
Adam Ścibior

University of Cambridge, UK / MPI Tübingen, Germany

University of Cambridge, UK / MPI Tübingen, Germany
View Profile

,
Ohad Kammar

University of Oxford, UK

University of Oxford, UK
View Profile

,
Zoubin Ghahramani

University of Cambridge, UK / Uber AI Labs, USA

University of Cambridge, UK / Uber AI Labs, USA
View Profile

Proceedings of the ACM on Programming Languages Volume 2 Issue ICFPArticle No.: 83pp 1–29https://doi.org/10.1145/3236778

Published:30 July 2018Publication History

Proceedings of the ACM on Programming Languages

Abstract

We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-of-the-art inference algorithms. Our design relies on three inherently functional features: higher-order functions, inductive data-types, and support for either type-classes or an expressive module system. We provide a performant Haskell implementation of this architecture, demonstrating that high-level and modular probabilistic programming can be added as a library in sufficiently expressive languages. We review the core abstractions in this architecture: inference representations, inference transformations, and inference representation transformers. We then implement concrete instances of these abstractions, counterparts to particle filters and Metropolis-Hastings samplers, which form the basic building blocks of our library. By composing these building blocks we obtain state-of-the-art inference algorithms: Resample-Move Sequential Monte Carlo, Particle Marginal Metropolis-Hastings, and Sequential Monte Carlo Squared. We evaluate our implementation against existing probabilistic programming systems and find it is already competitively performant, although we conjecture that existing functional programming optimisation techniques could reduce the overhead associated with the abstractions we use. We show that our modular design enables deterministic testing of inherently stochastic Monte Carlo algorithms. Finally, we demonstrate using OCaml that an expressive module system can also implement our design.

Supplemental Material

a83-scibior.webm

webm

94.4 MB

Download

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). https://www.tensorflow.org/ Software available from tensorflow.org.Google Scholar
Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable Conditional Distributions. In LiCS. http://ieeexplore.ieee.org/document/5970208/ Google ScholarDigital Library
Chrisophe Andrieu, Arnaud Doucet, and Roman Holenstein. 2010. Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society 72 (2010), 269–342. www.stats.ox.ac.uk/~doucet/andrieu_doucet_holenstein_PMCMC.pdfGoogle ScholarCross Ref
David Barber. 2012. Bayesian Reasoning and Machine Learning. Cambridge University Press. Google ScholarDigital Library
Christopher Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York. http://www.springer.com/ gb/book/9780387310732 Google ScholarDigital Library
Mario Blažević. 2011. Coroutine Pipelines. The Monad Reader (2011), 29–50. Issue 19.Google Scholar
Jacques Carette, Oleg Kiselyov, and Chung-Chieh Shan. 2009. Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages. Journal of Functional Programming 19, 5 (2009), 509–543. Google ScholarDigital Library
Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. Journal of Statistical Software 76 (2017).Google Scholar
Manuel M. T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. 2011. Accelerating Haskell Array Codes with Multicore GPUs. In DAMP. Google ScholarDigital Library
Nicolas Chopin, Pierre E. Jacob, and Omiros Papaspiliopoulos. 2013. SMC2: an efficient algorithm for sequential analysis of state space models. Journal of the Royal Statistical Society Series B: Statistical Methodology 75 (2013), 397–426. Issue 3.Google ScholarCross Ref
Duncan Coutts, Roman Leshchinskiy, and Don Stewart. 2007. Stream fusion: from lists to streams to nothing at all. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP 2007, Freiburg, Germany, October 1-3, 2007, Ralf Hinze and Norman Ramsey (Eds.). ACM, 315–326. Google ScholarDigital Library
Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B: Statistical Methodology 39 (1977), 1–38. Issue 1. https: //mathscinet.ams.org/mathscinet-getitem?mr=0501537Google ScholarCross Ref
Arnaud Doucet and Adam M. Johansen. 2011. A Tutorial on Particle Filtering and Smoothing: Fifteen years later. In The Oxford Handbook of Nonlinear Filtering, Dan Crisan and Boris Rozovskii (Eds.). Oxford University Press, Chapter 8.Google Scholar
Walter Gilks and Carlo Berzuini. 2001. Following a moving target - Monte Carlo inference for dynamic Bayesian models. Journal of the Royal Statistical Society 63 (2001), 127–146. www.mathcs.emory.edu/~whalen/Papers/BNs/MonteCarlo-DBNs. pdfGoogle ScholarCross Ref
W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. 1994. A Language and Program for Complex Bayesian Modelling. Journal of the Royal Statistical Society. Series D 43 (1994).Google Scholar
Noah Goodman, Vikash Mansinghka, Daniel Roy, Keith Bonawitz, and Joshua Tenenbaum. 2008. Church: a language for generative models. In UAI. http://cocolab.stanford.edu/papers/GoodmanEtAl2008-UncertaintyInArtificialIntelligence.pdf Google ScholarDigital Library
Noah Goodman and Andreas Stuhlmüller. 2014. Design and Implementation of Probabilistic Programming Languages. http://dippl.org . (2014).Google Scholar
Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In LiCS. http://ieeexplore.ieee.org/document/8005137/Google Scholar
Sheng Liang and Paul Hudak. 1996. Modular Denotational Semantics for Compiler Construction. In ESOP. Google ScholarDigital Library
David J. C. MacKay. 2003. Information Theory, Inference and Learning Algorithms. Cambridge University Press. www. inference.org.uk/itila/book.html Google ScholarDigital Library
Vikash Mansinghka, Daniel Selsam, and Yura Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099. (2014).Google Scholar
Tom Minka, John Winn, J. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. 2014. Infer.NET 2.6. Microsoft Research Cambridge. http://research.microsoft.com/infernet. (2014).Google Scholar
Lawrence M. Murray. 2013. Bayesian State-Space Modelling on High-Performance Hardware Using LibBi. arXiv:1306.3277. (2013).Google Scholar
Radford M. Neal. 2010. MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo. Chapman and Hall.Google Scholar
Aditya V. Nori, Chung-Kil Hur, Sriram K. Rajamani, and Selva Samuel. 2014. R2: An Efficient MCMC Sampler for Probabilistic Programs. In AAAI. Google ScholarDigital Library
Anand Patil, David Huard, and Christopher J. Fonnesbeck. 2010. PyMC: Bayesian Stochastic Modelling in Python. Journal of Statistical Software 35 (2010).Google Scholar
Avi Pfeffer. 2001. IBAL: A Probabilistic Rational Programming Language. In IJCAI. Google ScholarDigital Library
Avi Pfeffer. 2015. Practical Probabilistic Programming. Manning. https://www.manning.com/books/ practical-probabilistic-programming Google ScholarDigital Library
Rajesh Ranganath, Sean Gerrish, and David Blei. 2014. Black-Box Variational Inference. In AISTATS. http://www.jmlr.org/ proceedings/papers/v33/ranganath14.htmlGoogle Scholar
Daniel Ritchie, Andreas Stuhlmüller, and Noah D. Goodman. 2016. C3: Lightweight Incrementalized MCMC for Probabilistic Programs using Continuations and Callsite Caching. In AISTATS. http://proceedings.mlr.press/v51/ritchie16.htmlGoogle Scholar
Adam Ścibior, Zoubin Ghahramani, and Andrew Gordon. 2015. Practical Probabilistic Programming with Monads. In Haskell. http://dl.acm.org/citation.cfm?id=2804317 Google ScholarDigital Library
Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2018. Denotational Validation of Higher-Order Bayesian Inference. Proceedings of the ACM on Programming Languages 2 (2018). Google ScholarDigital Library
Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proc. ESOP 2017. Google ScholarDigital Library
David Tolpin, Jan-Willem van de Meent, and Frank Wood. 2015. Probabilistic Programming in Anglican. In Machine Learning and Knowledge Discovery in Databases, Albert Bifet, Michael May, Bianca Zadrozny, Ricard Gavalda, Dino Pedreschi, Francesco Bonchi, Jaime Cardoso, and Myra Spiliopoulou (Eds.). Lecture Notes in Computer Science, Vol. 9286. Springer International Publishing, 308–311.Google Scholar
Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep Probabilistic Programming. In ICLR.Google Scholar
Liang Wang. 2017. Owl: A General-Purpose Numerical Library in OCaml. arXiv:1707.09616. (2017).Google Scholar
Leo White, Frédéric Bour, and Jeremy Yallop. 2015. Modular Implicits. ACM Workshop on ML 2014 post-proceedings. (September 2015).Google Scholar
David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In AISTATS. https://web.stanford.edu/~ngoodman/papers/ lightweight-mcmc-aistats2011.pdf The published version contains a serious bug in the algorithm description, which was fixed in Revision 3 available from the authors page.Google Scholar
Frank Wood, Jan-Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In AISTATS. http://www.robots.ox.ac.uk/~fwood/assets/pdf/Wood-AISTATS-2014.pdfGoogle Scholar
Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In UAI.Google Scholar

Index Terms

Functional programming for modular Bayesian inference
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic reasoning algorithms
    2. Probabilistic representations
      1. Nonparametric representations
        Bayesian nonparametric models
2. Software and its engineering
  1. Software notations and tools

Recommendations

Gen: a general-purpose probabilistic programming system with programmable inference
PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation

Although probabilistic programming is widely used for some restricted classes of statistical models, existing systems lack the flexibility and efficiency needed for practical use with more challenging models arising in fields like computer vision and ...
Read More
Practical probabilistic programming with monads
Haskell '15: Proceedings of the 2015 ACM SIGPLAN Symposium on Haskell

The machine learning community has recently shown a lot of interest in practical probabilistic programming systems that target the problem of Bayesian inference. Such systems come in different forms, but they all express probabilistic models as ...
Read More
Simulation-based Bayesian inference for epidemic models

A powerful and flexible method for fitting dynamic models to missing and censored data is to use the Bayesian paradigm via data-augmented Markov chain Monte Carlo (DA-MCMC). This samples from the joint posterior for the parameters and missing data, but ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Programming Languages Volume 2, Issue ICFP
September 2018
1133 pages
EISSN:2475-1421
DOI:10.1145/3243631
Issue’s Table of Contents

Copyright © 2018 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 July 2018
Published in pacmpl Volume 2, Issue ICFP

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available
- Artifacts Evaluated & Functional
Author Tags
Anglican
Bayesian inference
Markov Chain Monte Carlo
Monte Carlo samplers
Sequential Monte Carlo
WebPPL
functional programming
higher-order functions
inductive types
machine learning
module systems
monad transformers
monads
probabilistic programming
type-classes
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 2,221
  Total Downloads
- Downloads (Last 12 months)266
- Downloads (Last 6 weeks)45
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Functional programming for modular Bayesian inference

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Gen: a general-purpose probabilistic programming system with programmable inference

Practical probabilistic programming with monads

Simulation-based Bayesian inference for epidemic models