skip to main content
research-article
Open Access
Artifacts Available
Artifacts Evaluated & Functional

Functional programming for modular Bayesian inference

Published:30 July 2018Publication History
Skip Abstract Section

Abstract

We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-of-the-art inference algorithms. Our design relies on three inherently functional features: higher-order functions, inductive data-types, and support for either type-classes or an expressive module system. We provide a performant Haskell implementation of this architecture, demonstrating that high-level and modular probabilistic programming can be added as a library in sufficiently expressive languages. We review the core abstractions in this architecture: inference representations, inference transformations, and inference representation transformers. We then implement concrete instances of these abstractions, counterparts to particle filters and Metropolis-Hastings samplers, which form the basic building blocks of our library. By composing these building blocks we obtain state-of-the-art inference algorithms: Resample-Move Sequential Monte Carlo, Particle Marginal Metropolis-Hastings, and Sequential Monte Carlo Squared. We evaluate our implementation against existing probabilistic programming systems and find it is already competitively performant, although we conjecture that existing functional programming optimisation techniques could reduce the overhead associated with the abstractions we use. We show that our modular design enables deterministic testing of inherently stochastic Monte Carlo algorithms. Finally, we demonstrate using OCaml that an expressive module system can also implement our design.

Skip Supplemental Material Section

Supplemental Material

a83-scibior.webm

webm

94.4 MB

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). https://www.tensorflow.org/ Software available from tensorflow.org.Google ScholarGoogle Scholar
  2. Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable Conditional Distributions. In LiCS. http://ieeexplore.ieee.org/document/5970208/ Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chrisophe Andrieu, Arnaud Doucet, and Roman Holenstein. 2010. Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society 72 (2010), 269–342. www.stats.ox.ac.uk/~doucet/andrieu_doucet_holenstein_PMCMC.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  4. David Barber. 2012. Bayesian Reasoning and Machine Learning. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christopher Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York. http://www.springer.com/ gb/book/9780387310732 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Mario Blažević. 2011. Coroutine Pipelines. The Monad Reader (2011), 29–50. Issue 19.Google ScholarGoogle Scholar
  7. Jacques Carette, Oleg Kiselyov, and Chung-Chieh Shan. 2009. Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages. Journal of Functional Programming 19, 5 (2009), 509–543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. Journal of Statistical Software 76 (2017).Google ScholarGoogle Scholar
  9. Manuel M. T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. 2011. Accelerating Haskell Array Codes with Multicore GPUs. In DAMP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Nicolas Chopin, Pierre E. Jacob, and Omiros Papaspiliopoulos. 2013. SMC2: an efficient algorithm for sequential analysis of state space models. Journal of the Royal Statistical Society Series B: Statistical Methodology 75 (2013), 397–426. Issue 3.Google ScholarGoogle ScholarCross RefCross Ref
  11. Duncan Coutts, Roman Leshchinskiy, and Don Stewart. 2007. Stream fusion: from lists to streams to nothing at all. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP 2007, Freiburg, Germany, October 1-3, 2007, Ralf Hinze and Norman Ramsey (Eds.). ACM, 315–326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B: Statistical Methodology 39 (1977), 1–38. Issue 1. https: //mathscinet.ams.org/mathscinet-getitem?mr=0501537Google ScholarGoogle ScholarCross RefCross Ref
  13. Arnaud Doucet and Adam M. Johansen. 2011. A Tutorial on Particle Filtering and Smoothing: Fifteen years later. In The Oxford Handbook of Nonlinear Filtering, Dan Crisan and Boris Rozovskii (Eds.). Oxford University Press, Chapter 8.Google ScholarGoogle Scholar
  14. Walter Gilks and Carlo Berzuini. 2001. Following a moving target - Monte Carlo inference for dynamic Bayesian models. Journal of the Royal Statistical Society 63 (2001), 127–146. www.mathcs.emory.edu/~whalen/Papers/BNs/MonteCarlo-DBNs. pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  15. W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. 1994. A Language and Program for Complex Bayesian Modelling. Journal of the Royal Statistical Society. Series D 43 (1994).Google ScholarGoogle Scholar
  16. Noah Goodman, Vikash Mansinghka, Daniel Roy, Keith Bonawitz, and Joshua Tenenbaum. 2008. Church: a language for generative models. In UAI. http://cocolab.stanford.edu/papers/GoodmanEtAl2008-UncertaintyInArtificialIntelligence.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Noah Goodman and Andreas Stuhlmüller. 2014. Design and Implementation of Probabilistic Programming Languages. http://dippl.org . (2014).Google ScholarGoogle Scholar
  18. Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In LiCS. http://ieeexplore.ieee.org/document/8005137/Google ScholarGoogle Scholar
  19. Sheng Liang and Paul Hudak. 1996. Modular Denotational Semantics for Compiler Construction. In ESOP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. David J. C. MacKay. 2003. Information Theory, Inference and Learning Algorithms. Cambridge University Press. www. inference.org.uk/itila/book.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Vikash Mansinghka, Daniel Selsam, and Yura Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099. (2014).Google ScholarGoogle Scholar
  22. Tom Minka, John Winn, J. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. 2014. Infer.NET 2.6. Microsoft Research Cambridge. http://research.microsoft.com/infernet. (2014).Google ScholarGoogle Scholar
  23. Lawrence M. Murray. 2013. Bayesian State-Space Modelling on High-Performance Hardware Using LibBi. arXiv:1306.3277. (2013).Google ScholarGoogle Scholar
  24. Radford M. Neal. 2010. MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo. Chapman and Hall.Google ScholarGoogle Scholar
  25. Aditya V. Nori, Chung-Kil Hur, Sriram K. Rajamani, and Selva Samuel. 2014. R2: An Efficient MCMC Sampler for Probabilistic Programs. In AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Anand Patil, David Huard, and Christopher J. Fonnesbeck. 2010. PyMC: Bayesian Stochastic Modelling in Python. Journal of Statistical Software 35 (2010).Google ScholarGoogle Scholar
  27. Avi Pfeffer. 2001. IBAL: A Probabilistic Rational Programming Language. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Avi Pfeffer. 2015. Practical Probabilistic Programming. Manning. https://www.manning.com/books/ practical-probabilistic-programming Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rajesh Ranganath, Sean Gerrish, and David Blei. 2014. Black-Box Variational Inference. In AISTATS. http://www.jmlr.org/ proceedings/papers/v33/ranganath14.htmlGoogle ScholarGoogle Scholar
  30. Daniel Ritchie, Andreas Stuhlmüller, and Noah D. Goodman. 2016. C3: Lightweight Incrementalized MCMC for Probabilistic Programs using Continuations and Callsite Caching. In AISTATS. http://proceedings.mlr.press/v51/ritchie16.htmlGoogle ScholarGoogle Scholar
  31. Adam Ścibior, Zoubin Ghahramani, and Andrew Gordon. 2015. Practical Probabilistic Programming with Monads. In Haskell. http://dl.acm.org/citation.cfm?id=2804317 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2018. Denotational Validation of Higher-Order Bayesian Inference. Proceedings of the ACM on Programming Languages 2 (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proc. ESOP 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. David Tolpin, Jan-Willem van de Meent, and Frank Wood. 2015. Probabilistic Programming in Anglican. In Machine Learning and Knowledge Discovery in Databases, Albert Bifet, Michael May, Bianca Zadrozny, Ricard Gavalda, Dino Pedreschi, Francesco Bonchi, Jaime Cardoso, and Myra Spiliopoulou (Eds.). Lecture Notes in Computer Science, Vol. 9286. Springer International Publishing, 308–311.Google ScholarGoogle Scholar
  35. Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep Probabilistic Programming. In ICLR.Google ScholarGoogle Scholar
  36. Liang Wang. 2017. Owl: A General-Purpose Numerical Library in OCaml. arXiv:1707.09616. (2017).Google ScholarGoogle Scholar
  37. Leo White, Frédéric Bour, and Jeremy Yallop. 2015. Modular Implicits. ACM Workshop on ML 2014 post-proceedings. (September 2015).Google ScholarGoogle Scholar
  38. David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In AISTATS. https://web.stanford.edu/~ngoodman/papers/ lightweight-mcmc-aistats2011.pdf The published version contains a serious bug in the algorithm description, which was fixed in Revision 3 available from the authors page.Google ScholarGoogle Scholar
  39. Frank Wood, Jan-Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In AISTATS. http://www.robots.ox.ac.uk/~fwood/assets/pdf/Wood-AISTATS-2014.pdfGoogle ScholarGoogle Scholar
  40. Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In UAI.Google ScholarGoogle Scholar

Index Terms

  1. Functional programming for modular Bayesian inference

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader