skip to main content
research-article
Open Access

A domain theory for statistical probabilistic programming

Published:02 January 2019Publication History
Skip Abstract Section

Abstract

We give an adequate denotational semantics for languages with recursive higher-order types, continuous probability distributions, and soft constraints. These are expressive languages for building Bayesian models of the kinds used in computational statistics and machine learning. Among them are untyped languages, similar to Church and WebPPL, because our semantics allows recursive mixed-variance datatypes. Our semantics justifies important program equivalences including commutativity.

Our new semantic model is based on `quasi-Borel predomains'. These are a mixture of chain-complete partial orders (cpos) and quasi-Borel spaces. Quasi-Borel spaces are a recent model of probability theory that focuses on sets of admissible random elements. Probability is traditionally treated in cpo models using probabilistic powerdomains, but these are not known to be commutative on any class of cpos with higher order functions. By contrast, quasi-Borel predomains do support both a commutative probabilistic powerdomain and higher-order functions. As we show, quasi-Borel predomains form both a model of Fiore's axiomatic domain theory and a model of Kock's synthetic measure theory.

Skip Supplemental Material Section

Supplemental Material

a36-vakar.webm

webm

89.9 MB

References

  1. Martin Abadi and Marcelo P Fiore. 1996. Syntactic considerations on recursive types. In Proc. LICS 1996. IEEE, 242–252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Nathanael L Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable conditional distributions. In Proc. LICS 2011. 107–116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Adamek and J. Rosicky. 1994. Locally Presentable and Accessible Categories. Cambridge University Press.Google ScholarGoogle Scholar
  4. Giorgio Bacci, Robert Furber, Dexter Kozen, Radu Mardare, Prakash Panangaden, and Dana Scott. 2018. Boolean-valued semantics for the stochastic λ-calculus. In Proc. LICS 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tyler Barker. 2016. A monad for randomized algorithms, In Proc. MFPS 2016. Electronic Notes in Theoretical Computer Science 325, 47–62.Google ScholarGoogle ScholarCross RefCross Ref
  6. Ingo Battenfeld, Matthias Schröder, and Alex Simpson. 2007. A convenient category of domains. Electronic Notes in Theoretical Computer Science 172 (2007), 69–99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proc. ICFP 2016. 33–46. Full version at arxiv:abs/1512.08990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Thomas Ehrhard, Michele Pagani, and Christine Tasson. 2017. Measurable cones and stable, measurable functions: a model for probabilistic higher-order programming. Proceedings of the ACM on Programming Languages 2, POPL (2017), 59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Marcelo P. Fiore. 1996. Axiomatic Domain Theory in Categories of Partial Maps. Cambridge University Press. Google ScholarGoogle Scholar
  10. Marcelo P Fiore and Gordon D Plotkin. 1994. An axiomatisation of computationally adequate domain theoretic models of FPC. In Proc. LICS 1994. 92–102.Google ScholarGoogle ScholarCross RefCross Ref
  11. Noah Goodman, Vikash Mansinghka, Daniel M Roy, Keith Bonawitz, and Joshua B Tenenbaum. 2008. Church: a language for generative models. In Proc. UAI 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Noah Goodman and Andreas Stuhlmüller. 2014. Design and Implementation of Probabilistic Programming Languages. http://dippl.org . Online book.Google ScholarGoogle Scholar
  13. Jean Goubault-Larrecq and Daniele Varacca. 2011. Continuous random variables. In Logic in Computer Science (LICS), 2011 26th Annual IEEE Symposium on. IEEE, 97–106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In Proc. LICS 2017.Google ScholarGoogle ScholarCross RefCross Ref
  15. Daniel Huang, Greg Morrisett, and Bas Spitters. 2018. An application of computable distributions to the semantics of probabilistic programs. (2018). arxiv:1806.07966.Google ScholarGoogle Scholar
  16. Claire Jones and Gordon D Plotkin. 1989. A probabilistic powerdomain of evaluations. In Proc. LICS 1989. 186–195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Achim Jung and Regina Tix. 1998. The troublesome probabilistic powerdomain, In Proc. Comprox 1998. Electronic Notes in Theoretical Computer Science 13, 70–91.Google ScholarGoogle ScholarCross RefCross Ref
  18. Olav Kallenberg. 2006. Foundations of modern probability. Springer.Google ScholarGoogle Scholar
  19. Olav Kallenberg. 2017. Random Measures, Theory and Applications. Probability Theory and Stochastic Modelling, Vol. 77. Springer.Google ScholarGoogle Scholar
  20. Ohad Kammar and Gordon D. Plotkin. 2012. Algebraic foundations for effect-dependent optimisations. In Proc. POPL 2012. 349–360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Alexander Kechris. 2012. Classical descriptive set theory. Vol. 156. Springer Science & Business Media.Google ScholarGoogle Scholar
  22. Klaus Keimel and Gordon D Plotkin. 2009. Predicate transformers for extended probability and non-determinism. Mathematical Structures in Computer Science 19, 3 (2009), 501–539. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Oleg Kiselyov and Chung-Chieh Shan. 2009. Embedded probabilistic programming. In Domain-Specific Languages. Springer, 360–384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Anders Kock. 2012. Commutative monads as a theory of distributions. Theory and Applications of Categories 26, 4 (2012), 97–131.Google ScholarGoogle Scholar
  25. Jimmy D. Lawson. 1982. Valuations on continuous lattices. In Continuous Lattices and Related Topics, Rudolf-Eberhard Höffman (Ed.). Mathematik Arbeitspapiere, Vol. 27. Universitat Bremen, 204–225.Google ScholarGoogle Scholar
  26. Paul Blain Levy. 2004. Call-By-Push-Value: A Functional/Imperative Synthesis. Kluwer.Google ScholarGoogle Scholar
  27. Vikash K. Mansinghka, Daniel Selsam, and Yura N. Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099 (2014). http://arxiv.org/abs/1404.0099Google ScholarGoogle Scholar
  28. Dylan McDermott and Ohad Kammar. 2018. Factorisation systems for logical relations and monadic lifting in type-and-effect system semantics. Proc. MFPS 2018 (2018). To appear.Google ScholarGoogle Scholar
  29. Michael W Mislove. 2016. Domains and random variables. arXiv preprint arXiv:1607.07698 (2016).Google ScholarGoogle Scholar
  30. Eugenio Moggi. 1989. Computational lambda-calculus and monads. In Proc. LICS 1989. 14–23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Proc. FLOPS 2016. Springer, 62–79.Google ScholarGoogle ScholarCross RefCross Ref
  32. Federico Olmedo, Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Christoph Matheja. 2016. Reasoning about recursive probabilistic programs. In Proc. LICS 2016. 672–681. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andrew M Pitts. 1996. Relational properties of domains. Inform. Comput. 127, 2 (1996), 66–90.Google ScholarGoogle ScholarCross RefCross Ref
  34. Gordon D. Plotkin. 1977. LCF Considered as a Programming Language. Theor. Comput. Sci. 5, 3 (1977), 223–255.Google ScholarGoogle ScholarCross RefCross Ref
  35. Nasser Saheb-Djahromi. 1980. CPO’s of measures for nondeterminism. Theoret. Comput. Sci. 12, 1 (1980), 19–37.Google ScholarGoogle ScholarCross RefCross Ref
  36. Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2017. Denotational validation of higher-order Bayesian inference. Proc. ACM Program. Lang. 2, POPL, Article 60 (Dec. 2017), 29 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Dana S. Scott. 1993. A type-theoretical alternative to ISWIM, CUCH, OWHY. Theoret. Comput. Sci. 121, 1 (1993), 411–440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Chung-chieh Shan and Norman Ramsey. 2017. Exact Bayesian inference by symbolic disintegration. In Proc. POPL 2017. 130–144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Michael B. Smyth and Gordon D. Plotkin. 1982. The category-theoretic solution of recursive domain equations. SIAM J. Comput. 11, 4 (1982), 761–783.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proc. ESOP 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Sam Staton, Hongseok Yang, Frank Wood, Chris Heunen, and Ohad Kammar. 2016. Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints. In Proc. LICS 2016. 525–534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Matthijs Vákár and Luke Ong. 2018. On S-Finite Measures and Kernels. arXiv preprint arXiv:1810.01837 (2018).Google ScholarGoogle Scholar
  43. David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Lightweight implementations of probabilistic programming languages via transformational compilation. In Proc. AISTATS 2011.Google ScholarGoogle Scholar
  44. Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A new approach to probabilistic programming inference. In Proc. AISTATS 2014. 1024–1032.Google ScholarGoogle Scholar

Index Terms

  1. A domain theory for statistical probabilistic programming

                Recommendations

                Reviews

                Santiago Escobar

                On the one hand, a statistical programming language is similar to a traditional programming language, but with libraries providing statistical functions. Examples are Mathematica, MATLAB, and the omnipresent R. On the other hand, probabilistic programming aims at providing a model of a probabilistic function or distribution as a program written in a programming language. As an example, WebPPL is a probabilistic extension of JavaScript. Essentially, we define a probabilistic function not mathematically, but as a program that computes an output using a sample from a specific given distribution function. Statistical probabilistic programming aims at their seamless combination. When writing a program in a traditional programming language, each one of its executions must provide an output for each given input. However, when reasoning about programs, we usually cannot compute all the possible executions and use other techniques. A very common one is compositionality. That is, we obtain properties of a given program by composing properties of its inner parts. But when writing a probabilistic program, there is no given input-we are not interested in specific data, but in input distribution functions. That is, the notion of an execution in a traditional program is related not to a concrete execution of the probabilistic program, but to executing the program many times using specific input data taken from an uncertainty distribution and extrapolating the probability associated to the output generated from each input. But reasoning about probabilistic programs becomes much more difficult, and compositionality is a must. This paper explores how the commutativity property of compositionality fails in statistical probabilistic programming and provides a denotational semantics for statistical probabilistic programming with recursive types. The authors use quasi-Borel predomains, which are "a mixture of chain-complete partial orders (CPOs) and quasi-Borel spaces." The main language is statistical FPC (SFPC), a statistical variant of fixed-point calculus (FPC), including sum, product, function, and recursive types; real numbers and measurable functions from reals into reals; a construct for testing real numbers for zero; a construct for random numbers; and a construct for soft constraints.

                Access critical reviews of Computing literature here

                Become a reviewer for Computing Reviews.

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image Proceedings of the ACM on Programming Languages
                  Proceedings of the ACM on Programming Languages  Volume 3, Issue POPL
                  January 2019
                  2275 pages
                  EISSN:2475-1421
                  DOI:10.1145/3302515
                  Issue’s Table of Contents

                  Copyright © 2019 Owner/Author

                  This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 2 January 2019
                  Published in pacmpl Volume 3, Issue POPL

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader