research-article

Open Access

Gen: a general-purpose probabilistic programming system with programmable inference

Authors:
Marco F. Cusumano-Towner

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Feras A. Saad

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Alexander K. Lew

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Vikash K. Mansinghka

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and ImplementationJune 2019Pages 221–236https://doi.org/10.1145/3314221.3314642

Published:08 June 2019Publication History

PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pages 221–236

ABSTRACT

Although probabilistic programming is widely used for some restricted classes of statistical models, existing systems lack the flexibility and efficiency needed for practical use with more challenging models arising in fields like computer vision and robotics. This paper introduces Gen, a general-purpose probabilistic programming system that achieves modeling flexibility and inference efficiency via several novel language constructs: (i) the generative function interface for encapsulating probabilistic models; (ii) interoperable modeling languages that strike different flexibility/efficiency trade-offs; (iii) combinators that exploit common patterns of conditional independence; and (iv) an inference library that empowers users to implement efficient inference algorithms at a high level of abstraction. We show that Gen outperforms state-of-the-art probabilistic programming systems, sometimes by multiple orders of magnitude, on diverse problems including object tracking, estimating 3D body pose from a depth image, and inferring the structure of a time series.

Supplemental Material

p221-cusumano-towner.webm

webm

98.3 MB

Download

3314221.3314642.mp4

mp4

74.1 MB

Download

References

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016). USENIX Association, 265-283.Google ScholarDigital Library
Ryan Adams, Hanna Wallach, and Zoubin Ghahramani. 2010. Learning the structure of deep sparse graphical models. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS 2010). PMLR, 1-8.Google Scholar
Eric Atkinson, Cambridge Yang, and Michael Carbin. 2018. Verifying handcoded probabilistic inference procedures. (2018). arXiv:1805.01863Google Scholar
Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah. 2017. Julia: A fresh approach to numerical computing. SIAM Rev. 59, 1 (2017), 65-98.Google ScholarDigital Library
Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D. Goodman. 2018. Pyro: Deep universal probabilistic programming. (2018). arXiv:1810.09538Google Scholar
David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. 2017. Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112, 518 (2017), 859-877.Google ScholarCross Ref
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (Jan. 2003), 993-1022. Google ScholarDigital Library
Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. Journal of Statistical Software 76, 1 (2017), 1-32.Google ScholarCross Ref
Nick Chater, Joshua B. Tenenbaum, and Alan Yuille. 2006. Probabilistic models of cognition: Conceptual foundations. Trends in Cognitive Sciences 10, 7 (2006), 287-291.Google ScholarCross Ref
Marco Cusumano-Towner, Benjamin Bichsel, Timon Gehr, Martin Vechev, and Vikash K. Mansinghka. 2018. Incremental inference for probabilistic programs. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, 571-585.Google ScholarDigital Library
Marco Cusumano-Towner and Vikash K. Mansinghka. 2017. AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms. In Advances in Neural Information Processing Systems 30 (NIPS 2017). Curran Associates, Inc., 3000-3010.Google Scholar
Marco Cusumano-Towner, Alexey Radul, David Wingate, and Vikash K. Mansinghka. 2017. Probabilistic programs for inferring the goals of autonomous agents. (2017). arXiv:1704.04977Google Scholar
Martin de La Gorce, Nikos Paragios, and David J Fleet. 2008. Model-based hand tracking with texture, shading and self-occlusions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). IEEE, 1-8.Google ScholarCross Ref
Luca Del Pero, Joshua Bowdish, Daniel Fried, Bonnie Kermgard, Emily Hartley, and Kobus Barnard. 2012. Bayesian geometric modeling of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012). IEEE, 2719-2726. Google ScholarDigital Library
Arnaud Doucet, Nando De Freitas, and Neil Gordon. 2001. An introduction to sequential Monte Carlo methods. In Sequential Monte Carlo Methods in Practice, Arnaud Doucet, Nando de Freitas, and Neil Gordon (Eds.). Springer, 3-14.Google Scholar
Simon Duane, Anthony D. Kennedy, Brian J. Pendleton, and Duncan Roweth. 1987. Hybrid Monte carlo. Physics Letters B 195, 2 (1987), 216-222.Google ScholarCross Ref
David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, and Zoubin Ghahramani. 2013. Structure discovery in nonparametric regression through compositional kernel search. In Proceedings of the 30th International Conference on Machine Learning (ICML 2010). PMLR, 1166-1174. Google ScholarDigital Library
Hong Ge, Kai Xu, and Zoubin Ghahramani. 2018. Turing: A language for flexible probabilistic inference. In Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018). PMLR, 1682-1690.Google Scholar
Alan E. Gelfand and Adrian F. M. Smith. 1990. Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85, 410 (1990), 398-409.Google ScholarCross Ref
Andrew Gelman, Hal S. Stern, John B. Carlin, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. CRC Press.Google Scholar
Stuart Geman and Donald Geman. 1987. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. In Readings in Computer Vision: Issues, Problem, Principles, and Paradigms, Martin A. Fischler and Oscar Fischler (Eds.). Morgan Kaufmann Publishers Inc., 564-584. Google ScholarDigital Library
Noah Goodman, Vikash Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: a language for generative models. In Proceedings of the 24th Annual Conference on Uncertainty in Artificial Intelligence (UAI 2008). AUAI Press, 220-229. Google ScholarDigital Library
Noah D. Goodman and Andreas Stuhlmüller. 2014. The Design and Implementation of Probabilistic Programming Languages. http://dippl.org. Accessed: 2018-11-8.Google Scholar
Peter J. Green. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 4 (1995), 711-732.Google ScholarCross Ref
Roger B. Grosse, Ruslan Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum. 2012. Exploiting compositionality to explore a large space of model structures. In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI 2012). AUAI Press, 306-315. Google ScholarDigital Library
Wilfred K. Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 1 (1970), 97-109.Google ScholarCross Ref
Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation 18, 7 (2006), 1527-1554. Google ScholarDigital Library
Steven Holtzen, Yibiao Zhao, Tao Gao, Joshua B. Tenenbaum, and Song-Chun Zhu. 2016. Inferring human intent from video by sampling hierarchical plans. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 1489-1496.Google ScholarDigital Library
Daniel Huang, Jean-Baptiste Tristan, and Greg Morrisett. 2017. Compiling Markov chain Monte Carlo algorithms for probabilistic modeling. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, 111-125. Google ScholarDigital Library
Varun Jampani, Sebastian Nowozin, Matthew Loper, and Peter V. Gehler. 2015. The informed sampler. Computing Vision and Image Understanding 136, C (2015), 32-44. Google ScholarDigital Library
S. J. Julier and J. K. Uhlmann. 2004. Unscented filtering and nonlinear estimation. Proc. IEEE 92, 3 (2004), 401-422.Google ScholarCross Ref
Rudolph E. Kalman. 1960. A new approach to linear filtering and prediction problems. Journal of Basic Engineering 82, 1 (1960), 35-45.Google ScholarCross Ref
Bjarne Knudsen and Jotun Hein. 2003. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Research 31, 13 (2003), 3423-3428.Google ScholarCross Ref
Daphne Koller, Nir Friedman, and Francis Bach. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press. Google ScholarDigital Library
Tejas D. Kulkarni, Pushmeet Kohli, Joshua B. Tenenbaum, and Vikash Mansinghka. 2015. Picture: A probabilistic programming language for scene perception. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015). IEEE, 4390-4399.Google ScholarCross Ref
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML 2001). Morgan Kaufmann Publishers Inc., 282-289. Google ScholarDigital Library
Steven M. LaValle. 1998. Rapidly-exploring random trees: A new tool for path planning. Technical Report TR 98-11. Computer Science Department, Iowa State University.Google Scholar
Vikash Mansinghka, Patrick Shafto, Eric Jonas, Cap Petschulat, Max Gasner, and Joshua B. Tenenbaum. 2016. CrossCat: A fully Bayesian nonparametric method for analyzing heterogeneous, high dimensional data. Journal of Machine Learning Research 17, 138 (2016), 1-49.Google ScholarDigital Library
Vikash K. Mansinghka, Tejas D. Kulkarni, Yura N. Perov, and Joshua B. Tenenbaum. 2013. Approximate Bayesian image interpretation using generative probabilistic graphics programs. In Advances in Neural Information Processing Systems 26. Curran Associates, Inc., 1520-1528. Google ScholarDigital Library
Vikash K. Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, and Martin Rinard. 2018. Probabilistic programming with programmable inference. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, 603-616.Google ScholarDigital Library
Andrew McCallum, Karl Schultz, and Sameer Singh. 2009. FACTORIE: Probabilistic programming via imperatively defined factor graphs. In Advances in Neural Information Processing Systems 22 (NIPS 2009). Curran Associates, 1249-1257. Google ScholarDigital Library
Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong, and Andrey Kolobov. 2005. BLOG: Probabilistic models with unknown objects. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI 2005). Morgan Kaufmann Publishers Inc., 1352-1359. Google ScholarDigital Library
Michael Montemerlo, Sebastian Thrun, Daphne Roller, and Ben Wegbreit. 2003. FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003). Morgan Kaufmann Publishers Inc., 1151-1156. Google ScholarDigital Library
Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press. Google ScholarDigital Library
Lawrence M Murray, Daniel Lundén, Jan Kudlicka, David Broman, and Thomas B Schön. 2018. Delayed sampling and automatic Rao-Blackwellization of probabilistic programs. (2018), 1037-1046.Google Scholar
Praveen Narayanan, Jacques Carette, Wren Romano, Chungchieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Proceedings of the 13th International Symposium on Functional and Logic Programming (FLOPS 2016). Springer, 62-79.Google ScholarCross Ref
Radford M. Neal. 2000. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9, 2 (2000), 249-265.Google Scholar
Avi Pfeffer. 2001. IBAL: A probabilistic rational programming language. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI 2001), Vol. 1. Morgan Kaufmann Publishers Inc., 733-740. Google ScholarDigital Library
Avi Pfeffer. 2016. Practical Probabilistic Programming (1 ed.). Manning Publications Co. Google ScholarDigital Library
Rajesh Ranganath, Sean Gerrish, and David Blei. 2014. Black box variational inference. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014). PMLR, 814-822.Google Scholar
Daniel Ritchie, Andreas Stuhlmüller, and Noah Goodman. 2016. C3: Lightweight incrementalized MCMC for probabilistic programs using continuations and callsite caching. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016). PMLR, 28-37.Google Scholar
Gareth O. Roberts and Richard L. Tweedie. 1996. Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2, 4 (December 1996), 341-363.Google ScholarCross Ref
Stuart J. Russell and Peter Norvig. 2009. Artificial Intelligence: A Modern Approach. Pearson. Google ScholarDigital Library
Feras A. Saad, Marco Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka. 2019. Bayesian synthesis of probabilistic programs for automatic data modeling. Proceedings of the ACM on Programming Languages 3, POPL, Article 37 (2019), 29 pages.Google Scholar
Tim Salimans, Diederik P Kingma, and Max Welling. 2015. Markov chain Monte Carlo and variational inference: Bridging the gap. In Proceedings of the 32nd International Conference on Machine Learning (ICML 2015). PMLR, 1218-1226. Google ScholarDigital Library
John Salvatier, Thomas V. Wiecki, and Christopher Fonnesbeck. 2016. Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2 (2016), e55.Google ScholarCross Ref
Ulrich Schaechtle, Feras Saad, Alexey Radul, and Vikash K. Mansinghka. 2016. Time Series Structure Discovery via Probabilistic Program Synthesis. (2016). arXiv:1611.07051Google Scholar
Leonid Sigal, Alexandru Balan, and Michael J. Black. 2008. Combined discriminative and generative articulated pose and non-rigid shape estimation. In Advances in Neural Information Processing Systems 20. Curran Associates, Inc., 1337-1344. Google ScholarDigital Library
Andreas Stuhlmüller, Jacob Taylor, and Noah Goodman. 2013. Learning stochastic inverses. In Advances in Neural Information Processing Systems 26 (NIPS 2013). Curran Associates, 3048-3056. Google ScholarDigital Library
Sebastian Thrun, Wolfram Burgard, and Dieter Fox. 2005. Probabilistic Robotics. MIT Press.Google Scholar
Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep probabilistic programming. In International Conference on Learning Representations (ICLR).Google Scholar
Jean-Baptiste Tristan, Daniel Huang, Joseph Tassarotti, Adam C. Pocock, Stephen Green, and Guy L. Steele. 2014. Augur: Data-parallel probabilistic modeling. In Advances in Neural Information Processing Systems 27 (NIPS 2014). 2600-2608. Google ScholarDigital Library
Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An introduction to probabilistic programming. (2018). arXiv:1809.10756Google Scholar
Martin J. Wainwright and Michael I. Jordan. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning 1, 1 (2008), 1-305. Google ScholarDigital Library
Richard D. Wilkinson. 2013. Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Statistical Applications in Genetics and Molecular Biology 12, 2 (2013), 129-141.Google ScholarCross Ref
David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Light-weight implementations of probabilistic programming languages via transformational compilation. In Proceedings of the 14TH International Conference on Artificial Intelligence and Statistics (AISTATS 2011). PMLR, 770-778.Google Scholar
Frank Wood, Thomas L. Griffiths, and Zoubin Ghahramani. 2006. A non-parametric Bayesian method for inferring hidden causes. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006). AUAI Press, 536-543. Google ScholarDigital Library
Frank Wood, Jan Willem Meent, and Vikash Mansinghka. 2014. A new approach to probabilistic programming inference. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014). PMLR, 1024-1032.Google Scholar
Lingfeng Yang, Patrick Hanrahan, and Noah Goodman. 2014. Generating efficient MCMC kernels from probabilistic programs. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014). PMLR, 1068-1076.Google Scholar
Mao Ye, Xianwang Wang, Ruigang Yang, Liu Ren, and Marc Pollefeys. 2011. Accurate 3D pose estimation from a single depth image. In Proceedings of the International Conference on Computer Vision (ICCV 2011). IEEE, 731-738. Google ScholarDigital Library

Index Terms

Gen: a general-purpose probabilistic programming system with programmable inference
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic reasoning algorithms

Recommendations

Probabilistic programming with programmable inference
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation

We introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded ...
Read More
Probabilistic abductive logic programming using Dirichlet priors

Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of those models. In this paper, we ...
Read More
Probabilistic programming with programmable inference
PLDI '18

We introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2019
1162 pages
ISBN:9781450367127
DOI:10.1145/3314221
General Chair:
Kathryn S. McKinley
Google, USA
,
Program Chair:
Kathleen Fisher
Tufts University, USA
Copyright © 2019 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 June 2019
Check for updates
Author Tags
Markov chain Monte Carlo
Probabilistic programming
sequential Monte Carlo
variational inference
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate406of2,067submissions,20%
Upcoming Conference
PLDI '24

Sponsor:

sigplan

ACM SIGPLAN Conference on Programming Language Design and Implementation

June 24 - 28, 2024

Copenhagen , Denmark
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 51
  Total Citations
  View Citations
- 21,919
  Total Downloads
- Downloads (Last 12 months)1,189
- Downloads (Last 6 weeks)119
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.