ABSTRACT
Although probabilistic programming is widely used for some restricted classes of statistical models, existing systems lack the flexibility and efficiency needed for practical use with more challenging models arising in fields like computer vision and robotics. This paper introduces Gen, a general-purpose probabilistic programming system that achieves modeling flexibility and inference efficiency via several novel language constructs: (i) the generative function interface for encapsulating probabilistic models; (ii) interoperable modeling languages that strike different flexibility/efficiency trade-offs; (iii) combinators that exploit common patterns of conditional independence; and (iv) an inference library that empowers users to implement efficient inference algorithms at a high level of abstraction. We show that Gen outperforms state-of-the-art probabilistic programming systems, sometimes by multiple orders of magnitude, on diverse problems including object tracking, estimating 3D body pose from a depth image, and inferring the structure of a time series.
Supplemental Material
- Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016). USENIX Association, 265-283.Google ScholarDigital Library
- Ryan Adams, Hanna Wallach, and Zoubin Ghahramani. 2010. Learning the structure of deep sparse graphical models. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS 2010). PMLR, 1-8.Google Scholar
- Eric Atkinson, Cambridge Yang, and Michael Carbin. 2018. Verifying handcoded probabilistic inference procedures. (2018). arXiv:1805.01863Google Scholar
- Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah. 2017. Julia: A fresh approach to numerical computing. SIAM Rev. 59, 1 (2017), 65-98.Google ScholarDigital Library
- Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D. Goodman. 2018. Pyro: Deep universal probabilistic programming. (2018). arXiv:1810.09538Google Scholar
- David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. 2017. Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112, 518 (2017), 859-877.Google ScholarCross Ref
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (Jan. 2003), 993-1022. Google ScholarDigital Library
- Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. Journal of Statistical Software 76, 1 (2017), 1-32.Google ScholarCross Ref
- Nick Chater, Joshua B. Tenenbaum, and Alan Yuille. 2006. Probabilistic models of cognition: Conceptual foundations. Trends in Cognitive Sciences 10, 7 (2006), 287-291.Google ScholarCross Ref
- Marco Cusumano-Towner, Benjamin Bichsel, Timon Gehr, Martin Vechev, and Vikash K. Mansinghka. 2018. Incremental inference for probabilistic programs. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, 571-585.Google ScholarDigital Library
- Marco Cusumano-Towner and Vikash K. Mansinghka. 2017. AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms. In Advances in Neural Information Processing Systems 30 (NIPS 2017). Curran Associates, Inc., 3000-3010.Google Scholar
- Marco Cusumano-Towner, Alexey Radul, David Wingate, and Vikash K. Mansinghka. 2017. Probabilistic programs for inferring the goals of autonomous agents. (2017). arXiv:1704.04977Google Scholar
- Martin de La Gorce, Nikos Paragios, and David J Fleet. 2008. Model-based hand tracking with texture, shading and self-occlusions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). IEEE, 1-8.Google ScholarCross Ref
- Luca Del Pero, Joshua Bowdish, Daniel Fried, Bonnie Kermgard, Emily Hartley, and Kobus Barnard. 2012. Bayesian geometric modeling of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012). IEEE, 2719-2726. Google ScholarDigital Library
- Arnaud Doucet, Nando De Freitas, and Neil Gordon. 2001. An introduction to sequential Monte Carlo methods. In Sequential Monte Carlo Methods in Practice, Arnaud Doucet, Nando de Freitas, and Neil Gordon (Eds.). Springer, 3-14.Google Scholar
- Simon Duane, Anthony D. Kennedy, Brian J. Pendleton, and Duncan Roweth. 1987. Hybrid Monte carlo. Physics Letters B 195, 2 (1987), 216-222.Google ScholarCross Ref
- David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, and Zoubin Ghahramani. 2013. Structure discovery in nonparametric regression through compositional kernel search. In Proceedings of the 30th International Conference on Machine Learning (ICML 2010). PMLR, 1166-1174. Google ScholarDigital Library
- Hong Ge, Kai Xu, and Zoubin Ghahramani. 2018. Turing: A language for flexible probabilistic inference. In Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018). PMLR, 1682-1690.Google Scholar
- Alan E. Gelfand and Adrian F. M. Smith. 1990. Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85, 410 (1990), 398-409.Google ScholarCross Ref
- Andrew Gelman, Hal S. Stern, John B. Carlin, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. CRC Press.Google Scholar
- Stuart Geman and Donald Geman. 1987. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. In Readings in Computer Vision: Issues, Problem, Principles, and Paradigms, Martin A. Fischler and Oscar Fischler (Eds.). Morgan Kaufmann Publishers Inc., 564-584. Google ScholarDigital Library
- Noah Goodman, Vikash Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: a language for generative models. In Proceedings of the 24th Annual Conference on Uncertainty in Artificial Intelligence (UAI 2008). AUAI Press, 220-229. Google ScholarDigital Library
- Noah D. Goodman and Andreas Stuhlmüller. 2014. The Design and Implementation of Probabilistic Programming Languages. http://dippl.org. Accessed: 2018-11-8.Google Scholar
- Peter J. Green. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 4 (1995), 711-732.Google ScholarCross Ref
- Roger B. Grosse, Ruslan Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum. 2012. Exploiting compositionality to explore a large space of model structures. In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI 2012). AUAI Press, 306-315. Google ScholarDigital Library
- Wilfred K. Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 1 (1970), 97-109.Google ScholarCross Ref
- Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation 18, 7 (2006), 1527-1554. Google ScholarDigital Library
- Steven Holtzen, Yibiao Zhao, Tao Gao, Joshua B. Tenenbaum, and Song-Chun Zhu. 2016. Inferring human intent from video by sampling hierarchical plans. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 1489-1496.Google ScholarDigital Library
- Daniel Huang, Jean-Baptiste Tristan, and Greg Morrisett. 2017. Compiling Markov chain Monte Carlo algorithms for probabilistic modeling. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, 111-125. Google ScholarDigital Library
- Varun Jampani, Sebastian Nowozin, Matthew Loper, and Peter V. Gehler. 2015. The informed sampler. Computing Vision and Image Understanding 136, C (2015), 32-44. Google ScholarDigital Library
- S. J. Julier and J. K. Uhlmann. 2004. Unscented filtering and nonlinear estimation. Proc. IEEE 92, 3 (2004), 401-422.Google ScholarCross Ref
- Rudolph E. Kalman. 1960. A new approach to linear filtering and prediction problems. Journal of Basic Engineering 82, 1 (1960), 35-45.Google ScholarCross Ref
- Bjarne Knudsen and Jotun Hein. 2003. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Research 31, 13 (2003), 3423-3428.Google ScholarCross Ref
- Daphne Koller, Nir Friedman, and Francis Bach. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press. Google ScholarDigital Library
- Tejas D. Kulkarni, Pushmeet Kohli, Joshua B. Tenenbaum, and Vikash Mansinghka. 2015. Picture: A probabilistic programming language for scene perception. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015). IEEE, 4390-4399.Google ScholarCross Ref
- John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML 2001). Morgan Kaufmann Publishers Inc., 282-289. Google ScholarDigital Library
- Steven M. LaValle. 1998. Rapidly-exploring random trees: A new tool for path planning. Technical Report TR 98-11. Computer Science Department, Iowa State University.Google Scholar
- Vikash Mansinghka, Patrick Shafto, Eric Jonas, Cap Petschulat, Max Gasner, and Joshua B. Tenenbaum. 2016. CrossCat: A fully Bayesian nonparametric method for analyzing heterogeneous, high dimensional data. Journal of Machine Learning Research 17, 138 (2016), 1-49.Google ScholarDigital Library
- Vikash K. Mansinghka, Tejas D. Kulkarni, Yura N. Perov, and Joshua B. Tenenbaum. 2013. Approximate Bayesian image interpretation using generative probabilistic graphics programs. In Advances in Neural Information Processing Systems 26. Curran Associates, Inc., 1520-1528. Google ScholarDigital Library
- Vikash K. Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, and Martin Rinard. 2018. Probabilistic programming with programmable inference. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, 603-616.Google ScholarDigital Library
- Andrew McCallum, Karl Schultz, and Sameer Singh. 2009. FACTORIE: Probabilistic programming via imperatively defined factor graphs. In Advances in Neural Information Processing Systems 22 (NIPS 2009). Curran Associates, 1249-1257. Google ScholarDigital Library
- Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong, and Andrey Kolobov. 2005. BLOG: Probabilistic models with unknown objects. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI 2005). Morgan Kaufmann Publishers Inc., 1352-1359. Google ScholarDigital Library
- Michael Montemerlo, Sebastian Thrun, Daphne Roller, and Ben Wegbreit. 2003. FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003). Morgan Kaufmann Publishers Inc., 1151-1156. Google ScholarDigital Library
- Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press. Google ScholarDigital Library
- Lawrence M Murray, Daniel Lundén, Jan Kudlicka, David Broman, and Thomas B Schön. 2018. Delayed sampling and automatic Rao-Blackwellization of probabilistic programs. (2018), 1037-1046.Google Scholar
- Praveen Narayanan, Jacques Carette, Wren Romano, Chungchieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Proceedings of the 13th International Symposium on Functional and Logic Programming (FLOPS 2016). Springer, 62-79.Google ScholarCross Ref
- Radford M. Neal. 2000. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9, 2 (2000), 249-265.Google Scholar
- Avi Pfeffer. 2001. IBAL: A probabilistic rational programming language. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI 2001), Vol. 1. Morgan Kaufmann Publishers Inc., 733-740. Google ScholarDigital Library
- Avi Pfeffer. 2016. Practical Probabilistic Programming (1 ed.). Manning Publications Co. Google ScholarDigital Library
- Rajesh Ranganath, Sean Gerrish, and David Blei. 2014. Black box variational inference. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014). PMLR, 814-822.Google Scholar
- Daniel Ritchie, Andreas Stuhlmüller, and Noah Goodman. 2016. C3: Lightweight incrementalized MCMC for probabilistic programs using continuations and callsite caching. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016). PMLR, 28-37.Google Scholar
- Gareth O. Roberts and Richard L. Tweedie. 1996. Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2, 4 (December 1996), 341-363.Google ScholarCross Ref
- Stuart J. Russell and Peter Norvig. 2009. Artificial Intelligence: A Modern Approach. Pearson. Google ScholarDigital Library
- Feras A. Saad, Marco Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka. 2019. Bayesian synthesis of probabilistic programs for automatic data modeling. Proceedings of the ACM on Programming Languages 3, POPL, Article 37 (2019), 29 pages.Google Scholar
- Tim Salimans, Diederik P Kingma, and Max Welling. 2015. Markov chain Monte Carlo and variational inference: Bridging the gap. In Proceedings of the 32nd International Conference on Machine Learning (ICML 2015). PMLR, 1218-1226. Google ScholarDigital Library
- John Salvatier, Thomas V. Wiecki, and Christopher Fonnesbeck. 2016. Probabilistic programming in Python using PyMC3. PeerJ Computer Science 2 (2016), e55.Google ScholarCross Ref
- Ulrich Schaechtle, Feras Saad, Alexey Radul, and Vikash K. Mansinghka. 2016. Time Series Structure Discovery via Probabilistic Program Synthesis. (2016). arXiv:1611.07051Google Scholar
- Leonid Sigal, Alexandru Balan, and Michael J. Black. 2008. Combined discriminative and generative articulated pose and non-rigid shape estimation. In Advances in Neural Information Processing Systems 20. Curran Associates, Inc., 1337-1344. Google ScholarDigital Library
- Andreas Stuhlmüller, Jacob Taylor, and Noah Goodman. 2013. Learning stochastic inverses. In Advances in Neural Information Processing Systems 26 (NIPS 2013). Curran Associates, 3048-3056. Google ScholarDigital Library
- Sebastian Thrun, Wolfram Burgard, and Dieter Fox. 2005. Probabilistic Robotics. MIT Press.Google Scholar
- Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep probabilistic programming. In International Conference on Learning Representations (ICLR).Google Scholar
- Jean-Baptiste Tristan, Daniel Huang, Joseph Tassarotti, Adam C. Pocock, Stephen Green, and Guy L. Steele. 2014. Augur: Data-parallel probabilistic modeling. In Advances in Neural Information Processing Systems 27 (NIPS 2014). 2600-2608. Google ScholarDigital Library
- Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An introduction to probabilistic programming. (2018). arXiv:1809.10756Google Scholar
- Martin J. Wainwright and Michael I. Jordan. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning 1, 1 (2008), 1-305. Google ScholarDigital Library
- Richard D. Wilkinson. 2013. Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Statistical Applications in Genetics and Molecular Biology 12, 2 (2013), 129-141.Google ScholarCross Ref
- David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Light-weight implementations of probabilistic programming languages via transformational compilation. In Proceedings of the 14TH International Conference on Artificial Intelligence and Statistics (AISTATS 2011). PMLR, 770-778.Google Scholar
- Frank Wood, Thomas L. Griffiths, and Zoubin Ghahramani. 2006. A non-parametric Bayesian method for inferring hidden causes. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006). AUAI Press, 536-543. Google ScholarDigital Library
- Frank Wood, Jan Willem Meent, and Vikash Mansinghka. 2014. A new approach to probabilistic programming inference. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014). PMLR, 1024-1032.Google Scholar
- Lingfeng Yang, Patrick Hanrahan, and Noah Goodman. 2014. Generating efficient MCMC kernels from probabilistic programs. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014). PMLR, 1068-1076.Google Scholar
- Mao Ye, Xianwang Wang, Ruigang Yang, Liu Ren, and Marc Pollefeys. 2011. Accurate 3D pose estimation from a single depth image. In Proceedings of the International Conference on Computer Vision (ICCV 2011). IEEE, 731-738. Google ScholarDigital Library
Index Terms
- Gen: a general-purpose probabilistic programming system with programmable inference
Recommendations
Probabilistic programming with programmable inference
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and ImplementationWe introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded ...
Probabilistic abductive logic programming using Dirichlet priors
Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of those models. In this paper, we ...
Probabilistic programming with programmable inference
PLDI '18We introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded ...
Comments