ABSTRACT
Learning the structure of Bayesian networks(BNs) is known to be NP-complete and most of the recent work in the field is based on heuristics. Many recent approaches to the problem trade correctness and exactness for faster computation and are still computationally infeasible, except for networks with few variables. In this paper we present a software/hardware co-design approach to learning Bayesian networks from experimental data that is scalable to very large networks. Our implementation improves the performance of algorithms that are traditionally developed based on the Von Neumann computing paradigm by more than four orders of magnitude. Through parallel implementation and exploitation of the reconfigurability of Field Programmable Gate Array (FPGA) systems our design enables scientists to apply BN learning techniques to large problems such as studies in molecular biology where the number of variables in the system overwhelms any state of the art software implementations. We describe how we combine Markov Chain Monte Carlo (MCMC) sampling with Bayesian network learning techniques as well as supervised learning methods in a parallel and scalable design. We also present how our design is mapped and customized to run on the Berkeley Emulation Engine 2 (BEE2) multi-FPGA system. Experimental results are presented on synthetic data sets generated from standard Bayesian networks as well as a real life problem in the context of systems biology
- Bayesian Networks Repository: http://compbio.cs.huji.ac.il/Repository/.Google Scholar
- BEE2 Website: http://bee2.eecs.berkeley.edu/.Google Scholar
- L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.Google Scholar
- W.L. Buntine. Theory refinement of Bayesian networks. In Uncertainty in Artificial Intelligence, 1991. Google ScholarDigital Library
- C. Chang, J. Wawrzynek, and R.W. Brodersen. Bee2:a high-end reconfigurable computing system. IEEE Design and Test of Computers, 2005. Google ScholarDigital Library
- D.M. Chickering. Learning bayesian networks is np-complete. In Learning from Data: Artificial Intelligence and Statistics. V. Springer Verlag, 1996.Google Scholar
- G.F. Cooper and E. Herskovits. A bayesian method for the induction of probabilistic networks from data. Mach. Learn., 9(4):309--347, 1992. Google ScholarDigital Library
- G.F. Cooper and C. Yoo. Causal discovery from a mixture of experimental and observational data. In UAI, pages 116--125, 1999. Google ScholarDigital Library
- B. Ellis and W. Wong. Sampling bayesian network quickly. In Interface, 2006.Google Scholar
- J. Friedman. Greedy function approximation: a gradient boosting machine, 1999.Google Scholar
- J. Friedman. Stochastic gradient boosting, 1999. Google ScholarDigital Library
- N. Friedman and D. Koller. Being Bayesian about Bayesian network structure: A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 50(1-2):95--125, 2003. Full version of UAI 2000 paper.Google ScholarCross Ref
- N. Friedman, I. Nachman, and D. Pe'er. Learning bayesian network structure from massive datasets: The "sparse candidate" algorithm. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pages 206--215. Google ScholarDigital Library
- D. Geiger and D. Heckerman. Learning gaussian networks. Technical Report MSR-TR-94-10, Redmond, WA, 1994.Google ScholarCross Ref
- C.J. Geyer. Markov chain monte carlo maximum likelihood. In Computing Science and Statistics: Proceedings of 23rd Symposium on the Interface Foundation, pages 156--163. Fairfax Station, 1991.Google Scholar
- W.K. Hastings. Monte carlo sampling methods using markov chains and their applications. 1970.Google Scholar
- D. Heckerman, D. Geiger, and D.M. Chickering. Learning bayesian networks: The combination of knowledge and statistical data. Mach. Learn., 20(3):197--243, September 1995. Google ScholarDigital Library
- A. Moore and W.K. Wong. Optimal reinsertion: A new search operator for accelerated and more accurate bayesian network structure learning. In T. Fawcett and N. Mishra, editors, Proceedings of the 20th International Conference on Machine Learning (ICML'03), pages 552--559, Menlo Park, California, August 2003. AAAI Press.Google Scholar
- J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, September 1988. Google ScholarDigital Library
- I. Pournara, C.S. Bouganis, and G.A. Constantinides. Fpga-accelerated bayesian learning for reconstruction of gene regulatory networks. pages 323--328, 2005.Google Scholar
- K. Sachs, O. Perez, D. Pe'er, D.A. Lauffenburger, and G.P. Nolan. Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721):523--529, 2005.Google ScholarCross Ref
- H.K.H. So. Borph: An operating system for fpga-based reconfigurable computers. PhD dissertaion, University of California, Berkeley. Google ScholarDigital Library
- M. Teyssier and D. Koller. Ordering-based search: A simple and effective algorithm for learning bayesian networks. In Proceedings of the Twenty-first Conference on Uncertainty in AI (UAI), pages 584--590, Edinburgh, Scottland, UK, July 2005.Google Scholar
Index Terms
- Reconfigurable computing for learning Bayesian networks
Recommendations
ParaLearn: a massively parallel, scalable system for learning interaction networks on FPGAs
ICS '10: Proceedings of the 24th ACM International Conference on SupercomputingParaLearn is a scalable, parallel FPGA-based system for learning interaction networks using Bayesian statistics. ParaLearn includes problem specific parallel/scalable algorithms, system software and hardware architecture to address this complex problem.
...A Mixed-Grained Reconfigurable Computing Platform for Multiple-Standard Video Decoding (Abstract Only)
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysA mixed-grained reconfigurable computing platform targeting multiple-standard video decoding is proposed in this paper. The platform integrates eight coarse-grained Reconfigurable Processing Units (RPUs), each of which consists of 16×16 multi-functional ...
The Promise of High-Performance Reconfigurable Computing
Several high-performance computers now use field-programmable gate arrays as reconfigurable coprocessors. The authors describe the two major contemporary HPRC architectures and explore the pros and cons of each using representative applications from ...
Comments