Abstract
Poker is a family of games that exhibit imperfect information, where players do not have full knowledge of past events. While many perfect information games have been solved (e.g., Connect-Four and checkers), no nontrivial imperfect information game played competitively by humans has previously been solved. In this paper, we announce that the smallest variant of poker in-play, heads-up limit Texas hold'em, is now essentially weakly solved. Furthermore, this computation formally proves the common wisdom that the dealer in the game holds a significant advantage. This result was enabled by a new algorithm, CFR+, which is capable of solving extensive-form games three orders of magnitude larger than previously possible. This paper is an extended version of the original 2015 Science article, with additional results showing Cepheus' in-game performance against computer and human opponents.
- Poker: A big deal. The Economist. Economist Newspaper Ltd., London. December 22, 31--38, 2007.Google Scholar
- Allis, V.L. A Knowledge-Based Approach to Connect-Four. The Game is Solved: White Wins. Master's thesis, Vrije Universiteit, Amsterdam, The Netherlands, 1988.Google Scholar
- Allis, V.L. Searching for Solutions in Games and Artificial Intelligence. PhD thesis, Vrije Universiteit Amsterdam, The Netherlands, 1994.Google Scholar
- Babbage, C. Passages from the Life of a Philosopher. Longman, Green, Longman, Roberts, and Green, London, 1864. Chapter 34.Google Scholar
- Billings, D., Burch, N., Davidson, A., Holte, R.C., Schaeffer, J., Schauenberg, T., Szafron, D. Approximating game-theoretic optimal strategies for full-scale poker. IJCAI, (2003), 661--668. Google ScholarDigital Library
- Billings, D., Davidson, A., Schaeffer, J., Szafron, D. The challenge of poker. Artificial Intelligence 134, 1--2 (2002), 201--240. Google ScholarDigital Library
- Borel, E., Ville, J. Applications de la théorie des probabilités aux jeux de hasard. Gauthier-Villars, 1938.Google Scholar
- Bowling, M., Burch, N., Johanson, M., Tammelin, O. The cepheus website, 2015. http://poker.srv.ualberta.ca.Google Scholar
- Bowling, M., Burch, N., Johanson, M., Tammelin, O. Heads-up limit hold'em poker is solved. Science 347, 6218 (Jan. 2015), 145--149.Google ScholarCross Ref
- Bowling, M., Burch, N., Johanson, M., Tammelin, O. Heads-up limit hold'em poker is solved: Supplementary online material, January 2015.Google Scholar
- Bowling, M., Johanson, M., Burch, N., Szafron, D. Strategy evaluation in extensive games with importance sampling. ICML, (2008), 72--79. Google ScholarDigital Library
- Bronowski, J. Ascent of man. Documentary, 1973. Episode 13.Google Scholar
- Buro, M. Solving the oshi-zumo game. Adv. in Comp. Games 135, (2004) 361--366.Google ScholarCross Ref
- Campbell, M., Hoane, A.J. Jr., Hsu, F. Deep blue. Artificial Intelligence 134, (Jan. 2002), 57--83. Google ScholarDigital Library
- Chen, k., Bowling, M. Tractable objectives for robust policy optimization. Adv. Neural Inf. Process. Syst. (NIPS) 25, (2012), 2078--2086. Google ScholarDigital Library
- Craig, M. The Professor, the Banker, and the Suicide King: Inside the Richest Poker Game of All Time. Grand Central Publishing, New York, NY, 2006.Google Scholar
- Ferrucci, D. Introduction to "this is watson." IBM J. Res. Dev 56, 3.4 (May 2012) 1:1--1:15. Google ScholarDigital Library
- Gilpin, A., Hoda, S., Peña, J., Sandholm, T. Gradient-based algorithms for finding nash equilibria in extensive form games. WINE, (2007), 57--69. Google ScholarDigital Library
- Gilpin, A., Sandholm, T. Lossless abstraction of imperfect information games. J. ACM 54, 5 (2007). Google ScholarDigital Library
- Jackson, E. Slumbot: An implementation of counterfactual regret minimization on commodity hardware. In Proceedings of the 2012 Computer Poker Symposium. (2012).Google Scholar
- Johanson, M., Bard, N. Burch, N., Bowling, M. Finding optimal abstract strategies in extensive form games. AAAI, (2012), 1371--1379. Google ScholarDigital Library
- Johanson, M., Bard, N., Lanctot, M., Gibson, R., Bowling, M. Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization. AAMAS (2012). Google ScholarDigital Library
- Johanson, M., Waugh, K., Bowling, M., Zinkevich, M. Accelerating best response calculation in large extensive games. IJCAI (2011), 258--265. Google ScholarDigital Library
- Johanson, M.B. Robust Strategies and Counter-Strategies: From Superhuman to Optimal Play. PhD thesis, University of Alberta, Edmonton, Alberta, Canada, 2016.Google Scholar
- Karmarkar, N. A new polynomial-time algorithm for linear programming. In Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing (1984), ACM, New York, NY, 302--311. Google ScholarDigital Library
- Koller, D., Megiddo, N. The complexity of two-person zero-sum games in extensive form. Games Econ. Behav 4, 4 (1992), 528--552.Google ScholarCross Ref
- Koller, D., Megiddo, N., von Stengel, B. Efficient computation of equilibria for extensive two-person games. Games Econ. Behav 14, 2 (1996).Google ScholarCross Ref
- Koller, D., Pfeffer, A. Representations and solutions for game-theoretic problems. Artificial Intelligence 94, (1997), 167--215. Google ScholarDigital Library
- Kuhn, H. Simplified two-person poker. In Contributions to the Theory of Games, volume 1 of Annals of mathematics studies. H. Kuhn and A. Tucker, eds. Princeton University Press, Princeton, New Jersey, 1950, 97--103.Google Scholar
- Mirowski, P. What were von neumann and morgenstern trying to accomplish? In Toward a History of Game Theory. Weintraub, ed. Duke University Press, 1992, 113--147. Mirowski cites Turing as author of the paragraph containing this remark. The paragraph appeared in {46}, in a chapter with Turing listed as one of three contributors. Which parts of the chapter are the work of which contributor, particularly the introductory material containing this quote, is not made explicit.Google Scholar
- Morgenstern, O. The cold war is cold poker. N. Y. Times Mag. (Feb. 5 1961) pages 21--22.Google Scholar
- Nash, J.F., Shapley, L.S. A simple 3-person poker game. In Contributions to the Theory of Games I. Princeton University Press, Princeton, New Jersey, 1950, 105--116.Google Scholar
- Nesterov, Y. Excessive gap technique in nonsmooth convex minimization. SIAM Journal on Optimization 16, 1 (2005), 233--249. Google ScholarDigital Library
- Rehmeyer, J., Fox, N., Rico, R. Ante up, human: The adventures of polaris the poker-playing robot. Wired 16, 12 (Dec. 2008), 186--191.Google Scholar
- Romanovskii, I.V. Reduction of a game with complete memory to a matrix game. Soviet Mathematics 3, (1962), 678--681.Google Scholar
- Rubin, J., Watson, I. Computer poker: a review. Artificial Intelligence 175,(2011), 958--987. Google ScholarDigital Library
- Sandholm, T. The state of solving large incomplete-information games, and application to poker. AI Mag. 31, 4 (2010), 13--32.Google Scholar
- Schaeffer, J., Lake, R., Lu, P., Bryant, M. Chinook the world man-machine checkers champion. AI Mag. 17, 1 (1996), 21--29.Google Scholar
- Schaeffer, J., Neil Burch, Y.B., Kishimoto, A., Müller, M., Lake, R., Lu, P. Sutphen, S. Checkers is solved. Science 317, 5844 (2007), 1518--1522.Google ScholarCross Ref
- Shannon, C.E. Programming a computer for playing chess. Philosophical Magazine, Series 7, 41, 314 (Mar. 1950), 256--275.Google Scholar
- Shi, J., Littman, M.L. Abstraction methods for game theoretic poker. In Comp. Games, (2000), 333--345. Google ScholarDigital Library
- Southey, F., Bowling, M., Larson, B., Piccione, C., Burch, N., Billings, D., Rayner, D.C. Bayes' bluff: Opponent modelling in poker. UAI, (2005) 550--558. Google ScholarDigital Library
- Tambe, M. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press, Cambridge, England, 2011. Google ScholarDigital Library
- Tammelin, O. Cfr+. CoRR, abs/1407.5042, 2014.Google Scholar
- Tammelin, O., Burch, N., Johanson, M., Bowling, M. Solving heads-up limit texas hold'em. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, 2015, 645--652. Google ScholarDigital Library
- Turing, A. Digital computers applied to games. In Faster Than Thought. B.V. Bowden, ed. Chapter 25. Pitman, 1976.Google Scholar
- von Neumann, J. Zur theorie der gesellschaftsspiele. Mathematische Annalen 100, 1 (1928), 295--320.Google Scholar
- von Neumann, J., Morgenstern, O. Theory of Games and Economic Behavior. Princeton University Press, Princeton, Second Edition, 1947.Google Scholar
- Waugh, K., Bagnell, J.A. A unified view of large-scale zero-sum equilibrium computation. In AAAI Workshop on Computer Poker and Imperfect Information, 2015.Google Scholar
- Waugh, K., Bard, N., Bowling, M. Strategy grafting in extensive games. In Advances in Neural Information Processing Systems 22 (NIPS-09), 2009. http://webdocs.cs.ualberta.ca/~games/poker/publications/NIPS09-graft.pdf. Google ScholarDigital Library
- Zermelo, E. Über eine Anwendung der Mengenlehre auf die Theorie des Schachspiels. In Proceedings of the Fifth International Congress Mathematics. Cambridge University Press, Cambridge, 1913, 501--504.Google Scholar
- Zinkevich, M., Johanson, M., Bowling, M., Piccione, C. Regret minimization in games with incomplete information. NIPS (2008), 905--912. Google ScholarDigital Library
- Zinkevich, M., Littman, M. The AAAI computer poker competition. J. Inter. Comp. Games Association 29, (2006), News item.Google Scholar
Index Terms
- Heads-up limit hold'em poker is solved
Recommendations
A Texas Hold'em poker player based on automated abstraction and real-time equilibrium computation
AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systemsWe demonstrate our game theory-based Texas Hold'em poker player. To overcome the computational difficulties stemming from Texas Hold'em's gigantic game tree, our player uses automated abstraction and real-time equilibrium approximation. Our player ...
A near-optimal strategy for a heads-up no-limit Texas Hold'em poker tournament
AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systemsWe analyze a heads-up no-limit Texas Hold'em poker tournament with a fixed small blind of 300 chips, a fixed big blind of 600 chips and a total amount of 8000 chips on the table (until recently, these parameters defined the heads-up endgame of sit-n-go ...
A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs
AAMAS '08: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2We present Tartanian, a game theory-based player for heads-up no-limit Texas Hold'em poker. Tartanian is built from three components. First, to deal with the virtually infinite strategy space of no-limit poker, we develop a discretized betting model ...
Comments