Some people look for a “tell” — a facial tic that gives away the fact that their opponent is bluffing. Others wait for a killer hand, then go all in.
When Michael Bowling plays poker he does neither — instead he simply looks for a counterfactual regret minimisation algorithm that finds Nash equilibria. Professor Bowling has “solved” poker.
A scientific paper published yesterday outlines his creation of a computer program that has been proved to have a perfect strategy for playing poker. And, given that it works only when there are two players and raises are fixed, it may be unlikely to help high rollers. Nevertheless, it is considered a major advance in game theory.
Until now, computers had been able to find unbeatable strategies only in games such as Connect Four and draughts, which are known as “perfect information” games. Here, both players can see precisely what is happening on the board. Chess also fits this category, but has so far proved too complex for a perfect strategy to be developed.
Poker, conversely, is not a perfect information game, as an opponent’s cards are kept secret. Not only does this make it far harder for a computer to solve, it also makes solving it far more useful because it is closer to real world situations.
In the paper, published in the journal Science, the authors quote John von Neumann, founder of game theory, who said that solving games such as draughts has little use because real life is not like that. “Real life consists of bluffing, of little tactics of deception, of asking yourself what is the other man going to think I mean to do.” Real life, in other words, is like poker.
The type of poker known as “Texas heads up limit hold ’em”, in which only two people are involved and raises are fixed, was chosen by the scientists because it is the simplest commonly played variant of the game. Even so, there are three hundred thousand trillion possible states the game can be in. Because many of those states are indistinguishable as far as the computer player is concerned — as it cannot tell what cards its opponent has — Professor Bowling, of the University of Alberta, was able to reduce that number to a mere three hundred trillion.
He then used what is known as a regret minimisation algorithm, that goes back over previous games after the hands have been revealed to see what could have been done better and adjusts its strategy as it goes along.
Although the closest human players get to regret minimisation strategies in casinos is asking their partners to haul them out if they are still playing at midnight, there are some lessons poker players can learn — not least that the program confirms a long-held view that the dealer has an advantage.
Professor Bowling hopes there could also be applications outside of card games. “With real-life decision-making settings almost always involving uncertainty and missing information, algorithmic advances such as those needed to solve poker are the key to future applications,” he writes.