Google's new AI, called Player of Games, was announced this week in a paper published on Arxiv. According to DeepMind — the subsidiary of Google behind PoG — the AI "reaches strong performance in chess and Go, beats the strongest openly available agent in heads-up no-limit Texas hold’em poker (Slumbot), and defeats the state-of-the-art agent in Scotland Yard."
That makes PoG one of the first general-purpose game-playing AI to show serious aptitude in both games of incomplete information and games of complete information.
PoG beat Slumbot at heads up no-limit Texas hold'em for an EV of +7 mbb/hand. Slumbot is the best publicly available poker AI. That's a pretty close-run thing. Especially as there are AIs, shut away in labs, that kick Slumbots keister when they come out to play. For example, Facebook's AI ReBeL beat Slumbot for an EV of +45 mbb/hand.
PoG also showed a similar middle-of-the-road performance against Stockfish in chess and AlphaZero in Go. So, it looks like PoG is a jack of all trades, master of none. Though it will still give most human players a run for their money in more or less any game you can teach it.
The building blocks of how PoG gets good at a game aren't new. However, DeepMind's combination of these blocks is. PoG uses the tree search and classic machine learning strategies of DeepMind's AlphaZero. Then it adds in game-theoretic reasoning and counterfactual regret minimization. GTO and CRM are intimately familiar to poker players. Especially those who are familiar with solvers.
A game of information
The history of AI has been tangled up with game playing since the 1950s when Arthur L. Samuel wrote a checkers bot that brute-forced its way to the best play. In the 1990s, the growing tower of scientists — each standing on the other's shoulders — built AIs to crush at backgammon and chess.
This left two major landmarks: Go and poker.
Google's DeepMind wowed the world when they built AlphaGo and let it loose on the best players in the world.
Until then it was considered a truism that no computer would be able to beat humans at Go. The decision tree was simply too large. Go begins with 361 possible starting moves and 360 first-responses and can last for hundreds of moves.
Only it turned out the decision tree wasn't too large. In March 2016, AlphaGo beat Lee Sedol (the world number one at the time) four games to one.
AlphaZero, built off the back of AlphaGo can beat any human player at chess or shogi too. But it isn't built for games of incomplete information, like poker.
Go's complexity comes from the depth of options. Poker's complexity comes from the fact that each player is ignorant of the other's cards. No limit and pot-limit games combine these complexities. Big bet games have enormously deep decision trees since each possible bet size must be investigated separately by the algorithm.
PoG breaks new ground in being able to play games of either complete or incomplete information, with large decision trees or small.
And it does so with minimal input from its human masters.
Living in the future
The codified rules of chess and Go limit the parameters one has to code. So, general-purpose game-playing AIs, like PoG, are viewed as key stepping stones towards truly general-purpose AIs that can eventually break those parameters.
These will be the kind of computers that can figure out how to do the job of anyone from an international diplomat to the embassy chauffeur that ferries her around.
Self-driving cars and AI ambassadors aren't likely to be implemented soon. But DeepMind does already has an AI that helps save Google money on its server plant's cooling bill. Plus the company has worked with the UK's National Health Service on an app that identifies patients at high risk of acute kidney injuries.
PoG takes its name from the title of an Iain M. Banks sci-fi novel, The Player of Games. In the novel, Gurgeh, the titular player, is sent to a far-off planet to win a game called Azad and in the process bring down a space empire.
Here the DeepMind team shows a sense of irony you might not expect from CompSci majors. Because Gurgeh is a human from a society run largely by super-intelligent computers.
Today they play games for us, but what about tomorrow?
Featured image source: Flickr by ivva used under CC license