AI: bluff, taking money from the population and the victory over uncertainty

Right now, while you are reading these lines, an amazing event is happening in the world - artificial intelligence defeats man in the next game. What is surprising is not the fact of the superiority of machines, but the fact that little is written about it. Perhaps due to the fact that this time the machine competes not in the legendary game “go”, not in DOOM, lapta or hide and seek, but in sports card poker.

Poker is often dismissively called a gamble, the gain in which depends exclusively on the case, but today it is recognized by law as the official sport in a number of countries (with the exception of Russia). The most interesting thing for us and machines is that in poker, winning strategies can be predicted using probability theory. And most importantly, poker is a game with incomplete information, unlike chess, checkers, backgammon, where both players see the position of all the pieces on the board. Previously, AI could not win where there is an element of uncertainty. So what has changed?

The biggest games

Disclaimer: to understand the material from this article does not require any knowledge of poker , but for a deeper immersion in the subject, you must know at least the basics of the game.
')
In science fiction, robots often fight robots (“good” against “bad”), but in reality, machine championships, with the exception of the well-known mechanical “fights”, attract few viewers. Fighting bots among themselves are devoid of emotion, the spirit of competition, and are of interest only to a narrow audience of specialists. Whether it's a fight with people! By 2016, computers were able to win (or find a mathematical solution to winning positions) from a person in two dozen intellectual games: him, tic-tac-toe, ghost, four in a row, Gomoku (15x15), a mill (with 9 chips), lentomino, ovalhu, quarto, tiko, pangki, rendzu (without opening rules), accident (mankala family), maharajah, tigers and goats, fanarona, english checkers, three musketeers, hex (8x8), calah (6x6), chinese sticks, Pentago, go (5x5 and the classic game against Lee Sedol).

Not all of these games may be familiar to you, but you can write a separate article about each one with a dramatic chapter on fighting cars. Perhaps, one of the most interesting and intense confrontations (besides the well-known vicissitudes with the game of “go”) was the battle for the dragon crown. You can read more about this period in our first article in the cycle about AI , but for now just remember one interesting number: the English checkers are the biggest of the games that have been completely solved so far. The size of its search space is 5 × 10 ²⁰ . In order to find a solution, for 18 years, a network of personal computers (from 50 to 200 pieces) made 10 ¹⁴ calculations.

If you read our previous articles on the topic of AI, you already know that in complex games the machine does not win by going through all the possible combinations of moves. The estimated minimum number of non-recurring chess games, calculated in 1950 by the American mathematician Claude Shannon, is approximately ^10,118 . For comparison, the number of atoms in the observable Universe is, according to various estimates, from 4 × 10 ⁷⁹ to 10 ⁸¹ , that is, 10 ⁴⁰ times less than the Shannon number.

It is unthinkable to know all the chess games "by heart". It is also not possible to calculate the number of possible positions. And the rule is not only about chess. Nevertheless, due to the development of algorithms and then the improvement of convolutional neural networks, computers were able to win there, where the human brain works faster or at least not worse.

What is the place of poker in comparison with other worthy games? Take for example the most popular form of poker today - Texas Hold'em with limited bets. Limit Hold'em, due to limited rates, is highly dependent on mathematics and is well suited for algorithmization: for a one-on-one game, there are about 10 ¹⁸ game situations in it. If we take into account that some card combinations are equivalent to each other (for example, two aces of different suits will be identical to any other pair of aces), we get about 10 ¹⁴ different gaming combinations. For comparison, there are ^10,160 in no-limit hold'em, and in the game of “go” development options ^10,170 .

10 ¹⁴ - it seems that this is very small compared to chess and much simpler English checkers. The difficulty is that in poker players do not know the opponent’s cards, and they don’t know what combination their own cards will make in the next turn.

In addition, if you remove the poker limits (i.e. allow to bet on all the money available to the player) and expand the number of players from two to three or five, we will get such a viscous swamp of mathematical uncertainties that the most powerful supercomputers will begin to sink in it. Human experience allows at an intuitive level, looking at the actions of the opponent, to evaluate the missing information and even make some conclusions and predictions. Thus, poker becomes an excellent platform for testing the capabilities of AI, because it increases the complexity at the expense of people who not only use logic, but also try to outwit rivals using bluffs and other tricks. AI only has game theory to search for optimal strategies.

Bots against humanity

The first serious attempts to create a poker bot were made in the early 80s. In 1984, the famous poker specialist Mike Caro introduced the Orac program, which struck many with its capabilities. Orac, for example, could relatively successfully calculate the opponent's bluff, just measuring the time it took for the opponent to move - the longer a person thought, the higher the probability of a bluff.

In 1991, the University of Alberta (Canada) began the development of the Polaris one-to-one hold'em program. After 16 years of work on the project, which was a combination of several poker bots that took into account a whole family of algorithms for finding equilibrium strategies, in 2007 there was a match against several poker professionals. Under the terms of the game, the same cards were handed out to the person and the computer, so the effect of randomness in the game was minimized. At first, Polaris won significantly, but after analyzing several games, the players found the repetitive features of the game program and were able to win.

In July 2008, the Polaris was finally able to win the championship between people and the car. The total score of the sessions was 3 wins, 2 losses, 1 draw. However, this victory did not mark the beginning of an era of machine domination and the death of online poker. As already mentioned, “machine poker” is played with a number of restrictions that are not respected in real people’s online games.

The first serious challenge for people was the 2015 tournament, in which four of the TOP 10 best players in Texas Hold'em were opposed by the Claudico program. In Brains Vs. Artificial Intelligence people were stronger.

Another Tartanian7 poker program at Carnegie Mellon University in 2014 was able to beat a few newbies and computer bots. The peculiarity of this program is that after ten years of development, she learned to play tolerably no-limit hold'em with novices.

Thus, until recently, the AI felt more or less confident in limit games with one person and was completely lost in no-limit and multi-seat poker disciplines. Nevertheless, there have always been people who worked to ensure that the bots could play better than people. The reason is simple - money.

Machine earns in games

One bot can give way to a person, but a hundred bots mathematically increase the chances of winning. In 2010, a big scandal broke out when bots were discovered on the well-known poker site, which in total earned more than two hundred thousand dollars. Specialists at PokerTableRatings have discovered a striking similarity of all parameters over a long distance to several players. Statistically, it was possible to prove that all these suspicious players acted identically in all situations.

Have the bots always won and “burned” only on the mass phenomenon? Not certainly in that way. Online platform for the game, the so-called "poker room", gives players a commission charged from each player's bet. The return, called rakeback, is an added bonus by which poker rooms attract additional players.

On average, the amount of payment for a poker game is 5% and cannot exceed 3-5 cu. for 1 game. Rakeback is a big enough profit for bots: at the expense of it you can play 0 with people, but at the same time you can earn at the expense of percent from the poker room. The presence of bots that successfully play against beginners is a fact. But they have no effect on the global market for the development of artificial intelligence.

Holistic victory on the field of uncertainty

After the successes of Tartanian7 at Carnegie Mellon University, they began to create a new, much more sophisticated poker bot - Libratus. While the DeepStack program from the rival University of Alberta showed very good results in the preliminary tests, Libratus in January 2017 opposed real poker professionals. Calculations in the development process of Libratus took 15 million core hours (Claudico cost 2-3 million core hours. During the game, Libratus uses the power of the Bridges supercomputer (1.35 petaflops / sec).

As mentioned at the beginning, to reduce the number of possible poker hands, the programs could use (as they did) a simplification, according to which some card combinations were considered identical to each other. In most cases, this is acceptable, but not against the world's best professionals, where the difference between all the cards is significant. Libratus uses a unique strategy for every situation in which it finds itself.

On January 11, a tournament started, in which Libratus played a total of 120,000 hands to a no limit Texas Hold'em one-on-one. The game goes for virtual money, but the prize for the victory - $ 200,000 - is absolutely real for four professional poker players, two of whom have already had experience playing against a bot, winning Claudico in 2015. So that the outcome of the competition is not too random, each match is duplicated so that Player A receives the cards that the computer received in the game with Player B and vice versa.

From the very beginning Libratus took the lead, winning on people on the first day, and more than doubled the gap in the second. The longer the game lasts, the more the AI gets information about the players, allowing it to become stronger. Every time people noticed shortcomings in the program's strategy, she found out about it, and adjusted her game the next day. At the end of the previous week, Libratus had already won almost $ 800,000. By January 30, the prize has exceeded one million dollars.

The machine plays in a balanced way - this means that it makes everything a little. She can bluff with bad cards or with good ones, she can make high stakes or play low - her game adjusts itself to the actions of a person every time and turns out better.

How does Libratus actually work? The answer to this question yet. Scientists will not reveal the secret of how the program wins, at least until the end of the tournament. We know that the program is based on a specially created algorithm for calculating optimal strategies for games with incomplete information. A new technology is also used to achieve the Nash equilibrium - a strategy where none of the players will increase their gain by changing the strategy, if the other, in turn, does not change the strategy.

Will Libratus destroy online poker?

The standard for research is Texas Hold'em, but the development methods used are not tied to one type of poker. In general, it is possible to create a similar bot for other species (and not only), but most importantly, scientists will receive a working tool for solving problems in the field of uncertainty. And this is not only a large class of other games (people still beat the computer in the first part of StarCraft), but also a lot of real-world tasks around the world.

As for ordinary poker, ordinary players already express fears that the programs will put an end to the usual online game or at least complicate the game significantly, forcing everyone to turn on the webcam and broadcast their actions to the network. But as we know from the past, programs have not destroyed chess, and chess tournaments with huge cash funds are still popular. But even if online poker is gradually becoming a thing of the past, ultimately everything that is connected with the computing power of computers will benefit mankind.

Sources:

Source: https://habr.com/ru/post/401087/

All Articles