Dagaz: Looking Ahead

One hundred and thirteen times a second, it stretches, and gets farther and farther. If a confirmation came, the signal - it could stop, and it does not stop. It stretches and finds new ways. It improvises, it studies. It does not realize that it does ...

James Cory "The Fire of Sibola"

Generally speaking, "strong" AI game is not my priority. It is foolish to compete with specialized game engines, doing universal and having only single-threaded JavaScript built into the browser as a computing resource. In addition, there are a number of games in which the need for complex AI simply does not arise. Here, for example, the entire AI comes down to finding the shortest path , and in this game , the task copes with the random task. Alas, such games are the exception rather than the rule. More often, one has to work hard enough for the program to make moves that would not seem idiotic.

The goal of my project is to promote board games. I'm looking for little-known or simply forgotten board games and try to bring them back to life. A variety of games: ancient, modern, complex and very childish, for two players or more of them. Even for one player (puzzles are also games). One criterion - the game should be interesting! At least to me.
')
At the current stage of development of the project, the server part does not imply any more than a set of html-pages with the accompanying JavaScript code. I want the game to load and work in any modern Web browser, including locally, without access to the Internet. All sorts of competitions, ratings and everything related to the interaction of players-people among themselves - all this is not about my project. Perhaps it will be, but hardly in the near future.

And here comes the development of bots and algorithms controlling them. That is, an interactive board with the most complete control of the rules of the game is nice, but in order to interest a person who may not be familiar with the game at all, you need to play with him. And since the game of people among themselves is not yet provided, we have to deal with bots.

The basic Instinct

As I said, there are games in which the machine requires not so much intelligence as instinct. And this does not mean that the games are bad! As a rule, such games are asymmetric. A creative approach is required from one of the parties (a person will do it), the game of the other side is more routine. Here is a good example of such a game.

Having on your side a complete chess "army", you need to catch (eat) just one piece - Maharaj, combining the moves of a chess knight and queen. This is a very strong piece, but if White plays correctly, the outcome is predetermined. On the other hand, playing for black is rather trivial. The implementation of the bot is not much different from the random one. He does the following:

Among all the available moves, he is looking for a move directly leading to victory (in our case, this is taking the opponent's king)
If there is no such move, looking for any move taking the opponent's piece and not substituting for “Maharaj” under the counter attack
If there are no suitable moves, it makes any safe move.
If there are no safe moves, pass the baton to another bot (usually random)

Oddly enough, this simple strategy is suitable for many games. In any case, in the first approximation. Here are video games with rather exotic rules of capture. The figures are immediately removed from the board if they are threatened (by the rules of the chess queen) three or more opponent pieces:

Here is another example. In the " Three Musketeers ", captures are performed only by one of the opponents, but on each turn. The task of another player is to direct the "musketeers" so that they are on the same vertical or horizontal. If the goal is not achieved and any of the parties cannot make another move, the “musketeers” will win:

Playing for the "musketeers", it would be quite possible to get along with the random, but it would be a shame if the computer built the line by chance, having another safe move in reserve. Bot should play more carefully . The code is as simple as possible.

Choice of safe stroke

CarefulAi.prototype.getMove = function(ctx) { var result = []; //       _.chain(Dagaz.AI.generate(ctx, ctx.board)) //      ,    .filter(function(move) { return move.actions.length > 0; }) //    .each(function(move) { //     ,    var b = ctx.board.apply(move); //      if (b.checkGoals(ctx.design, ctx.board.player) >= 0) { //       result.push(move); } }, this); if (result.length > 0) { //        var ix = this.params.rand(0, result.length - 1); return { done: true, move: result[ix], ai: "careful" }; } //     if (this.parent) { //    return this.parent.getMove(ctx); } }

Of course, I don’t want to say at all that bots would not play these games better using more complex algorithms, but the simplest solutions have their own use (for example, randomly can throw bones in those games where this is required). In addition, simple algorithms have one important advantage. They work very fast.

Games for one

In the case of puzzles, the game AI is not so much a necessity, as a way to “fill” the hand. In the end, it is assumed that the person will solve them himself! On the other hand, you can endlessly watch the computer trying to solve the puzzle for you:

A simple brute-force algorithm (nothing more effective for this class of puzzles has yet been invented). To ensure that the positions do not repeat, use Zobrist Hash . In this video, you can see that single squares walt each other for a long time. All because the bot considers them as different figures. It is easy to fix. Now the solution is much faster:

Another moment, which personally made me very nervous, was that before reaching the cherished goal just a couple of steps “in a straight line”, the bot could take a step “aside” and move the dies for half an hour, moving away from the solution further and further . To solve this problem, I used the time allotted to animate the move. Over 100 milliseconds, almost imperceptible to humans, you can catch hundreds of positions.

Running forward

  var x = []; var queue = [ ctx.board ]; var timestamp = Date.now(); while ((Date.now() - timestamp < this.params.AI_FRAME) && queue.length > 0) { var board = queue.shift(); var moves = cache(x, board); if (board.checkGoals(Dagaz.Model.getDesign(), board.player) != 0) { return { done: true, move: traceMove(ctx, board), ai: "win" }; } for (var i = 1; i < moves.length; i++) { var b = board.apply(moves[i]); var k = getKey(b); if (_.isUndefined(x.cache[k]) && !isLoop(ctx, b)) { queue.push(b); } } } ...

Just go through all sorts of moves, while there is time. If we find a goal - go to it in a straight line! Although the work of this bot is pretty visual, it does not use the best way to solve the problem. It would be wiser to spend more time in the beginning to do the animation after the solution has been found. This is exactly what I did when developing a bot to solve Solitaire Solitaire:

Following the example of Leibniz , I solve this problem in the opposite direction, that is, starting with one piece standing in the center of the board and performing the reverse moves, I try to build the original figure. From this point of view, the task is reduced to the usual search with a return . Here is a piece of the log:

Some text

Goal: 10,16,17,18,24,31
Current: 22.23
Current: 22,25,24
Current: 22,27,24,26
Current: 22,27,38,26,31
Current: 24,27,38,26,31,23
Current: 22,27,38,24,31,25
Current: 22,27,38,26,29,30
Current: 22,27,38,26,33,32
Current: 22,27,38,26,17,24
Current: 22,27,10,26,17
Current: 24,27,10,26,17,23
Current: 22,27,10,24,17,25
Current: 22,27,10,26,15,16
Current: 22,27,10,26,19,18
Current: 22,27,10,26,31,24
Current: 22,39,24,32
Current: 22,37,24,32,38
Current: 22,23,24,32,38,30
Current: 22,37,26,32,38,25
Current: 22,37,10,32,38,17
Current: 22,37,24,30,38,31
Current: 22,37,24,34,38,33
Current: 22,37,24,46,38,39
Current: 22,37,24,18,38,25
...

You can see that I interrupt the search after a set of 6 positions is reached (the solitaire game you were looking for consisted of so many pieces). Of course, this is not the only optimization. With the naive approach described above, the positions can be repeated. As always, when it comes to repetition of positions, Zobrist Hash comes to the rescue (in fact, it helps even earlier: before comparing positions elementwise much more efficiently at the beginning, comparing their hashes).

By the way speaking

With the debugging of this bot there was a funny incident. It so happened that I run the developed games, mostly under Firefox, only occasionally checking them on Chrome and IE. Once, when laying out a release, it turned out that solitaire-ai, working perfectly under IE and FireFox, under Chrome, is ten times slower! It was not very critical, but very strange.

For clarification of a situation it was necessary to use a profiler. As a result, it turned out that the reason for the brakes in Chrome is this console.log Even if the console itself is closed. After I removed the console output, it all worked at about the same speed. The moral in this story is very simple. If you need to display a large amount of text (for example, for debugging purposes) and you are going to use console.log for this - think again.

Having pretty much trained on puzzles, you can move on to more complex matters.

It stretches

It’s pretty obvious that more or less serious games, like chess or checkers, would require a more complex AI than anything I wrote above. The choice, by and large, is small: Minimax, with its " Alpha-Beta Clipping " and the " Monte-Carlo " method (and for that and the other there are wonderful articles on Habré). Each of these approaches is universal. Each of them has its pros and cons. Without pretending to complete the review, I will list the first thing that comes to mind.

To use the minimax method, it is necessary to determine the function that performs the assessment of the position (as a developer of board games with experience, I can see that for some games this can be very difficult).
Alpha-beta clipping is especially effective when performing a depth search, or a combined search (width, to some guaranteed depth, for example, 2-3 moves, then, depth, no more than N levels). This makes it unsuitable in conditions of limited computing resources and, most importantly, if it is necessary to output an answer no later than a specified time (let knowledgeable people correct me if I am mistaken).
The effectiveness of alpha-beta clipping can be significantly improved by pre-ranking the moves by their quality (as I will say below, when using the Monte-Carlo method, such heuristics are also useful).
When using the minimax method, the depth search cannot be stopped at an arbitrary location. When detecting forced moves (in Chess, shahs and taking figures are such), the search should be continued until a more “calm” position appears.
When using the Monte Carlo method, the simulation is performed to the terminal position (the victory of one of the players or the recognition of a draw). In games with a long party duration, this can be a problem.

Taking into account all these considerations, I decided, without discarding the minimax method, finally, in the current iteration, to focus on the Monte Carlo method (of course, having selected the tasks suitable for it). As one of the games, to test the operation of the algorithm, I chose " Four-Field Kono " - a children's game originally from Korea.

This is a simple taking game, the goal of which is to deprive the opponent of all his pieces or the possibility of a move. Taking in Kono is quite peculiar. In order to pick up an opponent's piece, you must land on it by jumping one of your pieces over the other (thanks to this, you can start the game, despite the fact that in the initial position the board is full of pieces). Here is the debug output when calculating one of the moves (time limit ~ 3 seconds):

Debug log

Player: Black
d2 - c2
Goal: 1; Deep: 26; Player: 1; P1: a2; P2: a4, a3, b3, c3, c2
Goal: 1; Deep: 20; Player: 1; P1: a1; P2: a4, b4, c4, d4, b3
Goal: 1; Deep: 16; Player: 1; P1: b1; P2: a4, b4, d4, a3, c3, d3
Goal: 1; Deep: 34; Player: 1; P1: c2; P2: a4, b4, d4, b3
Goal: 1; Deep: 22; Player: 1; P1: c3; P2: a3, b3, d3, b2, d2
Goal: 1; Deep: 24; Player: 1; P1: b1; P2: a4, b4, c3, b2, c2
Goal: 1; Deep: 30; Player: 1; P1: d2; P2: a4, c4, b3, b2
Goal: 1; Deep: 12; Player: 1; P1: a1; P2: a4, b4, c4, b3, d3, d2
Goal: 1; Deep: 34; Player: 1; P1: b1; P2: a4, b4, a3, c1
Goal: 1; Deep: 60; Player: 1; P1: d2; P2: a4, b4, b3, b1, c1
Goal: 1; Deep: 34; Player: 1; P1: d2; P2: a3, c2, a1, b1
Goal: 1; Deep: 36; Player: 1; P1: b1; P2: a4, d4, a3, d3, c2
Goal: 1; Deep: 32; Player: 1; P1: c1; P2: b4, a3, c3, a2, c2
Goal: 1; Deep: 24; Player: 1; P1: c2; P2: b3, b2
Goal: 1; Deep: 70; Player: 1; P1: a1; P2: b3, b2
Goal: 1; Deep: 38; Player: 1; P1: b1; P2: a4, b4, c3, a2, b2
Goal: 1; Deep: 28; Player: 1; P1: a1; P2: a4, d4, b3, c3, c2
Goal: 1; Deep: 34; Player: 1; P1: b2; P2: a4, b4, d4, a3, d3
Goal: 1; Deep: 20; Player: 1; P1: c1; P2: a4, b4, c4, d4, a3, c3
Goal: 1; Deep: 18; Player: 1; P1: c2; P2: a4, c4, a3, c3, b2, d1
Goal: 1; Deep: 28; Player: 1; P1: a2; P2: d4, c3, d3, c2
Goal: 1; Deep: 34; Player: 1; P1: d1; P2: b4, a3, b3, d3, a2
Goal: 1; Deep: 30; Player: 1; P1: b1; P2: a4, a3, b3, c3, b2, c2
Goal: 1; Deep: 36; Player: 1; P1: a2; P2: b4, a3, b3
Goal: 1; Deep: 24; Player: 1; P1: c1; P2: b4, a3, b3, d3
Goal: 1; Deep: 36; Player: 1; P1: a1; P2: a4, c3, a2, c2
Goal: 1; Deep: 22; Player: 1; P1: c1; P2: a4, b4, c4, d4, b2, c2
Goal: 1; Deep: 38; Player: 1; P1: c3; P2: a4, d4, b3, b2, c2
Goal: 1; Deep: 46; Player: 1; P1: a1; P2: a4, b4, c4, b2, c2
Goal: 1; Deep: 38; Player: 1; P1: a1; P2: a4, b4, d3, c2, d2
Goal: 1; Deep: 28; Player: 1; P1: b1; P2: b4, c4, d4, c3, a2
Goal: 1; Deep: 20; Player: 1; P1: d2; P2: a4, b4, c4, b3, c2
Goal: 1; Deep: 20; Player: 1; P1: c1; P2: a4, b3, c3, b2
Goal: 1; Deep: 48; Player: 1; P1: d1; P2: a3, a2, b2
Win = 5; All = 11; Move = d3 - c3
Win = 6; All = 12; Move = d3 - d2
Win = 13; All = 18; Move = a3 - b3
Win = 7; All = 13; Move = c4 - c3
Win = 3; All = 8; Move = b4 - b3
Player: White
a3 - b3

Having such a conclusion before your eyes, you can evaluate the correctness of the algorithm. The first thing you should pay attention to is whether the “goals” set by the algorithm correspond to the player’s goals according to the rules of the game. In the process of debugging, I had a couple of situations when the algorithm was trying to look for “not that”. If everything is normal with goals, we look at the assessment of moves (Win is the number of victories fixed, All is the total number of games played). UCT out of the box works, in principle, not bad, but I made a couple of changes:

Restricted search depth (100 by default)
Added heuristics of moves

For Kono, the heuristic is as follows

 Dagaz.AI.heuristic = function(ai, design, board, move) { var r = 1; if (move.actions.length > 0) { var pos = move.actions[0][1][0]; if (board.getPiece(pos) !== null) { r += 9; } } return r; }

That is, captures are 10 times more preferable than “quiet” moves. It should be noted here that, in contrast to the minimax method, it is useless to sort the moves in Monte Carlo (according to their heuristics). It is necessary to build an algorithm so that the probability of choosing a move is proportional to its heuristic estimate. Both innovations had a favorable effect on the statistics of wins / games played for Kono and I took on other games.

There are no takes in this game . To win, you need to hold your pieces through the entire field, then remove them from the board (this is considered to be a move). You can only move forward, left and right (orthogonal). The opponent's figures move perpendicular to the movement and interfere with the passage. It is impossible to lock the opponent's pieces! With all the seeming simplicity, the game is quite deep. Just try to play it with a computer .

Heuristics pretty obvious

 Dagaz.AI.heuristic = function(ai, design, board, move) { var r = 1; for (var i = 0; i < move.actions.length; i++) { if ((move.actions[i][0] !== null) && (move.actions[i][1] !== null)) { var d = move.actions[i][1][0] - move.actions[i][0][0]; if ((board.player == 1) && (d < -1)) { r += 10; } if ((board.player == 2) && (d == 1)) { r += 10; } } if ((move.actions[i][0] !== null) && (move.actions[i][1] === null)) { r += 5; } } return r; }

If the figure moves forward, not sideways, we consider the move preferred.

In terms of debugging, the game gave information about the behavior of the algorithm on large boards. All estimates of the number of victories for moves instantly zeroed out. The algorithm simply did not have time to find a single victory, as a result of which he began to make the first available moves (not at all the best). Increasing the search depth did not help much - three seconds were clearly not enough for a set of "critical" mass of games played in the simulation phase. I had to consider heuristics when choosing a move:

Changes in UCT

 function UctAi(params) { this.params = params; ... if (_.isUndefined(this.params.UCT_COEFF)) { this.params.UCT_COEFF = Math.sqrt(2); } } UctAi.prototype.getMove = function(ctx) { this.generate(ctx, ctx.board); ... var mx = null; for (var i = 0; i < ctx.childs.length; i++) { var u = 0; if (ctx.childs[i].win > 0) { u = ctx.childs[i].win / ctx.childs[i].all; } var h = this.heuristic(ctx.design, ctx.board, ctx.childs[i].move); var w = this.params.UCT_WEIGHT * u + (1 - this.params.UCT_WEIGHT) * h; if ((mx === null) || (w > mx)) { mx = w; ctx.result = ctx.childs[i].move; } console.log("Weight: " + w + "; Win = " + ctx.childs[i].win + "; All = " + ctx.childs[i].all + "; Move = " + ctx.childs[i].move.toString()); } ... }

Heuristics moves are certainly worse than what UCT gives, but, in not very difficult games, they let the computer live to the point where UCT can work. The weights are chosen in such a way that heuristics have a significant impact only in cases where the Monte Carlo algorithm does not work.

The next game was Kamisado . I already wrote about it before . This is not to say that the Monte Carlo method works quite well here. The program makes quite adequate moves, but if you set a goal, to win it is not at all difficult:

The main reason is that I could not come up with a simple and adequate heuristics for this game. As a result, the program makes its moves “by touch”. If there are enough computing resources, this would not be a problem (the Monte-Carlo algorithm is suitable for this game as it should be), but 3 seconds of calculating the course in one stream is the maximum that I can afford (by the way, UCT can be very good scaled when there are multiple streams).

Of course, the Monte Carlo method, as it is now implemented in me, is not suitable for all games. I'm not going to dwell on it. For games like Chess, perhaps minimax will work better. I will still have the opportunity to investigate this question. Also, I have some thoughts about improving the current implementation of UCT. Let's see what happens better.

Source: https://habr.com/ru/post/330320/

All Articles

Dagaz: Looking Ahead

The basic Instinct

Games for one

It stretches

More articles: