📜 ⬆️ ⬇️

The minimax algorithm on the example of the game "Collect 4 (connect4)"

The implementation of the minimax algorithm on the example of the game “Collect 4” is a very exciting activity and, in connection with this, there was a desire to tell someone else about this hobby, which he did. The game is available at this address .

The field of the game can be varied by setting constants, I took 7 to 6 as in the example of the link. The meaning of the game is to, making moves, to build 4 of their figures (crosses or zeros) in one row: horizontally, vertically, diagonally. To create a game (apparently any), two things are needed: a move generator and a move (position) analyzer.

The rules of the game are such that it is not possible to install a figure to any place on the board, but only to the bottom, and the field above the field occupied by the figure (any) is also considered the bottom. Board presented in the form of a one-dimensional array
TicTac field[NSIZE_I*NSIZE_J]; 
the structures themselves may be
 typedef enum {Tic, Tac, EMPTY} TicTac; 
, the game will need a lot of procedures (relatively of course), the code validation procedure implemented as
 int validstep(const TicTac *field, int step){ return ((field[step]==EMPTY)&&((step + NSIZE_J >= NSIZE_I*NSIZE_J)||((step + NSIZE_J < NSIZE_I*NSIZE_J)&&(field[step + NSIZE_J] != EMPTY))));} 
, the generator of moves was performed in the simplest and non-optimal way, but the least brainwashing - by simply looking at the equations of verticals, horizontals and diagonals, it turned out quite voluminous, but nothing complicated

 int c4getrait(const TicTac *field, TicTac Me){ int i, j, k, score=0, maxscore=-1; /* X X X */ for(i=0; i<NSIZE_I; i++){ score=0; for(j=0; j<NSIZE_J; j++){ //printf( "i=%d; j=%d\n", i, j ); if(field[i*NSIZE_J + j] == Me) score++; else { maxscore = (score > maxscore)?score:maxscore; score = 0; if(maxscore == WIN) return maxscore; } } maxscore = (score > maxscore)?score:maxscore; } /* XXX */ for(j=0; j<NSIZE_J; j++){ score=0; for(i=0; i<NSIZE_I; i++){ if(field[i*NSIZE_J + j] == Me) score++; else { maxscore = (score > maxscore)?score:maxscore; score = 0; if(maxscore == WIN) return maxscore; } } maxscore = (score > maxscore)?score:maxscore; } /*main up diag XXX XX X */ for(k=0; k<NSIZE_J; k++){ score=0; for(i=0, j=NSIZE_J-k-1; i<NSIZE_I&&j>=0; i++, j--){ //printf( "i=%d; j=%d\n", i, j ); if(field[i*NSIZE_J + j] == Me) score++; else { maxscore = (score > maxscore)?score:maxscore; score = 0; if(maxscore == WIN) return maxscore; } } //printf( "\n" ); maxscore = (score > maxscore)?score:maxscore; }... 
.

Of course the most important and difficult procedure in the program is the minimax algorithm itself. The essence of the algorithm is to find the optimal move, and, the assessment is developed as a set of assessments of the moves of his and his opponent. The procedure is recursive. At each step, we look for an assessment of our progress and an assessment of the opponent’s best move. For example: we estimate our move at 2 points, the opponent’s answer at 10 points, total = 2 - 10 = -8 - this is the assessment of our move, recursively making our way down the options tree, we estimate the position to a depth of N. This game, although it is enough simple, but going through all the options, and they are factorial 42 for an empty field, an impossible task on a personal computer (apparently only on a super computer). The inability to calculate until it stops makes it necessary to use heuristics - position calculation. I will give a minimax procedure:
')
 Step game(TicTac *field, int deep, WHO it, TicTac t){ int i=0; float rait, koeff = 1 - Koeff[it]*deep; Step s, r; s.step = -1; s.rait = -1000.; if(deep > DEEPMAX){ s.rait = 0.; return s; } for(i=0; i<NSIZE_I*NSIZE_J; i++){ if( validstep(field, i) ){ field[i] = t; rait = c4getrait(field, t); if(rait >= WIN){ field[i] = EMPTY; s.rait = 10.*koeff; s.step = i; return s; } else if(!isstep(field)){ rait = 0.; } else { r = game(field, deep+1, it, (t==Tic)?Tac:Tic); rait-=r.rait; } if(rait > s.rait){ s.rait = rait; s.step = i; } field[i] = EMPTY; } } s.rait = s.rait*koeff; return s; } 
The procedure iterates through all positions in the cycle.
 for(i=0; i<NSIZE_I*NSIZE_J; i++){... 
make another move
 field[i] = t; 
looking for evaluation
 rait = c4getrait(field, t); 
if we won
 if(rait >= WIN){ 
, it makes no sense to look deeper - no, the recursion stops, if there are no moves (a draw) - we return 0 (maybe not optimal), otherwise
 r = game(field, deep+1, it, (t==Tic)?Tac:Tic); rait-=r.rait; 
we estimate the opponent's reaction to our move and calculate the rating, then we select the maximum rating and return the position
 if(rait > s.rait){ s.rait = rait; s.step = i; } field[i] = EMPTY; 
, return the result
 s.rait = s.rait*koeff; return s; 
.

For a correct estimate, you must enter the coefficients (my theory is pure :). This is due to the fact that winnings at different depths are not equivalent concepts (and, in general, an estimate), because the value is higher, the smaller the search depth. Explanation: for example, at move A, you get a loss at a depth of 5, and at move B - at a depth of 2. It is clear that 2 will happen earlier than 5 :), and therefore, in this situation, move 5 is preferable for you, this is not the case only victories / defeats, but also simply assessments of the position. In this regard, it is necessary to form coefficients on the principle, the deeper, the less importance we attach to the move. But, this is where the difficulties begin :) It is necessary that the algorithm correctly weighs the positions, despite the contradiction - to think deeper, but to underestimate the score, the greater the depth. Here, apparently, strict mathematics is possible, but I have no special training. But it is possible and so obvious: in the cycle to hold the party between the programs with a gradual change in the coefficients and recording in the log. Then the log can find the best options and write off the coefficients. For depth of search 6 - I got 0.2. Approximately at this stage I finish, I see, although, in the course, another improvement is to vary the depth depending on the number of options, it is clear that the number of options in this game decreases proportionally to the occupied fields ... Now the algorithm easily wins the average player: ) With the program on the link above loses, if he walks second, first - draws on the maximum difficulty. The program was written 2 days along with debugging and configuration. Thanks for attention.

Source: https://habr.com/ru/post/224653/


All Articles