Learning Artificial Intelligence to play the game

Good day, dear reader!

In this article, we will develop a neural network that will be able to play a game specially created for it at a good level.

')
Note: this article does not explain the term " neural network " and everything connected with it, nor does it provide basic information about network training using the tracing method . We recommend briefly reading these concepts before reading the article.

Start: game rules description

The goal is to catch as many circles with a green rim as possible, avoiding circles with red.

Conditions:

Red circles flies onto the field twice as many as green ones;
The green circles remain within the field, the red ones fly out and are eliminated;
Green and red circles can collide and repel with their own kind;
Player - yellow ball on the screen, can move within the field.

After contact, the ball disappears, and the player receives the appropriate points.

Next: design AI

The receptors

In order for the AI to be able to decide where to move the player, you need to first obtain environmental data. To do this, create special scanners , which are straight lines. The first scanner is located at an angle of 180 degrees with respect to the lower boundary of the field.

There will be 68 of them: the first 32 react to bombs (red circles), the next 32 react to apples (green circles), and the last 4 receive data on the finding of field borders relative to the player. Let us call these 68 scanners the input neurons of the future neural network (receptor layer).

The length of the first 64 scanners is 1000px, the remaining 4 correspond to dividing the field in half according to the corresponding symmetry face

Sector (1/4) of AI scope

In the picture: input neurons receiving values from scanners.
Values on scanners are normalized, i.e. is brought to the range [0, 1], and the closer the object crossing the scanner is to the player, the greater the value transmitted to the scanner.

Algorithm for data acquisition by scanners and its implementation on JS

So, the scanner is straight. We have the coordinates of one of its points (player's coordinates) and the angle relative to the OX axis; we can get the second point using the trigonometric functions sin and cos.

From here we obtain the directing vector of the line, and therefore we can construct the canonical form of the equation of this line.

To get the value on the scanner, you need to see if some ball intersects this line, which means you need to present the equation of the line in a parametric form and substitute all this into the equations of a circle, successively for each circle for the field.

If we bring this substitution to a general form, we get a parametric equation for a, b and c, where these variables are the coefficients of the quadratic equation, starting with a square, respectively.

We offer the reader to familiarize yourself with this algorithm in more detail by using the simplest definitions of linear algebra
Below is the scanner code.

Ball.scaners: { // length: 1000, i: [], //   get_data: function(){ //Ball.scaners.get_data /     var angl = 0; var x0 = Ball.x, var y0 = Ball.y; var l = Ball.scaners.length; for(let k = 0; k < 32; k++){ x1 = l*Math.cos(angl), y1 = l*Math.sin(angl); Ball.scaners.i[k] = 0; for(i = 0; i < bombs.length; i++){ if(((k >= 0) && (k <= 8) && (bombs[i].x < x0) && (bombs[i].y < y0)) || ((k >= 8) && (k <= 16) && (bombs[i].x > x0) && (bombs[i].y < y0)) || ((k >= 16) && (k <= 24) && (bombs[i].x > x0) && (bombs[i].y > y0)) || ((k >= 24) && (k <= 32) && (bombs[i].x < x0) && (bombs[i].y > y0))){ //    var x2 = bombs[i].x, y2 = bombs[i].y; var p = true; //    var pt = true; //     var t1, t2; var a = x1*x1 + y1*y1, b = 2*(x1*(x0 - x2) + y1*(y0 - y2)), c = (x0 - x2)*(x0 - x2) + (y0 - y2)*(y0 - y2) - bombs[i].r*bombs[i].r; //------------------------------    if((a == 0) && (b != 0)){ t = -c/b; pt = false; } if((a == 0) && (b == 0)){ p = false; } if((a != 0) && (b != 0) && (c == 0)){ t1 = 0; t2 = b/a; } if((a != 0) && (b == 0) && (c == 0)){ t1 = 0; pt = false; } if((a != 0) && (b == 0) && (c != 0)){ t1 = Math.sqrt(c/a); t2 = -Math.sqrt(c/a); } if((a != 0) && (b != 0) && (c != 0)){ var d = b*b - 4*a*c; if(d > 0){ t1 = (-b + Math.sqrt(d))/(2*a); t2 = (-b - Math.sqrt(d))/(2*a); } if(d == 0){ t1 = -b/(2*a); } if(d < 0){ p = false; } } //----------------------------------- if(p == true){ if(pt == true){ let x = t1*x1 + x0; let y = t1*y1 + y0; let l1 = Math.pow((x - Ball.x), 2)+Math.pow((y - Ball.y), 2); x = t2*x1 + x0; y = t2*y1 + y0; let l2 = Math.pow((x - Ball.x), 2)+Math.pow((y - Ball.y), 2); if(l1 <= l2){ Ball.scaners.i[k] += 1 - l1/(l*l); }else{ Ball.scaners.i[k] += 1 - l2/(l*l); } }else{ let x = t1*x1 + x0; let y = t1*y1 + y0; Ball.scaners.i[k] += 1 - (Math.pow((x - Ball.x), 2)+Math.pow((y - Ball.y), 2))/(l*l); } }else{ Ball.scaners.i[k] += 0; } }else{ continue; } } angl += Math.PI/16; } //!---------------  for(k = 32; k < 64; k++){ x1 = l*Math.cos(angl), y1 = l*Math.sin(angl); Ball.scaners.i[k] = 0; for(i = 0; i < apples.length; i++){ if(((k >= 32) && (k <= 40) && (apples[i].x < x0) && (apples[i].y < y0)) || ((k >= 40) && (k <= 48) && (apples[i].x > x0) && (apples[i].y < y0)) || ((k >= 48) && (k <= 56) && (apples[i].x > x0) && (apples[i].y > y0)) || ((k >= 56) && (k <= 64) && (apples[i].x < x0) && (apples[i].y > y0))){ var x2 = apples[i].x, var y2 = apples[i].y; var p = true; //    var pt = true; //     var t1, t2; var a = x1*x1 + y1*y1, b = 2*(x1*(x0 - x2) + y1*(y0 - y2)), c = (x0 - x2)*(x0 - x2) + (y0 - y2)*(y0 - y2) - apples[i].r*apples[i].r; //------------------------------    if((a == 0) && (b != 0)){ t = -c/b; pt = false; } if((a == 0) && (b == 0)){ p = false; } if((a != 0) && (b != 0) && (c == 0)){ t1 = 0; t2 = b/a; } if((a != 0) && (b == 0) && (c == 0)){ t1 = 0; pt = false; } if((a != 0) && (b == 0) && (c != 0)){ t1 = Math.sqrt(c/a); t2 = -Math.sqrt(c/a); } if((a != 0) && (b != 0) && (c != 0)){ var d = b*b - 4*a*c; if(d > 0){ t1 = (-b + Math.sqrt(d))/(2*a); t2 = (-b - Math.sqrt(d))/(2*a); } if(d == 0){ t1 = -b/(2*a); } if(d < 0){ p = false; } } //----------------------------------- if(p == true){ if(pt == true){ let x = t1*x1 + x0; let y = t1*y1 + y0; let l1 = Math.pow((x - Ball.x), 2)+Math.pow((y - Ball.y), 2); x = t2*x1 + x0; y = t2*y1 + y0; let l2 = Math.pow((x - Ball.x), 2)+Math.pow((y - Ball.y), 2); if(l1 <= l2){ Ball.scaners.i[k] += 1 - l1/(l*l); }else{ Ball.scaners.i[k] += 1 - l2/(l*l); } }else{ let x = t1*x1 + x0; let y = t1*y1 + y0; Ball.scaners.i[k] += 1 - (Math.pow((x - Ball.x), 2)+Math.pow((y - Ball.y), 2))/(l*l); } }else{ Ball.scaners.i[k] += 0; } }else{ continue; } } angl += Math.PI/16; } Ball.scaners.i[64] = (1000 - Ball.x) / 1000; //  Ball.scaners.i[65] = Ball.x / 1000; //  Ball.scaners.i[66] = (500 - Ball.y) / 500; //  Ball.scaners.i[67] = Ball.y / 500; //  } }

Neural Network Design

So, for learning the NN, it was customary to use the trace algorithm (reference paths).
Note: in order to simplify the learning process matrix construction is omitted

Let's start:

At the output of the network, we represent 8 neurons responsible for directions: the first is left, the second is left + up, the third is up, the fourth is up + right, the fifth is right, and so on;
Let a positive value on a neuron means that you should move in the direction that is assigned to it, and the negative one is the opposite;
It is important for us to somehow combine the values from scanners that are at one angle relative to the lower edge of the field. Since if the output is negative, the AI will repel from this direction, we introduce an additional layer (hidden between the input and the output), which will add the values of neurons reflecting the corresponding scanners in pairs, and we will take the values from the red neurons with a "-";
Obviously, the movement to the left is mainly influenced by the information to the left of the player (not only, but this will be taken into account later (*)) - it means that we connect the 8 hidden neurons on the left with the neuron responsible for the movement to the left. We establish connections in the following way: a neuron corresponding to a scanner located parallel to the field boundary is assigned a weight of 1, then a neighboring two neurons are assigned a weight of 0.95, adjacent via a single weight of 0.9, and finally an extreme weight of 0.8 in this sector;
Neurons of borders are also taken into account: we make connections to those outputs from those border neurons whose boundaries can influence movement.
It does the same with the remaining seven neurons of the output layer.

By completing the above algorithm, we get the neural network shown in the figure below.

* But that is not all. It is important to correctly interpret the result, namely (Y is an array of output neurons):

  Ball.vx = -(Y[0] - Y[4]) + (-(Y[1] - Y[5]) + (Y[3] - Y[6]))*0.5; Ball.vy = -(Y[2] - Y[6]) + (-(Y[3] - Y[7]) + (Y[5] - Y[1]))*0.5;

Thus, we created and automatically trained the neural network that underlies the player's AI on the gif at the top of the article.

News code is available (implementation of the game and AI) at the link: Link to githab Ivan753 / Learning

That's all. Thanks for attention!

Source: https://habr.com/ru/post/418895/

All Articles

Learning Artificial Intelligence to play the game

Start: game rules description

Next: design AI

The receptors

Neural Network Design

More articles: