We program in the world of Minecraft

Habr, hello! While everyone is discussing AI in the world of Pacman , we will start making your AI in Minecraft with the Malmo framework from Microsoft Research. We will have a pacman too. If you like the cubic world, or you would like to start studying artificial intelligence, or you have children with whom you can not find common hobbies, or you are just interested in the topic - I ask under the cat.

In this article I will try to cover several topics:

Express your opinion about the madness of children on a cubic toy
I'll tell you about the main idea of Malmo
I will show a few examples with the code and give an understanding of where to go next
I'll tell you about the idea and the results of the Malmo Challenge

Minecraft: my backstory

I met a toy, being already a student. This did not prevent me from postponing all my personal, work, and academic goals on the same day, and to go completely into the cubic universe. Then I was released only a month later, but still I gladly go sometimes to run an hour or so on my beloved world.
')
For me, Minecraft was a continuation of Lego's favorite childhood toy, correcting its main drawback: a constant lack of detail. Analogue Lego with unlimited details, what could be better.

I would like to particularly note the absence of cruelty in this game. You can kill a zombie or run away, jump off a cliff. No one argues. But the lack of blood is very good, as is the sweet visualization of the birth of a new life.

Minecraft has a very vague notion of the final goal. Of course, you can pump and kill the dragon, proudly saying that you have completed the game. But no one does. The main thrill of the world of Minecraft is that every time it is possible to come up with a personal goal: explore the world and find a cave with caches, build your dream home, learn the basics of electricity, or go to the server with a friend and make all kinds of traps. The lack of goals in the game - in my opinion, its main advantage. Minecraft gives a huge scope for creativity, with almost no restrictions.

Studying the subject, I accidentally learned that the world of Minecraft is not limited to the game, merch, letspleyami and fan artami. The whole series are filmed in the game, and - unexpectedly - they are quite popular. In my opinion, it's funny.

I was very pleased with the news about the availability of open source framework for programming in the world of Minecraft. I firmly believe that in the future, the vast majority of occupations may need basic programming skills. The framework based on your favorite toy is, in my opinion, a great way to show your child the exciting world of programming.

Malmo: the main idea

The Malmo framework was created by the joint efforts of several researchers whose main goal was to adapt the interesting world to experiments in the field of artificial intelligence. The algorithms of AI are still relatively small, and they all have great potential for more detailed study and improvement. I really like the fact that Microsoft creates additional motivation to learn the unknown.

Technical points

Installation
Despite following the instructions clearly, you may encounter a number of problems during the installation process. My problems were mainly due to the fact that some of my components were already delivered, but the version was different. All problems are treated with the help of a well-known site .

OS and programming language support
Despite the bold statement about the support of all three popular operating systems, it seemed to me that the testing was conducted properly only for Windows. Having defeated installation problems, your headache on Windows OS promises to end. On Linux, the problems are likely to continue, as the elevated server periodically crashes, without giving a reason. If you continue my experiments - be sure to write in the comments about your experience.

The authors tried to support a large number of popular languages and made bindings for C #, C ++, Lua, Python2 and Java. I chose Python.

How to play Malmo program

The main process is as follows: in one window you need to raise the server and the client. For this there is a script ./Minecraft/launchClient.* . Once the server is up, in another window you can run the code with the main logic to control the character. How do I know that the server is up? Everything is extremely logical: you will see a running instance of Minecraft with the initial menu inside, and in the terminal the inscription Building 95% will proudly flaunt.

You can run as many launchClient instances. In this case, the first instance launched will be the server, as well as the client, representing one character. All subsequent instances will connect to an already raised server, adding an extra character to the world.

You can implement the logic for each of the men in the code, and you can also control the character yourself with all the familiar AWSD keys.

In addition to the server with the client and the file with the logic, we also have an xml file describing the initial state of the world. The authors do not insist on its existence, and in their examples they often put it in a line and store it in code, but, in my opinion, it is more convenient to immediately make it a separate file, adding the necessary pieces as needed.

The authors took care of us and made an impressive number of examples , adding a description to them.

My advice : do not try to start from scratch, take the first example as a base. Nothing happens in it, we simply create the simplest flat world and join the character. In the while loop at the end, you can, at your discretion, add action to what is happening. For example, write there:

 agent_host.sendCommand("move 1")

And enjoy the first steps of your hero. Note that the default is so-called. ContinuousMovementCommands. Think of the commands given to the character as a change in the position of the lever. Saying "move 1" , you take more than one step. You will run until you give the command "move 0" . Such a code in practice does not move the little man from the place:

 agent_host.sendCommand("move 1") agent_host.sendCommand("move 0")

Commands will be executed in a split second. Do not forget to insert the periodic lines "time.sleep(X)" . I am sure that you know where to get information about the other commands (although, in my experience, it is easier to look at the tutorial diagonally and then search for the right source).

In the xml file you can set the game mode:

 <AgentSection mode="Survival"> <!--   Creative     -->

Set the starting time, character position, customize the world: make it flat or close to reality.

This code will draw you Pakman, who eats balls and goes to the rainbow crater:

 <FlatWorldGenerator generatorString="3;7,44*49,73,35:1,159:4,95:13,35:13,159:11,95:10,159:14,159:6,35:6,95:6;12;lake,lava_lake" /> <DrawingDecorator> <DrawSphere x="-60" y="70" z="0" radius="30" type="air"/> <DrawSphere x="-60" y="80" z="30" radius="10" type="wool" colour="YELLOW"/> <DrawCuboid x1="-50" y1="80" z1="30" x2="-70" y2="70" z2="20" type="air"/> <DrawSphere x="-60" y="75" z="25" radius="2" type="wool" colour="WHITE"/> <DrawSphere x="-60" y="68" z="18" radius="2" type="wool" colour="MAGENTA"/> <DrawSphere x="-60" y="61" z="11" radius="2" type="wool" colour="PURPLE"/> <DrawSphere x="-60" y="54" z="4" radius="2" type="wool" colour="PINK"/> </DrawingDecorator>

Finally, in xml, you can add the necessary coordinates to add a character review:

 <ObservationFromGrid> <Grid name="floor3x3"> <min x="-1" y="0" z="-1" /> <max x="1" y="0" z="1" /> </Grid> </ObservationFromGrid>

By default, we have no opportunity to look around and get information about the nearest blocks. However, we can say that we want to know what is around us. Note that in this case we need to use relative coordinates, measured from the cube with the hero's legs. As a result of this line:

 grid = observations.get(u'floor3x3', 0)

We get an array with strings. Each line is a textual representation of the type of one of the cubes.

 floor3x3: ['lava', 'obsidian', 'obsidian', 'lava', 'obsidian', 'obsidian', 'lava', 'obsidian', 'obsidian']

In this way, an AI can be created that explores the world, searches for something and does not die for silly reasons. The simplest option without using machine learning I implemented here .

Features for AI

Of course, the first thing I wanted to see for the implementation of AI algorithms in malmo is the ability to move discretely. In the question of AI and so lack of complexity, and do not want to add to everything else a constant adjustment of the direction and speed of movement.
We include necessary in xml so:

 <DiscreteMovenetCommands/>

Unfortunately, this will not be enough. To move discretely, your starting position must be strictly in the center of the cube:

 <Placement x="4.5" y="46.0" z="1.5" yaw="0"/> <!-- y - , x, z -   -->

Whole coordinates will put you at the intersection of the cubes, the character will refuse to move from the spot, you will not see any warnings and errors. In the tutorial this is also not warned. I spent about 4 hours to realize the essence of the problem and make the x and z coordinates half-and-half. (y is responsible for the height and does not play a role in this story).

In addition, the researchers added some nice features to solve the problem of reinforcement learning (Reinforcement Learning). Algorithms of this type imply the constant rewarding or punishment of artificial intelligence for certain actions. The developers have thought about this moment and added the ability to register these actions / events in xml, saving the code from the same constant checks. You can also set the end of the game when a certain event occurs:

 <RewardForTouchingBlockType> <Block reward="-100.0" type="lava" behaviour="onceOnly"/> <Block reward="100.0" type="lapis_block" behaviour="onceOnly"/> </RewardForTouchingBlockType> <RewardForSendingCommand reward="-1" /> <AgentQuitFromTouchingBlockType> <Block type="lava" /> <Block type="lapis_block" /> </AgentQuitFromTouchingBlockType>

For example, here we constantly punish the character a little bit for each step that is not crowned with a victory; strongly rewarded for victory and punished for death; Finally, we end the round in case of death or win.

Malmo: Conclusion

The authors of the framework gave us an amazing opportunity to dive into the beloved world from the other side. Malmo is still in beta, in many situations he ... makes him improve his troubleshooting skills. Nevertheless, its advantages outweigh all its disadvantages, and the fact that the source code is in open access on github allows us to finish the right place on our own or create an issue to fix critical bugs.

For obvious reasons, the authors of the project do not mention in any of the articles the opportunity to educate children on the basis of the framework: a child is unlikely to cope with the struggle with small but frequent bugs. Nevertheless, I am sure that if a parent helps his child and programs with him, it will give excellent results and allow you to spend time with benefits.

Malmo Challenge: History and Results

In addition to the framework itself, Microsoft also held a competition on the basis of a platform called the Malmo Challenge. It was intended to encourage scientists and researchers to work on collaborative algorithms. The competition started about six months ago, and the results appeared on June 5th.

The essence of the challenge is as follows: we have a flat world, a fence of complex shape, inside the pen runs a pig and 2 people walk. Our task is to create an AI for one of the characters, who will be able to interact with the second one, so that together they will drive the pig into the closed space. The second character can behave randomly, can be controlled by a person, another AI, it can even be a second copy of your own AI.

In this case, you can get the maximum number of points by catching a pig, or you can get a small number of points by jumping into a puddle on the side. You will not get anything if your partner decides to jump into a puddle, refusing to interact with you.

This task in general is called Deer Hunt . It was formulated in the 18th century by Jean-Jacques Rousseau. Despite the impressive age of the problem, it is still unclear which algorithm most effectively solves the problem.

I am pleased to share with you the results of the competition . I was very surprised by the distribution of seats in the standings.

First place went to the project team from the UK. The authors soberly appreciated a strong lack of time, they realized that they are unlikely to have time to adapt complex existing algorithms for the task. They chose Bayesian inference to determine the type of partner, as well as Markov chains for direct gameplay. And they won.

The runners -up decided to take the most difficult of the existing solutions, they used DNN, Reinforcement learning, DQN, A3C model ... And all this did not help them bypass Bayes and Markov chains.

Summing up the article with the thought that you need to be easier.

If you also want to try creating your own AI, join our Russian-language chat about neural networks in Telegram . There you can ask questions that interest you, as well as share your achievements.

Video with my story about Malmo at the meeting of the Petersburg Python mitap has already appeared on my channel on Youtube . There are also records of my other lectures and other chatter about IT.

Source: https://habr.com/ru/post/331034/

All Articles