📜 ⬆️ ⬇️

OpenAI Universe. Open platform for training strong AI


A set of tasks for training with the reinforcement of strong AI in the framework of the universal OpenAI platform

Founded by Ilon Mask and colleagues, the nonprofit organization OpenAI, which aims to create safe (that is, publicly available and open) artificial intelligence, has taken another step to implement its plans. OpenAI introduced the Universe middleware to train and train strong AI. Theoretically, training can occur on all information of mankind, accessible through the Internet. These are games, websites and other applications.

Only nine lines of code - and thousands of environments are available for your AI for training.

Using the Universe software platform, the intelligent agent will use the computer in the same way as a person does: he will look at the pixels of the computer screen and interact using the keyboard and mouse (while virtual).
')

Artificial intelligence learns the world through the VNC program interface for remote desktop access

It is supposed to train an intelligent agent on the full set of tasks. The Universe platform opens for AI any tasks that a person is capable of solving at the computer.

OpenAI Gym Environments


The opening of the universal universal platform - the continuation of the planned actions of OpenAI to create a worldwide open universal AI. In April of this year, the organization released a public beta version of the OpenAI Gym toolkit for developing and comparing reinforcement learning algorithms. The "Gym" OpenAI Gym consists of a large number of environments (from a humanoid robot simulator to Atari games ). There is a site for comparing and reproducing the results .

OpenAI Gym is compatible with algorithms written in any framework, including Tensorflow and Theano . Initially, environments are created on Python, but in the future, developers plan to make it possible to implement them in any programming language.

OpenAI believes that reinforcement learning is an important way of machine learning that will greatly improve AI. In the process of learning by this method, the subject system (agent) learns by interacting with a certain environment. In contrast to traditional teaching with a teacher, reinforcement signals are a response to AI decisions, while some reinforcement rules are dynamically formed and difficult to understand for a person, that is, they are based on the simultaneous activity of formal neurons.


Reinforcement signal recognized by OCR at 60 fps: video

OpenAI Universe Software


The Universe introduced today is middleware that fully supports the toolkit environment and the runtime environment of the OpenAI Gym. Thanks to this middleware, it is planned to dramatically increase the number of environments for AI training.

If earlier the largest catalog of apps for learning with reinforcements included only 55 Atari games (Atari Learning Environment), then on the Universe platform games from many other developers, including Valve, EA and Microsoft, are expected to appear.

From the very beginning, thousands of games (flash games, Slither multiplayer snakes, Starcraft, GTA V others), various browser-based tasks (like filling out forms) and applications (such as fold.it puzzles) are available via the Universe “middleware”. Virtually any game can be freely launched using the Python library universe , which is published in the public domain on Github.

import gym import universe # register Universe environments into Gym env = gym.make('flashgames.DuskDrive-v0') # any Universe environment ID here observation_n = env.reset() while True: # agent which presses the Up arrow 60 times per second action_n = [[('KeyEvent', 'ArrowUp', True)] for _ in observation_n] observation_n, reward_n, done_n, info = env.step(action_n) env.render() 

The above code starts an artificial intelligence agent to play the game Dusk Drive .

Dusk Drive Game

“Our ultimate goal is to develop a single intelligent agent who is able to flexibly apply the experience gained in the Universe to solve new problems and quickly gain new experience, which will be an important step towards a strong AI,” OpenAI said.

Universe software environments are installed in Docker containers. As already mentioned, they communicate with an intelligent agent through a visual interface - through the "screen", "keyboard" and "mouse", as with a person. The interface is implemented using the VNC program for remote desktop access.

The idea is that the constant improvement of AI skills with the accumulation of experience in various small tasks will help him to master every new task more quickly, applying existing knowledge. The platform and the Universe environment set can become for intellectual agents the same standard uniform platform for training and reinforcement training, which is the ImageNet data set — the image base for training neural network classifiers when training with a teacher.

Reinforcement training can really be very effective. For example, the intelligent agent Universe has been training for about six days to play the multiplayer web game Slither. After six days, the AI ​​gains an average of 1,000 points in gaming sessions, with a maximum result of 1,400 points. For comparison, an employee from the OpenAI organization with a five-hour game experience gains an average of 1,400 points with a maximum score of 7050.

Currently, the following games and applications from OpenAI partners are available to agents via the Universe middleware: Portal , Fable Anniversary , World of Goo , RimWorld , Slime Rancher , Shovel Knight , SpaceChem , Wing Commander III , Command & Conquer: Red Alert 2 , Syndicate , Magic Carpet , Mirror's Edge , Sid Meier's Alpha Centauri and Wolfram Mathematica . The list will increase.

Source: https://habr.com/ru/post/399701/


All Articles