📜 ⬆️ ⬇️

Scientists cured AI from forgetfulness

Artificial neural networks are distinguished from biological analogues by the inability to “remember” past skills when teaching a new task. Artificial intelligence, trained to recognize dogs, cannot distinguish between people. To do this, he will have to retrain, but at the same time the network will “forget” about the existence of dogs. The same applies to games - the AI, who knows how to play poker, will not win chess.

This feature is called catastrophic forgetting. However, scientists from the company DeepMind and Imperial College London have developed an algorithm for learning deep neural networks, which is able to acquire new skills, preserving the “memory” of previous tasks.

/ photo by Dean Hochman CC
')
The neural network consists of several connections, for each of which its weight is calculated. Each weight in the neural network is assigned a parameter F, which determines its significance. The greater the F value for a particular neuron, the less likely it is to replace it with further learning. Therefore, the neural network “remembers” the most important acquired skills.

The technique is called Elastic Weight Consolidation, or "elastic weights". The operation of the algorithm was tested on Atari games. Scientists have shown that without the “fixing of the weights,” the program quickly forgot games when it stopped playing them (blue graph). When using the EWC algorithm, the neural network "remembered" the weights necessary for performing all the previous tasks. And although the EWC-network lost in each case to the classical algorithm, it showed good results on the sum of all stages (red and brown graphics).


The authors of the study say that the scientific community has already attempted to create deep neural networks capable of performing several tasks at once. However, past decisions were either not powerful enough, or required large computational resources, since the networks were trained at once on a large combined sample (and not on several consecutive ones). This approach did not bring the algorithms to the principles of the human brain.


There are also alternative neural network architectures for working with text, music, and a series of long data. They are called recurrent and have long-term and short-term memory, which allows you to switch from global to local problems (for example, from analyzing individual words to the rules of the stylistics of the language as a whole).

Recurrent neural networks have memory, but they are inferior to deep networks in their ability to analyze complex sets of features that occur, for example, in graphics processing. Therefore, a new solution from DeepMind in the future will allow creating intelligent universal algorithms that will find application in software for solving problems that require nonlinear transformations.

PS A few more interesting materials from our blog:

Source: https://habr.com/ru/post/323948/


All Articles