⬆️ ⬇️

Neural networks with reflection

Recently, I was invited to speak at TEDx, I tried to popularly talk about the current state of affairs in AI, and in addition I set out the essence of those neural networks that we are currently working on (see video).







Since the report was very popular, I did not provide any details there, but the model has interesting properties that I want to talk about in more detail.

')

Network structure


The basis was the widely known Hopfield network, but in addition to the main connections from each neuron to each (which can technically be considered as links with a delay of one cycle), additional links with delays of more than 1 cycle were added (delays of 2-8 cycles were practically investigated ).

These additional connections also went from each neuron to each, moreover, several blocks of such connections were present at the same time - in particular, there are links from each to each video, which are delayed for 2, 3, 4, 5 and 6 cycles.

Thus, for a network of N neurons, it is not N ^ 2 connections that are created, but (k + 1) * N ^ 2 connections, where k is the number of blocks of the delayed connections.



Formulation of the problem


In general, the task that I would like to solve with such a network in the future is self-study of dynamic images (that is, roughly speaking, the ability to select and memorize frequently encountered subsequences in some continuous sequence of images at the input).

But first we decided to have a simpler task - there are several well-known sequences (namely, words composed of consecutively displayed letters), they are given in a noisy form to the input (and the show can start with any letter in the word, not necessarily the first), and the network should as soon as possible understand what kind of word is now at the entrance (this is not possible to determine right away - words have common letters and even syllables).

The parameters of the model task are as follows: 6 words with 7 letters each, each letter is a picture of 100 pixels, the degree of noise is about 20% (i.e., at each display of each letter, 20% of randomly selected pixels are inverted).



Solution of the original problem


The network quite successfully solved the problem - almost at the human level, with such a degree of noise, when it is not always possible to understand by eye exactly which letter is being displayed, with several full-time word displays, the network can confidently say which word is being displayed. With a small degree of noise word is recognized after the display of the first 2-4 letters.

The result is good, but quite achievable by other methods.



Reflection effect


The behavior of the network turned out to be the most interesting after it was presented with some sequence at the input, and then “turned off” the input, but continued to calculate its states. It turned out that the network at the same time consistently turns into states that outwardly basically resemble letters (which is natural - according to the learning algorithm, letters should be stable states of the network), but firstly not always, and secondly, these letters are added in the sequence which somewhat resemble the syllables of the memorized words, but are not identical to them, change in a different rhythm (some letters repeat for a long time until the state changes, others pass quickly), and have a cycle length significantly exceeding the original words (remembered words of 7 letters, and the resulting words from this process can be more than 50 letters to loop).

And when showing different words at the entrance before turning it off, and when turning off at different times, the resulting words were different.

Those. it can be argued that the state of the network is extremely nontrivially dependent on its previous states. If you allow yourself a loose interpretation of what is happening, then you can draw an analogy with dreams - they line up in complex plots, reflecting in some way previous experience, but do not repeat it directly.



In general, the model has an interesting behavior, which certainly makes you wonder. In the video, I allowed myself a rather bold interpretation of this behavior, but if it is wrong, you can still hope that in the foreseeable future you will still be able to explain the basic mechanisms of thinking, albeit in a rough, but fundamentally correct approximation.



PS The discussion is extremely welcome - I cannot figure out how to solve the learning problem in the case when the sequences are not set from the outside, but need to be extracted from the input stream, and I will be very happy with any thoughts on the topic and not only.

Source: https://habr.com/ru/post/146197/



All Articles