📜 ⬆️ ⬇️

Model of manifestation of consciousness or ANN without the effect of forgetting

This article is a continuation of the Model of functional separation of consciousness and the unconscious. Introduction In it, we lyrically described the principles of the Rosenblatt perceptron. And they raised the problem of learning from two or more teachers. In this article, the problem of “two or more teachers” is not considered, it is quite difficult for me to formulate it technically with simple examples (I can with complex ones). Therefore, we will deal with this possibly in the next article - if there is some interest.

In this article we will talk about consciousness, but if you want to skip the lyrics (and it annoys specialists a little, but you want not only specialists to understand), then start reading from the section “Model of Zero Consciousness” in Intellectronics. But the lyrics still give some ideas about the connection with such fictitious concepts like consciousness.


')
First, let's define what unconscious is. It is actually simpler than it may seem initially. The unconscious is the whole mechanism of memory and the effects that arise there. This, in particular, was devoted to the previous article. All that distribution, associativity, two levels were never conscious, all these effects are internal. And only the final result (reaction elements) under the influence of a conscious complex is allowed on the surface.

The second, rather important position, is the description of a conscious complex . How is it different from the whole unconscious, from other work of memory?

The complex, according to Jung, is formed when the unconscious region of the psyche is set in motion. The processes occurring in the unconscious, can in some way contradict each other. Then these processes are brought to the level of consciousness. Consciousness can deprive the energy that should be sent to resolve this complex. If this happens, then there is a so-called. “Lowering mental level”. The intensity of conscious interests and activities gradually decreases, which is why either apathetic inertness arises or the regressive development of conscious functions, which means lowering the latter to their infantile and archaic primary arrivals, i.e. something like degeneration. But consciousness can identify with this complex. Then the presence of such complexes becomes one of the normal properties of the psyche. These conscious complexes manifest themselves in any differentiated typical installation or need.

Such a conscious complex arises in the process of learning . Any learning is a declaration of things contradictory to the unconscious. This complex establishes a certain activity, which is responsible for the belonging of external stimuli to a certain type of phenomena. This activity itself arises, as it were, from within, as the struggle of consciousness with the contradictory nature of the unconscious. In this way, it differs from other activity that occurs in the unconscious in response to environmental stimuli.

The work of such a conscious complex seriously accelerates the learning process. This is due to the fact that the complex itself differentiates various types of phenomena, and during training it remains to coordinate this with the other arising activity of the unconscious. It is much easier than to completely coordinate the rather contradictory activity of the unconscious.

Then the whole question is about the degree of seizure of mind training. If learning happened quickly, it shows a high degree of awareness. Now, when an ability test has occurred, the training of which has occurred, the amount of training received becomes important. If it is extremely small (compared to the total volume of the field of knowledge), the unconscious will still be included, and tries to somehow substitute work for consciousness, generalizing how knowledge can be obtained in order to foresee reality. Unfortunately, in this case the unconscious has practically nothing to generalize and it turns out only some nonsense.

If the amount of knowledge gained is higher, it displaces unconscious processes. And the consciousness restores the previously acquired knowledge, and the unconscious permits only for a small minimal generalization of knowledge. But for all those areas that were not directly memorized, as a result, it is simply not possible to make any assumptions, and even less predictions of their values.

Intuition works satisfactorily only when the learning process has not been completely captured by consciousness, but a certain abstraction has occurred. This abstraction or subsequent forgetfulness should not be too large, approximately 80% of attention should remain. At the same time, the learning process is a bit slow - this allows you to engage in learning for the unconscious.

When, later, the ability is tested (with a sufficient amount of acquired knowledge), the work of consciousness allows to delineate clear boundaries of knowledge, and the work of the unconscious fill the content with places that have not received attention, i.e. there is a certain generalization, a prediction (forecasting).

Separately, the work of the unconscious in this case represents a kind of “blurry spot”, when applied to which the work of the mind, reliable knowledge is obtained.

But sometimes the learning process may not be accompanied by the presence of consciousness. At the same time, a rather satisfactory result will be obtained, but it will never be possible to achieve clarity. Knowledge will be constantly blurred.

Model of "Zero consciousness" in Intellectronics

“ The complex establishes a certain activity responsible for the belonging of external stimuli to a certain type of phenomena. This activity itself arises, as it were, from within, as the struggle of consciousness with the contradictory nature of the unconscious. "

This phrase for the practice - cybernetics is very vague and indicates little what feature the model of consciousness should possess. Here we will try to explain this thought more specifically.

The model is based on the model of memory - Rosenblatt's perceptron. In the perceptron, the problem of forgetability is known. It manifests itself as follows: when learning, each subsequent stimulus can wipe the memory, or rather, change the weights on which the response was based on the previous stimulus. Therefore, in a single pass (iteration) in which the stimuli from the training sample are shown once, even if the correct response has already been received for a particular stimulus, subsequent training can lead to forgetting. Therefore, training is completed only when all stimulus-response pairs are matched across the entire training set, which are periodically forgotten.

This is understandable, because this corresponds to finding the coefficients in the system of equations. If coefficients were found that satisfy the solution of several K equations from the total number of equations N in the system, then in a subsequent search for the coefficients, despite the fact that the other M equations will be matched, for some of the K conditions will again not match. This problem is the cause of long learning.

You can simplify, improve and thereby get faster learning by applying the technique described below. This reception by external results resembles the manifestation of consciousness in the unconscious. But let's not get ahead.

One could not use the perceptron model at all, but simply memorize the input-output correspondences. But by doing so, we would be deprived of the opportunity to make some generalizations and, most importantly, we would not have a model of the relationship of the subject area of ​​interest to us, an expressed system of nonlinear equations.

Similar results could be obtained directly by interfering with the management of the activation of elements of the associative layer (A-elements). What would it give?

For example, let us select for this purpose additionally as many A-elements as R-elements, i.e. we will increase the size of the memory, without associating it with the sensory elements, by the size of the reacting elements. And during training, when applying the appropriate stimulus to the sensory elements, we will simultaneously determine the activity of these additional A-elements. In this case, we will establish this activity so that it exactly corresponds to the required state of R-elements. The state of the activity of other A-elements will be pseudo-random, or rather, will somehow depend on the input stimulus.

In this case, the training will be almost instantaneous, because in memory (associative layer), such a pattern is artificially formed, just like the one required at the output. This will take advantage of the training procedure, the purpose of which will only neutralize random activity on other “ordinary” A-elements.

It should be noted that the “accelerated learning” method presented here has a significant drawback - such a model practically excludes the possibility of generalization and subsequent prediction on the test set. This calls into question, in this case, the general use of the perceptron.

But this is naturally a degenerate example. One has only to change some details, and then this “accelerated learning” technique will have significant advantages over any neural network in a number of ways.

Such details are as follows. It is necessary to introduce some element of randomness (mutation) into the activation of additional A-elements, which were mentioned above. For example, if it is necessary to activate an additional A-element, we will in 80% (this figure will be called a factor of attention, or, technically more precisely, prediction accuracy) of cases, it will actually be activated. This will slow down the training a bit, since sensory activity will no longer be so easily neutralized. And since 100% accuracy of the response required at the output will be lost in the internal activity, the procedure for adjusting the coefficients (training) will need to take into account the sensory information and coordinate it with the internal one. This process can be figuratively, or simply in a narrower, particular manifestation - called the manifestation of the conscious from the unconscious.

The most important thing is that by applying this model “behind the shoulders” there remains the problem of forgetting and in such a neural network (unlike the others) there are no situations that it predicts worse than it was trained. Such a network makes it possible to regulate (by means of the attention factor) the level of generalization — in one extreme case, take account of the entire training sample, but with almost no forecasting capability; in the other extreme case, we obtain a result similar to the result of the perceptron, when the generalization is so great that the image becomes strongly blurred and it is no longer possible to understand which features of the image are plausible (correspond to the training); by changing the attention factor, one can smoothly adjust the level of generalization by obtaining, in addition to the well-known features of the image, the most likely generalization. Moreover, the learning rate becomes predictable and depends on the attention factor.

It is worth mentioning that in some well-known neural networks already, in the author’s opinion, some principles presented here were not quite consciously used.

For example, the CC4 S. Kak method, uses the ideas of angular classification, and tries to accelerate learning as much as possible by means of static calculation of the activity of the inner layer, artificially activating only one A-element for different stimuli. This is similar to the fact that we directly affect the activity of the A-layer, but differs in that we more accurately match the activity of the A-layer with the desired result on the R-layer. As well as the smoothness of generalization in the angular classification (which is already done in a different way than ours) leaves much to be desired.

The same stochastic networks of Amosov, in which, also in the author's opinion, unreasonably great attention is paid to the question of the randomness of activation of the associative layer have some analogy in our approach. This means the principle of randomness of activation of additional A-elements depending on the size of the attention factor, but unlike Amosov’s networks, the remaining A-elements are not subject to such randomness when activated.

It is the integrity of our approach that allows us to talk about, albeit distant, but analogy with the informational processes that occur when a person manifests consciousness. In the end, our model of consciousness has the same attitude to human consciousness as the model of an artificial neuron to a biological neuron.

Finally, it is worth paying attention to the fact that such a model of “zero consciousness” naturally needs to be improved.

And the first is the question of the artificiality of the technique used, because we directly as “from the sky” regulate additional elements in the associative layer. They should depend on signals from feedbacks, and predict the state that should happen. This is just one of the reasons why the Hawkins network is interesting - Hierarchical Temporal Memory (NTM).

Second and foremost, it does not include the modeling of the problem of “learning from several teachers,” and, accordingly, the contradiction, the conscious complex, has not yet arisen.

Source: https://habr.com/ru/post/148778/


All Articles