Model of functional separation of consciousness and unconscious. Introduction

In recent times, much lightweight articles have begun to appear on the topic of AI, it’s more precisely not even about AI, but about the philosophy of AI. Moreover, such a philosophy that does not pose ANY fundamental questions to researchers. Well, let's say this is just chatter for life. And surprisingly, such articles gain a certain amount of advantages.

What articles will I enter? I will give only the names, without references. I do not recommend reading them (Another attempt to understand the problem of artificial intelligence, On the possibility of AI to self-knowledge and knowledge of the creator, On emotions, programs and artificial intelligence, Artificial intelligence to be, Look of a well-informed skeptic on artificial intelligence)

It is clear that such articles were written by those who do not even have the appropriate education. But this is not the main reason for the appearance of such articles. After all, they are really publishing such articles think that it can even somehow send a researcher who has the appropriate education. I will disappoint them - no, I will not; in the articles there is not a single idea where. And this happens (the emergence of such mumble) from the fact that it seems to them that the researchers themselves do not know where to develop. And it sometimes looks that way. Even in the professional environment of AI specialists, there is often no understanding of what tasks need to be solved, there is no “list of unsolved AI problems,” so to say, unlike mathematics. Books often contain only methods for solving problems, and practically do not say anything about tasks that need to be solved. It is difficult for the younger generation to set themselves a task, and they begin to fantasize only from the word “intellect”. But everyone probably forgot / (they don’t know?) That the name “Artificial Intelligence” is a provocation, a publicity stunt - serious scientists are not engaged in “strong AI”, and not because it cannot be done, but because it does not have a technical statement .
')
Here I will give the refactoring version of one of my popular science article of 2006, which I thought was in the same row in meaning as I criticized above. But now I still see that the style, though the same, but behind my article can be (and is) a clear technical statement. We will talk about it later, but for now the lyrical introduction. But what is important is that I lyrically lead to one significant unsolved problem in the field of AI .

Initially, the necessary theory. Those who are familiar with Rosenblatt's perceptron can only wonder at his lyrical description for the public. And to focus only on one important aspect described in the section “The problem of learning from two or more teachers” is the problem of AI, which has not been solved to this day.

Memory model

First, the reader needs to abandon the view that the memory of a person is similar to a computer, i.e. what is recorded, then it can be considered unchanged. And besides that the recording in memory has a clearly localized place. This is far from it!

So, first, distribution is the main characteristic of memory. It has a number of consequences:
1. Memory is not static, it is dynamic. The concrete fact is distributed in memory and constantly changes places;
2. Since the fact does not have a precisely localized place, such a memory is more reliable (and not necessarily redundant). The loss of any memory element worsens only the quality of the memorized facts, but does not lead to a complete loss of the fact.

The second characteristic is associativity . Consider first the degenerate cases where the memory model has one characteristic, but not the other:
1. Distributed, but not associative memory. Known in programming as a distributed database.
2. Not distributed associative memory. Known as forward (or reverse) inference used in expert systems.

What adds associativity to the properties of distribution:
1. Combines distributed elements into a single unit;
2. Allows you to combine distributed elements in a combination that has not previously been provided;
3. New combinations can be combined in response to a previously unknown stimulus, i.e. as if predicting the answer (forecasting).

Thus, distribution is a necessary condition, and associativity is a sufficient condition for the appearance of the effect of foresight (forecasting).

The effect of foresight can be considered a manifestation of intuition. It is considered a necessary part of intellectual activity. And to treat the manifestation of the unconscious rather than consciousness.

Conscious prediction has other manifestations - it cannot be based on distribution and associativity. It has a deterministic character and a specific focus. Such prediction is based on clear analytical formulas.

Here we will continue to talk about unconscious prediction, which is one of the effects arising in distributed and associative memory.

The human brain is able to realize only simple linear patterns. We never work with complex systems, we decompose them into simple systems and separately consider the relationship between these simple systems. The whole object-oriented analysis is based on this principle.

We are not able to operate with characteristics that in complex combinations give one type of phenomena, and in other complex combinations give another type of phenomena. We can argue only in terms - if this and this characteristic is present, then such and such a phenomenon. In reality, such simplicity does not exist if we could not find an abstraction mechanism of complexity, we would have to argue this way - if there is this and this characteristic, and this and this is missing, then in this particular case this is such a type of phenomenon. But such reasoning cannot be generalized, which means that any hope of forecasting is lost.

But the memory of a person tried here, and showed us the principle of two-layer - the third characteristic of memory. This means that the brain does not perceive the primary stimuli (signals) from the outside world. He displays these stimuli on the field of his memory elements. As a result of this mapping, the original nonlinear stimuli are transformed into a completely different set of already linear stimuli.

To be able to perform the principle of double-layer, it is necessary to implement in the memory model a non-linear mapping function. Why it becomes clear from the following mathematical principle:
The principle of displaying nonlinearity in linearity. Any non-linear set of characteristics (which is represented by a non-linear space) reflected by any non-linear function on a set larger in volume (depends on the number of binary input-output pairs, if the number of inputs is greater than 16, and the number of outputs is 16, then the volume of a finite set is 2 ^ 16) is linearly separable.

So, what is the manifestation of this third characteristic of memory - dual-layer:
1. We have already mentioned, the ability to abstract from complexity;
2. Due to this characteristic, we look at the world in “rose-colored glasses”. We do not see its immediate complexity, we do not see the chaos that is present here - all the chaos is already removed and appears before us as a conscious world order. Many would like to remove these “glasses”, see life in all its complexity, but our body took this function for us - it deals with all the particulars - how many such and such substances we need, this is the function of the unconscious and instincts. And if we bring it to the level of consciousness (although this is not possible) we degrade, and not only because we don’t have time for creativity, but because we cannot rise above particulars, complexity will not allow us to see further. But then we are forced to see the world distorted, i.e. right as he is before us;
3. Unfortunately, this characteristic also manifests itself negatively. A larger order of memory is needed than the perceived world of stimuli. The truth is that the volume of the device is simpler than the relationship between input stimuli of the environment. More precisely, the memory elements are not directly connected with each other, there are no relationships between them, therefore they are simple in themselves. It is their volume that makes it possible to decompose the complex relationship between perceived stimuli. After all, it is known a mathematical rule that the number of intermediate calculations depends and is much greater than the number of inputs / outputs. Those. to calculate the task requires a large amount of intermediate memory.

Training model (conditioned reflex)

When learning is recording on distributed memory. This recording process is very different from instant recording to a localized memory location, as is the case when writing information to a computer disk. It is not known in advance in which elements the various parts of the recorded fact should be recorded. It is not even known how this fact should be divided.

On the one hand, it is customary to say that a stimulus (fact) causes a certain activity in the memory elements. This is a consequence of the reaction, which appears as an external manifestation of the process of transition from nonlinearity to linearity. If the reaction were simple linear - we would have an unconditioned reflex. And since we want to respond to the non-linear world, we are biologically forced to use memory, with the characteristics described above. On the other hand, the student himself or the “teacher” determines what kind of reaction one should expect from the appearance of a certain stimulus. In other words, when teaching it is indicated (in more serious models it is predicted) what type of phenomena (class) the given fact (stimulus) belongs to.

Thus, the correlation of the activity of memory with the received instructions is the essence of learning.

The activity of memory that arises (pseudo-randomly, since it obeys a certain formation law, but is chosen randomly for a particular stimulus) indicates which elements need to be recorded on the active elements of memory.

The next question is how to distribute information about the fact? Mathematically, this is a question of solving a system of equations - a large number of equations that need to be reconciled with each other.

The problem of learning from two or more teachers

Another important point in the theory of learning is the ability to generate an opinion based on learning from several different teachers. Remember that in a more difficult case it is not an external teacher, but an intuition of its own, but it can not predict unequivocally, but indicate that it can be 2 or more suitable answers, and all of them are correct, but given from different points of view.

Approximation, finding the average between the opinions of different teachers can not be called the development of their opinions. An opinion appears when a certain general situation is chosen as a basis, which in principle unites two or more opinions of teachers, at least in a limited area. Those. there is a replacement of incentives initiated by teachers for their own incentives, which are taken as a basis. If it is possible to learn and get a result similar to the result of the teachers, then we can talk about developing an opinion.

Or in a more complicated case, there is some law that combines the correct answers, but this happens through great mental effort and is usually not used in everyday life, but goes to the pages of articles or textbooks. This is the manifestation of the intellect - this and only this, the rest affected here - are only elements of organic life.

In this case, during training, contradictions may appear, which are in the picture offered by the same teacher. This indicates that, in the opinion of the student, the picture offered by the teacher is not accurate, and here an approximation can be applied, not between opinions of different approaches (teachers), but between the same approach, but in different situations. Those. according to the student's opinion, the teacher acts differently in the same situations, but only does not see that the situations are different. This effect is due to the fact that the learner initially chose several other key points, and during the training he replaced the teacher’s incentives with his own. The described learning process is useful as the formation of new knowledge, the emergence of different angles of view on the same process ... and most importantly, such training can be modeled and automated.

It turns out somehow a lot of text, so I will divide it into two parts. In the next part, we will proceed directly to the definition of consciousness and what this means for artificial neural networks. Those. to be continued ...

Source: https://habr.com/ru/post/148777/

All Articles

Model of functional separation of consciousness and unconscious. Introduction

More articles: