In the comments to the article
Algorithmic undecidability is not an obstacle for algorithmic AI, I expressed myself in light of the fact that
For some reason, everyone is fixated on NP problems. But no one for some reason puts tasks FASTER to solve problems of class P (up to instant response)
and hinted that the problems of building AI are more likely to be in principle (irresistible) slowness of computers built on the principle of Turing machines.
')
The article also talked about the theoretical problem of algorithmic undecidability on the example of the stopping problem.
The person solves the problem of stopping, but does it with errors, the probability of which increases with the increasing complexity of the programs. The ability of a person to solve algorithmically unsolvable problems (like mass problems) is extremely doubtful. His ability to find solutions for individual particular cases does not prove anything, because even a computer can do it.
and here it seems that the “algorithm” of human work with finding particular cases seems to be underestimated. It is beyond the power of a computer; it lacks a device, thanks to which it could select particular cases. Of course, in the comments in this article - many immediately coded these special cases, but the point is that the computer itself would accomplish it at the decision. It seems that finding particular cases at least fundamentally reduces the complexity of the calculations. The person, simplifying and idealizing, then proceeds to the formulation of laws, thereby passing to a qualitatively different level. And this computer is not available.
But all in order.
How does a person solve problems?A person solves problems (structural problems, and not parametric (clarification of terms later)) significantly faster than they can be calculated on any conceivable computing machine. But at the expense of what? As a hypothesis: due to the loss of universality of the solution, i.e. reducing to a certain set of particular finite solutions, and only faith allows the solved problem to be thought of as solved altogether, until a solution to the particular solution is found.
Example: Let's see the period when a child learns to count. He already knows how to count to 100. He knows how to add / subtract numbers to 10. But he still cannot solve the examples of folding dozens. For example, when he is asked to solve 10 + 10 =, or 20 + 10 =. This causes significant problems. But at the same time, he also understands the principles of addition and already counts up to 100. That is, the child simply needs to add 10 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1. But neither adults nor the child consider this a solution. While he does not immediately answer 10 + 10 = 20, we believe that he does not know this, does not know how, does not understand how to fold. Those. polynomial calculation is not recognized by a person for the ability to count. So the computer does not know how to do it. You can also try to calculate 10 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 in parallel, but it turns out that parallelization does not reduce the number of calculations, time can be reduced only a little. But the main thing for the decision is necessary knowledge of the global context (see below). What comes to the rescue? - introduction of structure in the calculations. It is not known why, but historically a person has come to believe that it is more convenient to consider dozens (not fives, dozens or binary). And so when we teach a child to count, we separately teach to add dozens, discarding zero, leading to this form 10 + 10 = 1 + 1 + "0" = 20. Those. we remove the zeros, add ones and simply assign the zero again. This is the structural decomposition of the addition operation. This rule is far from universal, but it is precisely it that makes it possible to read more quickly in certain necessary cases than any computer in which this aspect is not structured is potentially capable.
Structural adaptation examplesHow is it possible to overcome the limit of polynomial computation for polynomial problems.
In order not to go into details, suppose that the Rosenblatt perceptron solves the parity problem posed by Minsky. If it’s on the fingers, the problem is that even though the perceptron works in parallel to determine if the number of units in the input is even or odd, at least one node is required that would be connected to all the inputs. Those. need to know the global context.
And then we see that the value of parallel devices for such tasks as addition, multiplication, etc. becomes not very high. The degree of parallelism on such basic things is minimal.
But suppose that we have 4 points with values ​​of 0 (no point) or 1 (there is a point). Then there are 16 combinations for which you need to answer even if the number of points. If we remove at least one combination, then the requirement of a global context becomes optional.
Then such a solution with R = 4 and with the loss of one stimulus - tells us the logic of the action. Let's lose the universality of the decision, but decide everything else, plus we will know on what incentive we make a mistake.
This approach leads us to a specific device. Suppose some system G tells us how many points on the retina, i.e. will implement the size invariant. Then our training set (16 combinations) can be divided into 4 parts depending on the size. And then separately conduct training on different "columns" of intermediate neurons. What will it give? This will significantly reduce the number of links from the required 4 to 2. Why has this become possible? Because each training sample does not contain the problem "as a whole", and accordingly does not require solving the problem of "parity" at all.
Using the "size" invariant, our task has even become degenerate, since knowledge of size = knowledge of the number of points, and using this knowledge it is already easier to say that this number is even or not. But suppose we solve a problem where the existing “size” invariant does not facilitate our problem so directly. What then? The relief will occur indirectly, the main thing with the help of the size invariant is the problem that will be divided into parts, and similarly the problem of "parity", the training sample will be divided, which will reduce the universality of the required solution. And again, there will be an order of magnitude less links than is needed to solve the problem in general.
On the way to the "liquid perceptron"It remains for us now to see what we should organize the system G, which in the particular case will implement the size invariant. Here, no miracle happened? Of course not - our law on the need for a global context cannot be overcome.
Suppose we implement the system G as 4 threshold elements that are connected to all inputs, but they have different thresholds. Element a1 will pass the signal if the number of points is greater than zero, a2> 1, a3> 2, a4> 3. And again, we used all the connections with everyone and “overspending” the number of links. But the fact is that logical structural adaptation without physical structural adaptation is indeed of little use.
But if we physically refuse for our system G to use the principle of connecting bonds, and replace it with a more physically suitable system of "fluid concentration", then the effect will already be observed (practical). Such a system of “concentration of a liquid” can be imagined as follows: suppose our perceptron is floating in a liquid. Inputs send not only electrical signals through connections, but also chemically acting on the fluid. The liquid has a certain level of concentration "acidity". Intermediate neurons have different levels of tolerance for "acidity". The more electrical signals appear at the inputs, the greater the level of concentration becomes in the liquid. And with an increase in the level of concentration, certain groups of neurons that are less tolerant of “acidity” begin to be “chopped off” gradually. And thus, under the action of electrical signals, only a specially selected group of neurons, a specific part of one task, is trained.
People familiar with neurophysiology will understand that the described system is essentially a simplified prototype of a biological neural system. But if in the theory of perceptrons previously used only ideas about electrical interaction, now it becomes clear why chemical interaction is also needed. If the electrical interaction implements the "focus", then the chemical provides a "global context."