📜 ⬆️ ⬇️

Fundamentals of the approach to the construction of universal intelligence. Part 1

From universal intelligence to strong AI. Prospects for creating a strong artificial intelligence


The field of artificial intelligence (AI) has brought a lot of remarkable practical results in the automation of human activity in various fields, which gradually changes the face of our civilization. However, the ultimate goal - the creation of truly intelligent machines (strong AI) has not yet been achieved. At the same time, few of the scientists who really doubt that such a strong AI can be created in one form or another. If any objections sound, they have a religious character, appealing to the presence of a non-material soul in a person. But even with such radical views on the non-material world, they write off only such complex conceptually phenomena as free will, creativity or feeling, without denying the possibility of endowing the car with almost indistinguishable behavior from a person. Much less unambiguous are the answers to the questions, when and how exactly can a strong AI be created?

Artificial intelligence as a region experienced different periods. The initial period, which is often described as romantic, was promised to create thinking machines soon, within a couple of decades. Unjustified expectations led to a more pragmatic attitude, to the orientation of many researchers on weak AI - non-universal intellectual systems capable of solving narrow classes of practical problems. The peak of this trend falls on expert systems (ES), which promised not to have machine intelligence, but effective commercial solutions to complex applied problems. However, here the expectations were not met. ES, although they were rather successfully applied in practice, did not become a breakthrough technology that would turn the world business upside down, which is why the investments that flowed into this area decreased markedly [McCarthy, 2005]. The winter of AI has begun in the USA. Japan failed in the project of computers of the fifth generation.

However, research in the field of AI has not faded away. A large number of subdomains separated from AI, such as computer vision, text analysis, speech recognition, etc., continued to bear fruit, though not sensational, but more and more significant. Business interest in weak AI systems has been revived. The words about the extraordinary importance of the field of AI in the future for all of humanity began to reiterate [Nilsson, 2005a]. And once again the idea began to sound that the field of AI needed to “officially” return its ultimate goal - the creation of truly intelligent machines [Brachman, 2005].

At the same time, however, in purely academic circles, scientists have ceased to announce the timing of the possible creation of a strong AI. However, a number of prominent specialists in this field again refer to dates of several (and sometimes even one) decades [Hall, 2008]. And this time, such expert expectations are supported by independent evidence. One of them is due to the fact that, at least, according to some estimates, computing power by a computer comparable to the computing resources of the human brain is achievable by the 2030s (and by some - achievable now [Hall, 2008]). Given the fact that the lack of computing power was (as is clear now) one of the objective reasons why early predictions about the creation of a true AI were unrealizable, the likely elimination of this cause is encouraging in the near future.
')
But computational power is only a necessary condition for creating a strong AI. In addition, there are many substantive problems in the theory of AI, which have not yet been solved. Can they be solved in the coming decades? Some confidence in this is provided by forecasts related to technological singularity (see, for example, [Kurzweil, 2005]). The concept of singularity is based on the fact of the accelerating increase in the complexity of technical (and earlier - biological) systems. Since at each stage of global evolution the complexity of systems turns out to be exponential (a particular example here is Moore's law), and when passing between stages, the exponent increases every time, that is, the doubling time decreases (for example, the DNA doubling time is hundreds of millions of years, and nervous system - tens of millions of years), then we should expect the output of this process to infinity in a finite time.

Extrapolation of the curve of increasing complexity does not allow attributing the moment of occurrence of a singularity later than 2050 (and usually earlier), and the emergence of a certain superhuman mind probably should be one of the subsequent stages of increasing the complexity of systems. Of course, the possibility of achieving a true singularity can be challenged: the graph of increasing complexity is objective, but its extrapolation may be different, but the time intervals to the next stages (metasystem transitions) should not start too suddenly and lengthen too much. And, this means that this concept also confirms the possibility of creating a strong AI in the coming decades, which makes this problem, although it leaves the question of how it should be approached to solve it.

At the same time, leading experts have noted the impossibility of achieving strong AI in short-term projects [McCarthy, 2005], by creating highly specialized intelligent systems [Nilsson, 2005b] or even by continually improving systems that solve isolated cognitive tasks such as learning or understanding natural language [Brachman, 2005 ]. It is necessary to set and solve the task of creating a strong AI, even if it does not expect any commercial results in the first ten years or more.

In the academic environment, everything is limited to a completely natural call to unite the sub-areas of AI [Bobrow, 2005; Brachman, 2005; Cassimatis et al., 2006], each of which has already managed to acquire its own deep specifics. The progress achieved in each of the subdomains gives hope that combining the obtained results will allow building intelligent systems that are much more powerful than those built at the dawn of the computer era in attempts to create the first thinking machines. On the other hand, such an association should give a lot to the subregions themselves: after all, the problems solved within their framework are often relied upon by AI-complete ones. So, it is hardly possible to create universal systems of pattern recognition, language understanding or automatic proof of theorems without creating a strong AI, since there is a fundamental interrelation between all these tasks [Brachman, 2005].

The use and study of cognitive architectures as a means of combining in a single system all the functions necessary for a full-fledged intelligence, such as training, knowledge representation, reasoning, etc. highlighted in the new dominant paradigm in the field of AI as a whole [Brachman, 2005]. And it is this paradigm that is officially associated with the construction of human-level artificial intelligence systems [Cassimatis et al., 2006; Cassimatis, 2006], or universal [Langley, 2006].

Such integration studies are necessary, but how sufficient are they? General ideas that a strong AI should be created as a single system, which should include some basic cognitive functionality, are quite obvious and have been expressed for a very long time. However, there is still no minimum required list of cognitive functions, nor, all the more, reasonable details of their implementation.
Moreover, there are not only many essentially different cognitive architectures [Jones and Wray, 2006], but also architectural paradigms alternative to cognitive [Langley, 2006]. At the same time, cognitive architectures mainly concentrate on the issues of integration, the interaction of individual functions. But is it possible to get a strong AI from weak cognitive components? In our opinion, the answer is unequivocal: no. Instead of (or, at least, in addition) discussing the methodological issues of combining the existing weak components, it is necessary to develop a theory of strong AI, from which both the structure of the strong components of AI and the necessary architecture of their unification will follow.

As rightly noted in [Cohen 2005]: “Poor performance and universal scope are preferred.” Taking into account the fact that, as mentioned, the creation of effective highly specialized systems almost does not bring us closer to a strong AI, it is natural to ask what is lacking in modern cognitive systems in terms of universality?

Versatility as algorithmic completeness.

Historically, in the field of artificial intelligence several fundamental directions have emerged, such as search or training. These directions begin to be clearly visible when we set intellectual tasks in the most simplified pure form. So, considering game problems or proofs of theorems, one can propose a universal solution for them - a complete enumeration of options in the space of possible operations. Of course, with finite computational resources, complete brute force is impossible, but this does not eliminate the concept of search as a fundamental component of intelligence. In the case when the search space is not known in advance, the task of training is set (more precisely, predictions of how some operations will affect the states of the world and the agent himself). Here, the universal solution is not so obvious, but it is also known almost from the very moment of the origin of the AI ​​region. This is the universal prediction method of R. Solomonov [Solomonoff, 1964] based on algorithmic information theory. This method is also not applicable in practice, since it requires a huge search of options (and, generally speaking, it requires solving an algorithmically unsolvable stopping problem).

These ideal methods are what you need to approach in conditions of limited computational resources, since only a limit on resources separates these methods from making a strong AI on their basis. For example, all the problems of heuristic programming and metaheuristic search methods arose when trying to solve a search problem with limited resources. Also, the problems of machine learning, including, for example, transfer training, learning concepts, and much more, are due to limited resources. At the same time, however, researchers who develop practical methods often do not look back at the ideal to which one should strive. This leads to the creation of methods of weak artificial intelligence, possessing a fatal defect. This defect lies in the fact that these methods are not Turing-complete, that is, they work in a limited space of algorithms and, in principle, cannot go beyond these limitations. Although for different particular methods, domains in the space of algorithms may differ from each other, their final combination cannot give algorithmically complete space. In terms of machine learning methods, this means the impossibility of identifying arbitrary regularity that may be present in the data, the inability to build a model of the world that was not foreseen by the developer.

Here lies the answer why work in the field of cognitive architectures (as an approach to a strong AI) is not sufficient. They proceed from the premise that modern methods of searching in the space of solutions, knowledge representation, machine learning are enough, and they lack only a combination, in which a new quality will emerge - a strong AI. We, however, believe that the universality property of intelligence lies in the fact that it can, in principle, operate with any models from an algorithmically complete space (although in practice this, of course, is not fully achieved). In this regard, it is useful to separate the concept of universal and strong AI. Although they can actually mean the same thing, but the concept of a strong AI implicitly implies a desire to create models that resemble human intelligence in appearance, while the concept of a universal AI makes it necessary, first of all, to pay attention to building insurmountable restrictions on what the AI ​​will be able to learn or in what environment will be able to adequately act.
To ensure this, you can start with some idealized model of a strong AI operating in infinite resources. Since truly autonomous artificial intelligence should be created as an embodied intellectual agent, it is necessary to develop an idealized model of such an agent that would hypothetically solve all the tasks that a person can solve.

There are attempts to create such models (the most famous is AIXI [Hutter, 2005]), and we will discuss them later. Now we only note that consideration of such models makes different researchers come to the conclusion that it is the algorithmic completeness that ensures the universality of intelligence, and this property should be tried to be kept, at least in the limit (see, for example, [Pankov, 2008]).
Thus, the first methodological principle is the preservation of the absence of restrictions on the algorithmic completeness of the set of models (laws, concepts, representations) that can be derived or used by systems of universal AI.

Realizability as resource limitations.

Models of universal algorithmic intelligence can be a good starting point. But it is also obvious that it is necessary to take into account the limited resources so that these models can be realized. After all, this limitation largely determines the specifics of our cognitive processes.

Indeed, universal intelligence models have almost nothing in common with real intelligence, judging by their “cognitive operations”. Such models will not explicitly build a system of concepts, will not carry out planning, will not have attention, etc. It is extremely difficult to say whether they will have the function of “understanding”, self-awareness, etc. Here you can draw a (incomplete) analogy with a chess program, which, due to unlimited resources, performs a complete search. This program is extremely simple. Its only fundamental operation is search. There is no description of chess positions in any derivative terms, there is nothing like understanding. But within the framework of chess, it behaves perfectly. In a similar way, one can try to imagine the ideal embodied intellect acting in the real world.

The absence of the main part of cognitive functions in such an ideal intellect can mean one of two things. Either these functions are a consequence of limited resources (for a number of them, for example, for attention, this is so obvious). Either the intellect is something quite different from what is usually meant by it (and it implies a means of solving problems, the main of which is survival). Perhaps the second alternative is not so meaningless (and not so contrary to the first), if not any but some selected way of solving problems is considered as intelligence (that is, if not intelligence is important not so much functionality as how to achieve it). At the same time, with infinite computing resources, reasonable behavior can be achieved by much simpler means. Fortunately, discussing whether a reasonable system should be called that implements ideal (by adequacy) behavior at the expense of “gross computational power” and not at the expense of “intelligence” (some structural complexity of “thinking” processes), not necessarily because of the hypothetical character of such a system . The only thing that needs to be discussed is whether this system will truly have all the capabilities that natural intelligence does. If there is any doubt in this, then it will be necessary to overcome it, either by justifying the attainability of the corresponding possibilities, or by specifying the model.

The idea of ​​limited resources as a fundamental property of a strong AI, defining its architecture, has already been expressed [Wang, 2007]. But being guided by this idea alone is also not enough that will be discussed below. Now we only note that the consideration of limited resources should not violate the (algorithmic) universality of intelligence. Relatively speaking, real intelligence is an “any-time” method that aspires to ideal intelligence with an unlimited increase in computing resources.

The developers of universal models of algorithmic intelligence agree with the need to introduce resource constraints (see, for example, [Schmidhuber, 2007], [Hutter, 2007]). Attempts to impose resource restrictions in these models can be considered as a second step towards universal AI, although it is difficult to judge how significant this step is: these models are often “too universal” in the sense that the authors try to lay minimal bias in them in which world to function.
Thus, the second methodological principle is to build an architecture of real universal intelligence by introducing resource constraints into the model of ideal universal intelligence.

A priori information about the world as the main content of the phenomenon of intelligence.

Incarnate intelligence is limited not only by the number of computational operations performed in solving induction and deduction problems, but also by the number of actions performed in the physical world. The second type of restrictions is fundamentally non-reducible to the first, although there is some interconnection between them: performing some action can eliminate the need to reason, and, conversely, after thinking, you can reduce the number of trial actions in the physical world. It is this type of constraints that is not taken into account in models of ideal algorithmic intelligence with limited computing resources.

Globally, an increase in the effectiveness of actions taken is primarily due to the accumulation of information about the outside world. You can imagine a model of ideal intelligence with a minimum of a priori information. This intellect will be able to learn anything (including the efficient use of its computational resources) and will be as effective in the limit as the specialized intellect is effective, but it will take too much time. And, of course, such intelligence cannot survive autonomously in the process of primary education.

At the same time, a priori information for real intelligence can have the most diverse forms, in particular, be in the form of abilities, such as imitation. Indeed, it is necessary to expect from ideal intellect that he will be able to perform imitation, not having this ability in advance, but for this he will have to first accumulate too much information. If this ability is immediately available, then it can significantly speed up the optimization of one’s own actions in the physical world. It should be noted that models of learning by imitation of robots are now widely studied (as well as the study of mirror neurons in neurophysiology). The problem, however, is that this mechanism (like all other additional a priori mechanisms) is consistent with the universality of intelligence. Likewise, linguistic abilities must be to some extent incorporated a priori. This should be done not because universal intelligence cannot, in principle, be able to acquire them independently, but because this acquisition may take too much time.

The explanation of a number of cognitive abilities as a priori information about the external world (both purely physical and social), which allows speeding up the development of intelligence (which, in fact, boils down to the accumulation of information and its processing), is quite obvious. However, this explanation was not used to define the device of universal intelligence. We are interested in the minimum amount of a priori information and the form of its presentation, which will allow a real AI to develop no slower than a person. A fundamental issue here is the embedding of a priori information into the structure of a universal AI.

The importance of this moment is seen in the example of the flexibility of the architecture of natural intelligence. For example, the human brain in advance does not focus on the fact that linguistic information will be transmitted through speech. During the formation of proto-concepts, mechanisms related to conditioned reflexes work. If the ability to form true concepts is laid a priori, then it is not tied to sensory modality. Such universality must be retained even when introducing some a priori elements into the structure of AI. Now, in the models of teaching concepts, not only the division into semantic and linguistic channels is carried out a priori, but binding to modality is being done. A similar conclusion can be made regarding the modeling of other cognitive mechanisms reflecting a priori information. The most striking examples of this are the expert systems, in which a large amount of knowledge is a priori laid in the absence of the possibility of their autonomous expansion, which, obviously, should be avoided in the case of universal AI.

On the other hand, it is precisely the volume of a priori information that is necessary for real intelligence and the diversity of its forms (this can be information both on the most diverse aspects of the external world and on the heuristics of the optimal use of its own resources) that makes the creation of AI so difficult. In this sense, simple models of universal intelligence bring us little closer to its creation. Practically used cognitive architectures could even be more useful if they did not require a complete rework when trying to make them universal. Instead of adding the universality property to existing systems originally composed of weak components, it will be more productive to start with a universal impractical system, adding to it in a consistent manner those heuristics that have been accumulated in the field of classical AI.

The content complexity of the intellect, its cognitive architecture, is what allows us to act in the existing world in conditions of limited resources and without excessively long training. But this means that the main complexity of our intellect is related to its optimized for the world around us. The structure of such intelligence cannot be derived theoretically in universal models of intelligence, but must be obtained empirically either by universal intelligence itself or by developers. Naturally, we also want to make as universal intelligence as possible. More precisely, such intelligence can be as versatile as the mentioned simplest models are. The difference between them will be only in shifting preferences or bias towards our world. Naturally, the increase in the effectiveness of such intelligence for our world will occur by reducing its effectiveness (but not to zero, in which universality lies) in some other possible worlds, however, given the fact that he will have to act primarily in our world, it is quite acceptable.

But the loss of universality is unacceptable, since our world itself is a “universal environment”. In this regard, with the universal "unbiased" models it is quite possible to begin building a real AI. Heuristics related to the peculiarities of our world can be gradually introduced into them, starting with the most general ones, until the AI ​​can act independently (including self-optimization) quite effectively.
Thus, the third methodological principle is the introduction of a priori information into universal intelligence in order to reduce the amount of data that must be obtained in ontogenesis for an agent to function autonomously in the real world, provided that the subsequent universal induction and deduction are consistent with a priori information.

Part 2.

Literature.

(McCarthy, 2005) McCarthy J. The Future of AI — A Manifesto // AI Magazine. 2005. V. 26. No 4. P. 39.
(Nilsson, 2005a) Nilsson NJ Reconsiderations // AI Magazine. 2005. V. 26. No 4. P. 36–38.
(Nilsson, 2005b) Nilsson NJ Human-Level Artificial Intelligence? Be Serious! // AI Magazine. 2005. V. 26. No 4. P. 68–75.
(Brachman, 2005) Brachman R. Getting Back to “The Very Idea” // AI Magazine. 2005. V. 26. No 4. P. 48–50.
(Bobrow, 2005) Bobrow DG AAAI: It's Time for Large-Scale Systems // AI Magazine. 2005. V. 26. No 4. P. 40–41.
(Cassimatis et al., 2006) Cassimatis N., Mueller ET, Winston PH Achieving Human Intelligence through Integrated Systems and Research // AI Magazine. 2006. V. 27. No 2. P. 12-14.
(Langley, 2006) Langley P. Cognitive Architectures and General Intelligent Systems // AI Magazine. 2006. V. 27. No 2. P. 33–44.
(Cassimatis, 2006) Cassimatis NL A Cognitive Substrate for Achieving Human-Level Intelligence // AI Magazine. 2006. V. 27. No 2. P. 45–56.
(Jones and Wray, 2006) Jones RM, Wray RE Comparative Analysis of Frameworks for Knowledge-Intensive Intelligent Agents // AI Magazine. 2006. V. 27. No 2. P. 57-70.
(Cohen 2005) If What? // AI Magazine. 2005. V. 26. No 4. P. 61–67.
(Hall, 2008) J Storrs Hall. Engineering Utopia // Frontiers in Artificial Intelligence and Applications (Proc. 1st AGI Conference). 2008. V. 171. P. 460–467.
(Pankov, 2008) Pankov S. A computational approximation to the AIXI model // Frontiers in Artificial Intelligence and Applications (Proc. 1st AGI Conference). 2008. V. 171. P. 256–267.
(Duch et al., 2008) Duch W., Oentaryo RJ, Pasquier M. Cognitive Architectures: Where Do We Go From Here // Frontiers in Artificial Intelligence and Applications (Proc. 1st AGI Conference). 2008. V. 171. P. 122–136.
(Yudkowsky, 2011) Yudkowsky E. Complex Value Systems n Friendly AI // Proc. Artificial General Intelligence - 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3-6, 2011. Lecture Notes in Computer Science 6830. Springer. 2011. p. 388–393.
(Kurzweil, 2005) Kurzweil R. The Singularity is Near. Viking, 2005.
(Solomonoff, 1964) Solomonoff RJ A formal theory of inductive inference: parts 1 and 2. Information and Control. 1964. V. 7. P. 1–22, 224–254.
(Schmidhuber, 2003) Schmidhuber J. The New AI: General & sound & relevant for physics. Technical Report TR IDSIA-04-03, Version 1.0, cs.AI/0302012 v1, IDSIA. 2003
(Hutter, 2001) Hutter M. Towards a universal theory of artificial intelligence based on algorithms and algorithms. In Proc. 12th European Conf. on Machine Learning (ECML-2001), volume 2167 of LNAI, Springer, Berlin. 2001.
(Hutter, 2005) Hutter M. Universal Artificial Intelligence. Sequential Decisions Based on Algorithmic Probability / Springer. 2005. 278 p.
(Wang, 2007) Wang P. The Logic of Intelligence // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer. 2007. p. 31–62.
(Schmidhuber, 2007) Schmidhuber J. Gödel Machines: Fully Self-Referential Optimal Universal Self-improvers // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer. 2007. P. 199–226.
(Hutter, 2007) Hutter M. Universal Algorithmic Intelligence: A Mathematical Top → Down Approach // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer. 2007. p. 227–290.
(Goertzel and Pennachin, 2007) Goertzel B., Pennachin C. The Novamente Artificial Intelligence Engine // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer. 2007. pp. 63–130.
(Garis, 2007) Hugo de Garis. Artificial Brains // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer. 2007. p. 159–174
(Red'ko, 2007) Red'ko VG The Natural Way to Artificial Intelligence // in Artificial General Intelligence. Cognitive Technologies, B. Goertzel and C. Pennachin (Eds.). Springer. 2007. P. 327–352.

Source: https://habr.com/ru/post/145309/


All Articles