About computer understanding of text

One translator girl, looking thoughtfully at the ceiling, asked me: “Can computers ever understand a text the same way as a person?” Then I could not answer this question, but now, having some knowledge of text semiotics, I I am sure that just like humans, computers will never be able to understand the text.

In this article I will discuss several examples of difficulties that for a person are not at all, but for a computer they are practically insoluble.

(By computer, I do not mean abstract artificial intelligence, but a computing device that performs a computational process. This is important.)
')

For a start, let's figure out what the meaning of the text is. It used to be, and it is still considered by many, that the text has a single intrinsic meaning. This seems obvious, but it is not. Meaning is a change in the internal state of the interpreter when a message is influenced by it. Thus, meaning is a function of the interpreter and the message, not just the message. If you change the interpreter or its initial state, the meaning changes. [one]

Consider the phrase: "First Nikolai printed out a letter from Sony." How in this case to understand the word "printed out"? If you know that this is an excerpt from the novel “War and Peace”, the meaning should be “discovered”, “removed the seal”, but why? Because the action takes place in the XIX century, and then there were no printers, and the letters were sent in envelopes, which in order to open, it was necessary to remove the seal. However, the same word, in the same phrase, in the modern text might well make sense to “send to print”. Thus, an interpreter who knows the origin of the text and has some knowledge of the history of technology will understand this text differently than he who does not possess all this knowledge.

(Funny, Google Translator translates the word “print out” correctly in the above phrase, but only because, in combination with the word “letter,” the word “print” always translates to “to open.” Therefore, the phrase “He printed the letter and the printer broke” Google translates incorrectly.)

If we are talking about a literary text, then most often the author relies on a certain specific set of reader knowledge - a cultural context - but modern postmodern texts may well be designed for a different, but equivalently correct, understanding by carriers of different cultural contexts. [one]

For a person, accessing the cultural context (or the knowledge base) is so natural that he most often does not notice this. However, it is sometimes surprising how complex logical constructions are required to prove that any person seems obvious, and the computer will have to deal with these logical constructions; on the number and complexity of these constructions will depend on the speed of understanding the computer. Of course, you can come up with a lightweight language, for understanding of which you will need nothing more than knowledge of the words used in the text. However, such a language will be devoid of many properties of natural language. What should such a language not require from an interpreter?

Understanding defaults

Once, a certain Japanese delegation flew to the United States. The head of the delegation sent a letter to the receiving party with a request to send a bus to the hotel at 7:00 am. The next day, at 7:00 am, there was a bus at the hotel, but without a driver. In this case, the Japanese expected American understanding of defaults. In fact, a request to send a bus may or may not have a default depending on the situation, and the interpreter must take this into account when analyzing circumstances using the knowledge base.

Understanding Metaphors

Metaphor is the designation of an object by the name of another object selected by the similarity of properties. Thus, the interpreter must not only know the properties of both objects, but also be able to select common ones, which requires even greater knowledge, namely, knowledge of the properties of properties.

Understanding of humor

Humor is a difficult thing - so complex that if used in a Turing test, not every person will pass it. You can find rigorous criteria ridiculous, but in order to test the statement against these criteria, often non-trivial conclusions are required using the cultural context. Therefore, not everyone understands humor.

Reading "between the lines"

Often a text, especially an artistic one, carries more information than it seems at first glance - take, for example, the first sentence of this article. That's just for the perception of such hidden meanings requires an understanding of humor, metaphors and silence.

Would it be possible to consider a language that possesses all these properties as natural? Hardly. Rather, it will be similar to a conventional programming language, namely Prolog, and studying it will require no less effort than learning any other programming language, which is not the goal of creating AI.

If we say that the AI understands the text, we should expect from it solving the problems of text analysis, with which the person at the very least copes. For example, such as:

According to the text to determine the ideology of the author (right / left).
After analyzing the text, print ":)" if the text is well and ": (" if the text is sad.
At the first sentence of this article to determine the relationship between the author and the girl.
Decipher the meaning of a new unfamiliar saying.
Translate artistic text from one language to another.
Write an essay on a given topic.
Etc.

AI should not just be able to solve these problems, it should solve them just as quickly or quickly than people. In addition, the training of AI should take as much or less time than it takes to train a person. Otherwise, the use of AI will be inappropriate.

Why is it impossible to simply present all the knowledge of a human interpreter in the form of a semantic network or a set of predicates? It is theoretically possible, but the volume of this knowledge base will be enormous. Each word in each phrase generates a definition, which is also a phrase from words, each of which also generates a definition. And so, until the definitions do not close themselves on themselves. And this, without taking into account the combinations noun-verb, verb-adverb, idioms, etc., each of which can have its own unique definition. Thus, the volume of the knowledge base grows exponentially relative to the interpreter's vocabulary. With such volumes, even the transfer of computers to the element base of the scale of elementary particles (if it is at all possible) will not give sufficient amounts of memory and performance. Actually, the analysis of the texts for today rests on this.

Certainly, the human brain does not use any of the known methods to represent knowledge. Moreover, there are many reasons to believe that the brain does not function fundamentally algorithmically, and therefore cannot be modeled by a computational process. [2] Outstanding English mathematician Roger Penrose even thinks that the brain cannot do without quantum effects. In any case, how the brain works, nobody knows to the end, and many discoveries are still possible in this area. As for computers in the modern sense of the word, it can be said with confidence that they will never understand the text.

The role of the reader. / W. Eco.
The new mind of the king. / R. Penrose.

Source: https://habr.com/ru/post/126748/

All Articles

About computer understanding of text

Understanding defaults

Understanding Metaphors

Understanding of humor

Reading "between the lines"

More articles: