As you probably already understood, it will be about the so-called virtual interlocutors, or, as they are called, chat bots. In the title of the post, I deliberately used the concept of "intelligent dialogue system" (abbreviated hereinafter abbreviated as IDS), because I believe that the concepts of chat bot and virtual interlocutor are completely discredited and do not reflect the whole essence of this phenomenon.
In the post we will talk about the design of IDS and the difficulties that arise in this case. There will also be considered common algorithms used in IDS, their advantages, disadvantages, and much more. If this topic is interesting to you, welcome under cat.
Why do we need CID?
The range of application of such systems is incredibly wide. The first thing that comes to mind:
- for automatic testing of knowledge (schoolchildren, students, applicants for any position, etc.)
- as an automated user support service
- for diagnostics (diseases, malfunctions, etc.)
- for fun in the end
What is the situation with CID in Russia?
It's bad. There are a lot of programs of this kind, but the overwhelming majority are primitive and there can be no talk of any intellectuality. Neither self-study nor the ability to maintain a conversation on a given topic is usually absent. The maximum of what their creators can think of is two dozen lines of code and a question-answer type base.
')
Not so long ago, the company Nanosemantika, which creates the so-called info (virtual interlocutors), came across. But on close acquaintance with this information I was disappointed. It looks beautiful, but in fact there is nothing worthy of attention.
The company's website says:
Among other things, the integrated platform includes the infa knowledge base - a set of flexible scenarios with given options for questions and answers to them.
There is no self-study. Dialogue does not know how to support information (unless the “teacher” will try hard and score very, well, very, very many, many answers into the database). That is again primitive. But I will not abuse Nanosemantics. It is clear that this is a commercial project with purely pragmatic goals.
Portrait of perfect sid
What minimum should a truly intelligent conversational system be able to do? I am deeply convinced:
1. A dialogue system claiming to be intelligent should be able to support a conversation on a given topic. Ideally, she should not just answer questions, but also ask them, argue, defend her point of view (well, of course, have it). It is clear that to achieve this in practice is very difficult, but some ideas and ideas (I am talking about myself) are already there.
2. The CID should have a self-learning mechanism (at least primitive). Without this, to call it intellectual simply language does not turn.
3. A CID should be able to construct a response in a natural language (at least in some specific cases), and not just stupidly give out a response that is packed into it.
4. The dialogue system must have its own purpose. At the beginning of a dialogue, such a goal could be acquaintance with a person, finding out his gender, age, needs, interests, and so on. Then, on the basis of this data, she should contact the user, for example, by name or patronymic name or you / you (at the request of the person). There should also be a “memorization” and “recognition” mechanism. To the new conversation did not have to start all over again.
5. The dialogue system in some cases should have personality and emotions. Otherwise, it will be boring to communicate with such a system.
6. Another, though not the most important, but still necessary skill is the ability to perform any actions. Open an Internet address, search the site, register a user, send an e-mail, and so on.
Naturally, the presence of certain skills should be determined by the scope of the system. For example, an automatic support-system is not why (and even harmful) to have emotions. Already I imagine an angry user who, instead of solving his own problem, receives in response: “And you wouldn’t go to ... I’m sad.” Here! ”
Internal structure of CID
Practically all existing CIDs (at least of those with whose device I am familiar with) have a knowledge base in the form of:
user phrase or its significant part | one or several system response optionsThere are of course exceptions, but basically it is. Such an approach inevitably leads to problems. First, the opportunity to respond to a person according to his gender is lost and you just have to poke / poke. Secondly, it is necessary to take into account typos and user errors. This means that the database should contain, for example, such records:
what is your name | my name is so
how tibya zavut | my name is soIf the errors are not taken into account, then the program will not respond adequately to the question “how do you like this”.
Thirdly, if the base does not consist of complete phrases, incorrect system responses are inevitable. For example, the base contains the following string:
how old | I am 2 years oldObviously, the author wanted to provide several options for the question about the age of the program. “How old are you?”, “How old are you?” - the program correctly answers them: “I’m 2 years old”
And what if it is asked “How old is your creator?” Or “How old is planet Earth?” Probably, the answer to “I'm 2 years old” will not suit the user in this case.
Fourth, with this method of storing knowledge, it is very difficult to provide the ability to “hold the topic”.
Conclusions: The described method of data storage (it is traditional) is not optimal. If there is a large database that contains the most common questions, the program may seem rather “smart”, but as soon as a person touches a topic that is not represented in the database, all of his intellect will immediately evaporate and the system will begin to stupidly wildly.
If you have decided that I am categorically against the described method of data storage, then you are mistaken. Such an approach has the right to life (with certain reservations), but if only it is used, then it is a dead end.
So, in what cases and under what conditions can the traditional method of storing knowledge be used in a SIDS database?
1. Immediately before searching the database, the user's replica should be checked for possible typos and errors. Naturally, if any, they should be corrected. This can be done using a special module - spellchecker.
2. The database should store the full version of the user's replica, and not just its significant part. This will eliminate erroneous system responses.
Indeed, why use morphology, syntax, semantics and other means if you need to answer the question "How are you?". Such very common questions can be (and sometimes useful, for example, for speed) stored in a database in the traditional way.
How to store everything else? I believe that for these purposes it is necessary to develop an internal language that is understandable to the system. That is, the user's request is subject to preprocessing (for example, the previously mentioned error correction / typos, morphology, syntax, semantics, etc.). Further, it is translated into an internal, understandable system form. And only after that it is searched in the database.
Using this approach, you can bring different variants of the request to a single form. I will illustrate my thought. Suppose we have user replicas:
How old are you?
How old are you?
How old are you?First cue: the numeral “how many” uniquely identifies the cue as a question. The pronoun-noun “to you” uniquely identifies this question as a personal question of the system. The noun "years" belongs to the category "units of time -> age". We get: the system was asked a personal question about her age.
The second remark: the same, but previously “tibe” will be fixed on “you”.
Third remark: the pronoun adjective “which” uniquely identifies the remark as a question. The pronoun adjective “your” uniquely identifies this question as a personal question to the system. And finally, the noun "age" belongs to the category "age." Again we get: the system was asked a personal question about her age.
Of course it is much more complicated than the traditional way. Of course it is fraught with many problems associated with the processing of natural language. But who said that creating an intelligent system is easy?
For today, perhaps enough. In the next article I will talk about how you can teach the IDC to "hold the topic" and learn. Hope you enjoyed it.
PS In the comments I would like to see your thoughts and suggestions about the storage of information in the system knowledge base.