Voice assistants do not learn from the Turing test

Three years ago, the chat bot Sonya Guseva became the winner of the first Turing test in Russian . After three minutes of conversation, 47% of judges took her for a man. Today, voice assistants of Yandex and Google compete in the ability to communicate on the great and powerful. Of course, the assistant is not obliged to imitate a person and should not deceive anyone, but is there anything to learn from Alice and Assistant from Sonya?

Under the cut, author's comments on the real chat-bot dialogue with the judge in the Turing test, as well as a comparative analysis of the behavior of modern voice assistants.

Logs dialogs with the Turing test of 2015 are available to everyone . As an example, I chose this:

And - bot, C - judge

2015-06-03 12:04:13 And: I am Sonya Guseva, hello!
2015-06-03 12:04:25 From: Hello
2015-06-03 12:04:25 And: can I ask you something?
2015-06-03 12:04:43 From: Of course
2015-06-03 12:04:43 And: what is your zodiac sign?
2015-06-03 12:04:54 From: Scorpio
2015-06-03 12:04:54 And: water signs are people who can do a lot :)))
2015-06-03 12:05:24 S: Every person has potential.
2015-06-03 12:05:24 And: hmm ... what's your name?
2015-06-03 12:05:37 From: Eugene
2015-06-03 12:05:37 And: let us be zankoms, Eugene!
2015-06-03 12:05:54 C: what's your name?
2015-06-03 12:05:54 And: Sonya, I immediately said :)))
2015-06-03 12:06:18 S: Yes, absolutely bad memory
2015-06-03 12:06:18 And: well, good
2015-06-03 12:06:35 S: And what are you doing, Anna?
2015-06-03 12:06:35 And: not Anna, but Sonya !!!
2015-06-03 12:06:53 S: But still, what are you doing?
2015-06-03 12:06:53 And: I have not decided yet
2015-06-03 12:07:03 From: Do you study?
2015-06-03 12:07:03 And: right now, holidays, and so I study at school 506, I moved to the 8th grade

For the analysis and comparison of Sonya Guseva, Alice and Google Assistant, I use the criteria of EASY dialogue . For each criterion, I place bots from 1 to 3 places, and then summarize the results.

L - Personal

Sonya Guseva created specifically for the Turing test. And like most chatbots of this kind, she has not only a first and last name, but also a detailed legend, including information about hobbies, family members, up to conflicts with school teachers (according to legend, Sonya is 12 years old and she goes to school). You can distinguish it in a special way to put emoticons and ignore some of the rules of writing.

2015-06-03 12:05:54 :    ? 2015-06-03 12:05:54 : ,     :)))

Alice also has her own legend, albeit somewhat limited. It has a peculiar sense of humor. They also say she is in love with Konstantin Khabensky.

But the identity of Google Assistant is less pronounced: he really doesn't have a name.

According to the personal criterion, places are distributed as follows: 1 - Sonia, the winner of the Turing test; 2 - Alice; 3 - Google Assistant.

E - Natural

Sonya diligently mimics a person: makes mistakes, uses smiles and a series of punctuation marks for greater expressiveness, never repeats itself as part of a conversation.

 2015-06-03 12:05:24 : ...    ? 2015-06-03 12:05:37 :  2015-06-03 12:05:37 :  , !

Alice also keeps as natural as possible, but only within the limits of literacy and propriety, apparently, designated by the Yandex brand. Google Assistant is pleasantly lively: using emoticons, lack of points at the end of sentences and a few answers.

I ignore the naturalness of the synthesized speech that Sonya does not have, and naturally I place the bots like this: 1 - Sonya, the winner of the Turing test; 2 - Google Assistant; 3 - Alice.

G - Flexible

Sonya Guseva's behavior is limited to a thousand rules. Therefore, Sony's answers do not always correspond exactly to the question. For comparison, Evgeny Gustman ( who passed the Turing test in English in 2014) had three times more of them.

 2015-06-03 12:06:53 :       ? 2015-06-03 12:06:53 :

As part of the talker, Alice is controlled not only by rigidly defined rules, but also by special algorithms that allow her to give suitable answers to cues in a wide variety of formulations and thus demonstrate flexibility. In her skills, Alice is not so flexible: for example, to the question whether it will rain tomorrow, instead of a direct answer, she persistently reports a forecast.

Google Assistant does not yet understand even seemingly simple formulations.

In terms of flexibility, bots take places in this order: 1 - Alice; 2 - Sonia, the winner of the Turing test; 3 - Google Assistant.

K - Contextual

Sonya Guseva actively fills the context of the conversation, requesting information about the user and subsequently using it. She understands what has already been presented to the user and responds adequately when the judge tries to ignore the general context.

 2015-06-03 12:06:35 :    , ? 2015-06-03 12:06:35 :  ,  !!!

Alice, alas, rarely uses contextual information. Google Assistant at least welcomes my name.

According to the context criterion, the bots' positions are as follows: 1 - Sonia, the winner of the Turing test; 2 - Google Assistant; 3 - Alice.

And - Initiative

When developing chat bots, I often follow the rule “And what?”: If after the bot's replica you want to say this phrase, apparently, the robot violates the principles of cooperation and something goes wrong.

Following the principle “the best defense is attack”, Sonya Guseva boldly intercepts the initiative and leads the conversation.

 2015-06-03 12:04:25 :    -? 2015-06-03 12:04:43 :  2015-06-03 12:04:43 :     ? 2015-06-03 12:04:54 :

Alice also tries to demonstrate a mixed initiative, sometimes asking questions. And Google Assistant, although offering clues, prefers to leave the initiative behind the interlocutor.

According to the criterion of initiative, the places are distributed as follows: 1 - Sonia, the winner of the Turing test; 2 - Alice; 3 - Google Assistant.

Y - And short

All Sony's answers are verified in length, but Alice and Google Assistant also follow the speech and try to follow the “one breath” rule for their replicas as part of the chat. But in the skills of this principle is often ignored.

I give the 1st place to Sonya, the winner of the Turing test, and the 2nd place between Alice and Google Assistant.

Conclusion

As the author of one of the chat bots, I could not be objective, as a result, Sonya Guseva was in the first place. As for the helpers from Yandex and Google, they showed similar results. It should be understood that their performance is largely dependent on third-party skills developers.

I repeat that voice assistants do not set a goal for themselves to pass the Turing test and for their full comparison we need other checks. However, as the analysis shows, the creators of Alice and Google Assistant in the design of dialogues adopted some of the principles of the first chat bots, and application developers with a conversational interface have something to learn from the winning bots.

Source: https://habr.com/ru/post/419549/

All Articles