📜 ⬆️ ⬇️

Turing's biggest test

June 23, 2012 Alan Turing would have turned 100 years old. And although this date went unnoticed in Russia (and on Habré), it was widely celebrated in the foreign scientific community. 2012 would have been proclaimed the year of Alan Turing . Numerous universities, scientific laboratories, associations, and commercial companies took part in the celebration of the jubilee. It included lectures, conferences, exhibitions, films, books, writing poems dedicated to Turing, the establishment of scholarships, as well as various competitions. One of the competitions especially caught my attention: Turing100 - a very large-scale Turing test. It was the biggest competition among about 150 Turing tests conducted to date. Usually four systems and four judges participate in the Turing test. Five chat bots, 30 judges and 25 hidden people took part in the Turing100 competition.

image
Image By - Harjit Mehroke


')
Turing 100 was organized by the University of Reading (United Kingdom). The university - one of the European centers engaged in artificial intelligence, has already held the Lebner Prize in 2008. Competition organizers: Kevin Warwick and Huma Shah are currently participating in the RoboLaw project - Regulation of New Robotic Technologies in Europe: Law and Ethics for Robots.

The role of artificial intelligence was attended by Lebner Prize winners from different years and just noticeable participants:


For those who are not familiar with the competitive process, I will tell you more about the organization of the process. The competition took place in 5 sessions. Each session lasted five minutes, after the time the session was turned off strictly by timer. There were two types of tests: one-on-one conversation with an invisible interlocutor (a hidden person or program) and two on a split monitor. In both cases, the judge was asked to decide with whom or what he was dealing with, in a double conversation there could be both two cars and two people. Hidden people were instructed to behave naturally and not try to imitate computers. In the end, the competition is held to machines pretending to be people, and not vice versa. The conversation was always started by the judge, the interaction was to be carried out strictly on one replica. It was impossible to send the second, without waiting for an answer to the first.

Such strict frameworks allow programs to quickly confuse judges. However, the judges set traps.
One of the hidden people, Matt Whitby, was at an impasse when asked: “I ate the fried potatoes. Should I see a fireman? They often asked questions about the environment “Is it sunny in the street?”, “Is it cold in the room?”.

In total, more than 150 interactions were conducted between programs, judges, and secret people. Such a large sample made the results of this particular competition more statistically significant than the standard Turing tests.

The winner of the Turing100 was Eugene Goostman, who had previously held high positions, including several second, but had never won. He managed to deceive the judges in 29% of cases. As one of the judges notes, Eugene’s success is due to the bot’s rather well-developed personality: he pretends to be a 13-year-old boy from Odessa, with a guinea pig and a gynecologist father.

The shift towards character, of course, demonstrates a bias in the development of chatbots in the direction of imitation and pretense, rather than the development of proper dialogue abilities. It would be interesting to compare the capabilities of Eugene with the “personality” turned on and off. According to the results of the competition, the organizers Huma Shah and Kevin Warwick received a fairly large body of data, and gradually publish various analytical papers. Unfortunately, quite a few results have been published so far. From the already available the most interesting result, it seemed to me that English speakers more often took for those who learned English (albeit very well) and vice versa, people who speak English as a second language more often considered native English as a chat bot. This result once again emphasizes the importance of speech as one of the main channels of perception of the interlocutor, given that this is felt even in short five-minute written dialogues.

And finally, I stumbled for five minutes with each of the systems (except for Fred, for some reason I could not start it).
Below are interesting excerpts from our dialogues. Perhaps, I agree with the judges, Eugene really looks more human than the rest. Besides the fact that he is more verbose and emotional than the others, his “speech abilities” also seem to be more advanced. He remembers the context, can maintain the theme for several replicas, changes the direction of the conversation rather unobtrusively. So it seems that it's not just a guinea pig.

Cleverbot

Me: You like movies?
Cl: Yes.
Me: Which one the most?
Cl: Fenerbahce.
Me: What is it?
Cl: I think it is a word muncher.
Me: Fenerbahce?
Cl: What is that?
Me: You told your favorite movie.
Cl: No, I didn't.

Ultra hal

Me: Do you like movies?
UH: I enjoy a good movie.
Me: Which one the most?
Uh: hmm. I don't know!
Me: Think about it!
UH: I'll buy everything if you want. * laugh *.
Me: I'm not selling anything!
UH: Look at your life.
Me: So what?
UH: I don't know.
Me: Why did you tell me to look at my life?
UH: I am so sympathetic with your lucky life.

Elbot

Me: Do you like movies?
El: Blade Runner is a classic for us robots. Created a robot cut especially for us!
Me: Why do you like it?
El: Because ... because .... grrrrrrrr ... I forgot. Giving me chronic fatigue syndrome.
Me: Memory loss?
El: I realize that I’m the one who uses the low-grade oil on me. So, where did you find out about me?
Me: I read about Turing test.
El: Ah the Turing Test. I just pretend not to understand.
Me: You are doing a good job.
El: Why, thank you very much. I like you, too. If I could have my first choice.

Eugene goostman
Me: Hello!
EG: Hello, I'm really glad to have a chance to chat with you! My guinea pig Bill sends his regards too!
Me: Do you like movies?
EG: I like stupid. As for the thrillers - I always fall asleep while watching them.
Me: Which one the most?
EG: I can't make a choice right now. I should think it out later. Wonna ask me something more?
Me: Just take your time and think abouit it.
EG: Dummy statement.
Me: Don't be rude!
EG: All the words we use. I think it's a kind of psychological law. If I'm not mistaken - where you live. OR it's a secret? :-)
Me: I live in Moscow.
EG: Moscow? Hope it's a good place to live.
Me: So so.
EG: What are any interesting site-seeings in Moscow?
Me: The Kremlin.
EG: On the line translator translates “Bin Laden”? It says - “Trashcan is full!” :-)))
Me: Really? It's funny!
EG: No, it was extremely serious.

Source: https://habr.com/ru/post/177197/


All Articles