📜 ⬆️ ⬇️

What the Turing Test actually tests

Some time ago I offered to play Test Turing with my bot Misha. Who tried it, he knows that the robot player is determined elementarily. Who has not tried, can read about it in the first comment under that article.

Now the time has come to arrange a debriefing, tell about the device of the bot and draw conclusions.



Rules of the game


The rules of the game were listed in the previous article, I repeat here, so that you and me do not follow the links.
')
  1. Each participant connected to the bot can take on the role of the player who answers the questions, or the judge who asks these questions.
  2. If a participant chooses to be a judge, the bot randomly selects a player for him - either among human players, or by connecting a robot. That is, in the game, the respondent is always alone, and the judge asks questions only to him.
  3. If a participant chooses to be a player, then the bot likewise seeks him as a member of a participating judge
  4. The game is divided into tours of 5 questions. At the end of each round, the judge decides with whom he communicates, with a person or with a car - this is where the game ends. If he cannot decide, he can start the next round or give up, stopping the game.
  5. The robot plays the role of a little boy, Misha, five years old. In order for the judge not to facilitate the task, it is recommended that the human player also respond on behalf of Misha.


From the moment of launch until the writing of the article, 256 games were played (a bit strange, but true).

Of these, we ended up with an obvious result, that is, both sides played until the end of the next round (no one played more than one round) and the judge made the decision - 115 games


There were 26 erroneous decisions, of which:


As you see, more often people were mistaken for robots than robots for humans.

As a result, the robot was correctly defined in (74-15) / 115 = 51% of games, and the robot was taken for 11/115 = 9.5% of games, that is, we can confidently say that Misha-bot failed the test.


For comparison, at a competition in 2012, 29% of 150 conversations judges took the bot-Odessa Zhenya Gustman per person, and in 2014 - 33% of judges based on the results of 300 conversations.

What is under the hood


The bot is written in Python and by and large consists of three modules:


The main game algorithm is contained in the Game class, which stores the participants' id and implements a simple state machine:



On transitions between states, the bot passes the judge's questions to the player, the answers from the player to the judge. It is transmitted only by one message and the bot becomes in the next state, so if someone tried to ask or answer two, he saw that the bot does not allow it.

When the bot receives a response from the robot, it transmits it not immediately, but with a time delay - it did not immediately appear, I added it according to the remark galqiwi , thank you.

Each state change is saved in the SQLite database, in case of unforeseen situations - loss of power, reboot of the operating system, or simply the bot owner (I, I mean) wanted to stop it and redo something. With the new launch, the bot loads the saved games from the database and the game continues.

Further, there are two waiting lines - the judges waiting for the players, and the players waiting for the judges. When a participant starts a new game, the corresponding queue is viewed, is there a partner for him. If not, the member himself is queued.

In a separate thread lives dispatcher robots. He periodically scans the queue of judges waiting for players, and creates a copy of the robot for them. It specifically made a delay so that the robot does not always have time to intercept the judge and the human player also had a chance to connect to the game. The dispatcher has the ability to create different versions of robots, you just need to register their classes. But so far only one has been implemented.

And finally, the robot module. But before telling about the inside of the robot, I’ll tell you about the question and answer database with which it works.

At first I typed this text file:

T: 
T:   
Q:   ?
Q:   ?
Q:   ?
Q:   ?
Q:  ?

T: 
Q:   ?

T:  
T: 
Q:  ?
Q:   ?
Q:   ?

T:  
T: , 
Q:  ?
Q:  ?

… . — , , --. . , . pymorphy2 kmike, YARN, XML SQLite — 22 , …

:

    def normalSynonyms(self,orig):
        r = self.morph.parse(orig)
        res = []
        for parse in r:
            word = parse.normal_form
            syns = self.yarn.synonyms(word)
            for g in syns:
                words = self.yarn.words(syns[g])
                res = res + words
        return res

.

getAnswer(self, text)

, . , - , . , .

:

  1. , . , .
  2. -, , , , . .


, «» , , — , . , , — .

" , , 12000 . «-» 300 . ."

, , , . , .


, , , , , .

, , , . , , .   - , . , , .

— , , . , .

. ,   , — - - . - , . , , « » — . , , .

, , , , - . , , , .


, -:


Raspberry Pi 3. SSD, , , .

, .


, , — . , , , , , , .

, , , " ?" ( «Computing Machinery and Intelligence»)   « ». , , — , . .

:
"    ,   « »   ,     .   , ,   - .   ,   , ,  , ."

- , , , , . , .


:
"     , ,  ,   .    ."

, , . , , , , , , — « [] ?», , .

:
"2018-10-23 13:01:53,385 186 Player2Judge , ?"

— , :



, , . , , ?

. « » ? R2-D2 C-3PO « »? «A.I.»? « »? « »? - ? , , , , - .

— , ? , , .

, R2-D2 -, , .

, , — . ( « » ) , , , , , , , .

, C-3PO , , , , . - , .

. , , :


, —   . .

, . , « ?», (1950), , . « » , .

:
"… , , , , ; , , ."

Source: https://habr.com/ru/post/427259/


All Articles