Translation of the post by Rita Crook " Benedict Cumberbatch Can't Fool a Computer? ".Released this week, the highly anticipated movie
Imitation Game (
The Imitation Game ) tells about the life of Alan Turing (the 100th birthday of which coincided with the 22nd birthday of the
Mathematica system - see Stephen Wolfram's post for more details)
Happy Birthday, Alan Turing ). The central theme of the film are Turing machines. Interestingly, in 2007, Wolfram Research announced a
prize for proving the versatility of 2.3 Turing machines.
Of course, the
promotion video of Benedict Cumberbatch, in which he imitates the voices and behavior of other famous actors, was liked by many. But I wanted to find out if
Mathematica’s functionality from the
Machine Learning area could recognize his voice, or whether he could fool the computer too.
Personally, I can not stop laughing when watching this video, but I want to look at these parodies unbiased.
')
So, I wondered: Does he really imitate the voices of other actors so well, or are we all, including myself, just fascinated by his persona?
Could my mind be deceiving me? If we take the whole sample of the original votes, will the parodies really be indistinguishable from them?
In order to get an answer to this question, 10 years ago we would have to walk the streets and play sound bites from James Bond, Shining, Batman and Cumberbatch, imitating them, interviewing 300 people and then analyzing their opinions.
In today's world, you can use systems like
Mathematica to answer these questions!
The Wolfram Language language has a built-in functionality that allows you to create a classifier based on the training samples of audio fragments, which ultimately will allow us to find out whether Cumberbatch can "deceive" the computer. So, I set myself the task to create a fairly “decent" database of fragments of votes, in addition to this, I singled out fragments corresponding to each of the Cumberbatch parodies and, finally, allowed
Mathematica to do the rest for me.
Construct a path to each of the databases of fragments of votes that will be used by
Mathematica for analysis.

Now import all original voices:

The classifier was created using the
Classify function, which was fed to the input training sample. To increase performance, a classifier created once (
ClassifierFunction ) can then be loaded into the system instantly from the cfActorWDX.wdx file (in the commented out part of the code there is actually a structure creating the classifier):

My database includes: samples of the original voice of Benedict, the voices of the actors imitated by Benedict, and finally fragments of the parodies of Benedict. Sources for creating the training sample were taken from here:
Alan Rickman ,
Christopher Walken ,
Jack Nicholson ,
John Malkovich ,
Michael Caine ,
Owen Wilson ,
Sean Connery ,
Tom Hiddleston , and
Benedict Cumberbatch . I used a total of 560 fragments, but, of course, the more data is used, the more reliable the result will be. At the same time, the samples should be as “clean” as possible (without laughter, music, conversations of other people, etc.)
They must also have exactly the same length (3.00 s). In order to be sure that everyone has the same length, you can use this construction in the Wolfram Language:

Some of the files were not single-channel, so this feature also needed to be eliminated in order to optimize our results at the stage of generating and exporting samples.

I thank Martin Hadley and Jon McLoon for helping to create this code.
Drum roll ... time to talk about the results!
Probably now I’ll break my heart to everyone, and I definitely wouldn’t want to do this ... so I “blame”
Mathematica for everything, because Machine Learning actually allows you to determine whose voice sounds in a particular fragment, and therefore allows you to recognize the imitation of the voice and determine who, in fact, actually said.
The results below show which of the actors and with what probability
Mathematica gave “authorship” in each of the fragments of “imitation” of Benedict to the voices of other actors:

In most cases, the likelihood that one of the actors, besides Benedict Cumberbatch or Alan Rickman, spoke is negligible.

It may be worth noting that Rickman, Connery and Wilson all have a rather slow manner of speaking with a large number of pauses (which is quite noticeable in the fragments I used), which in general may somewhat "confuse" the algorithm.

Now it is time to overcome this slight shock, not holding a "resentment" for Benedict. He still certainly remains very charming.
In general, I am delighted with his talent and with impatience want to look at his game in the film, which I spoke about at the very beginning of my small post.
Resources for learning Wolfram Language (Mathematica) in Russian: http://habrahabr.ru/post/244451