
If call centers, where you sometimes call, have a system for creating voice prints,
you can easily be identified . If there is no such system yet, your prints can be created in one click on previously made records.
It works like this: every time you call, for example, the bank, where there is a system for determining the customer by voice, your conversation is recorded. Say, 1-2 minutes of your conversation with the operator is enough to create a fairly accurate profile of your voice. In the future, you will be determined by the first phrase.
')
The process of creating a voice print and its verification is unbalanced. More data is used to create (more talk time), for checking it is less by about an order of magnitude. The maximum that I saw in a highly noisy line was 15 seconds to check.
What is a voice print?
A voice imprint is a kind of unique recording for a person, something like a fingerprint. It is not tied to the very speech of a person (specific words or a specific phrase), but characterizes the voice as a whole. Technologies for creating voice prints are closed, but if it is short, we
can talk about the analysis of reference points in speech, for example, characteristic of transitions between sounds.
The system responds to various physical characteristics: in addition to the height, speed of a particular person, even the physiological features of his sound path, throat, throat, even nose are taken into account. A total of about 50 indicators are taken into account, such as accented sounds, pronunciation features, speech rate, type of pronunciation of words and sounds, physical characteristics of voice.
How is such an imprint used?
So, you have spoken for a full minute or even more (only the time of your speech is considered). This is enough to get your voice fingerprints.
Now imagine that the company where you call has voice prints of potential fraudsters (or some other list of people under special control). For about 10 seconds of your speech, a comparison with the base of up to 1000 records takes place, and if you have already managed to light up in such a database, the operator receives an alert. Plus, any other notifications work, for example, for the security department.
The second case is when we do the validation, that the client who has just called us is really the client who wants to get access to his account. That is, when you contact the bank and want to perform some kind of operation or to receive important information, they ask you besides your account number, ask for the password, the name of your dog, in which village your grandmother celebrated the first wedding, and so on.
The problem here is that we have a lot of available information in social networks. Accordingly, if fraudsters want to select access to your account, they will spend a lot of effort, and sometimes successfully, to collect this information. Perhaps they will just read your page "Vkontakte", and all the necessary information will be collected at a time. Sometimes they will call back 99 times to the contact center and try to “brute force”. If you use this attack rather slowly, then success is possible in a few months. And here helps the imprint of the present client (made up in conversations where the client was verified correctly).
As a result, when a customer calls and says: “Hello, I’m such and such, I want to make a transfer for my account, for example, I’m such and such an account number”, the system performs verification.
How was it tested?
Given that the creation of voice prints - closed algorithms, there is a logical question about accuracy. I can tell you about the tests. To begin with, we simply shouted “Don't Believe Him” and other nonsense in parallel with the conversation of the person with the contact center. In this case, the verification time slightly increases - an imprint is created for a couple of seconds longer. With a strong wind on the street, the system also takes more time.
We also switched from the handset to the general connection in the room, in which several people speak at the same time, just while the person who is actually talking to the operator, he sits closest to the pipe. And even in this case, the system defined it.
For each client or situation, a different level of verification threshold can be set. For example, if the system is “80% confident”, this is a suitable option for requesting a balance from a cellular operator, and “100% sure” is a good option for a bank before asking questions about secret words.
The technical environment of the contact center has an impact on the voice verification system, it is necessary to calibrate the threshold of false positives for each individual installation. Yes, sometimes even for one client you need several prints. For example, a customer calls from a cell phone from abroad, and all the time he has poor recording quality there, the quality of the channel itself is poor. Several profiles are tied to a client account: the system will check whether each of them fits the situation.
The creation of a voice print is influenced to some extent by the noise in the line. If the environment is as close as possible to where the first call was made with fingerprinting, then the system gives an accuracy of about 100%. Roughly speaking, a call on a noisy line will give only 80% accuracy.
Hi, paranoid!
Yes, you understood correctly. You can create your voice imprint and find your conversations, just like in "Betman." However, while this is not very real - there will be a lot of false positives on a large sample. Therefore, for the time being, the main profile of using fingerprints is a comparison with the base of fraudsters or client authentication to access non-critical data. Of course, voice identification that did not work 100% (that is, not performed in the same technical environment that the original call was made) cannot be used as the only security threshold, but it creates great convenience for many cases in contact centers.
If you need details on the technical features of the implementation of such a system, write to AleEfimov@croc.ru.