Voice "prints" now officially work (and how the implementation process at Priorbank looks like)

- Is it possible for you to answer anonymous questions in the bank there?
- No, Vladimir Petrovich, not zapadlo.

Priorbank, one of the largest commercial banks in Belarus, which is part of the Austrian Raiffeisen Group, uses voice standards (or, as they say, voice “fingerprints”) of customers to confirm their identity when they call. This is so far only the second case on the territory of Russia and the CIS, when the bank officially announced the fact of using such technology.
')
We have already told about voice biometrics (the possibility of “recognizing” and identifying the caller, for example, the subscriber’s contact center, even if he uses a different phone or is represented by someone else - this is relevant for the anti-fraud). I'll tell you about what features are in the introduction of voice biometrics on the example of Priorbank.

Why do you need it

The number of customer calls to the contact center of the bank with requests that require personal identification is growing. So, if in November-December 2007, 46% of calls were recorded with the need to request personal information, in the same period of 2015 the number of such calls increased to 68% of the total number of all calls. The technology of additional voice authentication can significantly reduce customer service time.

Terms

In voice biometrics, there are a couple of concepts that are quite often confused. Let's figure it out. Voice identification allows you to understand that a person who calls from an unfamiliar number to the contact center can be Vasya Petrov. As part of the identification, a match is checked for a single sample of votes with many of the voices database in the contact center.

Voice verification (it is authentication) allows you to confirm the identity of a person by phone. That is, with a certain degree of probability to assume that Vasya Petrov is really Vasya Petrov. As part of verification, a comparison of two voice samples takes place: the voice of the person whose identity is to be confirmed, with the voice that is stored in the database of the system and whose identity has already been reliably established.

All biometric systems are probabilistic, and none of the existing systems can guarantee the absence of errors.

Errors

Errors are of the first and second kind or, otherwise, of a false “alien” (False Acceptance Rate, FAR) and false false (FRR) errors. These errors are interrelated, and setting up biometric systems requires finding a “compromise” between the levels of these errors in order to satisfy the task as much as possible.

Voice biometrics can be applied in various applications. For example, to verify users during a conversation with a contact center operator, automatic voice verification in IVR, provide user access to a mobile application (shared with other types of biometrics: fingerprint or face), identify fraudsters by voice (antifraud), etc.

The project on the introduction of voice biometrics in Priorbank was implemented during 2015. The key task is to confirm the identity of the client by phone during a conversation with the operator and reduce the verification time. As a result, improve the quality of remote customer service and increase the level of protection of banking information.

Every month, the contact center operators of Priorbank process tens of thousands of calls, more than half of the calls are requests for information on accounts and operations. The main questions are: “why the card does not work”, “what is the loan debt”, “why the operation does not go through” and so on. These issues cannot be resolved without confirming the client’s identity - by law, information on accounts is provided only to their owners.

Such information is not just trying to get the account holders, and third parties. That is why the person who called the bank is always asked for passport information, card number, mother's maiden name, and so on. On the clarification of these issues takes an average of 30−40 seconds per customer.

As part of the project to introduce voice biometrics, the solution was chosen by the Speech Technology Center company, since the biometric platform of the Russian developer has already proven its effectiveness in verifying the identity of bank customers. Last summer, a solution based on it was implemented in the Wells Fargo mobile application - the largest bank in the world by market capitalization.

The principle of operation of the system in Priorbank

At each call (incoming or outgoing) at the moment of the beginning of a conversation with the operator, the user is started to be checked in the background and data about his voice is collected.
Biometric voice parameters are measured in real time and compared with the previously saved benchmark. The whole process takes a few seconds.
The result of voice verification is displayed on the monitor of the CC operator.

The process is so reliable that it makes it possible to distinguish, for example, the voices of twins or the call of a parodist. The system will verify the voice with the standard in a few seconds and report that the verification failed. The system is language independent, so a bank customer can speak in any language available to him.

Stages of implementation

Development of technical specifications, design and implementation.
Integration with CRM.
System calibration
Testing / trial operation.
Delivery of work.

There are several processes when working with voice samples:

initial filling of the database with voice samples of customers based on recordings of their calls to the contact center. Moreover, when creating a standard, the client must confirm his consent to his (reference) record;
update voice standards;
customer verification by voice.

The database with voice samples of customers was formed in the process of their calls to the contact center of the bank by phone. If during a conversation with an operator, a client could confirm his identity in a standard way (using passport data, secret words, contract numbers and other details), then when a sufficient amount of speech material was collected, the system created a digital standard based on the unique features of his voice: accent and the speed of speech, pronunciation, etc. The physical features were also taken into account: speech tact, shape and size of the mouth, nose structure. Thus, already on the next call to the KC, the client verification procedure was significantly reduced in time due to background verification of the person’s voice.

How is the process of registration of voice standard

When calling, the client is identified by the phone number (if he is calling from a mobile phone whose number is registered with the bank).
Then he goes through the procedure of standard authentication based on his full name, date of birth, passport number and contract, secret phrase. In general, up to the name of the cat.
During the conversation, the system accumulates the amount of speech needed to create a voice standard (usually about 40 seconds of clear speech), and when it is collected enough, it informs you about the readiness to create a standard. The operator presses the button, and if the information that the customer has reported about coincides with the data from the bank’s systems, the operator saves the voice reference. Otherwise, no save occurs. The bank receives consent from the client to create a voice standard in advance, at the stage when the client signs a service contract.

How is the verification process

The customer calls and is identified by the mobile phone number. If a voice standard has already been created for it, then the system will begin:

verification procedure, if not, see the story about the registration of a new standard. The operator asks a few simple questions to the client (asks to introduce himself and indicate the date of birth).
Within a few seconds of a conversation, when a sufficient amount of client’s clear speech has been accumulated (7–9 sec.), The system compares his voice with the standard and shows the result to the operator (“his”, “alien”, “not sure”).
The operator either finishes the survey, or continues, because the system is “not sure” that this is his own, or gently refuses the client, because he is “alien.”

If necessary, the operator can manually restart the client's biometric verification procedure, for example, if a third participant intervened in the conversation during the verification process.

After the verification procedure is completed, the system no longer “listens” to the conversation, and its resources are not used.

Features of implementation

To determine the solution in the system, three thresholds are configured (“first”, “second” and “enrichment”). If the result of the comparison is higher than the first threshold, then the system considers that the client is “its own”, if it is lower than the second, the client is “alien”, if between - this means that the system cannot be sure of its decision.

If, when comparing, the result is higher than the first threshold and higher than the “enrichment” threshold, the system automatically updates the voice standard. This allows you to keep standards up to date.

Fighting scammers

It would be a great omission not to say that, among other things, voice biometrics helps to fight fraudsters. This is the so-called anti-fraud system. This subsystem has not been implemented at Priorbank, but I’ll tell you about it anyway.

According to the Aite Group, fraudsters can get from 20 to 50% of all secret questions. Within a few seconds of the conversation, the client is automatically verified and verified that the caller is not a fraudster.

It works like this. In addition to the database of voice standards of ordinary clients, a black list of frodsters is created, where the voice samples of scoundrels are entered. When calling the contact center, the client's voice is compared with one of its standards in the client database (verification) and with all the standards in the frodster database (identification). Naturally, if it is found in the database of frodsters, the scenario of serving such a client changes.

The authentication solution plus antifraud-system together allow for up to 90% of fraud detection with a false response rate of 0.1%.

In biometrics, as I said earlier, there are key percentage probability indicators (thresholds):

false tolerance FAR (False Acceptance Rate, "the level of exactingness of the system"), i.e. the probability that the system will let a "stranger" person;
and a false failure of FRR (False Rejection Rate, error), i.e. the probability that the system will not allow "its" person.

The indicators are very closely interrelated. The FAR value is called specificity, the FRR value is called sensitivity. By increasing / decreasing the sensitivity of the system, we lower / increase its specificity and vice versa.

If we talk about False Reject, then in the classical scheme with questions the failure rate is about 10–15%, with the additional use of real-time authentication, this figure does not exceed 4%. As for False Acceptance, under the standard scheme, fraudsters authenticate in 15–20% of cases, additional authentication in online mode allows reducing this indicator to 1–3%.