The story of how cognitive technologies help preserve karma

Recently, we argued with one of my friends about whether to leave negative feedback and negatively evaluate someone’s work. For example, you come to the bank, and there you are a Naham consultant. I am convinced that it is worth it, because without this assessment a person will continue to be rude. A friend believes that this is a big minus in your karma, you can not offend people, they themselves will understand everything with time. At about the same time, we had a hackfest for partners, where I saw a solution that could save the karma of each of us. Sin is not to share. As you may have guessed from the title, under the cut, we will focus on the development based on cognitive services.

Introduction

There should be a text from the series “you know how important it is for any company to evaluate the quality of service is the basis of development”. In my opinion, these are quite commonplace truths, so we’ll omit them.

Save karma, inexpensive

The Heedbook service, which I will talk about today, has one very cool advantage over other ways of evaluating the work of employees with customers - this is an automatic assessment of a client's emotions in real time. That is, returning to my friend, his tolerance will not save the consultant from a real evaluation of the work. And the wolves are fed, and the sheep are safe, and the karma of a friend too.
')

How it works:

1. A front-line employee of a bank (or a pharmacy, or a MFC, or a store, or similar enterprises) at the beginning of the working day enters the system through a browser.
2. A client comes to him, for example, my friend.
3. The system receives and analyzes the video stream from a webcam in real time in the background.
4. Information is processed by the systems of intellectual recognition of emotions, speech and other parameters of the client.
5. According to the results of the analysis, the system provides detailed analytics on the structure of emotions and the proportion of positive / negative emotions of the client, the attention of the client to the employee, the content of the dialogue, the use of the script service or prohibited phrases.
6. The head of the office and the employees of the parent company receive detailed information about the quality of customer service in the context of managers and clients. (And here we see how our consultant begins to get nervous.))

In addition to the above, for each employee, the average customer service time, the number of customers served, the structure of the client base by demographic indicators are determined.

Another interesting feature is that the director can connect to the video stream from the workstations of the front-line employees, as well as watch the video recordings of the conversations later with detailed analytics on them. That is, a new scenario is added to our history when the consultant begins to be rude to my friend, and then suddenly his eyes are rounded and he becomes sweet and polite. )

Well and the last interesting detail, in Heedbook there is a rating system for employees.

How it works: through the eyes of the developer

Azure Functions Microservice Architecture

Together with Dima Soshnikov, we partially helped the guys in designing the solution. The first thing we did was decide to leave the monolithic architecture and make a system built on microservices (as you can see from my last articles, in my opinion this is a very interesting topic). For this, Azure Functions were used . In fact, we also thought about WebJob, but it has performance limitations and pricing is not based on the number of operations performed.

The main AF development environment is the online feature editor on the Azure portal. Also, from the end of May 2017, you can create AF using Visual Studio 2017 UPD 3.

Since AF is a new Microsoft product, there is no complete documentation on it yet, therefore we will analyze an example of one AF project from Heedbook below. This will save time if you decide to build an Azure based microservice architecture.

The trigger for triggering AF can be an Http request, the appearance of Blob in Azure Blob storage, actions in OneDrive, or just a timer. The project implemented almost all the above options for AF triggers. We also implemented AF cascades when the work of one AF starts another, thus providing a single business data analysis process.

An example of our AF is triggered by the appearance of a blob - pictures. With this AF, we will determine the number of people in the picture and their emotions. We do this using the cognitive service Microsoft Face API .

First you need to connect the necessary libraries of cognitive services. For the AF online editor, you will have to do this manually by creating the project.json file and writing all the necessary dependencies there:

{ "frameworks": { "net46":{ "dependencies": { "Microsoft.ProjectOxford.Common": "1.0.324", "Microsoft.ProjectOxford.Face": "1.2.5" } } } }

In the case of creating AF in Visual Studio 2017 UPD 3, we simply connect the necessary dependencies using Nuget.

Next we need to register the AF trigger and output parameters. In our case, this is the appearance of a blob in a specific container and the recording of the recognition results in the Azure MsSql table. This is done in the function.json file:

 { "bindings": [ { "name": "InputFace", "type": "blobTrigger", "direction": "in", "path": "frames/{name}", "connection": "heedbookhackfest_STORAGE" }, { "type": "apiHubTable", "name": "FaceData", "dataSetName": "default", "tableName": "FaceEmotionGuid", "connection": "sql_SQL", "direction": "out" } ], "disabled": false }

So, the Azure Functions code itself!

 #r "System.IO" using System.IO; using Microsoft.ProjectOxford.Face; using Microsoft.ProjectOxford.Common.Contract; public static async Task Run(Stream InputFace, string name, IAsyncCollector<FaceEmotion> FaceData, TraceWriter log) { log.Info($"Processing face {name}"); var namea = Path.GetFileNameWithoutExtension(name).Split('-'); var cli = new FaceServiceClient(<Face_Api_Key>); var res = await cli.DetectAsync(InputFace,false,false,new FaceAttributeType[] { FaceAttributeType.Age, FaceAttributeType.Emotion, FaceAttributeType.Gender}); var fc = (from f in res orderby f.FaceRectangle.Width select f).FirstOrDefault(); if (fc!=null) { var R = new FaceEmotion(); R.Time = DateTime.ParseExact(namea[1],"yyyyMMddHHmmss",System.Globalization.CultureInfo.InvariantCulture.DateTimeFormat); R.DialogId = int.Parse(namea[0]); var t = GetMainEmotion(fc.FaceAttributes.Emotion); R.EmotionType = t.Item1; R.FaceEmotionGuidId = Guid.NewGuid(); R.EmotionValue = (int)(100*t.Item2); R.Sex = fc.FaceAttributes.Gender.ToLower().StartsWith("m"); R.Age = (int)fc.FaceAttributes.Age; await FaceData.AddAsync(R); log.Info($" - recorded face, age={fc.FaceAttributes.Age}, emotion={R.EmotionType}"); } else log.Info(" - no faces found"); } public static Tuple<string,float> GetMainEmotion(EmotionScores s) { float m = 0; string e = ""; foreach (var p in s.GetType().GetProperties()) { if ((float)p.GetValue(s)>m) { m = (float)p.GetValue(s); e = p.Name; } } return new Tuple<string,float>(e,m); } public class FaceEmotion { public Guid FaceEmotionGuidId { get; set; } public DateTime Time { get; set; } public string EmotionType { get; set; } public float EmotionValue { get; set; } public int DialogId { get; set; } public bool Sex { get; set; } public int Age { get; set; } }

In this case, it is an asynchronous procedure in conjunction with the Cognitive Services Face API. AF receives a blob Stream and sends it to CS:

  var res = await cli.DetectAsync(InputFace,false,false,new FaceAttributeType[] { FaceAttributeType.Age, FaceAttributeType.Emotion, FaceAttributeType.Gender});

Next, selects the largest person in the frame:

  var fc = (from f in res orderby f.FaceRectangle.Width select f).FirstOrDefault();

And writes the recognition results to the database:

 await FaceData.AddAsync(R);

It's simple, isn't it? The future is microservices. )

About problems

Okay, not so simple, in fact.

AF, unfortunately, currently has a number of restrictions (the binding of names does not work, there are library conflicts). Fortunately, there is always a lot of walkarround in the .Net development world — and if you don’t manage to solve a problem in a basic scenario, you can find several workarounds.

Shooting video and audio in the background

As you know, modern OSs try to save battery life as much as possible, stopping the proactive work of all applications that are in the background. This also applies to streaming in a web, mobile or desktop application. Having spent a long prospecting work on this issue. we chose a web solution.

We get video and audio stream from a webcam using GetUserMedia() . Next, we have to record the received video and audio stream and extract data from there for transfer to the backend. This works if the browser window is constantly active, as soon as you bookmark a browser as inactive, it becomes inaccessible for recording data. Our task was to make a system that will work in the background and will not interfere with the employee to perform their direct duties. Therefore, the decision was to create our own stream variable, where we record and extract data of the video and audio stream.

The quality of recognition of the Russian language

In the future, the service will move towards creating its own models of audio recognition, but currently it is necessary to use external providers of Russian language recognition services. It was difficult to choose a good service that provides high-quality speech recognition in Russian. The current configuration of the system uses a combination of speech recognition systems, Goolge Speech Api is used for Russian speech, in testing it has shown the best recognition quality results.

Back to reality

In fact, this decision is not just a fairy tale about the future. In the near future, Heedbook will start working in the Moscow region MFC and the country's largest bank.

The Heedbook team will appreciate comments on their decisions, and will also be glad to collaborate with professionals in the field of ML, data analysis, SEO and working with large clients. Write your thoughts in the comments or email info@heedbook.com .

Source: https://habr.com/ru/post/330942/

All Articles