Habr, hello! We have already told you several times about our intelligent speech recognition service - LUIS. And in these stories there has always been one problem: LUIS is cool, but does not understand Russian. Today everything will change. Under the cat, you will learn about the method of adding support for additional languages โโin LUIS using the Translation Cognitive API service.
Moed.ai is an Israeli startup that allows service providers to manage their work calendars and contribute new events to them using a single cloud platform accessible from any device.
Using the Moed.ai control panel, users can schedule service execution, resource utilization, and other events. Resources here are meant as objects, for example cars and meeting rooms, and workers, for example test drivers and sales representatives of a car dealer. The Moed.ai platform allows you to plan the use of each of the resources and tailor meetings with clients to their availability.
')
Currently, Moed.ai is working on creating chat bots for each of its customers, so that clients of these companies can schedule services in the language they are comfortable with in the usual way (in Facebook messenger, via Skype or Slack, etc.).
Problem
Moed.ai is an Israeli company, so for many of its customers, the native language is Hebrew. The English version of the chat bot Moed.ai is able to isolate intent and entities from the user's message, and the company wants to create a Hebrew program with the same functionality. Unfortunately, in the
LUIS platform, which the company planned to use to highlight intentions and entities, there is currently no official support for Hebrew.
Decision
The goal of working together with Moed.ai was to find a way to add Hebrew support to LUIS using the
Translation Cognitive Service . During the course of the work, we compared two ways to provide support for Hebrew. The first method - downloading text translated
cognitive service directly into the existing LUIS model for English - showed unsatisfactory results, but we managed to create a more successful method.
We took a new approach to the learning process of the LUIS model: instead of subtracted English phrases, we used unedited machine translation as examples. This approach allowed us to overcome the significant differences between the translation made by the machine, and the correct human speech.
To understand why this method works, consider the following situation.
Suppose a user accesses a program with four Hebrew sentences:
ืื ื ืจืืฆื ืืงืืืข ืคืืืฉื ืื ื ืจืืฆื ืืงืืืข ื ืกืืขืช ืืืื ืื ื ืจืืฆื ืืงืืืข ื ืกืืขืช ืืืื ืืืืจ ืืคืฉืจ ืืงืืืข ื ืกืืขืช ืืืื ืืืืจ?
The correct English translation of these sentences is:
I want to schedule a meeting. I want to schedule a test drive. I want to schedule a test drive for tommorrow. Can I schedule a test drive tomorrow?
However, the machine translation service produces the following result:
I want to schedule an appointment. I want to schedule a test drive. I want to make a test tomorrow. Can set a test tomorrow?
The translation of the first two phrases is almost identical to their meaning, but note that between the translations of the third and fourth sentences (โ
I want to put the test tomorrow โ, โ
Can I set the test tomorrow? โ) And their true meaning (โ
I want to assign a test drive for tomorrow โ,โ
Can I schedule a test drive for tomorrow? โ) there is a significant difference.
For example, in both phrases, the translation system replaced the idea of โโโ
test drive โ with the word โ
test โ, similar in form but very far from the meaning of the source text. The LUIS model, which trained only on correct sentences, such as โI want to assign a test drive for tomorrow,โ can hardly guess the meaning behind this substitution, since this error is typical for translating sentences from Hebrew into English. Differences in grammar and word usage between the two languages โโlead to the appearance of the same inaccuracies in translations characteristic of this particular pair of languages.
If we initially train the model on sentences translated from Hebrew, the service will quickly learn to identify inconsistencies between the incorrect translation and the original value. Over time, the model will remember exactly which errors the translation service in Hebrew allows for in each specific context, and will often respond correctly to requests.
Usage guide
This section describes the process of learning and using our node module to add support for additional languages โโfor robot programs. It is assumed that the user has already created the
LUIS application and has generated a key for the cognitive translation system (
Translation Cognitive Service ).
- Make a list of commands in the language you need (in our case in Hebrew). For example:
ืื ื ืจืืฆื ืืงืืืข ืคืืืฉื
- Run the Bulk Transfer and LUIS Import script.
- Highlight translations, intentions, and entities using the LUIS portal.
- Use the automatic learning and testing program on the LUIS portal to test and practice your model again until she learns to match translations from a new language with their meanings.
- Use the LUIS npm module to use the trained LUIS model and integrate it into your application.
Code
You can find the source code and notes on using the described method
on GitHub .
Features to use
The method described in this article can be used to detect intentions and entities in text in any natural language supported by the cognitive translation service. It is also applicable to the localization of many products of the โcommunication as a platformโ type in order to make the conversation with the robot program more natural.
We remind you that
you can try Azure for free .
Minute advertising . If you want to try new technologies in your projects, but do not reach the hands, leave the application in the program
Tech Acceleration from Microsoft. Its main feature is that together with you we will select the required stack, we will help to realize the pilot and, if successful, we will spend maximum efforts so that the whole market will know about you.
PS We thank Kostya Kichinsky (
Quantum Quintum ) for the illustration of this article.