In this article, we’ll talk about understanding the language (linguistic computing, such as labeling, parsing, and so on) and pay particular attention to two APIs: Linguistic Analysis API and Intelligent Speech Recognition Service (LUIS). If you like English as well as Russian and are fond of learning artificial intelligence, welcome to Cat.

For a start, a little help.
Microsoft Cognitive Services (developed by Microsoft Research) make it easier to work with intelligent algorithms. Even specialists unfamiliar with the theory of data research will be able to easily use ready-made APIs for application analytics.
The processing of linguistic information includes speech recognition, translation, analysis of the emotional coloring of statements, summarizing (possibly also in Microsoft Word), the formation of a language, and so on. Microsoft Cognitive Services offers more than 20 APIs designed to solve these problems.
')
We hope that this article will help you understand how speech recognition technologies work in Microsoft Cognitive Services.
Note. The article does not explain how to use these interfaces. For instructions, see the official documentation (
Linguistic Analysis API ,
LUIS ).
Linguistic Analysis API
This API provides fundamental language parsing options. It returns the result of parsing the original sentences in json format after performing the following operations.
- Search for available analyzers (call REST API).
- Perform analysis (parsing) through the selected analyzer (REST API call).
Three types of linguistic analyzers are currently available.
Token Analyzer (PennTreebank3 - regular expressions)
This is the simplest analyzer that breaks sentences into tokens.
Below is the result of the assignment of lexemes in the sentence
I want a sweet-smelling flower with a red flowerbot (“I need a fragrant flower in a red pot”) .
It misses a lot of useful information, but this analyzer can be used together with others (by specifying several analyzers in one API call) and missing data can be obtained from general results.
{ "analyzerId": "08ea174b-bfdb-4e64-987e-602f85da7f72", "result": [ { "Len": 52, "Offset": 0, "Tokens": [ { "Len": 1, "NormalizedToken": "I", "Offset": 0, "RawToken": "I" }, { "Len": 4, "NormalizedToken": "want", "Offset": 2, "RawToken": "want" }, { "Len": 1, "NormalizedToken": "a", "Offset": 7, "RawToken": "a" }, { "Len": 14, "NormalizedToken": "sweet-smelling", "Offset": 9, "RawToken": "sweet-smelling" }, { "Len": 6, "NormalizedToken": "flower", "Offset": 24, "RawToken": "flower" }, { "Len": 4, "NormalizedToken": "with", "Offset": 31, "RawToken": "with" }, { "Len": 1, "NormalizedToken": "a", "Offset": 36, "RawToken": "a" }, { "Len": 3, "NormalizedToken": "red", "Offset": 38, "RawToken": "red" }, { "Len": 9, "NormalizedToken": "flowerbot", "Offset": 42, "RawToken": "flowerbot" }, { "Len": 1, "NormalizedToken": ".", "Offset": 51, "RawToken": "." } ] } ] }

The sentence is not only separated by spaces and punctuation marks, but also divided into lexemes according to context. For example, the API can correctly break into lexemes the phrase
what's your name? (the same as
what is your name? ), Mr. Williams (this is not a punctuation mark) and so on.
Frequency Markup Analyzer (PennTreebank3 - cmm)
To extract keywords and analyze a sentence, you usually need to identify parts of speech (noun, verb, and so on). For example, if you want to highlight key words with emotional coloring and evaluate the emotional background, the keyword should be an adjective.
The parser markup analyzer defines these tags. Below is the result of processing the analyzer entered sentences
I want a sweet-smelling flower with a red flowerbot .
{ "analyzerId": "4fa79af1-f22c-408d-98bb-b7d7aeef7f04", "result": [ [ "PRP", "VBP", "DT", "JJ", "NN", "IN", "DT", "JJ", "NN", "." ] ] }
PRP : personal pronoun.
VBP : verb.
DT : the defining word.
JJ : adjective.
NN : singular or plural noun.
IN : preposition or subordinate union.
For more information, see the article "
Tags of parts of speech Penn Treebank - Penn (University of Pennsylvania) ."
The tag is assigned not only by words, but also by context. For example, the word dog usually plays the role of a noun, but is used as a verb in the following sentence (an example is taken from the Wikipedia section “
Particular Markup ”). The markup analyzer is suitable for this example. Simple markup errors are allowed in the markup analysis.
The sailor dogs the hatch ("Sailor zadiraivaet hatch").Component grammar analyzer (PennTreebank3 - SplitMerge)
Suppose you own a flower online store and have an intelligent search engine set up in it. Buyers can enter the following queries.
“I want a red fragrant flower” (I want a red and sweet-smelling flower).
“I need a fragrant flower, but not red” (I want a sweet-smelling flower except for red flowers).
“I want a fragrant flower in a red pot” (I want a sweet-smelling flower with a red flowerbot).We see that in each sentence completely different objects are described. But if you use the analyzer partly markup (previous), you will not notice the difference. In this case, you need a parser grammar analyzer. Below is the result of processing with the analyzer of the same sentence
I want a sweet-smelling flower with a red flowerbot (“I need a fragrant flower in a red pot”) .
{ "analyzerId": "22a6b758-420f-4745-8a3c-46835a67c0d2", "result": [ "(TOP (S (NP (PRP I)) (VP (VBP want) (NP (NP (DT a) (JJ sweet-smelling) (NN flower)) (PP (IN with) (NP (DT a) (JJ red) (NN flowerbot))))) (. .)))" ] }
S : simple declarative sentence.
NP : nominal phrase.
VP : verb phrase.
PRP : personal pronoun.
VBP : verb.
PP : prepositional phrase.
DT : the defining word.
JJ : adjective.
NN : singular or plural noun.
IN : preposition or subordinate union.
More details can be found in the article “
Penn Treebank II Tags - MIT ”.
As we can see, the result is a tree structure. For example,
“in a red pot” is a subordinate phrase for
“fragrant flower” .
Intelligent Speech Recognition (LUIS)
The grammar analyzer (previous) provides a lot of useful information, but it will still be difficult for you to understand the tree structure in the program code.
Intelligent Speech Recognition Service (LUIS) is intended not only for parsing, as a linguistic analysis API. It gives a direct answer to some application scenarios related to understanding the language, and allows you to use your program code in the business logic of the application.
Suppose, for example, that you have an application for booking air tickets. Its interface contains a form with the fields "Point of Departure", "Point of Arrival" and "Date / Time". With LUIS, you can extract the entered values from sentences in natural language (for example,
“I need a flight from Ekaterinburg to Moscow on July 23” ). LUIS is perfect for solving problems of understanding the language. For example, this service allows you to directly extract the following values (point of departure, point of arrival, date / time), enclosed in brackets.
- Book a flight from {Ekaterinburg} to {Moscow} on {10.29.2016}.
- Book me a flight to {Moscow} on {October 29}.
- I need a flight from {Ekaterinburg} to {Moscow} {next Saturday}.
- And so on.
Note. As we can see, in the second example, the value of "point of departure" is not entered. But in this case, you can understand what parameters are missing (they need to be entered) for processing using LUIS, and ask the user to add the necessary information.
When using LUIS, you must first register the scenarios (“intention”), for example, “Book a flight”, “Check the weather” and so on. Next, for each intention, you should register sample sentences (“expressions”) and train the result. (When you press the teach button, LUIS learns the pattern.)
You can now use the endpoint of the LUIS REST call. REST will produce in json format the result that matches the registered intent.
Note. LUIS is actively learning, so you can adjust unallocated sentences.
LUIS understands not only words, but also the context of a sentence. For example, if you enter
“Book a flight“ there and then ”on October 29,” the word
“there and there” will be analyzed as a destination. In LUIS, you must obtain a registered intent in advance. Therefore, this service is not suitable for solving specific tasks, such as searching for natural language, answering special questions (special correspondence), and so on. Moreover, LUIS only extracts targeted keyword phrases, but does not analyze them. Consider the following example.
"I need a red pot that goes with red flowers for my mom."LUIS may extract the phrase “a pot that combines with red flowers” as a key, but this is not the same as “a pot and red flowers”. If you want to analyze this phrase, that is, to study the needs of a particular client, use the linguistic analysis API.
Note. The name of the text analysis API in Microsoft Cognitive Services is reminiscent of language parsing, but this API simply assesses the emotional tinge (human satisfaction or dissatisfaction). Remember that this interface does not analyze emotions (joy, sadness, anger, and so on), as the emotion API does (also from Microsoft Cognitive Services), and as a result we get a scalar value (the degree of emotional coloring). You can also extract a key phrase that affects emotional coloring. For example, if you record a client’s voice and want to find out what he doesn’t like about your services, you can find out the likely reasons for dissatisfaction by highlighting key phrases (“placement”, “employees”, etc.).
By the way, a few days ago there was
news about updates in Cognitive Services and two cases of their use: Human Interact uses CRIS and LUIS, so that people can communicate in the virtual world; Prism Skylab uses the Computer Vision API to analyze images from security cameras to determine specific events.
Latest articles
1.
Development on R: secrets of cycles .
2.
How to choose algorithms for machine learning Microsoft Azure (recall that
you can try Azure for free ).
3.
A series of articles "Deep Learning" .
If you see an inaccuracy of the translation, please report this to private messages.