Hi, Habr! Today we want to share with you instructions for creating a bot that will analyze the questions and answer them. It would seem that we could just tell you about QnA Maker, which performs this function. But there is one catch - it supports a limited number of languages. Therefore, under the cut, we will share step by step instructions on creating a Q & A bot that is universal for any language.

Situation
Recently, our colleagues collaborated with a Korean company that wanted to use a robot on its Internet portal capable of answering frequently asked questions (FAQ, QnA). The Bot Framework provides a service of this type (
QnA Maker ), in which to work you need to upload a list of frequently encountered questions and answers on a specific topic. This list is also called the QnA package. After indexing such a package, QnA Maker provides a search service that works through the REST API and is able to find the most appropriate answer to the question in a free text form.
')
Problem
Currently, QnA Maker does not support Korean, so we had to create a similar service to answer frequent questions without using QnA Maker.
Decision
This article describes how to create an internal service that uses Azure Search to run a program to handle frequently asked questions in languages ​​currently not supported by QnA Maker. In order to familiarize with the product, we will create a service for handling questions in English. However, all the steps described here apply to any of the languages ​​currently supported by Azure Search (a list can be found
here ).
Azure Search is a search engine offered in Azure as a service. Azure Search can be used to index user data and perform search queries on an indexed package. We can customize Azure search to index the existing list of questions and answers and find the answers and tags closest to the user's question. Azure search is great for creating your own question and answer service, not only because of the support of most common languages, but also because of two functions:
- Analyzers expand a search query with the help of a dictionary of synonyms and related expressions in order to increase the probability of finding the appropriate units in the package.
- The evaluation profile allows you to give more or less relative weight to the various fields of the indexed elements. In Azure Search, this profile is used to evaluate the results of a query and directly affects their location on issue.
In our solution, we will process in an Azure search a QnA corpus that has the shape of a table with the following parameters:
[
id, category, url, question, answer, keywords
]
- The id field is the unique key of the item.
- The keywords field is the keywords that best describe the question.
- The url field is used in cases where we want to send the user to an external link within the response.
- The properties and tasks of the category field will be described later when we move on to a more complex scenario.
Here you can find an example of such a table.
Using the evaluation profiles function allows us to create several such profiles and test each one on a search query for comparison. In this case, there are no magic numbers that define the ideal profile, and we will have to experiment with different weights of the fields in order to achieve the most accurate result.
Our solution is based on the use of keywords. The keywords field helps us advance the issue. By giving this field the most weight, we “shift” the search results in this direction: elements whose keywords match the keywords of the query will receive a higher rating.
In our example, we will create an evaluation profile in which the above-described fields are given such numerical weight:
keywords
- 5;question
- 3;answer
- 2.
In this case, it is assumed that the keywords field contains the keywords that the user most likely uses in his question, so this field is given the greatest weight. You should also use the question field, but with a lower weight than the keywords field. You should also include the answer: field in the assessment profile. It can be assumed that some words will be used both in the question and in the answer to it. However, we will give this field the least weight.
To facilitate the creation of our test model, we will use a small QnA package. It is precisely because of the small volume of the body that it may seem that we get good results and without using evaluation profiles. However, the more items a body contains, the better the evaluation profile function will manifest itself, helping to get more accurate results.
To create a Q & A service in Azure Search, follow these steps:
Open the Azure portal and create a new resource group. Create a new Azure search service in this resource group as shown below:

Enter the name of the search service, fill in the remaining required fields and click OK:

This is Azure's deployed search service:

You can import data from multiple sources into Azure Search, for example, from document databases, SQL or Azure tables. In our case, we created a SQL database containing one table and used it as a source for the search service:

Fill in required fields:

This is how the service should look after performing these actions:

To connect to a SQL database, we use SQL Server Management Studio. Then we create the table using the following query:

After that we bring in our table a few questions and answers:

Now the data is ready, it's time to import it into the search service. To do this, follow the steps in the red frames:

After connecting to SQL, the data import wizard allows us to define the attributes of the index. Enter the name of the index, and then in the “Key field” drop-down list, select the Id field.
Now for each of the table fields we need to select several attributes:
- The attribute "Available for receipt" means that the field will be displayed in the search results.
- The “Filtered” attribute means that the results can be filtered by this field.
- The attribute "Sorted" means that the results can be sorted by this field.
- The attribute "Assessed" means that the search results will be divided into groups according to the contents of this field, and the number of elements in it will be displayed next to each group. We'll see later how this attribute works.
- The attribute “Searchable” assigned to the fields means that the search engine will analyze the data contained in them during the actual execution of the search.

Now we define the analyzers used for each field. In this case, we used the
— Microsoft
analyzer for all the searchable fields. Use the
Microsoft Language Analyzer corresponding to the language you use:

For our test model, I put the value of the scheduler "Once." Here you can choose how often the system should re-index your database in search of changes and updates:

Click OK. After completing the previous steps, you will receive an alert about the successful import of data.

To create your rating profile, follow the steps indicated in the red frames in the screenshot below:

Then we need to set the weight for the database fields, which we talked about earlier:

Now we are ready to search our question and answer corpus. Click on the search browser tile, as shown in the screenshot below:

Search browser allows you to easily call the REST API without leaving the portal. At this stage, we can start experimenting with our search engine.
Enter search = question_in_free_form in the query string, as shown below. At this request, the system will perform a simple search on the fields available for search, without using the rating profile.

Let's use the rating profile by adding it to the query string:
search=in which mode am i&scoringProfile=qna-scoring-profile
Note. You can see that the same elements and estimates were not included in the output of the Azure search results that it used to be (the last element would not be 5, as when searching without an assessment profile, but 14). On a small case, it is difficult to demonstrate how a rating profile helps a lot. However, in the presence of a large hull, an evaluation profile is of great importance.
You can learn more about the search parameters and the search service API call from
this article .

In this example, we used the aspect function so that the search engine counts the number of search engine elements that fall into each of the categories. This feature can be useful if there are few results in each category. In this case, the user can ask the results from which category he would like to see, and show him the number of elements in each. After the user selects a category, we can refine the query by filtering the results within the category using an additional search:
search=in which mode am i?&scoringProfile=qna-scoring-profile&facet=category

Features to use
In this article, we described the procedure for creating a question / answer bot. For example, using this approach, you can create a question and answer system for the Norwegian language or Hebrew — languages ​​not supported in the official version of QnA Maker.
Note. As described in the article, you can create not only robots. This approach is applicable to the creation of a general-purpose search service operating in a specific subject. Evaluation profiles will help improve its performance.
In addition, thanks to the function of adding tags to separately indexed entries in the Azure search, the algorithm above can be used for scalable multi-class text classification. This ability is useful for programs that need to categorize the intentions of users expressed in messages. Of course, separate products already exist for these purposes, for example, Luis or Azure ML, but if there are more than 10–15 categories, the classification can be a difficult task for them. The described approach can be used, for example, by doctors to classify treatment protocols in a system where hundreds of protocols have already been entered. An Azure search will help classify text by comparing it with a large body of similar materials.
We remind you that recently we
talked about how to teach LUIS to understand other languages.