
The article is written based on the
synthesis and speech recognition from Google for Asterisk , with
no big changes. Speech recognition uses the Yandex SpeechKit HTTP API platform.
In the dialplan, everything is unchanged (the example for extensions.ael, in my opinion, AEL is more convenient than extensions.conf):
s => { Answer(); Wait(1); Record(/tmp/${UNIQUEID}.wav,3,20); AGI(yandex_voice.php,/tmp/${UNIQUEID}); NoOp(${TEXT}); Hangup(); };
')
The example is very primitive: we answer the call, wait for 1s., Record the speech, recognize what we have said, output the recognized text to the Asterisk console, but the principle of operation is clear.
Now as for the script itself.
First, a little about the variables used:
$ key = 'my_secret_key' is your key from the API, you can get it by writing a letter to speechkit@yandex-team.ru;
$ topic = 'maps' is a recognition theme, the following options are possible:
• freeform - arbitrary text, notes, etc. Application option: translate a voice mail message into text and send it by email or SMS.
• general - web search queries, I can not think of what this can be applied in this context;
• maps - addresses, GEO-points (name of bars, gas stations, hotels, etc.), etc .;
• music - names of songs, musical groups, etc.
$ lang = 'ru-RU' - the language in which the recognition will take place, currently Russian 'ru-RU' and Turkish 'tr-TR' are supported, and Turkish is supported only for those “general” and “maps”;
$ uuid = '12345678123456781234567812345678' - 32-x tsifirnuyu line must be unique for each request.
The API is described in more detail in the Yandex_SpeechKit_HTTP_API_May [5] .pdf file that will be sent to you along with the key, although it was not possible to read a shorter API manual, but this is for the best.
In my version of the Asterisk configuration, the script file is located in the folder: / usr / share / asterisk / agi-bin /
And actually the code yandex_voice.php:
Yes, the code is not perfect, it can and should be improved. As an option, make it more universal by passing most of the variables with arguments or using it as a function in another AGI or ARI script. As I use it now, to recognize the city in which the subscriber is located.