Voicemail and quality leap services thanks to Google (FreeSWITCH edition)

Using modern telephone platforms, the voice mail service has become so familiar and in demand that the developers of soft PBXs create voice mail modules, the manufacturers of iron PBXs create entire voice mail boards, and telecom operators offer it as a service. Everything, as they say, is simple and clear. I did not get through to the subscriber, the system will forward you to voice mail and offer to leave a message. Then there are several options - a notification is sent to the subscriber (most often this letter) that a voice message is left, they say - go into the system and listen. In a more advanced version, a file with a record comes to the e-mail immediately, which allows you to lose the time to visit the system interface - immediately listen to the file from the letter. But you can make the service more convenient and better.

To improve the quality of the service, in my opinion, you can send voice mail TEXT, aka speech-to-text. A little background. At the FreeSWITCH forum, Asterisk scripts using Google speech api were installed for speech recognition with a request to modify them under FreeSWITCH. I based the script on bash. Unfortunately, I don’t know the authorship, so first I’ll give the script without any changes:

#!/bin/sh echo "1 SoX Sound Exchange - Convert WAV to FLAC with 16000" sox $1 message.flac pad .1 0 rate 16k echo "2 Submit to Google Voice Recognition" wget -q -U "Mozilla/5.0" --post-file message.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" > message.ret echo "3 SED Extract recognized text" cat message.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > message.txt echo "4 Remove Temporary Files" rm message.flac # rm message.ret echo "5 Show Text " cat message.txt

')
I altered this script a little to fit my needs, put it in the / usr / local / freeswitch / scripts / folder and in the end it looks like this:

 #!/bin/sh cd /usr/local/freeswitch/scripts/ sox tmp.wav message.flac pad .1 0 rate 16k > /dev/null 2>&1 wget -q -U "Mozilla/5.0" --post-file message.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=ru-RU&client=chromium" > message.ret cat message.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' >> messages.log cat message.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' | iconv -f UTF8 -t KOI8-U | mutt -x -s "new voice mail" -- moya_pochta@domen.ru rm message.flac rm tmp.wav rm message.ret

In a nutshell - in the script we take an entry named tmp.wav, convert it to FLAC and send it to Google. We also inform Google that we want to recognize Russian speech (although, if we speak English with normal pronunciation, then the text comes in English). In response, we receive a message like {"status": 0, "id": "4ee1ad1a44f3cfbb58341972dd008e9c-1", "hypotheses": [{"utterance": "call back later", "confidence": 0.43928865}}}

With sed, we pull out the message text, save it to the log and send it to the mail.

To further improve the script, you can pay attention to the confidence (Google confidence ratio as a recognition) and send it by e-mail too. Also, or with low confidence or in any case attach to the letter also the attachment - the recording file. This can be done with the help of the "-a" option for mutt, but I advise you to convert the file to mp3 beforehand.

To record the tmp.wav message from the called subscriber and then run the script in the desired context of the FreeSWITCH dialing plan, I added the following lines of xml code:

 <extension name="s2t"> <condition field="destination_number" expression="^11111$"> <action application="export" data="api_hangup_hook=system /usr/local/freeswitch/scripts/s2t.sh"/> <action application="answer"/> <action application="playback" data="/usr/local/freeswitch/sounds/ru/RU/elena/voicemail/8000/vm-hello.wav"/> <action application="record" data="/usr/local/freeswitch/scripts/tmp.wav"/> <action application="hangup"/> </condition> </extension>

A few comments on the given piece of the FreeSWITCH recruitment plan. With api_hangup_hook, we tell FreeSITCH which script to execute after the call ends. Next, we answer the call and play the greeting, after which we activate the entry in the /usr/local/freeswitch/scripts/tmp.wav file
After FreeSWITCH recorded the message in the tmp.wav file and terminated the telephone connection, the /usr/local/freeswitch/scripts/s2t.sh script is called, which I wrote about a little higher.
This is the simple way you can, firstly, make your life easier, and secondly, you can significantly reduce the processing time of voice messages. Rate how quickly (and even without opening the letter) you see what was discussed in the messages in the mail.ru interface:

Source: https://habr.com/ru/post/149750/

All Articles

Voicemail and quality leap services thanks to Google (FreeSWITCH edition)

More articles: