Google translate + Asterisk IVR

I thought for a long time in which blog to post and decided that here is the most suitable place for him. If only because the main idea of the topic “sh - everything can”.

In this topic we set an interesting topic - to implement IVR for * using a synthesizer from Google Translate.

In general, I did not even plan to do this, but it became interesting to me.

')
And the first thing I did, I got to find out how Google is talking. It speaks well, but only 100 characters. But this is quite enough to create an IVR. Satisfied with the first result, I set out in search of how to cut this voice. A short search brought me to the option translate.google.com/translate_tts?q=&tl=ru
He stuck with him into the browser and got an mp3 with spoken text. Even more inspired, I put this line in wget.

[utfadm@SIP:/var/lib/asterisk]> wget "http://translate.google.com/translate_tts?q=&tl=ru"
--2011-12-01 13:24:53-- translate.google.com/translate_tts?q=%D1%82%D0%B5%D0%BA%D1%81%D1%82&tl=ru
translate.google.com (translate.google.com)... 173.194.32.225, 173.194.32.234, 173.194.32.235, ...
translate.google.com (translate.google.com)|173.194.32.225|:80... .
HTTP- . ... 403 Forbidden
2011-12-01 13:24:53 403: Forbidden.

--2011-12-01 13:24:53-- translate.google.com/translate_tts?q=%D1%82%D0%B5%D0%BA%D1%81%D1%82&tl=ru
translate.google.com:80.
HTTP- . ... 403 Forbidden
2011-12-01 13:24:53 403: Forbidden.

Then I waited for the first bummer. However, after a little thought it was thought that people in the corporation are not stupid and it’s not so easy to give wget mp3s to them. But they give the browser ...

So disguise as a browser.

[utfadm@SIP:/tmp]> wget -U "Lynx 1.2.3.4" "http://translate.google.com/translate_tts?q=&tl=ru"
--2011-12-01 13:27:22-- translate.google.com/translate_tts?q=%D1%82%D0%B5%D0%BA%D1%81%D1%82&tl=ru
translate.google.com (translate.google.com)... 74.125.232.1, 74.125.232.10, 74.125.232.11, ...
translate.google.com (translate.google.com)|74.125.232.1|:80... .
HTTP- . ... 200 OK
: 0 [audio/mpeg]
: ««translate_tts?q=\321%82\321%81\321%82&tl=ru»».

[ <=> ] 0 --.-K/s 0s

2011-12-01 13:27:22 (0,00 B/s) - «translate_tts?q=\321%82\321%81\321%82&tl=ru» saved [0/0]

Hmm ... the length of the file turned out to be zero. And if so

[utfadm@SIP:/tmp]> wget -U "Lynx 1.2.3.4" "http://translate.google.com/translate_tts?q=text&tl=ru"
--2011-12-01 13:29:59-- translate.google.com/translate_tts?q=text&tl=ru
translate.google.com (translate.google.com)... 74.125.232.2, 74.125.232.11, 74.125.232.12, ...
translate.google.com (translate.google.com)|74.125.232.2|:80... .
HTTP- . ... 200 OK
: 4421 (4,3K) [audio/mpeg]
: ««translate_tts?q=text&tl=ru»».

100%[===================================================================================================================>] 4 421 --.-K/s 0s

2011-12-01 13:29:59 (95,5 MB/s) - «translate_tts?q=text&tl=ru» saved [4421/4421]

And it works ...

We think, we think, we think ...
Maybe Russian characters coming from lynx are not accepted for Russians?
Then we will replace user-agent with that with which Russian letters precisely work.

[utfadm@SIP:/tmp]> /usr/local/bin/wget -U "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5" "http://translate.google.com/translate_tts?q=&tl=ru"
--2011-12-01 13:32:27-- translate.google.com/translate_tts?q=%D1%82%D0%B5%D0%BA%D1%81%D1%82&tl=ru
translate.google.com (translate.google.com)... 173.194.32.225, 173.194.32.234, 173.194.32.235, ...
translate.google.com (translate.google.com)|173.194.32.225|:80... .
HTTP- . ... 200 OK
: 4421 (4,3K) [audio/mpeg]
: ««translate_tts?q=\321%82\321%81\321%82&tl=ru.1»».

100%[===================================================================================================================>] 4 421 --.-K/s 0s

2011-12-01 13:32:27 (103 MB/s) - «translate_tts?q=\321%82\321%81\321%82&tl=ru.1» saved [4421/4421]

Oh, it's better ... only the name of the file is somehow clumsy
translate_tts?q=?%82?%81?%82&tl=ru.1
In order to fix it, the key will be -O, and the name will be set to which one you need.

So, now that we have learned how to receive voice files, we need to teach them how to do it *.

To do this, we write a small script

#!/bin/sh
`ls /var/lib/asterisk/festivalcache/$2.gsm`
if [ $? -eq 1 ]; then
NAME=/var/lib/asterisk/festivalcache/$2
/usr/local/bin/wget -U "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5" "http://translate.google.com/translate_tts?q=$1&tl=ru" -O $NAME.mp3
/usr/local/bin/mpg123 -w "$NAME.wav" "$NAME.mp3"
echo "Converting from wav to gsm"
/usr/local/bin/sox -t wav "$NAME.wav" -r 8000 -c1 -t gsm "$NAME.gsm"
rm $NAME.mp3
rm $NAME.wav
fi

Go over it.
With the first line, everything is clear.
The second check for the presence of the file, and if it already exists, then this script ends its work.
If not, then set the file name from the second launch parameter and the full path to it.
We throw the Google request with the text from the first launch parameter to save the file to mp3.
Then we drive it to wav, then to gsm.
Delete intermediate files.

At the output we get a file format gsm which is wonderfully able to play *.

Well, actually we write something like in the dialplan:

exten => 227,1,Set(home=/var/lib/asterisk/festivalcache)
exten => 227,2,Wait(1)
exten => 227,n,System(/bin/sh /var/lib/asterisk/tts.sh ". " "${EXTEN}.${PRIORITY}")
exten => 227,n,Playback(${home}/${EXTEN}.$[${PRIORITY} - 1])
exten => 227,n,Set(tic=${STRFTIME(${EPOCH},,%Y%m%d)}))
exten => 227,n,System(/bin/sh /var/lib/asterisk/dt.sh 1 "${EXTEN}.${tic}")
exten => 227,n,Playback(${home}/date/${EXTEN}.${tic})
exten => 227,n,System(/bin/sh /var/lib/asterisk/tts.sh ". " "${EXTEN}.${PRIORITY}")
exten => 227,n,Playback(${home}/${EXTEN}.$[${PRIORITY} - 1])
exten => 227,n,System(/bin/sh /var/lib/asterisk/tts.sh ". ." "${EXTEN}.${PRIORITY}")
exten => 227,n,Playback(${home}/${EXTEN}.$[${PRIORITY} - 1])
exten => 227,n,Set(tic=${STRFTIME(${EPOCH},,%H%M%S)})
exten => 227,n,System(/bin/sh /var/lib/asterisk/dt.sh 2 "${tic}")
exten => 227,n,Playback(${home}/time/${tic})
exten => 227,n,Hangup()

Thus, the script will generate and play the files 227.3.gsm, 227.8.gsm, 227.10.gsm, and two more about which later. The listed files will be generated once, because, as we remember, the script is not executed if the file already exists. It seems to me that even 50-60 once generated phrases Google will not burden, and we will give a full menu.

Those two files that I promised to tell you later are the current date and time. They are generated and caused by strings.

exten => 227,n,Set(tic=${STRFTIME(${EPOCH},,%Y%m%d)}))
exten => 227,n,System(/bin/sh /var/lib/asterisk/dt.sh 1 "${EXTEN}.${tic}")
exten => 227,n,Playback(${home}/date/${EXTEN}.${tic})

and

exten => 227,n,Set(tic=${STRFTIME(${EPOCH},,%H%M%S)})
exten => 227,n,System(/bin/sh /var/lib/asterisk/dt.sh 2 "${tic}")
exten => 227,n,Playback(${home}/time/${tic})

accordingly.
Apparently from a call they address to other script. This is a wrapper for the already considered script for accessing Google for a voice. It looks like this

#!/bin/sh
if [ $1 -eq 1 ]; then
q=`date +" %d.%m.%Y "`
n=date/$2
fi
if [ $1 -eq 2 ]; then
q=`date +" %H:%M:%S"`
n=time/$2
fi
echo "$q"
/var/lib/asterisk/tts.sh "$q" $n

It's pretty obvious. The first parameter determines whether we get the date or time, the second file name. Dates are stored in the date folder in time. File name gives us *. tic = $ {STRFTIME ($ {EPOCH} ,,% Y% m% d)}) is a year / month day) and tic = $ {STRFTIME ($ {EPOCH} ,,% H% M% S)} - an hour or two seconds. Thus, if you do not clean the time folder for a long time, you can accumulate all possible combinations.

Such is the simple wrapping.
To generate short phrases of any format you just need to write a wrapper for the first script. Simple and tasteful.

But I also had an idea to read files, but there can be more than 100 characters in the file. So you have to split into several requests. The specificity of the files that I need to read is that there are many lines in them, but they are all less than 100 characters. Therefore, I have added the following script:

[root@SIP:/var/lib/asterisk]# cat ttsb.sh
#!/bin/sh
Source=/var/lib/asterisk/source
i=0
splitted=''
NAME=/var/lib/asterisk/festivalcache/$2
`ls /var/lib/asterisk/festivalcache/$2.gsm`
if [ $? -eq 1 ]; then
for str in `cat $Source/$1`
do
i=`expr $i + 1`
WORKNAME=/var/lib/asterisk/festivalcache/$2.work.$i.mp3
splitted="$splitted $WORKNAME"
#echo $WORKNAME
#echo $str
#echo SP: $splitted
/usr/local/bin/wget -U "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5" "http://translate.google.com/translate_tts?q=$str&tl=ru" -O $WORKNAME
done
/usr/local/bin/mpg123 -w "$NAME.wav" $splitted
echo "Converting from wav to gsm"
/usr/local/bin/sox -t wav "$NAME.wav" -r 8000 -c1 -t gsm "$NAME.gsm"
rm $splitted
rm $NAME.wav
fi

Well, everything is also quite obvious. We take a file and feed each line to Google, then glue all mp3 files into one wav, convert it to gsm and delete intermediate files. A small pause is heard, so it’s good if the logical lines imply a pause between their pronunciation.

On this, in general, I think you can finish: the principle is set, the main script is set for which you can write wrappers that give it what it needs to input, there is an example of a wrapper and an example of reading files.

There are plans to set up a sphinx (I saw the project of Russian grammars with 96% accuracy), feed the result to Google, translate it into another language and pronounce it with Google. So far I really do not know why.

Source: https://habr.com/ru/post/133782/

All Articles

Google translate + Asterisk IVR

More articles: