ROS Speech Recognition with Google Speech API

I have already talked about using pocketsphinx for speech recognition in ROS. In this article I want to talk about using gspeech for speech recognition. gspeech is a ROS package that uses the Google Speech API: wiki.ros.org/gspeech .

Getting the Google API Key

So, let's begin. First you need a Google API key. To receive it, you first need to have a Google account. Secondly, you need to subscribe to chromium-dev@chromium.org (you need to subscribe here ).
Now you can get your Google API key. To do this, go to Google developer console: cloud.google.com/console . Here you need to create a project. After creating the project, you need to activate the Speech API in the APIs section under the APIs & auth item in the left menu. Be careful: this item may not be in the list, as it happened to me. If you do not see the Speech API, then check that you subscribed to chromium-dev and that you are now logged in with the Google account whose email address you used when subscribing to chromium-dev.
You can get the Google API key in the Credentials section under the same APIs & auth. Here you need to create a key by clicking on the Create new Key button in the Public API access section.

Gspeech installation

Now it remains to be easy - to install the gspeech package. To do this, clone gspeech from the Github page: github.com/kusha/gspeech . For gspeech to work, it requires a sox installation:
')

sudo apt-get install sox

You also need to insert your Google API key into the gspeech.py script in the line:

 api_key = "" # PASTE HERE YOUR GOOGLE API KEY

Running gspeech

Everything is ready and you can start the ROS gspeech node:

 rosrun gspeech gspeech.py

Gspeech recognition

In the process of recognition, gspeech publishes the recognized phrases into the theme / speech in String format and the degree of “confidence” of recognition into the theme / confidence in the Int8 format.
The phrase recognition process may take some time, as gspeech makes requests to Google servers. Nevertheless, gspeech has a fairly high recognition accuracy, gspeech recognizes phrases much better than the pocketsphinx package. When testing, gspeech recognized phrases with “confidence” 70-80. In some cases, recognizes with "confidence" to 94.

I wish you good luck in speech recognition with the Google Speech API.

Source: https://habr.com/ru/post/247539/

All Articles

ROS Speech Recognition with Google Speech API

Getting the Google API Key

Gspeech installation

Running gspeech

Gspeech recognition

More articles: