Voice Control Arduino Processing and Google Speech API

Idea:

Somehow the idea came to make voice control of the Arduino, but the Arduino alone is not enough, because for the smart home system, you also need to communicate with the computer and its systems.

Search for a solution:

Bitvoicer

I ran across all sorts of articles using BitVoicer in conjunction with Arduino, but the problem is that BitVoicer works only on Windows, and this does not allow using the system on simple devices like Rasberry Pi running Unix.

Arduino Voice Recognition

Also, the Arduino can be controlled by the voice thanks to the voice recognition module, but so far I don’t have enough money to purchase it and have a number of inconveniences when using this module: a limited number of commands, tedious training, a new module requires a flashing of the module, which is a minus if debugged and installed.

Decision

He began to look for a cross-platform solution that would allow the system to work on a variety of operating systems. Found such: Speech to Text Library for Java / Processing . The complex is implemented on the basis of the Processing (Java) language and the Google Speach API about the cat already written here. This solution allows you to monitor real-time voice enableAutoRecord () , specify the volume limit enableAutoThreshold () , connect external getLineIn () microphones, specify the setLanguage (String) recognition language. A complete list of features and specifics is on the developer's site: http://stt.getflourish.com . To work we need the Google Speech API Key. How to get it is described here: www.chromium.org/developers/how-tos/api-keys . The only negative point is that Google Speech can handle only 50 requests per day, but in practice more than 500 requests are processed.
')

In order to make it easier to navigate through the text, I attach all the sources, which already have voice commands, connection to the Arduino board, a sketch for the Arduino board, voice confirmation of the phrases and everything else that already exists and works: source codes . After downloading the folder GoogleTTS placed in the library Processing'a. Sketch for Arduino is in the folder GoogleTTS / ArduinoSerial. Everything was written on Processing 3.0a4, available in a pre-release on the official website .

Implementation (“Listen to my command!”):

With recognition decided. Now we need to catch the teams we need and make decisions on them. Section responsible for this:

void commands() { if (result.equals("arduino")) { //   //      } else if (result.equals(" ")) { //      } }

Voice response

Now we need a tool that will respond to us with a human voice if a match is found. Google Translate was chosen as an implementation tool, or rather, a module that converts text to voice. The text is sent to the request to the Google server, converted into a sound file and sent back to us in mp3 format. Section responsible for this:

 void googleTTS(String txt, String language) { //       googleTTS("", "") String u = "http://translate.google.com/translate_tts?tl="; u = u + language + "&q=" + txt; u = u.replace(" ", "%20"); try { URL url = new URL(u); try { URLConnection connection = url.openConnection(); connection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 1.2.30703)"); connection.connect(); InputStream is = connection.getInputStream(); File f = new File(sketchPath + "/" + txt + ".mp3"); OutputStream out = new FileOutputStream(f); byte buf[] = new byte[1024]; int len; while ((len = is.read(buf)) > 0) { out.write(buf, 0, len); } out.close(); is.close(); println("File created: " + txt + ".mp3"); } catch (IOException e) { e.printStackTrace(); } } catch (MalformedURLException e) { e.printStackTrace(); } }

Section processing is responsible for processing text phrases directly:

 void voicer(String s) { //   voicer("") println(s); //     File f = new File(sketchPath + "/" + s + ".mp3"); //   //     -   if(f.exists()){ println("  !  !"); player = minim.loadFile(s + ".mp3"); player.play(); } //     -   else { println("  ! !"); googleTTS(s, "ru"); player = minim.loadFile(s + ".mp3"); player.play(); } }

An example of the implementation of recognition and voice confirmation:

 void commands() { if (result.equals("")) { //   voicer(""); //    //      } }

He is alive!

Processing + Arduino

Well, it seems, and earned, but something is missing. Now "make friends" is all with the Arduino.
Initialize the serial connection in Processing to send data to Arduino (for Mac and Unix users):

 String portName = Serial.list()[0]; myPort = new Serial(this, portName, 9600); myPort.bufferUntil('\n');

For Windows users:

 String myPort = new Serial(this, " COM-", 9600); myPort.bufferUntil('\n');

And we will send a team there when a voice match is found:

 void commands() { if (result.equals(" ")) { //      " ",    myPort.write("High"); //   High  Serial  voicer(" "); //      } else if (result.equals(" ")) { myPort.write("Low"); //   Low  Serial  voicer(" "); //  //      } }

Now we will deal with the Arduino board. We need to listen to the Serial port and when finding a command from the list, perform the required action according to the command. The sketch is very simple:

 int led = 13; //     void setup() { Serial.begin(9600); //    pinMode(led, OUTPUT); //     } void loop() { int i=0; //       char buffer[100]; //        if(Serial.available()){ //       delay(100); //    while( Serial.available() && i< 99) { buffer[i++] = Serial.read(); } //  buffer[i++]='\0'; String val = buffer; if (val == "High") { //    High Serial.println("Led is On"); //   Serial   digitalWrite(led, HIGH); //   } if (val == "Low") { //    Low Serial.println("Led is Off"); //   Serial   digitalWrite(led, LOW); //   } } }

Everything. We are checking.

Problems and plans:

Since I have not been programming before this time, I do not fully understand some things in the debugging process. I would be grateful if someone tells you how to solve the problems from the list below:

- The most basic problem - the voice phrase is not uttered entirely. The last letters disappear. Although the sound file comes from the Google server in normal form. As I understand the situation: the problem of the audio player, but where exactly is not yet clear.
- Already wrote that the Google Speech API has a limit of 50 requests per day, but in fact it turns out more. In any case, this is not enough. I plan to register local recognition of the main team and only after recognition of it, send the rest of the text to Google for processing. I am looking for a solution.
- I think it would not hurt to send commands to the Arduino Ethernet Shield, since Some systems may be located at a decent distance from the main computer and the Serial connection will not work here. I'll make this decision the other day, because I do not have a router to connect to it with an Arduino Ethernet shield.

That's all for it! Please do not judge strictly by the lines of code! I just started to learn this bridgehead and I will be extremely grateful if you stick your nose at how you don’t need to do and show you how. I will also be happy if other interested people join this project - always open for communication!

Source: https://habr.com/ru/post/236673/

All Articles