Jarvis is back in business

Surely, everyone dreams of his voice assistant, under the cut is another implementation of "Jarvis" from the famous movie.

It has long been the thought of his "Jarvis" and the management of equipment in the house voice. And then, finally, the hands reached the creation of this miracle. I didn’t have to think long over the brains;

So iron:
')

Raspberry pi 3 model b
Logitech usb camera

Implementation

Our assistant will work on the principle of Alexa / Hub:

Activate offline for a specific word
Recognize the team in the cloud
Execute the command
Report on the work or inform the requested information.

Since My camera is supported out of the box, I didn’t have to mess with the drivers, so we immediately proceed to the software part.

Offline activation

Activation will occur with the help of CMU Sphinx, and everything would be fine, but out of the box, recognition is very slow, more than 10 seconds, which is absolutely not suitable, you need to clear the dictionary of unnecessary words to solve the problem.

Install everything you need:

pip3 install SpeechRecognition pip3 install pocketsphinx

Further

 sudo nano /usr/local/lib/python3.4/dist-packages/speech_recognition/pocketsphinx-data/en-US /pronounciation-dictionary.dict

delete everything except the Jarvis we need:

  jarvis JH AA RV AH S

Now pocketsphinx recognizes pretty quickly.

Speech recognition

At first there was the idea to use the Google service, besides its support is in SpeechRecognition. But as it turned out, Google takes money for it and does not work with physical. by individuals.

The benefit of Yandex also provides such an opportunity, free of charge and extremely simple.

Register, get API KEY. All work can be done curl'om.

 curl -X POST -H "Content-Type: audio/x-wav" --data-binary "@file" «https://asr.yandex.net/asr_xml?uuid=ya_uid&key=yf_api_key&topic=queries»

Speech synthesis

Here again Yandex will help us. We send the text in reply we receive the file with the synthesized text

 curl «https://tts.voicetech.yandex.net/generate?format=wav&lang=ru-RU&speaker=zahar&emotion=good&key=ya_api_key» -G --data-urlencode "text=text" > file

Jarvis

Putting it all together and get this script.

 #! /usr/bin/env python # -*-coding:utf-8-*- import os import speech_recognition as sr from xml.dom import minidom import sys import random r = sr.Recognizer() ya_uuid = '' ya_api_key = '' # os.system('echo "+ +" |festival --tts --language russian') def convert_ya_asr_to_key(): xmldoc = minidom.parse('./asr_answer.xml') itemlist = xmldoc.getElementsByTagName('variant') if len(itemlist) > 0: return itemlist[0].firstChild.nodeValue else: return False def jarvis_on(): with sr.WavFile("send.wav") as source: audio = r.record(source) try: t = r.recognize_sphinx(audio) print(t) except LookupError: print("Could not understand audio") return t == ("jarvis") def jarvis_say(phrase): os.system( 'curl "https://tts.voicetech.yandex.net/generate?format=wav&lang=ru-RU&speaker=zahar&emotion=good&key='+ya_api_key+'" -G --data-urlencode "text=' + phrase + '" > jarvis_speech.wav') os.system('aplay jarvis_speech.wav') def jarvis_say_good(): phrases = ["", "", "", "", "- ?", ] randitem = random.choice(phrases) jarvis_say(randitem) try: while True: os.system('arecord -B --buffer-time=1000000 -f dat -r 16000 -d 3 -D plughw:1,0 send.wav') if jarvis_on(): os.system('aplay jarvis_on.wav') os.system('arecord -B --buffer-time=1000000 -f dat -r 16000 -d 3 -D plughw:1,0 send.wav') os.system( 'curl -X POST -H "Content-Type: audio/x-wav" --data-binary "@send.wav" "https://asr.yandex.net/asr_xml?uuid='+ya_uuid+'&key='+ya_api_key+'&topic=queries" > asr_answer.xml') command_key = convert_ya_asr_to_key() if (command_key): if (command_key in ['key_word', 'key_word1', 'key_word2']): os.system('') jarvis_say_good() continue except Exception: jarvis_say('-   ')

What's going on here. Run an infinite loop, arecord'om write three seconds and send sphinx for recognition if the word “jarvis” appears in the file

  if jarvis_on():

We play a pre-recorded activation alert file.

Again, write 3 seconds and send to Yandex, in response we get our team. Next, perform actions based on the command.

That's all for it. A lot of scripts can be created.

Use-case

Now some real examples of my use.

Philips Hue

Install

 pip install phue

In the Hue application, set the static IP:

Run:

 #!/usr/bin/python import sys from phue import Bridge b = Bridge('192.168.0.100') # Enter bridge IP here. #If running for the first time, press button on bridge and run with b.connect() uncommented #b.connect() print (b.get_scene())

We write out the ID of the necessary schemes, type "470d4c3c8-on-0"

The final script:

 #!/usr/bin/python import sys from phue import Bridge b = Bridge('192.168.0.100') # Enter bridge IP here. #If running for the first time, press button on bridge and run with b.connect() uncommented #b.connect() if (sys.argv[1] == 'off'): b.set_light([1,2,3],'on', False) else: b.activate_scene(1,sys.argv[1])

Add to jarvis:

  if (command_key in [' ', ' ', '']): os.system('python3 /home/pi/smarthome/hue/hue.py a1167aa91-on-0') jarvis_say_good() continue if (command_key in [' ', ' ']): os.system('python3 /home/pi/smarthome/hue/hue.py ac637e2f0-on-0') jarvis_say_good() continue if (command_key in [' ', ' ']): os.system('python3 /home/pi/smarthome/hue/hue.py "off"') jarvis_say_good() continue

Lg tv

We take the script from here . After the first launch and input of the pairing code, the code itself does not change, so you can cut this part out of the script and leave only the manager.

Add to jarvis:

 #1 - POWER #24 - VOLUNE_UP #25 - VOLUME_DOWN #400 - 3D_VIDEO if (command_key in [' ', ' ']): os.system('python3 /home/pi/smarthome/TV/tv2.py 1') jarvis_say_good() continue if (command_key in [' ', '']): os.system('python3 /home/pi/smarthome/TV/tv2.py 24') jarvis_say_good() continue

Radio

 sudo apt-get install mpg123