📜 ⬆️ ⬇️

Voice control multimedia center

In this article I would like to describe my experience in using web speech api in Google Chrome browser for implementing voice search and automatically playing videos from Youtube channel. To demonstrate this functionality, we need to take the following steps:

  1. Install the kit: Apache2, PHP5 (curl package required).
  2. Have a Dune HD multimedia center available or install XBMC and configure it to work in the INTERNET.
  3. Get Youtube API Key to perform search queries.

How to do all of the above, I will not describe here, since these topics are full of articles. The principle of implementation is:

  1. We recognize the phrase using a script written in JavaScript - it will work only in Google Chrome .
  2. We are looking for videos matching the search query.
  3. We get direct links to videos.
  4. Create a playlist of links and movie titles.
  5. We send the playlist to play on the device.

Network topology: The Internet comes to the wan port of a Wi-Fi router, and connects to it:
')

JavaScript speech recognition script - index.html:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ru" lang="ru"> <head> <title> </title> <script language="javascript" type="text/javascript"> /*    XMLHttpRequest    Web- */ var xmlHttp = false; /*@cc_on @*/ /*@if (@_jscript_version >= 5) try { xmlHttp = new ActiveXObject("Msxml2.XMLHTTP"); } catch (e) { try { xmlHttp = new ActiveXObject("Microsoft.XMLHTTP"); } catch (e2) { xmlHttp = false; } } @end @*/ if (!xmlHttp && typeof XMLHttpRequest != 'undefined') { xmlHttp = new XMLHttpRequest(); } </script> <style> * { font-family: Verdana, Arial, sans-serif; font-size: 20px; } a:link { color:#000; text-decoration: none; } a:visited { color:#000; } a:hover { color:#33F; } body { text-align: center; } .button { background: -webkit-linear-gradient(top,#008dfd 0,#0370ea 100%); border: 1px solid #076bd2; border-radius: 3px; color: #fff; display: none; font-size: 13px; font-weight: bold; line-height: 1.3; padding: 8px 25px; text-align: center; text-shadow: 1px 1px 1px #076bd2; letter-spacing: normal; } .final { color: black; padding-right: 3px; } .interim { color: gray; } .info { font-size: 14px; text-align: center; color: #777; display: none; } .sidebyside { display: inline-block; width: 45%; min-height: 40px; text-align: left; vertical-align: top; } #headline { font-size: 40px; font-weight: 300; } #info { font-size: 20px; text-align: center; color: #777; visibility: hidden; } #results { font-size: 14px; font-weight: bold; border: 1px solid #ddd; padding: 15px; text-align: left; min-height: 30px; width: 500px; margin: 0 auto; } #start_button { border: 0; padding: 0; background: url(images/mic.gif); width: 50px; height: 50px; cursor: pointer; vertical-align: top; } #info_speak_now, #info_no_speech, #info_no_microphone, #info_upgrade { display: none; } </style> <meta charset="UTF-8" /> </head> <body> <div id="messages"> <input type="button" id="start_button" onclick="startButton(event);" /> <!--     --> <p id="info_start">      .</p> <p id="info_speak_now">!</p> <p id="info_no_speech">  .</p> <p id="info_no_microphone">  .</p> <p id="info_upgrade">    Web Speech API.</p> </div> <div id="results"> <span id="final_span" class="final"></span> </div> <script> var start_button = document.getElementById('start_button'), recognizing = false, //     final_transcript = ''; //   speach api if (!('webkitSpeechRecognition' in window)) { start_button.style.display = "none"; showInfo("info_upgrade"); } else { /*  api */ /*   */ var recognition = new webkitSpeechRecognition(); /*    */ recognition.lang = 'ru'; // ,   .  - lang code recognition.continuous = true; //       ,   /*      */ recognition.onstart = function() { recognizing = true; showInfo('info_speak_now'); //    start_button.style.background = 'url(images/mic-animate.gif)'; //    }; /*   */ recognition.onerror = function(event) { if (event.error == 'no-speech') { start_button.style.background = 'url(images/mic.gif)'; showInfo('info_no_speech'); } if (event.error == 'audio-capture') { start_button.style.background = 'url(images/mic.gif)'; showInfo('info_no_microphone'); } }; /*      */ recognition.onend = function() { recognizing = false; //recognition.start(); start_button.style.background = 'url(images/mic.gif)'; showInfo('info_start'); }; /*      .  event  : - resultIndex -      - results -       */ recognition.onresult = function(event) { /*    */ for (var i = event.resultIndex; i < event.results.length; ++i) { /*    ( )     */ if (event.results[i].isFinal) { final_transcript += event.results[i][0].transcript.toLowerCase(); } } final_span.innerHTML = final_transcript; var newText2 = final_transcript.replace(/(^\s+|\s+$)/g,''); var url = "/voice_search.php?q=" + encodeURI(newText2); xmlHttp.open("GET", url, true); xmlHttp.send(null); final_transcript = ''; //    }; } /*    */ function showInfo(id) { var messages = document.querySelectorAll('p'); for(i=0; i<messages.length; i++) messages[i].style.display = 'none'; document.getElementById(id).style.display = 'block'; } /*     */ function startButton(event) { if (recognizing) { //    ,   recognition.stop(); document.getElementById('final_span').innerHTML = ''; return; } recognition.start(); } </script> </body> </html> 


To complete the script, you also need to create an images folder and put in it pictures with microphones, which can be taken here and here .

This script does only two actions - it recognizes the phrase and sends it to AJAX with a PHP request to the script. You also need to pay attention to the fact that the encoding in all scripts must be UTF-8 (if you do in Windows, then UTF-8 without PTO).

Script search video clips on PHP - voice_search.php:
 <?php // send info into core function send_info($info) { echo $info; } function send_req($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1"); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($ch, CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_REFERER, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); $out = curl_exec($ch); curl_close($ch); return $out; } function get_video_url($videoId) { $url = 'http://www.youtube.com/get_video_info?&video_id='.$videoId.'&asv=3&el=detailpage&hl=en_US'; $first_found = ""; $last_found = ""; $first_quality = ""; $last_quality = ""; //$video_quality = 'medium'; $video_quality = 'hd1080'; $doc=send_req($url); $x=explode("&",$doc); $t=array(); $g=array(); $h=array(); foreach($x as $r) { $c=explode("=",$r); $n=$c[0]; $v=$c[1]; $y=urldecode($v); $t[$n]=$v; } $links = explode(',',urldecode($t['url_encoded_fmt_stream_map'])); $dlinks = array(); foreach ($links as $link) { parse_str($link,$linkarr); $itag = $linkarr['itag']; $quality = $linkarr['quality']; if (in_array($itag, array('18', '22', '37', '38'))) { if(isset($linkarr['s'])) { $linkarr['signature'] = file_get_contents('http://dune-club.info/echo?message=' . $linkarr['s']); unset($linkarr['s']); $dlinks[$linkarr['itag']] = $linkarr['url'] . "&signature=" . $linkarr['signature']; } else { $dlinks[$linkarr['itag']] = $linkarr['url']; $playback_url = $dlinks[$linkarr['itag']]; if ($first_found === "") { $first_found = $playback_url; $first_quality = $quality; } $last_found = $playback_url; $last_quality = $quality; if (($quality === $video_quality) || (($quality !== 'medium') && ($video_quality === 'hdonly'))) { $playback_url=(urldecode($playback_url)); return $playback_url; } } } } if (($last_found !== "") && ($video_quality !== 'hdonly')) { if ($video_quality === 'hd1080') { $first_found=(urldecode($first_found)); return $first_found; } else { $last_found=(urldecode($last_found)); return $last_found; } } else { //hd_print("--> video: $id; no mp4-stream."); return false; } } if(isset($_GET['q']) == false or $_GET['q']=="" ) { $url = "https://www.googleapis.com/youtube/v3/search?part=snippet&q=%20%20&type=video&maxResults=10&key=Youtube_API_key"; } else { $url = "https://www.googleapis.com/youtube/v3/search?part=snippet&q=".urlencode($_GET['q'])."&type=video&maxResults=10&key=Youtube_API_key"; } $res = json_decode(send_req($url)); if(isset($res->items) == false or ($res->items)=="" ) { $info=" -   !"; send_info($info); } else { $res = $res->items; //print_r($res); $fp = fopen('play_list.m3u', 'w+t'); $start="#EXTM3U\r\n"; fwrite($fp, $start); foreach ($res as $searchResult) { $title=($searchResult->snippet->title); $videoId = ($searchResult->id->videoId) ; $clip_url = get_video_url($videoId); if(isset($clip_url) == false or $clip_url=="") { $info="  !"; send_info($info); } else { $clip="#EXTINF:-1,$title\r\n$clip_url\r\n"; fwrite($fp, $clip); } } fclose($fp); $info=" "; send_info($info); //url  Dune HD $url="http://ip_addr_dune/cgi-bin/do?cmd=launch_media_url&media_url=http://server_ip/play_list.m3u"; //url  XBMC $url="http://:@ip-:8080/jsonrpc?request={"jsonrpc":"2.0","id":"1","method":"Player.Open","params":{"item":{"file":"http://server_ip/play_list.m3u"}}}"; $curl = curl_init(); curl_setopt($curl, CURLOPT_URL, $url); curl_setopt($curl, CURLOPT_RETURNTRANSFER,true); $out = curl_exec($curl); curl_close($curl); } ?> 


In this script, at the very end, you will need to edit the $ url under the settings of your multimedia center and delete the excess, as well as correct the text server_ip to the ip address of your Apache server and insert your Youtube_API_Key . What happens here: from the speech recognition script, the text of the recognized phrase comes here, then using Youtube API v3, you search for video clips that match the search query, and get links to videos, we skip them through a cycle, in which the full paths to the video files are extracted. recorded in playlist play_list.m3u . This script does not pretend to an ideal code, as it is written purely for informational purposes, therefore there are no various checks here.

That's all, now go to our web server by its ip address. You can enter from any device: tablet, smartphone, laptop, the only thing I noticed is that recently on smartphones with Android, the speech recognition script in the absence of it sends the phrase again, with which it is not yet clear, but this has not happened before .

Based on this material, many more interesting things can be done, such as voice search for music in VK and control of 1-wire devices. In general, try, if that does not work, then ask with pleasure to answer all your questions.

PS: The article is written on materials:

W3C Web Speech API
YouTube api v3

The script for getting direct links is taken from the YouTube application for Dune HD and slightly modified to fit your needs. For those who want to just try to manage multimedia centers, without writing scripts - can be done here or here The result of my work on YouTube

Source: https://habr.com/ru/post/270809/


All Articles