⬆️ ⬇️

Recognition from a non-standard microphone

Some are faced with the fact that it is necessary to recognize the speech to translate it into text and process. The problem has been described many times as well, and ways to solve it so that Google helps, but I ran into my own experience with a problem that required a non-trivial solution, namely how to make it recognize speech from a microphone that is not set to “default”. If you are also faced with a similar problem, then this article is for you.



So what we have is that there are two different speech recognition engines in Microsoft Windows (standard and Microsoft Speech SDK 11) and I’ll say right away that for the time being, only one of them supports it, namely Microsoft Speech SDK 11 with the Russian language installed. If you carefully read the documentation on both engines, you can see that only DefaultInputDevice can be selected as a device, but having dug deeper, namely opening the assembly with the Telerik JustDecompile decompiler, I made sure that all classes are just a wrapper over COM objects, and just in these COM objects you need there is a function. Naturally, like all programmers, I was lazy and didn’t want to work with COM, so I combine the two solutions. So the first thing we need is to find the Token of the microphone we need for this, there is the following function:

private SpeechLib.SpObjectToken FindMicByName(string name) { if (isot != null) { for (int i = 0; i < isot.Count; i++) { sot = isot.Item(i); string desc = sot.GetDescription(1033); string id = sot.Id; if (desc.Contains(name) != false) { break; } } return sot; } else return null; } 


After we get Token, it is necessary to call private methods on the class of the recognition engine to do this using the following function:

 public void SetMicByName(string name, ref System.Speech.Recognition.SpeechRecognitionEngine sre) { if (isot != null) { sre.SetInputToDefaultAudioDevice(); sot = FindMicByName(name); if (sot != null) { FieldInfo fi = sre.GetType().GetField("_sapiRecognizer", BindingFlags.Instance | BindingFlags.NonPublic); object _sapiRecognizer = fi.GetValue(sre); MethodInfo mi = _sapiRecognizer.GetType().GetMethod("SetInput", BindingFlags.Instance | BindingFlags.NonPublic); object[] parms = new object[] { sot, true }; mi.Invoke(_sapiRecognizer, parms); } } } 


At the end I will give the full text of the auxiliary class that is used in my development, you can use it in whole or in part or simply insert the necessary code fragments for yourself. All this is tested on Windows 7,8,8.1, NET 4.0, 4.5, but I think it will work with other versions as well.

 using System; using System.Reflection; namespace RMI.SmartHouse.Service { /// <summary> ///    . /// </summary> public class MicSelector : IDisposable { #region   private SpeechLib.SpInProcRecoContext _siprc; private SpeechLib.ISpeechObjectTokens _isot; private SpeechLib.SpObjectToken _sot; private bool _isDisposed; #endregion #region  /// <summary> ///   . /// </summary> public MicSelector() { _siprc = new SpeechLib.SpInProcRecoContext(); _isot = _siprc.Recognizer.GetAudioInputs(null, null); _sot = null; } #endregion #region   /// <summary> ///   . /// </summary> public SpeechLib.ISpeechObjectTokens Isot { get { return _isot; } } #endregion #region   /// <summary> ///   . /// </summary> /// <param name="name"> .</param> /// <param name="sre">  .</param> public void SetMicByName(string name, ref System.Speech.Recognition.SpeechRecognitionEngine sre) { if (_isot != null) { sre.SetInputToDefaultAudioDevice(); _sot = FindMicByName(name); if (_sot != null) { FieldInfo fi = sre.GetType().GetField("_sapiRecognizer", BindingFlags.Instance | BindingFlags.NonPublic); if (fi != null) { object sapiRecognizer = fi.GetValue(sre); MethodInfo mi = sapiRecognizer.GetType().GetMethod("SetInput", BindingFlags.Instance | BindingFlags.NonPublic); object[] parms = { _sot, true }; mi.Invoke(sapiRecognizer, parms); } } } } /// <summary> ///   . /// </summary> /// <param name="name"> .</param> /// <param name="sre">  .</param> public void SetMicByName(string name, ref Microsoft.Speech.Recognition.SpeechRecognitionEngine sre) { if (_isot != null) { sre.SetInputToDefaultAudioDevice(); _sot = FindMicByName(name); if (_sot != null) { FieldInfo fi = sre.GetType().GetField("_sapiRecognizer", BindingFlags.Instance | BindingFlags.NonPublic); if (fi != null) { object sapiRecognizer = fi.GetValue(sre); MethodInfo mi = sapiRecognizer.GetType().GetMethod("SetInput", BindingFlags.Instance | BindingFlags.NonPublic); object[] parms = { _sot, true }; mi.Invoke(sapiRecognizer, parms); } } } } /// <summary> ///    . /// </summary> /// <returns>  .</returns> public SpeechLib.ISpeechObjectTokens UpdateDeviceList() { if (_siprc != null) { _isot = _siprc.Recognizer.GetAudioInputs(null, null); return _isot; } return null; } /// <summary> ///  . /// </summary> public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } #endregion #region   /// <summary> ///  . /// </summary> protected virtual void Dispose(bool disposing) { if (!_isDisposed) { if (disposing) { _sot = null; _isot = null; _siprc = null; } _isDisposed = true; } } /// <summary> ///    . /// </summary> /// <param name="name">  .</param> /// <returns> .</returns> private SpeechLib.SpObjectToken FindMicByName(string name) { if (_isot != null) { for (int i = 0; i < _isot.Count; i++) { _sot = _isot.Item(i); string desc = _sot.GetDescription(1033); if (desc.Contains(name)) { break; } } return _sot; } return null; } #endregion } } 


')

Source: https://habr.com/ru/post/214635/



All Articles