The idea to make a "talker" in Russian recently came to mind. In my head there was a simple scheme like:
1) Recognize speech from a microphone
2) Come up with a more or less reasonable answer.
At this point, you can do a lot of interesting things.
For example, to implement the management of something physical and not so much.
3) Convert this very answer to speech and reproduce.
The most interesting thing is that for all these items there were libraries under Python, which I used.
The result was a bundle, almost independent of the language chosen as the spoken language.
This library is a wrapper over many popular speech recognition services / libraries.
Since Of all the services presented in the library list, Google Speech Recognition was the first to earn, which I used later.
The library uses machine learning techniques. Training takes place on data sets in a dialog format.
The learning process in the library chatterbot
The data sources for training can be files of such a simple format.
In fact, they are a set of dialogs in the form:
- - - ... -
For English, there is a good set of training classes, one of which takes dialogs from Ubuntu Dialog Corpus, and the other from Twitter.
Unfortunately, for the Russian language, I did not find alternatives to Ubuntu Dialog Corpus (the same volume). Although the same TwitterTrainer should work.
As an experiment, I tried to use the dialogues from the first volume of the Warriors and the World.
It turned out funny, but hardly feasible, because the dialogues there are often aimed at specific characters in the novel.
Since it is difficult to get an interesting interlocutor from a bot without a lot of data, at the moment the search for a good base for conversations continues.
Another library chatterbot provides a set of "logic modules" (LogicAdapter). With the help of which you can, for example, filter the answer, teach the bot to count or say the current time.
The library is quite flexible; it allows you to write your own classes for learning and logical modules.
This library is able to convert a string to an mp3 file with speech. Since Google is behind this library, there are many languages ​​to choose from, including Russian.
Available at the link: GHub
Just want to advise to create a separate virtual environment for python.
For example with the help of conda .
conda create --name speech_ai source activate speech_ai conda install python=3.5
For experiments with the above set of libraries, it is suitable:
Packages put on instructions from sites:
Also, when installing SpeechRecognition, it is sometimes necessary to help one dependency (PyAudio):
sudo apt-get install python-pyaudio python3-pyaudio pip3 install pyaudio
chatterbot advises using MongoDB to work in production.
By default, the Json file is used as the data storage, which leads to a multiple slowdown of the work with the training on medium-sized samples.
From thoughts:
Source: https://habr.com/ru/post/323570/
All Articles