📜 ⬆️ ⬇️

DIY Voice Translator

Somewhere about a week ago, my good friend complained in a conversation that she wanted to learn English, but unfortunately because of the work schedule she could not attend courses. It would be nice if there was some kind of self-help translator for a regular phone (read j2me-platform). Being a moderator of one of the near-mobile forums, I began to search through topics in search of a similar solution. There were a few. The problem was that:
1) weight from 350 kb and up to 7-8 megabytes
2) text translation only
3) only a couple of languages ​​(rus.-eng.)

As a true Jedi developer, I decided to correct the situation, impress the girl, at the same time I wanted to work with the voice on j2me for a long time.
As the engine of this whole business, I immediately chose a product from the Corporation of Good.
I put myself TK:
- minimalist interface (two input fields, choice of languages ​​from which and to which one to translate)
- Text translation
- voice speaking of the translated text
- voice input

As a translation engine, I took the source code kindly provided by a familiar developer Doctor Drive.
Generation of a voice found on Habré. A few hours later a prototype was ready, translating the text and voicing it.

That left the most interesting, and at first glance, unrealistic stage - voice recognition on j2me. I dropped the idea of ​​software processing right away as fantastic, and began to learn how to recognize the voice using GoogleSpeechAPI. There is enough information on the topic, the main problem is that the API accepts flac or speex format. Well, I’m going to download libraries of appropriate formats under j2se and start porting them to j2me. At 4 in the morning after 20-30-50 cups of coffee, looking at the upcoming struggle with more than one thousand lines, I get the idea that when, on the Nokia N73, I had a Google voice search and it worked surprisingly quickly. Either there was some kind of megaproduct flac codec, or there was some tricks there. I postpone the program code and libraries, create a new project, the main purpose of which is to loop through all the audio formats supported by the phone for compatibility with GoogleSpeechAPI. Compile, we run, a few minutes of fear that I was mistaken, a very long canvas of text with errors from api ... and one single answer, getting out of the crowd. And the format is “encoding = amr”! Joy was short-lived, for all attempts to send my voice, I get vague answers with text like “ooooh” and so on. Having stretched the remnants of brains after a sleepless night already at work, I decided to check with what characteristics the phone's voice recorder writes, thanks aNNiMON 'who promptly helped with this. It turned out that it was necessary to replace “audio / amr; rate = 16000 "on" audio / amr; rate = 8000 ". After work we check how it works. Bingo!
Now it was possible to design the program and source code into something more or less usable. The final version can:
- translate text from one language to another, the list of languages ​​is similar to the same on translate.google.com
- voiced translated text
- can translate Goloc-V-Gol, that is, we dictate the text, it appears in the input field, translated, and voiced
- pronunciation check (not a perfect solution, but it works). We dictate the phrase, and it compares with the standard
- opening text from the file system and translation
- logging of translations, stored in the voice recorder folder (in the file system)
')
I throw the program to my phone, to my friends and acquaintances, to a couple of forums to catch errors. The reaction is ambiguous, on some resources in the topics of a single post, some friends asked the same for Android / computer. I myself while waiting for the reaction of the very friend.

UPD: fixed problems found, now when recording we fall into a window with an end button and a timer recording; in case of an error on the server / network we display the corresponding notification

Source: https://habr.com/ru/post/146374/


All Articles