After the publication of the topic
“Development of the Russian-speaking“ analogue ”of Siri in 7 days,” I received many valuable tips and offers of help. Thank you very much. I took into account many tips and comments and continued to develop. What came out of it under the cut.
The first thing I did was change the design. For this many thanks to the vipzona who has kindly sent a new design for the application. Now the design looks like this:

')
Thanks to the Limosha
hacker , which suggested where to dig into the apple docks, I changed the sampling rate from 44.1 KHz to 16 KHz. The size of the file transferred to the server has decreased. I wanted to further reduce the file size by lowering the sampling rate to 8 kHz, but the recognition quality deteriorated markedly and I stopped at 16 kHz.
And most importantly, I changed the voice engine. Moved to google. The quality of recognition, in comparison with ispeech.org, is many times better. Unfortunately, I never managed to find a library to convert Wave to FLAC directly on the iPhone, so it was decided to perform the conversion on my server using the flac library. Although the developers write on the forums that it will still be faster than converting on a device, but it seems to me that this applies to old iPhone models.
The transition to the Google engine required more powerful resources and I ordered a dedicated server. Since I previously hosted only virtual hosting, I had to suffer from the dedicated one. I got a server with CentOs with PHP 5.1, without the support of json_encode and json_decode, and also MYSQL, in which fulltext search with Russian in the coding utf8_general_ci worked incorrectly. Two days I "killed", setting up the server. I updated php, installed the flac library, and so on trifles.
After all the changes, the logic of the application began to look like this:
A) the phone sends the wave file to my server;
B) on the server, the file is recoded into flac and sent to the Google server for recognition, receiving a recognized string in response;
B) my server processes the received information, forms the answer and sends it to the phone;
D) the phone reproduces the received information;
With this scheme, everything works much faster, but the load on the server has not increased slightly. I don’t even know how many requests the server can handle at the same time. Though it is dedicated, but not rubber.
Improved functionality
Further, having carefully read all the advice sent and written in the comments, I undertook to improve the functionality. The first thing I did was put in a weather widget that shows the weather by GPS coordinates or by IP, if the coordinates could not be found. The widget is standard, taken from gismeteo.ru, and everyone is not bad, only this shows advertising. Not much later, you will probably have to make your own weather widget. There are several sites that give an XML file with the received coordinates. You only need to parse it, draw icons of the sun, clouds, etc. and display.

Then I added the ability to call the subscriber, calling the contact voice. If the contact is not found, this is reported. Of course, for many, and for me, the contacts are named in such a way that you correctly pronounce the figs. To make it more convenient, I am looking for a subscriber not only by name or surname, but also by a pseudonym. You add a nickname to the contact, and it becomes much more convenient to call. In principle, the pseudonymity method is the most correct, even Apple advises it to use Siri for it, because how not to practice the program, but to teach it to bend all names is very difficult. Further on the topic of phone management (timer, alarm clock, notes) I decided not to go deep yet, so that apple wouldn't kill the prog. They already sent a letter saying that I'm sorry, they say so and so, but checking your application will require additional time, thanks for waiting and blah blah blah.
Then I added the opportunity to find out my location on the map and show the nearest objects to restaurants, cafes, night clubs, etc. For this, I use queries to Yandex. Considering that the real siri is not looking for objects outside the United States, it turned out to be a very useful feature.


He also added a bit of humor and answers to frequently asked, but absolutely meaningless questions. To do this, read the sources on the topic, which questions are most often asked Siri. Since all my questions are also being entered into the database, I will soon know what our people most often ask.
I also added the ability to search for photos and pictures from Google and connected the Wolfram | Alpha Webservice API. I connected it, but have not activated it yet, since I read that apple ruthlessly rejects all the programs that use a tungsten alpha base in the voice assistant, because Siri uses the same base. Using the Wolfram | Alpha API, when the number of requests is not more than 2000 per day, is free.



Then I took up such a necessary thing as finding a flight ticket and booking hotels. What came out of this can be seen in the screenshots.


In the end, it turned out quite a decent application, which is not ashamed to put in the App Store. True, it is necessary to complete the knowledge base to the end, but for now my assistant looks more like an ogre of 12 chairs to Ellochka.
If apple misses the application, then a very good startup can turn out. Then I port the application to android and windows phone. If I do not miss it, then I will not be very upset. I think that the developments will not disappear. I already have thoughts on how and where else you can apply speech recognition in mobile applications.
Video demonstration of the pre-release version features here:
www.youtube.com/watch?v=JlkJva-TGfYPS
As always, I will be happy to hear advice and constructive criticism.
If someone from Habrazhiteli has services for booking tickets, hotels and other useful services with a sane API, I will be happy to discuss the issue of integrating these services into your application.
UPD: Apple rejected the app tonight. I will enter with them in a long and boring correspondence and change what they require.
