Prototype voice shopping list for WP8, Win8, Android with a backend in Azure in 2.5 hours

From November 9 to 11, Windows 8 Hackaton RUWOWZAPP took place, where I first registered as a participant, and then was honored to attend the event as an expert. As an expert, I get to know a lot of great people and their projects. It was so interesting that he continued to advise even at night, and 4-5 hours remained to sleep. I was so infected with the positive and the energy and the desire of people to create, which also could not resist creating my own small prototype of the application - Shopping List with voice recognition support.
For a couple of hours I managed to make a functional prototype, demonstrating the idea of the application, with clients for WP, Win8, Android

I did not want to participate in the competition of applications with such a raw prototype, but I really wanted to show what I did in a couple of hours, and at the last moment, before the performance of the last participant, got in line for the performance, and the moderator allowed me to demonstrate my crafts:

')
The application has aroused great interest among the hackathon members and in fact this is the promised article with all the answers to questions for which I did not have enough time.

For those who want to see the code right away, the source code can be downloaded here.
And the rest I ask for cat.

In contrast to this video, during a speech, I was in a hurry and did not have time to launch the android version. I expected that the main interest would be to exactly how voice recognition takes place, but I did not expect so much interest to everything else and for the next half hour a couple of dozen people asked a variety of questions about the project, such as: exactly how synchronization takes place, how does parsing commands, how to write for android, what is the server part done, etc. I apologize that at that time I did not have the opportunity to show everyone the source code of the project and this article promised there, in fact, answers to all the questions that were asked of me.

The idea of the application.

In fact, initially, I wanted to finally try to recognize voice recognition in WP8, which became available to developers. And I wanted to make a decision that would be friends with the Russian language.
I stopped at the following set of commands:

Buy [product] - add products to the list
Bought [product] - setting checkbox "bought"
Remove [product] - remove a product from the list
Delete list - clear the list
Price [product] [price] - setting price
[product] in the store [store] - an indication of the store where you can buy a product

I figured that I could make such an application for three platforms in 6 hours, looking ahead and say that I had less time than I expected and managed only the first 4 teams.

Voice recognition - 1 hour.

WP8 works very well with English and recognizes well even for my accent. But it turned out that the possibility of recognition in Russian is much limited. For the Russian language WP8 only recognize a predefined dictionary. Killed it for about half an hour.
I really wanted to do exactly the Russian language, and since I already had experience with voice recognition services, I decided to fasten some commercial voice recognition engine for a while. However, since I worked with them the last time, nothing has changed and in fact no one has had an automated trial or paid period. And since with all the services it was necessary to communicate with the managers, I decided to screw in for the demo voice recognition from Google. I specifically looked for the conditions for using the Google voice recognition engine and could not find it, but remembered that I had seen somewhere that I couldn’t use it for commercial purposes (although I might be mistaken). Thank you so much Yakhnev for the excellent article on C # sources. It took only 10 minutes to make a web project from its desktop project, with an API for voice recognition. But since the application did not have the ability to save the file to disk, and there was no time to redo the recognition in memory, we had to abandon the free Web Role in Azure. Fortunately, I already had a couple of virtual machines deployed in Azure, and there was no problem modifying and uploading the project to the server. As a result, I picked up the recognition service with an access point voice.akhmed.ru/recognize.ashx - where to POST I upload the WAV file to the request and get the text at the output.

WP7 app - 30 minutes

Most of the time it took for the application on WP7. But only because this platform was a test site and during development it constantly changed the code.

After I raised the voice recognition service there was a question about voice recognition on the device.
Since it was a functional prototype, I decided to throw out all unnecessary, user authorization, handling button presses, loading indicator, resubmitting, error handling (therefore, the application may fall periodically), saving to the database, saving wav files, etc.
Since the application had to be ported to android too, I decided to make a prototype without MVVM, so I had a terrible mess of code.

Since, now we didn’t have to make an application just under WP8, I decided to make a version on WP7, which gave an additional advantage - the prototype works on any WP device. Recording a microphone is quite a non-trivial task on WP7, but I already had my WPExtensions library which made it easy to record a voice in a WAV file. In AppBar, I added one fictitious add button to the list of records with my hands and added a button with a microphone, which started recording at the first click, and when pressed again, sent a recording to the server and processed the result:

private bool isRecording = false; private readonly MicrophoneWrapper microphone = new MicrophoneWrapper(); private void ApplicationBarRecordIconButton_Click(object sender, System.EventArgs e) { if (!isRecording) { microphone.Record(); PageTitle.Text= "..."; } else if (isRecording) { microphone.Stop(); var wav = microphone.GetWavContent(); Send(wav); PageTitle.Text = defaultHeader; } isRecording = !isRecording; }

The sending method is also quite trivial, in it I send the answer to the server and process the received answer.

 private void Send(byte[] wav) { var client = new HttpWebClient(); client.Post("http://voice.akhmed.ru/recognize.ashx", wav, (result) => Dispatcher.BeginInvoke(() => ParseString(result))); } private void ParseString(string result) { logicLayer.Parse(result); RefreshView(); }

There were a lot of questions about how teams are being analyzed, what kind of library I use for text analysis, how superfluous words like “buy” or “i” are filtered. Of course, in the release, you need to make a much more competent solution with morphological and syntactic analysis, but now the code is outrageously simple. I just use the first word as a command and filter all words up to two letters.

 public void Parse(string voiceText) { var words = voiceText.Split(new[]{' '}, StringSplitOptions.RemoveEmptyEntries); if(words.Length>1) { var command = words.First(); if(command.Equals("")) { Add(words.Skip(1)); IncrementUpdate(); } if(command.StartsWith("")) { SetBoughtStatusTrue(words.Skip(1)); IncrementUpdate(); } if (command.Equals("") || command.Equals("")) { if (words[1].Equals("")) { shopList.ShopItems.Clear(); } else { RemoveShopListItems(words.Skip(1)); } IncrementUpdate(); } } }

Application backend - 20 minutes

In order to ensure synchronization with other devices, it was necessary to make the server part. Of course, such a backend is an ideal candidate for hosting in Azure as a web role, but for the prototype it could be placed on the same virtual machine in Azure as voice recognition. Since we have time is very limited, it makes sense to make a SOAP service, as the studio is able to generate a proxy quickly on the client.
Service too simple to ugliness. I have one shopping list that I transferred from client to server (for the client it will be generated in a proxy).

 public class ShopList { public ShopList() { ShopItems=new List<ShopItem>(); } public List<ShopItem> ShopItems { get; set; } public int Version { get; set; } } public class ShopItem { public string Name { get; set; } public decimal Price { get; set; } public char Valute { get; set; } public bool IsBought { get; set; } }

Honestly, the two fields Price and Valute are superfluous, since I did not have time to use them, but I quote the code “as it is”.
Saving and retrieving a list is also implemented very simply.

 public class GroceryService : System.Web.Services.WebService { private LogicLayer logicLayer = new LogicLayer(); [WebMethod] public ShopList GetVersion() { return logicLayer.GetShopList(); } [WebMethod] public void UploadVersion(ShopList request) { logicLayer.Update(request); } }

Of course, the release should not be a complete update of the list as it is, but a partial update of the changed data, but for the prototype, it will do as well.
The logic is also made very simple to disgrace, since this is a prototype, without a database, while maintaining the current value in the database. Honestly, there was no sense in such logic, but I bring it “as it is”. The names of the methods failed but did not change.

 public class LogicLayer { private static ShopList shopList = new ShopList(); public ShopList GetShopList() { return shopList; } internal void Update(ShopList newshopList) { shopList = newshopList; } }

In the end, he raised this service to voicegrocery.akhmed.ru/GroceryService.asmx
Now the question is how to deliver updates to customers? Of course, on PushNotification. But its implementation could take a lot of time, which was tight and I made a request from the client in 5 seconds.

 DispatcherTimer dispathcerTimer = new DispatcherTimer(); dispathcerTimer.Interval = TimeSpan.FromSeconds(5); dispathcerTimer.Tick += dispathcerTimer_Tick; dispathcerTimer.Start();

The logic of updating to / from the client is very simple.
1. If the current version is less than received from the server, the current list is replaced by the server one.
2. If a change occurs on the client, the version is incremented by 1 and sent to the server.

 void dispathcerTimer_Tick(object sender, System.EventArgs e) { var client = new ServiceReference1.GroceryServiceSoapClient(); client.GetVersionCompleted += client_GetVersionCompleted; client.GetVersionAsync(); } void client_GetVersionCompleted(object sender, ServiceReference1.GetVersionCompletedEventArgs e) { if (e.Result.Version > logicLayer.GetVersion()) { logicLayer.UpdateShopList(e.Result); RefreshView(); } } private void IncrementUpdate() { var shopListItem = new ShopList() { Version = shopList.Version + 1, ShopItems = shopList.ShopItems }; var client = new ServiceReference1.GroceryServiceSoapClient(); client.UploadVersionAsync(shopListItem); }

Porting to Windows 8 - 10 minutes.

Porting the application to Win8 was very easy. I did not implement voice recognition on the client and it turned out one-way synchronization. XAML was copied almost unchanged, a little bit had to correct the code sent to the server. It became a little easier - in one method

 async void dispathcerTimer_Tick(object sender, object e) { var client = new ServiceReference1.GroceryServiceSoapClient(); var result = await client.GetVersionAsync(); if (result.Body.GetVersionResult.Version > logicLayer.GetVersion()) { logicLayer.UpdateShopList(result.Body.GetVersionResult); RefreshView(); } }

Porting the application to Android - 15 minutes.

I love the mono platform. The code remained almost unchanged, it remains to correct the UI. Since the presentation is much more difficult for Android, I did not spend a lot of time creating a custom adapter and after 5 minutes I rolled back and made a simple text list with textual cross-stitches in brackets:

 void client_GetVersionCompleted(object sender, ru.akhmed.voicegrocery.GetVersionCompletedEventArgs e) { try { list.Clear(); var result = e.Result.ShopItems; foreach (var item in result) { var checkBox = item.IsBought ? "( X ) " : "( ) "; list.Add(checkBox + item.Name); } this.RunOnUiThread(() => { this.ListAdapter = new ArrayAdapter<string>(this, Resource.Layout.ListItem, list); ((BaseAdapter)this.ListAdapter).NotifyDataSetChanged(); }); } catch (Exception) { } }

Porting on iOS - not available

Of course, I was thinking about porting to iOS, but since I didn’t have i-devices ~~but hackintoshs, which I use in home development to show at such events incorrectly~~ and there was very little time to put off this idea. ~~Especially on the laptop with me I did not have a hackintosh~~

Results

If you do not take into account the 40 minutes that were spent on researching the capabilities of the WP8 platform, then in less than 2 hours, taking into account the cost of uploading to the server and minor bug fixes, a full-fledged prototype was implemented, which shows the main idea of the application and does not mind throwing it away .
Of course, the code was very dirty, sub-optimal, with a bunch of flaws and unfinished features. But functional prototypes are just needed for what would be on the “paper sketch” - on the draft to show the customer / superiors the product that will turn out.

Source: https://habr.com/ru/post/158481/

All Articles