📜 ⬆️ ⬇️

Wikistream - Worldwide audio guide based on Wikipedia articles.

image
We have released an audio guide that is based on Wikipedia.

In Wikipedia, about a million articles are referenced to specific points on the planet. 172 thousand of them - in English. In Russian - 17 thousand. We transformed all this wealth into a sound audio guide.
It is available to you if you have a smartphone with GPS and Java or iPhone 3G. You pay only for internet traffic.

In this topic I want to talk about some of the problems that we encountered in the process of project implementation.

image
1. As it turned out, the coordinates of the places listed in Wikipedia are not separate entities in their database. This is just part of the text. In this case, several coordinates (scattered throughout the text) may correspond to one article, and sometimes there are articles with coordinates that do not actually describe any particular object.
')
2. There are external services that have done a titanic work on the inverse transformation, and now there is an opportunity to get the URL of the article by coordinates. Unfortunately, experience shows that services are not always true (probably, replication takes place with a long delay). And, besides, very unstable. We are thinking about repeating the feat on our own, but for the time being we have to put up with the existing quality.

3. All articles of different sizes. After dubbing some of them cannot be heard even in 30 minutes. In addition, the sounding process itself takes time, and we can not keep within the 2 seconds allocated for the entire content extraction process. Therefore, we need to parse the composition of the article and carefully separate the annotation, as the most significant part.

4. Physical objects corresponding to the articles have a different “appearance” in reality. I mean, some of them are cities, and the visibility radius of the object for a tourist can be 10-20 kilometers. Other objects (monument to the leader) have a radius of only 30 meters. Now we are working on the analysis of the semantic load of articles, but for now all the articles are circles of radius 100 meters. I want to remind you that Toozla supports objects of arbitrary configuration, which makes it possible to create complex sequential audio guides that you can start listening to from any point.

5. We did not choose the voice engine for long. The benchmark for us was the quality of the Russian language, and here the undisputed leader is the well-known Swedish company Acapela . They have a convenient API, and the number of languages ​​at the first stage is quite satisfactory.

The name for the stream is obviously Wikistream. Thanks TarzanASG for help. Thanks to those who tested our app before going to the AppStore.
Details and an example of a robot voice.

Recall that in Toozla may be free and paid streams.
Wikistream content is provided free of charge, just like all Wikipedia (Creative Commons CC-BY-SA 3.0 Unported license).
You can download the application for Java for free from the website , for the iPhone the application is available in the AppStore .

Source: https://habr.com/ru/post/92558/


All Articles