
Yandex in
its blog announced a new possibility of Yandex.Disk file storage. Now the file search can find JPEG, GIF and PNG images containing the text of the search query, the System has learned to recognize the text in the images.
Yandex claims that they managed to create a universal OCR that can recognize text in completely different pictures. For this, a picture classifier operating on the neural network principle selects images from all the files containing text. The text is divided into lines, the lines into individual letters, and then the system recognizes them, taking into account the peculiarities of the language.
For different types of pictures, the poison promises a different recognition accuracy. For scanned documents in Russian - 80%, for photos with inscriptions - 63.2%, and for screenshots - almost 100%. In addition to the Russian language, the system also recognizes English, Ukrainian and Turkish. The average text recognition accuracy is around 70%.