📜 ⬆️ ⬇️

Offline Continuous Handwriting Recognition

Foreword

As you know, the problem of recognizing a continuous handwritten text in off-line mode is still considered unsolved.

I managed to solve this problem theoretically and practically. The practical part now has the form of a demo version of the program. The solution is general, it is not limited to any application, language or size of the dictionary.

About the program

The program is fully trained. The learning process looks simple: you write characters in on-line mode, the program summarizes them and highlights the writing algorithm. This is the first stage of training. The second stage occurs during operation. If a symbol is encountered, the general algorithm for writing which coincides with one of the available ones, and the values ​​of some properties are beyond the ranges calculated at the first stage, the ranges are expanded. Of course, only after the user confirms the overall recognition result. By the way, at the first stage it is enough from three to seven presentations of the character, and the algorithm is ready.
')
Theory

A little bit about theory. There are several approaches to solving this problem. They are usually divided into two types: structural and reference. The first is based on the selection and analysis of various structural elements of the symbol and their characteristics, properties. The second involves comparing a recognizable character with a set of predefined standards. These methods do not allow solving the problem in general.

The task of handwriting in on-line mode is completely and successfully solved. This decision is based, in any case, on the creation of algorithms for writing characters that take into account the trajectory of movement of the pen. That is, the sequence of change of its coordinates. There were proposals to reduce the off-line recognition task to on-line recognition. To do this, it is enough to correctly count the lines from a graphic copy of the text. But it is fundamentally impossible to do this. You can count the line segments between intersections, but in order to correctly connect them, interpretation is already needed.

There remains only one solution - to restore the characters in the process of interpreting the segments obtained at the stage of reading from a digital graphic copy of the text. To do this, two components are needed: a special representation of the algorithm for writing a symbol, allowing it to be done, and an algorithm for interpreting segments, which can analyze all possible interpretations.

Practice

This was done in full. As it is known, the main task of the demo version is to demonstrate the principal solution of the task. What, in this sense, is the prototype capable of now? The program is able to recognize one word written in any continuous handwriting on white paper. To translate into a digital file, the word can either be scanned or photographed with a webcam or digital camera. In principle, text recognition has already been done, but this feature needs some work.

The following are examples of recognizable words. As you can see, there is not only the usual spelling, but also “complicated” variants: crossed words, symbols written in segments, having superfluous parts and the like. This shows that in a fully finished form the program will be able to recognize quite noisy texts.

image

image

image

image

It is obvious that you can confidently recognize only those characters that have all the necessary parts approximately in their places. If there are missing or highly distorted parts, then word-level interpretation is necessary. The presence of a dictionary increases the percentage of recognition, but does not solve all the problems. There are such cases when without understanding of the meaning of a phrase some words cannot be unambiguously interpreted. This requires an artificial intelligence system capable of understanding the meaning of natural language phrases. Until recently, there was no information about the availability of such systems on the market. Now it is already there: ABBYY announced the creation of the Compreno system , which uses a semantic interpretation of phrases based on a “world model” that is not dependent on a specific language.

I also have a prototype of an AI system capable of understanding the meaning of the text. Judging by the information about “Compreno”, which is now in the media, my system is functionally much wider. She is trained, capable of summarizing information and actively seeking knowledge when they are not enough to complete the task. In other words, such a system is fully capable of working as a personal secretary. But it has one serious drawback compared to Compreno - in terms of the degree of general readiness, it still does not even reach the demo version.

Commerce

And at the end a little about the commercial side of the project. On the Internet there is an interview with the Vice-President of ABBYY Lingvo, Aram Pakhchanyan. Regarding the task of recognizing a continuous handwritten text in off-line mode, it says there is, in fact, that this task should not be solved. The cost of its solution (very large, it must be assumed) will not pay off. And, it seems, mainly because ABBYY Lingvo has practically already made irrelevant continuous writing. She completely solved the problem of recognizing separate handwriting, and developed appropriate forms for all occasions.

Perhaps it was a joke. But still, it makes sense to say the following. Writing in the usual continuous handwriting is more convenient and easier than writing letters in squares. If the computer recognizes the first as well as the second, the second will go into the past as well as punched cards, black-and-white televisions and films for cameras.

In the next short video you can see the program in action. Perhaps it will be interesting.


Conclusion

And one more important point - performance indicators, namely, time and percentage of recognition. Of course, the demo focused on the second criterion. The level not below 70% is now reached. In the finished version, this indicator can be formulated as follows: if a person can read the text, then the program too. About the recognition time, we can only say that it will be possible to bring it to acceptable values.

If everything goes well, there will be more articles about some technical aspects of text recognition and about AI.

Thank you for attention.
____________
Update.
Dear habravchane! Thank you all for feedback, this is very important and useful for us. In general, the topic was met positively, which can not but rejoice.

To indignant individuals, I would like to say: dear, we are not fair magicians. We report in our words. If we wrote that in the finished product recognition accuracy will strive for 100%, then we are sure of this.

This article can be considered an announcement, it was not her goal to disclose all the technical details in detail. However, given the interest shown, after some time there will be another article that describes the recognition process in more detail.

A demo version of the program will also be available for download.

Source: https://habr.com/ru/post/136165/


All Articles