⬆️ ⬇️

Camera support and digit recognition in Opera browser

I was confused yesterday on excellent. It seemed to me to be boring and sad to enter account numbers for an apartment in the window “Gosuslug” and decided to make automatic number recognition , simultaneously studying the work with the camera from the browser.



Recognition



Access to the camera is an experimental standard , so few of the browsers can boast it, I used a special build Opera-Labs-Camera , which can be taken from the Opera snapshots site. My example will work accordingly, only there, if you want to try, you need to download it.



Generally speaking, there is a similar API in Chrome, you only need to enable it in the config , I did support it, but did not test it.

')

API turned out to be simple. All that needs to be done is to connect the VIDEO tag with the source of the stream video in a special way; Then I take the video frame 10 times per second and put it in CANVAS (in the screenshot it’s colored), from there I cut a small area, translate into grayscale, take 75% of the average color and translate everything in b / w.



var vid = document.getElementById('video-stream'); //   —   navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia; if (navigator.getUserMedia) { //  getUserMedia     — callback,  ,  //  ,      navigator.getUserMedia('video', function (stream) { //   ,      VIDEO  vid.src = window.webkitURL ? window.webkitURL.createObjectURL(stream) : stream; }); } 




Most of the three hours that I spent on it, I did the recognition of numbers. The algorithms that I found on the Internet were very monstrous, and it was boring to rewrite what was done, so I tried several options that came to my mind.



As a result, the following algorithm recognizes best of all.







At first I obviously break the image into numbers, this is completely uninteresting, but I will briefly describe it. Moving from the bottom, I look for the lower boundary for the appearance of black pixels, then continue to move until they disappear. Same left and right. Then about the same way I look for the boundaries of numbers, discarding too short sequences and interrupting too long (if there are stains on the paper or numbers stuck together).



So turn out the rectangles of the numbers.



The rest is drawn on the picture - each digit is conventionally divided by a vertical line in the middle. To the left and right of the line I consider the number of transitions to black color, moving from the center, line by line. Combinations of pairs of numbers are obtained.



For example, in the picture to the left and right of the lines there will be one transition, except for line No. 4, where there will be no transition to the right.



I thin them out - the same consecutive ones are replaced by one value. As a result, the entry for the “six” will look like this: 1 | 1, 1 | 0, 1 | 1. The vertical bar here symbolizes my dividing line.



Chains are obtained that already characterize the figure quite well, although some figures are not able to recognize this pattern. Therefore, I consider the number of b / w transitions also in my speculative vertical line.



For example, “0” and “8” give the combination “1 | 1”, but differ in the number of transitions on the middle line: two and three, respectively.



Actually, on the top screenshot in a large text field, as you can see these combinations. First there is the number in order, then the indicator for the middle line and, after the colon, two numbers on the left and right sides.



Algorithm "noise", due to the fact that the hand is trembling with the paper, because of the lighting. Therefore, I accumulate statistics and indicate the number that was recognized in this position more often than others. I reset the statistics if the number of objects on the screen has changed.



Not an ideal algorithm, but I didn’t have a goal to make a production decision.



I do not quote the code, it is quite voluminous, available by reference .

Source: https://habr.com/ru/post/141624/



All Articles