📜 ⬆️ ⬇️

Is it possible to push number recognition in any tamagotchi?

About the recognition of numbers, we tell on Habré long ago. I hope even interesting. It seems the time has come to tell how this applies, why it is needed at all, where it can be pushed. And most importantly, how this has changed in recent years with the advent of new algorithms for machine vision.



Oddly enough, number recognition is applied not only when monitoring traffic rules. There are dozens of applications. And each task has its own specifics, target quality, input images, speed requirements, etc. Let's start with the most simple and banal:

Traffic Control


For a start it is worth telling a few stories about how traffic cameras are used. We ourselves treat this subject very superficially, but communicated with a large number of large firms, more or less understand the situation in the industry.
')
For example, a simple question. Who, in your opinion, installs cameras and monitors their work?


Speed ​​control

Oddly enough - it is the prerogative of private companies. Traffic police inspectors receive fines in their database, check that everything is filled correctly and send us with you. Profit from fines divide 50/50. Half goes to the concessionaire who serves the camera, half to the budget.

A concession is an agreement to install and maintain a certain number of cameras. Usually regions offer a couple of dozen cameras as a lot. But it happens in different ways. One example of the most epic concession is how the Cossacks are fined (by the way, this is not the only case).

Today, there are several violations that are detected (the list is not complete, I list the main ones):

  1. Speed ​​control
  2. Traffic control
  3. Markup Control
  4. Moving control


Moving control

With the exception of Moscow, the most tasty morsel is speed control. There most of the flow of fines. Cameras at traffic lights and lanes quickly cease to make a profit, people begin to drive normally after the first fine, especially if it is a small region. But speeding brings a steady profit, especially on major highways.
A couple of years ago, all the speed cameras were radar. But the radar is the most expensive thing in them. And for 15 years everyone has been trying to get rid of him.


Speed ​​control with tripod

And only about a year or two ago they licensed the first non-radar system. Now there are already a few. To be honest, I do not know how many there are today. It seems to me that three, but I could be wrong. Based on the fact that a lot of new number recognition algorithms have appeared, the quality has improved, the price of good hardware has gone down - now there is a global replacement, increase, and renewal of the camera fleet. It became a lot of points where you can put the camera and pay off. Most likely, 80 percent of the installations will be transferred over the next five years to a rampless system.

Given that the cameras are put by concessionaires - the topic is terribly profitable and liquid.
The cameras themselves can be put in very different places. This may be a tripod, there may be a special farm, there may be stationary nightstands near the road, and there may be simple poles.

About 10 years ago the algorithms were weaker, the numbers were recognized worse. Now, even on a mobile phone, you can get good quality. Increases the variability of installation sites.
What does “worse recognized” mean? In fact, on a well-built camera, with the right lighting, you can recognize about 95-98% of passing numbers. That was before neural networks, 5-7 years ago. Now the percentage is slightly higher, but not much. All unrecognized numbers are smeared with mud, snow, spoiled. 90% of what the system does not recognize - the person himself does not recognize. But the recognition on cameras in bad conditions has greatly improved. They hanged on a crooked pillar on the side, gave little light, and so on. Now it all works with the minimum number of settings.

Of course, in a snowstorm, the quality of recognition drops to zero, but this doesn’t bother anyone.
Despite the difficulty of entering the market and its overregulation by the state (certification of the number recognition system is at least half a year), it is oversaturated with solutions.

Parcon


The peculiarity of Parcons is that the quality of images is worse than ever. Shooting lead or from moving cars, or pedestrian inspectors. The data that comes to recognition may look something like this (recorded from the recorder, recognized by our algorithm, so it has nothing to do with Parkon, video for example):


Because of the poor quality of recognition, it is necessary to constantly check what the system recognized (in this park, the camera is much better, the light is set, etc.). Plus to check whether the same car is standing in the parking lot. For example, in DIT, as I heard, the algorithms are still of the past generation. A year ago, the processing of fines was carried out by a workshop of almost 100 workers who verified all the data with their eyes (all these were rumors circulating in the region that may not correspond to reality).

By the way, for mobile inspectors, the data is also verified. This is additional protection so that he will not issue a penalty to his foe.

With the algorithms of the new generation, verification is no longer so relevant, so I really hope that they have already updated their pipeline.

In any case, the main difficulty is not to recognize the number, but to make the system stable in terms of the information received and eliminate errors.

Recognition on smartphones / mobile devices


From pedestrian inspectors, you can smoothly go into a wider category - “recognition on smartphones”.



Recognition of car numbers is divided into two ways. The first way is the recognition of numbers on the server. The second is on the device. Recognition on the device is more difficult. Two years ago it was impossible to do it well .

We will return to the comparison of recognitions. First of all, I will tell you about where this recognition is needed:

The first task is to optimize the input data. It occurs when an employee must fill out a document on the street. This may be a traffic inspector, an insurance agent, a car dealer. An employee takes photos of the car and press a button. The documents automatically recognized the machine number.

The task is not God knows what, but they say that it helps people in the winter.

The second problem to be solved when recognizing numbers on a smartphone is the control of vehicles in private territories. For example, parking, unloading areas, etc. There goes the inspector, who immediately after the picture displays the necessary information on the tablet.
Feature of recognition on smartphones can be called the next moment. The person himself edits the number if it is recognized incorrectly (which is not happening in the Parcon). And the psychological threshold of correct recognition for the first task is about 80%, so even the algorithms of the past generation coped well. We tried our past algorithm to use several such tasks. It works fine, but the economic effect is uncertain. The pilots did not fire, only once was used when the customer of the system ordered such a function, and the performer asked us. Yes, and then - this is all for show.

For the second task it is good to have 95%. There are only modern algorithms.

Some level of error is allowed. People still manually recognize everything. And if something is wrong, then check and rule.

Barriers


Is it good or bad, but Russia is a country of barriers. And barriers can be absolutely everywhere. It can be parking, it can be entrances to the territory of enterprises, it can be entrances to supermarkets, it can be an entrance to your summer cottage or a local area.



Wherever control of the territory and speed of response is required, the easiest way is to insert a barrier. But, if you start to understand, it turns out to be trash.

No, even the most advanced recognition algorithm does not give 100% quality. Of course, now there was a huge leap in recognition. And for barriers the quality jumped. It was 93-95 percent, it became 97-98. Growth was due to those places where there was dirt and poor installation: not enough light, large corners, a bad combination of filters. Now the installation is much easier. This should not be a super-mega installer.

In reality, the unit solutions for new technologies. Usually it gets worse.

Not to let 2-5% of cars into the territory is unacceptable. We need someone who can adjust the recognition of the number. Security guard?

Large warehouses, where the territory guarded long ago went down this path. Recognition of the number they have tied to the next booth with a guard. The price of such a decision is usually 50-100 thousand. Depending on the hardware, camera, algorithm, light. Room recognition conditions are close to perfect. This solution is usually installed by the installer, who understands the topic. Its services can increase the cost of the project by a dozen or two thousand.

And this is without a barrier, only for the recognition complex!

The second way, which is popular in the Moscow courtyards - a barrier on a telephone call. Call the number, and if your number in the database - you open. But again. Someone forgot the phone. Someone discharged. Guests have come to you. As a result, very often add a pass by number. And since it is difficult and expensive to hang a good camera + most of the decisions on the algorithms of the old generation - the guard looks at the picture.

In addition, the theme with the phone often does not go - people arrange private parking lots from the courtyards where they have access, to the detriment of their neighbors.



There are many companies specializing in the installation of barriers in the courtyards. But almost none of them try to pick up recognition on objects. Although the market is huge. For the half a year since we made a new algorithm, we tried to launch several pilots, one of them even now is working successfully. The ideal scheme for any yard is a phone / card + number recognition + access to the camera for recognition

Statistics


Another interesting application for barriers - statistics on the numbers. This is sometimes used by shopping centers. To know how often people come (often even recognize on what machines).

You can put in car washes to control cars. In car services, at the box office, etc.

Server


Recognition on servers is very common. All but the traffic rules can be recognized on the servers. Here, in a nutshell, I will tell you what tasks only server-side recognition is using, and a little more general story about general use will be just below:


One of our backup servers on jetson (and the fact that a good piece of iron is free!):



How and where should the algorithms work?

Listing the application, I abstract from where and how to recognize everything. Where to push the cascade of grids and where to process.

In principle, if you summarize everything, there are not so many options:


How to use it all

In reality, all the questions above are limited by one problem. What to do with recognition errors? What to do with quality?

If we fix the recognition algorithm, the quality of the system is determined by the quality of the equipment: light, lenses, installation point.

Optimizing the quality of this bundle is a task known to all installers. There is almost nothing new to come up with. Each office that installs cameras has its own stack of developments. The stack that will be applied depends on how much money the client has on the task.
To select mistakes - usually plant operators. All fines pass through them. Parking control systems also go through the operators. One operator can be enough for a couple of dozens of barriers.

If recognition on smartphones - the user controls. If recognition is only for the sake of statistics, control is blocked. 2-3% loss is considered permissible

What we came to

We constantly received offers to test our algorithm / attach it to something, or use it. Plus, they themselves did a bunch of experiments. It hasn’t yet come to full-fledged implementation, but other systems of machine vision of the heels have been nudged, which are already in sale.

As can be seen from the description above, number recognition is an unstable thing. There are always mistakes. Even if 1%. I wanted to think of a way for the algorithm to become absolutely oak, so that it worked in any task, without much adjustment and additions.

But even we first began to do everything wrong. Since we had a large stock of quality and good speed, one of the first thoughts was: "why not run it on RPi and hang it on every barrier." The cost of the piece of iron for this approach was minimal.



They took RPi, they unsoldered the control board for the barrier, stuffed it in a box.

But, they understood one simple moment. Even it is difficult for us to put the camera on the barrier so as to save it from all possible artifacts. We can overcome the lights, the sun, the wrong installation angle or blur. But how will all these factors be overcome by a person who tries to set up number recognition for the first time ?!

It is difficult to adjust the picture on the device buried somewhere in the depths of the barrier. It is necessary to connect the laptop, build the camera.

Tried with Bluetooth. Same. Even if you display the picture on the phone - not much easier. You can put perfectly. But still, after some time, the device stops working. We need to re-crawl in the guts and reconfigure. And often at random. At some point, you shoot down a camera - and for 10 minutes you think what is wrong.

Or the system works for two days, and suddenly stopped. What happened?

When we did recognition from mobile phones about 3 years ago - there were no such problems. Send errors? You do the filtering that was not recognized, you try to pull out the error pattern - you try to finish the algorithm, or to train the operators. As a result, 1-2 days and the error is corrected.

This is clearly nicer than crawling under each barrier.

As a result, they realized that the most stable is recognition on the server. On the server it is much easier to understand the cause of errors, compensate, show. A tool that allows you to build statistics for the day, find the time when the system is wrong and does not work - is written in one hour. Recognition Graphics:



Blue graph - recognition quality in time. Red - the percentage of frames with numbers in time. Is the quality subsided? An error is displayed on the operator console. And for a quick understanding of the error, a map of the latest recognitions is displayed:



We immediately see that some of the numbers are not caught. What's the matter? We look closer:



The backlight is gone, the number is not readable!

Moreover, if something does not work, then the error can often be corrected on the server. If the camera turned into a zone where it does not work - to redo the homography. If the camera is out of focus - then for many models you can rebuild them remotely.

If the problem with the installation, you can give a clear indication: "from 10 to 10:30 the sun shines into the camera." If the camera is installed without access to the Internet and statistics, then it is very difficult and dreary to catch errors of such a plan. If there are 2-3 such errors, the quality may fall below 95% and the client will be disappointed in the system.

In this case, the system itself is made easier. You can stick the RPi + Lan camera + key:



Or you can stick an arduino with GPRS at all, by which you can tell the server when to process the camera, get an answer and open the barrier:



(Yes, arduino is expensive and not the optimal fee. But when you need 10 pieces + stability, then the price of developing it will be more expensive).

This quality control rule applies to any DeepLearning system. There will always be mistakes. And the only way for the customer to like the system is to lay ways to catch and deal with mistakes even before they appear. Base collection, online statistics control, additional training. All this is better to lay in such a way as to catch mistakes and compensate them before the client has time to understand that something is not working and take offense.

When we recognized the goods on the shelves - all recognition went to the verification of people. When we recognized containers - on operator verification. Etc. etc. Not everything can be sent for verification, not always. But you always need to come up with a method that will allow you to understand that the algorithm has not collapsed.

We had a wonderful experience when one of our customers shipped a half-full system in which there were several hundred modules (not recognizing numbers, a rather distant task). For this system, we developed a mathematical core.

The system even worked well. But the users of the system were not very educated people: they began its direct sabotage.

Our customer was a very reasonable person - in the system there was the possibility of remote retraining, flashing, changing the algorithm. And he had a whole team for support. As a result, sabotage was very timely cut off. Make algorithms for its fishing and processing.

With autonumbers the same. The quality of the system must be monitored continuously. If there are problems, the customer can abandon the system, even if it is a problem on his side. Of course, there are always borders that cannot be crossed. But simple addition of monitoring allows 80% of problems to be solved proactively, competently communicating with the customer.

And it does not matter, it is a server, a camera on the highway, on a home barrier, or in a car wash. The main thing is to properly set up monitoring so that it is minimally distracting, but as much as possible controlled.

Source: https://habr.com/ru/post/343512/


All Articles