In the course of our activities, we are confronted daily with the problem of determining development priorities. Given the high dynamics of the IT industry, the ever-increasing demand from businesses and the state for new technologies, each time, determining the development vector and investing our own resources and resources in the scientific potential of our company, we make sure that all our research and projects are fundamental and interdisciplinary.
Therefore, developing our main technology - the HIEROGLYPH data recognition framework, we are concerned both with improving the quality of document recognition (our main business line) and the possibility of using technology to solve related recognition problems. In today's article we will describe how, based on our recognition engine (documents), we made the recognition of larger, strategically important objects in the video stream.
Using the available groundwork, build a tank recognition system that allows for the classification of an object, as well as determining the basic geometric indicators (orientation and distance) in poorly controlled conditions without using specialized equipment.
As the main algorithm for solving the problem, we chose the approach of statistical machine learning. But one of the key problems of machine learning is the need for a sufficient amount of trained data. It is obvious that natural images obtained from real scenes containing the objects we need are not available to us. Therefore, it was decided to resort to generating the necessary data for training, since we have a great experience in this place . And yet, it seemed unnatural to fully synthesize the data for this task, so a special layout was prepared for modeling real scenes. The model has various objects that model the countryside: a characteristic landscape cover, bushes, trees, fences, etc. Images were captured using a digital small-format camera. In the process of capturing images, the background of the scene significantly changed to ensure that the algorithms are more resistant to background changes.
The target objects were 4 models of battle tanks: T-90 (Russia), M1A2 Abrams (USA), T-14 (Russia), Merkava III (Israel). The objects were located at different positions of the polygon, thereby expanding the list of permissible visible angles of the object. A significant role was played by engineering barriers, trees, bushes and other landscape elements.
Thus, in a couple of days we have collected a sufficient set for training and the subsequent evaluation of the quality of the algorithm (several tens of thousands of images).
Directly decided to split the recognition into two parts: the localization of the object and the classification of the object. Localization was performed with the help of Viola and Jones’s trained classifier (after all, a tank is a normal rigid object, no worse than a face, therefore Viola and Jones’s “blind on details” method quickly localizes the target object). But we entrusted the classification and definition of the angle to the convolutional neural network - in this task it is important for us that the detector successfully distinguishes those features that, say, distinguish the T-90 from the Merkava. As a result, it was possible to build an effective composition of algorithms that successfully solves the problem of localization and classification of similar objects.
Next, we launched the resulting program on all the platforms we have (Intel, ARM, Elbrus, Baikal, COMDIV), optimized computationally-difficult algorithms to improve performance (we have already written about this in our articles, for example https: // habr .com / ru / company / smartengines / blog / 438948 / or https://habr.com/ru/company/smartengines/blog/351134/ ) and achieved steady work of the program on the device in real time.
As a result of carrying out all the described actions, we have a full-fledged software product with significant tactical and technical characteristics.
So, we present you our new development - a program for recognizing tank images in the Smart Tank Reader video stream, which:
Usually, in the conclusion to our articles on Habré, we give a link to the marketplace, where everyone with the help of his mobile phone can download a demo version of the application to actually evaluate the performance of the technology. This time, taking into account the specifics of the resulting application, we wish all our readers never to face the problem of quickly determining whether a tank belongs to a certain side.
Source: https://habr.com/ru/post/446176/
All Articles