Did you know that in the range of software vendors for video systems there are many solutions for applied trade tasks - modules for counting visitors, determining the queue length, monitoring cashier operations, etc. And there are practically no offers for solving industrial and industrial problems. All because we, software developers for video surveillance, for production, unlike retail, we rarely develop, and it is expensive.
Why? Let's figure it out.
Initially, video surveillance systems were invented for security. They solved traditional tasks: archiving storage, displaying operators on the screens, motion detection and search in the archive. Regardless of which industry belongs to the object on which the video system is installed, the approach to solving these tasks is the same. And the technologies of their solutions, developed once, are successfully applied on millions of different objects.
Over time, video surveillance systems went further and learned how to solve somewhat more complicated tasks - analyze video: recognize faces, recognize car numbers, search by various parameters. And again, no matter what object it is doing, the process is approximately the same, because, for example, there is no difference for the number recognition module, the car is at the gate of a shopping center or factory.
')
The next step is to solve narrower specialized tasks that go beyond security. First of all, such systems appeared for retail. They can analyze the movement of customers, determine the length of the queue, count visitors, identify the most active areas of the sales area. And it all works quite effectively.
Heat map of Macroscop traffic intensity in the lobby Now it is logical to take the next step: if we taught the systems to analyze video and solve applied problems on some objects, why not introduce it everywhere? Not only in trade, but also, for example, in production, and then systems will be able to replace workers who are engaged in low-skilled labor.
Not here it was ...
In practice, everything is not so simple. If the basic security tasks are of the same type, the video analysis tasks for the retail are also of the same type (because it’s not so important what to sell, the stores as a whole sell the same, regardless of what is on the shelves), then the production has the most diverse tasks. All production is different and for each there is a production process.
Judge for yourself. In Macroscop, we receive a lot of requests for video analytics from different production companies. For example, using video systems, our customers want:
• to recognize the broken off tooth of the excavator;
• determine the fractions of rubble in the back of dump trucks in a quarry;
• count plastic bottles in the pallet;
• Detect gems on a conveyor belt.
Even these tasks are so different and specific that for each of them, it is necessary to engage in individual development. Standard tools developed once, they do not solve. And such an individual development is expensive.
It is logical that a widely replicated product is cheaper than a product that is produced in smaller volumes. In order to develop the technology and set up production, it is necessary to make certain investments that should pay off through sales. And the more copies of the developed product are sold, the smaller the cost of production is invested in the cost of each unit. When your task is completely individual, the product for its solution is produced in a single copy (and the likelihood that someone else will buy it once is small), all developer investments are also included in its value. The proportion is simple: less demand - more price.
Can save
There are in the practice of Macroscop and examples of self-solving non-standard production problems with minimal costs, when customers simply showed ingenuity and used standard video analysis tools.
Life hacking
For example, one of the companies, which is engaged in the production of roofing materials, solves the problem of detecting defects (holes) in the produced material by means of a video system using motion detection. This is what their system is:

The material moves along a conveyor belt through a dark box. At the bottom of the box a light source is installed, directed vertically upwards (through the tape), the video camera is located above, opposite the source, and directed vertically downwards. If there are no holes in the material section, the picture on the camera is absolutely black, but if there is a defect in the material section in the dark box, the camera fixes the gaps (spots of light). In Macroscop, the reaction to the appearance of these light spots is set up (motion detection is used), and an external application is triggered by an event that informs the operator about the presence of a defect.
The cost of the software component of this solution is 1800 rubles (for 1 camera).
Alternative point of view
Not everyone shares our point of view. Some of our colleagues in the workshop have the opinion that, by and large, all tasks, no matter how different they may be, can be classified and brought under some kind of universal development.
But it is possible to classify effectively only when really very close tasks turn out to be in the same classes, for the solution of which the same approaches can be applied. If there are 5 seemingly similar tasks in one class, but there is still no universal algorithm to solve them, this classification becomes meaningless. Which class, for example, is the task of detecting a broken excavator tooth?
On the other hand, it is absolutely clear that the future is behind this. After all, it is possible to classify tasks, for example, the recognition of some objects or some event. And, probably, very soon algorithms will appear in the video systems that will be able to recognize anything. To a certain extent, such developments are applicable now.
Not so long ago, we met with the company blippar, which, as she herself says, made the world's first visual browser. They made an application for mobile devices that recognizes any items that fall into the camera lens of a mobile phone, and gives out content about them. Today it is a completely working application that shows quite good results.
Eventually
All this suggests that promising video analysis technologies are already being created, which are universal, work in different conditions and do not require further development for specific tasks.
It is quite obvious that in the future the task of recognition will become a fundamental, basic function that can be applied everywhere, including at production facilities to solve special problems.
But it is necessary to separate the present and the future. And today, while technologies are not at this level, customers with a narrow specific production task have three options:
1. To pay quite a lot of money for individual development.
2. To show ingenuity and resourcefulness, using standard tools in non-standard ways.
3. Or do not use video analysis, but connect human resources.