
Computers, robots, artificial intelligence ... The basis of many advanced technologies was the need to reproduce or imitate human thinking, feelings and behavior.
Various sensors, such as acoustic and video sensors, as well as pressure sensors, were created after we figured out how our own hearing and vision is arranged, how we perceive pressure.
')
Undoubtedly, one of the main senses for humans is vision. Thanks to him, we can see the environment in which we are located, interpret and analyze the situation, take appropriate actions.
Human vision is an incredibly complex intellectual “machine” that uses a significant part of the brain. Neurons designed to process visual information occupy about 30% of the cortex.
For several years now, scientists and engineers have been working to create devices, objects, and things that can “see” the environment, as well as analyze and interpret what they see.
Technological complexity, high resource consumption and prohibitively high costs previously limited the scope of computer vision and related analytics tools, and therefore they were used only as part of security systems and video surveillance. But today the situation has changed dramatically, as the market for video sensors is experiencing rapid growth. Cameras are built into all sorts of devices, objects and things - both mobile and stationary. In addition, the computing power of end devices and cloud solutions has increased dramatically. And this led to a revolution in
computer vision.The affordable price of sensors and cameras, various advanced technologies, an increase in the resolution of video sensors, the dynamic range and the amount of computing power for processing video and images all lead to a wider spread of such systems and the emergence of new options for their use.
In the modern world of connected embedded systems, devices, and objects, intellectual analysis of images and video using classical processing and in-depth training based on the resources of the device itself, as well as cloud computing, has become possible.
As a result, we are witnessing a boom in the development of technology for autonomous cars, unmanned aerial vehicles, robots, automated systems for industry, retail, transportation, security systems and video surveillance, home appliances, medical devices and solutions for healthcare, sports and entertainment, augmented and virtual reality consumer level and, of course, the ubiquitous mobile phones. Computer vision technologies and related analytics tools in the Internet of Things are undergoing rapid development, and this is only the beginning.
In fact, the video sensor has made a real revolution, and in this no other sensor can compare with it. The video has become part of our daily life, and most people already take it for granted. Streaming video, providing video on demand, video calls - taking into account all this, it is easy to forget about the significant impact of sensors in the world of Internet-connected media and devices; therefore, the videosensor is the most underrated hero in the world of the Internet of Things. And in tandem with the technology of intellectual analysis of video and images, video sensors create a new dimension for the entire market.
One of the main factors behind the rapid development of computer vision has become an ever-increasing proliferation of mobile phones with integrated cameras. Before the revolution in the field of mobile phones, video sensors and cameras, as well as corresponding analytics tools, were mainly used in security and video surveillance systems. But then there were mobile phones with built-in cameras, which was also accompanied by an active increase in the computing power of end devices and cloud systems, available for video analytics and intellectual analysis. Such an explosive combination became a catalyst for the rapid development and distribution of video sensors that began to be used everywhere, from robots and drones to automobiles, industrial equipment, home appliances, etc.
There are various types of video sensors, but the complementary metal-oxide-semiconductor elements or CMOS sensors certainly had the greatest impact and led to the explosive development of these technologies and the integration of video sensors into various systems and smartphones.
Sensors are everywhere, and they are numerous. Autonomous cars today use more than 10 cameras, drones - from three to four, surveillance cameras are installed almost everywhere, mobile phones already know how to broadcast video in real time. Video data from these sources is transmitted to the cloud for further analysis, and real-time processing is performed on the devices themselves.
The resolution and dynamic range of video sensors, as well as their number, continue to increase, and in the foreseeable future this tendency will only gain momentum. To process, transfer and store large amounts of video information, more and more resources are required.
At first, everyone tried to stream video to the clouds for analysis in real time or after the fact. Clouds provided tremendous computing power, but to transmit video, even after compression, channels with very high bandwidth were required. The need to store huge amounts of data, significant delays and possible security and privacy issues force users to rethink approaches to cloud computing. Now, many analyze video information at the device or object level, and then perform offline video processing in the cloud.
And with the advent of the new 5G high-speed connection, which provides minimal delays, the idea arose of distributing real-time video processing tasks between end devices and cloud environments. Nevertheless, it remains to be seen how possible (if possible in principle) and whether it makes sense to transfer compressed video from millions of endpoints to the cloud in real time, almost completely loading communication channels.
As awareness of the importance of analytics at the level of end devices becomes more widespread, various
systems on a chip (SoC) , graphics processors (GPUs), and video accelerators have become more widespread. GPU-accelerated cloud resources are used to analyze archival video or to train neural networks on a large amount of test data, and real-time processing takes place on the end devices with accelerators themselves.
In-depth learning technologies and optimized SoC, along with video accelerators for traditional image processing, help to maintain the trend towards performing analysis on end devices, with additional events, parameters and analytics being transmitted to the clouds for further research and comparison. Cloud resources will continue to be used for analyzing video archives, while some systems will still perform real-time analysis.
Computer vision. Real examples of use
The market for computer vision technologies and related analytics tools will continue to be actively developed. Currently, there are some amazing trends in technology, and they should give a new impetus to the development of computer vision systems for years to come. Here are just some examples:
3D cameras and 3D sensors. 3D cameras or, more generally, 3D-enabled sensor technology, which allows you to determine the depth in a scene and build 3D scene maps. This technology appeared some time ago, and today it is widely used in gaming systems such as Microsoft Kinect, and more recently it was used in the iPhoneX 3D-sensor for biometrics. And this market is once again waiting for rapid growth, when smartphones will be able to provide the necessary acceleration for a much wider set of application options. In addition, robots, drones and autonomous cars with 3D cameras will be able to recognize the shape and size of objects and will use these technologies for navigation, mapping and obstacle detection. 3D and stereoscopic cameras are also the basis of augmented, virtual and mixed reality.
Deep learning on end devices and in the cloud. Artificial intelligence systems based on neural networks are becoming more widespread. Again, the deployment of depth learning networks has become possible only thanks to the enormous computing power available today. There are other factors that led to the rapid development of neural networks and options for their practical application, including the presence of huge amounts of data (video, photos, text) available for learning and conducting advanced research and development in universities and first-tier companies, which contribute to the popularization and development of open source solutions and systems. As a result, a large number of neural networks arise that are used to solve specific practical problems. In fact, for robots, autonomous cars and drones, depth learning using GPU / SoC on end devices has already become the norm. Cloud resources will continue to be used within the network of in-depth training, as well as for processing video from archives. Data processing across distributed architectures spanning end devices and clouds is also possible, since network delays and video stream delays are already considered acceptable.
SLAM in cars, robots, drones. Simultaneous localization and mapping (Simultaneous Localization And Mapping, SLAM) is a key component of autonomous cars, robots, and drones equipped with various types of cameras and sensors, including radar, lidar, ultrasonic sensors, etc.
Augmented / virtual reality and perceptual computing. Take for example Microsoft HoloLens. What is this system based on? Six cameras in combination with depth sensors. Microsoft has even announced the creation of a research center in Cambridge (USA), which specializes in the development of computer vision technologies for HoloLens.
Security / CCTV. This article does not address this direction of video collection and analysis. This in itself is a very large market.
Biometric authentication in mobile phones and embedded devices. Biometric authentication can give new impetus to the development of mobile applications, and here again, videosensors are used in combination with analytics tools on end devices and in the clouds. As it progresses, this technology will be implemented in various embedded devices.
Retail. Amazon Go is an example of using cameras and advanced video analytics. Soon, robots-consultants equipped with several cameras with a video analysis system, as well as other sensors, will meet customers at the shelves.
MASS MEDIA. Video analytics is already widely used in the media industry. Video analytics systems allow you to view large video files in search of a specific topic, scene, object or person.
Sport. Real-time 3D video, video analytics and virtual reality will allow you to create personalized sports and entertainment systems of the new generation.
Prospects, challenges, motives and problems
The need to constantly increase the resolution, dynamic range and frame rate of video, as well as the performance of video analytics systems, necessitates a corresponding increase in computational power and expansion of data transmission and storage systems. And it is not always possible to solve these tasks quickly.
Several companies take a different approach to solving this problem. Neural networks are based on biology research, development and commercial computer vision products begin to appear in a similar way, which respond to changes in the scene and generate a stream of a small number of events instead of transmitting a sequence of images. This will allow the use of video collection and processing systems with much more modest capabilities.
This approach is promising, it can drastically change the way video is received and processed. As a result of a significant reduction in the required computing power, large energy savings will also be achieved.
Video sensors will continue to be the main catalysts for the rapid development of the Internet of Things. Similarly, video analytics at the end device level will continue to stimulate the development of the SoC industry and semiconductors, contributing to the improvement of video accelerators using GPUs,
specialized integrated circuits (ASIC) , programmable SoC for logic output, user-programmable gate arrays (FPGA) and digital signal processing algorithms (DSP ). All this will also contribute to the improvement of traditional image processing systems and depth learning technologies, and developers will have more opportunities for programming.
Today it is a battlefield where many big players and startups have come together.
Built-in low-power video sensors
Currently, millions of autonomous-powered objects use video sensors and video analytics, therefore, the improvement of embedded video sensors with low power consumption remains one of the main factors for the growth of the whole industry in the new era, as well as one of the key problems that need to be solved. The emergence of devices and systems with embedded video sensors and video analytics tools necessitates the analysis and elimination of privacy and security problems already at the design stage.
Despite all the problems and challenges, systems that combine computer vision technology and the Internet of things have a great future and huge market opportunities, so companies that can cope with these problems and challenges will be fully rewarded.