Today we want to share a series of examples on Python for OpenCV learners on the Raspberry Pi, namely for the two-chamber StereoPi board. The finished code (plus the Raspbian image) will help you to go through all the steps, starting from capturing a picture and ending with getting a depth map from the captured video.
Introductory
I want to stress right away that these examples are for a comfortable immersion in the topic, and not for a production solution. If you are an advanced user of OpenCV and dealt with raspberries, then you know that for full-fledged work it is desirable to code on a sishechka, and even use a raspberry GPU. At the end of the article I will touch upon the "bottlenecks" of the python solution and performance in general in a little more detail.
What we work with
As iron we have such a setup here: ')
StereoPi board, on board the Raspberry Pi Compute Module 3+. Two of the simplest cameras for the Raspberry Pi version V1 are connected (on the ov5647 sensor).
What is established:
Raspbian Stretch (kernel 4.14.98-v7 +)
Python 3.5.3
OpenCV 3.4.4 (pre-compiled, 'pip' from Python Wheels)
The process of installing all the software is beyond the scope of this article, and we simply suggest downloading a ready-made Raspbian image (links to the githab at the end of the article).
Step One: Capturing a Picture
To do this, use the script 1_test.py
Open the console, go from the home folder to the folder with examples:
cd stereopi-tutorial
Run the script:
python 1_test.py
After launch, a thumbnail of our stereo image is displayed. The process can be interrupted by pressing the Q button. This will save the last captured image, which will be used in one of the following scripts to set up a depth map.
This script allows you to make sure that all the hardware is working correctly, as well as get the first image for future use.
Here is the work of the first script:
Step Two: Collect Images for Calibration
If we talk about a spherical horse in vacuum, then to get a good quality depth map we need to have two absolutely identical cameras, the vertical and optical axes of which are ideally parallel, and the horizontal axes coincide. But in the real world, all the cameras are slightly different, and it’s impossible to arrange them perfectly. Therefore, a software calibration trick was invented. With the help of two cameras from the real world, a large number of pictures of a previously known object are taken (we have this picture with a chessboard), and then a special algorithm calculates all the “nonideality” and tries to correct the pictures so that they are close to the ideal.
This script makes the first stage of work, namely it helps to make a series of photos for calibration.
Before each photo the script starts a 5-second countdown. This time, as a rule, is enough to move the board to a new position, make sure that on both cameras it does not crawl over the edges, and fix its position (so that there is no smearing on the photo). By default, the series size is set to 30 photos.
Run:
python 2_chess_cycle.py
Process:
As a result, we have a series of photos in the folder / scenes.
Cut the pictures into pairs
The third script, 3_pairs_cut.py, cuts the photos you have taken into “left” and “right” pictures and saves them in the / pairs folder. In fact, we could eliminate this script and do the cutting on the fly, but it is very useful for further experiments. For example, you can save the thread from different series, use your own scripts to work with these pairs, or even send pictures as pictures of other stereo cameras as pairs.
Plus, before each picture is cut, the script displays its image, which often allows seeing the unsuccessful photos before the next calibration stage and simply deleting them.
Run the script:
python 3_pairs_cut.py
Short video:
In the finished image there is a set of photos and chopped pairs that we used for our experiments.
Calibration
The 4_calibration.py script draws all pairs with chess boards to itself and calculates the necessary corrections to correct the pictures. The script made automatic garbage photos, which are not found a chessboard, so that in the case of unsuccessful photos, the work does not stop. After all 30 pairs of pictures are loaded, the counting begins. It takes us about a minute and a half. After completion, the script takes one of the stereo pairs, and, based on the calculated calibration parameters, “corrects” them, displaying a rectified picture. At this point, you can evaluate the quality of the calibration.
Run the command:
python 4_calibration.py
Calibration script in work:
Depth Map Setup
The 5_dm_tune.py script loads the image made by the first script and the calibration results. Next, an interface is displayed that allows you to change the settings of the depth map and see what is changing. Tip: before setting the parameters, make a frame in which you will have objects at the same time at different distances: near (30-40 centimeters), at an average distance (meter-two) and in the distance. This will allow you to choose the parameters for which close objects will be red in color, and distant objects will be dark blue.
In the image is a file with our depth map settings. You can load our settings in the script simply by clicking the “Load settings” button.
Run:
python 5_dm_tune.py
Here is the setup process:
Real-Time Depth Map
The last script 6_dm_video.py builds a depth map by video, using the results of previous scripts (calibration and setting of the depth map).
Run:
python 6_dm_video.py
Actually the result:
We hope that our scripts will be useful in your experiments!
Just in case, I’ll add that all scripts have keystroke processing, and you can stop working by pressing the Q button. If you stop roughly, for example, Ctrl + C, the process of interaction between Python and the camera may break and you will need to restart the raspberry.
For advanced
The first script in the process of work displays the average time between frame captures, and on completion - the average FPS. This is a simple and convenient tool for selecting such image parameters in which the python is still “not choking”. With it, we picked up 1280x480 at 20 FPS, in which the video is given without delay.
You can notice that we are capturing a stereo pair at a resolution of 1280x480, and then scaling it up to 640x240.
There is a reasonable question: why all this, and why not immediately capture a reduced picture and not load our python by decreasing it?
Answer: with direct capture at very low resolutions in the raspberry core, while there are problems (the picture breaks). Therefore, we take a larger resolution, and then reduce the picture. Here we use a little trick: the picture does not scale with a python, but with the help of a GPU, so there is no load on the core arm.
Why capture video in BGRA format, not BGR? We use GPU resources to reduce the size of the image, and the BGRA format is native to the resize module. If instead of BGRA we use BGR, we will have two drawbacks. The first one is slightly lower than the final FPS (in our tests - 20 percent). The second one is constant vorning in the PiCameraAlfaStripping console: using alpha-stripping; you may find equivalent alpha format faster ”. Googling thereof led to the Picamera documentation section, which tells about this trick.
And where is PiRGBArray?
It's like the native Picamera class to work with the camera, but here it is not used. It already happened that in our tests, working with the “manually prepared” numpy array is much faster (about one and a half times) than with the use of the PiRGBArray. This does not mean that PiRGBArray is bad, most likely these are our curved hands.
How loaded is the percentage when calculating the depth map? Let's answer with the picture:
We see that, of the 4 cores of the processor, only one of them is loaded, and that is 70%. And this is despite the fact that we work with the GUI, and we are outputting images and maps of the depths to the user. This means that there is a good performance margin, and fine tuning of OpenCV with OpenMP and other C-based buns, as well as a “combat” mode without a GUI can give very interesting results.
What is the maximum FPS depth map obtained with these settings?
The maximum achieved by us was 17 FPS, with 20 frames per second captured from the camera. The most “responsive” in terms of speed parameters in the depth map settings are MinDisparity and NumOfDisparities. This is logical, since they determine the number of “steps” made by the search frame comparison window within the algorithm. The second most responsive is preFilterCap, it affects, in particular, the “smoothness” of the depth map.
What is the temperature of the processor?
On Compute Module 3+ Lite (new series, with an iron “cap” on the process) it shows something like this:
How to use the GPU?
At a minimum, it can be used for andistorstion and rectification of pictures in real time, for there are examples ( here at WebGL ), Python Pi3d , as well as the Processing project ( examples for raspberries ).
There is another interesting development by Koichi Nakamura, called py-videocore . In our correspondence with him, he expressed the idea that to speed up StereoBM, you can use its core and OpenCV sors with Cuda support . In general, to optimize - nepahany edge, as they say.