📜 ⬆️ ⬇️

How I fought with cameras, or GMT in inept hands

Good evening, dear habrovchane, good evening, the glorious city of Belgorod.
I will tell you today a tale about a fool. And he is a fool (me, beige) because he did not follow one simple truth:

The famous laziness of a programmer is that instead of extra gestures (whether they are machine or machine), it is better to think and find a more elegant and simpler solution.


And it will talk about how the fool tried to teach how to find the position of the camera in space.
')


A tip



I had to write a programming project at the end of my second year. I decided, as usual, to hunt for a freebie and join some group at the end of the year. Did not work out. I agreed with the dean's office and prep about the delivery of the project at the end of August and quietly left to teach children about art. Returning to Nerizinovaya, I realized that I had essentially two weeks left. Since then, almost all the time I was sitting in a coffee shop and doing quite a pleasant coding. The essence of the project was to determine the spatial position of the ends of the fingers with the help of two webcams in real time.

It is clear that each pixel in the image from the camera in space corresponds to a ray in space. Two cameras - suddenly two beams intersecting at the point of interest. In theory, everything is simple. In practice, I screwed the OpenCV library to MSVS at night, and then a week and a half created various image processing algorithms, quickly wrote a simple 3D viewer, compiled two cameras together and debugged, debugged, debugged ... At that time, I set the basis in space easily - put cameras on one line, directed them "approximately upwards" and considered the distance between the cameras as 1000 units.

In general, everything was almost ready. The cameras individually knew how to catch a single finger, and with good accuracy, all the math functions of the recalculation were calculated, even a feature was written that allowed to have almost any fixed background, not black. But something was wrong - a point in space made freakish somersaults with an amplitude of about a centimeter when moving by hand. Trouble And then I realized that the waitress three hours ago just slightly touched the camera.

image

Staging



I sighed and realized that I would have to write a function that determines the position and orientation of the camera itself. The camera’s view area is ideally an infinite quadrangular pyramid with a rectangle at the base. It is fully defined by eight values: 3 coordinates of the vertex, 2 coordinates of the direction vector (pyramid bisectors, axes), 1 - rotation around the axis, and 2 more - angular width of the review.

The last two coordinates are initially known - google the angle of the camera diagonally and solve the simplest geometric problem. Turn around the axis - it is clear that, but may vary depending on the position of the camera. The coordinates of the direction vector are two, for it can be specified as the end of a vector of a certain length (3 coordinates) and one unknown is removed from the equation x ^ 2 + y ^ 2 + z ^ 2 = l ^ 2. Well, the three coordinates of the vertex are understandable. So we need to calculate 6 values.

"ABOUT! I need a triangle! ”I exclaimed. 3 points, and from each image we get 2 numbers. Total plan - we put some isosceles right triangle in space and say that its coordinates are (100, 0, 0), (0, 0, 0) and (0, 100, 0). Further we mark on the image from the camera the vertices of this triangle and all that remains to be done is to substitute the values ​​into a simple formula. Well, I thought so anyway.

But it was not there. I killed 4 hours to find this formula using conventional methods of exact mathematics, connected two of my best school math friends to the search for a solution, began to type the address of Wolframalf faster than the password, but all I got was that the exact solution exists, but after finding it i know zen.

And it was here that the fool made a mistake. There was a system of 6 trigonometric equations, closely related to the equation of a circle. And just in the next semester, we passed the Calculus, in which, as is known, a method of solving nonlinear systems is described. And it would be right to read the theory and do everything as it should be - despite the fact that it will take more time, the result will be better and faster, and it is useful for self-development. But no, the habits of Peter I awoke in me and I decided to chop it up with an ax.

Decision



As is known from the school planimetry course, HMT, of which this segment (AB) is visible at a given angle (alpha) - an arc of a circle. The picture clarifies everything.

image

Plus in space, this picture can be rotated around the segment. We get something like a torus, but without a hole. Since there are three segments, we get three tori, or rather, three torus surfaces. One surface is a flat figure, the intersection of two is already a line (in the general case, several closed curves), Three surfaces are already a point. The Torah in the picture below.

So the clumsy method: we have to cross these three tori. And since computer science is discrete, we will have to represent the surface of a torus as nodes of a grid stretched over it. Like this:

image

Further, the distance between the points is compared to the blunt one and the closest ones are located.

As a result, this function ate more memory than the rest of the project with all debug images and mountains of semi-necessary garbage, worked for five minutes for one camera (long live real-time!) And was sometimes mistaken.

And after half a year, I, with boredom on a pair, did write the function of intersection of these tori. Everything is as it should be, with deductions, matrices and other things. She worked instantly, thought for sure, and in general was easy and pleasant. But since it was over with that project, it was written on the fly, with zero decoration, which means I can’t understand anything in the text (then I was still afraid of the word “class”), I left the source in my office. And now it's time to finally finish this project, which I actually do. But that's another story.

Goodbye, dear habrovchane, good dreams, the city of Belgorod.

PS In the near future I plan to describe the image processing algorithms - only I will remember them myself. So see you soon!

Source: https://habr.com/ru/post/152437/


All Articles