📜 ⬆️ ⬇️

Robot Loader Project: Orientation Definition

A month ago, I wrote about the definition of my own position by my robot loader. (Sorry, I posted that article in an unfortunate time on Saturday night, so few people saw it.) As I noted, the readings of the wheel sensors allow the robot to determine its position quite accurately - the slowly accumulating error is corrected as soon as the robot scans the barcode on any from the shelves of the warehouse. On the other hand, the accumulated direction error was nothing to correct.

I discussed my difficulties with the humanities girl, and asked her how she knew how to orient in space. According to her, in the London Science Museum, she found an exhibition dedicated to the orientation of ants by looking vertically upwards overhead. Visitors were asked to take a mirror and walk around the room, looking at patterns on the ceiling in the mirror and focusing only on them. (The ceiling map was attached.)

I decided to check: what does my robot see on the ceiling of the warehouse?


Even on such a low-quality record, long rectangular windows are perfectly distinguishable. All of them, of course, parallel to one of the axes of the warehouse, and therefore parallel to the rows of shelves. It turns out that if a window gets into the frame and I can determine its direction, then I get the direction of the robot, with an accuracy of 180 °. The windows are not visible from any point of the warehouse, but for periodic course correction - they get into the frame quite often.
')
In pattern recognition, I am not a whale, and the first thing I asked at Q & A was there anything ready for recognition of rectangular windows, for example, is OpenCV applicable? Unfortunately, I wasn’t told anything sensible - people who were “in the subject line”, considered chewing a kettle on the bottom of their dignity.

American forum: asked a question - get an answer.
Israeli forum: asked a question - you get a question.
Russian forum: asked a question - they will explain to you what an asshole you are.

Moreover, the robot was controlled from under the Microsoft Robotics Development Studio, and I did not find the finished .NET assembly for working with OpenCV - I decided to write my own recognition from scratch. Not rocket science, tea - just something to recognize the bright white rectangles.

The ceiling of the warehouse is seen by the robot something like this:


To begin with, let's separate the bright white window from the dark background. We translate from RGB to YCbCr and divide by Y = 227 (the threshold was chosen empirically, and in an ideal world it would be adapted adaptively to the brightness of the image as a whole). Along the way, we reduce the original image 640x480 to 320x240 - that's enough for us, and the processing will accelerate four times.

Code:
byte[] bytes = response.Frame; int stride = bytes.Length / height; byte[,] belong = (byte[,])Array.CreateInstance(typeof(byte), new int[] { 326, 246 }, new int[] { -3, -3 }); int Ythreshold = settings.Ythreshold; for (int y = 0; y < 240; y++) { int offset = stride * y * 2; for (int x = 0; x < 320; x++) { int blu = bytes[offset] + bytes[offset + 3] + bytes[offset + stride] + bytes[offset + stride + 3]; offset++; int grn = bytes[offset] + bytes[offset + 3] + bytes[offset + stride] + bytes[offset + stride + 3]; offset++; int red = bytes[offset] + bytes[offset + 3] + bytes[offset + stride] + bytes[offset + stride + 3]; offset += 4; belong[x, y] = (0.299 / 4 * red + 0.587 / 4 * grn + 0.114 / 4 * blu > Ythreshold ? (byte)1 : (byte)0); } } 


As a result of the separation, this image is obtained:


As usual on video images, in addition to a correctly selected window, we received noise from individual pixels on the glare surfaces, and at the same time inside the window itself there were “broken pixels” that were not bright enough.

We remove the noise by the median filter (although for some reason I was advised to have a Gaussian blur in QA. Well, why is there a blur?) With a radius of 3 pixels and a threshold of 5 bright pixels out of 21 examined. Such a filter “leans” towards the connection of bright areas, between which there are dark pixels - so that our window of three glasses merges into one rectangle.

 private bool[,] filtered = (bool[,])Array.CreateInstance(typeof(bool), new int[] { 326, 246 }, new int[] { -3, -3 }); int medianThreshold = settings.MedianThreshold; for (int x = 0; x < 320; x++) for (int y = 0; y < 240; y++) filtered[x, y] = belong[x - 1, y - 2] + belong[x, y - 2] + belong[x + 1, y - 2] + belong[x - 2, y - 1] + belong[x - 1, y - 1] + belong[x, y - 1] + belong[x + 1, y - 1] + belong[x + 2, y - 1] + belong[x - 2, y * 1] + belong[x - 1, y * 1] + belong[x, y * 1] + belong[x + 1, y * 1] + belong[x + 2, y * 1] + belong[x - 2, y + 1] + belong[x - 1, y + 1] + belong[x, y + 1] + belong[x + 1, y + 1] + belong[x + 2, y + 1] + belong[x - 1, y + 2] + belong[x, y + 2] + belong[x + 1, y + 2] > medianThreshold; 

The dimension for the belong and filtered arrays - [-3..322, -3..242] - was chosen on purpose with “fields” of three pixels on each side in order to work with these arrays without bothering with index checks.

After filtering, only the window combined from three glasses and a few highlights on the shelf remain white on the image:


Let us state that the biggest white spot in the frame is by all means a window. Flood (floodfill) every white spot, and take the largest in area.

 int biggest_area = 0; byte area_id = 1, biggest_id = 0; // areas start from 2 Rectangle bounds = new Rectangle(); PointF cg = new PointF(); // center Point[] stack = new Point[320*200]; for (int x = 0; x < 320; x++) for (int y = 0; y < 240; y++) if (filtered[x, y] && belong[x, y] <= 1) { int area = 0, left = 320, top = 240, right = 0, bottom = 0; int sx = 0, sy = 0; ++area_id; // FloodFill int sp = 0; stack[0] = new Point(x, y); while (sp >= 0) { Point next = stack[sp--]; area++; sx += next.X; sy += next.Y; belong[next.X, next.Y] = area_id; if (next.X < left) left = next.X; if (next.X > right) right = next.X; if (next.Y < top) top = next.Y; if (next.Y > bottom) bottom = next.Y; if (filtered[next.X - 1, next.Y] && belong[next.X - 1, next.Y] <= 1) stack[++sp] = new Point(next.X - 1, next.Y); if (filtered[next.X, next.Y - 1] && belong[next.X, next.Y - 1] <= 1) stack[++sp] = new Point(next.X, next.Y - 1); if (filtered[next.X, next.Y + 1] && belong[next.X, next.Y + 1] <= 1) stack[++sp] = new Point(next.X, next.Y + 1); if (filtered[next.X + 1, next.Y] && belong[next.X + 1, next.Y] <= 1) stack[++sp] = new Point(next.X + 1, next.Y); } if (area > biggest_area) { biggest_area = area; biggest_id = area_id; bounds = new Rectangle(left, top, right - left, bottom - top); cg = new PointF((float)sx / area, (float)sy / area); } } 

The image for recognition is ready! Two-level - white rectangle on a black background. We even calculated during the filling the coordinates of its “center of mass” cg (in the image is a red dot) and borders (the green dot is the center of the bounding box). We can now check how much our white spot looks like a window: the biggest_area area should be no less than 2000 pixels, the distance between two centers should be no more than 20 pixels, otherwise the figure is too asymmetric for a rectangle. But how further will we determine its orientation?


Rocket scientists, perhaps, would apply the Hough transform , translate the image into the 4-dimensional space of probabilistic parameters of the rectangle (width, length, angle, offset from the origin), and look for a maximum there. But I approached the task of workers and peasants, and to begin with, I made a “histogram” of removing points of a rectangle from its geometric center:

 PointF c = new PointF(bounds.Left + (float)bounds.Width / 2, bounds.Top + (float)bounds.Height / 2); int[] hist = new int[400]; for (int i = 0; i < 400; i++) hist[i] = 0; int maxdist = 0; for (int x = bounds.Left; x <= bounds.Right; x++) for (int y = bounds.Top; y <= bounds.Bottom; y++) if (belong[x, y] == biggest_id) { int dist = (int)Math.Sqrt(Sqr(x - cX) + Sqr(y - cY)); hist[dist]++; if (dist > maxdist) maxdist = dist; } 




The gray chart is the histogram, the dark bar on it is the maximum, i.e. the greatest number of points is precisely at such a distance from the center of the rectangle. (A dark circle corresponding to this distance is drawn on the rectangle.) It is easy to understand that this distance is precisely the half-width of the rectangle: before it — the circles are entirely inside the rectangle, and the histogram grows linearly (with coefficient π ); then the histogram slowly decreases until it reaches the half-length of the rectangle; from now on, only four “corners” of circles are placed inside the rectangle, and the histogram drops sharply. Unfortunately, it is impossible to find the length through this “break” on the histogram - the edges of the rectangle turn out to be too ragged, and the end of the histogram rustles. We will go the other way, and consider a circle 5 pixels wider than the rectangle. There will be two black “sides” just on the transverse axis of the rectangle:



Carefully find the centers of mass of these "sides": the difficulty here is to separate them. We consider separately the center of mass in each “quadrant”, then combine the “quadrants” into two pairs by the proximity of the centers.
 int r1 = 0; // incircle radius for (int x = maxdist; x >= 3; x--) { if (hist[x] > hist[r1]) r1 = x; } int rSample = r1 + 5; int[] voters = new int[4]; Point[] sums = new Point[4]; Point sampleOrg = new Point(Math.Max((int)(cX - rSample), 0), Math.Max((int)(cY - rSample), 0)); Rectangle sample = new Rectangle(sampleOrg, new Size( Math.Min((int)(cX + rSample), 319) - sampleOrg.X, Math.Min((int)(cY + rSample), 239) - sampleOrg.Y)); for (int x = sample.Left; x <= sample.Right; x++) for (int y = sample.Top; y <= sample.Bottom; y++) if (belong[x, y] != biggest_id) { int dist = (int)Math.Sqrt(Sqr(x - cX) + Sqr(y - cY)); if (dist > r1 && dist <= rSample) { int idx = y < cY ? (x < cX ? 1 : 0) : (x < cX ? 2 : 3); voters[idx]++; sums[idx].X += x; sums[idx].Y += y; } } PointF adjusted = new PointF(); int vAbove = voters[0] + voters[1], vBelow = voters[2] + voters[3], vLeft = voters[2] + voters[1], vRight = voters[0] + voters[3], allVoters = vAbove + vBelow; if (allVoters == 0) { // abort: no black pixels found } else { if (vAbove > 0 && vBelow > 0) { // split vertically PointF above = new PointF((float)(sums[0].X + sums[1].X) / vAbove - cX, (float)(sums[0].Y + sums[1].Y) / vAbove - cY), below = new PointF((float)(sums[2].X + sums[3].X) / vBelow - cX, (float)(sums[2].Y + sums[3].Y) / vBelow - cY); double dAbove = Math.Sqrt(above.X * above.X + above.Y * above.Y), dBelow = Math.Sqrt(below.X * below.X + below.Y * below.Y); if (dAbove >= r1 && dAbove <= rSample && dBelow >= r1 && dBelow <= rSample) // the split is valid adjusted = new PointF((above.X * vAbove - below.X * vBelow) / allVoters, (above.Y * vAbove - below.Y * vBelow) / allVoters); } if (adjusted.X == 0 && adjusted.Y == 0 && vLeft > 0 && vRight > 0) { // split horizontally PointF toleft = new PointF((float)(sums[2].X + sums[1].X) / vLeft - cX, (float)(sums[2].Y + sums[1].Y) / vLeft - cY), toright = new PointF((float)(sums[0].X + sums[3].X) / vRight - cX, (float)(sums[0].Y + sums[3].Y) / vRight - cY); double dLeft = Math.Sqrt(toleft.X * toleft.X + toleft.Y * toleft.Y), dRight = Math.Sqrt(toright.X * toright.X + toright.Y * toright.Y); if (dLeft >= r1 && dLeft <= rSample && dRight >= r1 && dRight <= rSample) // the split is valid adjusted = new PointF((toleft.X * vLeft - toright.X * vRight) / allVoters, (toleft.Y * vLeft - toright.Y * vRight) / allVoters); } } 

The adjusted point now points from the center of the rectangle along its transverse axis:


Well, that's the whole orientation. It remains a bit of pure geometry to calculate the length of the rectangle and check whether the ratio of length to width looks like a real window.

I will demonstrate the work on another example - when the window is not fully captured. Due to the fact that we do not use the “end” of the histogram for recognition, we are not at all confused by the incomplete window.

Second example
Source Image:


After separation:


After filtering:


Bar chart:


Circle with sides:


Transverse axis:


The code I WindowDetector in the MRDS service WindowDetector - in the image and likeness of standard Technologies\Vision\ColorSegment . My service is tied to the video camera, and for each frame update it sends to subscribers UpdateFoundWindow with the angle of the window in the frame. The MRDS service that manages the robot binds to the WindowDetector in the same way as it binds to a barcode scanner, and processes messages from both quite similarly - correcting the course in the first case and the position in the other case.

With the window detector, my robot traveled through the warehouse rather briskly:


The fate of the robot is sad. Just an hour after shooting this video, the wheel sensor clogged with warehouse dust, and the robot stopped monitoring its position. At the initial assembly of the robot, this sensor is placed very first, i.e. to replace it, you need to essentially disassemble everything and then assemble it in a new way. I decided to try replacing the sensor with a “live” robot, but due to clumsiness I short-circuited the battery about something inside the robot, and it stopped driving altogether. I had to leave it on the storage shelf next to the last year's robot - that based on Roomba. Right rokladradishche, not the regiment.

So before I left Britain, I didn’t have time to make Eddie a working loader robot. But already this spring I will most likely return there with spare parts, revive it, and continue testing.

Source: https://habr.com/ru/post/164979/


All Articles