
The reason for this article was the following post:
"Converting bmp images into a matrix and back for further processing .
" At one time, I had to write a lot of research C # code that implemented various compression and processing algorithms. The fact that the code is research, I did not mention it by chance. This code has specific requirements. On the one hand, optimization is not very important - it is important to check the idea. Although I would like this test not to stretch for hours and days (when launching with different parameters, or a large body of test images is processed). The bmp.GetPixel (x, y) pixel brightness method used in the aforementioned post is where my first project started. This is the slowest, albeit the easiest way. Is it worth bothering here? Let's measure.
We will use the classic Bitmap (System.Drawing.Bitmap). This class is convenient because it hides from us the details of encoding raster formats - as a rule, they are not interested in us. It supports all common formats, such as BMP, GIF, JPEG, PNG.
By the way, I will suggest first benefits for beginners. The Bitmap class has a constructor that allows you to open a file with a picture. But it has an unpleasant feature - it leaves open access to this file, so repeated calls to it lead to an exception. To correct this behavior, you can use this method, which causes the bitmap to immediately “release” the file:
public static Bitmap LoadBitmap(string fileName) { using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read)) return new Bitmap(fs); }
')
Measurement Technique
We will measure, driving the image processing classic Lena back to the Bitmap (
http://en.wikipedia.org/wiki/Lenna ). This is a free image, it can be found in a large number of work on image processing (and in the title of this post, too). The size is 512 * 512 pixels.
A little about the technique - in such cases, I prefer not to chase after high-precision timers, but just to perform the same action many times. Of course, on the one hand, in this case, the data and code will already be in the processor cache. But, on the other hand, we isolate the costs of the first launch of the code associated with the translation of the MSIL code into the processor code and other overhead costs. To ensure this, we first run each piece of code once - perform the so-called “warm-up”.
Compile the code in Release. We launch it not necessarily from under the studio. Moreover, it is also advisable to close the studio - faced with cases where the very fact of its “neglect” sometimes affects the results obtained. Also, it is advisable to close other applications.
We run the code several times, achieving typical results - you need to make sure that it is not affected by some unexpected process. Let's say antivirus is awake or something else. All these measures allow us to get stable, repeatable results.
"Naive" method
This method was applied in the original article. It consists in using the Bitmap.GetPixel (x, y) method. Let us give the full code of a similar method that converts the contents of a bitmap into a three-dimensional byte array. The first dimension is the color component (from 0 to 2), the second is the position y, the third is the position x. It happened in my projects, if you want to arrange the data differently - I think there will be no problems.
public static byte[, ,] BitmapToByteRgbNaive(Bitmap bmp) { int width = bmp.Width, height = bmp.Height; byte[, ,] res = new byte[3, height, width]; for (int y = 0; y < height; ++y) { for (int x = 0; x < width; ++x) { Color color = bmp.GetPixel(x, y); res[0, y, x] = color.R; res[1, y, x] = color.G; res[2, y, x] = color.B; } } return res; }
The inverse transform is similar, only the data transfer goes in the other direction. I will not give his code here - anyone can look at the source code of the project at the link at the end of the article.
100 conversions to an image and back on my laptop with an I5-2520M 2.5GHz processor require 43.90 seconds. It turns out that when the image is 512 * 512, only for the transfer of data, about half a second is spent!
Direct work with Bitmap data
Fortunately, the Bitmap class provides a faster way to access your data. To do this, we need to use the links provided by the BitmapData class and address arithmetic:
public unsafe static byte[, ,] BitmapToByteRgb(Bitmap bmp) { int width = bmp.Width, height = bmp.Height; byte[, ,] res = new byte[3, height, width]; BitmapData bd = bmp.LockBits(new Rectangle(0, 0, width, height), ImageLockMode.ReadOnly, PixelFormat.Format24bppRgb); try { byte* curpos; for (int h = 0; h < height; h++) { curpos = ((byte*)bd.Scan0) + h * bd.Stride; for (int w = 0; w < width; w++) { res[2, h, w] = *(curpos++); res[1, h, w] = *(curpos++); res[0, h, w] = *(curpos++); } } } finally { bmp.UnlockBits(bd); } return res; }
This approach gives us 0.533 seconds per 100 transformations (accelerated 82 times)! I think it already answers the question - is it worth it to write more complex conversion code? But can we still speed up the process by staying within the framework of a managed code?
Arrays vs pointers
Multidimensional arrays are not the fastest data structures. Here checks are made for exceeding the index, the element itself is calculated using the multiplication and addition operations. Since address arithmetic has already given us once a significant acceleration when working with data Bitmap, then maybe we will try to apply it for multidimensional arrays? Here is the direct conversion code:
public unsafe static byte[, ,] BitmapToByteRgbQ(Bitmap bmp) { int width = bmp.Width, height = bmp.Height; byte[, ,] res = new byte[3, height, width]; BitmapData bd = bmp.LockBits(new Rectangle(0, 0, width, height), ImageLockMode.ReadOnly, PixelFormat.Format24bppRgb); try { byte* curpos; fixed (byte* _res = res) { byte* _r = _res, _g = _res + width*height, _b = _res + 2*width*height; for (int h = 0; h < height; h++) { curpos = ((byte*)bd.Scan0) + h * bd.Stride; for (int w = 0; w < width; w++) { *_b = *(curpos++); ++_b; *_g = *(curpos++); ++_g; *_r = *(curpos++); ++_r; } } } } finally { bmp.UnlockBits(bd); } return res; }
Result? 0.162 sec per 100 transformations. So they accelerated another 3.3 times (270 times compared with the “naive” version). It was this code that I used in my research on algorithms.
Why bother carrying?
Not quite obvious, but why transfer data from a Bitmap. Maybe in general, all the transformations to carry out there? I agree that this is one of the possible options. But the fact is that many algorithms are more convenient to check on floating point data. Then there are no problems with overflows, loss of accuracy at intermediate stages. You can convert to a double / float array in the same way. Reverse conversion requires validation when converting to byte. Here is a simple code for this verification:
private static byte Limit(double x) { if (x < 0) return 0; if (x > 255) return 255; return (byte)x; }
Adding such checks and type conversions slows down our code. The version with address arithmetic on double-arrays already takes 0.713 seconds (per 100 transformations). But against the background of the "naive" option - it is just lightning.
And if you need faster?
If you need faster, then we write the transfer, processing on C, Asm, use SIMD-commands. Load the raster format directly, without the Bitmap wrapper. Of course, in this case we go beyond the limits of the Managed code, with all the attendant advantages and disadvantages. And it makes sense to do it for an already debugged algorithm.
The full code for the article can be found here:
rasterconversion.codeplex.com/SourceControl/latestUpdate 2013-10-08:At the suggestion of commentators, I added the option of transferring data to an array using Marshal.Copy (). This is done purely for test purposes - this way of working has its limitations:
- The order of the data is exactly the same as in the original Bitmap. That is, the components are mixed. If we want to separate them from each other, it will be necessary to cycle through the array anyway, copying the data.
- The type of brightness remains byte, at the same time, it is often convenient to perform intermediate calculations with a floating point.
- Marshal.Copy () works with one-dimensional arrays. Yes, they are of course the fastest and it’s not very difficult to write rgb [x + width * y] everywhere, but still ...
So, copying to two sides takes 0.158 seconds (per 100 transformations). Compared with the more flexible option on the indicators, the acceleration is very small, within the statistical error of the results of different launches.
Update 2016-04-25:The user
Ant00 indicated an error in the code of the BitmapToByteRgbQ method. It did not affect the time, but the shifting was carried out incorrectly. There was an error when copying fragments from working code. Corrected. Thank you for your persistence (not the first time I carefully considered the code in the article, which is already 2.5 years old).