System.Drawing.Bitmap
class is very useful in the .NET infrastructure, because Allows you to read and save files of various graphic formats. The only problem is that it is not very useful for pixel-by-pixel processing — for example, if you need to convert the bitmap to b / w. Under the cut - a small sketch on this topic.//
sourceBitmap = (Bitmap) Image.FromFile( "Zap.png" );<br/>
//
targetBitmap = new Bitmap(sourceBitmap.Width, sourceBitmap.Height, sourceBitmap.PixelFormat);<br/>
We want targetBitmap
to be sourceBitmap
, only black and white. In fact, in C # this is done simply:void NaïveBlackAndWhite()<br/>
{<br/>
for ( int y = 0; y < sourceBitmap.Height; ++y)<br/>
for ( int x = 0; x < sourceBitmap.Width; ++x)<br/>
{<br/>
Color c = sourceBitmap.GetPixel(x, y);<br/>
byte rgb = ( byte )(0.3 * cR + 0.59 * cG + 0.11 * cB);<br/>
targetBitmap.SetPixel(x, y, Color.FromArgb(cA, rgb, rgb, rgb));<br/>
}<br/>
}<br/>
This solution is clear and simple, but unfortunately the horror as ineffective. To get more "quick" code, you can try to write the whole thing in C ++. First, create a structure to store the color values of a pixel.// 32bpp RGBA
struct Pixel {<br/>
BYTE Blue;<br/>
BYTE Green;<br/>
BYTE Red;<br/>
BYTE Alpha;<br/>
};<br/>
Now you can write a function that will make the pixel black and white:Pixel MakeGrayscale(Pixel& pixel)<br/>
{<br/>
const BYTE scale = static_cast < BYTE >(0.3 * pixel.Red + 0.59 * pixel.Green + 0.11 * pixel.Blue);<br/>
Pixel p;<br/>
p.Red = p.Green = p.Blue = scale;<br/>
p.Alpha = pixel.Alpha;<br/>
return p;<br/>
}<br/>
Now we actually write the bypass function itself:CPPSIMDLIBRARY_API void AlterBitmap( BYTE * src, BYTE * dst, int width, int height, int stride)<br/>
{<br/>
for ( int y = 0; y < height; ++y) {<br/>
for ( int x = 0; x < width; ++x)<br/>
{<br/>
int offset = x * sizeof (Pixel) + y * stride;<br/>
Pixel& s = * reinterpret_cast <Pixel*>(src + offset);<br/>
Pixel& d = * reinterpret_cast <Pixel*>(dst + offset);<br/>
// d
d = MakeGrayscale(s);<br/>
}<br/>
}<br/>
}<br/>
And then it remains only to use it from C #.void UnmanagedBlackAndWhite()<br/>
{<br/>
// ""
Rectangle rect = new Rectangle(0, 0, sourceBitmap.Width, sourceBitmap.Height);<br/>
BitmapData srcData = sourceBitmap.LockBits(rect, ImageLockMode.ReadWrite, sourceBitmap.PixelFormat);<br/>
BitmapData dstData = targetBitmap.LockBits(rect, ImageLockMode.ReadWrite, sourceBitmap.PixelFormat);<br/>
// unmanaged
AlterBitmap(srcData.Scan0, dstData.Scan0, srcData.Width, srcData.Height, srcData.Stride);<br/>
//
sourceBitmap.UnlockBits(srcData);<br/>
targetBitmap.UnlockBits(dstData);<br/>
}<br/>
This improved the speed, but I wanted more. I added the OpenMP directive before the y
cycle and received a predictable acceleration of 2 times. Then I wanted to experiment and try to apply SIMD as well. For this, I wrote this, not very readable, code:CPPSIMDLIBRARY_API void AlterBitmap( BYTE * src, BYTE * dst, int width, int height, int stride)<br/>
{<br/>
// /
static __m128 factor = _mm_set_ps(1.0f, 0.3f, 0.59f, 0.11f);<br/>
#pragma omp parallel for <br/>
for ( int y = 0; y < height; ++y)<br/>
{<br/>
const int offset = y * stride;<br/>
__m128i * s = ( __m128i *)(src + offset);<br/>
__m128i * d = ( __m128i *)(dst + offset);<br/>
for ( int x = 0; x < (width >> 2); ++x) {<br/>
// 4
for ( int p = 0; p < 4; ++p)<br/>
{<br/>
//
__m128 pixel;<br/>
pixel.m128_f32[0] = s->m128i_u8[(p<<2)];<br/>
pixel.m128_f32[1] = s->m128i_u8[(p<<2)+1];<br/>
pixel.m128_f32[2] = s->m128i_u8[(p<<2)+2];<br/>
pixel.m128_f32[3] = s->m128i_u8[(p<<2)+3];<br/>
// - !
pixel = _mm_mul_ps(pixel, factor);<br/>
//
const BYTE sum = ( BYTE )(pixel.m128_f32[0] + pixel.m128_f32[1] + pixel.m128_f32[2]);<br/>
//
d->m128i_u8[p<<2] = d->m128i_u8[(p<<2)+1] = d->m128i_u8[(p<<2)+2] = sum;<br/>
d->m128i_u8[(p<<2)+3] = ( BYTE )pixel.m128_f32[3];<br/>
}<br/>
s++;<br/>
d++;<br/>
}<br/>
}<br/>
}<br/>
Despite the fact that this code does 4 multiplication operations at a time ( _mm_mul_ps
instruction), all these conversions yielded no gain compared to the usual operations - rather, on the contrary, the algorithm began to work slower. Here are the results of the functions in the 360 × 480 image. A 2-core MacBook with 4GB of RAM was used, the results averaged.SetPixel/GetPixel
is evil, you should not touch them.Source: https://habr.com/ru/post/60085/
All Articles