Before OpenCL 2.0, it was impossible to carry out read and write operations of the image within the same core. You could declare images as CL_MEM_READ_WRITE , but after transferring the image, the kernel had to choose one of two things : either __ read_only (read-only access) or __ write_only ( write -only access). OpenCL 2.0 has the ability to read and write images within the same core. However, there are several features that we will discuss in detail in this post.input1 = clCreateImage( oclobjects.context, CL_MEM_READ_WRITE|CL_MEM_COPY_HOST_PTR, &format, &desc, &input_data1[0], &err ); SAMPLE_CHECK_ERRORS( err ); Code snippet 1. You could create an image buffer using CL_MEM_READ_WRITE __kernel void Alpha( __read_write image2d_t inputImage1, __read_only image2d_t inputImage2, uint width, uint height, float alpha, float beta, int gamma ) Code snippet 2. In OpenCL 2.0, it became possible to read and write images within a single core.

| Options | Description |
|---|---|
| -h, --help | Display this text and exit. |
| -p, --platform <number or string> | Selection of the platform whose devices are used. |
| -t, --type all | cpu | gpu | acc | default | <OpenCL constant for device type> | Select the type of device on which the OpenCL core is running. |
| -d, --device <number or string> | Select the device on which all work is performed. |
| -i, --infile <24-bit or 32-bit input bmp file> | The name of the first readable file in BMP format. The default is input1.bmp . |
| -j, --infile <24-bit or 32-bit input bmp file> | The name of the second readable file in BMP format. The default is input2.bmp . |
| -o, --outfile <24-bit or 32-bit input bmp file> | The name of the output file to write to. The default is output.bmp for OCL1.2 and 20_output.bmp for OCL2.0. |
| -a, - alpha <floating-point value from zero to one> | A non-zero positive value that determines how much two images will be superimposed on each other when combined. The default alpha value is 0.84089642. The default beta value is 0.15950358. |
calculatedPixel = ((currentPixelImage1 * alpha) + (currentPixeImage2 * beta) + gamma); float beta = 1 – alpha; 
__kernel void GaussianBlurHorizontalPass( __read_only image2d_t inputImage, __write_only image2d_t outputImage, __constant float* mask, int maskSize) { int2 currentPosition = (int2)(get_global_id(0), get_global_id(1)); float4 currentPixel = (float4)(0,0,0,0); float4 calculatedPixel = (float4)(0,0,0,0); for(int maskIndex = -maskSize; maskIndex < maskSize+1; ++maskIndex) { currentPixel = read_imagef(inputImage, imageSampler, currentPosition + (int2)(maskIndex, 0)); calculatedPixel += currentPixel * mask[maskSize + maskIndex]; } write_imagef(outputImage, currentPosition, calculatedPixel); } __kernel void GaussianBlurVerticalPass( __read_only image2d_t inputImage, __write_only image2d_t outputImage, __constant float* mask, int maskSize) { int2 currentPosition = (int2)(get_global_id(0), get_global_id(1)); float4 currentPixel = (float4)(0,0,0,0); float4 calculatedPixel = (float4)(0,0,0,0); for(int maskIndex = -maskSize; maskIndex < maskSize+1; ++maskIndex) { currentPixel = read_imagef(inputImage, imageSampler, currentPosition + (int2)(0, maskIndex)); calculatedPixel += currentPixel * mask[maskSize + maskIndex]; } write_imagef(outputImage, currentPosition, calculatedPixel); } Code snippet 3. Gauss blur kernel in OpenCL 1.2 __kernel void GaussianBlurDualPass( __read_only image2d_t inputImage, __read_write image2d_t tempRW, __write_only image2d_t outputImage, __constant float* mask, int maskSize) { int2 currentPosition = (int2)(get_global_id(0), get_global_id(1)); float4 currentPixel = (float4)(0,0,0,0); float4 calculatedPixel = (float4)(0,0,0,0) currentPixel = read_imagef(inputImage, currentPosition); for(int maskIndex = -maskSize; maskIndex < maskSize+1; ++maskIndex) { currentPixel = read_imagef(inputImage, currentPosition + (int2)(maskIndex, 0)); calculatedPixel += currentPixel * mask[maskSize + maskIndex]; } write_imagef(tempRW, currentPosition, calculatedPixel); barrier(CLK_GLOBAL_MEM_FENCE); for(int maskIndex = -maskSize; maskIndex < maskSize+1; ++maskIndex) { currentPixel = read_imagef(tempRW, currentPosition + (int2)(0, maskIndex)); calculatedPixel += currentPixel * mask[maskSize + maskIndex]; } write_imagef(outputImage, currentPosition, calculatedPixel); } Code snippet 4. Gauss blur kernel in OpenCL 2.0Source: https://habr.com/ru/post/268851/
All Articles