📜 ⬆️ ⬇️

We are looking for a fast universal library for working with graphic files, we are sorting out with Google benchmark



Nowadays, when neural networks surf the Big Data, and artificial intelligence is wondering whether it is profitable for him to get paid for his work in Bitcoin, the task of searching for the fastest open cross-platform library for loading, saving and transcribing graphic files has turned out to be a real anachronism . But in fact, this task is more urgent than ever - for all technologies of computer vision and machine learning, it is necessary to download gigabytes of pictures and sometimes save intermediate data as images. So to make it the fastest way is very desirable. In this article we will find the required library, and, most importantly, we will deal with a very useful product that greatly simplifies similar and many other tasks - Google Benchmark.

So, the exact wording of the problem says: the application loads, that is, jpeg and tiff files with a color depth of 24 and 8 bits are decoded into memory, as well as 32-bit bmp. Image size varies from tiny (32x32 pixels) to large, with a resolution of 15K. In the process, the files are modified, after which they must be saved to disk in the specified formats. And this should be done by an open-source cross-platform library with maximum performance on modern Intel processors with support for AVX2 vector instructions. Support by the DirectX DXT1 compressed texture library is also desirable. The Windows Imaging Component , a standard framework for working with images in Windows, is taken as a benchmark for performance, that is, you need to find a library that works on equal or faster than WIC.

But the most important requirement - the decision is needed right now, and better yesterday.
')

Meet libraries for working with bmp, tiff, jpeg


The solution begins with an obvious and simple, though not very fast step - a thorough study of Wikileaks github , stackoverflow and other google in search of suitable candidates for the role of the required library. Those turned out to be a bit:


All found libraries were compiled under Windows using the maximum optimization level of the Visual Studio compiler and the / arch: AVX2 key.

The same applies to the LibJPEG, LibPNG and LibTIFF libraries, to speed up work taken from the fresh package of the OpenCV library .

Meet Google Benchmark


The next step of the solution is also obvious - the creation of a benchmark for comparing the performance of found libraries, and the use of Google Benchmark, which is widely known in narrow circles for the microbench library, makes it easy and fast.
Google Benchmark can accurately measure the performance of pieces of code that you insert into the body of a C ++ cycle.

static void BM_foo1(benchmark::State& state) { //     Init_your_code(); for (auto _ : state){ //  -  your_code_to_benchmark(); } 

in functions registered as benchmark

 //       BENCHMARK(BM_foo1); 

And run them:

 BENCHMARK_MAIN(); 

After that, issue a report in the specified format - console output, json, csv.

The report will contain information about the execution system (processor, cache configuration), the total global operating time of each of the measured functions, as well as the time they take up the processor. These times are generally different - the first, for example, includes a delay in reading / writing, and the second for multi-threaded benchmarks is made up of the operating time of all cores.

The last benchmark parameter displayed by Google is the number of iterations performed for the function, which is necessary for a statistically correct, accurate measurement of its running time. The system selects it automatically, automatically, making preliminary measurements.

What is the “accurate measurement” of work time? On this topic you can write a dissertation, but in this case it is enough to say that:


The only point to which you should pay attention: Google benchmark does not provide "cleaning" of the cache memory between launches of benchmark iterations. If necessary, you should take care of this yourself.

But Google benchmark can do a lot of other things:


Google benchmark is loaded from the repository on github , compiled for the appropriate platform using Cmake (Visual Studio is available for Windows), the resulting library is linked to your project (in the case of Windows, linking to the shlwapi library is also required), the benchmark header file is added to your code .h, after which everything works as described above.

If it doesn’t work, then the only place besides the already mentioned site where you can get at least some information and help on Google benchmark is a specialized forum on the product .

In our case, everything worked without problems. After talking with customers, 4 benchmarks were identified, representing loading and saving under a different name:


Meet the results


It was originally planned that all the found libraries, ie FreeImage, Cimg, DevIL, OpenImageIO, Boost GIL and SDL_image 2.0, will take part in the testing-comparison with the Windows Imaging Component (WIC). But the last three libraries, dependent on such “monsters” as Boost and SDL, were asked to leave in reserve in case of emergency, if the required library is not found among the first three. And, fortunately, she was found. Although not immediately.

Below is a report generated by the Google benchmark, which shows that:


Remains library DevIL. It shows excellent results in cases of bmp and tiff downloads (3 and 2.8 times higher than WIC, respectively), black and white jpeg (1.75x better than WIC), but slows down a bit at loading normal 24-bit jpeg - it does as much as 3 % slower than WIC.
08/15/18 11:15:44
Running c:\WIC\WIC_test\Release\WIC_test.exe
Run on (8 X 4008 MHz CPU s)
CPU Caches:
L1 Data 32K (x4)
L1 Instruction 32K (x4)
L2 Unified 262K (x4)
L3 Unified 8388K (x1)
BenchmarkTimeCPUIterations
BM_WIC8jpeg72 ms70 ms11
BM_cimg8jpeg562 ms52 ms10
BM_FreeImage8jpeg147 ms144 ms5
BM_devIL8jpeg41 ms41 ms17
BM_WIC24jpeg266 ms260 ms3
BM_cimg24jpeg656 ms128 ms6
BM_FreeImage24jpeg594 ms594 ms1
BM_devIL24jpeg276 ms276 ms3
BM_WIC24tiff844 ms844 ms1
BM_cimg24tiff808 ms131 ms5
BM_FreeImage24tiff953 ms938 ms1
BM_devIL24tiff305 ms305 ms2
BM_WIC323 ms3 ms236
BM_cimg3271 ms7 ms90
BM_FreeImage326 ms5 ms112
BM_devIL321 ms1 ms747
Of course, at this stage DevIL could be rejected, but here another library appears in the frame - Libjpeg-turbo .

Its output can be met with applause - Libjpeg-turbo is a cross-platform library that fully implements the functionality (API) of libjpeg and adds its own functionality to it (for example, working with 32-bit buffers). At the same time, for the x86 architecture, Libjpeg-turbo actively uses vector instructions (SSE2, AVX2) and, according to its creators, exceeds the speed of libjpeg by 2-6 times (!)

Therefore, the next step is to build DevIL with Libjpeg-turbo instead of libjpeg. Libjpeg-turbo is built by Visual Studio without any problems using CMake, after which almost immediately (replacing the only #define that defines the version of libjpeg in the DevIL header file) begins to work as part of DevIL.

As a result, the Google benchmark report looks like this:
BenchmarkTimeCPUIterations
BM_WIC8jpeg72 ms68 ms9
BM_cimg8jpeg565 ms39 ms10
BM_FreeImage8jpeg148 ms141 ms5
BM_devIL8jpeg31 ms31 ms24
BM_WIC24jpeg269 ms266 ms2
BM_cimg24jpeg675 ms131 ms5
BM_FreeImage24jpeg604 ms594 ms1
BM_devIL24jpeg149 ms150 ms5
BM_WIC24tiff833 ms828 ms1
BM_cimg24tiff785 ms138 ms5
BM_FreeImage24tiff943 ms938 ms1
BM_devIL24tiff318 ms320 ms2
BM_WIC324 ms3 ms236
BM_cimg3274 ms8 ms56
BM_FreeImage326 ms5 ms100
BM_devIL321 ms1 ms747
Of course, the performance improvements with jpeg even twice compared to libjpeg are not visible here, but it should be so - the superiority in speed only applies to jpeg encoding / decoding, and the test includes read / write file overhead.

But it is clear that on average DevIL is faster than WIC in the case of 8-bit jpeg 2.3 times, 24-bit jpeg 1.8 times, 24-bit tiff - 2.7 times, 32-bit bmp - 3.5 times.

Problem solved. The decision was completely spent three summer pre-holiday working days. Of course, if there were a little more, it is possible that there would be a library with even more impressive results, and if it is significantly more, then perhaps I would write the library I was looking for myself.

But even that which is is impressive. Therefore, if you are looking for a fast and easy-to-use cross-platform library for working with graphic files, then pay attention to DevIL , and if you need to quickly and efficiently make comparative measurements of the code, then Google benchmark is at your service.

Source: https://habr.com/ru/post/425021/


All Articles