Compressing photos without apparent loss of quality: the Yelp experience

More than 100 million user photos are stored on Yelp, from pictures of dinners and hairstyles to one of our latest features, #yelfies . These images make up the bulk of the traffic for users of the application and the website, and their storage and transmission is expensive. Trying to provide people with the best service, we worked hard to optimize all the photos and achieved an average size reduction of 30%. This saves people time and traffic, and also reduces our costs of servicing these images. Oh yeah, and we did it without degrading the quality of the photos!

Initial data

Yelp stores custom photos for 12 years. We save lossless formats (PNG, GIF) as PNG, and all other formats in JPEG. Python and Pillow are used to save files, and photo uploads start around with this snippet:

# do a typical thumbnail, preserving aspect ratio new_photo = photo.copy() new_photo.thumbnail( (width, height), resample=PIL.Image.ANTIALIAS, ) thumbfile = cStringIO.StringIO() save_args = {'format': format} if format == 'JPEG': save_args['quality'] = 85 new_photo.save(thumbfile, **save_args)

After that we start looking for options to optimize the file size without losing quality.
')

Optimization

First, you need to decide whether to process the files yourself or allow the CDN provider to magically modify our photos. Since we prioritize high quality content, it makes sense to evaluate options and potential tradeoffs between size and quality. We began to study the current state of affairs with optimizing the size of files - what changes can be made and how the size / quality will change with each of them. At the end of the study, we decided to work in three main areas. The rest of the article is devoted to the story of what we have done and what benefits we have gained from each optimization.

Pillow Changes
- Optimize flag
- Progressive jpeg
Changes in photo application logic
- Large PNG recognition
- Jpeg dynamic quality
JPEG Encoder Changes
- Mozjpeg (trellis quantization, custom quantization matrix)

Pillow Changes

Optimize flag

This is one of the easiest changes we have made: to transfer to Pillow the responsibility for additional saving of file size due to CPU time ( optimize=True ). By definition, this does not affect the quality of photos.

For JPEG, this flag means instructing the encoder to find the optimal Huffman code by making an extra pass when scanning each image. Each first pass, instead of writing to a file, calculates the statistics of occurrences for each value, this information is needed for perfect coding. The PNG standard uses zlib, so the optimization flag in this case instructs the encoder to use gzip -9 instead of gzip -6 .

Such a change was easy to make, but it turned out that it was not an ideal solution, reducing the size of the files by only a few percent.

Progressive jpeg

When saving JPEG, you can select several different types:

Baseline JPEG uploaded
Progressive JPEGs that load from blurry to crisp. The option of progressive images is easy to activate in Pillow ( progressive=True ). As a result, the quality is subjectively enhanced (indeed, it is easier to notice the partial absence of the image than its non-ideal sharpness)

In addition, the packaging method for progressive images is such that it usually results in a smaller file size. As is more fully explained in the Wikipedia article , a JPEG format uses zigzag traversal of a block of 8 × 8 pixels for entropy coding. When the values of these blocks of pixels are not packed and arranged in order, non-zero values usually go first, and then a sequence of zeros, and this pattern is repeated and alternated for each 8 × 8 block in the image. With progressive coding, the order of processing pixel blocks changes. The first in the file are large values for each block (which gives the first scans of the progressive image such characteristic blockiness), and closer to the end long ranges of small values are stored, including more zeros, these ranges provide fine detail. Such a redistribution of data in the file does not change the image itself, but increases the number of zeros in a row behind each other (which are easier to compress).

Baseline JPEG and Progressive JPEG Comparison

An example of how Baseline JPEG rendering works.

An example of how Progressive JPEG rendering works.

Changes in photo application logic

Large PNG recognition

Yelp works with two formats for custom content — JPEG and PNG. JPEG is great for photographs, but usually does not handle high-contrast designer content (such as logos). In contrast, PNG compresses the image completely lossless, great for graphics, but too cumbersome for photos, where small distortions are still not noticeable. In cases where users upload photos in PNG format, we can save a lot of space if we recognize such files and save them in JPEG. One of the main sources of PNG photos on Yelp is screenshots from mobile devices and applications that change photos, apply effects and add frames.

Left: A typical combined PNG with a logo and frame. Right: typical png from screenshot

We wanted to reduce the number of such optional PNGs, but it was important not to overdo it by changing formats or degrading the quality of logos, graphics, etc. How can we determine that the image is a photograph? By pixels?

After checking the experimental sample of 2500 images, we found that the combination of file size and the number of unique pixels allows you to quite accurately determine the photos. We generate a smaller copy at the maximum resolution and check if the file size is more than 300 KiB. If so, then we check the pixels of the image, if there are more than 2 ¹⁶ unique colors (Yelp converts the loaded RGBA images to RGB, but if we didn’t do this, we would still check it).

In the experimental sample, such manual settings by definition of “large images” reveal 88% of all files that are potentially suitable for optimization without false positives on the graph.

Jpeg dynamic quality

The first and most famous way to reduce the size of JPEG files is a setting called quality . Many applications that can save in JPEG format define quality as a number.

Quality is a kind of abstraction. In fact, there are separate quality levels for each of the color channels of a JPEG image. Quality levels from 0 to 100 correspond to different quantization tables for color channels and determine how much data will be lost (usually in high frequencies). Signal quantization is one of the steps in the JPEG encoding process when information is lost.

The simplest way to reduce file size is to degrade image quality by allowing more noise. However, not every image loses the same amount of information at the same level of quality.

We can dynamically change the quality settings, optimizing them for each individual image to achieve the perfect balance between quality and size. There are two ways to do this:

Bottom up (Bottom-up): These algorithms generate customized quantization tables, processing the image at the block level of 8 × 8 pixels. At the same time, they calculate how much theoretical quality was lost and how this lost data increases or reduces the appearance of distortion to the human eye.
Top down (Top-down): These algorithms compare the whole image with its original version and determine how much information was lost. By consistently generating candidates with different quality settings, we can choose the one that corresponds to the minimum grade level, depending on which evaluation algorithm we use.

We evaluated the operation of the bottom-up algorithm and concluded that it does not provide adequate results at the highest quality settings that we wanted to use (although it seems that it has potential in the middle quality range, where the encoder can be more daring regarding the choice of discarded bytes). Many scientific papers on this strategy were published in the early 1990s, when computing resources were in short supply, so it was difficult to use the resource-intensive methods that Option B uses, such as evaluating interconnections between blocks.

So we turned to the second approach: using an algorithm divided in half to generate candidate images at different quality levels and estimating the drop in quality of each image by calculating its structural similarity index ( SSIM ) using pyssim as long as this value is within the limits of but a static threshold. This allowed us to selectively lower the average file size (and average quality) only for images that were above the perceived threshold.

In the diagram below, we display the SSIM values for 2500 images newly generated with three different quality settings.

Original images created using the current method with quality = 85 are shown in blue.
An alternative approach to reducing the size of files, with a reduction in the quality setting to quality = 80 , is shown in red.
Finally, the approach we ultimately settled on, the dynamic quality of the SSIM 80-85 , is shown in orange. Here, the quality is selected from the range from 80 to 85 (inclusive), depending on the coincidence or excess of the ratio SSIM: a pre-calculated static value that makes this transition somewhere in the middle of the image range. This allows us to reduce the average file size without lowering the quality of bad-looking images.

SSIM indices for 2500 images with three different strategies for changing quality settings

SSIM?
There are several algorithms for changing the quality of images that try to imitate the human vision system. We appreciated many of them, and we think that SSIM, although older, is best suited for such iterative optimization due to its characteristics:

JPEG quantization error sensitive
Fast, simple algorithm
It can be calculated on native PIL objects without converting images to PNG and transferring them to CLI applications (see # 2)

Sample code for dynamic quality:

 import cStringIO import PIL.Image from ssim import compute_ssim def get_ssim_at_quality(photo, quality): """Return the ssim for this JPEG image saved at the specified quality""" ssim_photo = cStringIO.StringIO() # optimize is omitted here as it doesn't affect # quality but requires additional memory and cpu photo.save(ssim_photo, format="JPEG", quality=quality, progressive=True) ssim_photo.seek(0) ssim_score = compute_ssim(photo, PIL.Image.open(ssim_photo)) return ssim_score def _ssim_iteration_count(lo, hi): """Return the depth of the binary search tree for this range""" if lo >= hi: return 0 else: return int(log(hi - lo, 2)) + 1 def jpeg_dynamic_quality(original_photo): """Return an integer representing the quality that this JPEG image should be saved at to attain the quality threshold specified for this photo class. Args: original_photo - a prepared PIL JPEG image (only JPEG is supported) """ ssim_goal = 0.95 hi = 85 lo = 80 # working on a smaller size image doesn't give worse results but is faster # changing this value requires updating the calculated thresholds photo = original_photo.resize((400, 400)) if not _should_use_dynamic_quality(): default_ssim = get_ssim_at_quality(photo, hi) return hi, default_ssim # 95 is the highest useful value for JPEG. Higher values cause different behavior # Used to establish the image's intrinsic ssim without encoder artifacts normalized_ssim = get_ssim_at_quality(photo, 95) selected_quality = selected_ssim = None # loop bisection. ssim function increases monotonically so this will converge for i in xrange(_ssim_iteration_count(lo, hi)): curr_quality = (lo + hi) // 2 curr_ssim = get_ssim_at_quality(photo, curr_quality) ssim_ratio = curr_ssim / normalized_ssim if ssim_ratio >= ssim_goal: # continue to check whether a lower quality level also exceeds the goal selected_quality = curr_quality selected_ssim = curr_ssim hi = curr_quality else: lo = curr_quality if selected_quality: return selected_quality, selected_ssim else: default_ssim = get_ssim_at_quality(photo, hi) return hi, default_ssim

There are several other blog posts about this technique, here’s one from Colt McCanlis. And when we were going to publish, Etsy also published its! Give me five, fast internet!

JPEG Encoder Changes

Mozjpeg

Mozjpeg is an open-source fork of libjpeg-turbo , which sacrificed runtime for the size of the files. This approach is well compatible with offline conveyor file regeneration. With a resource consumption of 3-5 times more than libjpeg-turbo, this algorithm makes images smaller in size!

One of the differences mozjpeg that it uses an alternative quantization table. As mentioned above, quality is an abstraction of quantization tables for each color channel. All indications are that the default JPEG quantization tables are fairly easy to beat. As stated in the JPEG specifications :

These tables are provided as examples only and are not necessarily suitable for any particular application.

So naturally, you should not be surprised that these tables are used by default in most implementations of encoders ...

Mozjpeg has done the hard work of benchmarking alternative tables for us and using alternative tables to generate images that show themselves best.

Mozjpeg + Pillow

Most Linux distributions install libjpeg by default. So mozjpeg under Pillow does not work by default , but it is not too difficult to configure in the configuration. When building mozjpeg, use the --with-jpeg8 and make sure that it can be linked with Pillow. If you are using Docker, you can make this Dockerfile:

 FROM ubuntu:xenial RUN apt-get update \ && DEBIAN_FRONTEND=noninteractive apt-get -y --no-install-recommends install \ # build tools nasm \ build-essential \ autoconf \ automake \ libtool \ pkg-config \ # python tools python \ python-dev \ python-pip \ python-setuptools \ # cleanup && apt-get clean \ && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* # Download and compile mozjpeg ADD https://github.com/mozilla/mozjpeg/archive/v3.2-pre.tar.gz /mozjpeg-src/v3.2-pre.tar.gz RUN tar -xzf /mozjpeg-src/v3.2-pre.tar.gz -C /mozjpeg-src/ WORKDIR /mozjpeg-src/mozjpeg-3.2-pre RUN autoreconf -fiv \ && ./configure --with-jpeg8 \ && make install prefix=/usr libdir=/usr/lib64 RUN echo "/usr/lib64\n" > /etc/ld.so.conf.d/mozjpeg.conf RUN ldconfig # Build Pillow RUN pip install virtualenv \ && virtualenv /virtualenv_run \ && /virtualenv_run/bin/pip install --upgrade pip \ && /virtualenv_run/bin/pip install --no-binary=:all: Pillow==4.0.0

It's all! Collect and be able to use Pillow with mozjpeg in normal image processing.

Effect

How important was each of these improvements to us? We started with a random sample of 2500 Yelp business photos, ran them through our processing pipeline, and measured the resizing.

Changes in the settings Pillow gave a savings of 4.5%
Determining large PNGs saves 6.2%
Dynamic quality saves 4.5%
The transition to the mozjpeg encoder gave a savings of 13.8%

All together, this led to a reduction in the average size of images by about 30%, which we used for our largest and most common photo resolutions, making the site faster for users and saving terabytes per day on data transfer. As fixed at the CDN level:

Changing the average file size over time for a CDN (along with other files that are not images)

What we did not do

This section describes several other typical optimizations that you can use, but they were not suitable for Yelp either because of the default settings of our tools, or because of a conscious refusal to accept such a compromise.

Subsampling

Downsampling is a major factor in determining both the quality and size of web image files. A more detailed description of subsampling can be found on the Internet, but for this article it is enough to say that we are already downsampling to 4:1:1 (these are the default settings for Pillow, unless you specify other settings), so we are unlikely to get any gain with further optimization.

Lossy PNG encoding

Knowing what we do with PNG, the option of saving these images in the same format, but using a lossy encoder like pngmini , makes sense, but we still chose the compression option in JPEG. However, the author of the encoder speaks about file compression by 72-85%, so this is an alternative option with sound results.

More modern formats

Support for more modern formats like WebP or JPEG2k was definitely considered by us. But even if we implemented this hypothetical project, the long tail of users who need JPEG / PNG images would still remain, so efforts to optimize them in any case were not in vain.

Svg

We apply SVG in many places on the site, for example, for static images that our designers created for the style guide . Although this format and optimization tools like svgo reduce page size well, they are not suitable for our task.

Magic vendor

There are too many companies that offer delivery, resizing, cropping, transcoding images as a service. Including open-source thumbor . Maybe for us in the future this is the easiest way to realize support for responsive images, dynamic content types and stay on the cutting edge of progress. But now we cope on our own.

additional literature

The two books mentioned here are completely self-sufficient outside the context of this article and are highly recommended for further reading on the subject.

Source: https://habr.com/ru/post/331588/

All Articles