More than 100 million user photos are stored on Yelp, from pictures of dinners and hairstyles to one of our latest features,
#yelfies . These images make up the bulk of the traffic for users of the application and the website, and their storage and transmission is expensive. Trying to provide people with the best service, we worked hard to optimize all the photos and achieved an average size reduction of 30%. This saves people time and traffic, and also reduces our costs of servicing these images. Oh yeah, and we did it without degrading the quality of the photos!
Initial data
Yelp stores custom photos for 12 years. We save lossless formats (PNG, GIF) as PNG, and all other formats in JPEG. Python and
Pillow are used to save files, and photo uploads start around with this snippet:
After that we start looking for options to optimize the file size without losing quality.
')
Optimization
First, you need to decide whether to process the files yourself or allow the CDN provider to
magically modify our photos. Since we prioritize high quality content, it makes sense to evaluate options and potential tradeoffs between size and quality. We began to study the current state of affairs with optimizing the size of files - what changes can be made and how the size / quality will change with each of them. At the end of the study, we decided to work in three main areas. The rest of the article is devoted to the story of what we have done and what benefits we have gained from each optimization.
- Pillow Changes
- Optimize flag
- Progressive jpeg
- Changes in photo application logic
- Large PNG recognition
- Jpeg dynamic quality
- JPEG Encoder Changes
- Mozjpeg (trellis quantization, custom quantization matrix)
Pillow Changes
Optimize flag
This is one of the easiest changes we have made: to transfer to Pillow the responsibility for additional saving of file size due to CPU time (
optimize=True
). By definition, this does not affect the quality of photos.
For JPEG, this flag means instructing the encoder to find the optimal
Huffman code by making an extra pass when scanning each image. Each first pass, instead of writing to a file, calculates the statistics of occurrences for each value, this information is needed for perfect coding. The PNG standard uses zlib, so the optimization flag in this case instructs the encoder to use
gzip -9
instead of
gzip -6
.
Such a change was easy to make, but it turned out that it was not an ideal solution, reducing the size of the files by only a few percent.
Progressive jpeg
When saving JPEG, you can select several different types:
- Baseline JPEG uploaded
- Progressive JPEGs that load from blurry to crisp. The option of progressive images is easy to activate in Pillow (
progressive=True
). As a result, the quality is subjectively enhanced (indeed, it is easier to notice the partial absence of the image than its non-ideal sharpness)
In addition, the packaging method for progressive images is such that it usually results in a smaller file size. As is more fully explained in the
Wikipedia article , a JPEG format uses zigzag traversal of a block of 8 × 8 pixels for entropy coding. When the values ​​of these blocks of pixels are not packed and arranged in order, non-zero values ​​usually go first, and then a sequence of zeros, and this pattern is repeated and alternated for each 8 × 8 block in the image. With progressive coding, the order of processing pixel blocks changes. The first in the file are large values ​​for each block (which gives the first scans of the progressive image such characteristic blockiness), and closer to the end long ranges of small values ​​are stored, including more zeros, these ranges provide fine detail. Such a redistribution of data in the file does not change the image itself, but increases the number of zeros in a row behind each other (which are easier to compress).
Baseline JPEG and Progressive JPEG Comparison
An example of how Baseline JPEG rendering works.
An example of how Progressive JPEG rendering works. Changes in photo application logic
Large PNG recognition
Yelp works with two formats for custom content — JPEG and PNG. JPEG is great for photographs, but usually does not handle high-contrast designer content (such as logos). In contrast, PNG compresses the image completely lossless, great for graphics, but too cumbersome for photos, where small distortions are still not noticeable. In cases where users upload photos in PNG format, we can save a lot of space if we recognize such files and save them in JPEG. One of the main sources of PNG photos on Yelp is screenshots from mobile devices and applications that change photos, apply effects and add frames.
Left: A typical combined PNG with a logo and frame. Right: typical png from screenshotWe wanted to reduce the number of such optional PNGs, but it was important not to overdo it by changing formats or degrading the quality of logos, graphics, etc. How can we determine that the image is a photograph? By pixels?
After checking the experimental sample of 2500 images, we found that the combination of file size and the number of unique pixels allows you to quite accurately determine the photos. We generate a smaller copy at the maximum resolution and check if the file size is more than 300 KiB. If so, then we check the pixels of the image, if there are more than 2
16 unique colors (Yelp converts the loaded RGBA images to RGB, but if we didn’t do this, we would still check it).
In the experimental sample, such manual settings by definition of “large images” reveal 88% of all files that are potentially suitable for optimization without false positives on the graph.
Jpeg dynamic quality
The first and most famous way to reduce the size of JPEG files is a setting called
quality
. Many applications that can save in JPEG format define
quality
as a number.
Quality is a kind of abstraction. In fact, there are separate quality levels for each of the color channels of a JPEG image. Quality levels from 0 to 100 correspond to different
quantization tables for color channels and determine how much data will be lost (usually in high frequencies). Signal quantization is one of the steps in the JPEG encoding process when information is lost.
The simplest way to reduce file size is to degrade image quality by allowing more noise. However, not every image loses the same amount of information at the same level of quality.
We can dynamically change the quality settings, optimizing them for each individual image to achieve the perfect balance between quality and size. There are two ways to do this:
- Bottom up (Bottom-up): These algorithms generate customized quantization tables, processing the image at the block level of 8 Ă— 8 pixels. At the same time, they calculate how much theoretical quality was lost and how this lost data increases or reduces the appearance of distortion to the human eye.
- Top down (Top-down): These algorithms compare the whole image with its original version and determine how much information was lost. By consistently generating candidates with different quality settings, we can choose the one that corresponds to the minimum grade level, depending on which evaluation algorithm we use.
We evaluated the operation of the bottom-up algorithm and concluded that it does not provide adequate results at the highest quality settings that we wanted to use (although it seems that it has potential in the middle quality range, where the encoder can be more daring regarding the choice of discarded bytes). Many
scientific papers on this strategy were published in the early 1990s, when computing resources were in short supply, so it was difficult to use the resource-intensive methods that Option B uses, such as evaluating interconnections between blocks.
So we turned to the second approach: using an algorithm divided in half to generate candidate images at different quality levels and estimating the drop in quality of each image by calculating its structural similarity index (
SSIM ) using
pyssim as long as this value is within the limits of but a static threshold. This allowed us to selectively lower the average file size (and average quality) only for images that were above the perceived threshold.
In the diagram below, we display the SSIM values ​​for 2500 images newly generated with three different quality settings.
- Original images created using the current method with
quality = 85
are shown in blue. - An alternative approach to reducing the size of files, with a reduction in the quality setting to
quality = 80
, is shown in red. - Finally, the approach we ultimately settled on, the dynamic quality of the
SSIM 80-85
, is shown in orange. Here, the quality is selected from the range from 80 to 85 (inclusive), depending on the coincidence or excess of the ratio SSIM: a pre-calculated static value that makes this transition somewhere in the middle of the image range. This allows us to reduce the average file size without lowering the quality of bad-looking images.
SSIM indices for 2500 images with three different strategies for changing quality settingsSSIM?There are several algorithms for changing the quality of images that try to imitate the human vision system. We appreciated many of them, and we think that SSIM, although older, is best suited for such iterative optimization due to its characteristics:
- JPEG quantization error sensitive
- Fast, simple algorithm
- It can be calculated on native PIL objects without converting images to PNG and transferring them to CLI applications (see # 2)
Sample code for dynamic quality:
import cStringIO import PIL.Image from ssim import compute_ssim def get_ssim_at_quality(photo, quality): """Return the ssim for this JPEG image saved at the specified quality""" ssim_photo = cStringIO.StringIO()
There are several other blog posts about this technique,
here’s one from Colt McCanlis. And when we were going to publish, Etsy also
published its! Give me five, fast internet!
JPEG Encoder Changes
Mozjpeg
Mozjpeg is an open-source fork of
libjpeg-turbo , which sacrificed runtime for the size of the files. This approach is well compatible with offline conveyor file regeneration. With a resource consumption of 3-5 times more than libjpeg-turbo, this algorithm makes images smaller in size!
One of the differences mozjpeg that it uses an alternative quantization table. As mentioned above, quality is an abstraction of quantization tables for each color channel. All indications are that the default JPEG quantization tables are fairly easy to beat. As stated in
the JPEG specifications :
These tables are provided as examples only and are not necessarily suitable for any particular application.
So naturally, you should not be surprised that these tables are used by default in most implementations of encoders ...
Mozjpeg has done the hard work of benchmarking alternative tables for us and using alternative tables to generate images that show themselves best.
Mozjpeg + Pillow
Most Linux distributions install libjpeg by default. So mozjpeg under Pillow does not work
by default , but it is not too difficult to configure in the configuration. When building mozjpeg, use the
--with-jpeg8
and make sure that it can be linked with Pillow. If you are using Docker, you can make this Dockerfile:
FROM ubuntu:xenial RUN apt-get update \ && DEBIAN_FRONTEND=noninteractive apt-get -y --no-install-recommends install \ # build tools nasm \ build-essential \ autoconf \ automake \ libtool \ pkg-config \ # python tools python \ python-dev \ python-pip \ python-setuptools \ # cleanup && apt-get clean \ && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* # Download and compile mozjpeg ADD https://github.com/mozilla/mozjpeg/archive/v3.2-pre.tar.gz /mozjpeg-src/v3.2-pre.tar.gz RUN tar -xzf /mozjpeg-src/v3.2-pre.tar.gz -C /mozjpeg-src/ WORKDIR /mozjpeg-src/mozjpeg-3.2-pre RUN autoreconf -fiv \ && ./configure --with-jpeg8 \ && make install prefix=/usr libdir=/usr/lib64 RUN echo "/usr/lib64\n" > /etc/ld.so.conf.d/mozjpeg.conf RUN ldconfig # Build Pillow RUN pip install virtualenv \ && virtualenv /virtualenv_run \ && /virtualenv_run/bin/pip install --upgrade pip \ && /virtualenv_run/bin/pip install --no-binary=:all: Pillow==4.0.0
It's all! Collect and be able to use Pillow with mozjpeg in normal image processing.
Effect
How important was each of these improvements to us? We started with a random sample of 2500 Yelp business photos, ran them through our processing pipeline, and measured the resizing.
- Changes in the settings Pillow gave a savings of 4.5%
- Determining large PNGs saves 6.2%
- Dynamic quality saves 4.5%
- The transition to the mozjpeg encoder gave a savings of 13.8%
All together, this led to a reduction in the average size of images by about 30%, which we used for our largest and most common photo resolutions, making the site faster for users and saving terabytes per day on data transfer. As fixed at the CDN level:
Changing the average file size over time for a CDN (along with other files that are not images)What we did not do
This section describes several other typical optimizations that you can use, but they were not suitable for Yelp either because of the default settings of our tools, or because of a conscious refusal to accept such a compromise.
Subsampling
Downsampling is a major factor in determining both the quality and size of web image files. A more detailed description of subsampling can be found on the Internet, but for this article it is enough to say that we are already downsampling to
4:1:1
(these are the default settings for Pillow, unless you specify other settings), so we are unlikely to get any gain with further optimization.
Lossy PNG encoding
Knowing what we do with PNG, the option of saving these images in the same format, but using a lossy encoder like
pngmini , makes sense, but we still chose the compression option in JPEG. However, the author of the encoder speaks about file compression by 72-85%, so this is an alternative option with sound results.
More modern formats
Support for more modern formats like WebP or JPEG2k was definitely considered by us. But even if we implemented this hypothetical project, the long tail of users who need JPEG / PNG images would still remain, so efforts to optimize them in any case were not in vain.
Svg
We apply SVG in many places on the site, for example, for static images that our designers created
for the style guide . Although this format and optimization tools like
svgo reduce page size well, they are not suitable for our task.
Magic vendor
There are too many companies that offer delivery, resizing, cropping, transcoding images as a service. Including open-source
thumbor . Maybe for us in the future this is the easiest way to realize support for responsive images, dynamic content types and stay on the cutting edge of progress. But now we cope on our own.
additional literature
The two books mentioned here are completely self-sufficient outside the context of this article and are highly recommended for further reading on the subject.