Images: formats and compression (2/3)

Hello again! After the break in the month, we continue the tour of image formats and compression algorithms. Where are we staying? Oh yes, the eighties.

TIFF: Collect yourself

In the yard in 1986. PCX and TGA formats are beginning to provide spectacular lucky owners of ordinary desktops. But the matter is not limited to home computers; there are also enterprises and universities. And they already have powerful machines and scanners, and in the environment of scanners ... a mess and disgrace. The task of the scanner is to convert the image to paper into a digital format, and each manufacturer had this format.

Imagine for a moment, you are doing a scan of some Important Drawing, carry it to the printing house, and there they say that they cannot read it. We'll have to overtake it in a different format, but it is better to just bring a photocopy for reliability. It's a shame.
')
Aldus took up the solution to this problem, eventually presenting the TIFF format. All scans were now in the same format, and even out of the box could contain several pages of a document in one file. Plus, together with the raster data it became possible to store meta information: the name of the document, the author, the date of the scan, the model of the scanner - and almost everything.

The format turned out to be so expandable that it is alive and well today. And all because of the fact that Aldus rethought the very concept of a graphic format. Instead of a fixed layer of information, relatively small structures are used from which a file can be assembled. The author is known - insert the structure, unknown - just delete. The use of such “tags”, in fact, was played up in the very name of the format: “Tagged Image File Format”.

The downside of such extensibility, however, has become the exorbitant swelling of the specification. Attempts by Aldus (and then Adobe, which bought it) to embrace the immense, have affected the enthusiasm of other developers not for the better.

Format for sharing images

The new TIFF format, possessing elegant advantages, was only suitable for printing: the first revision provided for only black and white images without semitones. And in 1987, VGA graphics cards with a palette of up to 256 colors appeared. TGA and PCX (with dopilivanii) allowed to store the notorious 256 colors, but the file size doubled, and even more - a simple RLE on the large palette worked much worse.

To estimate the scale, take a full-screen VGA image (320 × 200 @ 256) in PCX format - about 50 KiB , plus or minus. Now we will try to transmit it using a fast, 9600 baud, dayl-up modem, and we’ll get loading time in about a minute. Not cool. CompuServe seemed to be doing just that, making money by giving access to its BBS in sunny Columbus, Ohio.

But how can you get out of this situation? You can technically, cut out the extra fields, not put unnecessary, and use a powerful compression algorithm. You can use humanitarian savvy, and show something similar to the picture before it is fully loaded, and then it will be subjective to the user to think that everything is loading faster.

Both of these approaches and combined in the format of GIF ("Graphics Interchange Format"). And now more.

GIF: Structure

To find out what a GIF can do and how it works, crawl into the documentation. Previously, the format was patented, but these patents expired a dozen years ago, and everything is kept in open access.

Headline

The first six bytes of the file are the header. It consists of the signature “ GIF ” and the version: “ 87a ” or “ 89a ”, where 87 and 89 are the years in which they were developed. Version 89a extends 87a and is fully backward compatible with it. The specification requires 87a to be used whenever possible, but personally I would not advise it - browsers work crookedly with 87a. As a demonstration, run a bit ahead and show the focus:

What is shown in the picture? If one square, then you have Firefox or Chrome. If two, then IE. If three, then Opera (which is still on Presto).

Logical screen description block

The logical screen is just a rectangular area, selected for displaying graphics, within the meaning of something like a viewing port . GIF can contain several images, and they can be smaller in size than the logical screen. Moreover, in GIF87a, these images are not animation frames, but rather layers.

In other words, if you need to draw 800 × 600 in the middle of the kitten in the middle, and a flower in the corner, then instead of one huge “flattened” image, you can store the kitten and the flower with their shifts relative to the logical screen.

A small disclaimer. Despite the fact that at the intuitive level the gif itself (arrgh ... well, “jiffy”) is a picture that is shown on the screen, we will adhere to the terminology specified in the specification: “logical screen” is the size of the entire GIF'k, and the “image” is a layer / frame.

So, the block of the description of the logical screen always takes 7 bytes. The first dimensions are two bytes in width and two bytes in height (UInt16LE). It turns out, the maximum size is 65535 × 65535 pixels. The specification does not indicate that the width or height must be greater than zero, but the realization of this does not give us anything, except perhaps curiosity.

Here you should make an important note: if the GIF file is version 87a, Firefox and Chrome always ignore this field and use the size of the first image as the logical screen size. Opera and IE do the same only if the coordinates of the first image are zero.

Demo:

Files are identical, except for the version - on the left 87a, on the right 89a. They are the same in Opera and IE, not in Firefox and Chrome. Because of the vertical alignment on Habré, this is not very noticeable; the sizes of these pictures are different.

This is followed by a single byte containing several fields. The seventh bit (most significant) indicates whether there is a global palette, and bits 0 through 2 define its size (from 2 ¹ to 2 ⁸ ). The remaining 4 bits after years of age have lost their meaning.

The next field is one byte, indicating the background color in the global palette (if used). This color should be filled with all areas of the logical screen that are not occupied by any of the images. In theory. In practice, this requirement is fulfilled only by Photoshop and IrfanView. Common browsers leave unallocated space transparent. And Opera 12 has an interesting label: if the GIF is version 87a, and it contains more than one image, this empty space is filled with black.

The last byte of the description of the logical screen contains information about the proportions of pixels. Browsers ignore this parameter, and all pixels are considered square.

Demo:

There should be a circle, but instead an ellipse is displayed. So that you do not think that I am cheating, try opening this file in Photoshop.

Global palette

If the global palette checkbox is set to 0, it simply does not exist. And if there is, then the size of the entire block in bytes is equal to the number of colors in the palette multiplied by three. Filling principle: RGB RGB RGB ..., just like in PCX.

A global palette is applied to each image in the file, if it does not have a local palette. And if there is neither one nor the other? Speke on this occasion says "uh ... well, I don’t know, figure it out somehow," and suggests using the standard palette of the device, but it is desirable that the first color in it be black and the second white. In 2013, everyone has already forgotten what a standard device palette is, and this trick most often fails.

Demo:

File without palette. In Firefox, Chrome and old IE, we see a black rectangle, in Opera - a transparent one. Photoshop believes that the palette is 256-color greyscale. And here, IE10 and IrfanView follow the recommendations of the specs, and the image is easy to read.

At this point, the mandatory, fixed format fields end, and the optional fields begin.

Picture

Here the raster data itself and its location relative to the logical screen are stored. The file can be any number of images - from zero to infinity.

It is from scratch! Speke quite admits that there are no images in the file at all. For example, if you just need to initialize the global palette on the device. But the current browsers do not understand these subtleties, and consider in such cases that the file is “broken.”

The image begins with a marker, an ASCII character " , " (comma). This marker is needed in order to understand which structure has now begun: an image (“ , ”), an extension block (“ ! ”), Or a file end block (“ ; ”).

This is followed by four two-byte integers: two coordinates of the position of the left upper edge of the image relative to the logical screen and the width / height of the image itself. According to the specification, the lower right edge of the image should not go beyond the logical screen.

Next, bytes with packed fields. The most significant bit: does the image have a local palette, three low ones: what size, approximately the same as in the description of the logical screen. The 6th bit is responsible for the scanning order: line by line (progressive) or interlaced (interlaced). I think it’s not necessary to describe how the download of an interlaced image looks like, I’ve seen it a thousand times, but just in case, here’s a smart article with pictures .

After that, as in the description of the logical screen, a local palette should immediately follow if there is one.

By the way, since we were talking, you guessed it, how can you display more than 256 colors on one logical screen of GIF?

The next byte is the size of the LZW codes. In the LZW compressed data stream, the length of the initial dictionary is not specified, and it must be specified separately. But we will come back to this.

Packing data of arbitrary length

Then in the image are directly compressed LZW data. But they are not stored by simple data bursts with the specified length, but are packed into blocks. (Apparently, to avoid arbitrary IO and excessive memory consumption when encoding.) Packing principle: one byte of block length, n data bytes, one byte of block length, n data bytes, and so on, until the data runs out. And at the end of the data indicates a block of length 0.

For example, a well-known pangram about a quick fox can be packaged like this:

\x2B The quick brown fox jumps over the lazy dog \x00

And it is possible and in another way, we will spend for one byte more, but it will not be considered as an error:

\x20 The quick brown fox jumps over t \x0B he lazy dog \x00

In general, it is generally possible to use at least one byte per block, then the packed data will be twice as large as the original. And from the loaf of white (or black) bread you can make a trolley bus.

Extensions

The attentive reader has probably noticed that neither in the image description, nor in the description of the logical screen, has there been a GIF “buns” proper: transparency and animation. This is because they were introduced in version 89a as an extension block.

There are four types of extension blocks: graphics control extension, text extension, comment and application extension. All blocks are optional, making the GIF89a format backward compatible with 87a.

Graphics management extension

This extension is essentially a set of additional fields for the image that immediately follows it. In fact, it turns the vaguely described concept of “image” into a very specific “frame”. Of the fixed fields, only two byte markers: 0x21 , “this is an extension” and 0xF9 , “this is an extension of the graphics control”.

Further fields are packed in the same way as raw image data - block length, n bytes, block length, n bytes. It is a different matter that, according to the specification, it is always one block 4 bytes long. Why in general it was necessary to break into blocks - I didn’t even want to figure it out, apparently, it was easier.

Anyway, the first of these four bytes is a set of bit fields. The three bits are reserved. Then three bits, forming a whole from 0 to 7, indicate what should be done with the frame before showing the next one. 0: nothing (not defined); 1: nothing (leave everything as it is); 2: clear the screen; 3: to return what was before this frame; values 4-7 are reserved.

Bit 1 - the most interesting from a historical point of view. Whether to wait from the user “any key” in order to show the frame following this. Yes, yes, on the basis of GIF it was possible to make not only a slideshow, but also quite a presentation. Now this feature is not actually supported.

But if you still want to check

The lower bit indicates whether any of the colors in the palette is considered transparent.

The next two bytes form UInt16, indicating how many hundredths of a second you need to wait before showing the next frame. Thus, the delay can be in the range from 0 to 655.35 seconds.

Finally, the last byte is the index of the “transparent” color in the palette (if it was indicated above that it exists).

Application extension

This extension is designed to allow applications to write their proprietary metadata to the GIF. The format is already familiar: two bytes of a marker-signature, 0x21 0xFF , the rest is blocks with an indication of the length in front of them. And again the length of the first block is strictly fixed: 11 bytes, which identify the application. Subsequent blocks can be of any length and contain anything.

One such extension was invented by the Netscape Navigator browser. The fact is that the GIF format itself does not provide for the animation cycle, and with the help of the proprietary extension " NETSCAPE2.0 " you can specify how many times the animation is scrolled: from 1 to 65535 times or infinitely.

Now all browsers understand this construction, and if it is not there, it is considered that the animation is not looped.

The rest of the application extensions in order to reduce the file size can safely mow.

Text extension

This extension is necessary in order to draw any text on the logical screen. Specify the size of the glyphs in pixels, the size of the text field, the position relative to the logical screen, and voila! You can not just a slideshow or a presentation, but generally make a visual novel.

But alas, there is no real support for this expansion. In nature, this species does not occur, crossed out of the red book. And we go further.

Comment

The last extension is the simplest. Two byte markers 0x21 0xFE , and then arbitrary content, packed in blocks.

Closing block

The trailing unit (“trailer”) consists of a single byte 0x3B (“ ; ”). It is located at the end of the file and means that in this place, in fact, the end of the file. Everything after it is considered garbage and is ignored.

This construction according to the specification is obligatory, do not forget about it.

Whew! It seems that everything is structured. Honestly, I myself expected it to be easier. If you suddenly have any misunderstandings, due to the style of the presentation or the translation of the terms, you can familiarize yourself with the official specification GIF89a . I also strongly recommend the GIF opener program, it’s a hundred years old at lunchtime, but I’ve never seen anything better for low-level gif editing.

Lempel-Ziva-Welch Algorithm

Now that we’ve seen the neighborhood, we’ll go to the very heart of GIF — the Lempel-Ziv-Welch algorithm, aka LZW. First of all, it is worth noting that this algorithm is vocabulary . Let's see what that means.

Compression with a dictionary, a simple example

Dictionary compression is a direct development of the idea of RLE. In general, the principle is as follows: we look for duplicate values and enter them into the dictionary, and then in the right place make an indication of the element of the dictionary. Thus, not only the same values in a row are compressed, but also groups of different values, between which there can be a certain distance.

For example, take the following line:

It can be seen that it is directly stuffed with repetitive " ". Choose some unused symbol, for example, “ 1 ” and say that any such encountered symbol should be replaced with “ ”. As a result, we get the string “ 11 11 ”, half as much. And so that someone other than us could understand what this encryption will read, you need to remember to write the dictionary itself, that is, what exactly you need to replace " 1 " The result will be something like:

1=; 11 11

You can go a little further, and replace “ 1 ” with “ 2 ”:

1=,2=1; 11 22

Hmm ... it seems that we replaced more duplicate values, and the string was longer. Alas, the dictionary also needs to be stored, and the more it is, the greater the overhead.

Of course, they are trying to fight this somehow. For example, the LZ77 algorithm is formally considered to be a dictionary, but the decoded data itself acts as a dictionary - the decoder is given the command to go back n bytes and copy m bytes from there. You can read more about LZ77 in this article by the author of the respected horror_x .

In LZW, the “descendant” LZ77, dictionary entries also appear as they are decoded, but this is no longer just copying with offset, but a full-fledged dictionary.

LZW: General Coding Principle

For example, take the same line, " ."

Now you need to choose the encoding of this line. The use of multibyte encoding will greatly complicate the example, so who will lay down UTF-8 on the spot. The dust of seven-bit encodings, such as KOI-7 H1 (GOST 13052-74), will not be disturbed. Choose some eight-bit, for example, everyone's favorite CP866.

Thus, the original dictionary will be identical to the CP866:

  00000000 <nul>
 00000001 <SOH>
 ...
 10,000,000 A
 10000001 B
 ...
 10011111 I
 ...
 11111111 <nbsp>

This dictionary is already full, and nothing will be added to it. But in LZW, the dictionary should always be able to add a new entry. As soon as the previous code length begins to be missed for this, it needs to be increased:

  000000000 <nul>
 ...
 011111111 <nbsp>

Now, seven bits are used to indicate the index of the dictionary (this is the “code length”).By the way, when they say something like “LZW with a code length of 8”, we mean the length of the codes of the initial dictionary, and we start coding with the length of the codes one unit longer. A little confusing, but so it is.

The line begins with the capital "A". It corresponds to the binary value in the dictionary 010000000. We write out the bits directly, LZW works directly with the bits, and the fact that a byte is usually 8 bits does not matter for this algorithm.

 And 010000000 (we add nothing to the dictionary)

The next letter is “B”. Similarly, we enter it, but now along with this we add a new entry to the dictionary: the previous encoded string plus the first character of this one. The last line was “A”, this one is “B”, and its first character, respectively, “B”.

B 010000001 | add to the dictionary AB = 100000000

We continue further:

 010010000 | Br = 100000001
And 010000000 | RA = 100000010
K 010001010 | AK = 1,000,00011
And 010000000 | KA = 100000100
D 010000100 | BP = 100000101

Then it would be possible to write down “A” in the same way, but we already have “AB” in the dictionary! We use it. And do not forget that only the first letter goes to the new dictionary entry.

AB 100000000 | YES = 100000111
RA 100000010 | ADB = 100001000
<SP> 000100000 | RA <SP> = 100001001
From 010010001 | <SP> C = 100001010
<SP> 000100000 | C <SP> = 100001010
X 010010101 | <SP> X = 100001011

Now the jackpot! We already have “ADB” in the dictionary, and we can insert three letters at once in one letter. Choosing the longest matching line from the dictionary seems like a damn good idea.

ADB 100001000 | HA = 100001100
And 010000000 | ABRA = 100001101
HA 100001100 | AH = 100001110
BR 100000001 | HUB = 100001111
And 010000000 | ARB = 100010000

Now all received bits (from the second column) are written out in a line, grouped by 8 bytes and ready. Instead of 24 × 8 = 192 bits, we got 18 × 9 = 162 bits. 15% of the savings, despite the fact that RLE would not remove a single bit here.

Greed is bad

In the description of the algorithm, it is indicated that when encoding, a string of the greatest length should be selected from the dictionary. But as the song about the boy Billy from Treasure Island teaches, greed is not the best option. Take the previous example and roll back a little bit:

 A 010000000
B 010000001 | AB = 100000000
 010010000 | Br = 100000001
And 010000000 | RA = 100000010
K 010001010 | AK = 1,000,00011
And 010000000 | KA = 100000100
D 010000100 | BP = 100000101
AB 100000000 | YES = 100000111
RA 100000010 | ADB = 100001000
<SP> 000100000 | RA <SP> = 100001001
From 010010001 | <SP> C = 100001010
<SP> 000100000 | C <SP> = 100001010
X 010010101 | <SP> X = 100001011
ADB 100001000 | HA = 100001100
And 010000000 | ABRA = 100001101

In this place, we are going to go against greed and choose from the dictionary not “ ” a shorter string, “ ”:

X 010010101 | AH = 100001110
ABRA 100001101 | HA = 100001111

Now the length of the encoded message is 17 × 9 = 153 bits, the difference is about 5%. But we manually collected the file, but how to determine algorithmically, not to be led by greed? In the forehead will not work: the complexity of brute force exponential. So, as they say in all sorts of smart books, I will leave this to the reader for consideration. Maybe another Turing award.

Compression of the same values

Let us now try to compress a primitive string consisting only of the letters "A".

 A |
A | AA
AA | AA
AA |  AAA
AAA |  AAA
 ...

Somehow not very efficient, the dictionary is clogged with duplicates. But in this case, a cunning device is provided in LZW. In the second line, we add to the dictionary " ", although this entry would be useful to us right on the spot. Actually, this is how we can do:

 A |
AA | AA
AAA |  AAA
 ...

For the definition of a new record, only the first character to be encoded is relevant. We know him for sure, this is " ". Therefore, we first add a new entry to the dictionary, and then look for the string with the longest length in it.

This approach works not only with the same letters, but also with longer sequences:

 A |
B | AB
AB | BA
ABA | ABA
 ...

You may wonder how the decoder will behave if they say to him: “now insert the seventh entry”, and he has only six of them in the dictionary. It's okay, he is trained in such things and will understand without problems.

LZW to GIF

Well, on cats have trained, now turn to real action. Create a GIF with such a simple image.

For convenience, increased by 8 times, these sizes: 11 × 6 pixels, the height of each strip is 2 pixels.

Whose flag is it? No one's Just three horizontal stripes. But if it’s more convenient, you can consider this as the flag of the RGB Sovereign Color Model.

The size of the palette is chosen exactly the same as in PCX - as small as possible. In the image there are only three colors, so we will use a palette of 4 colors. Let me remind you that for LZW the order of colors in the palette does not matter, only the length of the codes. The image in the file will be one, so it doesn’t matter to us whether this is a local palette or a global one. Let it be global.

We will not use interlace, we will choose the pixels simply from left to right, top to bottom. The size of the logical screen is 11 × 6, the image size is the same, the size of the codes is 2 bits, let's go!

Compiled dictionary:

 00 R (palette index 0)
01 G (palette index 1)
10 B (palette index 2)
11 (palette index 3)

In fact, of course, LZW has no idea about a particular color, and encodes only its number in the palette. But then there will be so many numbers in the listing that we risk confusing them, so I’ll mark them as “R”, “G” and “B”.

That's not all. We add the special value “Clear Code” to the dictionary. It does not encode anything, only instructs the decoder to reset the dictionary to the initial state.

The dictionary also has another special meaning “End of Information”, which indicates that the compressed stream has ended. Why is it needed? Suppose that a single dictionary entry takes 4 bits, and there is an encoded value " 00010010 00110000". It is not clear whether this is decoded as 1 2 3, or as 1 2 3 0, and the EoI code indicates the end of the stream with bit accuracy, and the problem disappears.

So, the initial dictionary here looks like this:

 000 R (  0)
001 G (  1)
010 B (  2)
011 (  3)
100 <CC>
101 <EoI>

CC . . - , , , .

 100 <CC> |

, . - , . ( - ). .

. PCX 0 3 , GIF .

 000 R |

Further, the technique described above: use the dictionary entry at the same time as adding it.

110 RR | RR = 110
111 RRR | RRR = 111

The dictionary is full, and it needs to be increased by one bit. The next entry will occupy 4 bits.

1000 RRRR | RRRR = 1000

10 pixels are coded, and in one scanline 11. But in GIF there are no limitations associated with scanlines - all pixels are “stretched to the line”, and are encoded as one big sequence. If we used interlace, then the next one would take the blue band, which, of course, would not have a positive effect on compression.

1001 RRRRR | RRRRR = 1001
1010 RRRRRR | RRRRRR = 1010

21 red pixel encoded, left alone. This time the pixel is encoded with four bits. Despite the fact that the entry " RR" in the dictionary already exists, we add it again.

0000 R | RRRRRRR = 1011

We continue the same way with green pixels.

0001 G | RG = 1100
1101 GG | GG = 1101
1110 GGG | GGG = 1110
1111 GGGG | GGGG = 1111

Again, increase the length of the codes.

10,000 GGGGG | GGGGG = 10,000
10001 GGGGGG | GGGGGG = 10001
00001 G | GGGGGGG = 10010
00010 B | GB = 10011
10100 BB | BB = 10100
10101 BBB | BBB = 10101
10110 BBBB | BBBB = 10110
10111 BBBBB | BBBBB = 10111
11000 BBBBBB | BBBBBB = 11000
00010 B | BBBBBBB = 11001

Finally, specify the end of data marker.

 00101 <EoI> |

Now, the received bits are packed in eight pieces per byte, from the youngest to the oldest. From the sequence (100) (000) (110) (111) (1000) …we get:

 10) (000) (100) = 0x84
(1000) (111) (1 = 0x8F
 ...

13 ( 0x0D = 13) 0x00 , , .

, , :

. , .

, , 25 , — 31, 35. , , .

, , «» — . , , .

:

 100 <CC> |
000 R |
110 RR | RR = 110
111 RRR | RRR = 111
1000 RRRR | RRRR = 1000
1001 RRRRR | RRRRR = 1001
1010 RRRRRR | RRRRRR = 1010
0000 R | RRRRRRR = 1011

We reset the dictionary, and continue coding.

0100 <CC> | (dictionary reset)
001 G |
110 GG | GG = 110
111 GGG | GGG = 111
1000 GGGG | GGGG = 1000
1001 GGGGG | GGGGG = 1001
1010 GGGGGG | GGGGGG = 1010
0001 G | GGGGGGG = 1011

Once again.

0100 <CC> | (dictionary reset)
010 B |
110 BB | BB = 110
111 BBB | BBB = 111
1000 BBBB | BBBB = 1000
1001 BBBBB | BBBBB = 1001
1010 BBBBBB | BBBBBB = 1010
0010 B | BBBBBBB = 1011
0101 <EoI> |

It turned out 90 bits, and the total file size was reduced from 52 to 51 bytes, despite the fact that it remained readable:

.And, by the way, gifsicle, the number one tool for compressing GIF, cannot be compressed anymore.

By the way, you can see that all three bands were coded almost the same. For LZW, there is no difference between “ 0x0022 times” and “ 0x0222 times”, and you can choose any arrangement of colors in the palette.

Non-standard

A little bit of freaking. The dictionary reset code (“CC”) at the very beginning of the stream does not affect anything. And the end-of-stream code (“EoI”) is, in general, not needed - if there is any data left in the tail of a byte, they will already be outside the image. Cutting them out, we can shrink the image even more, up to 50 bytes, and it will be displayed normally in browsers:

.

If you are a happy owner of a browser in which it is not shown, please write in the comments.

But there is one thing, and this “but” is fundamental: it is a violation of the specification. Is it worth the savings of one byte - decide for yourself.

Animated gif

, , GIF. PNG . , , PNG « »: APNG W3C, ; W3C' MNG . GIF .

. , . . , .

Composition

Here you should start with the main thing: if on the 800 × 600 logical screen with a new frame only a 2 × 2 square has changed, then it makes sense to make a 2 × 2 frame, and not 800 × 600. I think this is logical. But, alas, not all image processing programs are bothered by such an analysis, and the lion’s share of GIF optimization comes from correcting this trivial error.

And now to the details, namely to the Disposal Method field of image expansion, which indicates what to do with the frame before showing a new one. I repeat a little:

Values 0 and 1 with different formulations indicate to the renderer to do nothing. Ie, draw the pixels of the next frame directly on top (except for transparent ones, of course).

Value 2 - before displaying a new frame, delete everything from the logical screen. According to the specification, the logical screen should be filled with a background color, in practice it becomes completely transparent.

Value 3 is the most interesting - before displaying the next frame, return everything as it was before the current frame was displayed. So, when coding, say, the fifth frame, it is worth making a difference not only with the fourth, but also with the third - it will suddenly shrink better.

Transparent color

Note that a transparent color is indicated for each image, even if a global palette is used. What can it give us? Look at the picture:

The picture is removed under the spoiler, not everyone loves animation in the middle of the text.

How many colors are there? Red, green, blue, yellow and transparent - five. That is what GIMP and Photoshop think. But they are wrong - there are only four colors. All four frame images are exactly the same, and use a global palette. The difference is only in the extensions of the image that stand in front of them; each color is alternately declared transparent.

Palette and code size

LZW ( ) , . , 24- , .

, , . , - , .

… … ! , .

8 :

, , 0 (-) .

: - . ( 1). - LZW . , 0 3, - . , — 2.

, ?

. , GIF' !

«» 437 . , 6 , 443 . , Photoshop, 469 , 7% .

Total

We summarize. GIF:

Removal of unnecessary blocks, including image extensions;
The choice of the minimum possible color depth (palette size);
Minimizing the size of image frames;
Use local palette;
A departure from the greed of the LZW;
Reset LZW dictionary;
Choose between progressive or interlaced scanning (if valid);
Using frame cleaning methods: “leave”, “clear”, “undo” (only for animation);
"Reuse" transparent colors (only for animation);
Optimizing the palette to reduce the size of the codes (only for animation).

Works, but is not encouraged:

Delete the end of the LZW stream marker and the dictionary reset code at the beginning.

To be continued

That's all for today. And the next, final part is the graphic format PNG and its derivatives.

Source: https://habr.com/ru/post/184660/

All Articles

Images: formats and compression (2/3)

TIFF: Collect yourself

Format for sharing images

GIF: Structure

Headline

Logical screen description block

Global palette

Picture

Packing data of arbitrary length

Extensions

Graphics management extension

Application extension

Text extension

Comment

Closing block

Lempel-Ziva-Welch Algorithm

Compression with a dictionary, a simple example

LZW: General Coding Principle

Greed is bad

Compression of the same values

LZW to GIF

Non-standard

Animated gif

Composition

Transparent color

Palette and code size

Total

To be continued

More articles: