Hello Habr!
This article focuses on how to properly and maximally compress files into ZIP archives. I decided to write this article for the reason that a lot of applications pack their formats in ZIP. In this article we will analyze ZIP compression methods, applications for compression in ZIP, and how compression can be improved.
ZIP compression method
For a start, I propose to make out with the fact that ZIP supports different compression methods (Copy, Deflate, Deflate64, BZip2, LZMA, PPMd), but we will consider only one compression method -
Deflate , for the reason that this method is used by most applications which pack their formats in ZIP. Here is a small list of file formats that are actually ZIP archives -
open-file.ru (enter ASCII-descriptor of the header - PK in the search). Immediately make a reservation, this is only a small list of files.
Deflate compression method
Today there are several libraries based on the Deflate compression method:
So, before choosing a ZIP archiver, you need to understand what result we need and how much time we are willing to spend to get it. Deflate is characteristic, the higher the compression ratio, the more time it will have to spend.
')

ZIP archivers
In this section, we will consider only those applications that are free to use.
7-zip algorithm
Here we will talk about two programs where the 7-zip algorithm is implemented: 7-zip and advzip.
When creating a zip archive using 7-zip, I use the following parameters
-r -mm=Deflate -y -tzip -mpass=15 -mfb=258 -mx9
The feature of advzip is that it already works with ready-made zip archives, i.e. you simply specify the path to the archive, and he tries to compress it. It happens conveniently when you already have a ready-made archive, and you do not need to unpack and archive again.
Kzip algorithm
The kzip algorithm was implemented in the kzip application, the application runs extremely slowly, but almost always gives the best result. It has settings (/ s, / n, / b) that can improve / degrade ZIP compression.
Recommendations
Here I would like to give a few recommendations on how to get the best degree of compression (recommendations are based on personal experience):
- If you are archiving files and there are ZIP archives there, I recommend unzipping these archives (for convenience, you can use advzip with the / z0 option). This is explained by the fact that the Deflate method does not support continuous archives , i.e. it turns out that when the Deflate method tries to compress the decompressed archive, the decompressed archive in this case appears as one whole file and its contents are compressed as a continuous archive.
- If you want to get the maximum effect of compression, you can use the zipmix application. Suppose you created two zip archives with the same content using kzip, but with different settings, and as a result received archives of different sizes. But this does not mean that all the files that you have compressed in the first archive are separately compressed smaller than in the second archive. For these purposes, you need zipmix, it creates a third archive from two archives, with a smaller size, because he compares each file individually, and selects the option where the size is smaller. zipmix works not only with kzip archives.
Practice
And so decided to show how it all works. For example, I took the game for the iPad - Angry Birds HD version 2.0.0. The original size of the game is 13,547,363 bytes.
Applications | Result byte | Elapsed time, second |
advzip | 12,891,768 | 195 |
7-zip | 12,891,143 | 720 |
kzip | 12,877,794 | 2770 |
7-zip + advzip | 12,858,419 | - |
kzip + advzip | 12,849,101 | - |
kzip + 7-zip + advzip | 12,842,760 | - |
As you can see, zipmix can slightly improve compression. Personally, when I need to get the maximum, I simply combine all three (kzip + advzip + 7-zip) results into one. This is much better than trying to sort through the parameters in kzip.