📜 ⬆️ ⬇️

Broo lossless compression algorithm and delta encoding, compared with Xdelta3. Home project development

We are glad to welcome you. Almost a year has passed since the last article was published, and we are ready to tell you what happened with the algorithm itself and how delta coding is involved.


image


Introduction


After the release of the article on improvements to the Broo algorithm, we were faced with an obstacle in improving the level of compression and performance, namely it was impossible to improve the level of compression without degrading the speed of decompression and vice versa. I will immediately make a reservation, the improvements were made without prejudice to the other characteristics of the algorithm, but these changes are minor, then we will write about these changes. So, after, we thought about where we can apply the accumulated expertise and knowledge in a similar direction. And the choice fell on delta encoding .


What is delta coding?


Delta encoding ( English Delta encoding) - a way to present data as a difference ( delta ) between serial data instead of the data itself.

In practice, if the compression algorithms allow you to reduce the file size and store or send it without any dependencies on other files, then the delta encoding algorithms allow you to build a smaller patch (difference) based on two files (data set) and apply a patch to the file ( data set) 1 - get the file (data set) 2 .


The most common use for delta coding is to update applications on your phones and PCs. Instead of downloading the application completely and then replacing the files, a much smaller patch is built (depending on the number of changes), which makes it much faster to download the update, and the speed with which the patch is applied directly affects the update speed of the application itself.


If you know where else delta coding is applied, then write in the comments.


On changes in the Broo algorithm


As we said, they are few:



These changes are still at the stage of experimentation and debugging. The main problem - after adding support for large files, the decompression speed dropped by 20%, which is unacceptable for us. So we are still searching for a solution.


Below we present only one table of comparisons of the old version of the algorithm, the experimental one and some levels of zstd. File "xml" from the previous article .


Processor: Intel i7-7700HQ


Memory: DDR4-2400


Algorithm NamePacking speedUnpacking speedCompressed file size, byte% of the original
memcpy17460 MB / s17194 MB / s5345280100.00
zstd 1.3.1 -6141 MB / s1311 MB / s58581010.96
broo 1.211 MB / s1905 MB / s60683811.35
zstd 1.3.1 -5196 MB / s1207 MB / s61951011.59
zstd 1.3.1-4357 MB / s1214 MB / s63758711.93
zstd 1.3.1 -3366 MB / s1220 MB / s63907311.96
broo 1.114 MB / s2005 MB / s64308412.03
zstd 1.3.1 -2394 MB / s1108 MB / s69050812.92
zstd 1.3.1 -1479 MB / s1213 MB / s70309313.15

Like many algorithms, the speed depends on the processor, as we can see in the table, the decompression speed is more than 1.5 times faster than the first level zstd, on the Intel i7-7700HQ. While on the old Intel i3-550, the decompression speed was approximately equal to the zstd decompression speed, you can see the comparative tables here .


This suggests that you can perform tighter integration with individual processors. Depends on the specifics of the problem.


Delta Coding and Broo


As you might have guessed, we developed our own delta encoding algorithm and named it DBroo (Delta Broo).


Main characteristics and features:



There are ready-made solutions, such as diff, bsdiff, xdelta and others. There was a goal to find the best (and also available) in this direction and to be with it. In a purely experimental way, Xdelta3 turned out to be the main competitor. It gives a good compression and a fairly fast speed of the patch. Also, Xdelta3 is used to update CyanogenMod (now LineageOS ).


Now look at the comparison table DBroo and Xdelta3. "Xml" is used as the reference file, and the same but changed randomly as the new file.


Algorithm NamePatch creation speedPatch speedPatch size, byte% of original
memcpy18052 MB / s18665 MB / s5326823100.00
Xdelta3 -9 + lzma5.40 MB / s306 MB / s1065422.00
Xdelta3 -6 + lzma20 MB / s310 MB / s1219162.28
DBroo 1.07.40 MB / s1600.00 MB / s1230522.31
Xdelta3 -97.00 MB / s688.24 MB / s1797323.37
Xdelta3 -636.71 MB / s694.09 MB / s2016813.78
Xdelta3 -359.22 MB / s637.43 MB / s2372184.45
Xdelta3 -272.73 MB / s582.75 MB / s2792235.24
Xdelta3 -181.43 MB / s540.53 MB / s4788248.9

PS


Only those products that are in demand in the market receive development. Therefore, we welcome your comments. We also created a telegram channel .


Thank.


')

Source: https://habr.com/ru/post/425255/


All Articles