Chain conversion

Algorithm Demonstration

I am glad to present my own algorithm for the pre-preparation of data for subsequent compression. Having learned about the possibility of preparing data for subsequent compression using the example of BTW, I decided to invent my own method.

The essence of the algorithm is to build a small table (dictionary) for two-sided conversion. The algorithm turns out two passers. In the first pass the table is built. In the second pass, data is directly encoded. Decoding occurs in a single pass on a pre-recorded table.

Create a table of size 2 ^ n ^ m bytes. Where n is the number of bits (words), and m is the depth of the chain (number of words).
')
The table is filled according to this algorithm:
Read m words, and increment by one value in the table using the formula (a + 2 ^ n ^ m0) + (b + 2 ^ n ^ m1) + (c + 2 ^ n ^ m2) ...
The resulting table shows the frequency of occurrence of a word (byte) after the next word (byte). The table is sorted according to the frequency of encounters of words (bytes).

Encoding happens like this:
The first and second word is taken. Find the value of the previous formula in the table and enter it in the second word. But at the same time, we assign the value of the original second word to the first word (but do not write to the file itself)

Conversion example for n = 2 m = 2
Initial data: 0,2,1,3, 1,1,3,1, 0,1,3,0
Filled table:
0,1,1,0
1,1,0,3
0,1,0,0
1,2,0,0

Sorted table:
1,2,0,3
3,0,1,2
1,0,2,3
1,0,2,3
The converted data: 0,0,0,0, 0,2,0,3, 1,2,1,0

Initial data (Picture file from the example in the video)

After conversion

The output shows what happens and the data is sorted: 0 is always greater than 1, 1 is always greater than 2, and so on.

Also, this algorithm works on already compressed data (rar, zip), but the efficiency is not enough to recompress the data.

Source: https://habr.com/ru/post/130703/

All Articles

Chain conversion

Algorithm Demonstration

Initial data (Picture file from the example in the video)

After conversion

More articles: