Neural network as a predictor for PNG image coding

I bring to your attention the translation of the article Neural Network As Predictor For Image Coding (PNG) . The author's blog is here .

Research topic

The main reason for this work was the improvement of existing pre-filters . Create a new filter that, using an artificial neural network, would make the best prediction, leading to better file compression.

Compression

Classically PNG compression is divided into two steps:
')

Pre-filtering (using predictors);
Compression (using DEFLATE).

In this article, only the first step is important. In the image below, you can see the pre-existing filters currently in use, and how they maintain the difference between the real and predicted pixel.

Predictor job

Current filters + new solution:

Type of	Name	Filter function	Recovery function
0	None	Filt (x) = Orig (x)	Recon (x) = Filt (x)
one	Sub	Filt (x) = Orig (x) - Orig (a)	Recon (x) = Filt (x) + Recon (a)
2	Up	Filt (x) = Orig (x) - Orig (b)	Recon (x) = Filt (x) + Recon (b)
3	Average	Filt (x) = Orig (x) - floor ((Orig (a) - Orig (b) / 2))	Recon (x) = Filt (x) + floor ((Recon (a) - Recon (b) / 2))
four	Paeth	Filt (x) = Orig (x) - PaethPredictor (Orig (a), Orig (b), Orig (d))	Recon (x) = Filt (x) + PaethPredictor (Recon (a), Recon (b), Recon (d))
five	Neural Network	Filt (x) = Orig (x) - NN (ArrayOfInputPixels)	Recon (x) = Filt (x) + NN (ArrayOfInputPixels)

Neural network as a predictor

The last filter is a new implementation of the author of this article. It internally uses a neural network with an array of input pixels. As a result, returns the predicted pixel variable. As in other filters, the difference between the original and the predicted value is maintained. But what are these input values you ask? In the figure below, the author tried to describe the process of transferring the neural network of input values more clearly and clearly. First, there are three different parts of the image:

Copied (indicated by RED);
Input pixels for the neural network (indicated by GREEN);
Predicted pixel (denoted BLUE).

Copied pixels

The entire red area will be copied 1: 1, since the initial data is required to start the work of the neural filter. This is the reason for copying such an image frame. The network configuration was as follows:

28 input neurons (marked GREEN) - (8 * 4-4) px.
1 output neuron (marked BLUE) - 29th px.

So all pixels from 1st to 28th will be copied.

Input pixels

The first pixel processed by the filter is in position (5.4) . This pixel can be predicted using the remaining 28 pixels and the neural network. This is evident from the illustration above.

Projected pixel

All green pixels are input pixels that the neural network processes, resulting in the predicted value for the BLUE pixel.

Components

In this section, the author describes the components developed and used. All code is written in JAVA.
At the first stage it is necessary to train the neural network. To accomplish this step a little faster, the author developed the Pattern Exporter, which creates a training sequence for the JavaNNS Tool. For clarity, this step describes the figure below.

After completing the training of the neural network, it must be used in the encoder / encoder. A detailed explanation of the described stage is shown in the figure below.

Input Image: A simple image that will compress the neural network.
PNG Encoder / Decoder: Encoding and decoding an image using a predictor on a neural network.
Neural Netwrok: a neural network developed in the JAVA programming language.
JNNSParser
Output image: as output, the image should be smaller than what was compressed.

For encoding and decoding, the author used the pngj library. You can find it here .

results

There are many ways to select a neural network configuration.

Possible ways to select a neural network configuration:

selection of the number of input neurons;
identification of input neurons;
selection of the number of hidden neurons;
selection of the number of hidden layers of neurons;
determination of neuron activation functions;
learning algorithm definition
And so on...

Below are some of the best options estimated by the author for designing a neural network. He mainly evaluated them by simply checking with several samples of images, and then he calculated the BPP (bit per pixel) of the neural network and determined the best parameters. This led to the following results:

Neural network estimated configuration:

Number of input neurons: 28.
Number of hidden neurons:
- 9 neurons (3x3);
- 25 neurons (5x5).
The number of hidden layers: 1.
Activation function: sigmoid, limiting the range from 0.2 to 0.8.
Algorithm of learning: the reverse propagation of error.

Comparison with other PNG predictors

At the next stage, the author compared his neural filter with other PNG filters that are currently in use. Testing took place on several images.

It can be seen that the neural network copes with image compression somewhat worse than Paeth and Average filters, but it is much better than Sub and Up. After this check, another one was carried out, and much more images (111) participated in which nature was captured. It was necessary to find out which image the filter copes with best, and which worse. It shows the images that the neural network coped with much better than all the other filters:

I wasnt sure what these pictures have in common. Well there are many flowers. So maybe my Neural Network really likes Flowers. But I wasn't very comfortable with that explanation.

So, we can conclude that the neural network is good to use if it is present in the image:

many textures;
various textures;
little noise.

In the next step, the author rummaged in his photos taken during the holidays to find the one that would satisfy the conditions described above and found one:

As a result, the following BPP values were calculated for 6 filters:

Type of	Name	BPP
0	None	7.289
one	Sub	6.681
2	Up	6.667
3	Average	6.433
four	Paeth	6.486
five	Nn	6.368

Thus, the theory of image features for better compression using a neurofilter was confirmed.

Comparison of images of nature with the image that man created

The author conducted one more test to find out how the origin of an object in an image affects compression. The following results were obtained:

Conclusion

There is a lot of potential. There was not enough time to find a suitable neural network setup. Maybe, if I specialized in this area, the neurofilter would beat other filters according to the BPP readings.
Perhaps using a different topology of the neural network, you can bring improvements. There were thoughts about recursive neural networks ...
Another idea was that you could train a neural network to process only one type of image.
Productivity was not the goal the author was working on. It is clear that other filters process images much faster than the above solution.

The project is on GIT Hub. Who is interested, you can see .

Source: https://habr.com/ru/post/303842/

All Articles