What happens if you combine the photo editor and neural network

An example of the neural photo editor Neural Photo Editor. In the center - the original image. Red and blue squares show areas of hidden space generated after learning a neural network. They can be manipulated both directly (as is usually done) and indirectly, by means of a “context brush”

Do you think that Photoshop does wonders in distorting reality? Yes, he can completely remove a person from a photo or increase hair on his head, like Ilona Mask, with the help of a “context brush”. But it’s not worth it with what a neural network is capable of if you allow it to edit with context analysis. This is a completely different reality. A neural network can make a person in a photo smile, give your girlfriend the features of Angelina Jolie and so on. The possibilities are endless.

The first sign in this area is the neural photo editor Neural Photo Editor , which was developed by the staff of the School of Engineering and Physical Sciences at Heriot-Watt University (Edinburgh, United Kingdom) with a colleague from Renishaw .

Recent advances in creating generative models for images have led to the emergence of neural networks, which, after training, generate samples and interpolate the highest quality. In this area, apply the two main methods invented in 2013-2014: Variational Autoencoder (VAE) and Generative Adversarial Network (GAN). They showed that the neural network is able to generate complex, multidimensional structures in natural images.
')
VAE variational autoencoders are probabilistic graphical models that learn to maximize variational lower bounds (based on the probability of the data), projecting the learning result into a latent space, and then reconstructing an image from that space.

Generative competitive networks (GAN) master the generative model, training one network ("discriminator") to distinguish between real and generated data. At the same time, another network (“generator”) is learning to generate samples that the discriminator does not distinguish from the real ones.

Both methods are suitable for generating images in a hidden space - for example, to add a smile to a gloomy face. Each of them has its own advantages and disadvantages.

Neural Photo Editor Neural Photo Editor is an innovative interface for working with the hidden space of generative models. This method allows you to make specific semantic edits in the image using the "context brush", which indirectly changes the eigenvector.

The key idea in the neural photo editor is to change the hidden space in an intuitive way, that is, by editing a regular image. The user selects the color and size of the brush - applies it to the resulting image. The neural network performs the reverse convolution, calculating the difference between the color of the original pixels and the color of the brush, and changes the hidden space to minimize this difference. As a result, we get semantically meaningful edits in the resulting image - changes in hair, smile, dimples, etc.

The result of changing photos using neural photo editor

A simple example. If we take a photo of a white face with black hair - and apply a black brush on the forehead, but the neural photo editor will automatically add hair there. The editor works in real time on a decent GPU.

To improve the result of editing in the editor, it is possible to edit the reconstruction of the image after transformation with a neural network (interpolation mask). In this case, the result is better (in the illustration below).

Interpolation mask visualization. Top left to right: reconstruction, delta (error) reconstruction, original image. Below: modified reconstruction, delta, resulting image

The following images show examples of the work of the neural network in the reconstruction and interpolation of photos from CelebA, ImageNet and SVHN bases. On the left - the original images, with each step to the right shows the results of a gradual reconstruction in the neural network.

The authors published their work on September 22, 2016 at arXiv.org.

The code for the Neural Photo Editor is published on Github . In the same repository, the code of the introspective adversarial neural network is published, which is a hybrid of variational autoencoder (VAE) and generative consensual networks (GAN).

To run the Neural Photo Editor you will need:

Theano , a Python library for efficiently defining, optimizing, and evaluating mathematical expressions using multidimensional arrays.
Lasagne , a library for creating and training neural networks on Theano.
To improve performance, it is recommended (but not necessary) to install cuDNN , the library from Nvidia for hardware acceleration of standard procedures such as forward and reverse convolutions, pooling, normalization and activation of layers. This is part of the Nvidia Deep Learning SDK .
numpy, scipy, PIL, Tkinter and tkColorChooser from the Python installation kit.

Source: https://habr.com/ru/post/397847/

All Articles

What happens if you combine the photo editor and neural network

More articles: