📜 ⬆️ ⬇️

Another algorithm for recovering blurred images

Good day. Already so much has been said about the methods of deconvolution of images, it seems there is nothing more to add. However, there is always a better and newer algorithm than the previous ones. Not so long ago, an iterative algorithm was described, which has a linear convergence rate at low memory costs, is stable and well-parallelizable. And after a while it was improved even to quadratic convergence. Meet: (Fast) Iterative Shrinkage-Thresholding Algorithm.



Formulation of the problem


A long time ago a whole series of articles on image deconvolution algorithms was published. The cycle considered a classical method for solving systems of linear algebraic equations and their regularization. The described method, in terms of the formulation of the problem, is practically no different from the classical one, but the devil is in the details. Let's start our math magic. This is the classic expression of the received damaged and noisy image:

 mathbfy= mathbfAx+ mathbfn


')
Matrix  mathbfAis a convolution matrix, and from a physical point of view it shows the relationship between the points of the original and the corrupted data. In the case of images, this matrix can be obtained by transforming the convolution kernel. Vectors  mathbfxand  mathbfyare the original and damaged image in vectorized form. In other words, all columns of pixels in an image are concatenated one by one.

The first problem of such a formulation of the problem: the amount of allocated memory. If the image has dimensions 640x480, then the task stores two vectors of images with dimensions of 307200x1 and a matrix of 307200x307200 elements. The convolution matrix is ​​sparse, but its dimensions, even in a sparse form, will be large and will take a lot of memory. A product with a convolution matrix can be replaced with a two-dimensional convolution with the kernel, which does not require storing a large matrix in memory, this will solve the problem of storing the matrix  mathbfA.
To search for the image as close as possible to the original one, we write the optimized function in the classical form of the second norm of the discrepancy between the damaged and the desired images.

min hat mathbfx mid mid mathbfyA hatx mid mid2



Difference from classical methods


This formulation of the problem is one of the most popular, but we will add to it to improve performance. The approach used is called “Majorization-maximization”, and it consists in replacing the original problem with another, very similar. The new task has two properties:

  1. The task is much easier
  2. At all points except the current one, there is a discrepancy greater than the original task.

This action occurs at each iteration of the algorithm. Sounds hard enough, much harder than it looks. Final minimization function for iteration kwritten as follows:

 mathbfJk(x)= mid mid mathbfyA hatx mid mid2+( mathbf hatx hatxk)T( alpha mathbfI mathbfATA)( mathbf hatx hatxk)



The matrix of the right term is positive definite. This is achieved by the fact that  alpha>maxeig( mathbfATA). The minimization function consists of the sum of two quantities, each of which is non-negative. At the same time, at the point  mathbf hatx= hatxkthe second term is zero, due to which the second condition from the list is satisfied.

Finding a solution


We will begin the search for a solution using the classical method: open the brackets, write out the gradient and equate it to zero. It is important to notice this gradient on kiteration Attention a lot of math!

 mathbfJk( hatx)=( mathbfyA hatx)T( mathbfyA hatx)+( mathbf hatx hatxk)T cdot( alpha mathbfI mathbfATA) cdot( mathbf hatx hatxk)


 mathbfJk( hatx)= mathbfyTy+ mathbf hatxTATA hatx2 mathbfyTA hatx mathbf hatxT( alpha mathbfI mathbfATA) hatx+ mathbf hatxkT( alpha mathbfI mathbfATA) hatxk2 mathbf hatxkT( alpha mathbfI mathbfATA) hatx



 frac mathbfJk( hatx) delta mathbf hatx= mathbfATA hatx2 mathbfATy+ mathbf2 alpha mathbfI hatx mathbfATA hatx2 mathbf( alpha mathbfI mathbfATA) hatxk



As a result, the derivative will have the following form:

 deltaJk(x)=2 mathbfATy2( alpha mathbfIATA) mathbf hatxk+2 alpha mathbf hatx



The next classical step in this situation is to equate the gradient to zero, and express the desired vector from the resulting expression.  mathbf hatx. Recorded  mathbf hatxwe define as  mathbf hatxn+1or as a solution at the next iteration.

 deltaJk(x)=2 mathbfATy2( alpha mathbfIATA) mathbf hatxk+2 alpha mathbf hatx=0


 mathbfATy+ alpha mathbfI mathbf hatxk mathbfATA mathbf hatxk= alpha mathbf hatx


 mathbf hatxk+1= mathbf hatxk+ frac1 alpha cdot mathbfAT(yA hatxk)



The resulting expression is called "Landweber iteration". This is the basic formula between iterations. Using it, you can guarantee a decrease in the discrepancy from iteration to iteration with a linear velocity.

The basic algorithm contains an additional step, called “soft-threshold”, and implying sparseness of the solution. This step equates to zero all the values ​​of the desired vector modulo smaller than the set value. This reduces the impact of noise on the recovery result.

What it looks like


As can be seen from the decision, we have two parameters that we adjust to our taste.  lambdaresponsible for the accuracy of the approximation, limiting the convergence of the threshold.  alphais responsible for the stability of work and the rate of convergence of the algorithm.

 mathbf hatxk+1=soft( mathbf hatxk+ frac1 alpha cdot mathbfAT(yA hatxk), lambda/ alpha)



Algorithm modification


In the future, the algorithm was improved by an additional term, similar in theory to the conjugate gradient method. Adding at each step the difference between the result of the two previous iterations, we increase the convergence of the algorithm to quadratic. The final procedure of the algorithm is as follows:

  1. Set parameters  lambda, alpha
  2. Set t1=1,t0=1, mathbfytemp= hatx0=ATy
  3. Iterate the base algorithm  mathbf hatxk+1=soft( mathbfytemp+ frac1 alpha cdot mathbfAT(yAytemp), lambda/ alpha)
  4. Update Temporary Variables
    t0=t1;t1=0.5+ sqrt0.25+t12
  5. Add a conjugate component to a variable  mathbfytemp= hatxk+ fract01t1 cdot( mathbf hatxk hatxk1)

Now about the parallelism of the algorithm. The cycle for calculating the threshold at each iteration can be expressed through elementwise products (Hadamard product) and comparison operations, which are included in both OpenCL and CUDA, so that they can be easily parallelized.

I implemented the algorithm on Matlab in two versions: for computing on the CPU and on the GPU. Later I began to use only the version of the code for the GPU, since it is more convenient. The code is presented below.

Code here
function [x] = fista_gpu(y,H,Ht,lambda,alpha,Nit) x=Ht(y); y_k=x; t_1=1; T=lambda/alpha; for k = 1:Nit x_old=x; x=soft_gpu((y_k+(1/alpha)*Ht(yH(y_k))), T); t_0=t_1-1; t_1=0.5+sqrt(0.25+t_1^2); y_k=x+(t_0/t_1)*(x-x_old); end end function [y] = soft_gpu(x,T) eq=ge(abs(x),abs(T)); y=eq.*sign(x).*(abs(x)-abs(T)); end 


Let's go down closer to the practice, and test the performance of the algorithm depending on the size of the picture.

My laptop configuration
Intel core I3-4005U
NVIDIA GeForce 820M

On the left is a graph of the dependence of the algorithm on 100 iterations, depending on the number of pixels for the CPU and GPU.





The following graphs present a comparison of the basic and improved algorithm, as well as the dependence of the algorithm’s convergence on  alphaand  lambdaparameters.




And finally, an example of the algorithm on the FHD picture,

Source picture


Corroded and Noisy Picture


Recovered picture


As you can see, the result is quite good, especially if it takes only 15 seconds, of which 1.5 is the work of the algorithm and 13.5 uploading a picture from the GPU memory.
All the code used for the algorithm is laid out in GitHub .

References


Source: https://habr.com/ru/post/324052/


All Articles