Optimization and Inference for Computer Vision: octubre 2015

martes, 20 de octubre de 2015

Image Denoising

Denoising Modelling

The goal is to apply function optimizations to denoise an image.

The image denoising is modelled as the following energy functionals which have to be minimized:

Tikhonov

Rudin, Osher and Fatemi (ROF):

For each denoising model (Tikhonov and ROF), the first integral computes the "smoothness" of the resulting image, and the second integral computes the "similarity" of the resulting image with the input image.

The gradient is computed following the Euler-Lagrange equations, which give us:

We will use this gradient to iteratively descend the gradient to find the value of u that minimizes the energy functional.

Discretization

The gradients are defined in continuous space. To compute them we have to discretize them, and we do this as:

From which we can deduce an explicit iterative method for each model:

Tikhonov:

ROF:

This iterative method is the one implemented to find the u that minimizes the energy functional.

Parameters

λ : Similarity weight

Lambda weights out the importance of the similarity parameter, versus the smoothness parameter. The greater the lambda, the smoother the image will be, as the similarity parameter will be weighted less.

The value has to be chosen depending on the input image. For an input image with small noise, one is interested in retreiving a similar image, there is no need for a big smoothing, therefore the chosen lambda should be low.

Some examples on how λ affects the end result:

As we can see, the greater the λ, the less noise there is in the image, but also countours are removed.

ε: Zero offset

Epsilon is a simple offset for the ROF model, as the |∇u| is not differentiable at zero. Smaller values of ε are, the more blurred the restored image is. The ε parameter restricts the minimum energy which the smoothness term can reach. For instance, if ε is too small the resulting image will remain noisy. Otherwise, if ε is too big the resulting image will be too smooth and the information of the contours will be completely lost.

dt: Discretization time

The discretization time is the step amount along the gradient between iterations. This step has to be small enough to ensure that the iterative algorithm reaches the minimum of the functional, but not too small or it will require many iterations to find the minimum. If the steps are too big the iterative algorithm may diverge, therefore it will never reach the minimal solution. This can be seen in the following image which are shown 3 results obtained for differents values of dt.

Seeds

The minumum result is independent of the initial seed, as the functional is convex and there are no restrictions, so the solution is unique. Therefore the starting point only affects the number of iterations the method will need to reach the result.

In this examples we can see the effect of the seed, using Thikonov with λ = 15 and dt= 0.1:

Energy

The energy decreases every iteration because the iterative process follows the energy gradient. Each step will reduce the energy of the functional until we reach the minimum u or reach the iterations limit.

Here we can see an example of the energy in Thikonov, using as a λ = 15, dt = 0.1 and with Neumann boundary conditions:

Boundary Conditions

With boundary condition Dirichlet, where the boundary pixels have value 0, the edge pixels are greatly affected with the Tikhonov model, as the resulting image is highly smoothed. This smoothness makes the boundary zero values to “smudge” into the image. The lambda factor can be used to lower the amount of smoothness and weight more the similarity to the image, reducing the black border generated.

Here we can see an example of how affect Dirichlet boundary conditions to an image, using Tikhonov model with λ = 150 and dt = 0.1:

Here we can see an example of how affect Dirichlet boundary conditions to an image, using ROF model with λ = 150 and dt = 0.1:

It is important to note that with the ROF model we do not see a black border. This is because the ROF model penalizes less the edges, so the black contour created is not smudged into the image.

With boundary condition Neumann, where the boundary pixels have the same value as the edge pixels (mirror-like), the edge pixels are not affected by the boundary condition. This is expected, as the gradient at the borders is zero (which is the only term computed in the borders).

Here we can see an example of how affect Neumann boundary conditions to an image, using Tikhonov model with λ = 150 and dt = 0.1:

Here we can see an example of how affect Neumann boundary conditions to an image, using ROF model with λ = 0.1 and dt = 10^-5.

Data Fidelity term

One could ask himself "¿Why is the data fidelity term needed? If we want to denoise we just need a smooth image".

The data fidelity term makes sure that the computed image is “similar” to the noisy image. Together with the smoothing term the resulting image is similar and smooth. If the similarity term is not used, then the image will only be smooth, a plain image with the average value of the seed image.

Conclusions

We can conclude that both methods, using both boundary conditions are not good enough for a real world problem.
Tikhonov denoising generates images too smooth, greatly removing the image high frequencies, making it barely visible.
ROF denoising correctly removes the noise, but at the cost of losing the intermediate grayscales, having as a result a staircase effect, where segments of the image are represented by a value of gray, with great value differences between patches.

Image Inpainting

ROF Dual Formulation

We saw on the last entry the formulation of the Total Variation (TV) as:

This formulation has two problems:

It is not differentiable at the origin
It is not differentiable if u has discontinuities

Therefore we need another formulation for the |∇u|. A definitions suggests:

Where p is a vector of 2 components (2D images) which has the same direction as ∇u when the |∇u| ≠ 0.

With this new definition, we can now use it in a new formulation of the ROF model, a dual formulation.

We define the ROF as follows:

Where

This means that we have two variables to minimize, therefore we need to constrain the first, minimize with respect to the second, then constrain the second and minimize respect to the first.

From the Euler-Lagrange equations we see that constraining p we get that the optimal u is:

Substituting this value into the original model minimizing with respect to p we get that the gradient descent of p is:

In the discrete form:

Which is the explicit algorithm we will use to find the minimum of the ROF model.

Inpainting Modelling

The goal is to apply function optimizations to inpaint an image.

The inpainting model can be defined as:

Where u is the image to be computed, and f is the input image, missing some parts to be inpainted. This model states the the resulting image is f at the borders of the missing pixels, and is the TV minimum value at the missing pixels.

We can reformulate it, again, as a dual model:

Where we compute the image u as the solution of the ROF dual model using v as input and viceversa, and v is then used to recompose the image where the information is known, using as source the input image to be inpainted f.

This leads again to a three step iterative approach.

1. Fixing v, and minimizing

2. Then fixing u and minimizing

Which, as u is constant, becomes:

And by simple inspection we can see that the minimum is when u = v.

3. The last step is to restore the computed image v with the known information, from f. We fill the known pixels in v with the pixels from f.

Parameters

λ : Similarity weight

Trying out values for lambda we conclude that it has to be less than 0.1 to get a reasonably good image, not blurred.

Boundary Conditions

The boundary conditions in this model are derived from the internal models. We know that the best boundary condition for the ROF model is Neumann, therefore we can derive that the best condition for p is Dirichlet, because p is the ∇u normalized, therefore in the borders the gradient should be zero for Neumann conditions on u.

Seed

The seed that we are using in the inpainting model is the input image f.

We can deduce this from the equation

Where u in the initial iteration is equal to f, as the p is (0,0) (Dirichlet Conditions), and div(0,0) = 0.

Differences between primal and dual

Minimizing with respect to v

As commented in the introduction:

Fixing u and minimizing

Which, as u is constant, becomes:

And by simple inspection we can see that the minimum is when u = v.

Multichannel extension of Total Variation

The vectorial TV norm for multichannel images, as RGB images, is defined as follows:

where u and p are vectorial functions where each component corresponds to one of the three channels of RGB. There are many ways to compute the vectorial TV norm. In this post, we defined as the sum of the scalar TV of each channel independently.

Given this formulation, the optimization process for the multichannel extension is equivalent to optimize each component separately as scalar functions. Thus, the equations for solving this problem are the same as we mentioned earlier.

Since the three channels are considered independent, the results of the inpainting process wouldn't be as good as expected.

Here there is the obtained result from the optimization process. As we can see the minimization problem needs more iteration to remove completely the text in red. Although we can see how the text is partially removed and the red holes have been blurred a little bit.

Examples with the given images

Here we can see some examples of the algorithm applied in the given images. These images are created using λ = 0.1 and 200 iterations.

Conclusions

We can conclude that the more damage the original image is, the less resolution the resulting image has and more iterations are needed to reach the minimum. For instance, the pictures 3 and 4 from the image above need more iteration due to they lost more information from the original image.

Moreover, the quality of the restored image depend on how the lost information is spread across the image. If the image have big holes, it will be more difficult to restore it completely and the inpainting results will have big homogeneous areas where there wasn't information.