Optimization and Inference for Computer Vision: Image Inpainting

ROF Dual Formulation

We saw on the last entry the formulation of the Total Variation (TV) as:

This formulation has two problems:

It is not differentiable at the origin
It is not differentiable if u has discontinuities

Therefore we need another formulation for the |∇u|. A definitions suggests:

Where p is a vector of 2 components (2D images) which has the same direction as ∇u when the |∇u| ≠ 0.

With this new definition, we can now use it in a new formulation of the ROF model, a dual formulation.

We define the ROF as follows:

Where

This means that we have two variables to minimize, therefore we need to constrain the first, minimize with respect to the second, then constrain the second and minimize respect to the first.

From the Euler-Lagrange equations we see that constraining p we get that the optimal u is:

Substituting this value into the original model minimizing with respect to p we get that the gradient descent of p is:

In the discrete form:

Which is the explicit algorithm we will use to find the minimum of the ROF model.

Inpainting Modelling

The goal is to apply function optimizations to inpaint an image.

The inpainting model can be defined as:

Where u is the image to be computed, and f is the input image, missing some parts to be inpainted. This model states the the resulting image is f at the borders of the missing pixels, and is the TV minimum value at the missing pixels.

We can reformulate it, again, as a dual model:

Where we compute the image u as the solution of the ROF dual model using v as input and viceversa, and v is then used to recompose the image where the information is known, using as source the input image to be inpainted f.

This leads again to a three step iterative approach.

1. Fixing v, and minimizing

2. Then fixing u and minimizing

Which, as u is constant, becomes:

And by simple inspection we can see that the minimum is when u = v.

3. The last step is to restore the computed image v with the known information, from f. We fill the known pixels in v with the pixels from f.

Parameters

λ : Similarity weight

Lambda weights out the importance of the similarity parameter, versus the smoothness parameter. The greater the lambda, the smoother the image will be, as the similarity parameter will be weighted less.

Trying out values for lambda we conclude that it has to be less than 0.1 to get a reasonably good image, not blurred.

Boundary Conditions

The boundary conditions in this model are derived from the internal models. We know that the best boundary condition for the ROF model is Neumann, therefore we can derive that the best condition for p is Dirichlet, because p is the ∇u normalized, therefore in the borders the gradient should be zero for Neumann conditions on u.

Seed

The seed that we are using in the inpainting model is the input image f.

We can deduce this from the equation

Where u in the initial iteration is equal to f, as the p is (0,0) (Dirichlet Conditions), and div(0,0) = 0.

Differences between primal and dual

Minimizing with respect to v

As commented in the introduction:

Fixing u and minimizing

Which, as u is constant, becomes:

And by simple inspection we can see that the minimum is when u = v.

Multichannel extension of Total Variation

The vectorial TV norm for multichannel images, as RGB images, is defined as follows:

where u and p are vectorial functions where each component corresponds to one of the three channels of RGB. There are many ways to compute the vectorial TV norm. In this post, we defined as the sum of the scalar TV of each channel independently.

Given this formulation, the optimization process for the multichannel extension is equivalent to optimize each component separately as scalar functions. Thus, the equations for solving this problem are the same as we mentioned earlier.

Since the three channels are considered independent, the results of the inpainting process wouldn't be as good as expected.

Here there is the obtained result from the optimization process. As we can see the minimization problem needs more iteration to remove completely the text in red. Although we can see how the text is partially removed and the red holes have been blurred a little bit.

Examples with the given images

Here we can see some examples of the algorithm applied in the given images. These images are created using λ = 0.1 and 200 iterations.

Conclusions

We can conclude that the more damage the original image is, the less resolution the resulting image has and more iterations are needed to reach the minimum. For instance, the pictures 3 and 4 from the image above need more iteration due to they lost more information from the original image.

Moreover, the quality of the restored image depend on how the lost information is spread across the image. If the image have big holes, it will be more difficult to restore it completely and the inpainting results will have big homogeneous areas where there wasn't information.

Optimization and Inference for Computer Vision

martes, 20 de octubre de 2015

Image Inpainting