Optimization and Inference for Computer Vision: noviembre 2015

miércoles, 18 de noviembre de 2015

Graphical models

In this post, we are going to deal the problem of image completion using millions of photographs from a huge database.

Given a target image to complete:

Finding the most similar images to the target from a huge database.
From every found image, find the most similar region to the hole of the target.
Using graphical models, automatically fine tune of the editing mask.
Copy the region.
Perform the Poisson Editing correction
Choose the best result

In particularly, we are going to focus on the 3rd step.

Once we have the similar image to the target, we placed the image into the hole of the incomplete image at its best placement using a seam finding and standard poisson blending to smooth the transition between the patch and the hole.

The seam finding operation tries to adapt the seam such that it could be possible that some pixel of the image won’t preserve. The algorithm use a relax constraint that allow the seam finding operation to remove some pixels from the given image but discourage the cutting of too many pixels by penalizing with a cost which increases with distance from the hole.

The seam finding operation minimizes the gradient of the image difference along the seam:

where C_d(p, L(p)) are the unary costs of assigning any pixel p, to a specific label L(p), and C_i(p, q, L(p), L(q)) is the cost for two pixels taking labels L(p) and L(q) respectively. Depending on if a pixel is taken from the match image or the incomplete image, the label of each pixel, L(p), has two values patch and exist. Thus, the value of C_d(p, exist) for a missing pixels of the existing image is a very large number. Likewise, the C_d(p, patch) is a very large number for those pixels that don’t belongs to the hole region.

The seam finding operation assigns a cost to each pixel taken from the incomplete image which increases with a pixel's distance from the hole.

C_i(p, q, L(p), L(q)) assigns a cost based on the labels of the immediately adjacent pixels. When the connected pixels p and q share the same label, L(p)=L(q), the cost is zero. Along the seams, when L(p)~=L(q), C_i(p, q, L(p), L(q)) is equal to the gradient of the SSD between the incomplete image and the match image at pixels p and q.

Then, this cost functional is minimized using a Loopy Belief Propagation. An image can be represented as a graph where each pixel is a node that could have two state values (patch, exist). The unary potentials (or factor) of each node is given by C_d. The edge potentials encode the relation between neighbors pixels or adjacent nodes, therefore, is given by C_i.

martes, 10 de noviembre de 2015

Poisson

Let S, a closed subset of R2, be the image definition domain, and let Ω be a closed subset of S with boundary dΩ. Let f* be a known scalar function defined over S minus the interior of Ω and let f be an unknown scalar function de fined over the interior of Ω. Finally, let v be a vector field defined over Ω.
To interpolate f of f* over Ω, the solution is:

To avoid the blurred interpolation, we can modify the problem by introducing further constrains in the form of a guidance field,

We choose the gradient field directly from the source image g,

So now, the solution to the problem is

We can discretize the laplacian operator as

So, the solution can be discretized as follows,

Results

Given a source and a destination image, we want to copy the eyes and the mouth of the girl (source) into the lena image (destination).

Gauss-Seidel scheme

The discretized model can be resolved by the Gauss-Seidel iterative method. The result is showed.

Conjugate gradient method

The model problem described above can be expressed as a linear system of equations:

where f are those pixels in the clone region. A is a matrix of size N x N, where N is the number of pixels in the clone region, and is defined as follows:

The vector b is a finite approximation of the source laplacian for all the pixels in the clone region. Moreover, the border pixels have the fixed value of their neighbors outside the clone region added on the laplacian value.

Once the matrix A and vector b are defined, the conjugate gradient method will attempt to solve the linear system of equations.

Conclusions

The results present a bit of blurring where the destination image have edges. This is happening because the seamless cloning blends the source and destination regions. The effect of this blending is more significant when the insertion of one object is close to another. However, the use of mixed gradients prevents this undesirable effect.

Segmentation

The Chan-Vese model is defined as follows,

The first term controls the regularity by penalizing the length. The second term penalizes the enclosed area of C to control its size. The third and fourth terms penalize discrepancy between u and f.

Defined H as the Heaviside function and δ the Dirac mass, its distributional derivative,

To be able to obtain δ, H is regularized as

So δ is

where t = φ.

In the function H(φ) we can see the set enclosed by C. So, the length of C is obtained as the total variation of H(φ),

For fixed φ, the optimal values of c1 and c2 are the region averages

Taking into account where H is defined, we can say that c1 is the average value of the set enclosed by C in the image f, whilst c2 is the average value of the outer points.

The evolution of φ can be discretized in space,

Where,

and

the forward difference in the x dimension

the backward difference in the x dimension and,

the central difference in the x dimension.

Knowing that :

is the function in our current iteration and

is the function in the next one,

we can discretize the derivative as

It means that we can isolate the function in the next iteration as follows,

Results

The parameters used in this results are the same except when it is specified something different: λ1 = λ2 = 1, ε= 1, η= 0.01 and dt = (10^-2 ).
These first examples are obtained using μ= 1 and ν= 0. Videos show on the left how φ evolves, and on the right the curve C that describes in the image f.

Effects of μ

The parameter μ is used to penalize the length of C. It means that if μ is small, the length(C) is less penalized, and the curve can be larger and it can fit more accurately the input image. On the other hand, if is larger the boundaries of the curve will be smoother.

In the videos below we can observe the results using a different μ .

This first case is using μ= 1 and ν = 0.5 The second case is using μ= 0.6 and ν = 0.5, and finally the third case is μ= 0.3 and ν = 0.5. In the first case, as we penalize more the length, it can not fit the shapes as well, in consequence the curve is smoother. In the third case, as we don't penalize too much the length, it can expand more, for this reason we can see that there are selected more shapes that those we want. With a medium value as we see in the second case, the result is quite better and it fits the shapes that we want.

Effects of ν

The parameter ν is used to control the size and penalize the area inside C. When ν is too smaller (it could be negative values) the boundary expands to fill the full domain, but if is too larger, the boundary shrinks until it vanishes.

In the videos below we can observe the results by using different ν . We observe that in the first case where μ= 0.5 and ν = 0, as the area is less penalize, the curve expands without fit any shape. In the third case as the area of C is more penalized the curve decreases without fitting any shape and at last it vanishes. In the second case, with a medium value, the curve fits well the shapes.

When it doesn't work

When the mean value inside the curve and the mean value outside are the same ( c1 = c2), φ doesn't change. In consequence the curve never fit the shapes.

In the video below we can see an example with an image with the same mean values in all the image. We can observe that due to the length restriction, φ goes down, so the length of the curve decreases. Even though, the constants keep the same inside and outside so, the curve can't evolve. It continues the same way until at last the curve vanishes.

Phantom19.png