Optimization and Inference for Computer Vision: Graphical models

In this post, we are going to deal the problem of image completion using millions of photographs from a huge database.

Given a target image to complete:

Finding the most similar images to the target from a huge database.
From every found image, find the most similar region to the hole of the target.
Using graphical models, automatically fine tune of the editing mask.
Copy the region.
Perform the Poisson Editing correction
Choose the best result

In particularly, we are going to focus on the 3rd step.

Once we have the similar image to the target, we placed the image into the hole of the incomplete image at its best placement using a seam finding and standard poisson blending to smooth the transition between the patch and the hole.

The seam finding operation tries to adapt the seam such that it could be possible that some pixel of the image won’t preserve. The algorithm use a relax constraint that allow the seam finding operation to remove some pixels from the given image but discourage the cutting of too many pixels by penalizing with a cost which increases with distance from the hole.

The seam finding operation minimizes the gradient of the image difference along the seam:

where C_d(p, L(p)) are the unary costs of assigning any pixel p, to a specific label L(p), and C_i(p, q, L(p), L(q)) is the cost for two pixels taking labels L(p) and L(q) respectively. Depending on if a pixel is taken from the match image or the incomplete image, the label of each pixel, L(p), has two values patch and exist. Thus, the value of C_d(p, exist) for a missing pixels of the existing image is a very large number. Likewise, the C_d(p, patch) is a very large number for those pixels that don’t belongs to the hole region.

The seam finding operation assigns a cost to each pixel taken from the incomplete image which increases with a pixel's distance from the hole.

C_i(p, q, L(p), L(q)) assigns a cost based on the labels of the immediately adjacent pixels. When the connected pixels p and q share the same label, L(p)=L(q), the cost is zero. Along the seams, when L(p)~=L(q), C_i(p, q, L(p), L(q)) is equal to the gradient of the SSD between the incomplete image and the match image at pixels p and q.

Then, this cost functional is minimized using a Loopy Belief Propagation. An image can be represented as a graph where each pixel is a node that could have two state values (patch, exist). The unary potentials (or factor) of each node is given by C_d. The edge potentials encode the relation between neighbors pixels or adjacent nodes, therefore, is given by C_i.

Optimization and Inference for Computer Vision

miércoles, 18 de noviembre de 2015

Graphical models

No hay comentarios:

Publicar un comentario