Object Removal by Exemplar-Based Inpainting 翻译
?
?
?
Object Removal by Exemplar-Based Inpainting
通過基于樣本塊的圖像修復(fù)來實(shí)現(xiàn)遮擋物移除
Abstract:
? A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way.
? In the past, this problem has been addressed by two classes of algorithms: (i) “texture synthesis” algorithms for generating large image regions from sample textures, and (ii) “inpainting” techniques for filling in small image gaps. The former work well for “textures” – repeating twodimensional patterns with some stochasticity; the latter focus on linear “structures” which can be thought of as onedimensional patterns, such as lines and object contours.
? This paper presents a novel and efficient algorithm that combines the advantages of these two approaches. We first note that exemplar-based texture synthesis contains the essential process required to replicate both texture and structure; the success of structure propagation, however, is highly dependent on the order in which the filling proceeds. We propose a best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting. The actual colour values are computed using exemplar-based synthesis. Computational efficiency is achieved by a blockbased sampling process.
? A number of examples on real and synthetic images demonstrate the effectiveness of our algorithm in removing large occluding objects as well as thin scratches. Robustness with respect to the shape of the manually selected target region is also demonstrated. Our results compare favorably to those obtained by existing techniques.
摘要:
? 針對移除數(shù)字圖像中大的物體,本文提出一個(gè)新的算法,它的挑戰(zhàn)在于用合適的方法填補(bǔ)圖像缺失的部分。
? 在過去的方法中,過去,這一問題通過兩類算法來解決:(i)“紋理合成”算法:通過樣本紋理生成圖像丟失的大區(qū)域,(i i)“修復(fù)”技術(shù):填充圖像的小間隙。前者在“紋理”上具有很好的效果——它通過一些隨機(jī)性的重復(fù)二維部分,后者著眼于線性結(jié)構(gòu),它可以被看做是一維模式,例如線條或者物體的輪廓。
? 這篇文章結(jié)合上述兩種算法,提出了一種新的高效的方法。我們首先注意到基于樣例的紋理合成過程中需要包含必要的紋理和結(jié)構(gòu)兩個(gè)方面的復(fù)制。然而,結(jié)構(gòu)修復(fù)的成功很大程度上取決于填充的順序。我們提出了一種最佳優(yōu)先算法,該算法合成像素值中的置信度(好像是一個(gè)評判指標(biāo),論文的后面應(yīng)該會介紹到)的傳播類似于在修復(fù)圖像過程中信息傳播的方式傳播。實(shí)際顏色值的計(jì)算則是基于樣例的合成。計(jì)算效率是通過基于塊的采樣過程來實(shí)現(xiàn)的。
? 在真實(shí)和合成圖像上的大量例子表明了我們的算法在去除大的物體遮擋和小的劃痕方面上都是有效的。并且對人工選擇的目標(biāo)區(qū)域形狀的魯棒性也進(jìn)行了論證。我們的結(jié)果優(yōu)于現(xiàn)有技術(shù)的結(jié)果。
1. Introduction
? This paper presents a novel algorithm for removing objects from digital photographs and replacing them with visually plausible backgrounds. Figure 1 shows an example of this task, where the foreground person (manually selected as the target region) is replaced by textures sampled from the remainder of the image. The algorithm effectively hallucinates new colour values for the target region in a way that looks “reasonable” to the human eye.
1.介紹
? 本文提出了一種新的算法,用于去除數(shù)字圖像中不需要的物體,并將其替換為視覺上可信的背景。圖1顯示了這個(gè)任務(wù)的一個(gè)示例,其中前景人物(手動選擇作為目標(biāo)區(qū)域)被從圖像其余部分采樣的紋理替換。該算法有效地為目標(biāo)區(qū)域產(chǎn)生新的顏色值,使人眼看起來“合理”。
? In previous work, several researchers have considered texture synthesis as a way to fill large image regions with?“pure” textures – repetitive two-dimensional textural patterns with moderate stochasticity. This is based on a large body of texture-synthesis research, which seeks to replicate texture ad infinitum, given a small source sample of pure texture [1, 8, 9, 10, 11, 12, 14, 15, 16, 19, 22]. Of particular interest are exemplar-based techniques which cheaply and effectively generate new texture by sampling and copying colour values from the source [1, 9, 10, 11, 15].
? As effective as these techniques are in replicating consistent texture, they have difficulty filling holes in photographs of real-world scenes, which often consist of linear structures and composite textures – multiple textures interacting spatially [23]. The main problem is that boundaries between image regions are a complex product of mutual influences between different textures. In constrast to the twodimensional nature of pure textures, these boundaries form what might be considered more one-dimensional, or linear, image structures.
? A number of algorithms specifically address this issue for the task of image restoration, where speckles, scratches, and overlaid text are removed [2, 3, 4, 7, 20]. These image inpainting techniques fill holes in images by propagating linear structures (called isophotes in the inpainting literature) into the target region via diffusion. They are inspired by the partial differential equations of physical heat flow,and work convincingly as restoration algorithms. Their drawback is that the diffusion process introduces some blur, which is noticeable when the algorithm is applied to fill larger regions.
? The algorithm presented here combines the strengths of both approaches. As with inpainting, we pay special attention to linear structures. But, linear structures abutting the target region only influence the fill order of what is at core an exemplar-based texture synthesis algorithm. The result is an algorithm that has the efficiency and qualitative performance of exemplar-based texture synthesis, but which also respects the image constraints imposed by surrounding linear structures.
?之前的圖像修復(fù)工作中,一些研究人員認(rèn)為紋理合成是用“純”紋理去填充圖像中丟失的大區(qū)域的方法——用適度的隨機(jī)性重復(fù)二維紋理部分。這類方法基于大量的紋理合成研究去尋求復(fù)制紋理,并給出一個(gè)小的純紋理源樣本[1,8,9,10,11,12,14,15,16,19,22]。特別值得關(guān)注的是基于示例的技術(shù),通過從源圖像中采樣和復(fù)制顏色值,可以通過較低的代價(jià)有效地生成新的紋理[1、9、10、11、15]。
? 盡管這些技術(shù)在復(fù)制一致的紋理方面很有效,但它們很難填補(bǔ)真實(shí)場景照片中的漏洞。真實(shí)場景通常包括了線性結(jié)構(gòu)和復(fù)合紋理結(jié)構(gòu),復(fù)合紋理結(jié)構(gòu)是指多個(gè)紋理在空間上相互作用[23]。主要問題是圖像區(qū)域之間的邊界是不同紋理之間相互影響的復(fù)雜產(chǎn)物。在構(gòu)造二維自然圖像的純紋理時(shí),這些邊界形成了可能被認(rèn)為是一維或線性的圖像結(jié)構(gòu)。?
??許多的算法專門去除斑點(diǎn)、劃痕和重疊文本[2、3、4、7、20]。這些圖像修復(fù)技術(shù)通過擴(kuò)散將線性結(jié)構(gòu)(在圖像修復(fù)文獻(xiàn)中稱為等壓線)傳播到目標(biāo)區(qū)域來填補(bǔ)圖像中的漏洞。它們受到物理熱流(?)偏微分方程的啟發(fā),并且在恢復(fù)算法上取得了令人信服的成就。它們的缺點(diǎn)是擴(kuò)散過程會引入一些模糊,當(dāng)應(yīng)用該算法填充較大的區(qū)域時(shí),這種模糊很明顯。
???本文提出的算法結(jié)合了這兩種方法的優(yōu)點(diǎn)。與圖像修復(fù)方法一樣,我們特別注意線性結(jié)構(gòu)。但是,與目標(biāo)區(qū)域相鄰區(qū)域的線性結(jié)構(gòu)只影響核心區(qū)域的填充順序,這是一種基于示例的紋理合成算法。該算法不僅具有基于實(shí)例的紋理合成的效率和定性性能,而且還考慮了周圍線性結(jié)構(gòu)對圖像的約束。?
? Our algorithm builds on very recent research along similar lines. The work in [5] decomposes the original image into two components; one of which is processed by inpainting and the other by texture synthesis. The output image is the sum of the two processed components. This approach still remains limited to the removal of small image gaps, however, as the diffusion process continues to blur the filled region (cf., [5], fig.5 top right). The automatic switching between “pure texture-” and “pure structure-mode” described in [21] is also avoided.
? One of the first attempts to use exemplar-based synthesis specifically for object removal was by Harrison [13]. There, the order in which a pixel in the target region is filled was dictated by the level of “texturedness” of the pixel’s neighborhood. Although the intuition is sound, strong linear structures were often overruled by nearby noise, minimizing the value of the extra computation. A related technique drove the fill order by the local shape of the target region, but did not seek to explicitly propagate linear structure [6].?
我們的算法建立在最近類似研究的基礎(chǔ)上。[5]中的工作將原始圖像分解為兩個(gè)部分,一個(gè)部分通過著色處理,另一個(gè)部分通過紋理合成處理。輸出圖像是兩個(gè)處理的總和。但是,這種方法仍然局限于去除較小的圖像間隙,因?yàn)閿U(kuò)散過程會持續(xù)模糊填充區(qū)域(參見[5],圖5右上角)。它也避免了[21]中描述的“純紋理”和“純結(jié)構(gòu)模式”之間的自動切換。?
Harrison[13]首次嘗試使用基于實(shí)例的合成來去除物體。在這里,目標(biāo)區(qū)域中的像素填充順序由像素鄰域的“紋理”級別決定。雖然想法是合理的,但強(qiáng)線性結(jié)構(gòu)常常被附近的噪聲所干擾,從而使額外計(jì)算的值最小化。相關(guān)技術(shù)通過目標(biāo)區(qū)域的局部形狀來確定填充順序,但并未尋求明確擴(kuò)散的線性結(jié)構(gòu)[6]。?
Finally, Zalesny et al. [23] describe an interesting algorithm for the parallel synthesis of composite textures. They devise a special-purpose solution for the interface between two textures. In this paper we show that, in fact, only one mechanism is sufficient for the synthesis of both pure and composite textures.
Section 2 presents the key observation on which our algorithm depends. Section 3 describes the details of the algorithm.
Results on both synthetic and real imagery are presented in section 4.
最后,Zalesny等人[23]描述了一種有趣的復(fù)合紋理并行合成算法。他們?yōu)閮煞N紋理之間的界面設(shè)計(jì)了一種特殊目的的解決方案。在本文中,我們證明,事實(shí)上,只有一種機(jī)制就足以合成純紋理和復(fù)合紋理。?
第2節(jié)介紹了我們的算法所依賴的關(guān)鍵實(shí)驗(yàn)觀察。第3節(jié)描述了算法的細(xì)節(jié),第4節(jié)給出了合成圖像和真實(shí)圖像的結(jié)果。?
2. Exemplar-based synthesis suffices
The core of our algorithm is an isophote-driven imagesampling process. It is well-understood that exemplarbased approaches perform well for two-dimensional textures [1, 9, 15]. But, we note in addition that exemplarbased texture synthesis is sufficient for propagating extended linear image structures, as well. A separate synthesis?mechanism is not required for handling isophotes.
2.基于樣例的合成
我們算法的核心是一個(gè)等壓線驅(qū)動的圖像采樣過程。眾所周知,基于示例的方法在二維紋理方面表現(xiàn)良好[1,9,15]。但是,我們還注意到,基于示例的紋理合成對于擴(kuò)展線性圖像結(jié)構(gòu)的傳播也是足夠的。因此,處理等壓線并不需要單獨(dú)的合成機(jī)制。?
? Figure 2 illustrates this point. For ease of comparison, we adopt notation similar to that used in the inpainting literature. The region to be filled, i.e., the target region is indicated by Ω, and its contour is denoted δΩ. The contour evolves inward as the algorithm progresses, and so we also refer to it as the “fill front”. The source region, Φ, which remains fixed throughout the algorithm, provides samples used in the filling process.
? We now focus on a single iteration of the algorithm to show how structure and texture are adequately handled by exemplar-based synthesis. Suppose that the square template Ψp ∈ Ω centred at the point p (fig. 2b), is to be filled. The best-match sample from the source region comes from the patch Ψq ∈ Φ, which is most similar to those parts that are already filled in Ψp. In the example in fig. 2b, we see that if Ψp lies on the continuation of an image edge, the most likely best matches will lie along the same (or a similarly coloured) edge (e.g., Ψq and Ψq in fig. 2c).?
? ?? All that is required to propagate the isophote inwards is a simple transfer of the pattern from the best-match source patch (fig. 2d). Notice that isophote orientation is automatically preserved. In the figure, despite the fact that the original edge is not orthogonal to the target contour δΩ, the propagated structure has maintained the same orientation as in the source region.
?圖2說明了這一點(diǎn)。為了便于比較,我們采用了類似圖像修復(fù)文獻(xiàn)中使用的符號。要填充的區(qū)域,即目標(biāo)區(qū)域用Ω表示,其輪廓用δΩ表示。輪廓隨著算法的發(fā)展而向內(nèi)填充,因此我們也將其稱為“填充線”。源區(qū)域Φ在整個(gè)算法中保持不變,它提供填充過程中使用的樣本。
現(xiàn)在我們將重點(diǎn)放在算法的某一次迭代上,以展示如何通過基于示例的合成充分處理結(jié)構(gòu)和紋理。假設(shè)要填充以點(diǎn)P為中心的方形模板θp∈Ω(如圖2b),源區(qū)的最佳匹配樣本來自于Ψq∈Φ,Ψq是與Ψp已填補(bǔ)部分最相似的部分。在圖2b中的示例中,我們發(fā)現(xiàn),如果Ψp位于圖像邊緣的延續(xù)上,則最可能的最佳匹配將位于相同(或顏色相似)的邊緣上(如圖2c中的Ψq和Ψq)。?
所有需要填充的等壓線向內(nèi)傳播是通過一個(gè)簡單的模式,從最佳匹配的源圖像塊轉(zhuǎn)移到待修復(fù)的塊中(圖2d)。請注意,等壓線方向是自動保留的。在圖中,盡管原始邊緣不與目標(biāo)輪廓δΩ正交,但填充部分與源區(qū)域保持相同的方向。
3. Region-filling algorithm
We now proceed with the details of our algorithm.
First, a user selects a target region, Ω, to be removed and filled. The source region, Φ, may be defined as the entire image minus the target region (Φ = I?Ω), as a dilated band around the target region, or it may be manually specified by the user.
Next, as with all exemplar-based texture synthesis [10], the size of the template window Ψ must be specified. We provide a default window size of 9×9 pixels, but in practice require the user to set it to be slightly larger than the largest?distinguishable texture element, or “texel”, in the source region.
3. 區(qū)域填充算法
我們現(xiàn)在開始詳細(xì)介紹我們的算法。
首先,標(biāo)記要填充的目標(biāo)區(qū)域Ω。源區(qū)域Φ可以定義為整個(gè)圖像減去目標(biāo)區(qū)域(Φ=I?Ω),作為目標(biāo)區(qū)域周圍的擴(kuò)展帶,也可以通過手動指定。
接下來,與基于示例的紋理合成方法[10]一樣,必須指定模板窗口的大小ψ。我們提供了9×9像素的默認(rèn)窗口大小,但實(shí)際上需要用戶將其設(shè)置為略大于源區(qū)域中最大的可分辨紋理元素(texel)。
Once these parameters are determined, the remainder of the region-filling process is completely automatic.
In our algorithm, each pixel maintains a colour value (or “empty”, if the pixel is unfilled) and a confidence value, which reflects our confidence in the pixel value, and which is frozen once a pixel has been filled. During the course of the algorithm, patches along the fill front are also given a temporary priority value, which determines the order in which they are filled. Then, our algorithm iterates the following three steps until all pixels have been filled:
3.1. Computing patch priorities.
Filling order is crucial to non-parametric texture synthesis [1, 6, 10, 13]. Thus far, the default favourite has been the “onion peel” method, where the target region is synthesized from the outside inward, in concentric layers. To our knowledge, however, designing a fill order which explicitly encourages propagation of linear structure (together with texture) has never been explored. Our algorithm performs this task through a best-first filling algorithm that depends entirely on the priority values that are assigned to each patch on the fill front. The priority computation is biased toward those patches which are on the continuation of strong edges and which are surrounded by high-confidence pixels. Given a patch Ψp centred at the point p for some p ∈ δΩ (see fig. 3), its priority P(p) is defined as the product of two terms:
一旦確定了這些參數(shù),區(qū)域填充過程的其余部分將完全自動進(jìn)行。?
在我們的算法中,每個(gè)像素都保持一個(gè)顏色值(如果像素未填充,則為“空”)和一個(gè)置信值,這反映了我們對像素值的信心,并且一旦像素被填充,就會被凍結(jié)。在算法的執(zhí)行過程中,填充圖像塊的前面的塊也會被賦予一個(gè)臨時(shí)的優(yōu)先級值,該值決定了補(bǔ)丁的填充順序。然后,我們的算法重復(fù)以下三個(gè)步驟,直到所有像素都被填滿:
3.1.計(jì)算修補(bǔ)程序優(yōu)先級。
填充順序?qū)Ψ菂?shù)紋理合成至關(guān)重要[1,6,10,13]。到目前為止,默認(rèn)的最受歡迎的方法是“洋蔥皮”法,即目標(biāo)區(qū)域是從外部向內(nèi),在同心層中合成的。然而,據(jù)我們所知,一個(gè)明確的鼓勵線性結(jié)構(gòu)(結(jié)合紋理因素)傳播的填充順序從未被探索過。我們的算法通過最佳優(yōu)先的填充算法來執(zhí)行這項(xiàng)任務(wù),該算法完全依賴于分配給填充前面的每個(gè)補(bǔ)丁的優(yōu)先級值。優(yōu)先級的計(jì)算偏向于那些在明顯邊界的延續(xù)上和被高置信像素包圍的補(bǔ)丁。給定一個(gè)以點(diǎn)p為中心的圖像塊,對于一些p∈δΩ(見圖3),其優(yōu)先級P(p)定義為兩個(gè)術(shù)語的乘積:?
We call C(p) the confidence term and D(p) the data term, and they are defined as follows:
我們稱C(p)為置信項(xiàng),D(p)為數(shù)據(jù)項(xiàng),定義如下:?
where |Ψp| is the area of Ψp, α is a normalization factor (e.g., α = 255 for a typical grey-level image), and np is a unit vector orthogonal to the front δΩ in the point p. The priority is computed for every border patch, with distinct patches for each pixel on the boundary of the target region.
式中,|Ψp|是Ψp的面積,α是歸一化因子(例如,對于典型的灰度圖像而言,α=255),np是在點(diǎn)p上與前面的δΩ正交的單位向量。為每個(gè)邊界圖像塊計(jì)算優(yōu)先級,目標(biāo)區(qū)域邊界上的每個(gè)像素都有不同的圖像塊。 ?
During initialization, the function C(p) is set to C(p) = 0 ?p ∈ Ω, and C(p) = 1 ?p ∈ I ?Ω. The confidence term C(p) may be thought of as a measure of the amount of reliable information surrounding the pixel p.?
對于C(p)的初始化,函數(shù)C(p)設(shè)為0時(shí), ?p∈Ω,C(p)設(shè)為1時(shí),?p∈I?Ω。置信項(xiàng)C(p)可以被認(rèn)為是圍繞像素p的可靠信息量的度量。 ? ??
The intention is to fill first those patches which have more of their pixels already filled, with additional preference given to pixels that were filled early on (or that were never part of the target region).
其目的是首先填充那些已經(jīng)填充了更多像素的補(bǔ)丁,并對早期填充的像素(或不屬于目標(biāo)區(qū)域的像素)進(jìn)行了額外的偏好設(shè)置。
This automatically incorporates preference towards certain shapes along the fill front. For example, patches that include corners and thin tendrils of the target region will tend to be filled first, as they are surrounded by more pixels from the original image. These patches provide more reliable information against which to match. Conversely, patches at the tip of “peninsulas” of filled pixels jutting into the target region will tend to be set aside until more of the surrounding pixels are filled in.?
這些自動合并傾向于先合并某些形狀或者是填充邊緣。例如,那些包含角或者細(xì)小紋理的目標(biāo)區(qū)域塊往往首先被填充,因?yàn)樗鼈儽辉紙D像中的更多像素包圍(?)。這些塊提供了更可靠的匹配信息。相反,突出到目標(biāo)區(qū)域的待填充“半島”尖端的塊將傾向于被留出,直到填充完更多的周圍像素后再填充。?
At a coarse level, the term C(p) of (1) approximately enforces the desirable concentric fill order. As filling proceeds, pixels in the outer layers of the target region will tend to be characterized by greater confidence values, and therefore be filled earlier; pixels in the centre of the target region will have lesser confidence values.
在粗略的水平上,第一次迭代的C(p)近似地執(zhí)行所需的同心填充順序。隨著填充的進(jìn)行,目標(biāo)區(qū)域外層的像素將趨向于更大的置信值,因此會更早填充;目標(biāo)區(qū)域中心的像素將具有更小的置信值。?
The data term D(p) is a function of the strength of isophotes hitting the front δΩ at each iteration. This term boosts the priority of a patch that an isophote “flows” into. This factor is of fundamental importance in our algorithm because it encourages linear structures to be synthesized first, and, therefore propagated securely into the target region. Broken lines tend to connect, thus realizing the “Connectivity Principle” of vision psychology [7, 17] (cf., fig. 4, fig. 7d, fig. 8b and fig. 13d).
數(shù)據(jù)項(xiàng)D(p)是在每次迭代時(shí)計(jì)算δΩ等壓線的強(qiáng)度的函數(shù)。這個(gè)過程提高了包括等壓線塊的優(yōu)先級。這一因素在我們的算法中至關(guān)重要,因?yàn)樗膭钍紫群铣删€性結(jié)構(gòu),從而安全地傳播到目標(biāo)區(qū)域。斷線傾向于連接,從而實(shí)現(xiàn)視覺心理學(xué)的“連接原理”[7,17](參見圖4,圖7d,圖8b和圖13d)。
There is a delicate balance between the confidence and data terms. The data term tends to push isophotes rapidly inward, while the confidence term tends to suppress precisely this sort of incursion into the target region. As presented in the results section, this balance is handled gracefully via the mechanism of a single priority computation for all patches on the fill front.
置信值和數(shù)據(jù)項(xiàng)之間存在微妙的平衡。數(shù)據(jù)項(xiàng)傾向于迅速地向內(nèi)推送等壓線,而置信項(xiàng)傾向于精確地抑制這種侵入目標(biāo)區(qū)域的行為。如結(jié)果部分所示,通過對填充區(qū)域上所有圖像塊的單個(gè)優(yōu)先級計(jì)算機(jī)制,可以很好地處理此平衡。 ? ? ? ? ? ? ?? ? ? ? ?
Since the fill order of the target region is dictated solely by the priority function P(p), we avoid having to predefine an arbitrary fill order as done in existing patch-based approaches [9, 19]. Our fill order is function of image properties, resulting in an organic synthesis process that eliminates the risk of “broken-structure” artefacts (fig. 7c) and also reduces blocky artefacts without an expensive patch-cutting step [9] or a blur-inducing blending step [19].?
由于目標(biāo)區(qū)域的填充順序完全由優(yōu)先級函數(shù)P(p)決定,因此我們避免了現(xiàn)有基于補(bǔ)丁的方法中預(yù)先定義任意填充順序的缺陷[9,19]。我們的填充順序是通過圖像優(yōu)先級的函數(shù)得到的,從而產(chǎn)生一個(gè)有組織的合成過程,消除了“結(jié)構(gòu)破壞”的風(fēng)險(xiǎn)(圖7c),并且還在沒有塊切割步驟[9]或模糊步驟[19]的情況下,減少了塊效應(yīng)。?
3.2. Propagating texture and structure information.
Once all priorities on the fill front have been computed, the patch Ψp^ with highest priority is found. We then fill it with data extracted from the source region Φ.
In traditional inpainting techniques, pixel-value information is propagated via diffusion. As noted previously, diffusion necessarily leads to image smoothing, which results in blurry fill-in, especially of large regions (see fig. 10f).
On the contrary, we propagate image texture by direct sampling of the source region. Similar to [10], we search in the source region for that patch which is most similar to Ψp^. Formally,
3.2.傳播紋理和結(jié)構(gòu)信息。
計(jì)算完填充面上的所有優(yōu)先級后,即可找到優(yōu)先級最高的補(bǔ)丁Ψp^。然后我們用從源區(qū)域Φ提取的數(shù)據(jù)填充它。?
在傳統(tǒng)的著色技術(shù)中,像素值信息是通過擴(kuò)散傳播的。如前所述,擴(kuò)散必然導(dǎo)致圖像平滑,從而導(dǎo)致填充模糊,尤其是大區(qū)域(見圖10f)。?
相反,我們通過直接采樣源區(qū)域來傳播圖像紋理。與[10]類似,我們在源區(qū)域中搜索最類似于Ψp^.的補(bǔ)丁,?
where the distance d(Ψa,Ψb) between two generic patches Ψa and Ψb is simply defined as the sum of squared differences (SSD) of the already filled pixels in the two patches. We use the CIE Lab colour space because of its property of perceptual uniformity [18].
Having found the source exemplar Ψ?q, the value of each pixel-to-be-filled, p |p ∈ Ψ?p∩Ω, is copied from its corresponding position inside Ψ?q.
This suffices to achieve the propagation of both structure and texture information from the source Φ to the target region Ω, one patch at a time (cf., fig. 2d). In fact, we note that any further manipulation of the pixel values (e.g., adding noise, smoothing and so forth) that does not explicitly depend upon statistics of the source region, is far more likely to degrade visual similarity between the filled region and the source region, than to improve it.
其中,兩個(gè)普通面片之間的距離d(ψa,ψb)定義為兩個(gè)面片中已填充像素的平方差(ssd)之和。我們使用CIE實(shí)驗(yàn)室顏色空間是因?yàn)樗哂懈兄鶆蛐訹18]。?
找到源示例_q后,將從其在_q內(nèi)的相應(yīng)位置復(fù)制要填充的每個(gè)像素的值p p∈_pΩ。
這足以實(shí)現(xiàn)結(jié)構(gòu)和紋理信息從源Φ到目標(biāo)區(qū)域Ω的傳播,一次一個(gè)圖像塊(參見圖2d)。事實(shí)上,我們注意到,對像素值的任何進(jìn)一步不顯式依賴源區(qū)域的統(tǒng)計(jì)信息的操作(例如,添加噪聲、平滑等),更可能的操作是降低填充區(qū)域和源區(qū)域之間的視覺相似性,而不是改善它。?
3.3. Updating confidence values.
After the patch Ψ?p has been filled with new pixel values, the confidence C(p) is updated in the area delimited by Ψ?p as follows:
3.3.更新置信值。?
補(bǔ)丁Ψ?p被新的像素值填充后,置信度C(p)在Ψ?p界定的區(qū)域更新如下:?
This simple update rule allows us to measure the relative confidence of patches on the fill front, without imagespecific parameters. As filling proceeds, confidence values decay, indicating that we are less sure of the colour values of pixels near the centre of the target region.
這個(gè)簡單的更新規(guī)則允許我們在沒有圖像特定參數(shù)的情況下測量填充面上補(bǔ)丁的相對置信度。隨著填充過程的進(jìn)行,置信值會衰減,這表明我們對目標(biāo)區(qū)域中心附近像素的顏色值不太確定。?
A pseudo-code description of the algorithmic steps is shown in table 1. The superscript t indicates the current iteration.
算法步驟的偽代碼描述如表1所示。上標(biāo)t表示當(dāng)前迭代。?
4. Results and comparisons
Here we apply our algorithm to a variety of images, ranging from purely synthetic images to full-colour photographs that include complex textures. Where possible, we make side-by-side comparisons to previously proposed methods. In other cases, we hope the reader will refer to the original source of our test images (many are taken from previous literature on inpainting and texture synthesis) and compare these results with the results of earlier work.
In all of the experiments, the patch size was set to be greater than the largest texel or the thickest structure (e.g., edges) in the source region. Furthermore, unless otherwise stated the source region has been set to be Φ = I ?Ω. All experiments were run on a 2.5GHz Pentium IV with 1GB of RAM.
The Kanizsa triangle. We perform our first experiment on the well-known Kanizsa triangle [17] to show how the algorithm works on a structure-rich synthetic image.
As shown in fig. 4, our algorithm deforms the fill front δΩ under the action of two forces: isophote continuation (the data term, D(p)) and the “pressure” from surrounding filled pixels (the confidence term, C(p)).
4.結(jié)果和比較?
在這里,我們將我們的算法應(yīng)用于各種圖像,從純合成圖像到包含復(fù)雜紋理的全彩照片。在可能的情況下,我們將與先前提出的方法進(jìn)行比較。在其他情況下,我們希望讀者參考我們測試圖像的原始來源(許多是從以前的關(guān)于圖像修復(fù)和紋理合成的文獻(xiàn)中獲取的),并將這些結(jié)果與早期工作的結(jié)果進(jìn)行比較。?
在所有的實(shí)驗(yàn)中,圖像塊大小被設(shè)置為大于源區(qū)域中最大的紋理元素或最厚的結(jié)構(gòu)(例如邊緣)。此外,除非另有說明,否則源區(qū)設(shè)置為Φ=I?Ω。所有的實(shí)驗(yàn)都是在一個(gè)2.5GHz的Pentium IV上運(yùn)行的,內(nèi)存為1GB。?
Kanizza三角。我們對著名的Kanizza三角形[17]進(jìn)行了第一次實(shí)驗(yàn),以展示該算法如何在結(jié)構(gòu)豐富的合成圖像上工作。?
如圖4所示,我們的算法在兩種函數(shù)的作用下使填充前δΩ變形:等壓線連續(xù)(數(shù)據(jù)項(xiàng)D(p))和周圍填充像素的“壓力”(置信項(xiàng)C(p))。
The sharp linear structures of the incomplete green triangle are grown into the target region. But also, no single structural element dominates all of the others; this balance among competing isophotes is achieved through the naturally decaying confidence values (in an earlier version of our algorithm which lacked this balance, “runaway” structures led to large-scale artefacts.)
Figures 4e,f also show the effect of the confidence term in smoothing sharp appendices such as the vertices of the target region (in red).
As described above, the confidence is propagated in a manner similar to the front-propagation algorithms used in inpainting. We stress, however, that unlike inpainting, it is the confidence values that are propagated along the front (and which determine fill order), not colour values themselves, which are sampled from the source region.
Finally, we note that despite the large size of the removed region, edges and lines in the filled region are as sharp as any found in the source region. There is no blurring from diffusion processes. This is a property of exemplar-based texture synthesis.
將不完全綠色三角形的尖銳線性結(jié)構(gòu)擴(kuò)展到目標(biāo)區(qū)域。但是,沒有一個(gè)單一的結(jié)構(gòu)元素支配所有其他元素;這種在競爭的等壓線之間的平衡是通過自然衰減的置信值實(shí)現(xiàn)的(在缺乏這種平衡的早期版本中,“失控”結(jié)構(gòu)導(dǎo)致了大規(guī)模的人工制品)。?
圖4e,f還顯示了置信值對平滑尖銳部分(如目標(biāo)區(qū)域頂點(diǎn))的影響(紅色)。
如上所述,置信度的傳播方式類似于修復(fù)中使用的前傳播算法。然而,我們強(qiáng)調(diào),與圖像修復(fù)方法不同的是,置信值是沿著前面?zhèn)鞑サ?#xff08;并決定填充順序),而不是顏色值本身,它們是從源區(qū)域取樣的。
最后,我們注意到,盡管移除區(qū)域的大小很大,填充區(qū)域中的邊和線與源區(qū)域中的任何邊和線一樣鋒利。擴(kuò)散過程沒有模糊。這是基于范例的紋理合成的一個(gè)特性。
The effect of different filling strategies. Figures 5, 6 and 7 demonstrate the effect of different filling strategies.
Figure 5f shows how our filling algorithm achieves the best structural continuation in a simple, synthetic image.
Figure 6 further demonstrates the validity of our algorithm on an aerial photograph. The 40 × 40-pixel target region has been selected to straddle two different textures (fig. 6b). The remainder of the 200 × 200 image in fig. 6a was used as source for all the experiments in fig. 6.
With raster-scan synthesis (fig. 6c) not only does the top region (the river) grow into the bottom one (the city area), but visible seams also appear at the bottom of the target region. This problem is only partially addressed by a concentric filling (fig 6d). Similarly, in fig. 6e the sophisticated ordering proposed by Harrison [13] only moderately succeeds in preventing this phenomenon.
In all of these cases, the primary difficulty is that since the (eventual) texture boundary is the most constrained part?of the target region, it should be filled first. But, unless this is explicitly addressed in determining the fill order, the texture boundary is often the last part to be filled. The algorithm proposed in this paper is designed to address this problem, and thus more naturally extends the contour between the two textures as well as the vertical grey road.
不同填充策略的效果。圖5、6和7展示了不同填充策略的效果。?
圖5f顯示了我們的填充算法如何在簡單的合成圖像中實(shí)現(xiàn)最佳的結(jié)構(gòu)延續(xù)。
圖6進(jìn)一步證明了我們的算法在航空照片上的有效性。選擇40×40像素的目標(biāo)區(qū)域跨越兩種不同的紋理(圖6b)。圖6a中200×200圖像的其余部分用作圖6中所有實(shí)驗(yàn)的源區(qū)域。
通過光柵掃描合成(圖6c),不僅頂部區(qū)域(河流)生長到底部區(qū)域(城市區(qū)域),而且目標(biāo)區(qū)域底部也會出現(xiàn)可見接縫。這一問題只能通過同心填補(bǔ)來部分解決(圖6d)。同樣,在圖6e中,Harrison[13]提出的復(fù)雜排序僅在一定程度上成功地防止了這種現(xiàn)象。?
在所有這些情況下,主要的困難是,由于(最終)紋理邊界是目標(biāo)區(qū)域中最受約束的部分,所以應(yīng)該首先填充它。但是,除非在確定填充順序時(shí)明確說明這一點(diǎn),否則紋理邊界通常是要填充的最后一部分。本文提出的算法是針對這一問題而設(shè)計(jì)的,從而更自然地?cái)U(kuò)展了兩種紋理之間的輪廓以及垂直的灰色道路。?
In the example in fig. 6, our algorithm fills the target region in only 2 seconds, on a Pentium IV, 2.52GHz, 1GB RAM. Harrison’s resynthesizer [13], which is the nearest in quality, requires approximately 45 seconds.
Figure 7 shows yet another comparison between the concentric filling strategy and the proposed algorithm. In the presence of concave target regions, the “onion peel” filling may lead to visible artefacts such as unrealistically broken structures (see the pole in fig. 7c). Conversely, the presence of the data term of (1) encourages the edges of the pole to grow “first” inside the target region and thus correctly reconstruct the complete pole (fig. 7d). This example demonstrates the robustness of the proposed algorithm with respect to the shape of the selected target region.
Comparisons with inpainting. We now turn to some examples from the inpainting literature. The first two examples show that our approach works at least as well as inpainting.
The first (fig. 8) is a synthesized image of two ellipses [4]. The occluding white torus is removed from the input image and two dark background ellipses reconstructed via our algorithm (fig. 8b). This example was chosen by authors of the original work on inpainting to illustrate the structure propagation capabilities of their algorithm. Our results are visually identical to those obtained by inpainting ([4], fig.4).
We now compare results of the restoration of an handdrawn image. In fig. 9 the aim is to remove the foreground text. Our results (fig. 9b) are mostly indistinguishable with those obtained by traditional inpainting 3. This example demonstrates the effectiveness of both techniques in image restoration applications.
It is in real photographs with large objects to remove,?however, that the real advantages of our approach become apparent. Figure 10 shows an example on a real photograph, of a bungee jumper in mid-jump (from [4], fig.8). In the original work, the thin bungee cord is removed from the image via inpainting. In order to prove the capabilities of our algorithm we removed the entire bungee jumper (fig. 10e). Structures such as the shore line and the edge of the house have been automatically propagated into the target region along with plausible textures of shrubbery, water and roof tiles; and all this with no a priori model of anything specific to this image.
在圖6中的示例中,我們的算法在Pentium IV,2.52GHz、1GB RAM上,只需2秒鐘就可以填充目標(biāo)區(qū)域。哈里森的再合成器[13]的質(zhì)量最接近,大約需要45秒。
圖7顯示了同心填充策略和對比算法之間的另一個(gè)比較。在存在凹面目標(biāo)區(qū)域的情況下,“洋蔥皮”填充物可能導(dǎo)致可見的假象,例如不真實(shí)的斷裂結(jié)構(gòu)(見圖7c中的桿)。相反,第一次迭代(?)的數(shù)據(jù)項(xiàng)的存在鼓勵極點(diǎn)的邊緣向目標(biāo)區(qū)域內(nèi)生長,從而正確地重建整個(gè)極點(diǎn)(圖7d)。這個(gè)例子說明了所提出的算法對所選目標(biāo)區(qū)域形狀的魯棒性。?
與圖像修復(fù)比較。現(xiàn)在,我們來看一些來自于圖像修復(fù)文獻(xiàn)的例子。前兩個(gè)例子表明,我們的方法至少能起到修復(fù)的作用。
第一個(gè)(圖8)是兩個(gè)橢圓的合成圖像[4]。通過我們的算法(圖8b),從輸入圖像中去除阻塞的白色圓環(huán),重建兩個(gè)暗背景橢圓。這個(gè)例子是由最初的圖像修復(fù)工作的作者選擇的,以說明他們的算法的結(jié)構(gòu)傳播能力。我們的結(jié)果在視覺上與通過圖像修復(fù)獲得的結(jié)果相同([4],圖4)。?
我們現(xiàn)在比較了手工繪制圖像的恢復(fù)結(jié)果。圖9的目的是刪除前景文本。我們的結(jié)果(圖9b)與傳統(tǒng)的圖像修復(fù)算法所得的結(jié)果幾乎不可區(qū)分。這個(gè)例子演示了這兩種技術(shù)在圖像恢復(fù)應(yīng)用中的有效性。?
然而,在大量物體的真實(shí)照片中證實(shí),我們的方法的真正優(yōu)勢變得顯而易見。圖10顯示了一張真實(shí)照片上的例子,一個(gè)蹦極運(yùn)動員在跳中(從[4],圖8)。在最初的作品中,薄的橡皮筋繩通過內(nèi)涂從圖像中去除。為了證明我們的算法的能力,我們移除了整個(gè)蹦極跳線(圖10e)。諸如海岸線和房屋邊緣等結(jié)構(gòu)已經(jīng)自動傳播到目標(biāo)區(qū)域,以及灌木、水和屋頂瓷磚的合理紋理;所有這一切都沒有針對這幅圖像的先驗(yàn)?zāi)P汀?
For comparison, figure 10f shows the result of filling the same target region (fig. 10b) by image inpainting. Considerable blur is introduced into the target region because of inpainting’s use of diffusion to propagate colour values; and high-frequency textural information is entirely absent.
Figure 11 compares our algorithm to the recent “texture and structure inpainting” technique described in [5]. Figure 11(bottom right) shows that also our algorithm accomplishes the propagation of structure and texture inside the selected target region. Moreover, the lack of diffusion steps avoids blurring propagated structures (see the vertical edge in the encircled region) and makes the algorithm more computationally efficient.
Synthesizing composite textures. Fig. 12 demonstrates that our algorithm behaves well also at the boundary between two different textures, such as the ones analyzed in [23]. The target region selected in fig. 12c straddles two different textures. The quality of the “knitting” in the contour reconstructed via our approach (fig. 12d) is similar to the original image and to the results obtained in the original work (fig. 12b), but again, this has been accomplished without complicated texture models or a separate boundaryspecific texture synthesis algorithm.
Further examples on photographs. We show two more examples on photographs of real scenes.
Figure 13 demonstrates, again, the advantage of the proposed approach in preventing structural artefacts (cf., 7d). While the onion-peel approach produces a deformed horizon, our algorithm reconstructs the boundary between sky and sea as a convincing straight line.?
Finally, in fig. 14, the foreground person has been manually selected and the corresponding region filled in automatically. The filled region in the output image convincingly mimics the complex background texture with no prominent artefacts (fig. 14f). During the filling process the topological changes of the target region are handled effortlessly.
為了進(jìn)行比較,圖10f顯示了通過圖像修復(fù)填充相同目標(biāo)區(qū)域(圖10b)的結(jié)果。由于圖像修復(fù)使用擴(kuò)散來傳播顏色值,因此在目標(biāo)區(qū)域引入了相當(dāng)大的模糊;并且完全沒有高頻紋理信息。?
圖11將我們的算法與[5]中描述的最新“紋理和結(jié)構(gòu)修復(fù)”技術(shù)進(jìn)行了比較。圖11(右下角)顯示,我們的算法還可以在選定的目標(biāo)區(qū)域內(nèi)完成結(jié)構(gòu)和紋理的傳播。此外,由于缺乏擴(kuò)散步驟,避免了傳播結(jié)構(gòu)的模糊(參見環(huán)繞區(qū)域的垂直邊緣),使算法的計(jì)算效率更高。?
合成復(fù)合紋理。圖12表明,我們的算法在兩種不同紋理之間的邊界也表現(xiàn)良好,如[23]中分析的紋理。圖12c中選擇的目標(biāo)區(qū)域跨越兩種不同的紋理。通過我們的方法重建的輪廓中的“編織”質(zhì)量(圖12d)與原始圖像和原始工作中(圖12b)相似,但同樣,這是在沒有復(fù)雜的紋理模型或單獨(dú)的邊界特定紋理合成算法的情況下完成的。?
關(guān)于照片的更多例子。我們在真實(shí)場景的照片上再展示兩個(gè)例子。?
圖13再次證明了建議的方法在防止結(jié)構(gòu)人工制品方面的優(yōu)勢(參見,7d)。當(dāng)洋蔥皮方法產(chǎn)生一個(gè)變形的地平線時(shí),我們的算法將天空和海洋之間的邊界重建為一條令人信服的直線。
最后,在圖14中,手動選擇了前景人物,并自動填寫相應(yīng)的區(qū)域。輸出圖像中的填充區(qū)域令人信服地模仿了復(fù)雜的背景紋理,沒有突出的人工痕跡(圖14f)。在填充過程中,對目標(biāo)區(qū)域的拓?fù)渥兓M(jìn)行了簡單的處理。?
5. Conclusion and future work
This paper has presented a novel algorithm for removing large objects from digital photographs. The result of object removal is an image in which the selected object has been replaced by a visually plausible background that mimics the appearance of the source region.
Our approach employs an exemplar-based texture synthesis technique modulated by a unified scheme for determining the fill order of the target region. Pixels maintain a confidence value, which together with image isophotes, influence their fill priority.
The technique is capable of propagating both linear structure and two-dimensional texture into the target region. Comparative experiments show that a careful selection of the fill order is necessary and sufficient to handle this task.
Our method performs at least as well as previous techniques designed for the restoration of small scratches, and in instances in which larger objects are removed, it dramatically outperforms earlier work in terms of both perceptual quality and computational efficiency.
Currently, we are investigating extensions for more accurate propagation of curved structures in still photographs and for object removal from video, which promise to impose an entirely new set of challenges.
5. 總結(jié)和后續(xù)工作
本文提出了一種從數(shù)字圖像中去除大物體的新算法。去除遮擋物的結(jié)果是得到一個(gè)圖像,該圖像中所選物體已被一個(gè)視覺上可信的背景所替換。?
我們的方法采用了一種基于范例的紋理合成技術(shù),并通過統(tǒng)一的方案來確定目標(biāo)區(qū)域的填充順序。像素的置信值與圖像等焦線一起影響其填充的優(yōu)先級。?
該技術(shù)能夠?qū)⒕€性結(jié)構(gòu)和二維紋理傳播到目標(biāo)區(qū)域。對比實(shí)驗(yàn)表明,對填充順序進(jìn)行仔細(xì)的選擇是處理這一任務(wù)的必要和充分的。?
我們的方法在修復(fù)小劃痕上和以前設(shè)計(jì)的技術(shù)一樣;在移除較大物體的情況下,在感知質(zhì)量和計(jì)算效率方面,它明顯優(yōu)于早期的工作。?
目前,我們正在研究在靜止照片中更精確地傳播曲線結(jié)構(gòu)以及從視頻中去除物體的擴(kuò)展,這將帶來一系列全新的挑戰(zhàn)。?
參考文獻(xiàn)
[1] M. Ashikhmin. Synthesizing natural textures. In Proc. ACM Symp. on Interactive 3D Graphics, pp. 217–226, Research Triangle Park, NC, Mar 2001.
[2] C. Ballester, V. Caselles, J. Verdera, M. Bertalmio, and G. Sapiro. A variational model for filling-in gray level and color images. In Proc. ICCV, pp. I: 10–16, Vancouver, Canada, Jun 2001.
[3] M. Bertalmio, A.L. Bertozzi, and G. Sapiro. Navier-stokes, fluid dynamics, and image and video inpainting. In Proc. Conf. Comp. Vision Pattern Rec., pp. I:355–362, Hawai, Dec 2001.
[4] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester. Image inpainting. In Proc. ACM Conf. Comp. Graphics
(SIGGRAPH), pp. 417–424, New Orleans, LU, Jul 2000. http://mountains.ece.umn.edu/~guille/inpainting.htm.
[5] M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. Simultaneous structure and texture image inpainting. to appear,
2002. http://mountains.ece.umn.edu/~guille/inpainting.htm.
[6] R. Bornard, E. Lecan, L. Laborelli, and J-H. Chenot. Missing data correction in still images and image sequences. In ACMMultimedia, France, Dec 2002.
[7] T. F. Chan and J. Shen. Non-texture inpainting by curvature-driven diffusions (CDD). J. Visual Comm. Image Rep., 4(12), 2001.
[8] J.S. de Bonet. Multiresolution sampling procedure for analysis and synthesis of texture images. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), volume 31, pp. 361–368, 1997.
[9] A. Efros and W.T. Freeman. Image quilting for texture synthesis and transfer. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), pp. 341–346, Eugene Fiume, Aug 2001.
[10] A. Efros and T. Leung. Texture synthesis by non-parametric sampling. In Proc. ICCV, pp. 1033–1038, Kerkyra, Greece, Sep 1999.
[11] W.T. Freeman, E.C. Pasztor, and O.T. Carmichael. Learning lowlevel vision. Int. J. Computer Vision, 40(1):25–47, 2000.
[12] D. Garber. Computational Models for Texture Analysis and Texture Synthesis. PhD thesis, Univ. of Southern California, USA, 1981.
[13] P. Harrison. A non-hierarchical procedure for re-synthesis of complex texture. In Proc. Int. Conf. Central Europe Comp. Graphics, Visua. and Comp. Vision, Plzen, Czech Republic, Feb 2001.
[14] D.J. Heeger and J.R. Bergen. Pyramid-based texture analysis/synthesis. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), volume 29, pp. 229–233, Los Angeles, CA, 1995.
[15] A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, and D. Salesin. Image analogies. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), Eugene Fiume, Aug 2001.
[16] H. Igehy and L. Pereira. Image replacement through texture synthesis. In Proc. Int. Conf. Image Processing, pp. III:186–190, 1997.
[17] G. Kanizsa. Organization in Vision. Praeger, New York, 1979.
[18] J. M. Kasson and W. Plouffe. An analysis of selected computer interchange color spaces. In ACM Transactions on Graphics, volume 11, pp. 373–405, Oct 1992.
[19] L. Liang, C. Liu, Y.-Q. Xu, B. Guo, and H.-Y. Shum. Real-time texture synthesis by patch-based sampling. In ACM Transactions on Graphics, 2001.
[20] S. Masnou and J.-M. Morel. Level lines based disocclusion. In Int. Conf. Image Processing, Chicago, 1998.
[21] S. Rane, G. Sapiro, and M. Bertalmio. Structure and texture fillingin of missing image blocks in wireless transmission and compression applications. In IEEE. Trans. Image Processing, 2002. to appear.
[22] L.-W. Wey and M. Levoy. Fast texture synthesis using treestructured vector quantization. In Proc. ACMConf. Comp. Graphics (SIGGRAPH), 2000.
[23] A. Zalesny, V. Ferrari, G. Caenen, and L. van Gool. Parallel composite texture synthesis. In Texture 2002 workshop - (in conjunction with ECCV02), Copenhagen, Denmark, Jun 2002.
總結(jié)
以上是生活随笔為你收集整理的Object Removal by Exemplar-Based Inpainting 翻译的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Netconf资料收集
- 下一篇: 如何让百度搜索到自己的博客(自己的博客名