Supplementary Materials: Generative Image Inpainting with ... · our proposed contextual attention...

Supplementary Materials: Generative Image Inpainting with ContextualAttention

Jiahui Yu1 Zhe Lin2 Jimei Yang2 Xiaohui Shen2 Xin Lu2 Thomas S. Huang1

1University of Illinois at Urbana-Champaign2Adobe Research

A. More Results on CelebA, CelebA-HQ, DTDand ImageNet

CelebA-HQ [?] We show results from our full modeltrained on CelebA-HQ dataset in Figure 1. Note that theoriginal image resolution of CelebA-HQ dataset is 1024 ×1024. We resize image to 256 × 256 for both training andevaluation.

CelebA [?] We show more results from our full modeltrained on CelebA dataset in Figure 2. Note that the originalimage resolution of CelebA dataset is 218× 178. We resizeimage to 315×256 and do a random crop of size 256×256to make face landmarks roughly unaligned for both trainingand evaluation.

ImageNet [?] We show more results from our full modeltrained on ImageNet dataset in Figure 3.

DTD textures [?] We show more results from our fullmodel trained on DTD dataset in Figure 4.

B. Comparisons with More Methods

We show more results for qualitative comparisons withmore methods including Photoshop Content-Aware Fill [?],Image Melding [?] and StructCompletion [?] in Figure 5and 6. For all these methods, we use default hyper-parameter settings.

C. More Visualization with Case Study

In addition to attention map visualization, we visualizewhich parts in the input image are being attended for pix-els in holes. To do so, we highlight the regions that havethe maximum attention score and overlay them to input im-age. As shown in Figure 7, the visualization results givenholes in different locations demonstrate the effectiveness ofour proposed contextual attention to borrow information atdistant spatial locations.

D. Network ArchitecturesIn addition to Section 3, we report more details of our

network architectures. For simplicity, we denote them withK (kernel size), D (dilation), S (stride size) and C (channelnumber).

Inpainting network Inpainting network has twoencoder-decoder architecture stacked together, with eachencoder-decoder of network architecture:

K5S1C32 - K3S2C64 - K3S1C64 - K3S2C128 -K3S1C128 - K3S1C128 - K3D2S1C128 - K3D4S1C128 -K3D8S1C128 - K3D16S1C128 - K3S1C128 - K3S1C128 -resize (2×) - K3S1C64 - K3S1C64 - resize (2×) - K3S1C32- K3S1C16 - K3S1C3 - clip.

Local WGAN-GP critic We use Leaky ReLU with α =0.2 as activation function for WGAN-GP critics.

K5S2C64 - K5S2C128 - K5S2C256 - K5S2C512 - fully-connected to 1.

Global WGAN-GP critic K5S2C64 - K5S2C128 -K5S2C256 - K5S2C256 - fully-connected to 1.

Contextual attention branch K5S1C32 - K3S2C64 -K3S1C64 - K3S2C128 - K3S1C128 - K3S1C128 - contex-tual attention layer - K3S1C128 - K3S1C128 - concat.

Figure 1: More inpainting results of our full model with contextual attention on CelebA-HQ faces. Each triad, from leftto right, shows original image, input masked image and result image. All input images are masked from validation set(training and validation split is provided in released code). All results are direct outputs from same trained model withoutpost-processing.

Figure 2: More inpainting results of our full model with contextual attention on CelebA faces. Each triad, from left to right,shows input image, result and attention map (upscaled 4×). All input images are masked from validation set (face identitiesare NOT overlapped between training set and validation set). All results are direct outputs from same trained model withoutpost-processing.

Figure 3: More inpainting results of our full model with contextual attention on ImageNet. Each triad, from left to right,shows input image, result and attention map (upscaled 4×). All input images are masked from validation set. All results aredirect outputs from same trained model without post-processing.

Figure 4: More inpainting results of our full model with contextual attention on DTD textures. Each triad, from left to right,shows input image, result and attention map (upscaled 4×). All input images are masked from validation set. All results aredirect outputs from same trained model without post-processing.

original image input image Content-Aware Fill [?] ImageMelding [?]

StructCompletion [?] our baseline model our full model attention map of full model

Figure 5: More qualitative results and comparisons. All input images are masked from validation set. All our results aredirect outputs from same trained model without post-processing. Best viewed with zoom-in.

Figure 6: More qualitative results and comparisons. All input images are masked from validation set. All our results aredirect outputs from same trained model without post-processing. Best viewed with zoom-in.

Figure 7: Visualization (highlighted regions) on which parts in input image are attended. Each triad, from left to right, showsinput image, result and attention visualization.

Supplementary Materials: Generative Image Inpainting with ... · our proposed contextual attention...

Documents

Phase reconstruction for time-frequency inpainting

inpainting the video completion

Simultaneous cartoon and texture image inpainting using ... · denoising of the image are integral by-products of the inpainting process. As opposed to the inpainting system proposed

Image Inpainting, Overviews & Advances

Image Inpainting with the Navier-Stokes Equationselynxsdk.free.fr/ext-docs/Inpainting/todo/inpainting with...Image Inpainting with the Navier-Stokes Equations Wilson Au, wilsona@sfu.ca

Borrow Area Use Plan - Chaffee Landfillchaffeelandfill.wm.com/images/pdf/borrow_area_use_plan.pdf · G:\Projects\Chaffee\Borrow Pit Permit 2008 \Borrow Area Use Plan.doc 2.0 BORROW

Light-weight pixel context encoders for image inpainting · Light-weight pixel context encoders for image inpainting Nanne van Noord* University of ... weight image inpainting model,

Free-form Video Inpainting with 3D Gated Convolution and ... · Image Inpainting. Image inpainting, to recover the dam-aged or missing region in a picture, is ﬁrstly introduced

Super- Resolution - based inpainting

Digital Inpainting

Fast Digital Image Inpainting

Spatio-Temporal Image Inpainting for Video … Image Inpainting for Video Applications 231 1) There are approaches similar to methods of static images inpainting. The main varieties:

IMAGE INPAINTING - İTÜ Thesis.pdf · inpainting process automatically by using an algorithm. Image inpainting problem is ex-plained with all steps for constructing a solution for

Image Inpainting via Generative Multi-column Convolutional …papers.nips.cc/paper/7316-image-inpainting-via... · 2019-02-19 · Image Inpainting via Generative Multi-column Convolutional

pixel-based image inpainting

Progressive Image Inpainting with Full-Resolution Residual ... · ing learning-based methods for image inpainting [12, 21, 22, 29, 31, 32, 35] do not consider progressive inpainting

Image Inpainting

Super Resolution Inpainting

Poster CAIP 2013 - Inpainting

Image inpainting: A review - arXiv