9
Fingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models to perform inpainting on fingerprint data. These models fill in missing patches of fingerprint images based on surrounding context. The first uses a deep residual architecture with a combination of reconstruction loss and adversarial loss, inspired by Generative Adversarial Networks, thereby encouraging mode selection as opposed to mode averaging. The second model we experiment with is a PixelCNN++. It produces qualitatively different fingerprint images. We demonstrate that they perform equally well when classifying inpainted fingerprints from 220 subjects with AlexNet. Our models significantly outperform the classification of non-inpainted images, which indicates their potential use in modern identification systems where fingerprints are often partly occluded. Finally, we present an example of a real-world application and a break-down test of our inpainting models. 1. Introduction W hile supervised deep learning has shown re- markable results in many application domains (Szegedy et al., 2016), semantically labelling data by hand is time and labour-intensive. As such, the unsupervised deep learning paradigm, which tries to learn a suitable representation of data without using any labels, has seen a new wave of popularity in recent years. Here we focus our attention on a specific unsupervised task: in- painting. This was recently demonstrated by Pathak et al. (2016), who removed large patches from street view im- ages and used a Convolutional Neural Network (CNN) to reconstruct the missing pixels. Their primary insight was to add an adversarial term to the loss function, as inspired by earlier work on Generative Adversarial Networks (GANs) Goodfellow et al. (2014). This approach was designed to encourage mode selection as opposed to mode averaging. The direct result thereof was more realistic and sharp in- painting as opposed to blurry inpainting. While inpaint- ing has been shown to be a relevant feature extractor in the transfer learning paradigm, owing to time constraints we do not consider this aspect in this research. Generative models are a particularly powerful set of unsu- pervised learning techniques that learn a low-dimensional representation of the data and sample from this to produce new data. The idea is that the new data exists in close vicin- ity to the training data on a high-dimensional hyper-plane specific to the problem domain. GANs are recognised as the current state-of-the-art for such generative tasks and have been shown to produce convincingly realistic images by learning data distributions implicitly (e.g. (Radford et al., 2015)). They can be described as the competition between a generator, which learns to produce images that look real, and a discriminator, which learns to discriminate fake and real images. These two components learn in uni- son. Cao & Jain (2018) showed that GANs alone can pro- duce random, fake fingerprints virtually indistinguishable from real ones. GANs often show unstable learning behaviours; an alterna- tive class of generative models are autoregressive models. These estimate data distributions explicitly by minimising the negative log-likelihood of some well-formulated prob- ability distribution estimation. One such model, the Pixel- CNN++ (Salimans et al., 2017), generates images pixel by pixel by decomposing the probability density over colours as conditionals on previously generated pixels. They are, however, notoriously slow to train. In our earlier work (Group107, 2018), we provided an au- toencoder baseline for the inpainting task on fingerprint images. We achieved somewhat disappointing results be- cause the autoencoder alone did not perform mode selec- tion; instead, it mostly produced averaged-out, grey pix- els, resulting in blurry images. In this work, we aim to improve inpainting performance by using a combination of deep Residual Neural Networks (ResNets) (He et al., 2016) and GANs, as well as a PixelCNN++. We show that the inpainting results are much more accurate with such mod- els, and demonstrate how to quantify their relative perfor- mance by solving a classification problem using inpainted and non-inpainted fingerprints. We run two final experi- ments: First, we intentionally mask poor-quality fingerprint regions, showing how these can be inpainted to “fix” the fingerprints, and second, we perform a break-down test by inpainting iteratively larger masked regions.

Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

  • Upload
    others

  • View
    36

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint Inpainting with Generative Models

s1739461 s1313429 s1771851 (Group 107)

AbstractWe propose and test two generative models to perform inpainting on fingerprint data. These models fill inmissing patches of fingerprint images based on surrounding context. The first uses a deep residual architecturewith a combination of reconstruction loss and adversarial loss, inspired by Generative Adversarial Networks,thereby encouraging mode selection as opposed to mode averaging. The second model we experiment with is aPixelCNN++. It produces qualitatively different fingerprint images. We demonstrate that they perform equallywell when classifying inpainted fingerprints from 220 subjects with AlexNet. Our models significantly outperformthe classification of non-inpainted images, which indicates their potential use in modern identification systemswhere fingerprints are often partly occluded. Finally, we present an example of a real-world application and abreak-down test of our inpainting models.

1. Introduction

W hile supervised deep learning has shown re-markable results in many application domains(Szegedy et al., 2016), semantically labelling

data by hand is time and labour-intensive. As such, theunsupervised deep learning paradigm, which tries to learna suitable representation of data without using any labels,has seen a new wave of popularity in recent years. Herewe focus our attention on a specific unsupervised task: in-painting. This was recently demonstrated by Pathak et al.(2016), who removed large patches from street view im-ages and used a Convolutional Neural Network (CNN) toreconstruct the missing pixels. Their primary insight was toadd an adversarial term to the loss function, as inspired byearlier work on Generative Adversarial Networks (GANs)Goodfellow et al. (2014). This approach was designed toencourage mode selection as opposed to mode averaging.The direct result thereof was more realistic and sharp in-painting as opposed to blurry inpainting. While inpaint-ing has been shown to be a relevant feature extractor in thetransfer learning paradigm, owing to time constraints wedo not consider this aspect in this research.

Generative models are a particularly powerful set of unsu-pervised learning techniques that learn a low-dimensionalrepresentation of the data and sample from this to producenew data. The idea is that the new data exists in close vicin-ity to the training data on a high-dimensional hyper-planespecific to the problem domain. GANs are recognised asthe current state-of-the-art for such generative tasks andhave been shown to produce convincingly realistic imagesby learning data distributions implicitly (e.g. (Radfordet al., 2015)). They can be described as the competition

between a generator, which learns to produce images thatlook real, and a discriminator, which learns to discriminatefake and real images. These two components learn in uni-son. Cao & Jain (2018) showed that GANs alone can pro-duce random, fake fingerprints virtually indistinguishablefrom real ones.

GANs often show unstable learning behaviours; an alterna-tive class of generative models are autoregressive models.These estimate data distributions explicitly by minimisingthe negative log-likelihood of some well-formulated prob-ability distribution estimation. One such model, the Pixel-CNN++ (Salimans et al., 2017), generates images pixel bypixel by decomposing the probability density over coloursas conditionals on previously generated pixels. They are,however, notoriously slow to train.

In our earlier work (Group107, 2018), we provided an au-toencoder baseline for the inpainting task on fingerprintimages. We achieved somewhat disappointing results be-cause the autoencoder alone did not perform mode selec-tion; instead, it mostly produced averaged-out, grey pix-els, resulting in blurry images. In this work, we aim toimprove inpainting performance by using a combination ofdeep Residual Neural Networks (ResNets) (He et al., 2016)and GANs, as well as a PixelCNN++. We show that theinpainting results are much more accurate with such mod-els, and demonstrate how to quantify their relative perfor-mance by solving a classification problem using inpaintedand non-inpainted fingerprints. We run two final experi-ments: First, we intentionally mask poor-quality fingerprintregions, showing how these can be inpainted to “fix” thefingerprints, and second, we perform a break-down test byinpainting iteratively larger masked regions.

Page 2: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

Section 2 briefly discusses the fingerprint data set. Sec-tion 3 and Section 4 introduce the GAN and PixelCNN++inpainting models respectively. We compare these modelsby means of a classification task in Section 5. Finally, wetest our model on a real-world application in Section 6 andperform a breakdown study in Section 7.

2. Data: FingerprintsWe have provided a detailed description of the fingerprintdatasets we use herein in our previous work (Group107,2018). Thus, we only provide a brief summary of this de-scription in what follows.

2.1. Overview

From smart-phone authentication to evidence in criminaltrials, fingerprints are ubiquitous in modern biometric sys-tems. However, the images at hand are a challenge in deeplearning since they suffer from high intra-class variation(owing to varying skin moisture, scanning deviations, arte-facts from ageing, etc.) and low inter-class variation (i.e.,fingerprints often look similar). As such, modern finger-print matching technology typically relies on heuristic al-gorithms (Maltoni et al., 2009). These look to extract acombination of three levels of features: (1) The overallshape of the fingerprint, (2) the minutiae points, namelywhere the ridge lines either come to an end (ridge-ending)or split in two (bifurcation), and (3) the sweat pores visibleon the ridges of the fingerprint. We show that our inpaint-ing can often reproduce the first two features. The thirdfeature requires high resolution scanners and is often notavailable; it is not inpainted by our model.

2.2. The Data

There are several fingerprint datasets available for directdownload or via request. We obtained a number of these inour earlier work. It is important to note that those datasetsonly contain a small number (eight maximum) of finger-prints per class, i.e per finger.

Figure 1. Examples of fingerprint crops with four randompatches.

A detailed description of the data we have used to constructsuitable datasets for this research can be found in Group107(2018). Owing to the insights gained through this earlier re-search, we have made several changes regarding the usageof the data described therein:

1. Input data size. We are now using 128 × 128 pixelcrops of the fingerprints instead of 384×288 pixel fullfingerprints. This is because our earlier work showedthat a wider context did not improve inpainting per-formances. Moreover, the smaller input image sizesallowed us to fit deeper models on the GPU memory.

2. Data standardisation. In our earlier work we stan-dardised fingerprints on a dataset level, because theyeach used different scanners. The abovementionedcropping made this step less relevant, so we now fol-low a simple approach of rescaling the data to therange [−1, 1]. We match this scale in the output im-ages by using tanh activation function in the finallayer of our generative models.

3. Patching. We have set the patching protocol to placefour patches of size 32× 32 pixels.

4. Data split. Two of the original databases were heldout as validation and test data, to ensure that the in-painting is invariant to the physical scanner used dur-ing data collection. The validation set is used for com-parison in Sections 3 and 4, and both sets are used forthe classification task in Section 5.

No other changes from our earlier work have been made tothe data. Figure 1 gives examples of the 128× 128 patchedfingerprint crops used as input data throughout this work.Note that many of the images displayed in this work areof full fingerprints as opposed to those shown in Figure 1.These are computed by processing overlapping regions of128 × 128 pixels and averaging the resultant output, usingsoft contributions at the region edges.

3. Experiment 1: GAN Inpainting3.1. Description

In our earlier work on fingerprint inpainting we useda patch autoencoder which produced blurry regions(Group107, 2018). This was mostly due to the limiteddepth of the model and to the autoencoder prioritising“mode averaging” rather than “mode selection”. In otherwords, the autoencoder often found that predicting an av-erage pixel value (grey) would result in a lower recon-struction error. Pathak et al. (2016) proposed to includea GAN component to the inpainting pipeline for that verypurpose. What follows in this section describes how we

Page 3: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

implemented a ResNet as the generator component for aGAN, and encouraged mode selection which improved in-painting.

3.2. Implementation

After some preliminary hyper-parameter search, we de-cided to use the architecture shown in Figure 2 for theResNet generator and discriminator. Deeper architectureswere shown to yield a lower reconstruction loss (Group107,2018), and so we used the deepest architecture that wouldfit on the GTX 1060 GPU used. We then focused our ex-perimentation on the addition of an adversarial loss.

Figure 2. (a) and (b) are the ResNet architectures for the generatorand discriminator components of the GAN. For each convolutionwe show the respective kernel size, number of output channelsand stride length used.

Figure 2 (a) illustrates the ResNet generator architecture weused for this work, while Figure 2 (b) is the architecture ofthe discriminator component of the GAN. Note that the dis-criminator has a skip connection from the input layer to thefully-connected output layer. Its architecture, including thesmall residual skip connection length, was found through abrief hyper-parameter search. Each building block shownin Figures 2 (a) and (b) consists of a convolution, leakyReLU activation and batch normalisation.

The output of the generator component – an inpainted fin-gerprint – is used as input to the discriminator networkshown in Figure 2 (b) to compute the adversarial loss,which we define in what follows. Let G(x(i)) be the theinpainted images that are produced by the generatorG, x(i)

0

be the original images, and x(i) be the patched images. Fora batch of sizeM , the “fake” imagesG(x(i)) and the “real”images x

(i)0 are both run through the discriminator D and

classified. The adversarial loss is then defined as the nega-tive cross-entropy (Goodfellow et al., 2014):

Ladv =1

M

M∑i=1

[log(D(x

(i)0 )) + log(1−D(G(x(i))))

](1)

The arguments of the log terms being probabilities, we haveLadv ≤ 0 and intuitively the generator and discriminatorplay a ”mini-max” game on it respectively. This adversarialloss term is then added to a reconstruction loss Lrecon, forwhich we used the Huber Loss (Huber, 1964), to yield theloss LG of the generator component:

LG = λadvLadv + λreconLrecon (2)

The discriminator loss LD is given by (Goodfellow et al.,2014):

LD = − 1

M

M∑i=1

[log(1−D(G(x(i))))

](3)

The generator loss LG and discriminator loss LD are op-timised in turn for each batch. Both training losses wereoptimised via the Adam Optimizer with a small learningrate decay of 2.5× 10−4 % per training step.

The main hyper-parameters that have influence over thesuccess or failure of the addition of adversarial loss are theloss mixing coefficients λrecon and λadv for reconstruc-tion and adversarial losses, respectively (see Equation 2).We tested many combinations of these over several weeks,three of which are compared to a ResNet without adversar-ial loss in the following section.

3.3. Results

Figure 3 shows the validation reconstruction losses be-tween original, non-patched images and inpainted imagesfor different values of λadv , where we kept λrecon = 1 con-stant. In addition, we also included a model with λadv = 0to show how the model would train without a GAN compo-nent. All the λadv values are kept extremely small becausewe quickly found out that larger values would result in ba-sically no reconstruction at all.

By looking at Figure 3 and the qualitative images betweeninpainted images, we conclude that λadv = 2 × 10−6 andλrecon = 1 performs best, with the exception of the pureResNet autoencoder. This is to be expected, as the ResNet

Page 4: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

0 25 50 75 100 125 150 175 200Epoch Number

0.02

0.03

0.04

0.05

0.06

Hu

ber

Los

s

λadv = 4× 10−4

λadv = 2× 10−6

λadv = 4× 10−6

No GAN

Figure 3. Huber Loss on the validation set for inpainting recon-structions with different values of λadv , as well as without theadversarial loss component (λadv = 0). A smoothing function isused to reduce the jitter of these curves.

autoencoder’s aim is to reconstruct the patched pixels, un-like the adversarial models which also aim to generate real-istic patches. In fact, while the model without GAN yieldsa lower loss, it doesn’t yield images that are as visuallysharp.

The inpainting performance of the aforementioned bestGAN model is demonstrated in Figure 4 (e) and (f). Whilethe inpainted patches are somewhat blurry, the generatedlines are clearly recognisable as the contrast between ridgesand valleys is much more prevalent than in our previouswork (Group107, 2018).

4. Experiment 2: PixelCNN++ Inpainting4.1. Description

A PixelCNN (van den Oord et al., 2016) is an image den-sity model that uses autoregressive connections to modelimages pixel by pixel. The basic idea is to decompose theimage distribution as a product of conditionals:

p(x) =

n2∏i=1

p(xi | x1, ..., xi−1) (4)

where n is the image dimension and is 128 in our case.

A PixelCNN learns a distribution over the entire colour-

(a) (b)

(c) (d)

(e) (f)

(g) (h)

Figure 4. (a) and (b) are original fingerprints selected from thevalidation set. (c) and (d) are the same fingerprints after apply-ing patches that cover up to 25% of the images. (e) and (f) areinpainted using the generator component of the GAN set-up. (g)and (h) are the inpainted results using the PixelCNN++.

Page 5: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

space for each pixel. This approach does not leveragethe knowledge that similar colours will have similar likeli-hoods of being present. This and several other pitfalls mo-tivated the advances that resulted in the crafting of the Pix-elCNN++ (Salimans et al., 2017), which models the colourof each pixel as a mixture of K logistic sigmoid functions(K = 10 in the original paper). Thus, for any given pixel,the PixelCNN++ models:

p(xi|πi, µi, si, x<i) =

K∑j=1

πij

(xi + 0.5− µij

sij

)−

σ

(xi − 0.5− µij

sij

)](5)

where xi is the pixel colour being estimated, πij is the fac-tor for the jth logistic sigmoid for the ith pixel, µij is themean of the jthlogistic sigmoid, sij is the scale thereof, andσ is the logistic sigmoid function. Furthermore, the colourchannels are linked so that the green channel depends lin-early on Equation 6 and the value of the estimated red chan-nel, while the blue channel depends on its estimation anda linear combination of the red and green channels. Ow-ing to the grayscale nature of fingerprints, we discard thiscomponent.

All parameters in the neural network and the loss functionthat defines the colour of each pixel are learned through thedirect minimization of the negative log-likelihood of Equa-tion 6, allowing the PixelCNN++ to define a powerful andflexible relationship between pixel intensities in an image.

4.2. Implementation

Our implementation is largely adapted from the originalPixelCNN++ code published by OpenAI1. Nonetheless,some changes were needed, such as making the Pixel-CNN++ gray-scale friendly. We also set K = 3 as themodel needs less flexibility to estimate gray pixels versuscolour pixels. Other than this, the default parameters wereused. The main challenge was the long training time ofsuch a model, which could be up to over a week on a GTX1060 GPU. This greatly limited the amount of hyperparam-eter fine-tuning we could perform.

4.3. Results

We trained a PixelCNN++ model for nearly 60 epochs us-ing the parameters used in Salimans et al. (2017). Wedo not believe that the model has converged at this point,but the extremely slow training time is prohibitive. Figure5 demonstrates the bits per dimension loss for the Pixel-CNN++ training. Bits per dimension is commonly used

1github.com/openai/pixel-cnn

to compare explicit distribution estimators, such as autore-gressive models, because the log-likelihood is directly ac-cessible. It is, essentially, the loss per dimension in bits andis defined as:

bpd =1

log(2)n2

n2∑i

−logp(x) (6)

for a single image, x, with n2 pixels. For instance, a valueof bpd = 2.92 was obtained using PixelCNN++ on CIFAR-10 data (Salimans et al., 2017).

0 10 20 30 40 50 60Epoch Number

4.0

4.5

5.0

5.5

6.0

6.5

Bit

sP

erD

imen

sion

PixelCNN++, train

PixelCNN++, valid

Figure 5. Bits per dimension loss for inpainting reconstructionsusing PixelCNN++.

The minimum validation bits per dimension we obtainedwas 3.74; it was at this point that the model was frozen andused for the inference tests of the following sections. Inter-estingly, the validation loss is lower than the training loss.This might be attributed to the fact that the held out valida-tion data represents ‘easier’ fingerprints to model, but mostlikely is the result of strong regularisation during trainingbut not for inference (i.e. dropout).

Figure 4 (g) and (h) give examples of the inpainting usingPixelCNN++ on randomly placed patches. The quality ofthe inpainting is different from that of the GAN-based in-painting. In particular, the PixelCNN++ inpainting seemsmore noisy and “pixelated”. The autoregressive nature ofthis approach means that the reconstruction errors com-pound from top to bottom (owing to the PixelCNN++ con-ditioning on pixels above in its receptive field). Nonethe-less, the inpainting using PixelCNN++ is certainly able to

Page 6: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

complete regions in a realistic fashion. The main advantagethis autoregressive model has over an adversarial approachis that the resultant inpainting is not blurry. Upon close in-spection it becomes evident that the ridge-valley line com-pletions are more true to the original fingerprint when us-ing the PixelCNN++. However, the associated speckle-typenoise is a clear drawback compared to the smooth lines theGAN ResNet generates.

5. Experiment 3: Classification5.1. Description

As mentioned previously, finding a metric to evaluate thequality of the inpainting task can be challenging. This is be-cause the loss function does not necessarily indicate that therelevant features of the fingerprints have been inpainted.Here, we define “relevant” features as those features whichcan be used to discriminate one finger from another. Intu-itively, this may correspond to the overall shape of the fin-gerprint and its minutiae points (see Section 2.1), althoughit is not constrained to be just those.

One simple way of quantifying the quality of our inpaint-ing model is to run a classification task on patched images,both with and without inpainting. We expect the classi-fier to perform better on the inpainted images if the regionshave been filled in with relevant features. Note that in thisexperiment we are not concerned about getting a state-of-the-art classification accuracy across classes. Indeed ourapproach is not amenable to real-world classification, sincethere could be millions of fingerprints in a given database,each of which would need to be a class in our classifiernetwork. Instead, we are solely interested in the relativeperformance of the classifier for different inpainting cases.

5.2. Implementation

We perform a standard classification task on 220 subjectsfrom datasets that were excluded from training in sections3 and 4. As previously, there are 8 impressions per sub-ject, taken by the same scanner but in different conditions,such as various moisture levels. These conditions are ran-domised as we split each class into 6, 1, and 1 impressionsfor the training, validation and test set respectively. We runthe classifier on four versions of the dataset:

• without patches;

• with patches;

• with patches inpainted with the GAN model; and

• with patches inpainted with the PixelCNN+ model.

The latter three cases were patched using 27 patches of size32x32 pixels, with overlapping allowed. This corresponds

to a maximum of 25% of the total area of the input images,which are cropped 374x300 pixels (see Group107 (2018)).

Since we do not have much data and are interested in therelative classification accuracy, we consider a fairly super-ficial model: AlexNet (Krizhevsky et al., 2012). We im-plement this architecture exactly as per the original paper,which most notably splits the 2nd, 4th and 5th convolutionallayers into two groups, in order to reduce the number ofparameters. We train this model with the Adam Optimizerand a decaying learning rate. This converges in about onehour on a GTX 1060 GPU.

The small number of images per class poses a serious chal-lenge when training a deep CNN. We propose two solu-tions. First, we augment the input images with a random±20 degree rotation. Then, we perform cross validation onthe training and validation sets: we perform 7 training runs,each using different images for the validation set. This al-lows us to average the validation accuracy across all thoseruns when comparing different cases of inpainting.

5.3. Results

The final validation accuracies, when averaged over allcross-correlation runs, are shown in Figure 6. All trainingruns (not shown in that particular figure) converge quicklyto 100% training accuracy, which is a sign of overfitting.This was expected owing to the small number of finger-print impressions per class. In fact, there are so few ex-amples that even with data augmentation our shallow ar-chitecture is data-starved, meaning it can remember someof the training data without necessarily learning a relevantrepresentation thereof.

Considering the above, it may seem counter-intuitive to tryto classify fingerprints given the data we have access to.The reader is thus reminded, however, that this experimentis intended to show the relative performance of fingerprintinpainting procedures.

While the absolute validation accuracies are not particu-larly high, the relationship between those curves demon-strates the value of our inpainting models. Indeed, the ac-curacy is increased by over one third when inpainting themissing patches, which means that some relevant featuresfor classification were inpainted correctly. The GAN andPixelCNN+ inpainting seemed to perform similarly well.

6. Real-World Use CaseTo test the real-world applicability of the models learnedthrough this research, we apply them to fingerprints whereregions have been manually identified for replacement.These regions typically correspond to wrinkles or acquiredscars, and may not be present in the fingerprints of the same

Page 7: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

0 25 50 75 100 125 150 175 200Epoch Number

0

5

10

15

20

25V

alid

atio

nA

ccu

racy

(%)

Original

Patched

GAN

PixelCNN++

Figure 6. Validation accuracies for patched fingerprints with andwithout inpainting. These curves are averages of 7 cross corre-lated runs. The classification accuracy without any patching isalso shown as a baseline. A smoothing function is used to reducethe jitter of these curves.

finger in a database. Examples can be seen in Figure 7.

Of course this inpainting procedure of poor-quality regionswould not necessarily improve the classification accuracyof the fingerprints, should all impressions in the datasetcontain the same wrinkles/scars, for instance. This remainsto be explored as future work, however. Nonetheless, thereconstruction is highly encouraging in that the continuityof both inpainting approaches is good, resulting in realistic‘wrinkle-free’ fingerprints.

7. Break-down TestingThe models we train must learn to leverage the surroundingcontext in order to perform inpainting. It is not immedi-ately clear, however, how close the context must be to theinpainted pixels in order to be effective. A simple break-down test is demonstrated in this section, as we inpaint acentral patch of increasingly larger sizes, in order to assessat what point the inpainting procedures fail. This is demon-strated in Figure 8.

It can be seen that breakdown occurs at approximately thesame patch size that was used for training, which is some-what expected. However, it is also evidence that muchof the advantage gained over our earlier work (Group107,2018) is owing to the much deeper ResNet architecture em-ployed instead of the adversarial loss that used to assist in

Figure 7. Real-world use case: fingerprints intentionally patchedto encourage reconstructions in poor-quality regions. From leftto right: original fingerprint, manually patched, GAN inpainting,and PixelCNN inpainting.

training the generative component of the GAN. Indeed, aGAN alone would inpaint the squares with realistic finger-print features even without any context. Interestingly, theGAN ResNet tends to merge lines together when the in-painting is ambiguous, which is often not the case in thereal fingerprint counterpart.

The PixelCNN++ inpainting, however, fails in a fashionconsistent with the autoregressive structure of the model.That is, each pixel is conditioned on those above it in aconvolutional manner. This receptive field structure meansthat regions at the lower end of the patches in Figure 8 aremore noise-prone. Overall, the GAN ResNet can inpaintlarger regions than the PixelCNN++.

8. ConclusionIn this report we demonstrated the inpainting performanceof a ResNet autoencoder with an adversarial loss and a Pix-elCNN++ model on a dataset of patched fingerprints. Forthe former model, we explored different weightings of theadversarial loss component, while we used standard param-eters for the latter model. Both models showed promis-ing inpainting performances: the adversarial model gen-erated realistic but blurry patches, while the PixelCNN++resulted in sharper but more pixelated inpainting. We eval-uated the inpainting performances of both approaches by

Page 8: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

framing a classification problem. Classifiers trained on fin-gerprints inpainted via both models worked similarly welland much better than a classifier trained on the patched fin-gerprints, albeit not as good as one trained on the original,non-patched fingerprints.

We have considered the real-world applications of ourmethods, in particular their use in denoising poor-qualityfingerprints. This has a direct application in auto-completion of partial latent fingerprints in the field of crim-inal investigation. Lastly, we tested the robustness of ourinpainting model by inpainting increasingly large patches.We found that the inpainting performance starts to becomeunreliable when the patch size becomes bigger than thepatch sizes used in the training data.

The results of this particular work are encouraging andcould, given minor adjustments, have some direct applica-tions for the biometric sciences. However, due to the com-plexity of the models, there is much that we have not exper-imented with and which could be the basis for future work.For instance, if we were to continue this work, we wouldaim to use a siamese architecture for determining whetherfingerprints originate from the same source. Albeit a pos-sibly combinatorial rephrasing of the fingerprint classifica-tion problem, this is how current state-of-the-art fingerprintidentification systems work. Moreover, posing the problemas such would circumvent the issue of low numbers of datasamples per class.

If there was more time, we would have also liked to addan adversarial loss to the PixelCNN++ and blend our twopresented approaches together. This is fundamentally dif-ficult owing to the differences in how these models esti-mate data distributions and is thus interesting for generaldomains too.

The inpainting models could also be used as data-augmenters for deep learning classification. One couldaugment the entire image patch by patch and get an outputfingerprint without scars or line discontinuities. One couldalso use our ResNet encoder model as a feature extractorand transfer the representation learned across datasets. Atlast, we would also like to test our model more thoroughlyby showing how much context is actually used during in-painting. Perhaps a context much smaller than 128x128yields satisfactory inpainting, thereby allowing for deepermodel or faster training.

9. AcknowledgementsOur Python TensorFlow codebase was partly adoptedfrom the Machine Learning Practical GitHub(https://github.com/CSTR-Edinburgh/mlpractical) and amended with inspira-tion from Antreas Antoniou’s GitHub (https:

Figure 8. Breakdown test for incrementally larger patches. Fromleft to right: fingerprints with patches, GAN processed result, andPixelCNN++ processed result.

//github.com/AntreasAntoniou).

Page 9: Fingerprint Inpainting with Generative ModelsFingerprint Inpainting with Generative Models s1739461 s1313429 s1771851 (Group 107) Abstract We propose and test two generative models

Fingerprint inpainting, 2018

ReferencesCao, Kai and Jain, Anil K. Fingerprint synthesis: Evaluat-

ing fingerprint search at scale. 2018.

Goodfellow, Ian, Pouget-Abadie, Jean, Mirza, Mehdi, Xu,Bing, Warde-Farley, David, Ozair, Sherjil, Courville,Aaron, and Bengio, Yoshua. Generative adver-sarial nets. In Ghahramani, Z., Welling, M.,Cortes, C., Lawrence, N. D., and Weinberger, K. Q.(eds.), Advances in Neural Information ProcessingSystems 27, pp. 2672–2680. Curran Associates, Inc.,2014. URL http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.

Group107. Mlp coursework 3: Fingerprint inpainting.2018.

He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun,Jian. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer visionand pattern recognition, pp. 770–778, 2016.

Huber, Peter J. Robust estimation of a location parameter.The annals of mathematical statistics, pp. 73–101, 1964.

Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E.Imagenet classification with deep convolutional neuralnetworks. In Pereira, F., Burges, C. J. C., Bottou, L.,and Weinberger, K. Q. (eds.), Advances in Neural Infor-mation Processing Systems 25, pp. 1097–1105. CurranAssociates, Inc., 2012.

Maltoni, Davide, Maio, Dario, Jain, Anil K, and Prabhakar,Salil. Handbook of fingerprint recognition. Springer Sci-ence & Business Media, 2009.

Pathak, Deepak, Krahenbuhl, Philipp, Donahue, Jeff, Dar-rell, Trevor, and Efros, Alexei A. Context encoders:Feature learning by inpainting. CoRR, abs/1604.07379,2016. URL http://arxiv.org/abs/1604.07379.

Radford, Alec, Metz, Luke, and Chintala, Soumith.Unsupervised representation learning with deep con-volutional generative adversarial networks. CoRR,abs/1511.06434, 2015. URL http://arxiv.org/abs/1511.06434.

Salimans, Tim, Karpathy, Andrej, Chen, Xi, and Kingma,Diederik P. Pixelcnn++: Improving the pixelcnn withdiscretized logistic mixture likelihood and other modifi-cations. arXiv preprint arXiv:1701.05517, 2017.

Szegedy, Christian, Ioffe, Sergey, Vanhoucke, Vincent, andAlemi, Alex A. Inception-v4, inception-resnet and theimpact of residual connections on learning. In ICLR2016 Workshop, 2016. URL https://arxiv.org/abs/1602.07261.

van den Oord, Aaron, Kalchbrenner, Nal, Espeholt, Lasse,Vinyals, Oriol, Graves, Alex, et al. Conditional im-age generation with pixelcnn decoders. In Advances inNeural Information Processing Systems, pp. 4790–4798,2016.