13
KAMRAN ET AL.: FUNDUS2ANGIO 1 Fundus2Angio: A Novel Conditional GAN Architecture for Generating Fluorescein Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 [email protected] Khondker Fariha Hossain 2 farihafi[email protected] Alireza Tavakkoli 1 [email protected] Stewart Lee Zuckerbrod 3 [email protected] 1 University of Nevada, Reno Reno, NV, USA 2 Deakin University Melbourne, Australia 3 Houston Eye Associates Houston, TX, USA Abstract Carrying out clinical diagnosis of retinal vascular degeneration using Fluorescein Angiography (FA) is a time consuming process and can pose significant adverse effects on the patient. Angiography requires insertion of a dye that may cause severe adverse effects and can even be fatal. Currently, there are no non-invasive systems capable of generating Fluorescein Angiography images. However, retinal fundus photography is a non-invasive imaging technique that can be completed in a few seconds. In order to elim- inate the need for FA, we propose a conditional generative adversarial network (GAN) to translate fundus images to FA images. The proposed GAN consists of a novel residual block capable of generating high quality FA images. These images are important tools in the differential diagnosis of retinal diseases without the need for invasive procedure with possible side effects. Our experiments show that the proposed architecture outperforms other state-of-the-art generative networks. Furthermore, our proposed model achieves better qualitative results indistinguishable from real angiograms. 1 Introduction For a long time Fluorescein Angiography (FA) combined with Retinal Funduscopy have been used for diagnosing retinal vascular and pigment epithelial-choroidal diseases [35]. The process requires the injection of a fluorescent dye which appears in the optic vein within 8-12 seconds depending on the age and cardiovascular structure of the eye and stays up to 10 minutes [33]. Although generally considered safe, there have been reports of mild to severe complications due to allergic reactions to the dye [2, 28, 41]. Frequent side effects can range from nausea, vomiting, anaphylaxis, heart attack, to anaphylactic shock and death [11, 12, 27, 31, 32]. In addition, leakage of fluorescein in intravaneous area is common. c 2020. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms. arXiv:2005.05267v1 [eess.IV] 11 May 2020

Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 [email protected] Khondker Fariha Hossain

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

KAMRAN ET AL.: FUNDUS2ANGIO 1

Fundus2Angio: A Novel Conditional GANArchitecture for Generating FluoresceinAngiography Images from Retinal FundusPhotography

Sharif Amit Kamran1

[email protected]

Khondker Fariha Hossain2

[email protected]

Alireza Tavakkoli1

[email protected]

Stewart Lee Zuckerbrod3

[email protected]

1 University of Nevada, RenoReno, NV, USA

2 Deakin UniversityMelbourne, Australia

3 Houston Eye AssociatesHouston, TX, USA

Abstract

Carrying out clinical diagnosis of retinal vascular degeneration using FluoresceinAngiography (FA) is a time consuming process and can pose significant adverse effectson the patient. Angiography requires insertion of a dye that may cause severe adverseeffects and can even be fatal. Currently, there are no non-invasive systems capable ofgenerating Fluorescein Angiography images. However, retinal fundus photography is anon-invasive imaging technique that can be completed in a few seconds. In order to elim-inate the need for FA, we propose a conditional generative adversarial network (GAN)to translate fundus images to FA images. The proposed GAN consists of a novel residualblock capable of generating high quality FA images. These images are important tools inthe differential diagnosis of retinal diseases without the need for invasive procedure withpossible side effects. Our experiments show that the proposed architecture outperformsother state-of-the-art generative networks. Furthermore, our proposed model achievesbetter qualitative results indistinguishable from real angiograms.

1 IntroductionFor a long time Fluorescein Angiography (FA) combined with Retinal Funduscopy havebeen used for diagnosing retinal vascular and pigment epithelial-choroidal diseases [35].The process requires the injection of a fluorescent dye which appears in the optic vein within8-12 seconds depending on the age and cardiovascular structure of the eye and stays up to10 minutes [33]. Although generally considered safe, there have been reports of mild tosevere complications due to allergic reactions to the dye [2, 28, 41]. Frequent side effectscan range from nausea, vomiting, anaphylaxis, heart attack, to anaphylactic shock and death[11, 12, 27, 31, 32]. In addition, leakage of fluorescein in intravaneous area is common.

c© 2020. The copyright of this document resides with its authors.It may be distributed unchanged freely in print or electronic forms.

arX

iv:2

005.

0526

7v1

[ee

ss.I

V]

11

May

202

0

Citation
Citation
{Mary, Rajsingh, and Naik} 2016
Citation
Citation
{Mandava, Reichel, Guyer, etprotect unhbox voidb@x penalty @M {}al.} 2004
Citation
Citation
{Brockow and S{á}nchez-Borges} 2014
Citation
Citation
{Kwiterovich, Maguire, Murphy, Schachat, Bressler, Bressler, and Fine} 1991
Citation
Citation
{Torres, Mayorga, and Blanca} 2009
Citation
Citation
{Elprotect unhbox voidb@x penalty @M {}Harrar, Idali, Moutaouakkil, Elprotect unhbox voidb@x penalty @M {}Belhadji, Zaghloul, Amraoui, and Benaguida} 1996
Citation
Citation
{Fineschi, Monasterolo, Rosi, and Turillazzi} 1999
Citation
Citation
{Kwan, Barry, McAllister, and Constable} 2006
Citation
Citation
{Lieberman, Kemp, Oppenheimer, Lang, Bernstein, Nicklas, Anderson, Bernstein, Bernstein, Fink, etprotect unhbox voidb@x penalty @M {}al.} 2005
Citation
Citation
{Lira, Oliveira, Marques, Silva, and Pessoa} 2007
Page 2: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

2 KAMRAN ET AL.: FUNDUS2ANGIO

However, the concentration of fluorescein solutions don’t have any direct impact on adverseeffects mentioned above.[44].

Given the complications and the risks associated with this procedure, a non-invasive,affordable, and computationally effective procedure is quite imperative. The only currentalternatives to flourecein angigraphy (FA) is carried out by Optical Coherence Tomogra-phy and basic image processing technique. These systems are generally quite expensive.Without a computationally effective and financially viable mechanism to generate reliableand reproducible flourecein angiograms, the only alternative is to utilize retina funduscopyfor differential diagnosis. Although automated systems consisting of image processing andmachine learning algorithms have been proposed for diagnosing underlying conditions anddiseases from fundus images [13, 14, 32, 37], there has not been an effective effort to gen-erate FA images from retina photographs. In this paper, we propose a novel conditionalGenerative Adversarial Network (GAN) called Fundus2Angio, capable of synthesizing fluo-rescein angiograms from retinal fundus images. The procedure is completely automated anddoes not require any human intervention. We use both qualitative and quantitative metricsfor testing the proposed architecture. We compare the proposed architecture with other state-of-the-art conditional GANs [21, 42, 48]. Our model outperforms these networks in termsof quantitative measurement. For qualitative results, expert ophthalmologists were asked todistinguish fake angiograms from a random set of balanced real and fake angiograms overtwo trials. Results show that the angiograms generated by the proposed network are quiteindistinguishable from real FA images.

2 Literature Review

Generative adversarial networks have revolutionized many image manipulation tasks suchas image editing [8, 47], image styling [6, 38], and image style transfer [42, 43, 48]. Multi-resolution architectures are common practice in computer vision, while coupled architectureshave the capability to combine fine and coarse information from images [3, 4]. Recently,techniques on Conditional [9, 19] and Unconditional GANs [5, 45] have explored the ideaof combined-resolutions within the architecture for domain specific tasks. Inspired by this,we propose an architecture that extract features at different scales.

Some approaches also used multi-scale discriminators for style-transfer [24, 42, 46].However, they only attached discriminators with generator that deals with fine features whileignoring discriminators for coarse generator completely. In order to learn useful features atcoarsest scale, separate multi-scale discriminators are necessary. Our proposed architectureemploys this for both coarse and fine generators.

For high quality image synthesis, a pyramid network with multiple pairs of discrimina-tors and generators has also been proposed, termed SinGAN [39]. Though it produces highquality synthesized images, the model works only on unpaired images. To add to this prob-lem, each generator’s input is the synthesized output produced by the previous generator.As a result, it can’t be employed for pair-wise image training that satisfies a condition. Toalleviate from this problem, a connection needs to be established that can propagate featurefrom coarse to fine generator. In this paper, we propose such an architecture that has a featureappending mechanism between the coarse and fine generators, making it a two level pyramidnetwork with multi-scale discriminators as illustrated in Fig. 1.

Citation
Citation
{Yannuzzi, Rohrer, Tindel, Sobel, Costanza, Shields, and Zang} 1986
Citation
Citation
{Fu, Cheng, Xu, Zhang, Wong, Liu, and Cao} 2018
Citation
Citation
{Gurudath, Celenk, and Riley} 2014
Citation
Citation
{Lira, Oliveira, Marques, Silva, and Pessoa} 2007
Citation
Citation
{Poplin, Varadarajan, Blumer, Liu, McConnell, Corrado, Peng, and Webster} 2018
Citation
Citation
{Isola, Zhu, Zhou, and Efros} 2017
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Citation
Citation
{Zhu, Park, Isola, and Efros} 2017
Citation
Citation
{Dekel, Gan, Krishnan, Liu, and Freeman} 2018
Citation
Citation
{Zhu, Kr{ä}henb{ü}hl, Shechtman, and Efros} 2016
Citation
Citation
{Chen and Hays} 2018
Citation
Citation
{Sangkloy, Lu, Fang, Yu, and Hays} 2017
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Citation
Citation
{Xian, Sangkloy, Agrawal, Raj, Lu, Fang, Yu, and Hays} 2018
Citation
Citation
{Zhu, Park, Isola, and Efros} 2017
Citation
Citation
{Brown, Lowe, etprotect unhbox voidb@x penalty @M {}al.} 2003
Citation
Citation
{Burt and Adelson} 1983
Citation
Citation
{Denton, Chintala, Fergus, etprotect unhbox voidb@x penalty @M {}al.} 2015
Citation
Citation
{Huang, Li, Poursaeed, Hopcroft, and Belongie} 2017
Citation
Citation
{Chen and Koltun} 2017
Citation
Citation
{Zhang, Xu, Li, Zhang, Wang, Huang, and Metaxas} 2017
Citation
Citation
{Karras, Aila, Laine, and Lehtinen} 2017
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Citation
Citation
{Zhang and Patel} 2018
Citation
Citation
{Shaham, Dekel, and Michaeli} 2019
Page 3: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

KAMRAN ET AL.: FUNDUS2ANGIO 3

Figure 1: Proposed Generative Adversarial Network

3 The Proposed Methodology

This paper proposes a new conditional generative adversarial network (GAN) comprisingof a novel residual block for producing realistic FA from retinal fundus images. First, weintroduce the residual block in section 3.1. We then delve into the proposed conditionalGAN encompassing of fine and coarse generators and four multi-scale discriminators insections 3.2 and 3.3. Lastly, in section 3.4, we discuss the objective function and loss weightdistributions for each of the architectures that form the proposed architecture.

3.1 Novel Residual Block

Recently, residual blocks have become the norm for implementing many image classifica-tion, detection and segmentation architectures [16, 17]. Generative architectures have em-ployed these blocks in interesting applications ranging from image-to-image translation tosuper-resolution [22, 29, 42]. In its atomic form, a residual unit consists of two consecu-tive convolution layers. The output of the second layers is added to the input, allowing fordeeper networks. Computationally, regular convolution layers are expensive compared to anewer convolution variant, called separable convolution [7]. Separable convolution performsa depth-wise convolution followed by a point-wise convolution. This, in turn helps to ex-tract and retain depth and spatial information through the network. It has been shown thatinterspersing convolutional layers allows for more efficient and accurate networks [23]. Weincorporate this idea to design a novel residual block to retain both depth and spatial infor-mation, decrease computational complexity and ensure efficient memory usage, as shown inTable. 1.

Table 1: Comparison between Original and Proposed Residual BlockResidual Block Equation Activation No. of Parameters1

Original[Ri ~FConv ~FConv

]+Ri ReLU (Pre) [17] 18,688

Proposed[Ri ~FConv ~FSepConv

]+Ri Leaky-ReLU (Post) 10,784

1 FConv and FSepConv has kernel size K = 3, stride S = 1, padding P = 0 and No. of channel C = 32.

Citation
Citation
{He, Zhang, Ren, and Sun} 2016{}
Citation
Citation
{He, Zhang, Ren, and Sun} 2016{}
Citation
Citation
{Johnson, Alahi, and Fei-Fei} 2016
Citation
Citation
{Ledig, Theis, Husz{á}r, Caballero, Cunningham, Acosta, Aitken, Tejani, Totz, Wang, etprotect unhbox voidb@x penalty @M {}al.} 2017
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Citation
Citation
{Chollet} 2017
Citation
Citation
{{Kamran}, {Saha}, {Sabbir}, and {Tavakkoli}} 2019
Citation
Citation
{He, Zhang, Ren, and Sun} 2016{}
Page 4: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

4 KAMRAN ET AL.: FUNDUS2ANGIO

Figure 2: Proposed Residual Block

As illustrated in Fig. 2, we replace the last convolution operation with a separable convo-lution. We also use Batch-normalization [20] and Leaky-ReLU as post activation mechanismafter both convolution and separable Convolution layers. For better results, we incorporatereflection padding as opposed to zero-padding before each convolution operation. The entireoperation can be formulated as shown in Eq. 1:

Ri+1 =[Ri ~FConv ~FSepConv

]+Ri

= F(Ri)+Ri(1)

Here, ~ refers to convolution operation while Fconv and FSepConv signify the back-to-backconvolution and separable convolution operations. By exploiting convolution and separableconvolution layer with Leaky-ReLU, we ensure that two distinct feature maps (spatial &depth information) can be combined to generate fine fluorescein angiograms.

Figure 3: Generator and Discriminator Architectures

3.2 Coarse and Fine GeneratorsUsing a coarse-to-fine generator for both conditional and unconditional GANs results in veryhigh quality image generation, as observed in recent architectures, such as pix2pixHD [42]

Citation
Citation
{Ioffe and Szegedy} 2015
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Page 5: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

KAMRAN ET AL.: FUNDUS2ANGIO 5

and SinGan [39]. Inspired by this idea, we use two generators (G f ine and Gcoarse) in theproposed network, as illustrated in Fig. 3. The generator G f ine synthesizes fine angiogramsfrom fundus images by learning local information, including retinal venules, arterioles, hem-orrhages, exudates and microaneurysms. On the other hand, the generator Gcoarse tries to ex-tract and preserve global information, such as the structures of the macula, optic disc, color,contrast and brightness, while producing coarse angiograms.

The generator G f ine takes input images of size 512× 512 and produces output imageswith the same resolution. Similarly, the generator Gcoarse network takes an image with halfthe size (256× 256) and outputs an image of the same size as the input. In addition, theGcoarse outputs a feature vector of the size 256×256×64 that is eventually added with oneof the intermediate layers of G f ine. These hybrid generators are quite powerful for sharinglocal and global information between multiple architectures as seen in [22, 39, 42]. Bothgenerators use convolution layers for downsampling and transposed convolution layers forupsampling. It should be noted that Gcoarse is downsampled twice (×2) before being upsam-pled twice again with transposed convolution. In both the generators, the proposed residualblocks are used after the last downsampling operation and before the first upsampling oper-ations as illustrated in Fig. 3. On the other hand, in G f ine, downsampling takes place oncewith necessary convolution layer, followed by adding the feature vector, repetition of residualblocks and then upsampling to get fine angiography image. All convolution and transposedconvolution operation are followed by Batch-Normalization [20] and Leaky-ReLU activa-tions. To train these generators, we start with Gcoarse by batch-training it on random samplesonce and then we train the G f ine once with a new set of random samples. During this time, thediscriminator’s weights are frozen, so that they are not trainable. Lastly, we jointly fine-tuneall the discriminator and generators together to train the GAN.

3.3 Multi-scale PatchGAN as DiscriminatorFor synthesizing fluorescein angiography images, GAN discriminators need to adapt tocoarse and fine generated images for distinguishing between real and fake images. To allevi-ate this problem, we either need a deeper architecture or, a kernel with wider receptive field.Both these solutions result in over fitting and increase the number of parameters. Addition-ally, a large amount of processing power will be required for computing all the parameters.To address this issue, we exploit the idea of using two Markovian discriminators, first intro-duced in a technique called PatchGAN [30]. This technique takes input from different scalesas previously seen in [39, 42].

We use four discriminators that have a similar network structure but operate at differentimage scales. Particularly, we downsample the real and generated angiograms by a factorof 2 using the Lanczos sampling [10] to create an image pyramid of three scales (originaland 2×downsampled and 4×downsampled). We group the four discriminators into two,D f ine = [D1 f ine,D2 f ine] and Dcoarse = [D1coarse,D2coarse] as seen in Fig. 1. The discrimina-tors are then trained to distinguish between real and generated angiography images at thethree distinct resolutions respectively.

The outputs of the PatchGAN for D f ine are 64× 64 and 32× 32 and for Dcoarse are32× 32 and 16× 16. With the given discriminators, the loss function can be formulated asgiven in Eq. 2. It’s a multi-task problem of maximizing the loss of the discriminators whileminimizing the loss of the generators.

minG f ine,Gcoarse

maxD f ine,Dcoarse

LcGAN(G f ine,Gcoarse,D f ine,Dcoarse) (2)

Citation
Citation
{Shaham, Dekel, and Michaeli} 2019
Citation
Citation
{Johnson, Alahi, and Fei-Fei} 2016
Citation
Citation
{Shaham, Dekel, and Michaeli} 2019
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Citation
Citation
{Ioffe and Szegedy} 2015
Citation
Citation
{Li and Wand} 2016
Citation
Citation
{Shaham, Dekel, and Michaeli} 2019
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Citation
Citation
{Duchon} 1979
Page 6: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

6 KAMRAN ET AL.: FUNDUS2ANGIO

Despite discriminators having similar network structure, the one that learns feature at alower resolution has the wider receptive field. It tries to extract and retain more global fea-tures such as macula, optic disc, color and brightness etc to generate better coarse images. Incontrast, the discriminator that learns feature at original resolution dictates the generator toproduce fine features such as retinal veins and arteries, exudates etc. By doing this we com-bine feature information of global and local scale while training the generators independentlywith their paired multi-scale discriminators.

3.4 Weighted Object Function and Adversarial Loss

We use LSGAN [34] for calculating the loss and training our conditional GAN. The objectivefunction for our conditional GAN is given in Eq. 3.

LcGAN(G,D) = Ex,y[(D(x,y)−1)2] +Ex

[(D(x,G(x)+1))2] (3)

where the discriminators are first trained on the real fundus, x and real angiography image,y and then trained on the the real fundus, x and fake angiography image, G(x). We start withtraining the discriminators D f ine and Dcoarse for couple of iterations on random batches ofimages. Next, we train the Gcoarse while keeping the weights of the discriminators frozen.Following that, we train the the G f ine on a batch of random samples in a similar fashion.We use Mean-Squared-Error (MSE) for calculating the individual loss of the generators asshown in Eq. 4.

LL2(G) = Ex,y‖G(x)− y‖2 (4)

where, LL2 is the reconstruction loss for a real angiogram, y, given a generated an-giogram, G(x). We use this loss for both G f ine and Gcoarse so that the model can generatehigh quality angiograms of different scales. Previous techniques have also exploited thisidea of combining basic GAN objective with a MSE loss [36]. From Eq. 3 and 4 we canformulate our final objective function as given in Eq. 5.

minG f ine,Gcoarse

maxD f ine,Dcoarse

LcGAN(G f ine,Gcoarse,D f ine,Dcoarse)+λ[LL2(G f ine)+LL2(Gcoarse)

](5)

Here, λ dictates either to prioritize the discriminators or the generators. For our archi-tecture, more weight is given to the reconstruction loss of the generators and thus we pick alarge λ value.

4 Experiments

In the following section, different experimentation and evaluation is provided for our pro-posed architecture. First we elaborate about the data preparation and pre-prossessing schemein Sec. 4.1. We then define our hyper-parameter settings in Sec. 4.2. Following that, differentarchitectures are compared based on some quantitative and qualitative evaluation metrics inSec. 4.3. Lastly, and Sec. 4.4,

Citation
Citation
{Mao, Li, Xie, Lau, Wang, and Paulprotect unhbox voidb@x penalty @M {}Smolley} 2017
Citation
Citation
{Pathak, Krahenbuhl, Donahue, Darrell, and Efros} 2016
Page 7: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

KAMRAN ET AL.: FUNDUS2ANGIO 7

4.1 DatasetFor training, we use the funuds and angiography data-set provided by Hajeb et al. [15]. Thedata-set consists of 30 pairs of diabetic retinopathy and 29 pairs normal of angiography andfundus images from 59 patients. Because, not all of the pairs are perfectly aligned, we select17 pairs for our experiment based on alignment. The images are either perfectly aligned ornearly aligned. The resolution for fundus and angiograms are as follows 576×720. Fundusphotographs are in RGB format, whereas angiograms are in Gray-scale format. Due toshortage of data, we take 50 random crops of size 512×512 from each images for trainingour model. So, the total number of training sample is 850 (17×50).

4.2 Hyper-parameter tuningLSGAN [34] was found to be effective for generating desired synthetic images for our tasks.We picked λ = 10 (Eq. 5). For optimizer, we used Adam [26], with learning rate α = 0.0002,β1 = 0.5 and β2 = 0.999. We train with mini-batches with batch size, b = 4 for 100 epochs.It took approximately 10 hours to train our model on an NVIDIA RTX2070 GPU.

Figure 4: Angiogram generated from transformed Fundus images

4.3 Qualitative EvaluationFor evaluating the performance of the network, we took 14 images and cropped 4 sectionsfrom each quadrant of the image with a size of 512× 512. We conducted two sets of ex-periments to evaluate both the network’s robustness to global changes to the imaging modesand its ability to adapt to structural changes to the vascular patterns and structure of the eye.We used GNU Image Manipulation Program (GIMP) [40] for transforming and distortingimages.

In the first set of experiments, three transformations were applied to the images: 1)blurring to represent out of focus funduscopy or fundus photography in the presence of severecataracts, 2) sharpening to represent pupil dilation, and 3) noise to represent interference

Citation
Citation
{Hajeb Mohammadprotect unhbox voidb@x penalty @M {}Alipour, Rabbani, and Akhlaghi} 2012
Citation
Citation
{Mao, Li, Xie, Lau, Wang, and Paulprotect unhbox voidb@x penalty @M {}Smolley} 2017
Citation
Citation
{Kingma and Ba} 2014
Citation
Citation
{Team etprotect unhbox voidb@x penalty @M {}al.} 2019
Page 8: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

8 KAMRAN ET AL.: FUNDUS2ANGIO

during photography. Good robustness is represented by the generated angiograms similarityto the real FA image since these transformation do not affect the vascular structure of theretina. A side by side comparison of different architecture’s prediction is shown in Fig. 4. Asit can be observed from the image, the proposed architecture produces images very similarto the ground-truth (GT) under these global changes applied to the fundus image.

In the case of blurred fundus images, our model is less affected compared to otherarchitectures, as seen in the second row of Fig. 4– structure of smaller veins are preservedbetter compared to Pix2Pix and Pix2PixHD.

In the case of sharpened images, the angiogram produced by Pix2Pix and Pix2PixHDshow vein-like structures introduced in the back, which are not present in our prediction.These are seen in the third row of Fig. 4.

In the case of noisy images, as seen in the last row of Fig. 4 our prediction is still un-affected with this pixel level alteration. However, both Pix2Pix and Pix2PixHD fails togenerate thin and small vessel structures by failing to extract low level features.

Figure 5: Angiogram generated from distorted Fundus images with biological markers

In the second set of experiments we modified the vascular pattern of the retina and thefundus images. These structural changes are represented by two different types of distor-tions: 1) pinch, representing the flattening of the retina resulting in the pulled/pushed retinalstructure, and 2) whirl, representing retina distortions caused by increased intra-ocular pres-sure (IOP). Good adaptation to structural changes in the retina is achieved if the generatedangiograms are similar to the angiograms with changed vascular structure. The effects ofPinch and Whirl on predicted angiogram is illustrated in Fig. 5.

Pinch represents the globe flattening condition, manifesting vascular changes on theretina as a result of distortions of retinal subspace. This experiment shows the adaptabilityand reproduciblity of the proposed network to uncover the changes in vascular structure.From the first row in Fig. 5 it is evident that our model can effectively locates the retinalvessels compared to other proposed techniques.

Whirl represented changes in the IOP or vitreous changes in the eye that may resultin twists in the vascular structure. Similar to pinch, the network’s ability to adapt to thisstructural change can be measured if the generated FA image is similar to the real angiogramshowing the changed vascular structure. As seen in the last row of Fig. 5 our network encodesthe feature information vessel structures, and is much less affected this kind of distortion.The other architectures failed to generate micro vessel structure as it can be seen in Fig. 5.

Page 9: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

KAMRAN ET AL.: FUNDUS2ANGIO 9

4.4 Quantitative Evaluations

For quantitative evaluation, we also performed two experiments. In the first experiment weuse the FrÃl’chet inception distance (FID) [18] that has been used to evaluate similar style-transfer GANs [1, 24, 25]. We computed the FID scores for different architectures on thegenerated FA image and original angiogram, and those generated from the changed fundusimages by the five global and structural changes –i.e., blurring, sharpening, noise, pinch, andwhirl. The results are reported in Table. 2. It should be noted that, lower FID score meansbetter results.

Table 2: FrÃl’chet inception distance (FID) for different architecturesArchitecture Orig. Noise Blur Sharp Whirl Pinch

Ours 30.3 41.5 (11.2↑) 32.3 (2.0↑) 34.3 (4.0↑) 38.2 (7.9↑) 33.1 (2.8↑)Pix2PixHD [42] 42.8 53.0 (10.2↑) 43.7 (1.1↑) 47.5 (4.7↑) 45.9 (3.1↑) 39.2 (3.6↓)Pix2Pix [21] 48.6 46.8 (1.8 ↓) 50.8 (2.2↑) 47.1 (1.5↓) 43.0 (5.6↓) 43.7 (4.9↓)

From Table. 2, using the original fundus image, the FID of our network angiogrm is 30.3,while other techniques are at least 10 points worse, Pix2PixHD (42.8) and Pix2Pix (48.6).For the case of noisy images, the FID for Pix2Pix dropped slightly but increased for bothPix2PixHD and our technique. Notice that the FID for our technique is still better than bothPix2Pix and Pix2PixHD. For all other changes, the FID score of our technique increasedslightly but still outperformed Pix2Pix and Pix2PixHD in both robustness and adaptation tothe structural changes.

Table 3: Results of Qualitative with Undisclosed Portion of Fake/Real ExperimentResults Average

Correct Incorrect Missed Found Confusion

Fake 15% 85%53% 48% 52.5%

Real 80% 20%

In the next experiment we evaluate the quality of the generated angiograms by asking ex-perts (e.g. ophthalmologists) to identify fake angiograms among a collection of 40 balanced(50%, 50%) and randomly mixed angiograms. For this experiment, the experts were not toldhow many of the images are real and how many are fake. The non-disclosed ratio of fake andreal images was a significant design choice for this experiment, as it will allow us to evaluatethree metrics: 1) incorrectly labeled fake images representing how real the generated imageslook, 2) correctly labeled real images representing how accurate the experts recognized an-giogram salient features, and 3) the confusion metric representing how effective the overallperformance of our proposed method was in confusing the expert in the overall experiment.The results are shown in Table 3.

As it can be seen from Table 3, experts assigned 85% of the fake angiogams as real.This result shows that experts had difficulty in identifying fake images, while they easilyidentified real angiograms with 80% accuracy. Overall, the experts misclassified 53% of allimages. This resulted in a confusion factor of 52.5%. This is significant, as the confusionfactor of 50% is the best achievable result.

Citation
Citation
{Heusel, Ramsauer, Unterthiner, Nessler, and Hochreiter} 2017
Citation
Citation
{Brock, Donahue, and Simonyan} 2018
Citation
Citation
{Karras, Aila, Laine, and Lehtinen} 2017
Citation
Citation
{Karras, Laine, and Aila} 2019
Citation
Citation
{Wang, Liu, Zhu, Tao, Kautz, and Catanzaro} 2018
Citation
Citation
{Isola, Zhu, Zhou, and Efros} 2017
Page 10: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

10 KAMRAN ET AL.: FUNDUS2ANGIO

5 ConclusionIn this paper, we introduced Fundus2Angio, a novel conditional generative architecture thatcapable of generating angiograms from retinal fundus images. We further demonstrated itsrobustness, adaptability, and reproducibility by synthesizing high quality angiograms fromtransformed and distorted fundus images. Additionally, we illustrated how changes in bio-logical markers do not affect the adaptability and reproducibility of synthesizing angiogramsby using our technique. This ensures that the proposed architecture effectively preservesknown biological markers (e.g. vascular patterns and structures). As a result, the proposednetwork can be effectively utilized to produce accurate FA images for the same patient fromhis or her fundus images over time. This allows for a better control on patient’s disease pro-gression monitoring or to help uncover newly developed diseases or conditions. One futuredirection to this work is to improve upon this work to incorporate retinal vessel segmentationand exudate localization.

References[1] Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high

fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018.

[2] Knut Brockow and Mario Sánchez-Borges. Hypersensitivity to contrast media anddyes. Immunology and Allergy Clinics, 34(3):547–564, 2014.

[3] Matthew Brown, David G Lowe, et al. Recognising panoramas. In ICCV, volume 3,page 1218, 2003.

[4] Peter Burt and Edward Adelson. The laplacian pyramid as a compact image code.IEEE Transactions on communications, 31(4):532–540, 1983.

[5] Qifeng Chen and Vladlen Koltun. Photographic image synthesis with cascaded re-finement networks. In Proceedings of the IEEE international conference on computervision, pages 1511–1520, 2017.

[6] Wengling Chen and James Hays. Sketchygan: Towards diverse and realistic sketchto image synthesis. In Proceedings of the IEEE Conference on Computer Vision andPattern Recognition, pages 9416–9425, 2018.

[7] François Chollet. Xception: Deep learning with depthwise separable convolutions. InProceedings of the IEEE conference on computer vision and pattern recognition, pages1251–1258, 2017.

[8] Tali Dekel, Chuang Gan, Dilip Krishnan, Ce Liu, and William T Freeman. Sparse,smart contours to represent and edit images. In Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition, pages 3511–3520, 2018.

[9] Emily L Denton, Soumith Chintala, Rob Fergus, et al. Deep generative image modelsusing a laplacian pyramid of adversarial networks. In Advances in neural informationprocessing systems, pages 1486–1494, 2015.

[10] Claude E Duchon. Lanczos filtering in one and two dimensions. Journal of appliedmeteorology, 18(8):1016–1022, 1979.

Page 11: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

KAMRAN ET AL.: FUNDUS2ANGIO 11

[11] N El Harrar, B Idali, S Moutaouakkil, M El Belhadji, K Zaghloul, A Amraoui, andM Benaguida. Anaphylactic shock caused by application of fluorescein on the ocularconjunctiva. Presse medicale (Paris, France: 1983), 25(32):1546, 1996.

[12] Vittorio Fineschi, Giorgio Monasterolo, Roberto Rosi, and Emanuela Turillazzi. Fatalanaphylactic shock during a fluorescein angiography. Forensic science international,100(1-2):137–142, 1999.

[13] Huazhu Fu, Jun Cheng, Yanwu Xu, Changqing Zhang, Damon Wing Kee Wong, JiangLiu, and Xiaochun Cao. Disc-aware ensemble network for glaucoma screening fromfundus image. IEEE transactions on medical imaging, 37(11):2493–2501, 2018.

[14] Nikita Gurudath, Mehmet Celenk, and H Bryan Riley. Machine learning identifica-tion of diabetic retinopathy from fundus images. In 2014 IEEE Signal Processing inMedicine and Biology Symposium (SPMB), pages 1–7. IEEE, 2014.

[15] Shirin Hajeb Mohammad Alipour, Hossein Rabbani, and Mohammad Reza Akhlaghi.Diabetic retinopathy grading by digital curvelet transform. Computational and mathe-matical methods in medicine, 2012, 2012.

[16] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learningfor image recognition. In Proceedings of the IEEE conference on computer vision andpattern recognition, pages 770–778, 2016.

[17] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings indeep residual networks. In European conference on computer vision, pages 630–645.Springer, 2016.

[18] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and SeppHochreiter. Gans trained by a two time-scale update rule converge to a local nashequilibrium. In Advances in neural information processing systems, pages 6626–6637,2017.

[19] Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, and Serge Belongie. Stackedgenerative adversarial networks. In Proceedings of the IEEE conference on computervision and pattern recognition, pages 5077–5086, 2017.

[20] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep networktraining by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.

[21] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image trans-lation with conditional adversarial networks. In Proceedings of the IEEE conferenceon computer vision and pattern recognition, pages 1125–1134, 2017.

[22] Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time styletransfer and super-resolution. In European conference on computer vision, pages 694–711. Springer, 2016.

[23] S. A. Kamran, S. Saha, A. S. Sabbir, and A. Tavakkoli. Optic-net: A novel convolu-tional neural network for diagnosis of retinal diseases from optical tomography images.In 2019 18th IEEE International Conference On Machine Learning And Applications(ICMLA), pages 964–971, 2019.

Page 12: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

12 KAMRAN ET AL.: FUNDUS2ANGIO

[24] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing ofgans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196,2017.

[25] Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture forgenerative adversarial networks. In Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition, pages 4401–4410, 2019.

[26] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXivpreprint arXiv:1412.6980, 2014.

[27] Anthony SL Kwan, Chris Barry, Ian L McAllister, and Ian Constable. Fluoresceinangiography and adverse drug reactions revisited: the lions eye experience. Clinical &experimental ophthalmology, 34(1):33–38, 2006.

[28] Kris A Kwiterovich, Maureen G Maguire, Robert P Murphy, Andrew P Schachat,Neil M Bressler, Susan B Bressler, and Stuart L Fine. Frequency of adverse systemicreactions after fluorescein angiography: results of a prospective study. Ophthalmology,98(7):1139–1142, 1991.

[29] Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham,Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al.Photo-realistic single image super-resolution using a generative adversarial network. InProceedings of the IEEE conference on computer vision and pattern recognition, pages4681–4690, 2017.

[30] Chuan Li and Michael Wand. Precomputed real-time texture synthesis with markoviangenerative adversarial networks. In European conference on computer vision, pages702–716. Springer, 2016.

[31] Phillip Lieberman, Stephen F Kemp, John Oppenheimer, David M Lang, I LeonardBernstein, Richard A Nicklas, John A Anderson, David I Bernstein, Jonathan A Bern-stein, Jordan N Fink, et al. The diagnosis and management of anaphylaxis: an updatedpractice parameter. Journal of Allergy and Clinical Immunology, 115(3):S483–S523,2005.

[32] Rodrigo Pessoa Cavalcanti Lira, Cleriston Lucena de Andrade Oliveira, Marta VirgíniaRibeiro Brito Marques, Alaine Rocha Silva, and Cristiano de Carvalho Pessoa. Ad-verse reactions of fluorescein angiography: a prospective study. Arquivos brasileirosde oftalmologia, 70(4):615–618, 2007.

[33] NARESH Mandava, ELIAS Reichel, D Guyer, et al. Fluorescein and icg angiography.St Louis: Mosby, 106:800–808, 2004.

[34] Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and StephenPaul Smolley. Least squares generative adversarial networks. In Proceedings of theIEEE International Conference on Computer Vision, pages 2794–2802, 2017.

[35] Viola Stella Mary, Elijah Blessing Rajsingh, and Ganesh R Naik. Retinal fundus imageanalysis for diagnosis of glaucoma: a comprehensive survey. IEEE Access, 2016.

Page 13: Fundus2Angio: A Novel Conditional GAN Architecture for ... · Angiography Images from Retinal Fundus Photography Sharif Amit Kamran 1 skamran@nevada.unr.edu Khondker Fariha Hossain

KAMRAN ET AL.: FUNDUS2ANGIO 13

[36] Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros.Context encoders: Feature learning by inpainting. In Proceedings of the IEEE confer-ence on computer vision and pattern recognition, pages 2536–2544, 2016.

[37] Ryan Poplin, Avinash V Varadarajan, Katy Blumer, Yun Liu, Michael V McConnell,Greg S Corrado, Lily Peng, and Dale R Webster. Prediction of cardiovascular riskfactors from retinal fundus photographs via deep learning. Nature Biomedical Engi-neering, 2(3):158, 2018.

[38] Patsorn Sangkloy, Jingwan Lu, Chen Fang, Fisher Yu, and James Hays. Scribbler:Controlling deep image synthesis with sketch and color. In Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, pages 5400–5409, 2017.

[39] Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. Singan: Learning a generativemodel from a single natural image. In Proceedings of the IEEE International Confer-ence on Computer Vision, pages 4570–4580, 2019.

[40] GIMP Team et al. GIMP: GNU Image Manipulation Program. GIMP Team., 2019.

[41] MJ Torres, C Mayorga, and M Blanca. 1 nonimmediate allergic reactions inducedby drugs: Pathogenesis and diagnostic tests. Journal of investigational allergology &clinical immunology, 19(2):80, 2009.

[42] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and BryanCatanzaro. High-resolution image synthesis and semantic manipulation with condi-tional gans. In Proceedings of the IEEE conference on computer vision and patternrecognition, pages 8798–8807, 2018.

[43] Wenqi Xian, Patsorn Sangkloy, Varun Agrawal, Amit Raj, Jingwan Lu, Chen Fang,Fisher Yu, and James Hays. Texturegan: Controlling deep image synthesis with tex-ture patches. In Proceedings of the IEEE Conference on Computer Vision and PatternRecognition, pages 8456–8465, 2018.

[44] Lawrence A Yannuzzi, Kathleen T Rohrer, Lori J Tindel, Russell S Sobel, Marcelle ACostanza, William Shields, and Edith Zang. Fluorescein angiography complicationsurvey. Ophthalmology, 93(5):611–617, 1986.

[45] Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang,and Dimitris N Metaxas. Stackgan: Text to photo-realistic image synthesis with stackedgenerative adversarial networks. In Proceedings of the IEEE international conferenceon computer vision, pages 5907–5915, 2017.

[46] He Zhang and Vishal M Patel. Densely connected pyramid dehazing network. InProceedings of the IEEE conference on computer vision and pattern recognition, pages3194–3203, 2018.

[47] Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. Generativevisual manipulation on the natural image manifold. In European Conference on Com-puter Vision, pages 597–613. Springer, 2016.

[48] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of theIEEE international conference on computer vision, pages 2223–2232, 2017.