15
Motion Estimation and Correction in Cardiac CT Angiography Images using Convolutional Neural Networks T. Lossau (n´ ee Elss) a,b , H. Nickisch a , T. Wissel a , R. Bippus a , H. Schmitt a , M. Morlock b , M. Grass a a Philips Research, Hamburg, Germany b Hamburg University of Technology, Germany Abstract Cardiac motion artifacts frequently reduce the interpretability of coronary computed tomography angiogra- phy (CCTA) images and potentially lead to misinterpretations or preclude the diagnosis of coronary artery disease (CAD). In this paper, a novel motion compensation approach dealing with Coronary Motion esti- mation by Patch Analysis in CT data (CoMPACT) is presented. First, the required data for supervised learning is generated by the Coronary Motion Forward Artifact model for CT data (CoMoFACT) which introduces simulated motion to 19 artifact-free clinical CT cases with step-and-shoot acquisition proto- col. Second, convolutional neural networks (CNNs) are trained to estimate underlying 2D motion vectors from 2.5D image patches based on the coronary artifact appearance. In a phantom study with computer- simulated vessels, CNNs predict the motion direction and the motion magnitude with average test accuracies of 13.37 ± 1.21 and 0.77 ± 0.09 mm, respectively. On clinical data with simulated motion, average test accuracies of 34.85 ± 2.09 and 1.86 ± 0.11 mm are achieved, whereby the precision of the motion direction prediction increases with the motion magnitude. The trained CNNs are integrated into an iterative motion compensation pipeline which includes distance-weighted motion vector extrapolation. Alternating motion estimation and compensation in twelve clinical cases with real cardiac motion artifacts leads to significantly reduced artifact levels, especially in image data with severe artifacts. In four observer studies, mean artifact levels of 3.08 ± 0.24 without MC and 2.28 ± 0.29 with CoMPACT MC are rated in a five point Likert scale. Keywords: Coronary Computed Tomography Angiography, Motion Compensation, Deep Learning 1. Introduction Non-invasive coronary computed tomography an- giography (CCTA) has become a preferred tech- nique for the detection and diagnosis of coronary artery disease (CAD) (Budoff et al., 2017; Foy et al., 2017; Camargo et al., 2017; Liu et al., 2017), but the temporal resolution of CCTA images is restricted by the angular range required for reconstruction and the system rotation time. Despite the appli- cation of dual source CT systems (Petersilka et al., 2008) and ECG-gated acquisition, cardiac motion frequently leads to artifacts in the reconstructed CT image volumes which hamper reliable evalua- tion (Ghekiere et al., 2017). Several software-based solutions for motion ar- tifact detection, quantification and reduction have been developed in the last few years. A selection of related papers is listed and compared in Table 1. Motion vector field (MVF) estimation and subse- quent motion compensated reconstruction (MCR) are the key components of most motion compen- sation (MC) algorithms. A variety of MCR meth- ods including motion compensated iterative recon- struction (Isola et al., 2010), motion compensated filtered back-projection (MC-FBP) (Sch¨ afer et al., 2006; van Stevendaal et al., 2008) and backproject- then-warp (BPW) strategies (Bhagalia et al., 2012; Brendel et al., 2014) are known. We assigned the algorithms in Table 1 into registration-based, PAR- based (partial angle reconstruction), metric-based and image-based approaches. Registration-based: Motion estimation by 3-D/3-D registration of multiple heart phases has shown great results in the reduction of moderate and severe motion artifacts (van Stevendaal et al., 2008; Isola et al., 2010; Bhagalia et al., 2012; Tang et al., 2012) but requires an extended temporal Preprint submitted to ELSEVIER June 24, 2019

Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

Motion Estimation and Correction in Cardiac CT Angiography Imagesusing Convolutional Neural Networks

T. Lossau (nee Elss)a,b, H. Nickischa, T. Wissela, R. Bippusa, H. Schmitta, M. Morlockb, M. Grassa

aPhilips Research, Hamburg, GermanybHamburg University of Technology, Germany

Abstract

Cardiac motion artifacts frequently reduce the interpretability of coronary computed tomography angiogra-phy (CCTA) images and potentially lead to misinterpretations or preclude the diagnosis of coronary arterydisease (CAD). In this paper, a novel motion compensation approach dealing with Coronary Motion esti-mation by Patch Analysis in CT data (CoMPACT) is presented. First, the required data for supervisedlearning is generated by the Coronary Motion Forward Artifact model for CT data (CoMoFACT) whichintroduces simulated motion to 19 artifact-free clinical CT cases with step-and-shoot acquisition proto-col. Second, convolutional neural networks (CNNs) are trained to estimate underlying 2D motion vectorsfrom 2.5D image patches based on the coronary artifact appearance. In a phantom study with computer-simulated vessels, CNNs predict the motion direction and the motion magnitude with average test accuraciesof 13.37 ± 1.21 and 0.77 ± 0.09 mm, respectively. On clinical data with simulated motion, average testaccuracies of 34.85± 2.09 and 1.86± 0.11 mm are achieved, whereby the precision of the motion directionprediction increases with the motion magnitude. The trained CNNs are integrated into an iterative motioncompensation pipeline which includes distance-weighted motion vector extrapolation. Alternating motionestimation and compensation in twelve clinical cases with real cardiac motion artifacts leads to significantlyreduced artifact levels, especially in image data with severe artifacts. In four observer studies, mean artifactlevels of 3.08± 0.24 without MC and 2.28± 0.29 with CoMPACT MC are rated in a five point Likert scale.

Keywords: Coronary Computed Tomography Angiography, Motion Compensation, Deep Learning

1. Introduction

Non-invasive coronary computed tomography an-giography (CCTA) has become a preferred tech-nique for the detection and diagnosis of coronaryartery disease (CAD) (Budoff et al., 2017; Foy et al.,2017; Camargo et al., 2017; Liu et al., 2017), but thetemporal resolution of CCTA images is restrictedby the angular range required for reconstructionand the system rotation time. Despite the appli-cation of dual source CT systems (Petersilka et al.,2008) and ECG-gated acquisition, cardiac motionfrequently leads to artifacts in the reconstructedCT image volumes which hamper reliable evalua-tion (Ghekiere et al., 2017).

Several software-based solutions for motion ar-tifact detection, quantification and reduction havebeen developed in the last few years. A selectionof related papers is listed and compared in Table 1.

Motion vector field (MVF) estimation and subse-quent motion compensated reconstruction (MCR)are the key components of most motion compen-sation (MC) algorithms. A variety of MCR meth-ods including motion compensated iterative recon-struction (Isola et al., 2010), motion compensatedfiltered back-projection (MC-FBP) (Schafer et al.,2006; van Stevendaal et al., 2008) and backproject-then-warp (BPW) strategies (Bhagalia et al., 2012;Brendel et al., 2014) are known. We assigned thealgorithms in Table 1 into registration-based, PAR-based (partial angle reconstruction), metric-basedand image-based approaches.

Registration-based: Motion estimation by3-D/3-D registration of multiple heart phases hasshown great results in the reduction of moderateand severe motion artifacts (van Stevendaal et al.,2008; Isola et al., 2010; Bhagalia et al., 2012; Tanget al., 2012) but requires an extended temporal

Preprint submitted to ELSEVIER June 24, 2019

Page 2: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

Table 1: Literature survey in the research field of coronary motion artifacts in CCTA images. Papers are clustered intoartifact reduction approaches, artifact detection/quantification approaches and approaches from our group which are based onsynthetically motion-perturbed data generated with the Coronary Motion Forward Artifact model for CT data (CoMoFACT).In this manuscript, a method for Coronary Motion estimation by Patch Analysis in CT data (CoMPACT) is introduced.

Paper Purpose Approach Data Constraints Keywords

SymbolsH: hand-craftedD: data-drivenΓ: required angular

scan range

Art

ifac

tD

etec

tion

Art

ifact

Quanti

fica

tion

Moti

onE

stim

ati

on

Art

ifact

Red

uct

ion

Reg

istr

atio

n-B

ase

d

PA

R-B

ase

d

Met

ric-

Base

d

Imag

e-B

ase

d

Alg

ori

thm

Man

ually

Lab

eled

Forw

ard

Model

Appro

xim

ate

Cen

terl

ine

Γ=

180

+F

an

Angle

Γ>

180

+F

an

Angle Abbreviations

PAR partial angle reconstructionMC-FBP motion compensated filtered back-projectionMC-IR motion compensated iterative reconstructionBPW backproject-then-warpME motion estimationTRIM temporal resolution improvement methodMAM motion artifact measure

van Stevendaal et al. (2008) X X X H X model-based surface reg.; MC-FBPIsola et al. (2010) X X X H X multi-phase elatic image reg.; MC-IRBhagalia et al. (2012) X X X H X X centerline reg.; subphasic warp and add; BPWTang et al. (2012) X X X H X multi-phase image reg.; alternating ME and MC-FBPGrass et al. (2016) X X X X H X vesselness filtering; opposite PAR reg.Kim et al. (2018) X X X X X H X linear PAR reg.; metric-based MVF refinementSchondube et al. (2011) X X H X iterative PAR; histogram constraint; TRIMRohkohl et al. (2013) X X X H X X iterative MAM optimizationHahn et al. (2017) X X X X H X X PAR; MAM optimization; BPWJung et al. (2018) X X D X X X cross-phase style transfer; image-to-image translation

Sprem et al. (2017) X X D X X X coronary artery calcification; deep learningMa et al. (2018) X X X H X X fold overlap ratio; low-intensity region score

Elss et al. (2018b) X X D X X X CoMoFACT; deep learningElss et al. (2018a) X X D X X X CoMoFACT; deep learningLossau et al. (2019) X X X D X X X CoMoFACT; deep learningThis manuscript X X X D X X X CoMoFACT; deep learning; CoMPACT

scan range which corresponds to increased radia-tion doses. In (Grass et al., 2016) a series of CTimage volumes with reduced angular range of 75

is reconstructed at the angular positions −120,−60, +60 and +120 around a selected centerphase. The resulting partial angle volumes arepost-processed by first combining high frequenciesfrom the partial scans with low frequencies froma central full scan and subsequent vessel featureenhancement according to (Wiemker et al., 2013).Elastic image registration of diametrically opposedpartial scans as described by Kim et al. (2015)yields dense MVFs which are integrated intoMC-FBP. By this procedure, Grass et al. (2016)reduced the required angular scan range to 315

plus fan angle of the reconstruction field of view.

PAR-based: The increased temporal resolution ofPARs are exploited in several MC methods (Grasset al., 2016; Hahn et al., 2017; Kim et al., 2018).Schondube et al. (2011) introduced the temporalresolution improvement method based on an itera-tive PAR with an additional histogram constraint.

Metric-based: An initial metric-based approachdealing with MVF estimation by iterative min-imization of handcrafted motion artifact mea-sures (MAM) has been presented by Rohkohlet al. (2013). This method was extended in

(Hahn et al., 2017) by introducing a novel motionmodel parametrization and application of estimatedMVFs to PARs. Kim et al. (2018) proposed acombination of these approaches by first estimatinglinear motion using registration of PARs and sub-sequent MVF refinement by information potentialminimization.

Image-based: Image-to-image translation us-ing deep residual convolutional neural networks(CNNs) allows for artifact suppression without con-sideration of the corresponding projection data(Jung et al., 2018). However, these approaches areessentially restricted by the information content ofthe motion perturbed input patches.

So far, most MC approaches are rule-based, i.e.they exploit hand-crafted features for MVF deter-mination. Machine learning holds the promise tosolve tasks of any complexity (Hornik et al., 1989),but requires either appropriate manually labeledor synthesized data. Forward models may helpto circumvent time-consuming and possibly noise-affected hand-labeling processes with the benefit ofreproducibility. Coronary motion artifacts do man-ifest in typical patterns containing intensity un-dershoots and arc-shaped blurring due to the CTreconstruction geometry which can be realisticallysimulated by the Coronary Motion Forward Arti-

2

Page 3: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

x m

oti

on

y m

oti

on

x m

oti

on

y m

oti

on

x-y plane x-z plane y-z plane

x

y

x

z

y

z

x-y plane x-z plane y-z plane

x

y

x

z

y

z

Figure 1: Constant linear motion is introduced to phantom vessel trees using the CoMoFACT. Depending on the relation ofmotion direction, reconstruction direction and coronary orientation, motion artifacts of different appearance occur. In caseof motion in direction of the coronary artery, artifacts are hardly visible due to blurring within the vessel (left box, bottomrow). The forward projected coronary artery mask depicted in the center is split into two scanning shots. Patch-based motionprediction models have to be robust regarding related stack transition artifacts.

fact model for CT data (CoMoFACT) from Elsset al. (2018b) and Lossau et al. (2019). This pre-liminary work has demonstrated that CNNs trainedon synthesized data are applicable for motion arti-fact recognition and quantification in clinical prac-tice, i.e. CNNs are capable of identifying coronarymotion artifact patterns. Phantom studies further-more revealed that the relation between angular re-construction range and motion direction is crucialfor the artifact appearance (Elss et al., 2018a).

This paper addresses the question of how wellmotion estimation can be performed from a singlereconstructed CT image patch based on the coro-nary artifact appearance. Furthermore, potentialand limitations of single-phase, image-based mo-tion estimation using CNNs for MC in clinical prac-tice are investigated. An initial feasibility studyfor axial motion estimation in coronary artery seg-ments which are aligned along the scanners z-axishas been presented in (Elss et al., 2018a). Buildingon this work, the Coronary Motion estimation byPatch Analysis in CT data (CoMPACT) method isintroduced here. We extended the model from Elsset al. (2018a) for application on coronary artery seg-ments with arbitrary orientations and adapted thenetwork architecture, accordingly. Furthermore,learned motion vector prediction networks are inte-grated into a novel iterative motion compensationpipeline which enables reduction of severe artifactsin dose efficient short scan data. The following stepsare applied to build and test the proposed motioncompensation method, whereby the main contribu-tions of this work are 2. and 3., i.e. the deep-learning-based motion estimation approach and themotion compensation pipeline:

1. The CoMoFACT presented by Lossau et al.(2019) enables introduction of simulated andhence controlled motion to artifact-free cardiacCT data. By restricting the trajectories of theCoMoFACT to constant linear motion in theaxial plane, 19 000 pairs of motion perturbedimage patches and underlying 2D motion vec-tors are generated from 19 clinical cases withexcellent image quality (see Section 3.1).

2. Based on the synthetically motion perturbeddata, CNNs are trained to estimate underlying2D motion vectors from 2.5D image patches.First, a phantom study is performed as startingpoint for motion vector estimation in a well-controlled scenario without variation in back-ground intensities, contrast agent density ornoise level. Finally, CNNs are trained for themore difficult task of motion vector estimationon clinical data (see Section 3.2).

3. The trained CNNs are integrated into an itera-tive motion compensation pipeline which usesalternating MVF estimation and MC-FBP.The MVF estimation step includes distance-weighted extrapolation of motion vectors pre-dicted along the approximate coronary center-lines (see Section 3.3).

4. CoMPACT MC is tested on twelve clinicalcases with real artifacts and compared to theregistration-based MC approach from Grasset al. (2016) in order to evaluate generaliza-tion capabilities of the trained CNNs regard-ing non-synthetic artifacts and the feasibility ofpatch-based motion estimation in clinical prac-tice (see Section 4.2).

3

Page 4: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

2. Material

The CoMoFACT from Lossau et al. (2019) usesartifact-free CCTA cases with step-and-shoot ac-quisition protocol as reference point for the mo-tion introduction process. In addition to the re-constructed CT image volumes which determine theno motion state, the corresponding coronary arterytrees and the raw projection data are required. Therestriction to step-and-shoot cases offers the advan-tage to generate artifacts in a well-controlled situa-tion without table movement or multi-cycle recon-struction. Phantom as well as patient data stud-ies are performed. Section 2.1 details collectionand pre-processing of the clinical reference data.The design of the computer-simulated vessels is de-scribed in Section 2.2. Twelve additional clinicalcases with real motion-perturbation are collectedin order to test the transferability to non-syntheticartifacts (see Section 2.3).

2.1. Clinical reference data without artifacts

Slice-by-slice visual inspection is performed togather contrast-enhanced cardiac CT data setswhich exhibit no coronary motion artifacts in thereconstructed CT image volume. In total, 19prospectively ECG-triggered clinical data sets fromdifferent patients are collected. A 256-slice CTscanner (Brilliance iCT, Philips Healthcare, Cleve-land, OH, USA) with a gantry rotation speed of0.272 sec per turn was used for the acquisition ofthese reference cases. The mean heart rates HRmean

ranged from 45.2 bpm to 68.9 bpm. Cardiac CT im-age volumes are reconstructed at the mid-diastolicquiescent phase by aperture-weighted cardiac re-construction (AWCR) (van Stevendaal et al., 2007).The center of the cardiac gating window hereaftercalled the reference cardiac phase r is chosen be-tween 70% and 80% R-R interval, respectively. Thecoronary artery tree of each case is segmented usingthe Comprehensive Cardiac Analysis Software (In-telliSpace Portal 9.0, Philips Healthcare, Cleveland,OH, USA) delivering a set of centerline points ~c ∈ Cwith associated information on the lumen contour.Centerline points with a minimum vessel diameterof 1.5 millimeters are utilized for data generation.

2.2. Phantom reference data without artifacts

From each clinical reference case, one binaryphantom mask is extracted which contains the seg-mented lumen contour of the entire vessel tree (seeFigure 1). Ray-driven forward projection (Bippus

et al., 2011) and subsequent high-pass filtering de-livers the projection data required for applicationof the CoMoFACT. The projection geometry, theECG-data and the coronary centerline points areadopted from the corresponding clinical data set.

The phantom study allows one to identify thelimits of deep-learning-based motion estimation ina well-controlled scenario without variation in back-ground intensities, contrast agent density or noiselevel and focuses on variations in the vessel struc-ture comprising different orientations, curvatures,radii and bifurcations.

2.3. Clinical test data with real artifactsTwelve additional clinical cases which belong to

different patients and exhibit real motion artifactsat the coronary arteries are collected for testing pur-poses. Acquisition is performed by a Brilliance iCTscanner using the same scan protocol as in the ref-erence data. The mean heart rates of the patientsHRmean ranged from 57.9 bpm to 83.0 bpm.

3. Methods

CNNs are trained for motion vector estimationin coronary image patches. The required data forsupervised learning is generated using an adaptedversion of the CoMoFACT for simulated motion in-troduction. Subsection 3.1 details the data genera-tion process including synthetic motion vector field(MVF) creation and patch sampling. Data augmen-tation and data separation strategies as well as thesupervised learning setups are described in Section3.2. Finally, the trained CNNs are integrated intoan iterative motion compensation pipeline which in-cludes distance-weighted extrapolation of the pre-dicted motion vectors (see Section 3.3).

3.1. Data generationThe CoMoFACT from Lossau et al. (2019) en-

ables the generation of CT image data with simu-lated and hence controlled motion at the coronaryarteries. It is based on motion compensated filteredback-projection (MC-FBP) (Schafer et al., 2006)taking artifact-free CT images and synthetic MVFsas input. In the MC-FBP, the attenuation coeffi-cient µ(~ν) of each voxel ~ν ∈ Ω in the field of viewΩ ⊂ R3 is calculated by:

µ(~ν) =

tend∫tstart

wAWCR(t, ~ν+~d(t, ~ν)) pfilt(t, ~ν+~d(t, ~ν)) dt

(1)

4

Page 5: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

As part of the filtered back-projection, line-integrals in pfilt are already re-binned to parallelbeam geometry and high-pass filtered with a rampfilter. The projection data pfilt(t, ~ν) indicates thefiltered line-integral which passes through the voxel~ν at time point t. It is multiplied by a weight-ing function wAWCR which includes angular weight-ing for gated reconstruction, pi-partner and aper-ture weighting for normalization of redundant andoblique rays according to (van Stevendaal et al.,2007). As illustrated in Figure 2, line integralsare spatially corrected with respect to the esti-mated voxel displacement ~d(t, ~ν) at each time pointt within the acquisition period [tstart, tend]. Appli-cation of the MC-FBP on CT data with excellentquality using synthetic MVFs reverses the usual ef-fect of motion compensation. Inconsistent projec-tion data is created and motion artifacts are in-duced in the reconstructed CT image volume.

Figure 2: In voxel-driven MC-FBP, line integrals are spa-tially adapted with respect to the input MVF ~d(t, ~ν) whichcontains the voxel displacements between a reference time t0and the time ti each specific projection was acquired.

Synthetic MVF: In principle, arbitrary motiontrajectories can be simulated using this approachby adjusting the synthetic MVF. For simplic-ity, we restrict the model to constant linear mo-tion. Therefore, minor adaptations of the sim-ulated MVF from Lossau et al. (2019) are per-formed. In our CoMoFACT variant, the displace-ment ~d~c : [0%, 100%)× Ω→ R3 of each voxel ~ν attime point tcc ∈ [0%, 100%) in millimeters is calcu-lated by:

~d~c(tcc, ~ν) = s ·m~c(~ν) · ~δ~c(tcc, α) (2)

As described in Lossau et al. (2019), tcc is mea-sured in percent cardiac cycle, s ∈ R+ denotes thetarget motion strength and m~c : Ω → [0, 1] indi-cates a weighting mask which limits the motion

to the area of the currently processed centerlinepoint ~c ∈ Ω. The motion direction is determinedby ~δ~c : [0%, 100%)× (−180, 180] which is adaptedto:

~δ~c(tcc, α) =60bpm

HRmean· ~ρ~c(α)

‖~ρ~c(α)‖2(3)

·

−0.5 if tcc < r − 10%(tcc−r)

20% if r − 10% ≤ tcc ≤ r + 10%

+0.5 if tcc > r + 10%

(4)

The parameter HRmean denotes the patient’s av-erage heart rate during acquisition and r is the ref-erence cardiac phase during AWCR. The motiondirection determined by ~ρ~c(α) is limited to the ax-ial plane, i.e. the z-component is set to zero. Thex- and y-components of ~ρ~c(α) are defined in such away that α corresponds to the angle between meanreconstruction direction of the currently processedcenterline point ~c and the motion direction (see Fig-ure 3). The mean reconstruction direction is de-fined by the gantry rotation angle γmean at the ref-erence heart phase r and is constant for each voxelreconstructed by the same circular scanning shoot.It has to be noted, that the system rotation direc-tion is consistent for all cases. This is important,since the reverse rotational directions would lead toa flipping of the artifact shapes.

Lossau et al. (2019) investigated the feasibility ofmotion artifact recognition and quantification byutilizing the parameter s for target value assign-ment. As an extension of this work, our forwardmodel has one additional (angular) degree of free-dom α, i.e. each MVF is now defined by the pa-rameter tuple (s, α). The target motion strength sscales the length of each displacement vector in theMVF and therefore determines the motion magni-tude. On the basis of the velocity measurementsat the coronary arteries by Vembar et al. (2003),the target motion strength s is limited to the inter-val [0, 10] in all experiments. The newly introducedangle parameter α ∈ (−180, 180] determines thein-plane motion direction. Both parameters s andα are randomly sampled from uniform distributionsin the following experiments. The correspondingCartesian coordinates x = s cos(α) and y = s sin(α)define the ground-truth labels for the supervisedlearning task.

Patch sampling: The extended CoMoFACT en-ables the generation of multiple motion-perturbedCT image volumes with controlled motion level and

5

Page 6: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

𝛼 = -90°

𝛼 = −60°

𝛼 = −30°

𝛼 = 0°

𝛼 = 30°

𝑠 = 0𝑠 = 3𝑠 = 6𝑠 = 9 𝑠 = 9𝑠 = 6𝑠 = 3

𝛼 = 60°

𝛼 = 90°

α

patient

patient𝑥

𝑦

𝑧

mean

patient

𝛾

mean reconstruction direction

motion direction

mean reconstruction direction

𝑥

𝑦

𝑧

𝛼 = -90°

𝛼 = −60°

𝛼 = −30°

𝛼 = 0°

𝛼 = 30°

𝑠 = 0𝑠 = 3𝑠 = 6𝑠 = 9 𝑠 = 9𝑠 = 6𝑠 = 3

𝛼 = 60°

𝛼 = 90°

Figure 3: The x-y plane of an example centerline point is illustrated in phantom (top) and clinical (bottom) mode for varyingparameter settings (s, α). For better visualization, the patches of size 60 × 60 pixels are illustrated as circles. Depending onthe motion angle α, distinct blurring artifacts occur. Orthogonal motion (α = ±90) leads to the most severe banana-shapedartifacts while parallel motion (α = 0 or α = 180) causes bird-shaped blurring. In the clinical data, visibility of blurringartifacts and intensity undershoots are strongly influenced by surrounding background intensities, i.e. artifact types are visuallymore difficult to distinguish.

motion direction at a specific coronary centerlinepoint ~c. For each centerline point ~c and param-eter setting (s, α), one 2.5D image patch I2.5D issampled as input data for supervised learning. As

illustrated in Figure 3, γmean defines the relation be-tween the static patient coordinate system and therotated patch coordinate system. Each 2.5D patchcontains three orthogonal image slices of size 60×60

6

Page 7: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

Notation: , BN = batch normalization, ReLU = Rectified Linear Unitkernel size, stride

1 x

60

x 6

0

16

x 6

0 x

60

16

x 6

0 x

60

16

x 6

0 x

60

32

x 3

0 x

30

32

x 3

0 x

30

64

x 1

5 x

15

64

x 1

5 x

15

64

x 1

x 1

+

64

x 1

x 1

x 3

1 x

60

x 6

0

16

x 6

0 x

60

16

x 6

0 x

60

16

x 6

0 x

60

32

x 3

0 x

30

32

x 3

0 x

30

64

x 1

5 x

15

64

x 1

5 x

15

64

x 1

x 1

1 x

60

x 6

0

16

x 6

0 x

60

16

x 6

0 x

60

16

x 6

0 x

60

32

x 3

0 x

30

32

x 3

0 x

30

64

x 1

5 x

15

64

x 1

5 x

15

64

x 1

x 1

x-y plane

x-z plane

y-z plane

BN + ReLU

BN

ReLU

Conv: 3 x 3, /2

Conv: 3 x 3

Conv: 1 x 1 /2

BN + ReLU

ReLU

BN

Conv: 3 x 3

Conv: 3 x 3ො𝑥ො𝑦

Conv: 3 x 3, /2 ConcatenateAdd

Basic blockConv block

Stride blockGlobal average pooling

Dense +

Figure 4: The orthogonal planes of each 2.5D input patch are processed separately in an individual path. Information aboutin-plane artifact pattern are finally fused in a dense layer with two output neurons to predict x and y. The network comprises523 346 learned weights.

pixels with an image resolution of 0.4 × 0.4 mm2

per pixel, the so-called x-y plane, x-z plane and y-z plane. The centerline point ~c defines the patchcenter and the patch orientation is contingent onthe angular reconstruction range. The z-axis corre-sponds to the scanner’s z-axis, while the x-y plane isspanned by two orthogonal vectors which are con-structed with respect to the mean reconstructiondirection of the centerline point, i.e. a rotation ofthe coordinate system by γmean about the z-axisis performed. By this procedure, the informationabout the angular reconstruction range is embed-ded in the patch orientation.

In both studies, phantom and patient data, theCoMoFACT with subsequent patch sampling is ap-plied 1000 times per reference case, thus, deliveringa total amount of 19 000 samples as database for su-pervised learning. The artifact appearance with re-spect to the relation between motion direction andvessel orientation is illustrated in Figure 1. Figure3 shows distinct blurring artifacts depending on an-gular reconstruction range and motion direction.

3.2. Supervised Learning

Based on the synthetically motion perturbeddata, CNNs are trained for patch-based motion es-timation. The networks take one image patch I2.5D

as input and deliver the predicted predominant mo-tion vector (x, y) as output. Due to the patch simi-larity of adjacent centerline points, the data is case-wise separated for training, validation and testingwith a ratio of 11 : 4 : 4 to avoid a bias. Bythis procedure, robustness of the trained networks

is evaluated with regard to unknown variations inthe vessel geometry and background intensity.

Data augmentation: The data basis during net-work training is extended by online data augmen-tation. Inherent symmetry properties of motion ar-tifacts are exploited for random mirroring of eachaxis in the input patches. The target labels x, y areadapted accordingly. This procedure increases theamount of vessel geometry variations in the phan-tom study and background variations in the pa-tient data study. Additionally, translation is per-formed as label-preserving augmentation strategy.The patch center is randomly shifted in a range of[−10, 10] voxels in x, y and z direction in order tobuild translation invariance into the networks. Thisallows deviation between extracted and actual cen-terline positions. During testing and validation, nomirroring and no translation is performed.

Learning setup: Multiple patch sampling strate-gies (2D, 2.5D and 3D), network architectures andhyperparameter settings were tested by extensivecross-validation. Best generalization capabilitiesare achieved by the CNN visualized in Figure 4which is employed in all subsequent experiments.The x-y plane, the x-z plane and the y-z plane areprocessed separately in an eight-layer ResNet (Heet al., 2016). No weight sharing between the pathsis performed since each plane exhibits individual,characteristic motion artifact pattern. The outputsof the global average pooling are concatenated andinformation are merged in a final dense layer withlinear activation function and two output neuronsto predict x and y.

7

Page 8: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

The learning process is driven by the squared er-ror l = (x− x)2 + (y− y)2. The stochastic gradientdescent solver Adam (Kingma, Diederik and Ba,Jimmy) with an initial learning rate of 0.01, a mini-batch size of 32 and a momentum of 0.8 is used fornetwork optimization. Training is performed over45 epochs while the learning rate halves after every15th epoch. L2 regularization with a weight of 0.001is used. Network training from scratch is performedon the phantom and the clinical data, separately.Furthermore, weight initialization based on previ-ous phantom studies and subsequent fine-tuning onclinical data is investigated. This is motivated bythe recent success of transfer learning approaches(Long et al., 2015; Zeiler and Fergus, 2014) andcomparable with learning to recognize digits beforelearning to read house numbers.Bagging approach: Three ensembles of five

CNNs each are learned using the aforementionednetwork design and hyper-parameter setting by thefollowing bagging approach:

1. Four validation cases and four test cases arerandomly sampled.

2. Network training is performed based on the re-maining eleven clinical cases.

3. After every epoch of the learning process thegeneralization capability is examined by meansof the validation set.

4. The model with the lowest mean euclidean er-ror εx,y on the validation set during 45 epochsof training is selected for calculation of the testmetrics.

5. Steps 1.-4. are performed five times in total.6. Mean and standard deviation of each test met-

ric over the five splits are calculated.7. Steps 5.-6. are performed for the phantom data

and the clinical data with and without networkinitialization based on the learned weights fromthe phantom study (using the same separationsin training, validation and testing for compa-rability).

3.3. Motion compensation pipeline

CNNs trained on the clinical database becomethe main component of the following motion com-pensation pipeline. A cardiac CT image volumewith corresponding set of approximate centerlinepositions C ⊂ R3 and the raw projection data arerequired as input. By patch sampling and subse-quent application of the trained CNNs, a set of mo-

tion vectors ~d~c ∈ R3 for ~c ∈ C is predicted along the

centerline. The estimated motion vectors (x, y, 0)

are back rotated in ~d~c by −γmean about the z-axisand specify the displacement in the static patientcoordinate system. For MC-FBP, these sparse mo-tion information are transformed into a dense mo-

tion vector field ~d(tcc, ~ν) : [0%, 100%) × Ω → R3

by distance-weighted extrapolation (see Figure 5).The estimated voxel displacements are calculatedby:

~d(tcc, ~ν) = mC(~ν)60bpm

HRmean

·

0 if ~ν /∈ Θτ (C)∑~c∈C : ~c∈Θτ (~ν)

w(~ν,~c) ~d~c if ~ν ∈ Θτ (C)

·

0.5 if tcc < r − 10%(r−tcc)

20% if r − 10% ≤ tcc ≤ r + 10%

−0.5 if tcc > r + 10%

(5)

w(~ν,~c) =g(~c− ~ν)∑

~ci∈C : ~ci∈Θτ (~ν)g(~ci − ~ν)

(6)

The bounding volume of a set A ⊂ R3 is definedas Θτ (A) = ~ν | ∃~υ ∈ A : ‖~ν − ~υ‖ < τ. The dis-tance weighting is performed using a 3D Gaussian

kernel g(~c − ~ν) = exp(−‖~c− ~ν‖2 /(2σ2))/√

2πσ23

with σ = 8 (in millimeters). Beside MV extrap-olation, the distance weighting also leads to MVFsmoothing. The weighting mask mC : R3 7→ [0, 1]is generated by dilation of each centerline point inC with a kernel radius of 15 mm and subsequentuniform filtering with a kernel radius of 6.2 mm ac-cording to Lossau et al. (2019). In order to yielda smooth transition to zero, the bounding volumeradius τ is set to 21.2 mm in the following experi-ments. The MVF extrapolation in equation (5) isperformed shot-wise, i.e. for each circular acqui-sition, as motion across different scanning shots isusually not smooth.

Iterative MC: In order to increase the robustnessof the MV estimation, an ensemble of five CNNs(see bagging approach in Section 3.2) is utilizedfor gradual approximation in an iterative fashion.An additional input parameter kmax is introducedwhich indicates the number of alternating MV esti-mation and MC-FBP steps. Algorithm 1 providesthe pseudo code of the iterative CoMPACT MCpipeline.

As an extension, alternative stopping criteria in-

8

Page 9: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

Θ𝑟 Ԧ𝑣

Θ𝑟 𝐶

Ԧ𝑣

Figure 5: For each voxel ~ν ∈ Θτ (C) in the centerline bound-ing volume highlighted in gray, a motion vector is calculatedby scattered data extrapolation. Distance weighting of ad-jacent centerline points ~c ∈ Θτ (~ν) is performed accordingto the equations (5) and (6).

Input: image volume IΩ0 , raw projection data

p, approximate centerline C, numberiterations kmax

Output: improved image volume IΩkmax

set ~d~c = ~0 for ~c ∈ C;for k = 1, 2, .., kmax do

for ~c ∈ C doI2.5D = sample

(IΩk−1,~c

);

~d~c += ensemble(I2.5D

);

end

calculate ~d(tcc, ~ν) according to (5);

IΩk = MC-FBP

(p, ~d(tcc, ~ν)

);

endAlgorithm 1: Iterative MC in CCTA images us-ing patch-based motion estimation.

stead of a fixed number of iterations are conceiv-able. Termination with respect to the convergencebehavior or exploitation of the artifact level quan-tification networks from Lossau et al. (2019) arepossible solutions. The proposed CoMPACT MCpipeline can be either applied on the full coronaryartery tree, on individual segments or on a singlecenterline point. In case of |C| = 1 and accurate

motion estimation, the dense MVF ~d(tcc, ~ν) equals

to ~d−1~c (tcc, ~ν). The duration and the robustness re-

garding outliers can be regulated by the density ofcenterline points.

4. Experiments and Results

The Microsoft Cognitive Toolkit (CNTK v2.5+,Microsoft Research, Redmond, WA, USA) is usedas deep learning framework. In Section 4.1, net-works accuracies are analyzed based on syntheticmotion artifacts. Qualitative performance evalua-tion of the proposed CoMPACT MC pipeline basedon real motion artifacts, observer studies and a run-time analysis are provided in Section 4.2.

4.1. Quantitative analysis on synthetic artifacts

The following error metrics are introduced fornetwork evaluation:

εx,y =√

(x− x)2 + (y − y)2,

ε = | − | , for ∈ s, x, y,εα = min(|α− α| , 360 − |α− α|)

Table 2 summarizes the results on the testingsubsets. As expected, more accurate MV predic-tion is achieved during the phantom study. Fine-tuning shows a slight advantage over network op-timization from scratch. However, during qualita-tive performance evaluation of the proposed CoM-PACT MC pipeline based on real motion artifacts,networks trained from scratch yield better resultsthan fine-tuned ones. The fine-tuned networks de-liver too conservative in clinical practice, i.e. theyseldom venture predictions far from the mean ~0 alsoin the presence of severe artifacts. This could beexplained by possible overfitting on the syntheticartifacts. For this reason, the following quantita-tive error analysis and qualitative experiments areperformed based on the bagging ensemble of thefive networks trained on the clinical data withoutfine-tuning.

Figure 6 illustrates the correlation between theaccuracy of the predicted motion direction and theintroduced motion strength s. High angle errorsεα correlate with low s values, i.e. most accurateprediction of the motion direction is feasible for im-age patches with severe motion artifacts. Figure 7shows the mean confusion matrix of the target mo-tion strength s. The CNNs in the network ensem-ble frequently deliver too conservative predictions.Especially high motion levels tend to be underesti-mated. This network behavior supports the itera-tive MC scheme. Furthermore, a weakness in thedifferentiation of low motion levels s ∈ [0, 3] is ob-servable, which is also difficult for a human observer(see Figure 3).

9

Page 10: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

Table 2: Error metrics on the test cases with synthetic artifacts during the phantom study and the patient data study with and

without fine-tuning (FT). The baseline corresponds to the mean ground truth MV ( ~d~c = ~0) determined in the CoMoFACT.

εx,y εs εx εy εα

Baseline 5.00 5.00 3.24 3.24 NaN∗

Phantom 1.10± 0.12 0.77± 0.09 0.72± 0.05 0.66± 0.11 13.37 ± 1.21

Clinical (no FT) 2.92± 0.13 1.89± 0.07 1.75± 0.10 1.96± 0.07 35.66 ± 1.57

Clinical (with FT) 2.87± 0.16 1.86± 0.11 1.71± 0.08 1.93± 0.13 34.85 ± 2.09

∗ The angle error εα is 90.00 for arbitrary constant ~d~c 6= ~0

Figure 6: Bar plot of mean and median angle error evaluatedfor subsets determined by the selected s ranges.

4.2. Qualitative analysis on real artifacts

Local MC: In the first experiment, we investi-gate how well motion estimation and subsequentcompensation can be done from a single 2.5D im-age patch, i.e. in case of |C| = 1. Centerline pointsfor 24 test patches are manually selected from thetwelve clinical test cases described in Section 2.3at vessel segments of varying position, orientationand artifact level. In Figure 8, the correspondingx-y planes of size 60 × 60 pixels are visualized be-fore and after k ∈ 1, 3, 10 iterations of CoMPACTMC. For comparison, the registration-based MC ap-proach from Grass et al. (2016) is considered whichexploits the entire 3D image field of view (FOV).

The CoMPACT MC shows gradual improvementof the image quality in a multitude of test cases (e.g.in Figure 8c,h,i,k) and sensible convergence prop-erties (except for Figure 8m). Is has to be noted,that the registration-based approach also fails inthe patch of Figure 8m. Indeed, our pipeline hasan advantage over the registration-based approachin several patches, like for instance in Figure 8d,p,t.The networks are robust regarding slight shifts be-tween patch center and vessel position (see Figure8b,e). The main weakness of our method is the

Figure 7: Confusion matrix of the target motion strength s.

restricted motion model complexity. We assumeconstant linear motion in the axial plane which isspatially constant in a local neighborhood. In somecases, these assumptions do not seem to be fulfilled,e.g. in the presence of more complex motion trajec-tories like turning motion (see Figure 8v) and spa-tially varying predominant motion directions (seeFigure 8i,m,q). Nevertheless, the CoMPACT MCis remarkably successful in view of the little infor-mation content obtained from a single 2.5D imagepatch.

Global MC: In the second experiment, we in-vestigate how well motion estimation and subse-quent compensation can be done in clinical prac-tice by application on the whole coronary arterytree. The simultaneous consideration of variouscenterline points also allows for spatially irregu-lar predominant motion directions. Figure 9 showsthe multiplanar reformats of four vessels withoutMC, after 10 iterations of CoMPACT MC and afterregistration-based MC. In Figure 9a,c,d, the corre-

10

Page 11: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

(a)

org

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

k= 1 k= 3 k= 10 reg

(m)

org

(n)

(o)

(p)

(q)

(r)

(s)

(t)

(u)

(v)

(w)

(x)

k= 1 k= 3 k= 10 reg

Figure 8: The x-y planes of 24 image patches belonging to twelve different patients are visualized before (org) and afterk ∈ 1, 3, 10 steps of CoMPACT MC. For comparison, the registration-based MC (reg) is considered. Significantly reducedartifact levels are observable in the majority of the test patches.

sponding centerlines were extracted from the out-put image volume of the registration-based MC us-ing the Comprehensive Cardiac Analysis Software.

The centerline in Figure 9b was determined basedon the original image volume as the registration-based approach leads to increased artifact levels

11

Page 12: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

15.2 45.8 76.2 106.8 137.2 167.8 198.2 228.8

org

15.2 45.8 76.2 106.8 137.2 167.8 198.2 228.8

k=

10

15.2 45.8 76.2 106.8 137.2 167.8 198.2 228.8

reg

(a)

9.4 28.1 46.9 65.6 84.4 103.1 121.9 140.6

org

9.4 28.1 46.9 65.6 84.4 103.1 121.9 140.6

k=

10

9.4 28.1 46.9 65.6 84.4 103.1 121.9 140.6

reg

(b)

18.8 56.2 93.8 131.2 168.8 206.2 243.8 281.2

org

18.8 56.2 93.8 131.2 168.8 206.2 243.8 281.2

k=

10

18.8 56.2 93.8 131.2 168.8 206.2 243.8 281.2

reg

(c)

14.7 44.1 73.4 102.8 132.2 161.6 190.9 220.3

org

14.7 44.1 73.4 102.8 132.2 161.6 190.9 220.3

k=

10

14.7 44.1 73.4 102.8 132.2 161.6 190.9 220.3

reg

(d)

Figure 9: The multiplanar reformats of four vessels belonging to different patients and branches of the right coronary artery(RCA) are visualized before (org) and after k = 10 iterations of CoMPACT MC. Corresponding cross-sectional image patcheswhich are perpendicular to the extracted centerline are given below for visual inspection. Furthermore, registration-based MC(reg) is considered for comparison. Both MC approaches, the registration-based and the proposed CoMPACT MC, lead tosignificant reduction of moderate and severe artifacts along the vessel.

12

Page 13: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

in the distal vessel segment. Significantly reducedartifact levels after CoMPACT MC are observablein all cases, also in the presence of noise (see Fig-ure 9a) or bifurcations (see Figure 9d). In Figure9c, moderate artifacts in the proximal RCA are re-moved while the artifact-free mid and distal vesselsegments captured by the second scanning shot re-main unchanged.Observer studies: Four separate observer stud-

ies were performed to rate cross-sectional imagepatches before MC, after k = 10 iterations ofCoMPACT MC and after registration-based MC.Eight cross-sectional image patches are equidis-tantly sampled along the RCA (as illustrated inFigure 9) from eleven test cases resulting in a to-tal number of 4 · 3 · 8 · 11 = 1056 labeled patches.The twelfth clinical test case was omitted as stacktransition artifacts preclude the automatic coronaryartery tree segmentation by means of the Compre-hensive Cardiac Analysis Software. Rating is per-formed in a five point Likert scale (1: excellent,2: good, 3: mixed, 4: strong artifact, 5: non-diagnostic). Vessel segments are presented in ran-dom order without indication of the underlying al-gorithm (org/k = 10/reg) to the readers. It hasto be noted that the readers were no radiologists,but research scientists with high level of expertise inreading cardiac CT images. The resulting annota-tions are summarized in Figure 10. Mean observerscores of 3.08 ± 0.24, 2.28 ± 0.29 and 2.42 ± 0.23are achieved on image volumes without MC, af-ter k = 10 iterations of CoMPACT MC and afterregistration-based MC. The performed experimentsdemonstrate the generalization capabilities of thetrained neural networks on non-synthetic motionartifacts and a reasonable convergence behavior ofthe iterative MC scheme.

1 2 3 4 5k= 10

1

2

3

4

5

org

38 5 0 0 0

15 57 12 0 0

8 40 39 3 1

9 22 19 19 1

9 23 13 12 7

1 2 3 4 5reg

1

2

3

4

5

org

33 10 0 0 0

12 66 5 1 0

8 35 38 9 1

6 20 17 22 5

7 14 21 13 9

Figure 10: Cross-sectional image patches before and afterMC are rated by four human observers in a five point Likertscale (from 1: excellent to 5: non-diagnostic).

Run-time analysis: Table 3 provides the re-sults of a five-fold run-time analysis performed on

a NVIDIA GeForce GTX 1080 Ti. In case of k = 1,patch sampling is performed with respect to themean reconstruction direction and voxel positionsare cached for faster sampling in subsequent it-erations (k > 1). Motion vector prediction bymeans of the networks, i.e. ensemble application,is the most time-efficient processing step, whereasMVF extrapolation and MC-FBP are quite time-consuming in the current implementation. Accel-eration is possible by parallel processing of the in-dividual scanning shots and restriction of the re-construction region during MC-FBP to the cachedvoxel positions required for patch sampling. Fur-thermore, the run-time is controllable by adjustingthe number of iteration steps kmax and the center-line point density along the vessel.

Table 3: Mean duration of the CoMPACT MC pipeline incase of |C| = 1000 and a FOV of size 512× 512× 300 voxels.

Processing Step Duration [secs]

Patch Sampling k = 1 62.10± 3.61k > 1 11.54± 0.27

Ensemble Application 4.86± 0.03MVF Extrapolation 172.17± 0.87MC-FBP 116.16± 0.16

Total (kmax = 10) 3196.44± 139.39

5. Discussion

We proposed the first single-phase motion es-timation approach which works solely on recon-structed image data. The designed motion modelwhich comprises linear trajectories in the axialplane, reveals potential and limitations of image-based motion estimation. Despite severe simpli-fication of the actual, more complex heart mo-tion, significant artifact reduction is achieved onclinical test data. More complex trajectories (e.g.turning motion) could be determined by perform-ing constant linear motion estimation at multipletime points. Areas around the ostia exhibit 3D ve-locities with a noticeable contribution from the z-component. In contrast, mid- and distal RCA andmid-LCX segments have a dominant axial compo-nent and velocities in these segments are typicallyhigher (Wang et al., 1999; Vembar et al., 2003).The introduced procedure of data generation by theCoMoFACT and subsequent supervised learning is,in principle, extendable to arbitrary non-linear 3Dmotion trajectories. However, the information con-tent of the reconstructed image volumes is a lim-iting factors in model extension to more complex

13

Page 14: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

motion, i.e. motion along the z-axis. Furthermore,performed experiments on network fine-tuning fromphantom to clinical data reveal the problem of po-tential overfitting to synthetic artifacts.

For application of the proposed CoMPACT MCpipeline, the approximate locations of the coronaryarteries have to be known. In case of incompleteor incorrect fully automatic centerline segmenta-tion due to severe motion artifacts, semi-automaticapproaches which enable user-interaction have tobe considered. This requirement constitutes themain disadvantage of patch-based MC in compari-son to registration-based MC. Both approaches leadto significantly reduced artifact levels in the CCTAimages. CoMPACT requires the minimal angularrange of 180 degrees in parallel rebinned geome-try while the registration-based MC demands 315degrees for the partial image reconstructions. Fur-thermore, CoMPACT shows very promising resultsdespite minimal spatial information which enablesfast local processing of a few centerline points andtheir neighborhood.

Patch-based motion estimation offers a lot ofpotential for further research. Additional predic-tion of network uncertainty and integration intothe distance-weighted MVF extrapolation might beuseful in patches with little information content, i.e.in case of low contrast enhancement. So far, theproposed CoMPACT method is merely based on 19clinical data sets. In general, CCTA images are ac-quired with a wide variety of scanner types, imag-ing protocols and reconstruction algorithms. In or-der to increase the network’s robustness, collectionof more data and network fine-tuning is required.The transferability of the proposed CoMPACT MCpipeline to other scanner types and imaging proto-cols should be investigated.

The methodology of first introducing simulatedmotion to clinical cases with excellent quality andsubsequent supervised learning of motion estima-tion models based on the artifact appearance is,in principle, not restricted to contrast-enhancedcoronary arteries. By providing a set of referencecases without motion artifacts, patch-based motionestimation and compensation is on-site trainablefor data of arbitrary contrast protocol and otherparts of the human anatomy. Possible examplesare motion artifact reduction at the aortic valveor correction of calcium scores in non-contrast CT.The information content of the reconstructed im-age patches and overfitting to synthetic artifactsare again potentially limiting factors.

6. Conclusions

Typical coronary artifact patterns are introducedin phantom and clinical data by a forward modelwhich simulates linear, axial motion. The generatedimage data is used for subsequent supervised learn-ing of CNNs for estimation of underlying motionvectors which are integrated into an iterative mo-tion compensation pipeline. Despite variations innoise level, background intensity and contrast agentdensity, CNNs are remarkably successful in patch-based MV estimation on clinical data. The pro-posed CoMPACT MC method furthermore gener-alizes to non-synthetic artifacts and deep-learning-based motion estimation is particularly suitable forMC in clinical cases with severe artifacts.

References

Bhagalia, R., Pack, J.D., Miller, J.V., Iatrou, M., 2012. Non-rigid registration-based coronary artery motion correctionfor cardiac computed tomography. Medical Physics 39,4245–4254.

Bippus, R.D., Koehler, T., Bergner, F., Brendel, B., Hansis,E., Proksa, R., 2011. Projector and backprojector for iter-ative CT reconstruction with blobs using CUDA, in: Fully3D 2011: 11th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nu-clear Medicine, Potsdam, Germany, 11-15 July 2011.

Brendel, B., Bippus, R., Kabus, S., Grass, M., 2014.Motion compensated backprojection versus backproject-then-warp for motion compensated reconstruction, in:The Third International Conference on Image Formationin X-ray Computed Tomography, Salt Lake City, USA,pp. 169–172.

Budoff, M.J., Li, D., Kazerooni, E.A., Thomas, G.S., Mieres,J.H., Shaw, L.J., 2017. Diagnostic accuracy of nonin-vasive 64-row computed tomographic coronary angiogra-phy (CCTA) compared with myocardial perfusion imag-ing (MPI): the PICTURE study, a prospective multicen-ter trial. Academic Radiology 24, 22–29.

Camargo, G.C., Peclat, T., Souza, A.C., Lima, R.d.S.L.,Gottlieb, I., 2017. Prognostic performance of coronarycomputed tomography angiography in asymptomatic in-dividuals as compared to symptomatic patients with anappropriate indication. Journal of Cardiovascular Com-puted Tomography 11, 148–152.

Elss, T., Nickisch, H., Wissel, T., Bippus, R., Morlock, M.,Grass, M., 2018a. Motion estimation in coronary CTangiography images using convolutional neural networks.Medical Imaging with Deep Learning (MIDL) .

Elss, T., Nickisch, H., Wissel, T., Schmitt, H., Vembar,M., Morlock, M., Grass, M., 2018b. Deep-learning-based CT motion artifact recognition in coronary arter-ies, in: Medical Imaging 2018: Image Processing, Inter-national Society for Optics and Photonics. p. 1057416.doi:10.1117/12.2292882.

Foy, A.J., Dhruva, S.S., Peterson, B., Mandrola, J.M., Mor-gan, D.J., Redberg, R.F., 2017. Coronary computed to-mography angiography vs functional stress testing for pa-

14

Page 15: Motion Estimation and Correction in Cardiac CT Angiography … · 2019. 6. 24. · temporal resolution of CCTA images is restricted by the angular range required for reconstruction

tients with suspected coronary artery disease: A system-atic review and meta-analysis. JAMA Internal Medicine177, 1623–1631. doi:10.1001/jamainternmed.2017.4772.

Ghekiere, O., Salgado, R., Buls, N., Leiner, T., Mancini,I., Vanhoenacker, P., Dendale, P., Nchimi, A., 2017. Im-age quality in coronary CT angiography: challenges andtechnical solutions. The British Journal of Radiology 90,20160567.

Grass, M., Thran, A., Bippus, R., Kabus, S., Wiemker, R.,Vembar, M., Schmitt, H., 2016. Fully automatic cardiacmotion compensation using vessel enhancement, in: Ab-stracts of the 11th Annual Scientific Meeting of the Soci-ety of Cardiovascular Computed Tomography, JCCT.

Hahn, J., Bruder, H., Rohkohl, C., Allmendinger, T., Stier-storfer, K., Flohr, T., Kachelrieß, M., 2017. Motion com-pensation in the region of the coronary arteries based onpartial angle reconstructions from short-scan CT data.Medical Physics 44, 5795–5813.

He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residuallearning for image recognition, in: Proceedings of theIEEE Conference on Computer Vision and Pattern Recog-nition (CVPR), pp. 770–778.

Hornik, K., Stinchcombe, M., White, H., 1989. Multilayerfeedforward networks are universal approximators. NeuralNetworks 2, 359–366.

Isola, A.A., Grass, M., Niessen, W.J., 2010. Fully auto-matic nonrigid registration-based local motion estimationfor motion-corrected iterative cardiac CT reconstruction.Medical Physics 37, 1093–1109.

Jung, S., Lee, S., Jeon, B., Jang, Y., Chang, H.J., 2018.Deep learning based coronary artery motion artifact com-pensation using style-transfer synthesis in CT images, in:International Workshop on Simulation and Synthesis inMedical Imaging, Springer. pp. 100–110.

Kim, S., Chang, Y., Ra, J.B., 2015. Cardiac motion correc-tion based on partial angle reconstructed images in X-rayCT. Medical Physics 42, 2560–2571.

Kim, S., Chang, Y., Ra, J.B., 2018. Cardiac motion cor-rection for helical CT scan with an ordinary pitch. IEEETransactions on Medical Imaging .

Kingma, Diederik and Ba, Jimmy, 2015. Adam: A methodfor stochastic optimization.

Liu, T., Maurovich-Horvat, P., Mayrhofer, T., Puchner,S.B., Lu, M.T., Ghemigian, K., Kitslaar, P.H., Broersen,A., Pursnani, A., Hoffmann, U., et al., 2017. Quantitativecoronary plaque analysis predicts high-risk plaque mor-phology on coronary computed tomography angiography:results from the ROMICAT II trial. The InternationalJournal of Cardiovascular Imaging , 1–9.

Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolu-tional networks for semantic segmentation, in: Proceed-ings of the IEEE Conference on Computer Vision and Pat-tern Recognition, pp. 3431–3440.

Lossau, T., Nickisch, H., Wissel, T., Bippus, R., Schmitt, H.,Morlock, M., Grass, M., 2019. Motion artifact recognitionand quantification in coronary CT angiography using con-volutional neural networks. Medical Image Analysis 52,68–9.

Ma, H., Gros, E., Baginski, S.G., Laste, Z.R., Kulkarni,N.M., Okerlund, D., Schmidt, T.G., 2018. Automatedquantification and evaluation of motion artifact on coro-nary CT angiography images. Medical Physics .

Petersilka, M., Bruder, H., Krauss, B., Stierstorfer, K.,Flohr, T.G., 2008. Technical principles of dual sourceCT. European Journal of Radiology 68, 362–368.

Rohkohl, C., Bruder, H., Stierstorfer, K., Flohr, T., 2013.Improving best-phase image quality in cardiac CT by mo-tion correction with MAM optimization. Medical Physics40.

Schafer, D., Borgert, J., Rasche, V., Grass, M., 2006.Motion-compensated and gated cone beam filtered back-projection for 3-D rotational X-ray angiography. IEEETransactions on Medical Imaging 25, 898–906.

Schondube, H., Allmendinger, T., Stierstorfer, K., Bruder,H., Flohr, T., 2011. Evaluation of a novel CT imagereconstruction algorithm with enhanced temporal resolu-tion, in: Medical Imaging 2011: Physics of Medical Imag-ing, International Society for Optics and Photonics. p.79611N.

van Stevendaal, U., von Berg, J., Lorenz, C., Grass, M.,2008. A motion-compensated scheme for helical cone-beam reconstruction in cardiac CT angiography. MedicalPhysics 35, 3239–3251.

van Stevendaal, U., Koken, P., Begemann, P.G., Koester,R., Adam, G., Grass, M., 2007. ECG gated circular cone-beam multi-cycle short-scan reconstruction algorithm, in:Medical Imaging 2007: Physics of Medical Imaging, In-ternational Society for Optics and Photonics. p. 65105P.

Tang, Q., Cammin, J., Srivastava, S., Taguchi, K., 2012.A fully four-dimensional, iterative motion estimation andcompensation method for cardiac CT. Medical Physics39, 4291–4305.

Vembar, M., Garcia, M., Heuscher, D., Haberl, R.,Matthews, D., Bohme, G., Greenberg, N., 2003. A dy-namic approach to identifying desired physiological phasesfor cardiac imaging using multislice spiral CT. MedicalPhysics 30, 1683–1693.

Sprem, J., de Vos, B.D., de Jong, P.A., Viergever, M.A.,Isgum, I., 2017. Classification of coronary artery calcifi-cations according to motion artifacts in chest CT using aconvolutional neural network, in: Styner, M.A., Angelini,E.D. (Eds.), Medical Imaging 2018: Image Processing, In-ternational Society for Optics and Photonics. p. 101330R.doi:10.1117/12.2253669.

Wang, Y., Vidan, E., Bergman, G.W., 1999. Cardiac motionof coronary arteries: variability in the rest period and im-plications for coronary MR angiography. Radiology 213,751–758.

Wiemker, R., Klinder, T., Bergtholdt, M., Meetz, K.,Carlsen, I.C., Bulow, T., 2013. A radial structure ten-sor and its use for shape-encoding medical visualizationof tubular and nodular structures. IEEE Transactions onvisualization and computer graphics 19, 353–366.

Zeiler, M.D., Fergus, R., 2014. Visualizing and understand-ing convolutional networks, in: European conference oncomputer vision, Springer. pp. 818–833.

15