Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
OPTIMAL DESIGN IN MEDICAL INVERSION
Lior Horesh, Eldad Haber, Luis Tenorio
IBM TJ Watson Research Center - Toronto, Canada - June 2011
INTRODUCTION
• Aim: infer model
• Given
• Design parameters
• Measurements
• Observation model
• Naïve inversion ... Fails...
• Cast as an optimization problem
EXPOSITION - INVERSE PROBLEMS
y
m
( );d m y
( ) ( ); ;F m y d m yh+ =
m
( ) ( ) ( ){
2
ˆ arg min ;
regularizationdata fit
m F m y d y S m= - +144444442 44444443
2ˆ (1)m m- = O
HOW TO IMPROVE MODEL RECOVERY ?
• How can we ...
• Improve observation model ?
• Use more meaningful a-priori information ?
• Extract more information in the measurement procedure ?
( ) ( ) ( ){
2
2
ˆ arg min ;
regularizationdata fit
m F m y d y S m= - +144444442 44444443
( ) ( ) ( ){
2
2
ˆ arg min ;
regularizationdata fit
m m y d y S mF= - +144444442 44444443
( ) ( ) ( ){
2
2
ˆ arg min ;
regularizationdata fit
y ym mF d S m= - +144444442 44444443
( ) ( ) ( ){
2
2
ˆ arg min ;
regularizationdata fit
y ym mF Sd m= - +144444442 44444443
( ) ( ) ( ){
2
2
arg minˆ ;
regularizationdata fit
y ym mF Sd m= - +144444442 44444443
• Provide more efficient optimization schemes ?
PART IREGULARIZATION DESIGN
REGULARIZATION DESIGN -BACKGROUND
REGULARIZATION APPROACHES
• Why regularization is necessary ?
• Imposes a-priori information
• Stabilizes the inversion process
• Provides a unique solution
• Two approaches
• Explicit
• Sparse representation
( ) ( ) ( ){
2
2
ˆ arg min ;
regularizationdata fit
y ym mF Sd m= - +144444442 44444443
HOW TO REPRESENT SPARSELY ?
• Principle of parsimony à True model can be represented by a small number of parameters
• Each column is a prototype model ó atom
• Sparse representation vector
l
p
Over-complete
dictionary Sparse
vector
==
ui
D
Standard
representation
Du
m
SPARSE REPRESENTATION
• Ideally sparsest solution achieved by -’norm’ penalty
• Non-convex à NP-hard combinatorial problem
• Instead employ -norm ( Donoho 2006 )1l
0l
( ) ( )2ˆ 1
2
ˆ arg min ; -u
u DF d yu y ua= +
( ) ( ) ( ){
2
2ˆ
ˆ arg min ; -m
regularizationdata misfit
y y Sm m d mF= +14444442 4444443
SPARSE REPRESENTATION PERFORMANCE
Total Variation Sparse Representation
m d
SmoothnessEnergy
PSF
SPARSE REPRESENTATION PERFORMANCE - DIFFERENT
OPERATORS
Tm ( )1 1
F m dh+ = ( )2 2F m dh+ =1
Du2
Du
Fadili et al 2007
Fadili et al 2007
d F 1 1D u
2 2D u
1 1D u
2 2D ud F
SPARSE REPRESENTATION PERFORMANCE -
DIFFERENT DICTIONARIES
SPARSE REPRESENTATION PERFORMANCE
Chung, Nagy, O’Leary 2006 Figueiredo, Nowak, Wright 2007
Lanczos Hybrid Bidiagonalization
Regularization (HyBR)
Gradient Projection Sparse
Representation (GPSR)
Singular
vectorsWavelets
m d
IMPLICIT REGULARIZATION - RATIONALE
• Sometimes sparse representation performs well, sometimes not...
Why?
• Model and operator dependent
• Some dictionaries perform better than others for specific problems
• should be chosen such that it sparsifies the representations
• One approach: choose from a known set of transforms (Steerable wavelet,
Curvelet, Contourlets, Bandlets, Singular vectors…)
D
Local DCTlocally oscillatory, stationary texture
Wavelets piecewise smooth, isotropic structures
Curveletspiecewise smooth with C2 contours
D
• Objective vs. subjective function
• Heuristic choice of regularization functional based on ad-hoc assumptions
• Solutions are intrinsically subjective to the regularization functional choice
• Adaptability - account for the problem’s statistics (model, operator and noise)
• Efficiency and precision - use the right jargon to express a message/model
IMPLICIT REGULARIZATION BY
DICTIONARY DESIGN
{ }1,...,
sm m
How to construct more objective regularization functionals ?
Design a dictionary by learning from authentic examples
• Approximated Maximum Likelihood (Olshusen & Field 1996, 1997 )
• Overcomplete ICA (Lewicki 2000 )
• Method of Optimal Directions (Engan et al 2001, 2005 )
• Sparse Bayesian Learning (Girolami 2001, Wipf 2005 )
• FOCUSS (Delgado et al 2003 ) - Bayesian MAP & relative complexity
• K-SVD (Aharon & Elad 2006 )
• FOCUSS+ (Murray & Delgado 2007 )
• All addressed sparse coding à observation operator was identity
DICTIONARY DESIGN - PREVIOUS WORK
:F I=
But
REGULARIZATION LEARNING –STATISTICAL MERIT
DICTIONARY LEARNING - OPTIMALITY CRITERION
• Loss
• Mean Square Error
ß Depends on the noise
ß Depends on an unknown model ( ) ( )
2
2ˆ, : ( ), ,m D m d D u mh= -L
( ) ( )2
2ˆ, : ( ), ,MSE m D m d D u m
hh= -E
h
ß Depends on an unknown model m
m
DICTIONARY LEARNING - OPTIMALITY CRITERION
• Bayes risk
• Bayes empirical risk
• Assume a set of feasible authentic model examples is available
ß Computationally infeasible( )2
2ˆ, : ( , )
mtrueD m D u m
e= -ER M
( )1
2
2ˆ, ( , )
s
iempirical i i i
D m D uh
m m=
= -åER
{ }1,...,
sm m
REGULARIZATION LEARNING –OPTIMIZATION FRAMEWORK
OVER-COMPLETE DICTIONARY DESIGN - FORMULATION
• Bi-level optimization problem
• Non-smooth - norm is replaced by a smooth optimization problem with inequality constraints
• Sensitivity by differentiating the necessary conditions of the decomposition
• Non-smooth optimization framework à Modified L-BFGS (Overton 2003 )
ˆ1
2
2
ˆ ˆargmin ( , )1
i iiD
s
isD m D u
hm
=
= -åE
( ) ( )2
2 1s.t . u = arg min ;
i
ii i iu
F D yu y d ua- +
Horesh & Haber 2009
1l
, 0p q ³u p q= -
( ) ( ), , , ,I J
k k
p qf D p gF D p
D DF
¶ ¶= =
¶ ¶
REGULARIZATION DESIGN –NUMERICAL RESULTS
DICTIONARY DESIGN - TRAINING SET
Horesh & Haber 2009
DICTIONARY DESIGN - COMPARISON
Tm ( )T
F m dh+ = 0 0D u
Horesh & Haber 2009
TVm
t tDu
DICTIONARY LEARNING –
ASSESSMENT WITH NOISE
( )F m dh+ = 0 0D u
Horesh & Haber 2009
t tDu
0.1%
1%
5%
truem
TVm
PART IIOPTIMAL EXPERIMENTAL DESIGN
OPTIMAL EXPERIMENTAL DESIGN -MOTIVATION
MOTIVATION – LIMITED ANGLE TOMOGRAPHY
Clerbout 2000
MOTIVATION – DIFFUSE OPTICAL TOMOGRAPHY
MOTIVATION – ULTRASOUND IMAGING
Clerbout 2000
DESIGN EXPERIMENTAL LAYOUT
Stonehenge, 2500 B.C.
DESIGN EXPERIMENTAL PROCESS
Galileo Galilei, 1564-1642
RESPECT EXPERIMENTAL CONSTRAINTS…
French nuclear test, Mururoa, 1970
OPTIMAL EXPERIMENTAL DESIGN -BACKGROUND
• Previous work
• Well-posed problems - well established (Fedorov 1997, Pukelsheim 2006 )
• Ill-posed problems - under-researched (Curtis 1999, Bardow 2008 )
• Many practical problems in engineering and sciences are ill-posed (under-determined)
What makes non-linear ill-posed problems so special ?
ILL VS. WELL-POSED
OPTIMAL EXPERIMENTAL DESIGN
( ) ( ) ( ){
2
2
ˆ arg min ;
regularizationdata fit
y ym mF Sd m= - +144444442 44444443
OPTIMALITY CRITERIA IN OVER-DETERMINED PROBLEMS
• For linear inversion, employ Tikhonov regularized least squares solution
• Bias - variance decomposition
• For over-determined problems
• A-optimal design problem
0a =
( )( )
( )1 ,ˆ
C y
F m ym J J L L J d J
ma
- ¶= + º
¶1444442 444443• • •
( )1
min y
trace C y-æ ö÷ç ÷÷çè ø
( ) ( )2
2 2 12 2
22
variance bias
m̂ m trace JC y J C y L Lms a- -æ ö÷ç- = +÷÷çè ø
E
1444444442 444444443 14444442 4444443
• •
OPTIMALITY CRITERIA IN OVER-DETERMINED PROBLEMS
• Optimality criteria of the information matrix
• A-optimal design ó average variance
• D-optimality ó uncertainty ellipsoid
• E-optimality ó minimax
• Almost a complete alphabet…
( )1
C y-
( )1
min y
trace C y-æ ö÷ç ÷÷çè ø
( )1
min dety
C y-æ ö÷ç ÷÷çè ø
( )1
min max eigy
C y-æ öæ ö÷ç ÷ç ÷÷ç ÷ç ÷ç è øè ø
THE PROBLEM...
• For non-linear ill-posed problems à none of these apply !
• Non-linearity à bias-variance decomposition is impossible
• Ill-posedness à controlling variance alone reduces mildly the error
What strategy can be used ?
Proposition 1 - Common practice so far
Trial and Error…
EXPERIMENTAL DESIGN BY TRIAL AND ERROR
• Pick a model
• Run observation model of different experimental designs, and get data
• Invert and compare recovered models
• Choose the experimental design that provides the best model recovery
Tm =
( )1 1,
TdyF m h+ = ( )2 2
,T
dyF m h+ =
2m̂ =
1m̂ =
2y
1y
THE PROBLEM...
• For non-linear ill-posed problems à none of these apply !
• Non-linearity à bias-variance decomposition is impossible
• Ill-posedness à controlling variance alone reduces mildly the error
What other strategy can be used ?
Proposition 2 - Minimize bias and variance altogether by some optimality criterion
How to define the optimality criterion ?
Horesh, Haber & Tenorio 2010
OPTIMAL EXPERIMENTAL DESIGN -STATISTICAL MERIT
OPTIMALITY CRITERION
• Loss
• Mean Squared Error
ß Depends on the noise
ß Depends on an unknown model
h
ß Depends on an unknown model m
m
( ) ( )2
2ˆ, :MSE m m my y
h= -E
( ) ( )2
2ˆ, :y ym m m= -L
OPTIMALITY CRITERION
• Bayes risk
• Bayes empirical risk
• Assume a set of feasible authentic model examples is available
How can be regularized ?
( ) ( )2,
, 12
ˆ, :1
ij j
k s
i jskmy ym m
=
= -åR
y
{ }1,...,
sm m
( ) ( )2
2ˆ, :
true risk my ym m
e= -ER M
ß Computationally infeasible
OPTIMALITY CRITERION –
SPARSITY CONTROLLED DESIGN
• Regularized empirical risk - Direct density penalty for activation
• Assume: fixed number of experiments
• Let
• The data
• Regularized risk
Horesh, Haber & Tenorio 2011
( ) { ( )1
, ( , ),y
y V Qm A Qd m F mVh h-
= + = +•
{ },y Q V=
1y
2y
( ) ( ),
, 1
2
2
1ˆ, :
reg ij
k s
i jpj
ysk
my ym m b=
= - +åR
OPTIMALITY CRITERION –
SPARSITY CONTROLLED DESIGN
• Regularized empirical risk - Direct approach
• Total number of experiments may be large
• Effective when activation of each source and receiver is expensive
• Derivatives of the forward operator w.r.t. Difficult…y
Horesh, Haber & Tenorio 2011
OPTIMALITY CRITERION –
SPARSITY CONTROLLED DESIGN
• Regularized empirical risk - Weights formulation
• Density penalty over selected experiments from a predefined set
( )1;F m y( )2;F m y( )999;F m y ( )1
;F m y( )2;F m y( )3;F m y( )1000
;F m y
Haber, Horesh & Tenorio 2010
OPTIMALITY CRITERION –
SPARSITY CONTROLLED DESIGN
• Regularized empirical risk - Weights formulation
• Let be discretization of the space
• Let
• The observation operator is weighted
• If experiment is not conducted
( ) ( )2
2
,
, 1
ˆ, : ,1
0r
k s
g j pj
e i jisk
wmw w wbm m=
= - + ³åR
( )( ) ( ) ( )1 1 12 2 2W F m W d m W diag wh+ = =
0i
w =
{ }1, ,
sy yK Y
( )
( )
1
s
F y
F
F y
ì üï ïï ïï ïï ï= í ýï ïï ïï ïï ïî þ
M
Haber, Horesh & Tenorio 2010
i
OPTIMALITY CRITERION –
SPARSITY CONTROLLED DESIGN
• Regularized empirical risk - Weights formulation
• Suitable when each experiment conduction is costly
• Source and receiver activation may be highly populated
• Less DOF
• No explicit access to the observation operator needed
Haber, Horesh & Tenorio 2010
OPTIMAL EXPERIMENTAL DESIGN -OPTIMIZATION FRAMEWORK
THE OPTIMIZATION PROBLEMS
• Direct formulation
• Weights formulation
( )
( ) ( )
, 2
2, 1
2
ˆ 2
1ˆ
ˆs.t .
min
arg mi
( ; ) ;nij
k s
ij ji j
ij ij i j ij
y
m
pm
sk
m m
y
S
y
y mF F yh
bm
m
=
- +
= + - +
å
( )
( )( ) ( )12
, 2
2, 1
2
ˆ 2
1ˆ
ˆs.t . ( )
min
arg mi
0
nij
k s
ij ji j
ij ij i j ijm
pwm
sk
m W F F
w w
w
Sm m
bm
h m
=
- +
= + - +
³
å
Horesh, Haber & Tenorio 2011Haber, Horesh & Tenorio 2010
THE OPTIMIZATION PROBLEM
• Bi-level optimization problem
• Assuming the lower optimization level is:
• Convex with a well defined minimum
• With no inequality constraints
( )
( )( ) ( )12
, 2
12
2ˆ
2,
1ˆ
ˆs.t . ( )
min
arg mi
0
nij
k s
ij ji j
i mj ij i
p
i
w
j j
msk
m W F F
w w
w
Sm m
bm
h m
=
- +
= + - +
³
å
Haber, Horesh & Tenorio 2010
( )
( ) ( ) ( )( ) ( )
, 2
2, 1
1
ˆs.t . c c , , ( ) 0
mi
0
'
nk s
ij ji j
ij ij j ij j
p
j
y
i j i
msk
F Fm J m mWw m
w
w
S
wm
m h m
b=
- +
º = + - + =
³
å•
THE OPTIMIZATION PROBLEM
• m is eliminated from the equations and viewed as a function of
• Compute gradient by implicit differentiation
• The sensitivity
• The reduced gradient
( ) ( ) ( )
( ) ( )( )diag
ij
ij ij ij ij
ij
ij
ij ij ij
cJ m J m S m K
m
cJ
wm m d
W
F
¶¢¢= + +
¶
¶= -
¶
•
•
Haber, Horesh & Tenorio 2010
1
: ij ij ij
ij
ijw w
m c cM
m
-æ ö¶ ¶ ¶÷ç ÷ç= = - ÷ç ÷¶ ¶ ¶ç ÷è ø
( )( ) ( ),
1 ,
ij ij ij ii j
wR m M mw w e
LKbbmÑ = - +å •
w
OPTIMAL EXPERIMENTAL DESIGN –NUMERICAL STUDIES
IMPEDANCE TOMOGRAPHY –
OBSERVATION MODEL
• Governing equations
• Following Finite Element discretization
• Given model and design settings
• Find data ,
( ) { ( )1
, ( , ),y
y V Qm A Qd m F mVh h-
= + = +•
n k<nd Î ¡
km Î ¡ y
( ) 0 in
. . on
m u
B C
Ñ × Ñ = W
¶ W
( )A m u Q=
IMPEDANCE TOMOGRAPHY –
DESIGNS COMPARISON
True modelNaive design Optimized design
Horesh, Haber & Tenorio 2011
MAGNETO-TULLERICS TOMOGRAPHY –
OBSERVATION MODEL
• Governing equations
• Following Finite Volume discretization
• Given: model and design settings (frequency )
• Find: data ,
( ) {1; ( ; )
y
d m V A m i sw w
w w hw-= +•
n k<nd Î ¡
km Î ¡ w
1 in
ˆ 0 on r
E i mE i s
E n
m w w-Ñ ´ Ñ ´ - = W
Ñ ´ ´ = ¶ W
r r rr
y
MAGNETOTELLURICS TOMOGRAPHY –
DESIGNS COMPARISON
True model Naive design
Optimized non-linear designOptimal linearized design
Haber, Horesh & Tenorio 2010Haber, Horesh & Tenorio 2008
THE PARETO CURVE – A DECISION MAKING TOOL
Haber, Horesh & Tenorio 2010
10
6
8
0 20
2
4
40 60 80 1000
Ris
k
1w
SUMMARY
SUMMARY
• Generic approaches for design in ill-posed inverse problems
• Design of adaptive regularization
• Optimal experimental design
• Only two (important) elements in the big puzzle...
• New frontiers in inverse problems and optimization
• Vast range of applications in medical imaging, that offers:
• Faster
• Safer
• Higher fidelity image reconstructions
ACKNOWLEDGMENTS
DESIGN IN INVERSION –
OPEN COLLABORATIVE RESEARCH
• IBM Research
• MITACS
• University of British Columbia
Andy Conn
Michael Henderson
Ulisses Mello
David Nahamoo
ACKNOWLEDGMENTS
Michele Benzi
Eldad Haber
Raya Horesh
Jim Nagy
Questions?Thank you