View
227
Download
0
Embed Size (px)
Citation preview
Computational Algebraic Problems in
Variational PDE Image Processing
Tony F. Chan
Department of Mathematics, UCLA
International Summer School in Numerical Linear Algebra
Chinese University of Hong Kong
Reports: www.math.ucla.edu/~imagers
Supported by ONR, NIH and NSF
Collaborators
Peter Blomgren (Stanford)
Raymond Chan (CUHK)
Gene Golub (Stanford)
Pep Mulet (Valencia)
Jackie Shen (Minnesota)
Luminita Vese (UCLA)
Justin Wan (Waterloo)
C.K. Wong
Jamylle Carter (MSRI)
Berta Sandberg (TechFinity)
Ke Chen (Liverpool)
Xue-Cheng Tai (Bergen, Norway)
Jean-Francois Aujol (Cachan)
Selim Esedoglu (Michigan)
Fred Park (UCLA -> Michigan)
Outline of 4 Lectures
1. Introduction to PDE Image Models & Algorithms:
Denoising, deblurring, active contours, segmentation
2. Algebraic Problems from Denoising & Deblurring
Linear preconditioning techniques
Nonlinear + Optimization (duality) techniques
Multigrid techniques
3. Algebraic Problems from Active Contours/Segmentation
Curve evolution techniques
Direct optimization techniques
Goal of Lectures
• Broad overview rather than latest techniques• Details on only a few topics --- see papers + reports
+ web• No comprehensive referencing• Limit to PDE aspects (there are non-PDE
approaches; see forthcoming SIAM book by Hansen, Nagy, O’Leary)
• Please ask questions (English, Cantonese, Mandarin)
TYPICAL IMAGE PROCESSING TASKS
* Denoising/Inpaint * Object Detection/Identification
* Deblurring * Object/Pattern Recognition
* Enhancement * Gray-Scale vs Vector-Valued
* Compression * Still vs Video
* Segmentation * Registration
Related fields: Computer Graphics, Computer Vision.
IP And Applied Math• Important applications:
– Medical, astronomy, Comp. Vision/Comp. Graphics,
• Math Models:– standard or create your own
• Math Tools: – harmonic analysis, PDEs, Baysean statistics,
differential geometry, CFD, multiscale, optimization,…
• Analysis of Models: – existence, uniqueness, properties
• Challenging Computations– 3D+time+multi-components, nonlinearity, non-smooth
Examples of PDE Image Models
Denoising and Inpainting
The Restoration ProblemA given Observed image z
Related to True Image u
Through Blur K
And Noise n
Blur+NoiseInitial Blur
Inverse Problem: restore u, given K and statistics for n.
Ill-posed: needs proper regularization.
Keeping edges sharp and in the correct location is a key problem !
nuKz
Total Variation Regularization
dxuuTV ||)(
• Measures “variation” of u, w/o penalizing discontinuities.
• |.| similar to Huber function in robust statistics.
• 1D: If u is monotonic in [a,b], then TV(u) = |u(b) – u(a)|, regardless of whether u is discontinuous or not.
• nD: If u(D) = char fcn of D, then TV(u) = “surface area” of D.
• (Coarea formula)
• Thus TV controls both size of jumps and geometry of boundaries.
• Extensions to vector-valued functions
• Color TV: Blomgren-C 98; Ringach-Sapiro, Kimmel-Sochen
drdsfdxufnR
ru
)(||}{
Total Variation Restoration
2||||2
1)()(min fKuuTVuf
u
0n
uGradient flow:
)(1
|| )( * fKKuK
u
uugu
t
anisotropic diffusion data fidelity
dxuuTV ||)(
* First proposed by Rudin-Osher-Fatemi ’92.
* Allows for edge capturing (discontinuities along curves).
* TVD schemes popular for shock capturing.
Regularization:
Variational Model:
Comparison of different methods for signal denoising & reconstruction
Denoising: TV vs H1
Inpaintings: Generalized Restoration Models Scratch Removal Disocclusion
Graffiti Removal
Examples of TV Inpaintings
Where is the Inpainting Region?
Unified TV Restoration & Inpainting model
EDE
dxdyuudxdyuuJ ,||2
||][ 20
,0)(||
0
uuu
ue
.0; ,,DzEz
e
Examples of PDE Image Models
Blind (unknown blur) and Non-blind Deburring
- Blurring operator K usually ill-conditioned
- Need to solve systems with differential-convolution operator:
Original Image Out of focus blur Blurred Image
Deblurring
zKKuKu
u
||
Debluring: TV vs H1
TV Blind Deconvolution C. + Wong (98)
* Variational Model:
* Alternating minimization algorithm:
22021
,||||
2
1)()(),(min uukkTVuTVkuf
ku
),(min),(
),(min ),(
111
1
kufkuf
kufkuf
n
k
nn
n
u
nn
* Algorithm gives 1-parameter family of solutions,
determined by SNR.
)( 2)( 1
TV Blind deconvolution
Recovered image
Recovered Blurring Function
61 102
2 0 1e-7 1e-5 1e-4
Original image
Out of focus blurred blind non-blind
Gaussian blurred blind non-blind
Examples of PDE Image Models
Active Contours & Segmentation
Features:
Automatically detects interior contours!
Works very well for concave objects
Robust w.r.t. noise
Detects blurred contours
The initial curve can be placed anywhere!
Allows for automatical change of topolgy
Active Contour w/o Edges (C.-Vese 99)
C of Evolution Objects Found
Europe nightlightsThe model detects contours without gradient (cognitive contours)
MRI brain image
Motion Segmentation (Moelich, Chan 2004)
Olympic Blvd, LA(2 frames per sec)
UCLA(1 frame per sec)
Westwood Blvd
Motion determined by logical AND & OR on frame differences over 3 consecutive frames.
Designed for low frame rate videos.
Another extension of Chan-Vese 2001 model.
Implemented via level sets.
Other Related PDE Image Models* Geometry driven diffusion (see book by Bart M. ter Haar Romeny 1994)
* Anisotropic diffusion (Perona-Malik 87)
* Fundamental IP PDE (Alvarez-Guichard-Lions-Morel 92)Affine invariant flow (Sapiro-Tanenbaum 93)
* TV + Textures (Meyer 2001, Osher-Vese 02, Osher-Sole-Vese 02, Osher-Sapiro-Vese 02, Chambolle 03)
||
,||)(u
ucurvatureucurvatureFut
)|)(|( uufut
3/1)(|| curvatureuut
Different Frameworks for Image Processing
Statistical/Stochastic Models:
Maximum Likelihood Estimation with uncertain data
Transform-Based Models:
Fourier/Wavelets --- process features of images (e.g. noise)
in transform space (e.g. thresholding)
Variational PDE Models:
Evolve image according to local derivative/geometric info,
e.g. denoising diffusion
Concepts are related mathematically:
Brownian motion – Fourier Analysis --- Diffusion Equation
Features & Advantages of PDE Imaging Models
* Use PDE concepts: gradients, diffusion, curvature, level sets
* Exploit sophisticated PDE and CFD (e.g. shock capturing) techniques
Restoration:
- sharper edges, less edge artifacts, often morphological
Segmentation:
- scale adaptivity, geometry-based, controlled regularity of boundaries, segments can have complex topologies
Newer, less well developed/accepted.
Combining PDE with other techniques.
Computational challenges* size: large # of pixels, color and multi-channel, 3D, videos.
* TV(u) non-differentiable if
need numerical regularization:
* High nonlinearity of
* Ill-conditioning:
* Highly varying coefficients:
* Need to precondition differential + convolution operators.
0|| u
0 ,|||| 2 uu
|| u
u
operatorcompact ddiscretize :
)(||
cond 2
KK
hOu
u
||
1
u
Some Books/Surveys for PDE Imaging• Morel-Solimini 94: Variational Meths in Image Segmentation
• Romeny 94: Geometry Driven Diffusion in Computer Vision
• Alvarez-Morel 94: Acta Numerica (review article)
• IEEE Tran. Image Proc. 3/98, Special Issue
• J. Weickert 98: Anisotropic Diffusion in Image Processing
• G. Sapiro 2000: Geometric PDE & Imaging
• Aubert-Kornprost 2002: Math Aspects of Image Processing
• Osher-Fedkiw 2003: “Level Set Bible”
• Chan, Shen & Vese Jan 03, Notices of AMS (review)
• Paragios, Chen, Faugeras 2005: Collection of articles
• Chan-Shen 2005: Image Processing & Analysis
Available since Sept 05 (www.siam.org)
Outline of 4 Lectures
1. Introduction to PDE Image Models & Algorithms:
Denoising, deblurring, active contours, segmentation
2. Algebraic Problems from Denoising & Deblurring
Linear preconditioning techniques
Nonlinear + Optimization (duality) techniques
Multigrid techniques
3. Algebraic Problems from Active Contours/Segmentation
Curve evolution techniques
Direct optimization techniques
)(||
)( *zKKuKu
uugut
2||||2
1)()(min zKuuTVuf
u
N
i
Tii zuuA 2
22 ||||2
1||||
TiAwhere denotes the discrete gradient , divergence.
Restoration Problem in Discrete Algebraic Form
0)(||||
)( zuuA
uAAug
N
iTi
Tii
Gradient Flow:
iA
)(||
|| )( *zKKuKu
uuugut
Morphological Diagonal Scaling (Marquina, Osher 2000)
Two different but related motivations:
- morphological evolution of level sets --- moves in direction of normal with speed proportional to curvature, independent of contrast.
- diagonally scaled Richardson stationary iteration.
Advantages:
- cost per time step similar to time marching
- much faster convergence to steady state
Unconstrained Modular Solver for Discrepancy Principle
Blomgren-C. 96
min R(u) u
subject to ,||||2
1 22 zuK
)(||||2
1)(min 2 uRzuKuf
u
Discrepancy Principle Constrained Problem:
Tikhonov Unconstrained Problem:
Modular Solver: efficient solver for fixed ),( uSu (e.g. Time Marching, Fixed Point, Primal-Dual.)
- Can make use of S to solve constrained problem efficiently.
- Based on block-elimination + Newton’s method for constrained problem via calls to S for computing directional derivatives.
A modular Newton’s Method
System of nonlinear equations: G(u, ) = 0; N(u, ) = 0.
Newton’s Method:
Gu G u - G
=
Nu N - N
Block elimination:
Define w = - Gu-1G; v = Gu
-1G
= - (N – Nu v)-1 (Nu w + N)
Main idea: Replace computation of w & v by calls to S(u,)
Robustness of Modular Solver
Efficiency of Modular Solver for Constrained Problem
Outline of 4 Lectures
1. Introduction to PDE Image Models & Algorithms:
Denoising, deblurring, active contours, segmentation
2. Algebraic Problems from Denoising & Deblurring
Linear preconditioning techniques
Nonlinear + Optimization (duality) techniques
Multigrid techniques
3. Algebraic Problems from Active Contours/Segmentation
Curve evolution techniques
Direct optimization techniques
Difficulties with Primal TV
• TV Norm Non Differentiable ) Regularize Functional:
• Primal: Gradient Flow Also Needs Regularization
• Problem Becomes difficult for Small
• But edges smeared for large • Artificial Time Marching at best Linearly Convergent
• Nonlinear relaxation (e.g. GS) non-convergent w/o • Ill-conditioning due to spatial scales (CFL; MG)
Towards Quadratically Convergent Methods
Newton’s Method [R. Chan, T. Chan, Zhou, ‘95]
• As ! 0 ) Size of Domain of Convergence ! 0
• Tied Continuation on .
• But Efficient Continuation Not Easy to Obtain!
Introduce Auxiliary (Dual Variable) w := ru / |ru| , Note |w| =1.
Linearize the (w,u)-system:
• Similar to primal-dual (Conn & Overton ’95, Anderson ’95)
Instead of u system:
Primal-Dual Method: (Chan, Golub, and Mulet ‘95)
• w =ru / |ru| Normal to the Level Sets of u• r ¢ w = Curvature of Level Sets
• Time Marching Regarded as Curvature Driven Flow
• Better Global Convergence Of Newton Meth for (w,u)- system:
(w,u) system more “globally linear” than u-system
Why Linearization Works Well
• Linearization of u-system
• Linearization of (w,u)-system
• Similar Structure and Cost for Both Systems
Linearized Systems
CGM Method: Convergence Results
Residual vs. Iterations ||un-utrue|| vs. Iterations
Relatively robust wrt ; but iteration # increases as -> 0.
Introducing Primal-Dual (J. Carter ’01)
Rewrite By Introducing Dual Variable :
Swap inf & sup (G strict. convex in u and concave (linear) in ):
For each , Solution to infu G(u,):
Back Substitute u, and use:
TV Model:
Primal-Dual ) Dual
Problem Reduces to the Dual Problem:
Optimality Conditions (Discrete Case):
Complementarity:
Update for u :
Objective quadratic; but many constraints
• Many Constraints
• Use Standard Constrained Optimization Meths– [Carter, Vandenberghe ‘02] Barrier Methods– Interior Point Methods?– Penalty Methods?
• But need to estimate algorithmic parameters• Other related ideas:
– Second order cone programming [Yin, Goldfarb, Osher]– Graph Cut [Zabih, Boykov, Komogorov,..]
Potential Difficulties To Dual Problem
Chambolle’s Key Observation (‘04)Optimality Conditions from Dual TV Formulation:
Key Observation:
Complementarity:
Lagrange Multipliers Eliminated!
Elimination of Lagrange Multipliers
Thus, Lagrangian Simplifies To:
Reduces to Explicit Scheme:
Solve via Semi-Implicit Gradient Descent (Chambolle):
Convergence of Scheme
Theorem:(Chambolle)
• Iterates Decrease Energy
• Empirically: # iter ~ # pixels (2-D)
• Convergence: At Best Linear
• No Parameter Needed
• Globally Convergent for any p
ROF Dual: Residual vs #Iterations
# Iterations
Time Marching
Chamb DualObserved image
250 iters 1500 itersTM CD TM CD
Residuals
#Iterations versus
# iterations
Values of 0 100 200 300 400 500 600 700 800
102
103
104
lambda vs iterations
#Iterations versus Image Size
Size of image10
310
410
510
610
2
103
104
# iterations
Non-Smooth Newton Methods
• Recent works by M. Ng, Qi 2005• Chambolle’s equation is non-differentiable:
• Non-smooth Newton: uses sub-gradients near singularities
• Superlinear convergence is achievable in theory
Outline of 4 Lectures
1. Introduction to PDE Image Models & Algorithms:
Denoising, deblurring, active contours, segmentation
2. Algebraic Problems from Denoising & Deblurring
Linear preconditioning techniques
Nonlinear + Optimization (duality) techniques
Multigrid techniques
3. Algebraic Problems from Active Contours/Segmentation
Curve evolution techniques
Direct optimization techniques
Time Marching vs Fixed PointTime Marching vs Fixed Point
• In image denoising, TV regularization approach leads to:
• In fixed-pt iteration, if one fixes |ux| = |uxn| and u - u0 = un - u0, one needs
to solve:
• Apply 1 step of Richardson with relaxation parameter t & precond. B:
.0)(||
0
uu
u
u
xx
x
.0)(||
01
uuu
u n
x
nx
nx
].||
)([ 01
x
nx
nxnnn
u
uuuBtuu
Time Marching vs Fixed Point (cont.)Time Marching vs Fixed Point (cont.)
• B = 1 time marching scheme by Rudin-Osher-Fatemi (92):
• B = |uxn| time marching scheme by Marquina-Osher (00):
i.e. diagonal precond. Richardson.
• In general, one can choose other B, e.g. multigrid, to speed up convergence.
• If 1 pre- & 1 post- GS smoothing are used, 1 MG cycle 4 time marching steps.
.||
)( 01
x
nx
nxn
nn
u
uuu
t
uu
.||
||)(|| 01
x
nx
nxn
xnn
x
nn
u
uuuuu
t
uu
Comparison of Preconditioners
Inner iteration: 1 Richardson step with diag or MG preconditioner
Nonlinear Iteration: TM vs FPNonlinear Iteration: TM vs FP
Exact FP vs Inexact FPExact FP vs Inexact FP
Inexact FP: 1 MG V-cycle, Exact FP: ~10 MG V-cycle
Show video
Show video
Show video
A demonstration
Show video of X-ray of hand
Multigrid for Differential+Convolution Problems
-Linear MG, V-cycles, various smoothers (R. Chan, C., Wan 97)
> works well for I, Laplacian but not as well for TV.
> difficult to find good smoother for diff-conv problems
> spectral properties of diff and conv operators “flipped”.
-Multilevel Additive Schwarz (Hanke, Vogel 98)
> 2 grid levels, projection of diff-conv operator directly to
a coarse grid.
> coarse problem dense, solved using direct method.
zKKuKu
u
||
Spectrum of - Spectrum of - + K + K
= 1
smallest e.v. middle e.v. largest e.v.
= 10-4
= 10-8
Richardson SmoothingRichardson Smoothing
= 1
initial error 1 iter 5 iter
= 10-4
= 10-8
10 iter
Outline
1. PDE Image Models:
Denoising, deblurring, active contours, segmentation
2. Algebraic Problems from Denoising & Deblurring
Nonlinear + Optimization techniques
Linear preconditioning techniques
3. Algebraic Problems from Active Contours/Segmentation
Features:
Automatically detects interior contours!
Works very well for concave objects
Robust w.r.t. noise
Detects blurred contours
The initial curve can be placed anywhere!
Allows for automatical change of topolgy
Active Contour w/o Edges (C.-Vese 99)
C of Evolution Objects Found
)( )(
220
210
21,,
||||
))((||),,(inf21
Cinside Coutside
Ccc
dxdycudxdycu
CinsideAreaCCccF
An Active Contour model “without edges”
Fitting + Regularization terms (length, area)
Connection with Segmentation:
Active contour model partitions the image into 2 segments –
inside and outside.
Level Sets (Osher - Sethian ‘87)
Inside C
Outside C
Outside C0
0
0
0C
nn
n
||
,||
Normal
divKCurvaturen
* Allows automatic topology changes, cusps, merging and breaking.
0),(|),( yxyxC
),(),,0(
)()(||
)(
0
220
210
yxyx
cucudivt
),(),,0(
)()(||
)(
0
220
210
yxyx
cucut
Main Evolutionary Equation in Active Contour Model
Possible Approaches:
-Time marching using explicit or implicit schemes
- Solve for Steady State directly
- Use pointwise-relaxation for linear systems
||
1
n
n
- Linearize curvature term using FP:
Fast Algorithms for Level Set Segmentation
• Implicit methods (CV ’01): allow larger time steps
• Multigrid (Fedkiw, C, Kang, Vese ’01, Tsai, Wilsky, Yezzi ‘00): – interpolate LSF from coarse grid as initial guess for
fine grid
• Direct Optimization (Song-C ’02): – sweep through pixels, decide pixel’s region
membership by value of energy functional.
0))(())((||
)everywhere positive(strictly 0)( ionapproximatour Because
0)()(||
)(
220
210
1
220
210
nn
n
n
cucu
cucu
Evolutionary iterative scheme
220
210
11
))(())((||
)( nnn
nn
nn
cucut
Two linear schemes (fixed point)
Stationary iterative scheme
2
202
10 )()(||
)( cucut
Typically, we use only 1 step of Gauss-Seidel relaxation for 1n
Evolutionary Scheme (CPU time = 59.13 sec)
0 It 50 It 500 It 1000 It 2000 It 4000 It
Stationary Scheme (CPU time = 0.63 sec)
0 It 10 It 20 It 30 It 40 It 50 It
Comparison of the evolutionary/stationary schemes
1 ,1.0 t
1 ,1 ,0 ,2550.01 2 x
Multigrid Ideas For Active Contours
- Use MG for solving the linear systems arising in evolution
- Use MG for solving nonlinear steady state equations
- Use full MG to obtain better initial guess for curve:
> Down-sample image to lower resolution
> Solve active contour problem on low resolution image
> Interpolate level set function to fine resolution image.
smooth good approximation from low resolution.
> Evolution on fine resolution image picks up details.
Refs: Tsai, Willsky, Yezzi 2000, C., Fedkiw, Kang, Vese 2000
Original image
256x171
32x22 64x43
128x86 256x171
Multigrid for Active Contours
Animation of Multigrid Active Contours
4 levels: 256x171, 128x86, 64x43, 32x22
Fast Direct Search Algorithm
1. Initialization. Partition domain into and
2. Advance. For each point x in the domain, if the energy F lower when we change to , then update this point. (Can be updated fast.)
3. Repeat step 2 until energy F remains unchanged
0 0
)(x )(x
Insight: Segmentation only needs sign of LSF but not its value
(Song-C ’02)
(Related to K-mean algorithm, and “region merging” algorithm of Koepfler, Lopez, Morel ’94)
A 2-phase example
(a), (b), (c), (d) are four different initial condition.All of them converge in one sweep!
Example with Noise
Converged in 4 steps.
(Gradient Descent on Euler-Lagrange took > 400 steps.)
Convergence of the algorithm
Theorem: For 2-phase images, algorithm (w/o length term) converges in 1 sweep, independent of sweeping order.
Why is 1-step covergence is possible?
Problem is global: usually cannot have finite step convergence based on local updates only
But, in our case, we can exactly calculate the global energy change via local update (can update global average locally)
Application to piecewise linear CV model(Vese 2002)
))(1(||
)(|||)(|),),((
221002
22100121
Hybxbbu
HyaxaauHccHF
Original P.W.Constant
Converged in 4 steps
P.W. Linear
Converged in 6 steps
Other Fast Algorithms for Level Set Segmentation
• Narrow Band (see Osher-Fedkiw ’03): – only solve PDE near zero LS
• Operator Splitting (Gibou-Fedkiw ’02): – split length term (nonlinearly diffuse) from fidelity term (opt via
k-mean).
• Threshold Dynamics (Esedoglu & Tsai ’04, +Ruuth ‘05):– Extends Merriman, Bence, Osher ’92 diffusion generated motion
by mean curvature to MS segmentation. Alternates diffusion with thresholding.
– Operator split phase field formulation of Mumford-Shah functional