Download pdf - Depth Image Coding & Processing Part 3: Depth Image Processing

Depth Image Coding & Processing

Part 3: Depth Image Processing

.

Gene Cheung

National Institute of Informatics

6th July, 2015

1COST Training School 7/06/2015

Outline

• Depth Image Denoising

• Graph Sparsity Prior

• Graph-signal Smoothness Prior

• Bit-depth Enhancement


Outline






Introduction to PWS Image Denoising

• Limitations of current sensing technologies

- acquired PWS images are often corrupted by non-negligible acquisition noise.

• Denoising is an inverse imaging problem.

• Signal prior is key to inverse imaging problems!

• Depth images are PWS, self-similar.

noise

desired signal

observation vxy

COST Training School 7/06/2015 4

• Local methods (e.g., bilateral filtering)

• Nonlocal image denoising

Buades et al, "A non-local algorithm for image denoising," CVPR 2005

- Assumption: nonlocal self-similarity

• Dictionary learning based

Elad et al, "Image denoising via sparse and redundant representation over

learned

dictionaries," TIP 2006.

- represent a signal by the linear combination of a few atoms out of a dictionary

Other related works

- Huhle et al, “Robust non-local denoising of colored depth data,” CVPR Workshop

2008

- Tallon et al, “Upsampling and denoising of depth maps via joint segmentation,”

EUSIPCO 2012

Existing Image Denoising Methods


Local Piecewise SmoothnessNonlocal self-similarity

unify in GFT domain

Challenges Our method

1. Adapt to nonlocal statistics --- adapt to nonlocal statistics via nonlocal self-similarity

2. Characterize PWS --- characterize PWS via GFT representation

+ learn GFT dictionary efficiently

Key Idea in Non-local GFT


Algorithm:

1. Identify similar patches, compute avg patch.

（self-similarity）

2. Given avg patch, use Gaussian kernel to

compute weights between adjacent pixels.

3. Compute graph Fourier transform (GFT).

4. Given GFT, soft thresholding on transform coeff.

for sparse representation.

7[1] W. Hu, X. Li, G. Cheung, O. Au, "Depth Map Denoising using Graph-based Transform and Group Sparsity," IEEE International

Workshop on Multimedia Signal Processing, Pula (Sardinia), Italy, October, 2013. (Top 10% paper award.)

N

i

i

N

i

i

10

1

2

2i,U

Uymin

common GFT from avg. patch

code vector for observation i

observation i

NL-GFT Algorithm

Justification of Sparsity Prior

• GFT domain sparsity prior in objective function:

• ”Argument”:

• GFT approximates KLT if statistical model is GMRF and each graph

weight captures correlation of 2 connected pixels [2, 3].

• Underlying “causes” of PWS signals are few; PWS signal can be

sparsely represented in GFT domain [4, 5].

8

K

i

i

K

i

iix

xxyi 1

01

2

2,min

[2] C. Zhang and D. Florencio, “Anaylzing the optimality of predictive transform coding using graph-based models,” in IEEE Signal

Processing Letters, vol. 20, NO. 1, January 2013, pp. 106–109.

[3] W. Hu, G. Cheung, A. Ortega, O. Au, “Multi-resolution Graph Fourier Transform for Compression of Piecewise Smooth Images,”

IEEE Transactions on Image Processing, January 2015.

[4] G. Shen, W.-S. Kim, S.K. Narang, A. Ortega, J. Lee, and H. Wey, “Edge-adaptive transforms for efficient depth map coding,” in IEEE

Picture Coding Symposium, Nagoya, Japan, December 2010.

[5] W. Hu, G. Cheung, X. Li, O. Au, “Depth Map Compression using Multi-resolution Graph-based Transform for Depth-image-based

Rendering,” IEEE International Conference on Image Processing, Orlando, FL, September 2012.

• Setup:

- Test Middleburry depth maps: Cones, Teddy, Sawtooth

- Add Additive White Gaussian Noise

- Compare against Bilateral Filtering (BF), Non-Local Means Denoising (NLM)

and Block-Matching 3D (BM3D)

• Results

– Up to 2.28dB improvement over BM3D.

NLGFT BM3D

NLM BF

Experimental Results (1)


1010

• Setup:

- Test Middleburry depth maps: Cones, Teddy, Sawtooth

- Add Additive White Gaussian Noise

- Compare agaist Bilateral Filtering (BF), Non-Local Means Denoising

(NLM) and Block-Matching 3D (BM3D)

• Results

– Up to 2.28dB improvement over BM3D.

NLGFT BM3D

NLM BF

Experimental Results (2)


Outline






• Image denoising—a basic restoration problem:

• It is under-determined, needs image priors for regularization:

• Graph Laplacian regularizer: should be small for target patch

• Many works use Gaussian kernel to compute graph weights [1, 6]:

is some distance metric between pixels i and j

Motivation (I)

y x eobservation noise

desired signal

2

2min prior( )

xy x x

fidelity termprior term

x

T( )S x x LxG L D A

12

dist( , )i j

graph Laplacian matrix

[6] D. Shuman et al., “The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and

other irregular domains,” IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 83–98, 2013.

2

2,

exp

jidistwij

approximate

discrete graph continuous manifold

Motivation (II)

13

• However…

a. Why is a good prior?

b. Why using Gaussian kernel for edge weights?

c. How to design a discriminant for restoration?

T( )S x x LxG

Tx Lx

• We answer these basic questions by viewing:

• discrete graph as samples of high-dimensional manifold.

2

2

,expij

dist i jw

[7] Jiahao Pang, Gene Cheung, Antonio Ortega, Oscar C. Au, "Optimal Graph Laplacian Regularization for Natural Image Denoising," IEEE

International Conference on Acoustics, Speech and Signal Processing, Brisbane, Australia, April, 2015.

[8] Jiahao Pang, Gene Cheung, Wei Hu, Oscar C. Au, "Redefining Self-Similarity in Natural Images for Denoising Using Graph Signal

Gradient," APSIPA ASC, Siem Reap, Cambodia, December, 2014.

Our Contributions

14

1. Using Gaussian kernel to compute graph weights,

converges to a continuous functional .

T( )S x x LxG

S

A continuous functionalfor regularization

Graph Laplacian regularizer SG S

converge

2. Analysis of functional provides understanding of how signals are

being discriminated and to what extent; careful graph construction leads

to discriminant signal prior.

S

Effective regularizer Select edge weights via

“feature functions” obtain SG

3. We derive the optimal graph Laplacian regularizer for denoising, which is

discriminant for small noise and robust when very noisy.

2

2

,expij

dist i jw

COST Training School 7/06/2015

• Graph for image restoration

• Each pixel corresponds to a vertex in a graph (denote # of pixels as ).

• Regard the image as a signal defined on a weighted graph.

• With proper graph configuration, construct filter for image (graph signal)

using prior knowledge (i.e., smooth on the graph).

e.g., graph of a 5×5 patch,

(not necessarily be a grid graph)

M

Graph-Based Image Processing

15

Continuous Domain Discrete Domain

Obtain continuous

functional ( )S h

Choose the continuous

feature functions 1 N

n nf

SAMPLE

Get metric space

on point-by-point basis

2 2R G Compute the weights

and Laplacian M MR L

Sample to obtain

the discrete 1 D N

n nf1 N

n nf

Graph Laplacian

Regularizer ( )DS hG

CONVERGE

• Different features lead to different regularization behavior!1 N

n nf

Road Map


• First, define:

• 2D domain

—shape of an image patch

•

— uniformly distributed

random samples on ,

pixel locations in our work

2R

T [ ] | ,1i i i iyx i M s s

M

Functional

( )S h

Matrix

2 2R G

Graph weights,

and M MR L

Samples

1 D N

n nf1 N

n nf

Features

Regularizer

( )DS hG

converge

Roadmap

• (Freely) choose continuous functions

called feature functions, for example

• intensity for gray-scale image ( )

• R, G, B channels for color image ( )

( , ) : , 1nf x R n Ny

N

1N

3N 𝑥

𝑦𝑓𝑛(𝑥, 𝑦)

Ω𝑂

Graph Construction (I)


• For each sample , define a length vector

• Build a graph with vertices; each sample has a

vertex

T

1 2[ ( ) ( ) ( )]D D D

i Ni i i v f f f

i s

iV

G i s

N

M

• Sampling at positions in gives

discretized feature functions

T

1 1 2 2[ ( , ) ( , ) ( , )]D

n n MMn nf x y f x y f x y f

nf N

𝑥

𝑦𝑓𝑛(𝑥, 𝑦)

Ω𝒔𝑖

𝑓𝑛(𝑥𝑖 , 𝑦𝑖)

𝑂

Functional

( )S h

Matrix

2 2R G

Graph weights,

and M MR L

Samples

1 D N

n nf1 N

n nf

Features

Regularizer

( )DS hG

converge

Roadmap

Graph Construction (II)


• Weight between vertices andiV jV

( ) ( )ij i j ijw d

22

2ij i jd v v

“Distance” between two features

1( )i ij

M

jd

degree before normalization

normalization factor

• is an r-neighborhood graph, i.e., no edge connecting two

vertices with distance greater than

Gr

Clipped Gaussian kernel2

2exp

2,

( )

otherwise0

dd

dr

where and is a constantrCrCr

Functional

( )S h

Matrix

2 2R G

Graph weights,

and M MR L

Samples

1 D N

n nf1 N

n nf

Features

Regularizer

( )DS hG

converge

Roadmap

Graph Construction (III)

19

• — -th entry is

— diagonal entry is

unnormalized Graph LaplacianA ijw( , )i j

D1 ij

m

jw

L D A

• Our graph is very general

• e.g., one can derive that the popular

2D grid graph is a special case of ours

G

• — graph Laplacian regularizer, functional inT( ) ( )D D DS h h LhG

MR

• is some continuous candidate function

— discrete version of

( , ) :h x y R

T

1 1 2 2[ ( , ) ( , ) ( , )]D

M Mh x y h x y h x y h ( , )h x y

Functional

( )S h

Matrix

2 2R G

Graph weights,

and M MR L

Samples

1 D N

n nf1 N

n nf

Features

Regularizer

( )DS hG

converge

Roadmap

Graph Construction (IV)

20

• The continuous counterpart of is a

functional on domain

is the gradient of

2 1

T 1( ) ( ) ( ) detS h h h dxdy

G G

T[ ]x yh h h h

SGS

Convergence of the Graph Laplacian Regularizer

(I)

[9] H. Knutsson, C.-F. Westin, and M. Andersson, “Representing local structure using tensors ii,”

in Image Analysis. Springer, 2011, vol. 6688, pp. 545–556.

• is a 2-by-2 matrix:

2

T1 1

2

1 11

··

·

N N

x n x n y nn n

n nN N

x n y n nn n

N

ny

f f ff f

f f f

G

Structure tensor [9] of the

gradients 1 ( , )N

n nx yf

G

• is computed from on a point-by-point basis G 1 N

n nf

Functional

( )S h

Matrix

2 2R G

Graph weights,

and M MR L

Samples

1 D N

n nf1 N

n nf

Features

Regularizer

( )DS hG

converge

Roadmap

21

• Theorem : convergence of to

“~” means there exist a constant

such that equality holds.

2) neighborhood shrinks

SG

1) number of samples increases

S

M

Convergence of the Graph Laplacian Regularizer

(II)

[10] M. Hein, “Uniform convergence of adaptive graph-based regularization,” in Learning Theory. Springer, 2006, pp. 50–64.

• With results of [10], we proved it by viewing a graph as proxy of an

-dimensional Riemannian manifold

Vertex Coordinate on Ω Coordinate on N-D manifold

N

iV ,i i iyxsT

1 2[ ( ) ( ) ( )]D D D

i Ni i i v f f f

0

lim h ~D

GM

S S h

rCr

Functional

( )S h

Matrix

2 2R G

Graph weights,

and M MR L

Samples

1 D N

n nf1 N

n nf

Features

Regularizer

( )DS hG

converge

Roadmap


• converges to , with , any new insights we gain on ??

• Inspect the equations carefully…

• 3 observations:

• measures length of in a metric space built by !

• The eigen-space of reflects dominant directions of

• integrates the gradient norm

23

Interpretation of Graph Laplacian Regularizer (I)

SG S SGS

T 1( ) ( )h h G h

G

G

1 N

n nf

S

T

1· n

N

nnf f

G

T( ) ( )D D DS h h LhG


dydxhhhST 12

1 GdetG

Green dots are 1 N

n nf

Justification of Graph Laplacian Regularizer (II)

• Metric space defined by ?

• At a certain location on the image

𝜕𝑥

𝜕𝑦

𝑙𝑂

G

l: dominant direction,

eigenvector corresponds to

the largest eigenvalue of G

Ellipses are contours (isolines),

reflects how concentrate 1 N

n nf

T 1( ) ( ) ( )S h h h dxdy

G T

1· n

N

nnf f

G

( , )x y


Justification of Graph Laplacian Regularizer (III)

• The 2D metric space provides a clear picture of what signals are

being discriminated and to what extent, on a point-by-point basis in

the continuous domain.

• Both (a)(b) are correct, but (b) is more discriminant,

(c) is discriminant but incorrect

• Lesson: when ground-truth is unknown, one should design a

discriminant metric space only to the extent that estimates of

ground-truth are reliable!

𝜕𝑥

𝜕𝑦

𝑙

𝑂

(a)

𝜕𝑥

𝜕𝑦

𝑙𝑂

(b)

𝜕𝑥

𝜕𝑦

𝑙

𝑂

(c)

ground-truth


• For a noisy patch , identify similar patches

on the noisy image, the patches form a cluster

26

Noise Modeling in Gradient Domain

M M

2

2 2 2

1exp

2

1( | )

2k k

e e

Pr

g g g g

1K 1

0 K

k k

pK

• On patch , gradient at pixel is .

• Drop superscript , model the noisy gradients as

kp ii

kg1

0 K

k k

g

,0 1k k k K g g e

Unknown ground-truthNoise term, follows 2D Gaussian

with zero-mean and covariance

• PDF of given ground-truth (likelihood) is simply

2

e I

kg g

i

MR0p


• We first establish an ideal metric space assuming we

know ground truth:

It is discriminant to

, smaller makes the space more skewed

27

Seeking for the Optimal Metric Space (I)

2 1

0 0arg min ( ) |

K

k kFPr d

G

G G G g g g g・

∆ is the whole gradient domain posterior prob. of ground truth

g

T

0 ( ) G g gg I

g

𝜕𝑥

𝜕𝑦

𝑙𝑂

g

• With noisy gradients seek for the optimal metric space

0

1

0 K

k k

g

1

0 0( )· |

K

k kdPr

G G g g g g

・(1)


• Intuition: If noise is small, dominates and is discriminant;

if is large, dominates, defaults to Euclidean space!

• Assume the prior is a 2D Gaussian with covariance

we derive

where the “ensemble” mean and variance are

28

21

2 20 2| ex

1 1

2 2p

K

k kPr

g g g g

• Carrying out the integral in (1) gives the optimal metric space

2

T 2( ) G g g I・

( )Pr g 2

g I

g

2

1

02

1k

e g

K

kK

g g

22

2 2

e

e gK

2T

g g2 2( ) I

(2)

G・

G・

Seeking for the Optimal Metric Space (II)


noise variance of gk

• Our work is closely-related to joint (or cross) bilateral filtering, with the

averaging of similar patches as guidance image.

• The structure of allows us to select

feature functions, such that they lead to the optimal metric space:

29

2

1 ( ) ·D

ii x f

• and correspond to the term in

2

2 ( ) ·D

ii y f

1 ( )D if

3N

03 2

1

2

1D

k

e g

K

kK

f p

2 ( )D if2( ) I G

・

• leads to the term in 3 ( )D if T

g g G・

T 2( ) G g g I・

• However, we adapt to noise, resulting in robust weight estimates.

From Metric Space to Graph Laplacian

— Spatial

— Intensity


30

Formulation and Algorithm

• Adopt a patch-based recovery framework, for a noisy patch

1. Find patches similar to in terms of Euclidean distance.

2. Compute the feature functions, leading to edge weights and Laplacian.

3. Solve the unconstrained quadratic optimization:

to obtain the denoised patch.

0p

1K 0p

• Aggregate denoised patches to form an updated image.

• Denoise the image iteratively to gradually enhance its quality.

• Optimal Graph Laplacian Regularization for Denoising (OGLRD).

0

12

20q

pLIqLqqqpminarg*q

T


31

Experimentation (I)

1.5 dB better than NLM!

• Test images: Lena, Boats, Peppers and Airplane

• i.i.d. Additive White Gaussian Noise (AWGN)

• Compare OGLRD to NLM and BF

32

Experimental Results (II)

• Visual comparisons ( ) of fragments25n

BF

NLM

OGLRD

33

Experimental Results (III)

• Some visual results when 30n

Before After Before After Before AfterCOST Training School 7/06/2015

34

Summary

• Image denoising is an ill-posed problem; we use graph Laplacian

regularizer as prior for regularization.

• Graph Laplacian regularizer with Gaussian kernel weights converges to a

continuous functional.

• Analysis of the continuous functional provides theoretical justification of

why and to what extent the graph Laplacian regularizer can be

discriminant.

• We describe a methodology to derive the optimal edge weights given

nonlocal noisy gradient observations.

• Our denoising algorithm with graph Laplacian regularizer and gradient-

based similarity out-performs NLM by up to 1.5 dB.


Outline






Image Bit-depth Enhancement

Problem:

low bit-depth (LBD) image y—a quantized

version of underlying HBD image

an estimate of the

original HBD image

[11] Rudin et al, “Nonlinear total variation based noise removal algorithms”, Elsevier, 1992



Objective: find that minimizes mean-squared-error (MSE),

Smoothness prior: HBD signal is

smooth

Conventional smoothness (e.g., total Variation)

are signal-independent over-smoothing

Posterior:

Likelihood: equals to 1 iff xi quantizes

to yi

xy|xxxminargx2

2xdfMMSE

posterior prob of HBD

signal x given LBD signal y

squared errx

Question: what’s a good signal smoothness prior?


[12] P. Wan, G. Cheung, D. Florencio, C. Zhang, O. Au, “Image Bit-depth Enhancement via Maximum-a-Posteriori Estimation of Graph

AC Component," IEEE International Conference on Image Processing, Paris, France, October, 2014. (Top 10% accepted paper recognition)

Graph-signal smoothness prior

• MMSE problem is now well posed, but difficult to solve.

L is the graph Laplacian matrix describing inter-pixel similarities*

Reconstruct smooth signal without blurring edges

Image Bit-depth

Enhancement

38

• MAP finds smoothest solution in feasible space.

• Can have arbitrarily large MSE!


Smoothest feasible signal

is basically DC.


Proposed ACDC Algorithm:

• Compute edge weights from quantized signal.

• Compute MAP solution of AC signal.

• Compute MMSE solution of DC signal given AC signal.


Why better?

• Posterior ≈ likelihood * prior

• Likelihood of AC is less skewed

(integrating over possible DC values).

)x|y( Af

DA xxx


Numerical comparison:

Experiments


Visual comparison:

Experiments

42

Summary

• Inverse imaging requires good signal priors.


• Graph Sparsity Prior (probabilistic interpretation)

• Graph-signal Smoothness Prior (deterministic interpretation)


• Instead of fidelity term, restricted feasible space due to

quantization bin constraints (as likelihood term).


Conclusion

Depth Image Coding & Processing

• Coding: graph Fourier Transform (GFT), generalized graph

Fourier Transform (GGFT)

• Denoising: graph sparsity prior, graph-signal smoothness prior

Future Work

• Natural image coding using graph-based transforms.

• Depth image denoising / interpolation for non-AWGN noise.

• Apps: Given depth images, foreground / background

segmentation, tracking, face modeling, etc.