51
Chebyshev Estimator Presented by: Orr Srour

09-Chebyshev Estimator

Embed Size (px)

Citation preview

Chebyshev Estimator

Presented by: Orr Srour

References Yonina Eldar, Amir Beck and Marc Teboulle, "A

Minimax Chebyshev Estimator for Bounded Error Estimation" (2007), to appear in IEEE Trans. Signal Proc.

Amir Beck and Yonina C. Eldar, Regularization in Regression with Bounded Noise: A Chebyshev Center Approach, SIAM J. Matrix Anal. Appl. 29 (2), 606-625 (2007).

Jacob (Slava) Chernoi and Yonina C. Eldar, Extending the Chebyshev Center estimation Technique, TBA

Chebyshev Center - Agenda Introduction CC - Basic Formulation CC - Geometric Interpretation CC - So why not..? Relaxed Chebyshev Center (RCC)

Formulation of the problem Relation with the original CC Feasibility of the original CC Feasibility of the RCC CLS as a CC relaxation CLS vs. RCC Constraints formulation

Extended Chebyshev Center

Notations

y – boldface lowercase = vector yi - i’th component of the vector y A - boldface uppercase = matrix - hat = the estimated vector of x = A – B is PD, PSD

x̂, A B A B

The Problem

Estimate the deterministic parameter vector from observations with:

A – n x m model matrix w – perturbation vector.

mx R ny R

y Ax w

LS Solution

When nothing else is known, a common approach is to find the vector that minimizes the data error:

Known as “least squares”, this solution can be written explicitly:

2Ax y

* 1 *ˆ ( )LSx A A A y

(Assuming A has a full column rank)

But…

In practical situations A is often ill-conditioned -> poor LS results

Regularized LS

Assume we have some simple prior information regarding the parameter vector x.

Then we can use the regularized least squares (RLS):

2 2ar inˆ :g m n

RLS x

x Ax y LxF

But…

But what if we have some prior information regarding the noise vector as well…?

What if we have some more complicated information regarding the parameter vector x?

Assumptions

From now on we assume that the noise is norm-bounded :

And that x lies in a set defined by:

2w

C

: ( ) 2 0,1

0, ,

i i

mi i i

f d i k

d

T Ti ix x x Q x g x

Q g

C

R R(hence C is the intersection of k ellipsoids)

Assumptions

The feasible parameter set of x is then given by:

(hence Q is compact) Q is assumed to have non-empty

interior

2: , x x y AxQ C

Constrained Least Squares (CLS)

Given the prior knowledge , a popular estimation strategy is:

- Minimization of the data error over C- But: the noise constraint is unused…

More importantly, it doesn’t necessarily lead to small estimation error:

x C

2

CLS arg minˆ xx y AxC

ˆ x x

Chebyshev Center

The goal: estimator with small estimation error

Suggested method: minimize the worst-case error over all feasible vectors

2

ˆmin max ˆ x x x xQ

Chebyshev Center – Geometric Interpretation

Alternative representation:

-> find the smallest ball (hence its center and its radius r ) which encloses the set Q.

2

ˆ , ˆ: for all min r r r x x x x Q

Chebyshev Center – Geometric Interpretation

Chebyshev Center

This problem is more commonly known as finding “Chebyshev’s Center”.

Pafnuty Lvovich Chebyshev

16.5.1821 – 08.12.1894

Chebyshev Center – The problem The inner maximization is non-

convex

Computing CC is a hard optimization problem

Can be solved efficiently over the complex domain for intersection of 2 ellipsoids

Relaxed Chebyshev Center (RCC)

Let us consider the inner maximization first:

and:

2ˆ : ) 0,0

( ) 2 0,

max (

0

i

i i

f i k

f d i k

x

T Ti i

x x x

x x Q x g x

2

0 0 0, ,d T TQ A A g A y y

Relaxed Chebyshev Center (RCC)

Denoting , we can write the optimization problem as:

with:

Txx

2

( , ) ˆ ˆ2 ( )max Tr Tx x x xG

( , ) : ( , ) 0,0 ,

( , ) (

f

) 2

i

i i

i k

f Tr d

T

Ti i

x x xx

x Q g x

G

Concave

Not Convex

Relaxed Chebyshev Center (RCC)

Let us replace G with:

And write the RCC as the solution of:

( , ) : ( , ) 0,0f ,i i k Tx x xxT

Convex

2

ˆ ( , ) ˆmin x ˆ2 ( )ma Tr Tx x x x xT

Convex

Relaxed Chebyshev Center (RCC)

T is bounded The objective is concave (linear) in The objective is convex in

We can replace the order: min-max to max-min

, xx̂

2

ˆ( , ) ˆmax ˆ2 (in )m Tr Tx x x x xT

Relaxed Chebyshev Center (RCC)

The inner minimization is a simple quadratic problem resulting with

Thus the RCC problem can be written as:

ˆ x x

2

( , )max ( )Tr x xT}

Note: this is a convex optimization problem.

RCC as an upper bound for CC RCC is not generally equal to the CC

(except for k = 1 over the the complex domain) Since we have:

Hence the RCC provides an upper bound on the optimal minimax value.

G T

2

ˆ

2

ˆ ( , )

2

ˆ ( , )

ˆ

ˆ ˆ2

min max

min max

min ma )x

( )

ˆ ˆ2 (

Tr

Tr

x x

Tx x

Tx x

x x

x x x

x x x

Q

G

T

CC – The problem

RCC Solution

Theorem: The RCC estimator is given by:

1

RCC 0 0ˆ

k k

i i i ii i

x Q g

RCC Solution

Where are the optimal solution of:

subject to:

0( ,.. )., k

1

0 0 0 0min

Tk k k k

i i i i i i i ii i i id

g Q g

0

0,0

k

i ii

i i k

Q I

RCC Solution – as SDP

Or as a semidefinite program (SDP):

s.t.:

0min

k

i iit d

0 0

0

0

0

0,0

k k

i i i ii i

k Ti ii

k

i ii

i

t

a i k

Q g

g

Q I

Feasibility of the CC

Proposition: is feasible. Proof:

Let us write the opt. problem as:

with:

CCx̂

2

ˆm ˆ ˆ( )in x x x

2ˆ ˆ( ) 2max T

xx x x xQ

1. Convex in ̂x

2. strictly convex: 3. has a UNIQUE solution

Feasibility of the CC

Let us assume that is infeasible, and denote by y its projection onto Q.

By the projection theorem:

and therefore:

( (ˆ ) ) 0 T x y x y x Q

2 2 2

2 2

ˆ

ˆ ˆ ˆ2( ) )(T

y x y x x y

y x x y x y x y x x

Feasibility of the CC

So:

Which using the compactness of Q implies:

But this contradicts the optimality of .

2 2 ˆ y x x x x Q

2 2ˆmax max x xy x x xQ Q

Hence: is unique and feasible.x̂

Feasibility of the RCC

Proposition: is feasible. Proof:

Uniqueness follows from the approach used earlier.

Let us prove feasibility by showing that any solution of the RCC is also a solution of the CC.

RCCx̂

Feasibility of the RCC

Let be a solution for the RCC problem. Then:

Since:

We get:

( )ˆ, RCCx T

ˆ ˆ) 2 0, 0( RCC iTr kd i Ti iQ g x

, ˆ ˆ ˆ 0 TRCC RCC ix x Q

) 2

2

ˆ ˆ ˆ ˆ(

ˆ ˆ( ) 0

i i

i

f d

dTr

T TRCC RCC i RCC i RCC

Ti i RCC

x x Q x g x

Q g x

ˆ RCCx Q

CLS as CC relaxation

We now show that CLS is also a (looser) relaxation of the Chebyshev center.

Reminder:2

CLS arg minˆ xx y AxC

: ( ) 2 0,1

0, ,

i i

mi i i

f d i k

d

T Ti ix x x Q x g x

Q g

C

R R

CLS as CC relaxation

Note that is equivalent to

Define the following CC relaxation:

x Q2

, x y AxC

2

( , ) :

. ( ) 0, Tr

T T T T

x x

A A 2y A x y xx

CV.

unharmed

relaxed

2

( , )max ( )Tr x xV}

CLS as CC relaxation

Theorem: The CLS estimate is the same as the relaxed CC over V (here CCV).

Proof: Les us assume is the CCV

solution, and the RCC solution, .

The RCC is a strictly convex problem, so its solution is unique:

( , )1x

2x 1 2x x

2 20r 1 2y Ax y Ax

CLS as CC relaxation

Define

It is easy to show that

(hence it is a valid solution for the CCV)

('

)

r

Tr T T

2 2 1 1 Tx x x x I

A A

( )', 2x V

CLS as CC relaxation

Denote by the objective of the CCV.

By definition:

contradicting the optimality of .

,( )P x

2 1', ) ,( )( )

(P Pn

rTr

T

x xA A

> 0

( , )1x

1 2x x

CLS vs. RCC Now, as in the proof of the feasibility

of the RCC, we know that:

And so:

Which means that the CLS estimate is the solution of a looser relaxation than that of the RCC.

( ) ( , )i iff x x

T V

Modeling Constraints The RCC optimization method is

based upon a relaxation of the set Q

Different characterizations of Q may lead to different relaxed sets.

Indeed, the RCC depends on the specific chosen form of Q.

(unlike CC and CLS)

Linear Box Constraints

Suppose we want to append box constraints upon x:

These can also be written as:

Which of the two is preferable…?

l u Ta x

( )( ) 0l u T Ta x a x

Linear Box Constraints

Define:

1

, ) : ( ) 0,

, ,

( iTr d

u l

Ti i

T T T

x Q 2g x

a x 0 a x 0 xxT

2

, ) : ( ) 0,

( ) ( )

(

,

iTr d

Tr u l ul

Ti i

T T T

x Q 2g x

aa a x 0 xxT

Linear Box Constraints

Suppose , then:

Since , it follows that:

Which can be written as:

2( , ) x T

( ) ( ) 0Tr u l ul T Taa a x

Txx

( ) 0u l ul T T Tx aa x a x

( )( ) 0l u T Ta x a x

1( , ) x T

Linear Box Constraints

Hence:

T1 is a looser relaxation -> T2 is preferable.

2 1T T

Linear Box Constraints

An example in R2: The constraints have been chosen as

the intersection of: A randomly generated ellipsoid [-1, 1] x [-1, 1]

Linear Box Constraints

Linear Box Constraints

An example – Image Deblurring x is a raw vector of a 16 x 16 image. A is a 256 x 256 matrix

representing atmospheric turbulence blur (4 HBW, 0.8 STD).

w is a WGN vector with std 0.05 .

The observations are Ax+w We want x back…

An example – Image Deblurring

LS:

RLS: with

CLS:

RCC:

1ˆ ( ) T TLSx A A A y

21.1 x

2min : 0,4 5) 2( 1 6i ix ix Ax y

2, 0,1 25( ) 64i ixx i Ax yQ

Chebyshev Center - Agenda Introduction CC - Basic Formulation CC - Geometric Interpretation CC - So why not..? Relaxed Chebyshev Center (RCC)

Formulation of the problem Relation with the original CC Feasibility of the original CC Feasibility of the RCC CLS as a CC relaxation CLS vs. RCC Constraints formulation

Extended Chebyshev Center

Questions..?

Now is the time…