25
Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta

Selecting Observations against Adversarial Objectives

  • Upload
    sarila

  • View
    19

  • Download
    0

Embed Size (px)

DESCRIPTION

Selecting Observations against Adversarial Objectives. Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A. Observation selection problems. - PowerPoint PPT Presentation

Citation preview

Page 1: Selecting Observations against Adversarial Objectives

Carnegie Mellon

Selecting Observations against Adversarial

ObjectivesAndreas Krause

Brendan McMahanCarlos GuestrinAnupam Gupta

Page 2: Selecting Observations against Adversarial Objectives

Observation selection problems

Place sensors forbuilding automation

Monitor rivers, lakes using robots

Detectcontaminations

in water networksSet V of possible observations (sensor locations,..)Want to pick subset A* µ V such that

For most interesting utilities F, NP-hard!

A¤ = argmaxjA j· k

F (A)

Page 3: Selecting Observations against Adversarial Objectives

Placement B = {S1,…, S5}

Key observation: Diminishing returns

Placement A = {S1, S2}

Formalization: SubmodularityFor A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B)

Adding S’ will help a lot! Adding S’ doesn’t

help muchNew sensor S’

Page 4: Selecting Observations against Adversarial Objectives

Submodularity[with Guestrin, Singh, Leskovec, VanBriesen, Faloutsos, Glance]

We prove submodularity forMutual information F(A) = H(unobs) – H(unobs|A)

UAI ’05, JMLR ’07 (Spatial prediction)Outbreak detection F(A) = Impact reduction sensing A

KDD ’07 (Water monitoring, …)

Also submodular:Geometric coverage F(A) = area coveredVariance reduction F(A) = Var(Y) – Var(Y|A) …

Page 5: Selecting Observations against Adversarial Objectives

Why is submodularity useful?

Theorem [Nemhauser et al ‘78]Greedy algorithm gives constant factor approximationF(Agreedy) ¸ (1-1/e) F(Aopt)

Can get online (data dependent) bounds for any algorithmCan significantly speed up greedy algorithmCan use MIP / branch & bound for optimal solution

~63%

12

34

5Greedy Algorithm(forward selection)

sj +1 = argmaxs2VnA j

F (A j [ f sg)

Page 6: Selecting Observations against Adversarial Objectives

Robust observation selection

What if …… parameters of model P(XV j ) unknown / change?… sensors fail?… an adversary selects the outbreak scenario?

Morevariabilityhere now

new

Attackhere!

Best placement forparameters old

Sensors

Page 7: Selecting Observations against Adversarial Objectives

Robust prediction

Instead: minimize “width” of the confidence bandsFor every location s 2 V, define Fs(A) = Var(s) – Var(s|A)Minimize “width” simultaneously maximize all Fs(A)Each Fs(A) is (often) submodular! [Das & Kempe ‘07]

Low average variance (MSE)but high maximum

(in most interesting part!)

Typical objective: Minimize average variance (MSE)

Confidencebands

Horizontal positions V

pH v

alue

-3

-2

-1

0

1

2

3

-3

-2

-1

0

1

2

3

Page 8: Selecting Observations against Adversarial Objectives

-3

-2

-1

0

1

2

3

Adversarial observation selection

Given:Possible observations V, Submodular functions F1,…,Fm

Want to solve

Can model many problems this way:Width of confidence bands: Fi is variance at location iunknown parameters: Fi is info-gain with parameters i

adversarial outbreak scenarios: Fi is utility for scenario i…

Unfortunately, mini Fi(A) is not submodular

One Fi foreach location i

… …A¤ = argmax

jA j· kmin

iF i (A)

Page 9: Selecting Observations against Adversarial Objectives

How does greedy do?Set A F1 F2 mini Fi

{x} 1 0 0{y} 0 2 0{z} {x,y} 1 2 1{x,z} 1 {y,z} 2

Theorem: The problem max|A|· k mini F(A) does not admit any approximation unless P=NP

Optimalsolution(k=2)

Greedy picksz first

Then, canchoose only

x or y

Greedy does arbitrarily badly. Is there something better?

Page 10: Selecting Observations against Adversarial Objectives

Alternative formulationIf somebody told us the optimal value,

can we recover the optimal solution A*?

Need to solve dual problem

Is this any easier?

Yes, if we relax the constraint |A| · k

c¤ = maxjA j· k

mini

F i (A)

A¤ = argminA

jA j such that mini

F i (A) ¸ c¤

Page 11: Selecting Observations against Adversarial Objectives

Solving the alternative problem

Trick: For each Fi and c, define truncation c

|A|

Fi(A)F’i(A)

Set F1 F2 F’1

F’2

F’avg,1 mini Fi

{x} 1 0 1 0 ½ 0{y} 0 2 0 1 ½ 0{z} {x,y}

1 2 1 1 1 1

{x,z}

1 1 (1+)/2

{y,z}

2 1 (1+)/2

mini Fi(A) ¸ c F’avg,c(A) = c

Lemma:

F’avg,c(A)is submodular!

F 0i (A) = minfF i (A);cg

F 0avg;c(A) = 1

mX

iF 0

i (A)

Page 12: Selecting Observations against Adversarial Objectives

Why is this useful?Can use the greedy algorithm to find (approximate) solution!

Proposition: Greedy algorithm finds

AG with |AG| · k and F’avg,c(AG) = c

where = 1+log maxs i Fi({s})

Page 13: Selecting Observations against Adversarial Objectives

Back to our example

Guess c=1First pick xThen pick y

Optimal solution!

How do we find c?

Set F1 F2 mini Fi

F’avg,1

{x} 1 0 0 ½{y} 0 2 0 ½{z} {x,y}

1 2 1 1

{x,z}

1 (1+)/2

{y,z}

2 (1+)/2

Page 14: Selecting Observations against Adversarial Objectives

Submodular Saturation Algorithm

Given set V, integer k and functions F1,…,Fm

Initialize cmin=0, cmax = mini Fi(V)Do binary search: c = (cmin+cmax)/2

Use greedy algorithm to find AG such that F’avg,c(AG) = cIf |AG| > k: decrease cmax

If |AG| · k: increase cmin

until convergencecmaxcmin c

|AG| · k c too low

|AG| > k c too high

Page 15: Selecting Observations against Adversarial Objectives

Theoretical guarantees

Theorem: If there were a polytime algorithm with better constant < , then NPµ DTIME(nlog log n)

Theorem: Saturate finds a solution AS such that

mini Fi(AS) ¸ OPTk and |AS|· k

where OPTk = max|A|· k mini Fi(A) = 1 + log maxs i Fi({s})

Theorem: The problem max|A|· k mini F(A) does not admit any approximation unless P=NP

Page 16: Selecting Observations against Adversarial Objectives

Experiments:Minimizing maximum variance in GP regressionRobust biological experimental designOutbreak detection against adversarial contaminations

Goals:Compare against state of the artAnalyze appropriateness of“worst-case” assumption

Page 17: Selecting Observations against Adversarial Objectives

0 20 40 600

0.05

0.1

0.15

0.2

0.25

Number of sensors

Max

imum

mar

gina

l var

ianc

e

Greedy

SimulatedAnnealing

Saturate

Spatial prediction

Compare to state of the art [Sacks et.al. ’88, Wiens ’05, …]Highly tuned simulated annealing heuristics (7 parameters)

Saturate is competitive & faster, better on larger problems

Environmental monitoring Precipitation data

bette

r

0 20 40 60 80 1000.5

1

1.5

2

2.5

Number of sensors

Max

imum

mar

gina

l var

ianc

e

Greedy

Saturate

SimulatedAnnealing

Page 18: Selecting Observations against Adversarial Objectives

Maximum vs. average variance

Minimizing the worst-case leads to good average-case score, not vice versa

Environmental monitoring Precipitation data

bette

r

0 5 10 15 200

0.05

0.1

0.15

0.2

0.25

Number of sensors

Mar

gina

l var

ianc

e

Max. var.opt. avg.(Greedy) Max. var.

opt. var.(Saturate)

Avg. var.opt. max.(Saturate)

Avg. var.opt. avg.(Greedy)

0 5 10 15 200

0.5

1

1.5

2

2.5

3

Number of sensors

Mar

gina

l var

ianc

e

Max. var.opt. avg.(Greedy) Max. var.

opt. max.(Saturate)

Avg. var.opt. max.(Saturate)

Avg. var.opt. avg.(Greedy)

Page 19: Selecting Observations against Adversarial Objectives

Outbreak detection

Results even more prominent on water network monitoring (12,527 nodes)

Water networks

bette

r

Water networks

0 2 4 6 8 100

500

1000

1500

2000

2500

3000

Number of sensors

Det

ectio

n tim

e (m

inut

es)

max DT(Saturate)

max DT(Greedy)

avg DT(Saturate)

avg DT(Greedy)

0 10 20 300

500

1000

1500

2000

2500

3000

Number of sensors

Max

imum

det

ectio

n tim

e (m

inut

es)

Greedy

SimulatedAnnealing

Saturate

Page 20: Selecting Observations against Adversarial Objectives

Robust experimental design

Learn parameters of nonlinear functionyi = f(xi,) + wChoose stimuli xi to facilitate MLE of Difficult optimization problem!

Common approach: linearization!yi ¼ f(xi,0) + rf0

(xi)T (-0) + wAllows nice closed form (fractional) solution!

How should we choose 0??

Page 21: Selecting Observations against Adversarial Objectives

Robust experimental design

State-of-the-art: [Flaherty et al., NIPS ‘06]Assume perturbation on Jacobian rf0

(xi)Solve robust SDP against worst-case perturbationMinimize maximum eigenvalue of estimation error (E-optimality)

This paper:Assume perturbation of initial parameter estimate 0

Use Saturate to perform well against all initial parameter estimatesMinimize MSE of parameter estimate(Bayesian A-optimality, typically submodular!)

Page 22: Selecting Observations against Adversarial Objectives

Experimental setupEstimate parameters of Michaelis-Menten model (to compare results)Evaluate efficiency of designs

Loss of optimal design,knowing true parameter true

Loss of robust design,assuming (wrong) initial parameter 0

e±ciency ´ ¸max[Cov(µ̂ j µtrue;wopt(µtr ue)))]¸max[Cov(µ̂ j µtr ue;w½(µ0))]

Page 23: Selecting Observations against Adversarial Objectives

Robust design results

Saturate more efficient than SDP if optimizing for high parameter uncertainty

bette

r

Low uncertainty in 0 High uncertainty in 0

A B C A B C

10-1

100

1010

0.2

0.4

0.6

0.8

1

Initial parameter estimate 02

Effi

cien

cy (w

.r.t.

E-o

ptim

ality

)

ClassicalE-optimal

design

SDP = 10-3

true 2

Saturate

10-1

100

1010

0.2

0.4

0.6

0.8

1

Initial parameter estimate 02

Effi

cien

cy (w

.r.t.

E-o

ptim

ality

)

ClassicalE-optimal

design

SDP = 10-3

true 2

Saturate

SDP = 16.3

Page 24: Selecting Observations against Adversarial Objectives

Future (current) workIncorporating complex constraints (communication, etc.)Dealing with large numbers of objectives

Constraint generationImproved guarantees for certain objectives (sensor failures)

Trading off worst-case and average-case scores

0 200 400 600 8000

500

1000

1500

2000

2500

3000

Expected score

Adv

ersa

rial s

core

k=5k=10

k=15k=20

Page 25: Selecting Observations against Adversarial Objectives

ConclusionsMany observation selection problems require optimizing adversarially chosen submodular function

Problem not approximable to any factor!Presented efficient algorithm: Saturate

Achieves optimal score, with bounded increase in costGuarantees are best possible under reasonable complexity assumptions

Saturate performs well on real-world problemsOutperforms state-of-the-art simulated annealing algorithms for sensor placement, no parameters to tuneCompares favorably with SDP based solutions for robust experimental design

A¤ = argmaxjA j· k

mini

F i (A)