Upload
beverly-hoover
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Searching for Extremes Among Distributed Data Sources with Optimal ProbingZhenyu (Victor) Liu
Computer Science Department, UCLA
Why Extremes?Central Server
Sensor 1 Sensor 2 Sensor n
query: highest raindrop
Sensor i (the highest one), plus its value
Identifying severe weatherconditions (flood / drought)
Central Server
link 1 link 2 link n
query: slowest link
link i (the slowest one), plus its transferring speed
a network path from L.A. to N.Y.
Identifying the networkbottleneck
Central Server
Amazon Barns & Noble CampusI.com
query: best Web site for “Computer Algorithms”
Website i (the best one), plus the matching Web pages
Identifying the best Web database for a user’s query
What Is the Challenge?
Constant communication between sensors and the central server is too expensive
Can the central server contact only a few sensors (i.e. use probing) to find out the maximum?
Central Server
Sensor 1 Sensor 2 Sensor n
query: highest raindrop
Sensor i (the highest one), plus its value
A Motivating Example
Central Server
Sensor 1
Sensor 2
Sensor n
expensivecommunicationcost
Sensor 2
the possible value range of Sensor 1
actual value ofSensor 1 (unknown)
( )Sensor n
Sensor 1 ( )
( )
a) The central server without the latest sensor updates
Central ServerSensor 1
Sensor 2
Sensor n
Sensor 2
( )Sensor n
Sensor 1
( )
1000
probe
1000
b) Probing sensors’ reading to reduce uncertainty
Data Model
The reading of each source as a random variable, X1, …, Xn
[li, ui] as Xi’s value range Bounded model: li, ui as real numbers Unbounded model: [-, ui], [li, +], [-, +]
Given Xi’s probability distribution in [li, ui] fi(x), Fi(x)
X1, …, Xn independent Probing Xi results in xi, costs ci
uniform-cost model, c1=c2= … = 1 non-uniform-cost model
U(<X2, 880>) = 0.12, cost: probing 1U(<X2, 880>) = 0, cost: probing 2
Uncertainty in The Answer
Two variables X1 and X2, uniform distribution
0
f1(x)
880
0.12
1000 X1
X2
f2(x)
600
f1(x)
900800
f2(x)
Uncertainty / Probing Cost Tradeoff
Uncertainty in the answer
0
Less probing,high uncertainty
More probinglow uncertainty
Probing cost
Tradeoff point
The user-specified uncertainty threshold
The Problem
Given the uncertainty data model, design a probing policy
P: X1PX2
P…XnP
that incurs the least probing cost finds the maximum variable with an uncertainty
lower than Brute force searching takes n!
Optimal Probing under Zero-Uncertainty = 0, i.e. return an absolutely correct answer Two policies
P1: X1X2
P2: X2X1
0
f1(x)X1
X2900800
f2(x)
1000
f1(x)
f2(x)
Optimal Probing under Zero-Uncertainty Theorem 1: X1, …, Xn are ranked in a
descending order of their upper bounds, i.e., u1 > … > un,
P: X1X2…Xn
is optimal in the zero-uncertainty case The upper bound ui as a “representative
point” for Xi
Optimal Probing under Non-Zero-Uncertainty = 0.15 Two policies
P1: X1X2, saves the 2nd probing if X1>885
P2: X2X1, saves the 2nd probing if X2>850
0
f1(x)X1
X2900800
f2(x)
1000
885
850
Critical Point
Critical point,i [li, ui] s.t.P(Xi > i ) =
Lemma 1: With two variables X1 and X2, the optimal policy always probes the one with the larger critical point
0
f1(x)X1
X2900800
f2(x)
1000
885
850
x8501
8852
F1(x)
F2(x)
1
0.85(1-)
Deriving The Optimal Policy from The Critical Points? Theorem 2: The optimal policy should always
place Xi before Xj if:Cond1: i > j
Cond2: x >j, Fi(x) < Fj(x)
x
1-1
Fi(x)Fj(x)
j i
Applying Theorem 2 to Derive The Optimal Policy
x
1-1
2 1n
F1(x)F2(x)Fn(x)
Case 1:
Optimal policy:P: X1X2…Xn
Applying Theorem 2 to Derive The Optimal Policy Case 2: Possible candidate
policies {X1,X2,X3} {X4,X5}
and X1 must be before X2
X1X2X3X4X5
X1X2X3X5X4X1
X3X2X4X5
X1X3X2X5X4
X3X1X2X4X5
X3X1X2X5X4
x
1-
1
F3(x)
F4(x)
F5(x)
F1(x)F2(x)
Experimental Set-up
166 rainfall sensors across Washington State
Recording the rainfall at each sensor location, on every day over the past 46 years
Probability Distribution
From the historical data, generate one distribution per sensor per day
Distinguish two kinds of historical data: Yesterday was dry Yesterday was rainy
Preliminary Results
Complexity of optimal-policy searching
Future Experimental Study
The behavior of the optimal policy on the rainfall sensor data Uncertainty threshold vs. number of sensor probing
The behavior of the optimal policy on synthetic datasets Reduction in the search space vs. number of sensor probing
Summary
Under the proposed data model, find the maximum variable with uncertainty less than
Optimal probing policy = 0, sort variables according to their upper
bounds > 0, derive probing preferences (Xi before Xj)
and reduce the search space