Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Data-Driven Optimization under Distributional Uncertainty
Shuo Han
Postdoctoral ResearcherElectrical and Systems Engineering
University of Pennsylvania
Shuo HanUTC-IASE, Apr 2017
Internet of Things (IoT)
• A network of physical objects- Devices- Vehicles- Buildings
• Allows objects to be - Sensed and controlled- Remotely across the network
• Growing rapidly, by 2020:- 50 billion devices- 6.58 devices per person
2
Shuo HanUTC-IASE, Apr 2017
What IoT Brings: More Sensing (and More Data)
• Total size of dataset: 267 GB
• 1.1 billion taxi and Uber trips (2009 - 2015)
• Pick-up and drop-off dates/times, locations, distances...
3
[Source: nyc.gov]
TLC: Taxi and Limousine Commission
Shuo HanUTC-IASE, Apr 2017
What IoT Brings: More Control
4
Connected and Autonomous VehiclesSmart Home Appliances
Wireless Traffic Light Control Smart Buildings
Shuo HanUTC-IASE, Apr 2017
Smart Cities: IoT + Decision Support
5
City Infrastructure
Control & Optimization Algorithm
Sensor Data
Actions
Shuo HanUTC-IASE, Apr 2017
Investment in Smart Cities
6
“... an infrastructure to continuously improve the collection, aggregation, and use of data to improve the life of their residents – by harnessing the growing data revolution, low-cost sensors, and research collaborations, and doing so securely to protect safety and privacy.”
The Smart Cities Initiative from the White House (Sep 2015)
TerraSwarm
Shuo HanUTC-IASE, Apr 2017
TerraSwarm: Swarm at The Edge of The Cloud
7
“How should we make use of data?”
“How should we send data?”
“How should we collect data?”
TerraSwarm
Shuo HanUTC-IASE, Apr 2017
Research Interests
8
Convex Optimization Control Theory Statistics
Theory Applications
Energy
Transportation
Research Topics
Multi-Agent Systems
Stochastic Systems
Network Dynamics
Shuo HanUTC-IASE, Apr 2017
Research Overview
9
Data-Driven Optimization
6am 12pm 6pm0
50
100
150
Pow
er g
ener
atio
n
[ACC13], [SIOPT15] [CDC15], [TASE16], [ICCPS17]
Pricing for Ridesharing
[Allerton14], [TAC16] [ACC17]
Privacy Solutions for Cyber-Physical Systems
Shuo HanUTC-IASE, Apr 2017
Research Overview
10
Data-Driven Optimization
6am 12pm 6pm0
50
100
150
Pow
er g
ener
atio
n
[ACC13], [SIOPT15] [CDC15], [TASE16], [ICCPS17]
Privacy Solutions for Cyber-Physical Systems
[Allerton14], [TAC16] [ACC17]
Pricing for Ridesharing
Shuo HanUTC-IASE, Apr 2017
[Source]: AESO6am 12pm 6pm
0
50
100
150
Pow
er g
ener
atio
n
Motivation: Wind Energy Integration
11
conventional power plant
wind power storage devices
+
Control Action: Allocation of energy storage
How can we make use of the wind power generation data to maximally utilize wind power?
Wind Energy Data
Shuo HanUTC-IASE, Apr 2017
Motivation: On-Demand Ridesharing in Cities
12
How can we make use of the trip data to reduce the average wait time for passengers?
- Pick-up and drop-off times- Pick-up/drop-off locations- Travel distances
Cause: Mismatch between supply and demand
Control Action: Redistribution of empty vehicles
Shuo HanUTC-IASE, Apr 2017
Probability distribution that models the stochastic phenomenon
Background: Stochastic Programming
• : Objective function
• : Decision variable- Ridesharing: Redirection of empty vehicles- Wind power integration: Allocation of storage
• : Stochastic phenomenon- Ridesharing: Future passenger demand- Wind power integration: Wind power generation
• : Probability distribution of
13
minimizex
E✓⇠d
[f(x, ✓)]
f
x
✓
d ✓
Shuo HanUTC-IASE, Apr 2017
Distribution is Not Always Available
14
6am 12pm 6pm0
50
100
150
Pow
er g
ener
atio
n
We often do not have: Instead, we have:
Question: How should these samples be used in a computationally tractable way with performance guarantees?
Shuo HanUTC-IASE, Apr 2017
Using Sampled Data: Previous Methods
15
Sample average approximation
minimizex
1
n
nX
i=1
f(x, ✓i
)
• Weak guarantee on performance
Robust optimization
minimize
x
max
✓2⇥f(x, ✓)
✓i
⇥
• Can be extremely conservative
Distributional Information + Uncertainty ?
Shuo HanUTC-IASE, Apr 2017
Using Sampled Data: Distributional Uncertainty
• Distributional uncertainty- An ambiguity set in the space of probability distributions- No assumption on the type (continuous vs discrete,
Gaussian, uniform, ...) of distributions- Contains the true distribution with high probability- Informally: “Uncertainty of uncertainty”
16
D
d(true distribution)
minimizex
E✓⇠d
[f(x, ✓)](vs. )minimize
x
max
d2DE✓⇠d
[f(x, ✓)]
• Decision making problem: Distributionally robust optimization
- Strong worst-case guarantees- Subsumes conventional robust optimization
Shuo HanUTC-IASE, Apr 2017
Distributional Uncertainty
• Method 1: Based on certain (pseudo)metric - KL divergence- Wasserstein metric (earth mover’s distance)
• Metric ball centered at the empirical distribution
• The ball contains with high probability
• Advantage: “Nonparametric” characterization
• Disadvantage: Complexity of decision making against grows quickly with the number of samples
17
Dd
bd
empirical distribution
d
D
(true distribution)
M
D(✏) =n
d : M(d, bd) ✏o
Shuo HanUTC-IASE, Apr 2017
Distributional Uncertainty (cont’d)
• Method 2: Based on generalized moments (this talk)
• Assume: is easily bounded
• Examples- Moments: , - Tail probability:
• Classical concentration inequalities can be used to compute the probability that contains
18
E[✓] = ✓̂ cov[✓] = b⌃
P(✓ � ✓̄) ✏
g
D = {d : E✓⇠d[g(✓)] � 0}
D
d(true distribution)
D d
P 1
n
nX
i=1
✓i � E✓ + t
! exp(�2nt2)Example (Hoeffding’s inequality):
Shuo HanUTC-IASE, Apr 2017
Challenges and My Contribution
• Challenge: Finding the worst-case distribution- Infinite-dimensional optimization problem- Not numerically tractable
19
minimize
x
max
d2DE✓⇠d
[f(x, ✓)]
• My contribution- Formulate equivalent convex optimization problem (under certain conditions)- Tractable numerical solutions- Conditions apply to many resource allocation and scheduling problems
• Previous work on special instances- [Scarf, 1958]: Analytical solution for a special case- [Bertsimas, Popescu, 2005]: Optimal probability inequalities- [Vandenberghe, Boyd, Comanor, 2007]: Optimal Chebyshev bounds- [Delage, Ye, 2010]: Piecewise affine functions
Shuo HanUTC-IASE, Apr 2017
g(✓)
f(✓)
Main Result: Equivalent Convex Optimization Problem
20
Theorem: There exists an equivalent convex optimization problem for computing the worst-case distribution if
f
concave
• The objective is piecewise concave
f(✓) = max
kf (k)
(✓) f (k)
g
convex
• The constraint is piecewise convex
g(✓) = minl
g(l)(✓) g(l)
Shuo Han, Molei Tao, Ufuk Topcu, Houman Owhadi, Richard M. Murray, “Convex optimal uncertainty quantification,” SIAM Journal on Optimization, 25(3), 1368–1387, 2015.
Shuo HanUTC-IASE, Apr 2017
concave piecewise affine 0-1 indicator
resource allocation/schedulingresource allocation/scheduling failure rate
Piecewise Concave Functions
21
0
1I(✓ � a)max
k2K{aT
k ✓ + bk}
Shuo HanUTC-IASE, Apr 2017
Piecewise Convex Functions
22
linear convex 0-1 indicator
mean covariance & higher moments tail probability
0
1I(✓ � a)
Shuo HanUTC-IASE, Apr 2017
The Convex Optimization Problem
23
maximize
{pkl,�kl}k,l
X
k,l
pklf(k)
(�kl/pkl)
subject to
X
k,l
pkl = 1
pkl � 0, 8k, lX
k,l
pklg(l)
(�kl/pkl) 0
K · L
f(✓) = max
k2{1,2,··· ,K}f (k)
(✓)
g(✓) = minl2{1,2,··· ,L}
g(l)(✓)
For:
• The worst case is always attained by a discrete distribution
• Total number of Dirac masses in the distribution:
Shuo HanUTC-IASE, Apr 2017
Storage Allocation for Power Grid
24
xi
power flowfij
storage
✓iwind power(stochastic)
min.
x
max
d
E✓⇠d
min
f
Wind Energy Wasted(x, ✓, f)
�Storage Allocation Problem
optimal power flow
✓piecewise concave in
| {z }
Shuo HanUTC-IASE, Apr 2017
Numerical Example: IEEE 14-Bus Test Case
• Network with 5 generators
• Time: one day, 3-hour interval
• Mean and covariance obtained from real wind generation data
25
[Source]: AESO
6am 12pm 6pm0
50
100
150
Pow
er g
ener
atio
n
Shuo HanUTC-IASE, Apr 2017
0 5 10 150
10
20
30
40
50
Total storage
Expe
cted
cos
t
The Influence of Information Constraints
26
Exact distribution
Support Information Only
Support + Mean + Covariance
Shuo Han, Ufuk Topcu, Molei Tao, Houman Owhadi, Richard M. Murray, “Convex optimal uncertainty quantification: Algorithms and a case study in energy storage placement for power grids study,” American Control Conference, 2013.
Shuo HanUTC-IASE, Apr 2017
On-Demand Ridesharing
27
Dispatch Center
Vehicle
Customer
Predicted Demand
Dispatch Command
min.X1:T
TX
t=1
[JD(Xt) + JE(Xt, rt)]
Distribution of Customer Demand
Vehicle Flows Wait TimeCost of Rebalancing
Demand
Shuo HanUTC-IASE, Apr 2017
Robust vs. Non-Robust
• Robust optimization against demand uncertainty
28
Fei Miao, Shuo Han, Shan Lin, George J. Pappas, “Taxi dispatch under model uncertainties,” IEEE Conference on Decision and Control, 2015.
12 16 20 24 28 32 36 40 44 48 520
10
20
30
40
Cost rangeNum
ber o
f exp
erim
ents
Costs distribution of dispatch solutions
non−robust solutionsrobust solutions
Robust solution: 35.5% reduction
min.
X1:T
max
r1:T2�
TX
t=1
[JD(Xt) + JE(Xt, rt)]
100
200
300
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Total
requ
ests
numb
er
Boxplot of total requests number during one hour
Region ID
NYC Dataset: 4 years, 100 GB
Shuo HanUTC-IASE, Apr 2017
Conventional vs. Distributionally Robust
• Distributionally robust formulation
29
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 117
8
9
10x 104 Average cost comparison of DRO and RO
ϵ
Aver
age
cost
SOCBoxNRDRO
min.
X1:T
max
d2DEr⇠d
(TX
t=1
[JD(Xt) + JE(Xt, rt)]
)
Fei Miao, Shuo Han, Abdeltawab Hendawi, et al., “Data-driven distributionally robust vehicle balancing with dynamic region partition,” ACM/IEEE Intl. Conf. on Cyber-Physical Systems, 2017.
(non-robust)(distr. robust)
(robust #1)(robust #2)
confidence levelNote: Confidence level not optimized for distr. robust opt.
Confidence level: Probability the true parameter/distribution lies outside ambiguity set
Shuo HanUTC-IASE, Apr 2017
Future Directions
30
• Distributed computation• Approximate algorithms
• Markov properties• Prior knowledge
• When to discard old data• When to re-learn
Large Datasets Structured Models Online Optimization
Shuo HanUTC-IASE, Apr 2017
Summary
• Distributional uncertainty: A new approach to data-driven optimization
• A rigorous way to make use of sampled data- Probabilistic guarantees- Worst-case analysis/design: Often required for engineering applications
• Computationally efficient- Convex formulation available for a large class of problems - Examples: Resource allocation and scheduling
31
6am 12pm 6pm0
50
100
150
Pow
er g
ener
atio
n
D
d(true distribution)