Upload
barrie-hardy
View
219
Download
1
Tags:
Embed Size (px)
Citation preview
Traveling Salesman Problems Motivated by
Robot Navigation
Maria Minkoff
MIT
With Avrim Blum, Shuchi Chawla, David Karger, Terran Lane,
Adam Meyerson
A Robot Navigation Problem
• Robot delivering packages in a building• Goal to deliver as quickly as possible• Classic model: Traveling Salesman Problem
• Find a tour of minimum length
• Additional constraints:• some packages have higher priority• uncertainty in robot’s behavior
• battery failure• sensor error, motor control error
Markov Decision Process Model
• State space S
• Choice of actions aA at each state s
• Transition function T(s’|s,a)
• action determines probability distribution on next state
• sequence of actions produces a random path through graph
• Rewards R(s) on states
• If arrive in state s at time t, receive discounted reward tR(s) for
• MDP Goal: policy for picking an action from any state that maximizes total discounted reward
Exponential Discounting
• Motivates to get to desired state quickly
• Inflation: reward collected in distant future decreases in value due to uncertainty • at time t robot loses power with fixed probability
• probability of being alive at t is exponentially distributed
• discounting reflects value of reward in expectation
Solving MDP
• Fixing action at each state produces a Markov Chain with transition probabilities pvw
• Can compute expected discounted reward v if start at state v:
v = rv + w pvw t(v,w) w
• Choosing actions to optimize this recurrence is polynomial time solvable • Linear programming
• Dynamic programming (like shortest paths)
Solving the wrong problem
• Package can only be delivered once• So should not get reward each time reach target
• One solution: expand state space• New state = current location past locations
(packages already delivered)
• Reward nonzero only on states where current location not included in list of previously visited
• Now apply MDP algorithm
• Problem: new state space has exponential size
Tackle an easier problem
• Problem has two novel elements for “theory”• Discounting of reward based on arrival time
• Probability distribution on outcome of actions
• We will set aside second issue for now• In practice, robot can control errors
• Even first issue by itself is hard and interesting
• First step towards solving whole problem
Discounted-Reward TSP
Given • undirected graph G=(V,E) • edge weights (travel times) de ≥ 0
• weights on nodes (rewards) rv ≥ 0
• discount factor (0,1)• root node s
Goalfind a path P starting at s that maximizes
total discounted reward (P) = v P rv dP(v)
Approximation Algorithms
• Discounted-Reward TSP is NP-complete (and so is more general MDP-type problem)
• reduction from minimum latency TSP
• So intractable to solve exactly
• Goal: approximation algorithm that is guaranteed to collect at least some constant
fraction of the best possible discounted reward
Related Problems
Goal of Discounted-Reward TSP seems to be to find a “short” path that collects “lots” of reward
• Prize-Collecting TSP• Given a root vertex v, find a tour containing v that
minimizes total length + foregone reward (undiscounted)
• Primal-dual 2-approximation algorithm [GW 95]
k-TSP
• Find a tour of minimum length that visits at least k vertices
• 2-approximation algorithm known for undirected graphs based on algorithm for PC-TSP [Garg 99]
• Can be extended to handle node-weighted version
Mismatch
Constant factor approximation on length doesn’t exponentiate well
• Suppose optimum solution reaches some vertex v at time t for reward tr
• Constant factor approximation would reach within time 2t for reward 2tr
• Result: get only t fraction of optimum discounted reward, not a constant fraction.
Orienteering Problem
Find a path of length at most D that maximizes net reward collected
• Complement of k-TSP • approximates reward collected instead of length• avoids changing length, so exponentiation doesn’t hurt• unrooted case can be solved via k-TSP
• Drawback: no constant factor approximation for rooted non-geometric version previously known
• Our techniques also give a constant factor approximation for Orienteering problem
Our Results
Using -approximation for k-TSP as subroutine
• (3/2 +2)-approximation for Orienteering
• e(3/2 + 2)-approximation for Discounted-Reward Collection
• constant-factor approximations for tree- and multiple-path versions of the problems
Our Results
Using -approximation for k-TSP as subroutinesubstitute =2 announced by Garg in 1999
• (3/2 +25 -approximation for Orienteering
• e(3/2 +13-approximation for Discounted-Reward Collection
• constant-factor approximations for tree- and multiple-path versions of the problems
Eliminating Exponentiation
• Let dv = shortest path distance (time) to v
• Define the prize at v as v=dv rv
• max discounted reward possibly collectable at v
• If given path reaches v at time tv,
define excess ev = tv – dv
• difference between shortest path and chosen one
• Then discounted reward at v is ev v
• Idea: if excess small, prize ~ discounted reward
• Fact: excess only increases as traverse path
• excess reflects lost time; can’t make it up
Optimum path• assume = ½ (can scale edge lengths)
Claim: at least ½ of optimum path’s discounted reward R is collected
before path’s excess reaches 1
s
u
Proof by contradiction:• Let u be first vertex with eu ≥ 1• Suppose more than R/2 reward follows u• Can shortcut directly to u then traverse the rest of optimum
• reduces all excesses after u by at least 1• so “undiscounts” rewards by factor -1 = 2• so doubles discounted reward collected• but this was more than R/2: contradiction
0
1
0.5
1.5
2
3
0
0.5
1
2
New problem: Approximate Min-Excess Path
• Suppose there exists an s-t path P* with prize value of length l(P*)=dt+e
• Optimization: find s-t path P with prize value ≥ that minimizes excess l(P)-dt over shortest path to t
• equivalent to minimizing total length, e.g. k-TSP
• Approximation: find s-t path P with prize value ≥ that approximates optimum excess over shortest path to t, i.e. has length l(P) = dt + ce
• better than approximating entire path length
Using Min-Excess Path
• Recall discounted reward at v is ev v
• Prefix of optimum discounted reward path:
• collects discounted reward ev v R/2
spans prize v R/2
• and has no vertex with excess over 1
• Guess t = last node on opt path with excess et 1
• Find a path to t of approximately (4 times) minimum excess that spans R/2 prize (we can guess R/2)
• Excesses at most 4, so ev v v/16
discounted reward on found path R/32
Solving Min-Excess Path problem
Exactly solvable case: monotonic paths
• Suppose optimum path goes through vertices in strictly increasing distance from root
• Then can find optimum by dynamic program• Just as can solve longest path in an acyclic graph
• Build table• For each vertex v: is there a monotonic path from
v with length l and prize ?
Solving Min-Excess Path problem
Approximable case: wiggly paths
• Length of path to v is lv = dv + ev
• If ev > dv then lv > ev > lv/2
• i.e., take twice as long as necessary to reach v
• So if approximate lv to constant factor, also approximate ev to twice that constant factor
Approximating path length
• Can use k-TSP algorithm to find approximately shortest s-t path with specified prize
• merge s and t into vertex r • opt path becomes a tour• solve k-TSP with root r
• “unmerge”: can get one or more cycles
r
s t
• connect s and t by shortest path
Decompose optimum path
monotone monotone monotonewiggly wiggly
> 2/3 of each wiggly path is excess
Divides into independent problems
Decomposition Analysis
• 2/3 of each wiggly segment is excess
• That excess accumulates into whole path
• total excess of wiggly segment excess of whole path
total length of wiggly segments 3/2 of path excess
• Use dynamic program to find shortest (min-excess) monotonic segments collecting target prize
• Use k-TSP to find approximately shortest wiggles collecting target prize
• Approximates length, so approximates excess • Over all monotonic and wiggly segments,
approximates total excess
Dynamic program for Min-Excess Path
• For each pair of vertices and each (discretized) prize value, find• Shortest monotonic path collecting desired prize
• Approximately shortest wiggly path collecting desired prize
• Note: polynomially many subproblems
• Use dynamic programming to find optimum pasting together of segments
Solving Orienteering Problem: special case
• Given a path from s that• collects prize • has length D
• ends at t, the farthest point from s
v
t
s
• For any const integer r 1, there
exists a path from s to some v with• prize /r
• excess (D-dv)/r
0
0.5
1
1.5
2
3
1
Solving Orienteering Problem
General case: path ends at arbitrary t• Let u be the farthest point from s • Connect t to s via shortest path• One of path segments ending at u
• has prize /2• has length D
Reduced to special case• Using 4-approximation for
Min-Excess Path get 8-approximation for Orienteering
s
t
u
Budget Prize-Collecting Steiner Tree problem
Find a rooted tree of edge cost at most D that spans maximum amount of prize
• Complement of k-MST
• Create Euler tour of opt tree T* of cost 2D
• Divide this tour into two paths starting at root each of length D
• One of them contains at least ½ of total prize
• Path is a type of tree
• Use c-approximation algorithm for Orienteering to obtain 2c-approximation for Budget PCST
Summary
• Showed maximum discounted reward can be approximated using min-excess path
• Showed how to approximate min-excess pathusing k-TSP
• Min-excess path can also be used to solve rooted Orienteering problem (open question)• Also solves “tree” and “cycle” versions of
Orienteering
Open Questions
• Non-uniform discount factors• each vertex v has its own v
• Non-uniform deadlines• each vertex specifies its own deadline by which it
has to be visited in order to collect reward
• Directed graphs• We used k-TSP, only solved for undirected
• For directed, even standard TSP has no known constant factor approximation
• We only use k-TSP/undirectedness in wiggly parts
Future directions
• Stochastic actions• Stochastic seems to imply directed
• Special case: forget rewards. • Given choice of actions, choose to minimize cover
time of graph
• Applying discounting framework to other problems :• Scheduling
• Exponential penalty in place of hard deadlines