Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Urban Link Travel Time Estimation Using Large-scale Taxi Data with
Partial Information
Xianyuan Zhan*
Satish V. Ukkusuri*
*Civil Engineering, Purdue University
24/04/2014
โข Introduction
โข Study Region
โข Link Travel Time Estimation Model
โข Base Model
โข Probabilistic Model
โข Numerical Results
โข Conclusion
โข Questions/Comments
Outline
Introduction Study region Base model Probabilistic model Numerical results Conclusion
2
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
โข New York City has the largest market for
taxis in North America:
โ 12,779 yellow medallion (2006)
โ Industrial revenue $1.82 billion (2005)
โ Serving 240 million passengers per year
โ 71% of all Manhattan residentsโ trips
โข GPS devices are installed in each taxicab
โข Taxi data recorded by New York Taxi and
Limousine Commission (NYTLC)
Introduction
Introduction Study region Base model Probabilistic model Numerical results Conclusion
3
โข Massive amount of data!
โ 450,000 to 550,000 daily trip records
โ More than 180 million taxi trips a year
โ Providing a lot of opportunities!
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Taxi trips in NYC
Introduction
Introduction Study region Base model Probabilistic model Numerical results Conclusion
4
Trip Origin Trip Destination
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Estimating urban link travel times
โข Traditional approaches:
โ Loop detector data
โ Automatic Vehicle Identification tags
โ Video camera data
โ Remote microwave traffic sensors
โข Why taxicab data?
โ Novel large-scale data sources
โ Ideal probes monitoring traffic condition
โ Large coverage
โ Do not need fixed sensors
โ Cheap!
Introduction
Introduction Study region Base model Probabilistic model Numerical results Conclusion
5
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
The data
โข NYTLC records taxi GPS trajectory data, but not public
โข Only trip basis data available
โ Contains only OD coordinate, trip travel time and distance, etc.
โ Path information not available
โ Large-scale data with partial information
The problem
โข Given large-scale taxi OD trip data, estimate urban link travel times
โข Sub-problems to solve:
โ Map data to the network
โ Path inference
โ Estimate link travel time based on OD data
Introduction
Introduction Study region Base model Probabilistic model Numerical results Conclusion
6
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Study Region
Introduction Study region Base model Probabilistic model Numerical results Conclusion
7
โข 1370ร1600m rectangle area in Midtown Manhattan
โข Data records fall within the region are subtracted
MPE 2013+MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Study Region
Introduction Study region Base model Probabilistic model Numerical results Conclusion
8
Test network
โข Network contains:
โ 193 nodes
โ 381 directed links
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Study Region
Introduction Study region Base model Probabilistic model Numerical results Conclusion
9
Number of observations in the study region
โข Day 1: Weekday (2010/03/15, Monday)
โข Day 2: Weekend (2010/03/20, Saturday)
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
0
200
400
600
800
1000
1200
Fre
qu
en
cy
Histogram for day 1
0
100
200
300
400
500
600
Fre
qu
en
cy
Histogram for day 6
Base Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
10
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Base link travel time estimation model*
โข Hourly average link travel time estimations
โข Direct optimization approach
โข Overall framework: four phases
* Zhan, X., Hasan, S., Ukkusuri, S. V., & Kamga, C. (2013). Urban link travel time estimation using large-scale taxi data with partial information.Transportation Research Part C: Emerging Technologies, 33, 37-49.
Base Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
11
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Data mapping
โข Mapping points to nearest links in the network
โข Mapped point (blue) are used
โข Identify intermediate origin/
destination nodes
โข ๐ผ1, ๐ผ2 are defined as distance proportions from mapped
points to the intermediate
origin/destination node
Base Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
12
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Construct reasonable path sets
โข Number of possible paths could be huge!
โข Need to shrink the size of possible path set
โข Use trip distance to eliminate unreasonable paths
โข K-shortest path algorithm* (k=20) is used to generate initial path sets
* Y. Yen, Finding the K shortest loopless paths in a network, Management Science 17:712โ716, 1971.
โข Filter out unreasonable paths (threshold:
weekday 15%~25%, weekend 50%)
Base Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
13
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Route choice model
โข Assumption:
โ Each driver wants to minimize both trip time and distance to make more trips
thus make more revenue
โข A MNL model based on utility maximization scheme
๐๐ ๐ก, ๐, ๐ =๐โ๐๐ถ๐ ๐ก,๐๐
๐โ๐ ๐ ๐โ๐๐ถ๐ ๐ก,๐๐
โข Path cost measured as a function of trip travel time and distance
๐ถ๐ ๐ก, ๐๐ = ๐ฝ1 โ ๐๐ ๐ก + ๐ฝ2 โ ๐๐
๐๐ ๐ก = ๐ผ1๐ก๐ + ๐ผ2๐ก๐ท +
๐โ๐ฟ
๐ฟ๐๐๐ก๐
Base Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
14
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Link travel time estimation
โข Minimizing the squared difference between expected (๐ธ ๐๐|๐ ๐ ) and observed ๐๐ path travel times
๐ธ ๐๐|๐ ๐ =
๐โ๐ ๐
๐๐( ๐ก)๐๐ ๐ก, ๐, ๐
๐ก = argmin ๐ก
๐โ๐ท
๐ฆ๐ โ ๐ธ ๐๐|๐ ๐2
โข Solve using Levenberg-Marquardt (LM) method
โข Parallelized codes developed to estimate the model
โข Entire optimization solved within 10 minutes on an intel i7 laptop
โข Numerical results show in later section
Probabilistic Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
15
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Limitations of the base model
โข Point estimate of hourly average travel time
โข Not incorporating variability of link travel times
โข Not utilizing historical data
โข Problems of compensation effect
โข Less robust
Solution: Adopt a probabilistic framework
โข Accounting for variability in link travel times
โข More robust
โข Historical information can be incorporated as priors
Probabilistic Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
16
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Assumptions:
1. Link travel time: ๐ฅ๐ โผ ๐ฉ(๐๐ , ๐๐2)
2. Path travel time is the summation of a set of link travel times
๐ ๐ฆ๐|๐, ๐ = ๐ ๐ฆ๐|๐, ๐, ๐ฎ = ๐ ๐ผ1๐0 + ๐ผ2๐๐ท +
๐โ๐
๐๐ , ๐ผ1๐๐2 + ๐ผ2๐๐ท
2 +
๐โ๐
๐๐2
3. Route choice based on the perceived mean link travel times and distance
๐๐๐ ๐,๐ท, ๐๐ =
exp โ๐ถ๐๐ ๐,๐ท, ๐๐
๐ โ๐ ๐exp โ๐ถ๐
๐ ๐,๐ท, ๐๐
โข where ๐, ๐, ๐ฎ are the vector of link travel times, their mean and variance
Probabilistic Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
17
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Mixture model
โข A Mixture model is developed to model the posterior probability of the
observed taxi trip travel times given link travel time parameters ๐, ๐ฎ
๐ป ๐|๐, ๐ฎ,๐ซ =
๐=1
๐
๐โ๐ ๐
๐๐๐ ๐,๐ท, ๐๐ ๐ ๐ฆ๐|๐, ๐, ๐ฎ
โข Introducing ๐ง๐๐ as the latent variable
indicating if path ๐ is used by observation ๐
Plate notation
Probabilistic Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
18
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Bayesian Mixture model
โข Incorporating historical information:
โ Prior on ๐:
โ Priors on ๐ and variance ๐ฎ
๐ป ๐|๐, ๐ฎ,๐ซ =
๐=1
๐
๐โ๐ ๐
๐๐๐ ๐,๐ท, ๐๐ ๐ ๐ฆ๐|๐, ๐, ๐ฎ โ
๐โ๐ฟ
๐ ๐๐
๐ป ๐|๐, ๐ฎ,๐ซ =
๐=1
๐
๐โ๐ ๐
๐๐๐ ๐,๐ท, ๐๐ ๐ ๐ฆ๐|๐, ๐, ๐ฎ โ
๐โ๐ฟ
๐ ๐๐ ๐ ๐๐2
Probabilistic Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
19
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Solution approach
โข An EM algorithm is proposed for estimation
โข A iterative procedure of two steps:
โ E-step:
๐ผ ๐ง๐๐ = ๐ง๐๐ ๐ง๐๐ ๐๐๐ ๐,๐ท, ๐๐ ๐ ๐ฆ๐|๐, ๐, ๐ฎ
๐ง๐๐
๐ง๐๐ ๐ โ๐ ๐ ๐๐
๐ ๐,๐ท, ๐๐ ๐ ๐ฆ๐|๐ , ๐, ๐ฎ๐ง๐ ๐= ๐พ ๐ง๐
๐
โ M-step: Let ๐๐ = ๐๐2, ๐ = ๐ฎ,
๐ ๐, ๐ = ๐ผ๐ ln ๐ ๐, ๐|๐, ๐ =
๐=1
๐
๐โ๐ ๐
๐พ ๐ง๐๐ ln ๐๐
๐ ๐,๐ท, ๐๐ + ln๐ ๐ฆ๐|๐, ๐, ๐
๐๐๐๐ค, ๐๐๐๐ค = ๐๐๐๐๐๐ฅ๐,๐๐ ๐, ๐
Probabilistic Model
Introduction Study region Base model Probabilistic model Numerical results Conclusion
20
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Solving for large-scale data and large networks
โข The M-step involves a large-scale optimization problem
โข Our goal:
โ Solve for large-scale data input
โ Solve for large network
โ Short term link travel time estimation (say 15min)
โข Solution: parallelize the computation!
โ Alternating Direction Method of Multiplier (ADMM) to decouple the problem
into smaller sub-problems
โ Solve decomposed sub-problems in parallel
โ Deals with large size of network and data
โ Faster model estimation
Numerical Results
Introduction Study region Base model Probabilistic model Numerical results Conclusion
21
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Model results for base model
โข Validation metrics
โข Root mean square error
โข Mean absolute percentage error
RMSE =1
๐
๐=1
๐
๐๐๐๐ โ ๐๐
๐๐ 2
MAPE =1
๐
๐=1
๐๐๐๐๐ โ ๐๐
๐๐
๐๐๐๐ ร 100%
Numerical Results
Introduction Study region Base model Probabilistic model Numerical results Conclusion
22
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Model results for base model
โข Test data: 3/15/2010 ~ 3/21/2010
Day ErrorTime Period
9:00-10:00 13:00-14:00 19:00-20:00 21:00-22:00
MondayRMSE (min) 2.614 1.981 1.937 1.372
MAPE 29.51% 24.22% 26.27% 21.87%
TuesdayRMSE (min) 2.461 2.302 1.827 1.437
MAPE 29.63% 25.59% 23.33% 22.20%
WednesdayRMSE (min) 3.827* 3.216* 2.18 1.691
MAPE 41.32%* 34.97%* 28.73% 24.40%
ThursdayRMSE (min) 2.468 2.699 2.49 1.382
MAPE 27.28% 27.92% 28.54% 21.05%
FridayRMSE (min) 2.26 2.179 1.692 1.334
MAPE 27.76% 27.04% 25.17% 22.26%
SaturdayRMSE (min) 1.034 1.69 1.839 1.584
MAPE 16.84% 24.58% 27.14% 21.61%
SundayRMSE (min) 2.041 1.518 1.395 1.16
MAPE 25.44% 23.70% 22.72% 19.87%
* Traffic disturbance caused by Patrick's Day Parade.
Numerical Results
Introduction Study region Base model Probabilistic model Numerical results Conclusion
23
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
โข Two new models are proposed to estimate urban link travel times
โข Utilizing data with only partial information
โข Efficiently estimation using base model with reasonable accuracy
โข Mixture models are proposed to get more robust and accurate estimations
โข Applicable to trajectory data, can provide more accurate estimations
Future work
โข Test the mixture models for larger network
โข Efficient implementation using distributed computing technique
โข Result validation
Conclusion
Introduction Study region Base model Probabilistic model Numerical results Conclusion
24
MPE 2013+
Urban Link Travel Time Estimation Using Large-scale Taxi Data with Partial Information
Xianyuan Zhan
Q&A
Introduction Study region Base model Probabilistic model Numerical results Conclusion
25
Thank you!
Questions / Comments ?