Big Data “Triage” for Long Range Planning Transportation Engineering and Safety Conference...

Preview:

Citation preview

Big Data “Triage” for Long Range Planning

Transportation Engineering and Safety Conference

Reuben S MacMartinDecember 12, 2014

Delaware Valley Regional Planning Commission

Metropolitan Planning Organization (MPO) 2 States 9 Counties 351 Municipalities 5.6 Million Population 3,800 sq. miles ~115 employees

Activities – Long Range Plan (LRP) Transportation Improvement Program (TIP) Wide range of planning and technical support for

regional partners

Outline

What we use data for? Traditional data sources – traffic counts,

surveys, demographic data The old-new – OSM, GTF, VPP Suite,

Bluetooth The new-new – CycleTracks, real-time

transit data,…, data-mined GPS data, etc.

What do we use data for?

Current conditions on transportation studies

Current Conditions

What do we use data for?

Current conditions on transportation studies

Definition and analysis of congestion for the Congestion Management Process (CMP)

A bad day compared to average

A bad day compared to average

What do we use data for?

Current conditions on transportation studies

Definition and analysis of congestion for the Congestion Management Process (CMP)

Long Range Planning

Long Range Planning

Long Range Planning

What do we use data for?

Current conditions on transportation studies

Definition and analysis of congestion for the Congestion Management Process (CMP)

Long Range Planning Calibration and validation of travel

forecasting models250 Riders in 2040

Also a data provider – eg. RIMIS

“Traditional” Planning Data Sources Inventories

Traffic counts (78,300+) Bike and Ped counts (1000+) Travel time surveys

Behavioral Surveys Household travel survey (2012-2013) Transit on-board (2010-2012)

Demographic Data Census, American Community Survey National Employment Time Series (NETS)

The old “new” data

These were innovative 5 years ago – Open source data for our travel demand

model networks

Travel Demand Model Networks The need:

Accurate representations of regional highway and transit networks

The past: “hand” code from paper maps, schedules,

etc. or, combine a multitude of different data

sources The innovation:

Fuse OpenStreetMap (OSM) and GTF (i.e. “Google-transit”) and add extra data for modeling

Open Data Mash-up for Transportation Modeling

Data integration Data objects

of different origin are merged

New relationships are created

from OSM

Stop Point

Number

Line

Name

Service Pattern

Line NameRoute NameDirection

Scheduled Run

Line NameRoute NameDirectionIndex

Travel DemandData

Stop Area

Number

from GTFS

Node

Number

Link

From NodeTo Node

2

1 or more

0 or more

Exactly 1

Legend

Connector

Zone NumberNode NumberDirection

Zone

Zone Number

Integrated Street & Transit Network

© in part by OSM and CC-by-SA

TIM 2 Highway Network

© in part by OSM and CC-by-SA

New, accurate topology (& routable) Legacy DVRPC network model

Original SEPTA GTFS (2010)

VISUM Imported Network

VISUM Exported Network(WKTPoly shape)

The old “new” data

These were innovative 5 years ago – Open source data for our travel demand

model networks Bluetooth detectors for speed and O-D

data

The old “new” data

These were innovative 5 years ago – Open source data for our travel demand

model networks Bluetooth detectors for speed and O-D

data Automated Passenger Counter (APC)

data - SEPTA

Why APC data?

Time stamped boarding and alighting data by line by stop

Time period level targets for modeling Stop and line level expansion values for

On Board Survey work Used in calibration/validation of path

builder Transit studies: O-D matrices by line

The new “new” data

User-sourced bike data - CyclePhilly

CyclePhilly – User Generated GPS Data

www.cyclephilly.org

Raw GPS Trace

Snapped GPS Model Path Model vs. Data

The new “new” data

User-sourced bike data – CyclePhilly Vehicle probe data – INRIX

PM Peak TTI – INRIX

Archived Operational Data – INRIX

The new “new” data

User-sourced bike data – CyclePhilly Vehicle probe data – INRIX SEPTA Key (new fare payment

technology) data – SEPTA (availability TBD)

Fare Card Data – Possibilities Anonymized full day transit-based tour

data for all riders O-D data Route choice data Transfer behavior Frequency of transit use Much higher resolution data than current

survey methods

Triage – Making Data Usable

Aggregation – Resolution and limits of existing analytical tools/methods

Cleaning – You can’t check every data point Initial spot check and clean as you go if you

find discrepancies Sampling Biases – Not all big data is

truly random Compare non-random to random sources

whenever possible Declare biases of data when using it

Recommended