18
Will we arrive on time? Forecasting train delays by using data rather than assumptions. ANDREAS GUTWENIGER WWW.PUENKTLICHKEIT.CH

ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

Will we arrive on time?Forecasting train delays by using data rather than assumptions.

ANDREAS GUTWENIGERWWW.PUENKTLICHKEIT.CH

Page 2: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

www.puenktlichkeit.ch

Privately developed and maintained. Non-profit.

Based solely on «open data».

My Motivation:

Learn.

Enjoy.

Get to know interesting people.

Page 3: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»
Page 4: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

3 questions to the audience1. What percentage of delay forecasts made by customer

information systems turns out to be correct?

Imagine you do this:- At Bern station, record all forecasts for train arrivals within the next 60 minutes- Repeat every minute; observe over a period of 6 weeks (Jan 17 – Feb 27, 2018)- Compare with the actual arrival of the train- Classify forecast as «correct» if difference is less than 60 seconds (+ / -)

2. What percentage of delay forecasts turns out to be too high?

71.0%

0.05%

68.4%3. What percentage of those trains is actually on time

(delay <60 seconds, compared to railway timetable)?

Page 5: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»
Page 6: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

Why put so much effort into a small improvement of forecast precision?

Customers would like to be informed about

looming delays:

To adjust their travel plans.

To spend waiting time in a more pleasing way.

Railway staff should be informed about looming delays:

Take counteraction.

Avoid spill-over of delays to other trains.

Page 7: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

Predicting delays: How are Swiss railways performing?

All arrivals at Bern main station Only delayed arrivals (>60 sec)

Minutes before actual arrival Minutes before actual arrival

Perc

enta

geo

fco

rrec

tp

red

icti

on

s

Perc

enta

geo

fco

rrec

tp

red

icti

on

s

Room for improvement?

Page 8: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»
Page 9: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»
Page 10: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»
Page 11: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

How am I achieving this?

By growing many small trees!

Page 12: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

Splitting criterion: delay of 308 seconds

Predictor 2:Arrival Bern > Thun, scheduled :24

(i.e. 30 minutes prio to the prediction event)

Predictor 1:Departure Spiez>Thun, scheduled :25

(i.e. 29 minutes prior to the prediction event)

Prediction suggested by model

Probability of a delay of 3 minutes or more

Proportion of training cases that fall into this node

Departure Spiez>Thun < 130 sec. delay

Departure Spiez>Thun:308 to 366 sec. delay

ANDArrival Bern>Thun

< 92 sec. delay

Splitting criterion: delay of 366 seconds

Decision Tree:«Arrival Thun -> Bern (scheduled :54) is at least 3 minutes delayed»

Splitting criterion:delay of 92 seconds

Predict a delayof 3 minutes or more

Page 13: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

If ….

… then x minutes later

If ….

… and …

… then y minutes later

Page 14: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

From open data to performance evaluation

Historical data

Decision trees

Real time data

RT predictions ofSwiss railways

My real time predictions

Predictionquality

Quality benchmarks

Comparision

Arrivals / departures that recently happened

Arrival times predicted by Swiss railways

Training (recursive partitioning)and validation

reference forcorrectness

(on next day)

reference forcorrectness

(on next day)

Page 15: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

Over estimated:- Swiss railways: 0.089%- puenktlichkeit.ch: 0.087%

Page 16: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

Delayedarrival / departure at Thun

becomes apparentDelayed

arrival / departure at Spiez becomes apparentDelay of preceeding

«Lötschberger» trainin Kander valley becomes apparent

Intercity Brig->Bern, Scheduled arrival :23

Page 17: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

Why does it work?

Traditionally, railways have a very strong«engineering culture», leading to the design of sophisticated systems.

The «empirical culture» is not equally strong,because in the past, feedback loops couldrarely be applied.

Today, data is at hand!

Page 18: ANDREAS GUTWENIGER ...-puenktlichkeit.ch: 0.087% Delayed arrival/ departureat Thun becomesapparent Delayed arrival/ departureat Spiez Delay ofpreceeding becomesapparent «Lötschberger»

So please: launch the feedback loop!

Andreas [email protected]