95
Freeing Tower Bridge

Freeing Tower Bridge

Embed Size (px)

Citation preview

FreeingTowerBridge

It’s 2017!Why the f*ck are we still scraping

web sites?

BadLondoner

Confession

I have lived in London for

over 35 years

I havenever seen

Tower Bridge lift

* Who are hiring

* Who are hiring (obviously)

Geographical Convenience

Opportunity!

Notification

Data feed

Data feed Android→

Data feed Android→ Pebble→

Profit!

Data

No Data

NoData

(Machine readable)

DIY

It’s 2017!Why the f*ck are we still scraping

web sites?

Web::Query

wq('http://www.towerbridge.org.uk/lift-times/') ->find('table.lined tbody tr') ->each(sub { push @lifts, [ map { $_->text } $_[1]->contents ] });

wq('http://www.towerbridge.org.uk/lift-times/') ->find('table.lined tbody tr') ->each(sub { push @lifts, [ map { $_->text } $_[1]->contents ] });

wq('http://www.towerbridge.org.uk/lift-times/') ->find('table.lined tbody tr') ->each(sub { push @lifts, [ map { $_->text } $_[1]->contents ] });

wq('http://www.towerbridge.org.uk/lift-times/') ->find('table.lined tbody tr') ->each(sub { push @lifts, [ map { $_->text } $_[1]->contents ] });

wq('http://www.towerbridge.org.uk/lift-times/') ->find('table.lined tbody tr') ->each(sub { push @lifts, [ map { $_->text } $_[1]->contents ] });

$VAR1 = [ [ 'Sat', '11 Mar', '07:30', 'Maintenance Lift ', 'Up river' ], [ 'Sat', '11 Mar', '08:00', 'Maintenance Lift ', 'Down river' ], ... ];

Munge

iCal

Profit!

Well...

Data::ICal

Data::ICal::Entry::Event

my $ical = Data::ICal->new();

for (@lifts) { my $date = ...; my $event = Data::Ical::Entry::Event->new(); $event->add_properties(...); $ical->add_entry($event);}

print $fh, '>', $ical->as_string;

No Year

my $date = $dt_parser->parse_datetime( "$_->[2] $_->[1] $curr_year");

# If the month number of this event is less# than the current month number then we've# gone to the next year. Increment the year# number and re-calculate.if ($date->mon < $curr_mon) { ++$curr_year; $date = $dt_parser→parse_datetime( "$_->[2] $_->[1] $curr_year" );}

# Tower Bridge web site occasionally # has duplicatesnext if $seen{$date->epoch}++;

my $event = Data::ICal::Entry::Event->new();

$event->add_properties( summary => 'Tower Bridge Lift', description => "$_->[3] ($_->[4])", dtstart => dt2ical($date), duration => 'PT30M', dtstamp => $now_ical, uid => $date->epoch . '@towerbridge.dave.org.uk',);

A detour

Different Timezones

(I assume)

Timezones are easy

my $dt_parser = DateTime::Format::Strptime->new( pattern => '%H:%M %d %b %Y', time_zone => 'Europe/London',);

sub dt2ical { my ($dt) = @_;

return $dt->ymd('') . 'T' . $dt->hms('') . # Or something like this. # Check iCal specs. $dt->time_zone_short_name;}

Failed validation

https://icalendar.org/validator.html

DateTime::Format::ICal

TZID=Europe/London:20170311T105600

Looked OK

Failed validation

“Invalid TZID”

TZID=Europe/London:20170311T105600

TZID=Europe/London:20170311T105600

To the standard

definition!

This property specifies the text value that uniquely

identifies the "VTIMEZONE" calendar component in the scope of

an iCalendar object.

If present, the "VTIMEZONE" calendar

component defines the set of Standard Time and Daylight Saving Time observances (or rules) for a particular time zone for a given interval of

time.

Add VTIMEZONE section to the iCal

file

Data::ICal::Entry::TimeZone

This module is not yet useful, because every time zone

declaration needs to contain at least one STANDARD or DAYLIGHT component, and

these have not yet been implemented.

Plan C

Back to the iCal standard

definition

DATE WITH LOCAL TIME

The date with local time form is simply a DATE-TIME value that does not

contain the UTC designator nor does it reference a time zone. For example, the following represents January 18, 1998,

at 11 PM:

19980118T230000

sub dt2ical { my ($dt) = @_;

return $dt->ymd('') . 'T' . $dt->hms('');}

And we have a valid iCal feed

Throwtogether

a web site

http://towerbridge.dave.org.uk

Rebuild the data daily

Stick the code on Github

https://github.com/davorg/towerbridge

Subscribe to the calendar

Profit!

13th Feb 2017 13:30

PriorArt

Sun 2nd April: 20:30 & 21:15

Dave Cross@davorg

@perlhackshttps://perlhacks.com/