60
Case Study: Wind Sports Mashup on Google App Engine JAOO Århus 2009 | Jakob A. Dam | [email protected]

Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Case Study:Wind Sports Mashup on

Google App Engine

JAOO Århus 2009 | Jakob A. Dam | [email protected]

Page 2: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Explaining the title Case Study:Wind Sports Mashup on Google App Engine

Page 3: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Explaining the title Case Study:Wind Sports Mashup on Google App Engine

The problem: finding the winddirectionspeedspottime

Page 4: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Explaining the title Case Study:Wind Sports Mashup on Google App Engine

Page 5: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Agenda

Motivation, vision, and demo Architectural overview Problem: No cron jobs (GAE) Challenge: Inequality filters on one property only (GAE) Challenge: Result set <= 1000 entities (GAE)

Page 6: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

http://ifm.frv.dk/

Motivation               

Page 7: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

http://www.dmi.dk/dmi/index/danmark/borgervejr.htm?map=map1&param=wind

Motivation           

Page 8: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

is_surfable(direction, speed, spot, time)

Key predicate

Page 9: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Problem

Page 10: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

+

+

+

wind sports info and logic=

A global mashup that assists practitioners of wind sports

Page 11: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Demo

http://welovewind.com

Page 12: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

How to make it fly?

Serving infrastructure

Page 13: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Google App Engine

Page 14: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Google App Engine

Page 15: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

GAE Restrictions Feb '09

Python only

Request duration <= 10 seconds

Request only way to start processing Inequality filters on one property only ...

Page 16: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Restrictions lifted since

Python only (Java, JRE subset)

Request duration <= 10 seconds (30 seconds)

Request only way to start processing (cron jobs, however, only 20) Inequality filters on one property only

Experimental Task Queue for offline processing

Page 17: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

How to make it fly?

A web service for connecting all the distributed resources

Page 18: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Web service data model

Page 19: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Architecture

123

Page 20: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Architecture

123

GET /forecast_points/GET /weather_stations/

Page 21: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Architecture

123

GET /weatherapi/locationforecast/1.6/?lat=56.2274;lon=10.3083Host: api.yr.no

Page 22: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Architecture

123

PUT /forecast_points/56.2274,10.3083/ (JSON forecasts)

Page 23: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Architecture

123

GET /forecast_points/...GET /spots/...GET /weather_stations/...

POST /spots/

Page 24: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Problem:

How to flush out stale weather data?

Page 25: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Solutions:Delete stale data with a cron job.

Page 26: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Solutions:Delete stale data with a cron job. Maintain when inserting weather data.

Update "existing" or insert new entity if non-existing

Page 27: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

How? Reuse db keysForecast key names: /forecast_points/-23.0161,-43.3063/time_delta/9//forecast_points/-23.0161,-43.3063/time_delta/12//forecast_points/-23.0161,-43.3063/time_delta/15/...

Calculating time delta:time_delta = forecast time - calculation time    

Page 28: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Too resource intensive

~100 entities for each forecast point are updated

Page 29: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Solutions cont'd:

Combine the one-to-many relationship into one entity.

Page 30: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

class ForecastPoint(db.Model): point = db.GeoPtProperty() calculation_time = db.DateTimeProperty() forecasts = db.TextProperty() ...

Page 31: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

class ForecastPoint(db.Model): point = db.GeoPtProperty() calculation_time = db.DateTimeProperty() forecasts = db.TextProperty() ... forecasts is a JSON list: [ { "direction": 269.1, "speed": 6.2, "temp": 7.7, "time": "2009-10-04T23:00:00" },(...)]

Page 32: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Forecasts as text:

Forecasts as entities:

Page 33: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Agenda

Motivation, vision, and demo Architectural overview Problem: No cron jobs (GAE) Challenge: Inequality filters on one property only (GAE) Challenge: Result set <= 1000 entities (GAE)

Page 34: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Geo. queries are not directly supported

 

Page 35: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Too many points

Page 36: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

SELECT *FROM SpotsWHERE lat > 54 AND lat < 58 AND lon > 8 AND lon < 16;

Page 37: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

SELECT *FROM SpotsWHERE lat > 54 AND lat < 58 AND lon > 8 AND lon < 16;

"Inequality Filters Are Allowed On One Property Only"

-- GAE

Page 38: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Bounding box query

Using index on lat. and index on lon.

Page 39: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Solution:Convert points to values

in a single dimension

using a scheme that preserves proximity.

Page 40: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

GeohashBase32 = "0123456789bcdefghjkmnpqrstuvwxyz"Value = 012... 31"0" <=> 000002 <=> (-67.5°, -157.5°)

Page 41: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

GeohashBase32 = "0123456789bcdefghjkmnpqrstuvwxyz"Value = 012... 31"00"<=> 00000 000002 <=> (-87.1875°,-174.375°)

Page 42: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Note:

Points in the same grid cell have the same geohash prefix

Page 43: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot
Page 44: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Prefix query for proximity points (SQL)

SELECT *FROM SpotsWHERE geohash LIKE 'U1%'

Page 45: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Prefix query for proximity points (SQL)

SELECT *FROM SpotsWHERE geohash LIKE 'U1%'

LIKE not available on GAE!

Page 46: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Prefix query for proximity points (SQL)

SELECT *FROM SpotsWHERE geohash LIKE 'U1%'

LIKE not available on GAE!

SELECT *FROM SpotsWHERE geohash >= 'U1' AND geohash < 'U2'

Page 47: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Prefix query for proximity points (GAE)

query = db.Query(Spot)query.filter('geohash >=', 'u1')query.filter('geohash <', 'u1' + u'\ufffd') The largest possible unicode char: �

Page 48: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Advantage: proximity queries supported by index

Kind Property Value KeySpot geohash sws8whkz7yzb . Spot geohash u1vvsqd1rzrb .Spot geohash u1yznthncyzb .Spot geohash u1zjy5pd7fxg ....Spot geohash u3bqk1wvrgzy .

Page 49: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Challenge:

"If more than 1000 entities match the query

only the first 1000 results are returned" 

 -- GAE doc.       

Page 50: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Solution:

Apply paging using the geohash index.

Page 51: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Paging: only by using the geohash index

Kind Property Value KeySpot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ...Spot geohash u1yznthncyzb ...Spot geohash u1zjy5pd7fxg ......Spot geohash u3bqk1wvrgzy ...

Page 52: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Spots Paging: using the geohash index.../api/spots/?gh_prefix=u1&gh_offset=u1zrfef3xbzgPAGE_SIZE = 2 def index(request): prefix = request.GET.get('gh_prefix', '') offset = request.GET.get('gh_offset', prefix) (...)

Page 53: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Spots Paging: using the geohash index.../api/spots/?gh_prefix=u1&gh_offset=u1zrfef3xbzgPAGE_SIZE = 2 def index(request): prefix = request.GET.get('gh_prefix', '') offset = request.GET.get('gh_offset', prefix) q = db.Query(Spot) q.filter('geohash >=', offset) q.filter('geohash <', prefix + u'\ufffd') q.order('geohash') spots = q.fetch(PAGE_SIZE + 1) (...)

Page 54: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Spots Paging: using the geohash index.../api/spots/?gh_prefix=u1&gh_offset=u1zrfef3xbzgPAGE_SIZE = 2 def index(request): prefix = request.GET.get('gh_prefix', '') offset = request.GET.get('gh_offset', prefix) q = db.Query(Spot) q.filter('geohash >=', offset) q.filter('geohash <', prefix + u'\ufffd') q.order('geohash') spots = q.fetch(PAGE_SIZE + 1) has_next_page = len(spots) > PAGE_SIZE if has_next_page: qs = request.GET.copy() qs['gh_offset'] = spots[-1].geohash spots = spots[:-1] # create representation with uri to next page (...)

Page 55: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Spots Representation: http://welovewind.com/api/spots/?gh_prefix=u1 { "items":[ { "name": "Bork Havn", "lon": 8.2757949829101562, "lat": 55.84650606768372, "uri": "/api/spots/dk/bork_havn/", "forecast_point": "/api/forecast_points/55.8465,8.2758/", "country_code": "dk" },(...)], "next": "/api/spots/?gh_prefix=u1&gh_offset=u1zrfef3xbzg"}

Page 56: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Challenge:

The proximity property is not preserved

in all cases with geohash.

Page 57: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

g... u...

Problem: Proximity property of geohash

Page 58: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

Include all neighbor cells

http://www.welovewind.com/examples/geohash/index.html

Page 59: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

ConclusionIn this talk

Motivation and vision Architectural overview Problem: No cron jobsChallenge: Limited inequality operators Challenge: Result set <= 1000 entities

The challenges are your friend. The result

A mashup designed with high scalability.

Page 60: Case Study: Mashup for Wind Sports on GAE€¦ · Paging: only by using the geohash index Kind Property Value Key Spot geohash sws8whkz7yzb ... Spot geohash u1vvsqd1rzrb ... Spot

ConclusionIn this talk

Motivation and vision Architectural overview Problem: No cron jobsChallenge: Limited inequality operators Challenge: Result set <= 1000 entities

The challenges are your friend.

The resultA mashup designed with high scalability.

More info

http://welovewind.com/about Thank you.