Graphing the Second Screen - Glen Ford @ GraphConnect London 2013

Preview:

DESCRIPTION

Glen will show how a graph database is helping zeebox improve both performance and the end-user experience

Citation preview

zeebox  Graphing  the  Second  Screen  

zeebox  is  Your  TV  Sidekick  

In  the  beginning…  

We  came  up  with  rela?onal  model  from  which  we  can  build  an  EPG.  

So,  what  is  the  problem?  

An  EPG  looks  like  a  good  fit  for  a  rela?on  database?  

An  EPG,  maybe.  But  we  have  poten6ally  1.5  million  EPGs…  

we  also  have  a  lot  available  data  sources…  

zeebox analyses live TV to understand the context

of what’s on air…

+  lots  lots  more…  

With  some  smart  implementa?on  it  all  worked  well,  but  suffers  from  limita?ons,  including  how  oKen  we  can  update  those  EPG’s.  

And  we  are/want  to  be  much,  much  more  than  just  an  EPG.  

One  problem  to  solve…  

What  channel?  What  ?me?  Can  Ian  even  see  it?  

Now  it  can  be  done  

but  it’s  ugly  

and  it’s  slow.  

hSp://www.fanpop.com/clubs/lt-­‐commander-­‐data/images/31158615/?tle/data-­‐photo  

We  have  all  the  data.    Is  there  a  smarter  way  to  structure  it?  

Need  to  handle:    Structured  and  Semi-­‐Structured  Densely  Connected  High  Read  Rates  Rela?vely  Low  Write  Rates  

Dr Who.S1 EP1

Dr Who.S1 EP1

Broadcast

AIRED_ON

Dr Who.S1 EP1

Channel

Broadcast

AIRED_BYTime

AIRED_AT

AIRED_ON

Dr Who.S1 EP1

Channel

Broadcast

AVAILABLE_ONAIRED_BYTime

AIRED_AT

AIRED_ON

Provider

Dr Who.S1 EP1

Channel

Broadcast

AVAILABLE_ON

AIRED_BY

Time

AIRED_AT

AIRED_ON

Provider

BroadcastTime

Channel

Provider

AIRED_ON

AIRED_AT

AIRED_BY

AVAILABLE_ON

MATCH  

Dr Who.S1 EP1

Broadcast

AIRED_ONBroadcast

Time

Channel

Provider

AIRED_ON

AIRED_AT

AIRED_BY

AVAILABLE_ON

(b:Broadcast)<-­‐[:AIRED_ON]-­‐(ep:Episode)  

-­‐[:AIRED_ON]-­‐>(d:Broadcast)  

-­‐[:AIRED_BY]-­‐>(c:Channel)  

-­‐[:AVAILABLE_ON]-­‐>(p:Provider),  

(d:Broadcast)-­‐[:AIRED_AT]-­‐>(t:Time)  

WHERE  b.broadcast_id  =  {JIMs  BROADCAST}  

RETURN  d,c,t;  

AND  p.provider_id  =  {IANs  PROVIDER}  

But  what  are  the  numbers?  

So  some  early  benchmarks  based  on  7  days  worth  of  broadcast  data.    Running  on  a  2011  MBP,  2.3GHz,  8GB  with  SSD  Neo  1.9,    MySQL  5.1    MySQL                                    80  seconds  Cypher  1st  Attempt            6  seconds  Cypher  after  tuning        190  milliseconds  Traversal*                          42  milliseconds  

So  that’s  cool  you  can  make  your  system    faster,  nice.  

But  that’s  not  actually  the  really  good  bit.  

Deeper  ques?ons  of  the  data…  

Dr Who.

Tom Baker

BlackAdder

APPEARED_IN

APPEARED_IN

Little Britain

NARRATED

Dr Who.

Tom Baker

BlackAdder

APPEARED_IN

APPEARED_IN

Little Britain

NARRATED

Broadcast

Time

Channel

AIRED_AT

AIRED_BY

AIRED_ON

Flexibility  of  the  data…  

Dr Who.S1 EP1

Dr Who.S1

Dr Who.

Image.

Image.

Represent  the  data  as  it  is,  no  “wedging”  

Star Trek

FRANCHISE

Star TrekVI

MOVIE

Broadcast

Star TrekS1 EP1

EPISODE

Broadcast

Star TrekCon. Live

LIVE EVENT

Broadcast

Glen

zeebox

ChiefArchitect

WORKS_FOR

HOLDS_ROLE

HAS_ROLE

@glen_ford

USES_TWITTER

glen@zeebox.comHAS_EMAIL

Thank  you.  

Recommended