21
Acunu Analytics Realtime Big Data Analytics Tom Wilkie, Acunu 16th July 2012

Acunu Analytics @ Cassandra London

  • Upload
    acunu

  • View
    510

  • Download
    1

Embed Size (px)

DESCRIPTION

My talk about Acunu Analytics - for video see http://skillsmatter.com/podcast/nosql/acunu-analytics

Citation preview

Page 1: Acunu Analytics @ Cassandra London

Acunu AnalyticsRealtime Big Data Analytics

Tom Wilkie, Acunu16th July 2012

Page 2: Acunu Analytics @ Cassandra London

Analytics

• Motivation / alternatives

• What is it?

• How does it work?

• Whats it good for?

2

Page 3: Acunu Analytics @ Cassandra London

Analytics

• Motivation / alternatives

• What is it?

• How does it work?

• Whats it good for?

3

Page 4: Acunu Analytics @ Cassandra London

Analytics

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

time page session id duration

... ... ... ...

14:58:03.234 /index.html 248.180.3.40 175

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234

14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52

4

Page 5: Acunu Analytics @ Cassandra London

Analytics

Live & historicalaggregates... Trends... Drill downs

and roll ups

Combining “big” and “real-time” is hard

5

Page 6: Acunu Analytics @ Cassandra London

Analytics6

Solution Con

Scalability$$$

Not realtimeInefficient Recomputation

Spartan query semantics => complex, DIY solutions

Page 7: Acunu Analytics @ Cassandra London

Analytics

• Motivation / alternatives

• What is it?

• How does it work?

• Whats it good for?

7

Page 8: Acunu Analytics @ Cassandra London

Analytics

• Simple, real-time, incremental analytics

• Push processing into ingest phase

events

counterupdates

Acunu Analytics

Click streamSensor data

etc

Page 9: Acunu Analytics @ Cassandra London

Analytics

{time : TIME(HOUR; MIN; SEC),page : PATH(/),category : STRING,loadTime : LONG

}

{select : ["COUNT", "AVG(loadTime)"],where : “time, ?path”,group : “time, ?category”

}

9

Page 10: Acunu Analytics @ Cassandra London

Analytics

• Motivation / alternatives

• What is it?

• How does it work?

• Whats it good for?

10

Page 11: Acunu Analytics @ Cassandra London

Analytics

Introduction

11

Page 12: Acunu Analytics @ Cassandra London

Analytics

countgrouped by ...

daycount

distinct (session)

count ... geography

... browseravg(duration)

12

Page 13: Acunu Analytics @ Cassandra London

Analytics

time : TIME(HOUR; MIN; SEC),cust_id : LONG,session_id : LONG,geography : STRING,browser : STRING,load_time : LONG

Data Definition

{ select: “COUNT” patterns: [ { where : “?time”, group : “?time” }, { where : “”, group : “geography” }, { where : “”, group : “browser” } ]}, { select: [“COUNT_DISTINCT(session_id)”, “AVG(load_time)”], where: “time”, group: “”}

QueryPatterns

13

Page 14: Acunu Analytics @ Cassandra London

Analytics

21:00 all→1345 :00→45 :01→62 :02→87 ...

22:00 all→3221 :00→22 :00→19 :02→104 ...

... ...

UK all→228 user01→1 user14→12 user99→7 ...

US all→354 user01→4 user04→8 user56→17 ...

...

UK, 22:00 all→1904 ...

∅ all→87314 UK→238 US→354 ...

{cust_id: user01,session_id: 102,geography: UK,browser: IE,time: 22:02,

}

14

Page 15: Acunu Analytics @ Cassandra London

Analytics

21:00 all→1345 :00→45 :01→62 :02→87 ...

22:00 all→3222 :00→22 :00→19 :02→105 ...

... ...

UK all→229 user01→2 user14→12 user99→7 ...

US all→354 user01→4 user04→8 user56→17 ...

...

UK, 22:00 all→1905 ...

∅ all→87315 UK→239 US→354 ...

15

{cust_id: user01,session_id: 102,geography: UK,browser: IE,time: 22:02,

}

Page 16: Acunu Analytics @ Cassandra London

Analytics

21:00 all→1345 :00→45 :01→62 :02→87 ...

22:00 all→3222 :00→22 :01→19 :02→105 ...

... ...

UK all→229 user01→2 user14→12 user99→7 ...

US all→354 user01→4 user04→8 user56→17 ...

...

UK, 22:00 all→1905 ...

∅ all→87315 UK→239 US→354 ...

16

where time 21:00-22:00count(*)

where time 22:00-23:00, group by minute

where geography=UK group all by user,

count all

group all by geo

Page 17: Acunu Analytics @ Cassandra London

Analytics

• SUM, COUNT, MIN, MAX, STDDEV, AVG, TOP k, COUNT DISTINCT

• Also: approx top k, approx count distinct

• Also: idempotent update

• RESTful JSON interface, CLI

17

Page 18: Acunu Analytics @ Cassandra London

Analytics

• Motivation / alternatives

• What is it?

• How does it work?

• Whats it good for?

18

Page 19: Acunu Analytics @ Cassandra London

Analytics

Manufacturing

Systems Monitoring

Financial Services

Social Media Ad Analytics

Oil + Gas

Page 20: Acunu Analytics @ Cassandra London

Analytics

“Up and running in about 4 hours”

“We found out a competitor was scraping our data”

“We keep discovering use cases we hadn’t thought of ”