27
Norikra in action Data/Stream Processing Meetup (2013/06/28) TAGOMORI Satoshi (@tagomoris) 13629日土曜日

Norikra in action

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Norikra in action

Norikra in action

Data/Stream Processing Meetup (2013/06/28)TAGOMORI Satoshi (@tagomoris)

13年6月29日土曜日

Page 2: Norikra in action

TAGOMORI Satoshi (@tagomoris)LINE corp.

Ruby, Perl, Node.js, Hadoop, ...

13年6月29日土曜日

Page 3: Norikra in action

13年6月29日土曜日

Page 4: Norikra in action

System OverviewWeb Servers Fluentd

Cluster

ArchiveStorage(scribed)

FluentdWatchers

GraphTools

Notifications(IRC)

Hadoop Cluster(HDFS, YARN)

webhdfs

HuahinManager

hiveserver

STREAM

Shib ShibUI

BATCH SCHEDULEDBATCH

Norikra

13年6月29日土曜日

Page 5: Norikra in action

Stream queryCustom fluentd plugin: not so casual enoughxQL: declarative languagestreams processing

for optional data fieldsno more schema management

connectivity with Fluentd

13年6月29日土曜日

Page 6: Norikra in action

Stream query: vs stored data query

No more query wait time

Immediate result for time batch

No more storages

No more query execution management

Once register query, runs forever

13年6月29日土曜日

Page 8: Norikra in action

Norikra: is not for only Fluentd.

13年6月29日土曜日

Page 9: Norikra in action

Norikra query: vs Fluentd custom plugin

SQL!!!

No more restart for new queries

register queries whenever we want

No more private plugins

No more fat Fluentd configurations

13年6月29日土曜日

Page 10: Norikra in action

Norikra

Full feature of Esper over JRuby

Simple RPC: msgpack-rpc-over-http

Simple RPC Server: mizuno (jetty + rack)

Simple Client Library: norikra-client

Just same code for cruby/jruby

13年6月29日土曜日

Page 11: Norikra in action

Norikra

Norikra Server (on JVM)

Esper Instance (Query Engine)

Type DefinitionManager

Output Event Pool

Norikra Engine

RPC Servermizuno (Jetty + Rack)

Rack RPC HandlerNorikraClient

NorikraClient

JRUBY

CRUBY

msgpack-rpc-over-http

13年6月29日土曜日

Page 12: Norikra in action

Norikra Query: target "sales"

goods_id:5 price:49.8 num:1 shop:"LINE"goods_id:2 price:12.5 num:3 shop:"Cookpad"goods_id:4 price:36.6 num:10 shop:"Cookpad"

SELECT shop, sum(price*num) AS amountFROM sales.win:time_batch(10 minutes)GROUP BY shop

goods_id:5 price:49.8 num:1 shop:"LINE"

goods_id:2 price:12.5 num:3 shop:"Cookpad" affiliate:"BiS"

SELECT affiliate, count(*) AS cntFROM sales.win:time_batch(1 hour)GROUP BY affiliate

13年6月29日土曜日

Page 13: Norikra in action

Esper and NorikraEsper:

queries for streamsstream: a set of field-type pairs of eventsusers need to know for field set variations(or manage 'map subtypes' on your own)

Norikra:queries for targetstarget: virtual name of union of field set variationusers don't need to know for detail of target

13年6月29日土曜日

Page 14: Norikra in action

automated stream inheritanceof norikra's target

Base typedef

Query typedef

Data typedef

b_xxxxxxxxx

minimal fieldset definition:

name: 'string'id: 'long'

valid: 'boolean'action_type: 'string'

13年6月29日土曜日

Page 15: Norikra in action

automated stream inheritanceof norikra's target

Base typedef

Query typedef

Data typedef

b_xxxxxxxxx

event data fieldset definition:

name: 'string'id: 'long'

valid: 'boolean'action_type: 'string'

product_code: 'string'charge: 'integer'shop_code: 'long'e_xxxxxxxx1

13年6月29日土曜日

Page 16: Norikra in action

automated stream inheritanceof norikra's target

Base typedef

Query typedef

Data typedef

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2

event data fieldset definition:name: 'string'

id: 'long'valid: 'boolean'

action_type: 'string'product_code: 'string'

charge: 'integer'shop_code: 'long'affiliate: 'string'

13年6月29日土曜日

Page 17: Norikra in action

automated stream inheritanceof norikra's target

Base typedef

Query typedef

Data typedef

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2

new query:SELECT count(*)

FROM target.win:time_batch(1min)WHERE affiliate.length() > 0

13年6月29日土曜日

Page 18: Norikra in action

automated stream inheritanceof norikra's target

Base typedef

Query typedef

Data typedef

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2'

event data fieldset definition:

name: 'string'id: 'long'

valid: 'boolean'action_type: 'string'

affiliate: 'string'

q_xxxxxxxx0

new query:SELECT count(*)

FROM target.win:time_batch(1min)WHERE affiliate.length() > 0

13年6月29日土曜日

Page 19: Norikra in action

automated stream inheritanceof norikra's target

Base typedef

Query typedef

Data typedef

b_xxxxxxxxx

e_xxxxxxxx1 e_xxxxxxxx2'

q_xxxxxxxx0

Registered EPL:SELECT count(*)

FROM q_xxxxxxxx0.win:time_batch(1min)WHERE affiliate.length() > 0

13年6月29日土曜日

Page 20: Norikra in action

automated stream inheritanceof norikra's target

Base typedef

Query typedef

Data typedef

b_xxxxxxxxx

e_xxxxxxxx1' e_xxxxxxxx2'

q_xxxxxxxx0

e_xxxxxxxx3'

q_xxxxxxxx1

13年6月29日土曜日

Page 21: Norikra in action

Output data pooling

Output event data: pushed

Event pushing brings many problems

Pooling + fetch

typical usecase: aggregation

-> not so many outputs

13年6月29日土曜日

Page 22: Norikra in action

fluent-plugin-norikra

Fluentd plugin to use Norikra

Norikra server autostart

Automatically defined target

Pre-defined queries for each targets

13年6月29日土曜日

Page 23: Norikra in action

fluent-plugin-norikra

installation

`gem install fluent-plugin-norikra`

configuration

see DEMO

13年6月29日土曜日

Page 24: Norikra in action

Demo: bootstrap

rbenv shell jruby-1.7.4gem install norikrawhich norikrarbenv shell 2.0.0-pxxxgem install fluent-plugin-norikravi demo.conffluentd -c demo.conf

13年6月29日土曜日

Page 25: Norikra in action

Demo: query streams

some messages over fluent-cat

register queries with norikra-client

more messages over fluent-cat & norikra-client

13年6月29日土曜日

Page 26: Norikra in action

roadmap of norikraNorikra is still UNDER DEVELOPMENT

Norikra feature updates (JOINs, etc)Web GUI

query & target list managementsave & restore metadata

Distributed & orchestrated nodes

13年6月29日土曜日

Page 27: Norikra in action

See also:http://fluentd.org/http://fluentd.org/plugin/https://github.com/tagomoris/norikrahttps://github.com/tagomoris/norikra-clienthttps://github.com/tagomoris/fluent-plugin-norikrahttp://esper.codehaus.org/

"Fluentd: The ruby based middleware across the world"http://www.slideshare.net/tagomoris/fluentd-in-tkrk10

"Log analysis system with Hadoop in livedoor 2013 Winter"http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-2013

13年6月29日土曜日