52
Log everyting in JSON. Sadayuki Furuhashi Treasuare Data, Inc.

Fluentd meetup #2

Embed Size (px)

Citation preview

Page 1: Fluentd meetup #2

Log everyting in JSON.

Sadayuki FuruhashiTreasuare Data, Inc.

Page 2: Fluentd meetup #2

Self-introduction

> Sadayuki Furuhashitwitter: @frsyuki

> Original author of Fluentd

> Treasure Data, Inc.Software Architect; Founder

> open-sourceMessagePack - efficient serialization format

Page 3: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 4: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 5: Fluentd meetup #2

0. Why logging?

> Error notifications> Performance monitoring> User segment analysis> Funnel analysis> Heatmap analysis> Market prediction

etc...

Page 6: Fluentd meetup #2

0. Why logging? - Error noti!cations

Error!

Page 7: Fluentd meetup #2

0. Why logging? - Performance monitor

Page 8: Fluentd meetup #2

0. Why logging? - User segment analysis

Page 9: Fluentd meetup #2

0. Why logging? - Funnel analysis

-27%!-28%!

Page 10: Fluentd meetup #2

0. Why logging? - Heatmap analysis

Page 11: Fluentd meetup #2

0. Why logging? - Market prediction

Page 12: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 13: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 14: Fluentd meetup #2

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

log utilization

Page 15: Fluentd meetup #2

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Databases

log sources

log utilization

Page 16: Fluentd meetup #2

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Databases

Page 17: Fluentd meetup #2

perl scripts

rsync servers

bash scripts

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Databases

Page 18: Fluentd meetup #2

Problems...

No unified method to collect logs> Too many bash/perl scripts

Fragile for changesLess reliable

> Mixed log formatsOld-fashioned “Human-readable” text logsNot ready to analyze

> High latencymust wait a day for log rotation

Page 19: Fluentd meetup #2

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Databases

Page 20: Fluentd meetup #2

Nagios

MongoDB

Hadoop

Alerting

Amazon S3

Analysis

Archiving

MySQL

Apache

Frontend

Access logs

syslogd

App logs

System logs

Backend

Databasesfilter / buffer / routing

Page 21: Fluentd meetup #2

Input Plugins Output Plugins

Buffer PluginsFilter Plugins

Page 22: Fluentd meetup #2

Input Plugins Output Plugins

2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}

JSON format

Page 23: Fluentd meetup #2

Input Plugins Output Plugins

2012-02-04 01:33:51myapp.buylog { “user”: ”me”, “path”: “/buyItem”, “price”: 150, “referer”: “/landing”}

timetag

record

JSON format

Page 24: Fluentd meetup #2

Why Fluentd?> Extensibility - Plugin architecture

collect logs from various systemsforward logs to various systems

> Unified log format - JSON formatmodern “Machine-readable” log formatimmediately ready to analyze

> Reliable - HA configuration> Easy to install - RPM/deb packages

deploy instantly to everywhere

Page 25: Fluentd meetup #2
Page 26: Fluentd meetup #2

Comparision with other log collectors:> Scribe

Less extensibleNo unified log formatNo longer developped?

> FlumeLess simpleNo unified log formatLittle information about Flume-NG

Page 27: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 28: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 29: Fluentd meetup #2

NHN Japan COOKPAD NAVER

Crocos

http://www.quora.com/Who-uses-Fluentd-in-production

Page 30: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 31: Fluentd meetup #2

0. Why logging?

1. Why Fluentd? - Design of Fluentd

> Extensibility

> Uni!ed log format

> Simplicity

2. Who uses Fluentd?

3. Future of Fluentd

Page 32: Fluentd meetup #2

Future of Fluentd> <filter>> <match> in <source>> <label>> MessagePack for Ruby v5> td-agent-lite> Pub/Sub & Monitoring API> New process model & Live restart> Backward compatibility

Page 33: Fluentd meetup #2

<source> type tail path /var/log/httpd.log format apache tag not_filtered.apache</source>

<match not_filetered.**> type rewrite remove_prefix not_filtered <rule> key status pattern ^500$ ignore true </rule></match>

<match **> type forward host log.server</match>

Before

Mysterious tag

tag operations

Page 34: Fluentd meetup #2

<source> type tail path /var/log/httpd.log format apache tag apache</source>

<filter **> type rewrite <rule> key status pattern ^500$ ignore true </rule></match>

<match **> type forward host log.server</match>

After (v11)

Filter plugins!

Page 35: Fluentd meetup #2

<source> type tail path /var/log/httpd.log format apache tag apache

<filter **> type rewrite <rule> key status pattern ^500$ ignore true </rule> </match></source>

<match **> type forward host log.server</match>

After (v11)

<filter>/<match> in <source>

Page 36: Fluentd meetup #2

<source> type tail path /var/log/httpd.log tag apache</source>

<match **> type forward host log.server</match>

Before

I want to add flowcounter here...

Page 37: Fluentd meetup #2

<source> type tail path /var/log/httpd.log tag apache</source>

<match flow.traffic> type forward host traffic.server</match>

<match **> type copy <store> type flowcounter tag flow.traffic </store>

<store> type forward host log.server </store></match>

Before

Nested!

Page 38: Fluentd meetup #2

<source> type tail path /var/log/httpd.log tag apache</source>

<filter **> type copy <match> type flowcounter tag flow.traffic <match> type forward host traffic.server </match> </match></match>

<match **> type forward host log.server</match>

After (v11)

Filtering pipeline

Page 39: Fluentd meetup #2

<source> type forward</source>

<filter **> type copy <match> type file path /mnt/local_archive </match></filter>

<label alert> <match **> ... </match></label>

<label analysis> ...</label>

# copy & label & forward<filter **> type copy <match> type forward label alert host alerting.server </match></filter>

# copy & label & forward<filter **> type copy <match> type forward label analysis host analysis.server </match></filter>

After (v11)

Page 40: Fluentd meetup #2

MessagePack for Ruby v5

0

10000

20000

30000

40000

Serialize Deserialize

msgpack v5 msgpack v4 yajl json

(tweets/sec)

Page 41: Fluentd meetup #2

td-agent-lite

> in_tail + out_forward in “single” binarystatically linked ruby binary + scripts tied with the binary

Page 42: Fluentd meetup #2

New process model & Live restart

Supervisor

Old multiprocess model

Enginefork()

detached process

detached processall data pass through

the central process

Page 43: Fluentd meetup #2

New process model & Live restart

New multiprocess model

Supervisor Engine

detached process

detached process

direct communication

ProcessManager

Page 44: Fluentd meetup #2

New process model & Live restart

New multiprocess model

Supervisor Engine

detached process

detached process

ProcessManager

Live restart

Engine ProcessManager

Page 45: Fluentd meetup #2

Backward compatibility

Fluentd v11 includes 2 namespaces:> Fluentd:: new code base> Fluent:: old code base + wrapper classes

Checkout the repository for details:> http://github.com/frsyuki/fluentd-v11

Page 46: Fluentd meetup #2

Conculution

Fluentd makes logging better> Plugin architecture> JSON format> HA configuration> RPM/deb package

Fluentd is under active development

Fluentd is suppored by many committers

Page 48: Fluentd meetup #2
Page 49: Fluentd meetup #2
Page 50: Fluentd meetup #2

ログ収集/解析に使っているツール

Page 51: Fluentd meetup #2

ログの保存先

Page 52: Fluentd meetup #2

Fluentdを導入するにあたっての障壁