Masahiro NakagawaFeb 21, 2015
RubyKansai #65
FluentdUnified logging layer
Who are you?
> Masahiro Nakagawa > github/twitter: @repeatedly
> Treasure Data, Inc. > Senior Software Engineer > Fluentd / td-agent developer
> Living at OSS :) > D language - Phobos committer > Fluentd - Main maintainer > MessagePack / RPC - D and Python (only RPC) > The organizer of several meetups (Presto, DTM, etc…) > etc…
Structured logging !
Reliable forwarding !
Pluggable architecture
http://fluentd.org/
What’s Fluentd?
> Data collector for unified logging layer > Streaming data transfer based on JSON > Written in Ruby
> Gem based various plugins > http://www.fluentd.org/plugins
> Working in production > http://www.fluentd.org/testimonials
Background
Data Analytics Flow
Collect Store Process Visualize
Data source
Reporting
Monitoring
Data Analytics Flow
Store Process
Cloudera
Horton Works
Treasure Data
Collect Visualize
Tableau
Excel
R
easier & shorter time
???
TD Service Architecture
Time to Value
Send query result Result Push
Acquire Analyze Store
Plazma DB Flexible, Scalable, Columnar Storage
Web Log
App Log
Censor
CRM
ERP
RDBMS
Treasure Agent(Server) SDK(JS, Android, iOS, Unity)
Streaming Collector
Batch / Reliability
Ad-hoc /Low latency
KPI$
KPI Dashboard
BI Tools
Other Products
RDBMS, Google Docs, AWS S3, FTP Server, etc.
Metric Insights
Tableau, Motion Board�����etc.
POS
REST API ODBC / JDBC �SQL, Pig�
Bulk Uploader
Embulk,TD Toolbelt
SQL-based query
@AWS or @IDCF
Connectivity
Economy & Flexibility Simple & Supported
Dive into…
Divide & Conquer & Retry
error retry
error retry retry
retryBatch
Stream
Other stream
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
FluentLog Server
High Latency!must wait for a day...
Before…
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
Fluentd Fluentd Fluentd
Fluentd Fluentd
In streaming!
After…
Core Plugins
> Divide & Conquer
> Buffering & Retrying
> Error handling
> Message routing
> Parallelism
> read / receive data > from API, database,
command, etc… > write / send data
> to API, database, alert, graph, etc…
Apache to Mongo
tail
insert
event buffering
127.0.0.1 - - [11/Dec/2012:07:26:27] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:30] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:32] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:26:40] "GET / ... 127.0.0.1 - - [11/Dec/2012:07:27:01] "GET / ...
...
Fluentd
Web Server
2012-02-04 01:33:51 apache.log
{ "host": "127.0.0.1", "method": "GET", ... }
> default second unit
> from data source
Event structure(log message)
✓ Time
> for message routing
> where is from?
✓ Tag
> JSON format
> MessagePackinternally
> schema-free
✓ Record
Architecture (v0.12 or later)
EngineInput
Filter Output
Buffer
> grep > record_transfomer > …
> Forward > File tail > ...
> Forward > File > ...
Output
> File > Memory
not pluggable
FormatterParser
Configuration and operation
> No central / master node > include helps configuration sharing
> Operation depends on your environment > Use your deamon management > Use Chef in Treasure Data
> Apache like syntax and Ruby DSL
# receive events via HTTP <source> type http port 8888 </source> !# read logs from a file <source> type tail path /var/log/httpd.log format apache tag apache.access </source> !# save access logs to MongoDB <match apache.access> type mongo database apache collection log </match>
# save alerts to a file <match alert.**> type file path /var/log/fluent/alerts </match> !# forward other logs to servers <match **> type forward <server> host 192.168.0.11 weight 20 </server> <server> host 192.168.0.12 weight 60 </server> </match> !include http://example.com/conf
Plugins - use rubygems
$ fluent-gem search -rd fluent-plugin!
!
$ fluent-gem search -rd fluent-mixin!
!
$ fluent-gem install fluent-plugin-mongo
in_tail
✓ read a log file!✓ custom regexp!✓ custom parser in Ruby
FluentdApache
access.log
> json > csv > tsv > ltsv
Supported format:> apache > apache_error > apache2 > nginx
> syslog > none
out_webhdf
Fluentd
buffer
✓ retry automatically!✓ exponential retry wait!✓ persistent on a file
✓ slice files based on time2013-01-01/01/access.log.gz!2013-01-01/02/access.log.gz!2013-01-01/03/access.log.gz!...
HDFS
✓ custom text formatter
Apache
access.log
out_copy
✓ routing based on tags!✓ copy to multiple storages
Amazon S3
Fluentd
buffer
Apache
access.log
out_forward
apache
✓ automatic fail-over!✓ load balancing
FluentdApache
bufferaccess.log
✓ retry automatically!✓ exponential retry wait!✓ persistent on a file
Fluentd
Fluentd
Fluentd
Before
After
or Embulk
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databasesbuffering / processing / routing
M x N → M + N
Use-cases
Treasure Data
FrontendJob Queue
WorkerHadoop
Presto
Fluentd
Applications push metrics to Fluentd (via local Fluentd)
Librato Metricsfor realtime analysis
Treasure Data
for historical analysis
Fluentd sums up data minutes(partial aggregation)
hundreds of app servers
sends event logs
sends event logs
sends event logs
Rails app td-agent
td-agent
td-agent
GoogleSpreadsheet
Treasure Data
MySQL
Logs are available
after several mins.
Daily/Hourly
Batch
KPI
visualizationFeedback rankings
Rails app
Rails app
✓ Unlimited scalability✓ Flexible schema✓ Realtime✓ Less performance impact
Cookpad
✓ Over 100 RoR servers (2012/2/4)
Slideshare
http://engineering.slideshare.net/2014/04/skynet-project-monitor-scale-and-auto-heal-a-system-in-the-cloud/
Log Analysis System And its designs in LINE Corp. 2014 early
Roadmap
v0.10 (old stable)
> Mainly for log forwarding > with good performance > working in production
> almost users use td-agent > Various plugins
> http://www.fluentd.org/plugins
v0.12 (current stable)> Event handling improvement
> Filter > Label > Error Stream
> At-least-once semantics in forwarding > require_ack_response parameter > http://ogibayashi.github.io/blog/2014/12/16/try-
fluentd-v0-dot-12-at-least-once/
> Apply filtering routine to event stream > No more tag tricks!
Filter
<match access.**> type record_reformer tag reformed.${tag} </match> !<match reformed.**> type growthforecast </match>
<filter access.**> type record_transformer … </filter>
v0.10: v0.12:
<match access.**> type growthforecast </match>
> Internal event routing > Redirect events to another group
> much easier to group and share plugins
Label
<source> type forward </source> !<match app1.**> type record_reformer </match> !…
<source> type forward @label @APP1 </source><label @APP1> <match access.**> type s3 </match> </label>
v0.10: v0.12:
Error stream with Label
> Can handle an error at each record level > It is still prototype ERROR!
{"event":1, ...}
{"event":2, ...}
{"event":3, ...}
chunk1
{"event":4, ...}
{"event":5, ...}
{"event":6, ...}
chunk2
…
Input
OK
ERROR!
OK
OK
OK
Output
<label @ERROR> <match **> type file ... </match> </label>
Error stream
Built-in @ERROR is used when error occurred in “emit”
v0.14 (next stable)
> New plugin APIs > Actor > New base classes (#309)
> ServerEngine based core engine > Robust supervisor
> Sub-second time support (#461) > Zero downtime restart
Actor> Easy to write popular routines > Hide implementation details
class TimerWatcher < Coolio::TimerWatcher ... end !def start @loop = Coolio::Loop.new @timer = ... @loop.attach(@timer) @thread = ... end
def configure(conf) actor.every(@interval) { router.emit(...) } end !def start actor.start end
v10: v0.14:
> Socket manager shared resources with workers
40
SupervisorTCP
1. Listen to TCP socket
Zero downtime restart
41
Worker
Supervisor
heartbeat
TCP
TCP
1. Listen to TCP socket
2. Pass its socket to worker
Zero downtime restart
> Socket manager shared resources with workers
42
Worker
Supervisor
Worker
TCP
TCP
1. Listen to TCP socket
2. Pass its socket to worker
3. Do same actionat worker restartingwith keeping TCP socket
heartbeat
Zero downtime restart
> Socket manager shared resources with workers
TODO: How to implement on JRuby?
v1 (future stable)
> Fix new features / APIs > Plugin APIs > Default configurations
> Clear versioning and stability > No breaking API compatibility!
> Breaking compatibility by Fluentd v2 ?
Roadmap summary> v0.10 (old stable) > v0.12 (current stable)
> Filter / Label / At-least-once > v0.14 (spring, 2015)
> New plugin APIs, ServerEngine, Time… > v1 (early summer, 2015)
> Fix new features / APIs
https://github.com/fluent/fluentd/wiki/V1-Roadmap
Other TODO
> Windows support > Need feedback! > https://github.com/fluent/fluentd/tree/windows
> Also check: http://qiita.com/okahashi117
> JRuby support > msgpack / cool.io now work on JRuby > https://github.com/fluent/fluentd/issues/317
Ecosystem
Treasure Agent (td-agent)
> Treasure Data distribution of Fluentd > Treasure Agent 2 is current stable
> Update core components > We recommend to use v2, not v1
> Next version, 2.2.0, uses fluentd v0.12 > In this week or next week
fluentd-forwarder> Forwarding agent written in Go
> Focusing log forwarding to Fluentd > Work on Windows
> Bundle TCP input/output and TD output > No flexible plugin mechanizm > We have a plan to add some input/output
> Similar product > fluent-agent-lite, fluent-agent-hydra, ik
fluentd-ui
> Manage Fluentd instance via Web UI > https://github.com/fluent/fluentd-ui
Embulk
> Bulk Loader version of Fluentd > Pluggable architecture
> JRuby, JVM languages > High performance parallel processing
> Share your script as a plugin > https://github.com/embulk
http://www.slideshare.net/frsyuki/embuk-making-data-integration-works-relaxed
HDFS
MySQL
Amazon S3
Embulk
CSV Files
SequenceFile
Salesforce.com
Elasticsearch
Cassandra
Hive
Redis
✓ Parallel execution ✓ Data validation ✓ Error recovery ✓ Deterministic behaviour ✓ Idempotent retrying
Plugins Plugins
bulk load
Check: treasuredata.comCloud service for the entire data pipeline
Recommended