Storm 0.8.2

STORMCOMPARISON – INTRODUCTION - CONCEPTS

PRESENTATION BY KASPER MADSEN

NOVEMBER - 2012

Slide updated for STORM 0.8.2

HADOOP VS STORMBatch processing

Jobs runs to completion

JobTracker is SPOF*

Stateful nodes

Scalable

Guarantees no data loss

Open source

Real-time processing

Topologies run forever

No single point of failure

Stateless nodes

Scalable

Guarantees no data loss

Open source

* Hadoop 0.21 added some checkpointing SPOF: Single Point Of Failure

COMPONENTSNimbus daemon is comparable to Hadoop JobTracker. It is the master

Supervisor daemon spawns workers, it is comparable to Hadoop TaskTracker

Worker is spawned by supervisor, one per port defined in storm.yaml configuration

Executor is spawned by worker, run as a thread

Task is spawned by executors, run as a thread

Zookeeper* is a distributed system, used to store metadata. Nimbus and Supervisor daemons are fail-fast and stateless. All state is kept in Zookeeper.

* Zookeeper is an Apache top-level project

Notice all communication between Nimbus and Supervisors are done through Zookeeper

On a cluster with 2k+1 zookeeper nodes, the system can recover when maximally k nodes fails.

EXECUTORSExecutor is a new abstraction

• Disassociate tasks of a component to #threads

• Allows dynamically changing #executors, without changing #tasks

• Makes elasticity much simpler, as semantics are kept valid (e.g. for a grouping)

• Enables elasticity in a multi-core environment

STREAMSStream is an unbounded sequence of tuples.

Topology is a graph where each node is a spout or bolt, and the edges indicate which bolts are subscribing to which streams.

• A spout is a source of a stream

• A bolt is consuming a stream (possibly emits a new one)

• An edge represents a grouping

Source of stream A

Source of stream B

Subscribes: AEmits: C

Subscribes: AEmits: D

Subscribes:A & B

Subscribes: C & D

GROUPINGSEach spout or bolt are running X instances in parallel (called tasks).

Groupings are used to decide which task in the subscribing bolt, the tuple is sent to

Shuffle grouping is a random grouping

Fields grouping is grouped by value, such that equal value results in equal task

All grouping replicates to all tasks

Global grouping makes all tuples go to one task

None grouping makes bolt run in same thread as bolt/spout it subscribes to

Direct grouping producer (task that emits) controls which consumer will receive

2 tasks

2 tasks

4 tasks 3 tasks

EXAMPLE

TopologyBuilder builder = new TopologyBuilder();

builder.setSpout("words", new TestWordSpout(), 10);

builder.setBolt("exclaim1", new ExclamationBolt(), 3)

.shuffleGrouping("words");

builder.setBolt("exclaim2", new ExclamationBolt(), 2)

.shuffleGrouping("exclaim1");

The sourcecode for this example is part of the storm-starter project on github

Run 10 tasks

Run 3 tasks

Run 2 tasks

Create stream called ”words”

Create stream called ”exclaim1”

Subscribe to stream ”words”, using shufflegrouping

Create stream called ”exclaim2”

Subscribe to stream ”exclaim1”, using shufflegrouping

A bolt can subscribe to an unlimited number of streams, by chaining groupings.

TestWordSpout ExclamationBolt ExclamationBolt

EXAMPLE – 1

TestWordSpout

public void nextTuple() {

Utils.sleep(100);

final String[] words = new String[] {"nathan", "mike", "jackson", "golda", "bertels"};

final Random rand = new Random();

final String word = words[rand.nextInt(words.length)];

_collector.emit(new Values(word));

}

The TestWordSpout emits a random string from the array words, each 100 milliseconds


EXAMPLE – 2

ExclamationBolt

OutputCollector _collector;public void prepare(Map conf, TopologyContext context, OutputCollector collector) {

_collector = collector; } public void execute(Tuple tuple) {

_collector.emit(tuple, new Values(tuple.getString(0) + "!!!"));_collector.ack(tuple);

} public void declareOutputFields(OutputFieldsDeclarer declarer) {

declarer.declare(new Fields("word"));}

declareOutputFields is used to declare streams and their schemas. It

is possible to declare several streams and specify the stream to use

when outputting tuples in the emit function call.

Prepare is called when bolt is created

Execute is called for each tuple

declareOutputFields is called when bolt is created


TRIDENT TOPOLOGYTrident topology is a new abstraction built on top of STORM primitives

• Supports

• Joins• Aggregations• Grouping• Functions• Filters

• Easy to use, read the wiki

• Guarantees exactly-once processing - if using (opaque) transactional spout

• Some basic ideas are equal to the deprecated transactional topology*• Tuples are processed as small batches• Each batch gets a transaction id, if batch is replayed same txid is given• State updates are strongly ordered among batches• State updates atomically stores meta-data with data

• Transactional topology is superseded by the Trident topology from 0.8.0

*see my first slide (march 2012) on STORM, for detailed information. www.slideshare.com/KasperMadsen

http://www.slideshare.com/KasperMadsen

EXACTLY-ONCE-PROCESSING - 1Transactional spouts guarantees same data is replayed for every batch

Guaranteeing exactly-once-processing for transactional spouts

• txid is stored with data, such that last txid that updated the data is known• Information is used to know what to update in case of replay

Example

1. Currently processing txid: 2, with data [”man”, ”dog”, ”dog”]

2. Current state is: ”man” => [count=3, txid=1]

”dog” => [count=2, txid=2]

3. Batch with txid 2, fails and gets replayed.

4. Resulting state is”man” => [count=4, txid=2]

”dog” => [count=2, txid=2]

5. Because txid is stored with the data, it is known the count for “dog” should not be increased again.

EXACTLY-ONCE-PROCESSING - 2Opaque transactional spout is not guaranteed to replay same data for a failed batch, as originally existed in the batch.

• Guarantees every tuple is successfully processed in exactly one batch• Useful for having exactly-once-processing and allowing some inputs to fail

Guaranteeing exactly-once-processing for opaque transactional spouts

• Same trick doesn’t work, as replayed batch might be changed, meaning some state might now have stored incorrect data. Consider previous example!

• Problem is solved by storing more meta-data with data (previous value)

Example

Step Data Count prevValue Txid

1 2 dog 1 cat 2,1 0,0 1,1

2 1 dog 2 cat 3,1 2,1 2,1

2.1 2 dog 2 cat 4, 3 2,1 2,2

Updates dog count then fails

Batch contains new data, but updates ok as previous values are used

Consider the potential problems if the new data for 2.1 doesn’t contain any

cat.

ELASTICITY• Rebalancing workers and executors (not tasks)

• Pause spouts• Wait for message timeout• Set new assignment• All moved tasks will be killed and restarted in new location

• Swapping (STORM 0.8.2)

• Submit inactive new topology• Pause spouts of old topology• Wait for message timeout of old topology• Activate new topology• Deactivate old topology• Kill old topology

What about state on tasks which are killed and restarted?

It is up to the user to solve!

LEARN MORE

Website (http://storm-project.net/)

Wiki (https://github.com/nathanmarz/storm/wiki)

Storm-starter (https://github.com/nathanmarz/storm-starter)

Mailing list (http://groups.google.com/group/storm-user)

#storm-user room on freenode

UTSL: https://github.com/nathanmarz/storm

More slides: www.slideshare.net/KasperMadsen

from: http://www.cupofjoe.tv/2010/11/learn-lesson.html

http://storm-project.net/

http://storm-project.net/

https://github.com/nathanmarz/storm/wiki



https://github.com/nathanmarz/storm-starter

https://github.com/nathanmarz/storm-starter

http://groups.google.com/group/storm-user





http://freenode.net/

https://github.com/nathanmarz/storm

https://github.com/nathanmarz/storm

http://www.slideshare.net/KasperMadsen

Technology

Storm 0.8.2