STORMCOMPARISON – INTRODUCTION - CONCEPTS
PRESENTATION BY KASPER MADSEN
NOVEMBER - 2012
Slide updated for STORM 0.8.2
HADOOP VS STORMBatch processing
Jobs runs to completion
JobTracker is SPOF*
Stateful nodes
Scalable
Guarantees no data loss
Open source
Real-time processing
Topologies run forever
No single point of failure
Stateless nodes
Scalable
Guarantees no data loss
Open source
* Hadoop 0.21 added some checkpointing SPOF: Single Point Of Failure
COMPONENTSNimbus daemon is comparable to Hadoop JobTracker. It is the master
Supervisor daemon spawns workers, it is comparable to Hadoop TaskTracker
Worker is spawned by supervisor, one per port defined in storm.yaml configuration
Executor is spawned by worker, run as a thread
Task is spawned by executors, run as a thread
Zookeeper* is a distributed system, used to store metadata. Nimbus and Supervisor daemons are fail-fast and stateless. All state is kept in Zookeeper.
* Zookeeper is an Apache top-level project
Notice all communication between Nimbus and Supervisors are done through Zookeeper
On a cluster with 2k+1 zookeeper nodes, the system can recover when maximally k nodes fails.
EXECUTORSExecutor is a new abstraction
• Disassociate tasks of a component to #threads
• Allows dynamically changing #executors, without changing #tasks
• Makes elasticity much simpler, as semantics are kept valid (e.g. for a grouping)
• Enables elasticity in a multi-core environment
STREAMSStream is an unbounded sequence of tuples.
Topology is a graph where each node is a spout or bolt, and the edges indicate which bolts are subscribing to which streams.
• A spout is a source of a stream
• A bolt is consuming a stream (possibly emits a new one)
• An edge represents a grouping
Source of stream A
Source of stream B
Subscribes: AEmits: C
Subscribes: AEmits: D
Subscribes:A & B
Subscribes: C & D
GROUPINGSEach spout or bolt are running X instances in parallel (called tasks).
Groupings are used to decide which task in the subscribing bolt, the tuple is sent to
Shuffle grouping is a random grouping
Fields grouping is grouped by value, such that equal value results in equal task
All grouping replicates to all tasks
Global grouping makes all tuples go to one task
None grouping makes bolt run in same thread as bolt/spout it subscribes to
Direct grouping producer (task that emits) controls which consumer will receive
2 tasks
2 tasks
4 tasks 3 tasks
EXAMPLE
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("words", new TestWordSpout(), 10);
builder.setBolt("exclaim1", new ExclamationBolt(), 3)
.shuffleGrouping("words");
builder.setBolt("exclaim2", new ExclamationBolt(), 2)
.shuffleGrouping("exclaim1");
The sourcecode for this example is part of the storm-starter project on github
Run 10 tasks
Run 3 tasks
Run 2 tasks
Create stream called ”words”
Create stream called ”exclaim1”
Subscribe to stream ”words”, using shufflegrouping
Create stream called ”exclaim2”
Subscribe to stream ”exclaim1”, using shufflegrouping
A bolt can subscribe to an unlimited number of streams, by chaining groupings.
TestWordSpout ExclamationBolt ExclamationBolt
EXAMPLE – 1
TestWordSpout
public void nextTuple() {
Utils.sleep(100);
final String[] words = new String[] {"nathan", "mike", "jackson", "golda", "bertels"};
final Random rand = new Random();
final String word = words[rand.nextInt(words.length)];
_collector.emit(new Values(word));
}
The TestWordSpout emits a random string from the array words, each 100 milliseconds
TestWordSpout ExclamationBolt ExclamationBolt
EXAMPLE – 2
ExclamationBolt
OutputCollector _collector;public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
_collector = collector; } public void execute(Tuple tuple) {
_collector.emit(tuple, new Values(tuple.getString(0) + "!!!"));_collector.ack(tuple);
} public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));}
declareOutputFields is used to declare streams and their schemas. It
is possible to declare several streams and specify the stream to use
when outputting tuples in the emit function call.
Prepare is called when bolt is created
Execute is called for each tuple
declareOutputFields is called when bolt is created
TestWordSpout ExclamationBolt ExclamationBolt
TRIDENT TOPOLOGYTrident topology is a new abstraction built on top of STORM primitives
• Supports
• Joins• Aggregations• Grouping• Functions• Filters
• Easy to use, read the wiki
• Guarantees exactly-once processing - if using (opaque) transactional spout
• Some basic ideas are equal to the deprecated transactional topology*• Tuples are processed as small batches• Each batch gets a transaction id, if batch is replayed same txid is given• State updates are strongly ordered among batches• State updates atomically stores meta-data with data
• Transactional topology is superseded by the Trident topology from 0.8.0
*see my first slide (march 2012) on STORM, for detailed information. www.slideshare.com/KasperMadsen
EXACTLY-ONCE-PROCESSING - 1Transactional spouts guarantees same data is replayed for every batch
Guaranteeing exactly-once-processing for transactional spouts
• txid is stored with data, such that last txid that updated the data is known• Information is used to know what to update in case of replay
Example
1. Currently processing txid: 2, with data [”man”, ”dog”, ”dog”]
2. Current state is: ”man” => [count=3, txid=1]
”dog” => [count=2, txid=2]
3. Batch with txid 2, fails and gets replayed.
4. Resulting state is”man” => [count=4, txid=2]
”dog” => [count=2, txid=2]
5. Because txid is stored with the data, it is known the count for “dog” should not be increased again.
EXACTLY-ONCE-PROCESSING - 2Opaque transactional spout is not guaranteed to replay same data for a failed batch, as originally existed in the batch.
• Guarantees every tuple is successfully processed in exactly one batch• Useful for having exactly-once-processing and allowing some inputs to fail
Guaranteeing exactly-once-processing for opaque transactional spouts
• Same trick doesn’t work, as replayed batch might be changed, meaning some state might now have stored incorrect data. Consider previous example!
• Problem is solved by storing more meta-data with data (previous value)
Example
Step Data Count prevValue Txid
1 2 dog 1 cat 2,1 0,0 1,1
2 1 dog 2 cat 3,1 2,1 2,1
2.1 2 dog 2 cat 4, 3 2,1 2,2
Updates dog count then fails
Batch contains new data, but updates ok as previous values are used
Consider the potential problems if the new data for 2.1 doesn’t contain any
cat.
ELASTICITY• Rebalancing workers and executors (not tasks)
• Pause spouts• Wait for message timeout• Set new assignment• All moved tasks will be killed and restarted in new location
• Swapping (STORM 0.8.2)
• Submit inactive new topology• Pause spouts of old topology• Wait for message timeout of old topology• Activate new topology• Deactivate old topology• Kill old topology
What about state on tasks which are killed and restarted?
It is up to the user to solve!
LEARN MORE
Website (http://storm-project.net/)
Wiki (https://github.com/nathanmarz/storm/wiki)
Storm-starter (https://github.com/nathanmarz/storm-starter)
Mailing list (http://groups.google.com/group/storm-user)
#storm-user room on freenode
UTSL: https://github.com/nathanmarz/storm
More slides: www.slideshare.net/KasperMadsen
from: http://www.cupofjoe.tv/2010/11/learn-lesson.html