Concurrent Stream Processing

Alex Miller - @puredangerRevelytix - http://revelytix.com

Contents• Query execution - the problem• Plan representation - plans in our program• Processing components - building blocks• Processing execution - executing plans

Query Execution

Relational Data & Queries

SELECT NAMEFROM PERSONWHERE AGE > 20

NAME AGE

Joe 30

RDF"Resource Description Framework" - a fine-grained graph representation of data

http://data/Joe

http://demo/age

http://demo/name

Subject Predicate Object

http://data/Joe http://demo/age 30

http://data/Joe http://demo/name "Joe"

SPARQL queriesSPARQL is a query language for RDF

PREFIX demo: <http://demo/>SELECT ?nameWHERE { ?person demo:age ?age. ?person demo:name ?name. FILTER (?age > 20) }

A "triple pattern"

Natural join on ?person

PREFIX demo: <http://demo/>SELECT ?nameWHERE { ?person demo:age ?age. ?person demo:name ?name. FILTER (?age > 20) }

Relational-to-RDF• W3C R2RML mappings define how to virtually

map a relational db into RDF

NAME AGEJoe 30

http://data/Joe

http://demo/age

http://demo/name

SELECT NAMEFROM PERSONWHERE AGE > 20

Enterprise federation• Model domain at enterprise level• Map into data sources• Federate across the enterprise (and beyond)

Enterprise

SPARQL

SPARQLSPARQLSPARQL

SQLSQLSQL

Query pipeline• How does a query engine work?

Parse Plan Resolve Optimize Process

Results!

AST Plan

Metadata

Trees!

Parse Plan Resolve Optimize Process

Results!

AST Plan

Metadata

Trees!

Plan Representation

SQL query plans

Person

join filter project

DeptID Age > 20 Name, DeptName

DeptIDDeptName

NameAgeDeptID

SELECT Name, DeptNameFROM Person, DeptWHERE Person.DeptID = Dept.DeptID AND Age > 20

SPARQL query plans

join filter project

?Person ?Age > 20 ?Name

{ ?Person :Age ?Age }

{ ?Person :Name ?Name }

SELECT ?NameWHERE { ?Person :Name ?Name . ?Person :Age ?Age . FILTER (?Age > 20) }

Common modelStreams of tuples flowing through a network of processing nodes

node node node

What kind of nodes?• Tuple generators (leaves)

– In SQL: a table or view– In SPARQL: a triple pattern

• Combinations (multiple children)– Join– Union

• Transformations– Filter– Dup removal– Sort– Grouping

– Project– Slice (limit / offset)– etc

RepresentationTree data structure with nodes and attributes

TableTableNode

joinTypejoinCriteria

JoinNode

childNodesPlanNode

criteriaFilterNode

projectExpressionsProjectNode

limitoffset

SliceNode

s-expressionsTree data structure with nodes and attributes

(* (+ 2 3) (- 6 5) )

List representationTree data structure with nodes and attributes

(project+ [Name DeptName] (filter+ (> Age 20) (join+ (table+ Empl [Name Age DeptID]) (table+ Dept [DeptID DeptName]))))

Query optimizationExample - pushing criteria down

(project+ [Name DeptName] (filter+ (> Age 20) (join+ (project+ [Name Age DeptID] (bind+ [Age (- (now) Birth)] (table+ Empl [Name Birth DeptID]))) (table+ Dept [DeptID DeptName]))))

Query optimizationExample - rewritten

(project+ [Name DeptName] (join+ (project+ [Name DeptID] (filter+ (> (- (now) Birth) 20) (table+ Empl [Name Birth DeptID]))) (table+ Dept [DeptID DeptName])))

Hash join conversion

first+

preduce+

left tree

right tree

hash-tupleshashes

mapcat tuple-matches

left tree

right tree

Hash join conversion

(join+ _left _right)

(let+ [hashes (first+ (preduce+ (hash-tuple join-vars {} #(merge-with concat %1 %2)) _left))] (mapcat (fn [tuple] (tuple-matches hashes join-vars tuple)) _right)))

Processing trees

• Compile abstract nodes into more concrete stream operations:

– map+– mapcat+ – filter+

– first+ – mux+

– let+– let-stream+

– pmap+– pmapcat+ – pfilter+– preduce+

– number+– reorder+– rechunk+

– pmap-chunk+– preduce-chunk+

Summary• SPARQL and SQL query plans have essentially

the same underlying algebra• Model is a tree of nodes where tuples flow from

leaves to the root• A natural representation of this tree in Clojure is

as a tree of s-expressions, just like our code• We can manipulate this tree to provide

– Optimizations– Differing levels of abstraction

Processing Components

PipesPipes are streams of data

Producer Consumer

(enqueue pipe item)(enqueue-all pipe items)(close pipe)(error pipe exception)

(dequeue pipe item)(dequeue-all pipe items)(closed? pipe)(error? pipe)

Pipe callbacks

Events on the pipe trigger callbacks which are executed on the caller's thread

Pipe callbacks

1. (add-callback pipe callback-fn)

callback-fn

Pipe callbacks

1. (add-callback pipe callback-fn)

callback-fn

Pipe callbacks

1. (add-callback pipe callback-fn)2. (enqueue pipe "foo")

callback-fn

Pipe callbacks

1. (add-callback pipe callback-fn)2. (enqueue pipe "foo")

callback-fn

Pipe callbacks

1. (add-callback pipe callback-fn)2. (enqueue pipe "foo")3. (callback-fn "foo") ;; during enqueue

callback-fn

PipesPipes are thread-safe functional data structures

callback-fn

Batched tuples• To a pipe, data is just data. We actually pass

data in batches through the pipe for efficiency.

[ {:Name "Alex" :Eyes "Blue" } {:Name "Jeff" :Eyes "Brown"} {:Name "Eric" :Eyes "Hazel" } {:Name "Joe" :Eyes "Blue"} {:Name "Lisa" :Eyes "Blue" } {:Name "Glen" :Eyes "Brown"}]

Pipe multiplexerCompose multiple pipes into one

Pipe teeSend output to multiple destinations

Nodes• Nodes transform tuples from the input pipe and

puts results on output pipe.

fnInput Pipe Output PipeNode

•input-pipe•output-pipe•task-fn•state •concurrency

Processing Trees• Tree of nodes and pipes

Data flow

SPARQL query example

join filter project

?Person ?Age > 20 ?Name

SELECT ?NameWHERE { ?Person :Name ?Name . ?Person :Age ?Age . FILTER (?Age > 20) }

(project+ [?Name] (filter+ (> ?Age 20) (join+ [?Person] (triple+ [?Person :Name ?Name]) (triple+ [?Person :Age ?Age]))))

Processing tree

filter project

?Age > 20 ?Name

first+

preduce+ hash-tuples

hashes

mapcat tuple-matches

Mapping to nodes• An obvious mapping to nodes and pipes

fnfnfn fn

fn project+filter+let+

triple pattern

first+

preduce+

Mapping to nodes• Choosing between compilation and evaluation

triple pattern

project

?Age > 20 ?Name

filterfn

fnfnfn

fn let+

triple pattern

first+

preduce+

Compile vs eval• We can evaluate our expressions

– Directly on streams of Clojure data using Clojure– Indirectly via pipes and nodes (more on that next)

• Final step before processing makes decision– Plan nodes that combine data are real nodes– Plan nodes that allow parallelism (p*) are real nodes– Most other plan nodes can be merged into single eval– Many leaf nodes actually rolled up, sent to a database– Lots more work to do on where these splits occur

Processing Execution

Execution requirements• Parallelism

– Across plans – Across nodes in a plan– Within a parallelizable node in a plan

• Memory management– Allow arbitrary intermediate results sets w/o OOME

• Ops– Cancellation– Timeouts– Monitoring

Event-driven processing• Dedicated I/O thread pools stream data into plan

Compute threadsI/O threads

Task creation1.Callback fires when data added to input pipe2.Callback takes the fn associated with the node

and bundles it into a task3.Task is scheduled with the compute thread pool

fncallback Node

Fork/join vs Executors• Fork/join thread pool vs classic Executors

– Optimized for finer-grained tasks– Optimized for larger numbers of tasks– Optimized for more cores– Works well on tasks with dependencies– No contention on a single queue– Work stealing for load balancing

Compute threads

Task execution

1.Pull next chunk from input pipe2.Execute task function with access to node's state3.Optionally, output one or more chunks to output

pipe - this triggers the upstream callback4.If data still available, schedule a new task,

simulating a new callback on the current node

fncallback

Concurrency

• Delicate balance between Clojure refs and STM and Java concurrency primitives

• Clojure refs - managed by STM– Input pipe– Output pipe– Node state

• Java concurrency– Semaphore - "permits" to limit tasks per node– Per-node scheduling lock

• Key integration constraint– Clojure transactions can fail and retry!

Concurrency mechanisms

Blue outline = Java lockall = under Java semaphoreGreen outline = Cloj txnBlue shading = Cloj atom

Acquire sempahore Yes Dequeue

inputInput

message Data

set closed = true

closed && !closed_done

Create task

acquire all semaphores

Yesrun-task

w/ nil msg

set closed_done = true

close output-

release all

semaphores

invoke task

Result message

release 1 semaphore

Input closed?

enqueue data on

output pipe

set closed = true

Closes output?

Yes Yes

run-taskclose-output

process-input

Memory management• Pipes are all on the heap• How do we avoid OutOfMemory?

Buffered pipes• When heap space is low, store pipe data on disk• Data is serialized / deserialized to/from disk• Memory-mapped files are used to improve I/O

0100 ….

Memory monitoring• JMX memory beans

– To detect when memory is tight -> writing to disk• Use memory pool threshold notifications

– To detect when memory is ok -> write to memory• Use polling (no notification on decrease)

• Composite pipes– Build a logical pipe out of many segments– As memory conditions go up and down, each segment

is written to the fastest place. We never move data.

Cancellation• Pool keeps track of what nodes belong to which

plan• All nodes check for cancellation during execution• Cancellation can be caused by:

– Error during execution – User intervention from admin UI– Timeout from query settings

Summary• Data flow architecture

– Event-driven by arrival of data– Compute threads never block– Fork/join to handle scheduling of work

• Clojure as abstraction tool– Expression tree lets us express plans concisely– Also lets us manipulate them with tools in Clojure– Lines of code

• Fork/join pool, nodes, pipes - 1200• Buffer, serialization, memory monitor - 970• Processor, compiler, eval - 1900

• Open source? Hmmmmmmmmmmm……. 51

Thanks...Alex Miller

@puredangerRevelytix, Inc.

Concurrent Stream Processing

Technology

Scalable Stream Processing Spark Streaming and Flink Stream

Parallel Processing and Concurrent Processes

Apache Flink Stream Processing

Stream Processing - Huihoodocs.huihoo.com/javaone/2015/CON1534-Stream-Processing-Framew… · existing IBM products like the Watson Health Cloud; ... Stream Processing Analytics Correlate

Stream Processing In Go

MillWheel: Fault-Tolerant Stream Processing at Internet Scale · MillWheel: Fault-Tolerant Stream Processing at Internet Scale Tyler Akidau, ... INTRODUCTION Stream processing

Stream Processing Blocks

Continuous Data Stream Processing

Stateful Distributed Stream Processing

Down Stream Processing

MediaKind Stream Processing M1 Server Stream Processing M1 S… · M1 Server Stream Processing Appliance The MediaKind Stream Processing M1 Server is a powerful stream processing

RDF Stream Processing€¦ · Structure of the tutorial Introduction to RDF Stream Processing (now) RDF stream data models + processing models RSP Implementations Coffee Break RSP

Concurrent Analytical Query Processing with GPUs

High-Performance Computing 12.1: Concurrent Processing

Optimizing the Performance for Concurrent RDF Stream Processing Queries …translectures.videolectures.net/site/normal_dl/tag=1117914/eswc2017_le... · Chan Le Van, Feng Gao, Muhammad

2012 Feb EBS Concurrent Processing Troubleshooting

Stream processing application

Asynchronous stream processing with Akka …...Make building powerful concurrent & distributed applications simple.Akka is a toolkit and runtime for building highly concurrent, distributed,

Stream Processing Frameworks

Lecture 5 Concurrent Programming - ece.uprm.eduwrivera/ICOM4036/Lecture5.pdf · processing units –Each processing ... –Task Parallel Library (TPL) –Intel Concurrent Collections