Concurrent Stream Processing

Preview:

DESCRIPTION

One of the greatest benefits of Clojure is its ability to create simple, powerful abstractions that operate at the level of the problem while also operating at the level of the language.This talk discusses a query processing engine built in Clojure that leverages this abstraction power to combine streams of data for efficient concurrent execution. * Representing processing trees as s-expressions* Streams as sequences of data* Optimizing processing trees by manipulating s-expressions* Direct execution of s-expression trees* Compilation of s-expressions into nodes and pipes* Concurrent processing nodes and pipes using a fork/join pool

Citation preview

Concurrent Stream Processing

Alex Miller - @puredangerRevelytix - http://revelytix.com

Contents• Query execution - the problem• Plan representation - plans in our program• Processing components - building blocks• Processing execution - executing plans

2

Query Execution

Relational Data & Queries

SELECT NAMEFROM PERSONWHERE AGE > 20

4

NAME AGE

Joe 30

RDF"Resource Description Framework" - a fine-grained graph representation of data

5

http://data/Joe

30

"Joe"

http://demo/age

http://demo/name

Subject Predicate Object

http://data/Joe http://demo/age 30

http://data/Joe http://demo/name "Joe"

SPARQL queriesSPARQL is a query language for RDF

6

PREFIX demo: <http://demo/>SELECT ?nameWHERE { ?person demo:age ?age. ?person demo:name ?name. FILTER (?age > 20) }

A "triple pattern"

Natural join on ?person

PREFIX demo: <http://demo/>SELECT ?nameWHERE { ?person demo:age ?age. ?person demo:name ?name. FILTER (?age > 20) }

Relational-to-RDF• W3C R2RML mappings define how to virtually

map a relational db into RDF

7

NAME AGEJoe 30

http://data/Joe

30

"Joe"

http://demo/age

http://demo/name

SELECT NAMEFROM PERSONWHERE AGE > 20

Enterprise federation• Model domain at enterprise level• Map into data sources• Federate across the enterprise (and beyond)

8

Enterprise

SPARQL

SPARQLSPARQLSPARQL

SQLSQLSQL

Query pipeline• How does a query engine work?

9

Parse Plan Resolve Optimize Process

SQL

Results!

AST Plan

Plan

Plan

Metadata

Trees!

10

Parse Plan Resolve Optimize Process

SQL

Results!

AST Plan

Plan

Plan

Metadata

Trees!

Plan Representation

SQL query plans

12

Person

Dept

join filter project

DeptID Age > 20 Name, DeptName

DeptIDDeptName

NameAgeDeptID

SELECT Name, DeptNameFROM Person, DeptWHERE Person.DeptID = Dept.DeptID AND Age > 20

SPARQL query plans

13

TP1

TP2

join filter project

?Person ?Age > 20 ?Name

{ ?Person :Age ?Age }

{ ?Person :Name ?Name }

SELECT ?NameWHERE { ?Person :Name ?Name . ?Person :Age ?Age . FILTER (?Age > 20) }

Common modelStreams of tuples flowing through a network of processing nodes

14

node

node

node node node

What kind of nodes?• Tuple generators (leaves)

– In SQL: a table or view– In SPARQL: a triple pattern

• Combinations (multiple children)– Join– Union

15

• Transformations– Filter– Dup removal– Sort– Grouping

– Project– Slice (limit / offset)– etc

RepresentationTree data structure with nodes and attributes

16

TableTableNode

joinTypejoinCriteria

JoinNode

childNodesPlanNode

criteriaFilterNode

projectExpressionsProjectNode

limitoffset

SliceNode

Java

s-expressionsTree data structure with nodes and attributes

17

(* (+ 2 3) (- 6 5) )

List representationTree data structure with nodes and attributes

18

(project+ [Name DeptName] (filter+ (> Age 20) (join+ (table+ Empl [Name Age DeptID]) (table+ Dept [DeptID DeptName]))))

Query optimizationExample - pushing criteria down

19

(project+ [Name DeptName] (filter+ (> Age 20) (join+ (project+ [Name Age DeptID] (bind+ [Age (- (now) Birth)] (table+ Empl [Name Birth DeptID]))) (table+ Dept [DeptID DeptName]))))

Query optimizationExample - rewritten

20

(project+ [Name DeptName] (join+ (project+ [Name DeptID] (filter+ (> (- (now) Birth) 20) (table+ Empl [Name Birth DeptID]))) (table+ Dept [DeptID DeptName])))

Hash join conversion

21

first+

let+

preduce+

join+

left tree

right tree

hash-tupleshashes

mapcat tuple-matches

left tree

right tree

Hash join conversion

22

(join+ _left _right)

(let+ [hashes (first+ (preduce+ (hash-tuple join-vars {} #(merge-with concat %1 %2)) _left))] (mapcat (fn [tuple] (tuple-matches hashes join-vars tuple)) _right)))

Processing trees

23

• Compile abstract nodes into more concrete stream operations:

– map+– mapcat+ – filter+

– first+ – mux+

– let+– let-stream+

– pmap+– pmapcat+ – pfilter+– preduce+

– number+– reorder+– rechunk+

– pmap-chunk+– preduce-chunk+

Summary• SPARQL and SQL query plans have essentially

the same underlying algebra• Model is a tree of nodes where tuples flow from

leaves to the root• A natural representation of this tree in Clojure is

as a tree of s-expressions, just like our code• We can manipulate this tree to provide

– Optimizations– Differing levels of abstraction

24

Processing Components

PipesPipes are streams of data

26

Producer Consumer

Pipe

(enqueue pipe item)(enqueue-all pipe items)(close pipe)(error pipe exception)

(dequeue pipe item)(dequeue-all pipe items)(closed? pipe)(error? pipe)

Pipe callbacks

Events on the pipe trigger callbacks which are executed on the caller's thread

27

Pipe callbacks

Events on the pipe trigger callbacks which are executed on the caller's thread

27

1. (add-callback pipe callback-fn)

callback-fn

Pipe callbacks

Events on the pipe trigger callbacks which are executed on the caller's thread

27

1. (add-callback pipe callback-fn)

callback-fn

Pipe callbacks

Events on the pipe trigger callbacks which are executed on the caller's thread

27

1. (add-callback pipe callback-fn)2. (enqueue pipe "foo")

callback-fn

Pipe callbacks

Events on the pipe trigger callbacks which are executed on the caller's thread

27

1. (add-callback pipe callback-fn)2. (enqueue pipe "foo")

callback-fn

Pipe callbacks

Events on the pipe trigger callbacks which are executed on the caller's thread

27

1. (add-callback pipe callback-fn)2. (enqueue pipe "foo")3. (callback-fn "foo") ;; during enqueue

callback-fn

PipesPipes are thread-safe functional data structures

28

PipesPipes are thread-safe functional data structures

28

callback-fn

Batched tuples• To a pipe, data is just data. We actually pass

data in batches through the pipe for efficiency.

29

[ {:Name "Alex" :Eyes "Blue" } {:Name "Jeff" :Eyes "Brown"} {:Name "Eric" :Eyes "Hazel" } {:Name "Joe" :Eyes "Blue"} {:Name "Lisa" :Eyes "Blue" } {:Name "Glen" :Eyes "Brown"}]

Pipe multiplexerCompose multiple pipes into one

30

Pipe teeSend output to multiple destinations

31

Nodes• Nodes transform tuples from the input pipe and

puts results on output pipe.

32

fnInput Pipe Output PipeNode

•input-pipe•output-pipe•task-fn•state •concurrency

Processing Trees• Tree of nodes and pipes

33

fn

fnfn

fn

fn

fn

Data flow

SPARQL query example

34

TP1

TP2

join filter project

?Person ?Age > 20 ?Name

{ ?Person :Age ?Age }

{ ?Person :Name ?Name }

SELECT ?NameWHERE { ?Person :Name ?Name . ?Person :Age ?Age . FILTER (?Age > 20) }

(project+ [?Name] (filter+ (> ?Age 20) (join+ [?Person] (triple+ [?Person :Name ?Name]) (triple+ [?Person :Age ?Age]))))

Processing tree

35

TP1

TP2

filter project

?Age > 20 ?Name

{ ?Person :Age ?Age }

{ ?Person :Name ?Name }

first+

preduce+ hash-tuples

hashes

mapcat tuple-matches

let+

Mapping to nodes• An obvious mapping to nodes and pipes

36

fn

fn

fnfnfn fn

fn project+filter+let+

triple pattern

triple pattern

triple pattern

first+

preduce+

Mapping to nodes• Choosing between compilation and evaluation

37

eval

triple pattern

project

?Age > 20 ?Name

filterfn

fn

fnfnfn

fn let+

triple pattern

first+

preduce+

Compile vs eval• We can evaluate our expressions

– Directly on streams of Clojure data using Clojure– Indirectly via pipes and nodes (more on that next)

• Final step before processing makes decision– Plan nodes that combine data are real nodes– Plan nodes that allow parallelism (p*) are real nodes– Most other plan nodes can be merged into single eval– Many leaf nodes actually rolled up, sent to a database– Lots more work to do on where these splits occur

38

Processing Execution

Execution requirements• Parallelism

– Across plans – Across nodes in a plan– Within a parallelizable node in a plan

• Memory management– Allow arbitrary intermediate results sets w/o OOME

• Ops– Cancellation– Timeouts– Monitoring

40

Event-driven processing• Dedicated I/O thread pools stream data into plan

41

fn

fnfn

fn

fn

fn

Compute threadsI/O threads

Task creation1.Callback fires when data added to input pipe2.Callback takes the fn associated with the node

and bundles it into a task3.Task is scheduled with the compute thread pool

42

fncallback Node

Fork/join vs Executors• Fork/join thread pool vs classic Executors

– Optimized for finer-grained tasks– Optimized for larger numbers of tasks– Optimized for more cores– Works well on tasks with dependencies– No contention on a single queue– Work stealing for load balancing

43

Compute threads

Task execution

1.Pull next chunk from input pipe2.Execute task function with access to node's state3.Optionally, output one or more chunks to output

pipe - this triggers the upstream callback4.If data still available, schedule a new task,

simulating a new callback on the current node

44

42

fncallback

Concurrency

• Delicate balance between Clojure refs and STM and Java concurrency primitives

• Clojure refs - managed by STM– Input pipe– Output pipe– Node state

• Java concurrency– Semaphore - "permits" to limit tasks per node– Per-node scheduling lock

• Key integration constraint– Clojure transactions can fail and retry!

45

Concurrency mechanisms

Blue outline = Java lockall = under Java semaphoreGreen outline = Cloj txnBlue shading = Cloj atom

Acquire sempahore Yes Dequeue

inputInput

message Data

Close

set closed = true

empty

closed && !closed_done

Create task

acquire all semaphores

Yesrun-task

w/ nil msg

set closed_done = true

close output-

pipe

release all

semaphores

Yes

invoke task

Result message

release 1 semaphore

No

No

Input closed?

enqueue data on

output pipe

set closed = true

Closes output?

empty

Data

Yes Yes

Close

run-taskclose-output

process-input

Memory management• Pipes are all on the heap• How do we avoid OutOfMemory?

47

Buffered pipes• When heap space is low, store pipe data on disk• Data is serialized / deserialized to/from disk• Memory-mapped files are used to improve I/O

48

fnfn

fn

fn

0100 ….

Memory monitoring• JMX memory beans

– To detect when memory is tight -> writing to disk• Use memory pool threshold notifications

– To detect when memory is ok -> write to memory• Use polling (no notification on decrease)

• Composite pipes– Build a logical pipe out of many segments– As memory conditions go up and down, each segment

is written to the fastest place. We never move data.

49

Cancellation• Pool keeps track of what nodes belong to which

plan• All nodes check for cancellation during execution• Cancellation can be caused by:

– Error during execution – User intervention from admin UI– Timeout from query settings

50

Summary• Data flow architecture

– Event-driven by arrival of data– Compute threads never block– Fork/join to handle scheduling of work

• Clojure as abstraction tool– Expression tree lets us express plans concisely– Also lets us manipulate them with tools in Clojure– Lines of code

• Fork/join pool, nodes, pipes - 1200• Buffer, serialization, memory monitor - 970• Processor, compiler, eval - 1900

• Open source? Hmmmmmmmmmmm……. 51

Thanks...Alex Miller

@puredangerRevelytix, Inc.

Recommended