90
Cuneiform A Functional Workflow Language Implementation in Erlang orgen Brandt Humboldt-Universit¨ at zu Berlin 2015-12-01 orgen Brandt (HU Berlin) Cuneiform 2015-12-01 1 / 27

Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

  • Upload
    lamdat

  • View
    235

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

CuneiformA Functional Workflow Language Implementation in Erlang

Jorgen Brandt

Humboldt-Universitat zu Berlin

2015-12-01

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 1 / 27

Page 2: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform

CuneiformA Functional Language for Large Scale Scientific Data Analysis

Black-box operator modelPro: Operators can be any piece of software

Black-box data modelPro: Input and Output data can be anything

Features of advanced workflow languagesAbstractions, lists, operations on lists, conditions

Light-weight Foreign Function Interface (FFI)Wrapping in R, Matlab, Octave, Python, Lisp, Perl, Bash

Automatic parallelizationScalability with large data sets

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 2 / 27

Page 3: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform

CuneiformA Functional Language for Large Scale Scientific Data Analysis

Black-box operator modelPro: Operators can be any piece of software

Black-box data modelPro: Input and Output data can be anything

Features of advanced workflow languagesAbstractions, lists, operations on lists, conditions

Light-weight Foreign Function Interface (FFI)Wrapping in R, Matlab, Octave, Python, Lisp, Perl, Bash

Automatic parallelizationScalability with large data sets

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 2 / 27

Page 4: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform

CuneiformA Functional Language for Large Scale Scientific Data Analysis

Black-box operator modelPro: Operators can be any piece of software

Black-box data modelPro: Input and Output data can be anything

Features of advanced workflow languagesAbstractions, lists, operations on lists, conditions

Light-weight Foreign Function Interface (FFI)Wrapping in R, Matlab, Octave, Python, Lisp, Perl, Bash

Automatic parallelizationScalability with large data sets

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 2 / 27

Page 5: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform

CuneiformA Functional Language for Large Scale Scientific Data Analysis

Black-box operator modelPro: Operators can be any piece of software

Black-box data modelPro: Input and Output data can be anything

Features of advanced workflow languagesAbstractions, lists, operations on lists, conditions

Light-weight Foreign Function Interface (FFI)Wrapping in R, Matlab, Octave, Python, Lisp, Perl, Bash

Automatic parallelizationScalability with large data sets

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 2 / 27

Page 6: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform

CuneiformA Functional Language for Large Scale Scientific Data Analysis

Black-box operator modelPro: Operators can be any piece of software

Black-box data modelPro: Input and Output data can be anything

Features of advanced workflow languagesAbstractions, lists, operations on lists, conditions

Light-weight Foreign Function Interface (FFI)Wrapping in R, Matlab, Octave, Python, Lisp, Perl, Bash

Automatic parallelizationScalability with large data sets

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 2 / 27

Page 7: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform

CuneiformA Functional Language for Large Scale Scientific Data Analysis

Black-box operator modelPro: Operators can be any piece of software

Black-box data modelPro: Input and Output data can be anything

Features of advanced workflow languagesAbstractions, lists, operations on lists, conditions

Light-weight Foreign Function Interface (FFI)Wrapping in R, Matlab, Octave, Python, Lisp, Perl, Bash

Automatic parallelizationScalability with large data sets

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 2 / 27

Page 8: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Motivation

“New hardware is increasingly parallel, so new programming languagesmust support concurrency or they will die.”

Joe Armstrong

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 3 / 27

Page 9: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

“New Hardware”

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 4 / 27

Page 10: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

DNA Sequencing is becoming cheap

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 5 / 27

Page 11: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Decentralized software development

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 6 / 27

Page 12: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Scientific Workflow Systems

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 7 / 27

Page 13: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Scientific Workflow Systems

Workflows as DAGs

Scientific Workflows areDAGs

Nodes are tasksEdges are data dependencies

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 8 / 27

Page 14: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Scientific Workflow Systems

Workflows as DAGs

Scientific Workflows areDAGsNodes are tasks

Edges are data dependencies

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 8 / 27

Page 15: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Scientific Workflow Systems

Workflows as DAGs

Scientific Workflows areDAGsNodes are tasksEdges are data dependencies

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 8 / 27

Page 16: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Example: Galaxy Workflow System

Focus onUsability

Integration of tools/libraries

Systematic documentationReproducibility

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 9 / 27

Page 17: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Example: Galaxy Workflow System

Focus onUsabilityIntegration of tools/libraries

Systematic documentationReproducibility

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 9 / 27

Page 18: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Example: Galaxy Workflow System

Focus onUsabilityIntegration of tools/libraries

Systematic documentation

Reproducibility

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 9 / 27

Page 19: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Example: Galaxy Workflow System

Focus onUsabilityIntegration of tools/libraries

Systematic documentationReproducibility

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 9 / 27

Page 20: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The Next Generation Sequencing use case

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 10 / 27

Page 21: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The Next Generation Sequencing use case

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 10 / 27

Page 22: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The Next Generation Sequencing use case

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 10 / 27

Page 23: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The Next Generation Sequencing use case

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 10 / 27

Page 24: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The Next Generation Sequencing use case

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 10 / 27

Page 25: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Desired Features in a Language

Is there a language that is . . .

Like a workflow languageSo we can integrate all the tools

Like MapReduceSo we can derive parallelism and distribute the work

Like a functional programming languageSo we can write arbitrary programs using lists and operations on lists

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 11 / 27

Page 26: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Desired Features in a Language

Is there a language that is . . .Like a workflow languageSo we can integrate all the tools

Like MapReduceSo we can derive parallelism and distribute the work

Like a functional programming languageSo we can write arbitrary programs using lists and operations on lists

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 11 / 27

Page 27: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Desired Features in a Language

Is there a language that is . . .Like a workflow languageSo we can integrate all the tools

Like MapReduceSo we can derive parallelism and distribute the work

Like a functional programming languageSo we can write arbitrary programs using lists and operations on lists

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 11 / 27

Page 28: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Desired Features in a Language

Is there a language that is . . .Like a workflow languageSo we can integrate all the tools

Like MapReduceSo we can derive parallelism and distribute the work

Like a functional programming languageSo we can write arbitrary programs using lists and operations on lists

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 11 / 27

Page 29: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform example

deftask gunzip( out( File ) : gz( File ) )in bash *{ gzip -c -d $gz > $out}*

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 12 / 27

Page 30: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform example

deftask gunzip( out( File ) : gz( File ) )in bash *{ gzip -c -d $gz > $out}*

gunzip( gz: 'myarchive1.gz');

gunzip

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 12 / 27

Page 31: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform example

deftask gunzip( out( File ) : gz( File ) )in bash *{ gzip -c -d $gz > $out}*

gunzip( gz: 'myarchive1.gz' 'myarchive2.gz');

gunzip

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 12 / 27

Page 32: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform example

deftask gunzip( out( File ) : gz( File ) )in bash *{ gzip -c -d $gz > $out}*

gunzip( gz: 'myarchive1.gz' 'myarchive2.gz' 'myarchive3.gz');

gunzip

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 12 / 27

Page 33: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Workflow Implementations Available

Available example workflows:Variant callinghttps://www.github.com/joergen7/variant-call

Methylationhttps://www.github.com/joergen7/methylation

RNA-Seq https://www.github.com/joergen7/rna-seq

etc (ChIP-Seq, miRNA detection,consensus prediction, . . . )

samtools-faidx

fastq-dump

cufflinks

cummerbund

tophat-single

cuffdiff

cuffmerge

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 13 / 27

Page 34: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Workflow Implementations Available

Available example workflows:Variant callinghttps://www.github.com/joergen7/variant-call

Methylationhttps://www.github.com/joergen7/methylation

RNA-Seq https://www.github.com/joergen7/rna-seq

etc (ChIP-Seq, miRNA detection,consensus prediction, . . . )

samtools-faidx

fastq-dump

cufflinks

cummerbund

tophat-single

cuffdiff

cuffmerge

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 13 / 27

Page 35: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Workflow Implementations Available

Available example workflows:Variant callinghttps://www.github.com/joergen7/variant-call

Methylationhttps://www.github.com/joergen7/methylation

RNA-Seq https://www.github.com/joergen7/rna-seq

etc (ChIP-Seq, miRNA detection,consensus prediction, . . . )

samtools-faidx

fastq-dump

cufflinks

cummerbund

tophat-single

cuffdiff

cuffmerge

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 13 / 27

Page 36: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Workflow Implementations Available

Available example workflows:Variant callinghttps://www.github.com/joergen7/variant-call

Methylationhttps://www.github.com/joergen7/methylation

RNA-Seq https://www.github.com/joergen7/rna-seq

etc (ChIP-Seq, miRNA detection,consensus prediction, . . . )

samtools-faidx

fastq-dump

cufflinks

cummerbund

tophat-single

cuffdiff

cuffmerge

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 13 / 27

Page 37: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Cuneiform Operational Semantics

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 14 / 27

Page 38: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Program Interpretation in Cuneiform

Current design in Java implementation:

Parser Generator(ANTLR)

EBNF

Scanner/Parser

Transcrip�onVisitor

Program Interpreter ResultParse treeAbstractProgram

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 15 / 27

Page 39: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Program Interpretation in Cuneiform

Designated design but never implemented:

Parser Generator(ANTLR)

EBNF

Scanner/Parser

Transcrip�onVisitor

Program Interpreter Result

Program Extrac�on(?)

Opera�onalSeman�cs

Parse treeAbstractProgram

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 15 / 27

Page 40: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Program Interpretation in Cuneiform

Designated design in Erlang:

Parser Generator(ANTLR)

EBNF

Scanner/Parser

Transcrip�onVisitor

Program Interpreter Result

Program Extrac�on(?)

Opera�onalSeman�cs

Parse treeAbstractProgram

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 15 / 27

Page 41: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The eval Function

fun eval(Expr , ρ,GetFuture,Global ,Fin) → Result whenExpr :: exprρ :: string => exprGetFuture :: funGlobal :: string => lamFin :: id => exprResult :: expr

Expr The expression to be evaluatedρ Current scope

GetFuture A function returning a future for a foreign task applicationGlobal Task definitions

Fin Results of foreign task applicationsResult The result of evaluation (may contain futures)

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 16 / 27

Page 42: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The eval Function

fun eval(Expr , ρ,GetFuture,Global ,Fin) → Result whenExpr :: exprρ :: string => exprGetFuture :: funGlobal :: string => lamFin :: id => exprResult :: expr

Expr The expression to be evaluated

ρ Current scopeGetFuture A function returning a future for a foreign task application

Global Task definitionsFin Results of foreign task applications

Result The result of evaluation (may contain futures)

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 16 / 27

Page 43: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The eval Function

fun eval(Expr , ρ,GetFuture,Global ,Fin) → Result whenExpr :: exprρ :: string => exprGetFuture :: funGlobal :: string => lamFin :: id => exprResult :: expr

Expr The expression to be evaluatedρ Current scope

GetFuture A function returning a future for a foreign task applicationGlobal Task definitions

Fin Results of foreign task applicationsResult The result of evaluation (may contain futures)

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 16 / 27

Page 44: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The eval Function

fun eval(Expr , ρ,GetFuture,Global ,Fin) → Result whenExpr :: exprρ :: string => exprGetFuture :: funGlobal :: string => lamFin :: id => exprResult :: expr

Expr The expression to be evaluatedρ Current scope

GetFuture A function returning a future for a foreign task application

Global Task definitionsFin Results of foreign task applications

Result The result of evaluation (may contain futures)

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 16 / 27

Page 45: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The eval Function

fun eval(Expr , ρ,GetFuture,Global ,Fin) → Result whenExpr :: exprρ :: string => exprGetFuture :: funGlobal :: string => lamFin :: id => exprResult :: expr

Expr The expression to be evaluatedρ Current scope

GetFuture A function returning a future for a foreign task applicationGlobal Task definitions

Fin Results of foreign task applicationsResult The result of evaluation (may contain futures)

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 16 / 27

Page 46: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The eval Function

fun eval(Expr , ρ,GetFuture,Global ,Fin) → Result whenExpr :: exprρ :: string => exprGetFuture :: funGlobal :: string => lamFin :: id => exprResult :: expr

Expr The expression to be evaluatedρ Current scope

GetFuture A function returning a future for a foreign task applicationGlobal Task definitions

Fin Results of foreign task applications

Result The result of evaluation (may contain futures)

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 16 / 27

Page 47: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

The eval Function

fun eval(Expr , ρ,GetFuture,Global ,Fin) → Result whenExpr :: exprρ :: string => exprGetFuture :: funGlobal :: string => lamFin :: id => exprResult :: expr

Expr The expression to be evaluatedρ Current scope

GetFuture A function returning a future for a foreign task applicationGlobal Task definitions

Fin Results of foreign task applicationsResult The result of evaluation (may contain futures)

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 16 / 27

Page 48: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Computational Semantics

eval(Expr , ρ,GetFuture,Global ,Fin) →Next = step(Expr , ρ,GetFuture,Global ,Fin)case Next of

Expr → Expr→ eval(Next, ρ,GetFuture,Global ,Fin)

end

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 17 / 27

Page 49: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Computational Semantics

eval(Expr , ρ,GetFuture,Global ,Fin) →Next = step(Expr , ρ,GetFuture,Global ,Fin)case Next of

Expr → Expr→ eval(Next, ρ,GetFuture,Global ,Fin)

end

single step is computed

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 17 / 27

Page 50: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Computational Semantics

eval(Expr , ρ,GetFuture,Global ,Fin) →Next = step(Expr , ρ,GetFuture,Global ,Fin)case Next of

Expr → Expr→ eval(Next, ρ,GetFuture,Global ,Fin)

end

single step is computedIf the step has no effect evaluation terminates

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 17 / 27

Page 51: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Computational Semantics

fun eval(Expr , ρ,GetFuture,Global ,Fin) →Next = step(Expr , ρ,GetFuture,Global ,Fin)case Next of

Expr → Expr→ eval(Next, ρ,GetFuture,Global ,Fin)

end

single step is computedIf the step has no effect evaluation terminatesOtherwise eval is called recursively

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 17 / 27

Page 52: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Languages Suitable for Operational Semantics

Choosing a languageFunctional Language

Common Lisp

Scala

ML

Erlang

Haskell

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 18 / 27

Page 53: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Formalisms Suitable for Operational Semantics

Choosing a languageFunctional LanguageWith Pattern Matching

Common Lisp

Scala

ML

Erlang

Haskell

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 18 / 27

Page 54: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Formalisms Suitable for Operational Semantics

Choosing a languageFunctional LanguageWith Pattern MatchingConcurrency Orientation

Common Lisp

Scala

ML

Erlang

Haskell

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 18 / 27

Page 55: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Program Interpretation in Cuneiform

New design:

Parser Generator

(Leex/Yecc)

EBNF

Scanner/

ParserProgram Interpreter Result

Compiler

(Erlang)

Opera�onal

Seman�cs

Abstract

Program

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 19 / 27

Page 56: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Program Interpretation in Cuneiform

New design:

Parser Generator

(Leex/Yecc)

EBNF

Scanner/

ParserProgram Interpreter Result

Compiler

(Erlang)

Opera�onal

Seman�cs

Abstract

Program

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 19 / 27

Page 57: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Fault Tolerance

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 20 / 27

Page 58: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Two types of complexity

n Processes

n Tasks

Distributed applications can becomplex in two independent ways:

in the number of processesinvolved to compute one taskthe more system components the morelikely one component fails

in the number of taskscontributing to a workflowthe more tasks the more likely one taskfails

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 21 / 27

Page 59: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Two types of complexity

n Processes

n Tasks

Distributed applications can becomplex in two independent ways:

in the number of processesinvolved to compute one taskthe more system components the morelikely one component fails

in the number of taskscontributing to a workflowthe more tasks the more likely one taskfails

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 21 / 27

Page 60: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Two types of complexity

n Processes

n Tasks

Distributed applications can becomplex in two independent ways:

in the number of processesinvolved to compute one taskthe more system components the morelikely one component fails

in the number of taskscontributing to a workflowthe more tasks the more likely one taskfails

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 21 / 27

Page 61: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Two types of complexity

n Processes

n Tasks

Distributed applications can becomplex in two independent ways:

in the number of processesinvolved to compute one taskthe more system components the morelikely one component fails

in the number of taskscontributing to a workflowthe more tasks the more likely one taskfails

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 21 / 27

Page 62: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Two types of complexity

n Processes

n Tasks

Distributed applications can becomplex in two independent ways:

in the number of processesinvolved to compute one taskthe more system components the morelikely one component fails

in the number of taskscontributing to a workflowthe more tasks the more likely one taskfails

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 21 / 27

Page 63: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Distributed Application: Workflow System

Query Query Query

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 22 / 27

Page 64: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Distributed Application: Workflow System

Cache

Query Query Query

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 22 / 27

Page 65: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Distributed Application: Workflow System

Scheduler

Cache

Query Query Query

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 22 / 27

Page 66: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Distributed Application: Workflow System

Scheduler

FS Work FS Work FS Work

Cache

Query Query Query

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 22 / 27

Page 67: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

How to achieve fault tolerance when

Workflow systems are complex in bothNumber of processes involved in computing one taskNumber of tasks in one workflow

so failures are likely

All components need to maintain stateso plain restarting of components is not enough

Restarting of workflow helps only if workflows are small and systemhas few components

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 23 / 27

Page 68: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

How to achieve fault tolerance whenWorkflow systems are complex in both

Number of processes involved in computing one taskNumber of tasks in one workflow

so failures are likely

All components need to maintain stateso plain restarting of components is not enough

Restarting of workflow helps only if workflows are small and systemhas few components

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 23 / 27

Page 69: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

How to achieve fault tolerance whenWorkflow systems are complex in both

Number of processes involved in computing one taskNumber of tasks in one workflow

so failures are likely

All components need to maintain stateso plain restarting of components is not enough

Restarting of workflow helps only if workflows are small and systemhas few components

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 23 / 27

Page 70: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

How to achieve fault tolerance whenWorkflow systems are complex in both

Number of processes involved in computing one taskNumber of tasks in one workflow

so failures are likely

All components need to maintain stateso plain restarting of components is not enough

Restarting of workflow helps only if workflows are small and systemhas few components

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 23 / 27

Page 71: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

Golden path:P1 sends request to P2

requestP1 P2

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 24 / 27

Page 72: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

Golden path:P1 sends request to P2

Asynchronously P1 receives reply requestP1 P2

reply

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 24 / 27

Page 73: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P2 may fail:P1 sends request to P2

P2 fails requestP1 P2

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 25 / 27

Page 74: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P2 may fail:P1 sends request to P2

P1 creates monitor on P2 requestP1 P2

monitor

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 25 / 27

Page 75: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P2 may fail:P1 sends request to P2

P1 creates monitor on P2

P1 memorizes request requestP1 P2

monitor

{ }

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 25 / 27

Page 76: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P2 may fail:P1 sends request to P2

P1 creates monitor on P2

P1 memorizes requestP2 fails

requestP1 P2

monitor

{ }

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 25 / 27

Page 77: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P2 may fail:P1 sends request to P2

P1 creates monitor on P2

P1 memorizes requestP2 failsP2 supervisor restarts P2

P1 P2

{ }

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 25 / 27

Page 78: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P2 may fail:P1 sends request to P2

P1 creates monitor on P2

P1 memorizes requestP2 failsP2 supervisor restarts P2

request is replayed to P2

monitor is recreated

requestP1 P2

monitor

{ }

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 25 / 27

Page 79: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P1 may fail:P1 sends request to P2

P1 fails requestP1 P2

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 26 / 27

Page 80: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P1 may fail:P1 sends request to P2

P2 creates monitor on P1 requestP1 P2

monitor

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 26 / 27

Page 81: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P1 may fail:P1 sends request to P2

P2 creates monitor on P1

P1 fails requestP1 P2

monitor

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 26 / 27

Page 82: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Generic process behaviour

P1 may fail:P1 sends request to P2

P2 creates monitor on P1

P1 failsrequest is canceledsupervisor restarts P1

P1 P2

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 26 / 27

Page 83: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:

FunctionalIntegrate anythingParallelism

Runs on HadoopImplementation in Erlang:

Concise statelesssemanticsFine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27

Page 84: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:Functional

Integrate anythingParallelism

Runs on HadoopImplementation in Erlang:

Concise statelesssemanticsFine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27

Page 85: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:FunctionalIntegrate anything

ParallelismRuns on HadoopImplementation in Erlang:

Concise statelesssemanticsFine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27

Page 86: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:FunctionalIntegrate anythingParallelism

Runs on HadoopImplementation in Erlang:

Concise statelesssemanticsFine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27

Page 87: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:FunctionalIntegrate anythingParallelism

Runs on Hadoop

Implementation in Erlang:

Concise statelesssemanticsFine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27

Page 88: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:FunctionalIntegrate anythingParallelism

Runs on HadoopImplementation in Erlang:

Concise statelesssemanticsFine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27

Page 89: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:FunctionalIntegrate anythingParallelism

Runs on HadoopImplementation in Erlang:

Concise statelesssemantics

Fine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27

Page 90: Cuneiform - A Functional Workflow Language … · Cuneiform A Functional Workflow Language Implementation in Erlang J¨orgen Brandt Humboldt-Universit¨at zu Berlin 2015-12-01 J¨orgen

Conclusion

Cuneiform:FunctionalIntegrate anythingParallelism

Runs on HadoopImplementation in Erlang:

Concise statelesssemanticsFine-grained faulttolerance

samtools-sort samtools-faidx

varscan

gunzip bowtie2-build

samtools-view

untar

annovar

bowtie2-align

samtools-mpileup

https://github.com/joergen7/cre

Jorgen Brandt (HU Berlin) Cuneiform 2015-12-01 27 / 27