ACAT 2019
14.03.2019
Design Pattern for Analysis Automation on Interchangeable,
Distributed Resources using Luigi Analysis Workflows .
Marcel Rieger, Martin Erdmann
lawluigi analysis workflow
Marcel Rieger - 14.3.19 2 Motivational questions
● Portability: Does the analysis depend on ...
■ where it runs?
■ where it stores data?
▻ Execution/storage should not dictate code design!
● Reproducibility: When a M.Sc. / PhD / Postdoc leaves, ...
■ can someone else run the analysis?
■ is there a loss of information? Is a new framework required?
▻ Dependencies often only exist in the physicists head!
● Preservation: After an analysis is published ...
■ are people investing time to preserve their work?
■ can it be repeated after O(years)?
▻ Daily working environment should provide preservation features out-of-the-box!
WLCG
?
?
?
?
Marcel Rieger - 14.3.19 3 Landscape of HEP analyses
● Scale: measure of resource consumption and amount of data
● Complexity: measure of granularity and inhomogeneity of workloads
● Future analyses likely to be large and complex,
bottlenecks:
■ Undocumented structure & requirements between workloads, only exists in the physicist’s head
■ Bookkeeping of data, revisions, …
■ Manual execution/steering of jobs
■ Error-prone & time-consuming
→ Analysis workflow management essential for future measurements!
ScaleCom
plex
ity
Computing infrastructures
(WLCG)
Good scripts & code structure
Single machine, single command
Analysis workflow
management
Marcel Rieger - 14.3.19 4 Abstraction: analysis workflows
● Workflow, decomposable into particular workloads
● Workloads related to each other by common interface
■ In/outputs define directed acyclic graph (DAG)
● Alter default behavior via parameters
● Computing resources
■ Run location (CPU, GPU, WLCG, …)
■ Storage location (local, dCache, EOS, …)
● Software environment
● Collaborative development and processing
● Reproducible intermediate and final results
Selection
Reconstruction
MVA Split
MVA MVA Evaluation
Inference
MVA Training
Weights
Example
CPU
GPU
→ Reads like a checklist for analysis workflow management
Marcel Rieger - 14.3.19 5 Example: ttbb cross section measurement
vispa
e_bdt_eval
e_bdt_roc
e_hist
e_syst_bdt_plot_btag
e_syst_bdt_plot_mistag
e_syst_bdt_plot_jer
e_syst_bdt_plot_jes
e_syst_bdt_plot_pdf
e_syst_bdt_plot_pileup
mu_data_cms
mu_data_lumi
mu_data_merge
sfb_coefficients
mu_data_tuple
e_data_tuple
mu_tuple
e_tuple
mu_enrich_fill
mu_enrich_old_weighter
mu_syst_scaleUp_tuple
mu_syst_scaleDown_tuple
e_syst_scaleUp_bdt
e_syst_scaleDown_bdt
mu_syst_matchingUp_tuple
mu_syst_matchingDown_tuple
e_syst_matchingUp_bdt
e_syst_matchingDown_bdt
mu_syst_btagUp_tuple
mu_syst_btagDown_tuple
e_syst_btagUp_bdt
e_syst_btagDown_bdt
mu_syst_mistagUp_tuple
mu_syst_mistagDown_tuple
e_syst_mistagUp_bdt
e_syst_mistagDown_bdt
mu_syst_jerUp_tuple
mu_syst_jerDown_tuple
e_syst_jerUp_bdt
e_syst_jerDown_bdt
mu_syst_jesUp_tuple
mu_syst_jesDown_tuple
e_syst_jesUp_bdt
e_syst_jesDown_bdt
mu_syst_pdfUp_tuple
mu_syst_pdfDown_tuple
e_syst_pdfUp_bdt
e_syst_pdfDown_bdt
mu_syst_pileupUp_tuple
mu_syst_pileupDown_tuple
e_syst_pileupUp_bdt
e_syst_pileupDown_bdt
mu_qcd_tuple
sfb_plot_coefficients
pu_weight pu_weight_plot
e_data_cms
e_data_lumi
e_data_mergee_syst_btag_hist
e_syst_mistag_hist
e_syst_jer_hist
e_syst_jes_hist
e_syst_pdf_hist
e_syst_pileup_hist
e_syst_scale_hist
e_syst_matching_hist
e_plot
mu_chain
crab_mc
sfb_tuple
crab_mc_merge
crab_data crab_data_lumi
mu_enrich_old_chain
mu_syst_scaleUp_chain
mu_syst_scaleDown_chain
mu_syst_matchingUp_chain
mu_syst_matchingDown_chain
mu_syst_btagUp_chain
mu_syst_btagDown_chain
mu_syst_mistagUp_chain
mu_syst_mistagDown_chain
mu_syst_jerUp_chain
mu_syst_jerDown_chain
mu_syst_jesUp_chain
mu_syst_jesDown_chain
mu_syst_pdfUp_chain
mu_syst_pdfDown_chain
mu_syst_pileupUp_chain
mu_syst_pileupDown_chain
crab_data_merge
mu_qcd_chain
mc_nominal_merge
mu_bdt_train__ttbb_vs_all__nominal
mu_bdt_train__ttbb_vs_all__enriched
mu_bdt_train__ttb_vs_ttbb__nominal
mu_bdt_train__ttbb_vs_ttb__nominal
mu_bdt_train__ttlight_vs_rest__nominal
mu_hist
mu_enrichment_comparison_plot
mu_bdt_eval
mu_syst_bdt_plot_pileup
mu_plot
mu_plot_no_toppt
mu_plot_mcOnly
mu_plot_mc_2d
mu_plot_mcOnly_ttlight
mu_plot_events
sfb_chain
mu_syst_scaleUp_bdt
mu_syst_scaleDown_bdt
mu_syst_matchingUp_bdt
mu_syst_matchingDown_bdt
mu_syst_btagUp_bdt
mu_syst_btagDown_bdt
mu_syst_mistagUp_bdt
mu_syst_mistagDown_bdt
mu_syst_jerUp_bdt
mu_syst_jerDown_bdt
mu_syst_jesUp_bdt
mu_syst_jesDown_bdt
mu_syst_pdfUp_bdt
mu_syst_pdfDown_bdt
mu_syst_pileupUp_bdt
mu_syst_pileupDown_bdt
mu_qcd_bdt
mu_bdt_train__ttbb_vs_all__nominal_plot
mu_bdt_train__ttbb_vs_all__enriched_plot
mu_bdt_train__ttb_vs_ttbb__nominal_plot
mu_bdt_train__ttbb_vs_ttb__nominal_plot
mu_bdt_train__ttlight_vs_rest__nominal_plot
e_bdt_roc_plot
th_combined
th_batch_prep
mc_scale_merge
mc_matching_merge
mu_syst_plot_pdf_comparison
mu_syst_plot_pdf_stack
mu_syst_plot_pdf_shape
mu_qcd_fit
mu_qcd_ratio
mu_qcd_scatter
mu_qcd_plot_mc_2d
th_qcd
mu_syst_btag_hist
mu_syst_mistag_hist
mu_syst_jer_hist
mu_syst_jes_hist
mu_syst_pdf_hist
mu_syst_pileup_hist
mu_syst_scale_hist
mu_syst_matching_hist
mu_qcd_hist_sr
mu_qcd_hist_sb_nominal
mu_qcd_hist_sb_variation
mc_nominal_das mc_nominal_cms
mc_ttbaralt_das mc_ttbaralt_cms mc_ttbaralt_merge
mc_scale_das mc_scale_cms
mc_matching_das mc_matching_cms
mu_data_das
e_data_das
pu_calc
mu_syst_bdt_plot_scale
mu_syst_bdt_plot_matching
mu_syst_bdt_plot_mistag
mu_syst_bdt_plot_jer
mu_syst_bdt_plot_jes
mu_syst_bdt_plot_pdf
mu_bdt_plot_output
mu_bdt_roc
mu_qcd_stack
mu_qcd_comparison
mu_bdt_roc_plot
sfb_plot_control
Robert Fischer
Marcel Rieger - 14.3.19 5 Example: ttbb cross section measurement
vispa
e_bdt_eval
e_bdt_roc
e_hist
e_syst_bdt_plot_btag
e_syst_bdt_plot_mistag
e_syst_bdt_plot_jer
e_syst_bdt_plot_jes
e_syst_bdt_plot_pdf
e_syst_bdt_plot_pileup
mu_data_cms
mu_data_lumi
mu_data_merge
sfb_coefficients
mu_data_tuple
e_data_tuple
mu_tuple
e_tuple
mu_enrich_fill
mu_enrich_old_weighter
mu_syst_scaleUp_tuple
mu_syst_scaleDown_tuple
e_syst_scaleUp_bdt
e_syst_scaleDown_bdt
mu_syst_matchingUp_tuple
mu_syst_matchingDown_tuple
e_syst_matchingUp_bdt
e_syst_matchingDown_bdt
mu_syst_btagUp_tuple
mu_syst_btagDown_tuple
e_syst_btagUp_bdt
e_syst_btagDown_bdt
mu_syst_mistagUp_tuple
mu_syst_mistagDown_tuple
e_syst_mistagUp_bdt
e_syst_mistagDown_bdt
mu_syst_jerUp_tuple
mu_syst_jerDown_tuple
e_syst_jerUp_bdt
e_syst_jerDown_bdt
mu_syst_jesUp_tuple
mu_syst_jesDown_tuple
e_syst_jesUp_bdt
e_syst_jesDown_bdt
mu_syst_pdfUp_tuple
mu_syst_pdfDown_tuple
e_syst_pdfUp_bdt
e_syst_pdfDown_bdt
mu_syst_pileupUp_tuple
mu_syst_pileupDown_tuple
e_syst_pileupUp_bdt
e_syst_pileupDown_bdt
mu_qcd_tuple
sfb_plot_coefficients
pu_weight pu_weight_plot
e_data_cms
e_data_lumi
e_data_mergee_syst_btag_hist
e_syst_mistag_hist
e_syst_jer_hist
e_syst_jes_hist
e_syst_pdf_hist
e_syst_pileup_hist
e_syst_scale_hist
e_syst_matching_hist
e_plot
mu_chain
crab_mc
sfb_tuple
crab_mc_merge
crab_data crab_data_lumi
mu_enrich_old_chain
mu_syst_scaleUp_chain
mu_syst_scaleDown_chain
mu_syst_matchingUp_chain
mu_syst_matchingDown_chain
mu_syst_btagUp_chain
mu_syst_btagDown_chain
mu_syst_mistagUp_chain
mu_syst_mistagDown_chain
mu_syst_jerUp_chain
mu_syst_jerDown_chain
mu_syst_jesUp_chain
mu_syst_jesDown_chain
mu_syst_pdfUp_chain
mu_syst_pdfDown_chain
mu_syst_pileupUp_chain
mu_syst_pileupDown_chain
crab_data_merge
mu_qcd_chain
mc_nominal_merge
mu_bdt_train__ttbb_vs_all__nominal
mu_bdt_train__ttbb_vs_all__enriched
mu_bdt_train__ttb_vs_ttbb__nominal
mu_bdt_train__ttbb_vs_ttb__nominal
mu_bdt_train__ttlight_vs_rest__nominal
mu_hist
mu_enrichment_comparison_plot
mu_bdt_eval
mu_syst_bdt_plot_pileup
mu_plot
mu_plot_no_toppt
mu_plot_mcOnly
mu_plot_mc_2d
mu_plot_mcOnly_ttlight
mu_plot_events
sfb_chain
mu_syst_scaleUp_bdt
mu_syst_scaleDown_bdt
mu_syst_matchingUp_bdt
mu_syst_matchingDown_bdt
mu_syst_btagUp_bdt
mu_syst_btagDown_bdt
mu_syst_mistagUp_bdt
mu_syst_mistagDown_bdt
mu_syst_jerUp_bdt
mu_syst_jerDown_bdt
mu_syst_jesUp_bdt
mu_syst_jesDown_bdt
mu_syst_pdfUp_bdt
mu_syst_pdfDown_bdt
mu_syst_pileupUp_bdt
mu_syst_pileupDown_bdt
mu_qcd_bdt
mu_bdt_train__ttbb_vs_all__nominal_plot
mu_bdt_train__ttbb_vs_all__enriched_plot
mu_bdt_train__ttb_vs_ttbb__nominal_plot
mu_bdt_train__ttbb_vs_ttb__nominal_plot
mu_bdt_train__ttlight_vs_rest__nominal_plot
e_bdt_roc_plot
th_combined
th_batch_prep
mc_scale_merge
mc_matching_merge
mu_syst_plot_pdf_comparison
mu_syst_plot_pdf_stack
mu_syst_plot_pdf_shape
mu_qcd_fit
mu_qcd_ratio
mu_qcd_scatter
mu_qcd_plot_mc_2d
th_qcd
mu_syst_btag_hist
mu_syst_mistag_hist
mu_syst_jer_hist
mu_syst_jes_hist
mu_syst_pdf_hist
mu_syst_pileup_hist
mu_syst_scale_hist
mu_syst_matching_hist
mu_qcd_hist_sr
mu_qcd_hist_sb_nominal
mu_qcd_hist_sb_variation
mc_nominal_das mc_nominal_cms
mc_ttbaralt_das mc_ttbaralt_cms mc_ttbaralt_merge
mc_scale_das mc_scale_cms
mc_matching_das mc_matching_cms
mu_data_das
e_data_das
pu_calc
mu_syst_bdt_plot_scale
mu_syst_bdt_plot_matching
mu_syst_bdt_plot_mistag
mu_syst_bdt_plot_jer
mu_syst_bdt_plot_jes
mu_syst_bdt_plot_pdf
mu_bdt_plot_output
mu_bdt_roc
mu_qcd_stack
mu_qcd_comparison
mu_bdt_roc_plot
sfb_plot_control
vispa
e_bdt_eval
e_bdt_roc
e_hist
e_syst_bdt_plot_btag
e_syst_bdt_plot_mistag
e_syst_bdt_plot_jer
e_syst_bdt_plot_jes
e_syst_bdt_plot_pdf
e_syst_bdt_plot_pileup
mu_data_cms
mu_data_lumi
mu_data_merge
sfb_coefficients
mu_data_tuple
e_data_tuple
mu_tuple
e_tuple
mu_enrich_fill
mu_enrich_old_weighter
mu_syst_scaleUp_tuple
mu_syst_scaleDown_tuple
e_syst_scaleUp_bdt
e_syst_scaleDown_bdt
mu_syst_matchingUp_tuple
mu_syst_matchingDown_tuple
e_syst_matchingUp_bdt
e_syst_matchingDown_bdt
mu_syst_btagUp_tuple
mu_syst_btagDown_tuple
e_syst_btagUp_bdt
e_syst_btagDown_bdt
mu_syst_mistagUp_tuple
mu_syst_mistagDown_tuple
e_syst_mistagUp_bdt
e_syst_mistagDown_bdt
mu_syst_jerUp_tuple
mu_syst_jerDown_tuple
e_syst_jerUp_bdt
e_syst_jerDown_bdt
mu_syst_jesUp_tuple
mu_syst_jesDown_tuple
e_syst_jesUp_bdt
e_syst_jesDown_bdt
mu_syst_pdfUp_tuple
mu_syst_pdfDown_tuple
e_syst_pdfUp_bdt
e_syst_pdfDown_bdt
mu_syst_pileupUp_tuple
mu_syst_pileupDown_tuple
e_syst_pileupUp_bdt
e_syst_pileupDown_bdt
mu_qcd_tuple
sfb_plot_coefficients
pu_weight pu_weight_plot
e_data_cms
e_data_lumi
e_data_mergee_syst_btag_hist
e_syst_mistag_hist
e_syst_jer_hist
e_syst_jes_hist
e_syst_pdf_hist
e_syst_pileup_hist
e_syst_scale_hist
e_syst_matching_hist
e_plot
mu_chain
crab_mc
sfb_tuple
crab_mc_merge
crab_data crab_data_lumi
mu_enrich_old_chain
mu_syst_scaleUp_chain
mu_syst_scaleDown_chain
mu_syst_matchingUp_chain
mu_syst_matchingDown_chain
mu_syst_btagUp_chain
mu_syst_btagDown_chain
mu_syst_mistagUp_chain
mu_syst_mistagDown_chain
mu_syst_jerUp_chain
mu_syst_jerDown_chain
mu_syst_jesUp_chain
mu_syst_jesDown_chain
mu_syst_pdfUp_chain
mu_syst_pdfDown_chain
mu_syst_pileupUp_chain
mu_syst_pileupDown_chain
crab_data_merge
mu_qcd_chain
mc_nominal_merge
mu_bdt_train__ttbb_vs_all__nominal
mu_bdt_train__ttbb_vs_all__enriched
mu_bdt_train__ttb_vs_ttbb__nominal
mu_bdt_train__ttbb_vs_ttb__nominal
mu_bdt_train__ttlight_vs_rest__nominal
mu_hist
mu_enrichment_comparison_plot
mu_bdt_eval
mu_syst_bdt_plot_pileup
mu_plot
mu_plot_no_toppt
mu_plot_mcOnly
mu_plot_mc_2d
mu_plot_mcOnly_ttlight
mu_plot_events
sfb_chain
mu_syst_scaleUp_bdt
mu_syst_scaleDown_bdt
mu_syst_matchingUp_bdt
mu_syst_matchingDown_bdt
mu_syst_btagUp_bdt
mu_syst_btagDown_bdt
mu_syst_mistagUp_bdt
mu_syst_mistagDown_bdt
mu_syst_jerUp_bdt
mu_syst_jerDown_bdt
mu_syst_jesUp_bdt
mu_syst_jesDown_bdt
mu_syst_pdfUp_bdt
mu_syst_pdfDown_bdt
mu_syst_pileupUp_bdt
mu_syst_pileupDown_bdt
mu_qcd_bdt
mu_bdt_train__ttbb_vs_all__nominal_plot
mu_bdt_train__ttbb_vs_all__enriched_plot
mu_bdt_train__ttb_vs_ttbb__nominal_plot
mu_bdt_train__ttbb_vs_ttb__nominal_plot
mu_bdt_train__ttlight_vs_rest__nominal_plot
e_bdt_roc_plot
th_combined
th_batch_prep
mc_scale_merge
mc_matching_merge
mu_syst_plot_pdf_comparison
mu_syst_plot_pdf_stack
mu_syst_plot_pdf_shape
mu_qcd_fit
mu_qcd_ratio
mu_qcd_scatter
mu_qcd_plot_mc_2d
th_qcd
mu_syst_btag_hist
mu_syst_mistag_hist
mu_syst_jer_hist
mu_syst_jes_hist
mu_syst_pdf_hist
mu_syst_pileup_hist
mu_syst_scale_hist
mu_syst_matching_hist
mu_qcd_hist_sr
mu_qcd_hist_sb_nominal
mu_qcd_hist_sb_variation
mc_nominal_das mc_nominal_cms
mc_ttbaralt_das mc_ttbaralt_cms mc_ttbaralt_merge
mc_scale_das mc_scale_cms
mc_matching_das mc_matching_cms
mu_data_das
e_data_das
pu_calc
mu_syst_bdt_plot_scale
mu_syst_bdt_plot_matching
mu_syst_bdt_plot_mistag
mu_syst_bdt_plot_jer
mu_syst_bdt_plot_jes
mu_syst_bdt_plot_pdf
mu_bdt_plot_output
mu_bdt_roc
mu_qcd_stack
mu_qcd_comparison
mu_bdt_roc_plot
sfb_plot_control
Robert Fischer
Marcel Rieger - 14.3.19 6
● Python package for building complex pipelines
● Development started at Spotify, now open-source and community-driven
1. Workloads defined as Task classes
2. Tasks require other tasks & output Targets
3. Parameters customize tasks and
control behavior
● Web interface, error handling, command line tools, task history, collaborative features, …
● github.com/spotify/luigi
Building blocks
Marcel Rieger - 14.3.19 7 Luigi in a nutshell
> python reco.py Reconstruction --dataset ttJets
Marcel Rieger - 14.3.19 8 make-like execution system
● Luigi’s execution model is make-like
1. Create dependency tree for triggered task
2. Determine tasks to actually run:
- Walk through tree (top-down)
- For each path, stop when all output targets of a task exist
● Only processes what is really necessary
● Error handling & automatic re-scheduling
● Clear & scalable through simple structure
triggered task
required task
dependency
Marcel Rieger - 14.3.19 9 Example trees
Marcel Rieger - 14.3.19 9 Example trees
Work of a B.Sc. student after 2 weeks ❗
lawluigi analysis workflow
Marcel Rieger - 14.3.19 11 law - luigi analysis workflow
● law: layer on top of luigi (i.e. it does not replace luigi)
● Software design follows 2 primary goals:
1. Scalability on HEP infrastructure (but not limited to)
2. Decoupling of run locations, storage locations & software environments
▻ No fixation on dedicated resources ▻ All components interchangeable
● Provides a toolbox to follow an analysis design pattern
■ No constraint on language or data structures
→ Not a framework!
lawluigi analysis workflow
Run location
Analysis
Storage location
Software environment
Marcel Rieger - 14.3.19 12 law features (1)
1. Job submission
■ Idea: submission built into tasks, no need to write extra code
■ Currently supported job systems: HTCondor, LSF, gLite, ARC, (CRAB)
▻ Backend not hard-coded, selectable at runtime
■ Mandatory features
▻ Automatic resubmission, dashboard interface
■ From the htcondor_at_cern example:
lawluigi analysis workflow
lxplus129:law_test > law run CreateChars --version v1 --poll-interval 0.5 --workflow htcondorINFO: [pid 30564] Worker Worker(host=lxplus129.cern.ch, username=mrieger) running CreateChars(branch=-1, start_branch=0, end_branch=26, version=v1)going to submit 26 htcondor job(s)submitted 1/26 job(s)submitted 26/26 job(s)14:35:40: all: 26, pending: 26 (+26), running: 0 (+0), finished: 0 (+0), retry: 0 (+0), failed: 0 (+0)...14:37:10: all: 26, pending: 0 (+0), running: 26 (+26), finished: 0 (+0), retry: 0 (+0), failed: 0 (+0)14:37:40: all: 26, pending: 0 (+0), running: 10 (-16), finished: 16 (+16), retry: 0 (+0), failed: 0 (+0)14:38:10: all: 26, pending: 0 (+0), running: 0 (+0), finished: 26 (+10), retry: 0 (+0), failed: 0 (+0)INFO: [pid 30564] Worker Worker(host=lxplus129.cern.ch, username=mrieger) done!
lxplus129:law_test >
Marcel Rieger - 14.3.19 13 law features (2)
2. Remote targets
■ Idea: work with remote files as if they were local
■ Remote targets built on top of GFAL2 Python bindings
▻ Supports all WLCG protocols (dCache, XRootD, GridFTP, SRM, ...) + DropBox
▻ API identical to local targets
■ Mandatory features
▻ Automatic retries, local caching
■ Example: working with files on EOS
lawluigi analysis workflow
“FileSystem” configuration
● Base path prefixed to all paths using this “fs”
● Configurable per file operation (stat, listdir, ...)
● Protected against removal of directories above
Marcel Rieger - 14.3.19 13 law features (2)
2. Remote targets
■ Idea: work with remote files as if they were local
■ Remote targets built on top of GFAL2 Python bindings
▻ Supports all WLCG protocols (dCache, XRootD, GridFTP, SRM, ...) + DropBox
▻ API identical to local targets
■ Mandatory features
▻ Automatic retries, local caching
■ Example: working with files on EOS
lawluigi analysis workflow
“FileSystem” configuration
● Base path prefixed to all paths using this “fs”
● Configurable per file operation (stat, listdir, ...)
● Protected against removal of directories above
Reading remote files (json)
Marcel Rieger - 14.3.19 13 law features (2)
2. Remote targets
■ Idea: work with remote files as if they were local
■ Remote targets built on top of GFAL2 Python bindings
▻ Supports all WLCG protocols (dCache, XRootD, GridFTP, SRM, ...) + DropBox
▻ API identical to local targets
■ Mandatory features
▻ Automatic retries, local caching
■ Example: working with files on EOS
lawluigi analysis workflow
“FileSystem” configuration
● Base path prefixed to all paths using this “fs”
● Configurable per file operation (stat, listdir, ...)
● Protected against removal of directories above
Reading remote files (json)Conveniently reading remote files (json)
Marcel Rieger - 14.3.19 13 law features (2)
2. Remote targets
■ Idea: work with remote files as if they were local
■ Remote targets built on top of GFAL2 Python bindings
▻ Supports all WLCG protocols (dCache, XRootD, GridFTP, SRM, ...) + DropBox
▻ API identical to local targets
■ Mandatory features
▻ Automatic retries, local caching
■ Example: working with files on EOS
lawluigi analysis workflow
“FileSystem” configuration
● Base path prefixed to all paths using this “fs”
● Configurable per file operation (stat, listdir, ...)
● Protected against removal of directories above
Reading remote files (json)Conveniently reading remote files (json)Conveniently reading remote files
Marcel Rieger - 14.3.19 13 law features (2)
2. Remote targets
■ Idea: work with remote files as if they were local
■ Remote targets built on top of GFAL2 Python bindings
▻ Supports all WLCG protocols (dCache, XRootD, GridFTP, SRM, ...) + DropBox
▻ API identical to local targets
■ Mandatory features
▻ Automatic retries, local caching
■ Example: working with files on EOS
lawluigi analysis workflow
“FileSystem” configuration
● Base path prefixed to all paths using this “fs”
● Configurable per file operation (stat, listdir, ...)
● Protected against removal of directories above
Reading remote files (json)Conveniently reading remote files (json)Conveniently reading remote files
Marcel Rieger - 14.3.19 13 law features (2)
2. Remote targets
■ Idea: work with remote files as if they were local
■ Remote targets built on top of GFAL2 Python bindings
▻ Supports all WLCG protocols (dCache, XRootD, GridFTP, SRM, ...) + DropBox
▻ API identical to local targets
■ Mandatory features
▻ Automatic retries, local caching
■ Example: working with files on EOS
lawluigi analysis workflow
“FileSystem” configuration
● Base path prefixed to all paths using this “fs”
● Configurable per file operation (stat, listdir, ...)
● Protected against removal of directories above
Reading remote files (json)Conveniently reading remote files (json)Conveniently reading remote files
Marcel Rieger - 14.3.19 14 law features (3)
3. Environment sandboxing
■ Diverging software requirements between typical workloads is a great feature / challenge / problem
■ Introduce sandboxing:
▻ Run entire task in different environment
■ Existing sandbox implementations:
▻ Sub-shell with init file
▻ Docker images
▻ Singularity images
docker::imgA
docker::imgB
shell::myEnv.sh
Singularity
singularity::imgC
lawluigi analysis workflow
Marcel Rieger - 14.3.19 15 law in action
> python reco.py Reconstruction --dataset ttJets
☐ luigi task ☐ law task ☐ Run on HTCondor ☐ Store on EOS ☐ Run in docker
✔
Marcel Rieger - 14.3.19 15 law in action
> python reco.py Reconstruction --dataset ttJets > law run Reconstruction --dataset ttJets
☐ luigi task ☐ law task ☐ Run on HTCondor ☐ Store on EOS ☐ Run in docker
✔
✔
Marcel Rieger - 14.3.19 15 law in action
> python reco.py Reconstruction --dataset ttJets > law run Reconstruction --dataset ttJets > law run Reconstruction --dataset ttJets --workflow htcondor
☐ luigi task ☐ law task ☐ Run on HTCondor ☐ Store on EOS ☐ Run in docker
✔
✔
✔
Marcel Rieger - 14.3.19 15 law in action
> python reco.py Reconstruction --dataset ttJets > law run Reconstruction --dataset ttJets > law run Reconstruction --dataset ttJets --workflow htcondor
☐ luigi task ☐ law task ☐ Run on HTCondor ☐ Store on EOS ☐ Run in docker
✔
✔
✔
✔
Marcel Rieger - 14.3.19 15 law in action
> python reco.py Reconstruction --dataset ttJets > law run Reconstruction --dataset ttJets > law run Reconstruction --dataset ttJets --workflow htcondor
☐ luigi task ☐ law task ☐ Run on HTCondor ☐ Store on EOS ☐ Run in docker
✔
✔
✔
✔
✔
Marcel Rieger - 14.3.19 16 Successful applications
● ttH analysis at CMS (JHEP 03 (2019) 026)
■ Large-scale:
▻ ∼100 TB of storage, ∼500k tasks
■ Complex:
▻ DNNs/BDTs/MEM
▻ ∼80 systematic variations
■ Distributed:
▻ 7 CEs, (GPU) clusters, local machines
▻ 2 SEs (dCache), local disk, Dropbox, CERNBox
■ Clear separation of duties within group
■ Entire analysis operable by everyone at any time
● DeepCSV + DeepJet b-tagging scale factors at CMS
● Multiple theses
......
......
2.2 The tt̄H Process at Hadron Colliders
Under the assumption that the Higgs boson decay occurs perpendicular to the direction of itsmotion, the spatial angle between the two jets in the observer’s reference frame at typical mo-menta of pH = 100 GeV amounts to f ⇡ 103�. Therefore, one can estimate that in the majorityof cases the two jets exhibit a sufficiently large spatial separation, allowing for their resolvedmeasurement and identification based on features of displaced secondary vertices.
Signal processes in which the Higgs boson decays into particles other than a pair of bottomquarks, such as H ! W+W� and H ! t+t�, are taken into account in the following. Despitetheir minor expected yield due to smaller branching ratios and different final-state signature,events of those processes can potentially pass phase space selection criteria and contribute tothe total number of tt̄H signal events.
Moreover, the decay of the tt̄ system is considered in the single-lepton and dilepton decaychannels (cf. Section 2.2.3). A corresponding leading-order Feynman diagram is presented inFig. 2.9a. It should be noted that more diagrams exist to describe tt̄H production. An exampleis the production of a pair of top quarks (cf. Section 2.2.3) where one top quark emits a Higgsboson. Similarly to generic tt̄ production, gluon-initiated processes have the largest contributionto the total tt̄H cross section at
ps = 13 TeV [33].
g
g
b̄n̄
b̄
q̄0, l+
t
t
t̄H
t̄ W�
W+
b
b
l�
q, n
(a)
g
g
b̄n̄
b̄
q̄0, l+
t
t
t̄g
t̄W�
W+
b
b
l�
q, n
(b)
Figure 2.9: Feynman diagrams showing tt̄H (H ! bb̄) (a) and tt̄+bb̄ production (b) in thesingle-lepton and dilepton tt̄ decay channels. Their final state is identical and despite differ-ent spin and color charge relations, the event topology is quite similar. It should be noted thatmore possible diagrams exist for both processes.
In total, the measurable final state consists of six jets and an isolated lepton in the single-lepton,and four jets and two isolated leptons with opposite charge in the dilepton channel, respectively.In both cases, four jets are supposed to originate from b-hadron decays and a significant amountof missing transverse energy is expected due to the non-detectable neutrinos. Given the highcombinatorial complexity due the number of jets and the typical detector resolution of jet ob-servables, the full reconstruction of the event is rather challenging. The net cross section is
stt̄H,bb̄,SL+DL = stt̄H · BRH!bb̄ · BRtt̄,SL+DL = 98.4 +6.9�9.9 fb, (2.50)
which corresponds to ⇠ 3500 produced events in the dataset recorded by the CMS detector in2016 with an integrated luminosity of 35.9 fb�1.
23
Marcel Rieger - 14.3.19 17 Summary
● HEP analyses likely to increase in scale and complexity
■ Analysis workflow management essential
■ Need for toolbox providing a design pattern, not a framework
● Luigi is able to model even complex workflows
● Law adds convenience & scalability in the HEP context
● All information transparently encoded in tasks, targets & dependencies
● Aim for out-of-the-box preservation
● github.com/riga/law, law.readthedocs.io
{Singularity
WLCGlawluigi analysis workflowSoftware
environment
Run location
Storage location
Backup
Marcel Rieger - 14.3.19 19 Links
● law - luigi analysis workflow
■ Repository ☞ github.com/riga/law
■ Paper ☞ arXiv:1706.00955 (CHEP16 proceedings)
■ Documentation ☞ law.readthedocs.io (in preparation)
■ Minimal example ☞ github.com/riga/law/tree/master/examples/loremipsum
■ HTCondor example ☞ github.com/riga/law/tree/master/examples/htcondor_at_cern
■ Contact ☞ Marcel Rieger
● luigi - Powerful Python pipelining package (by Spotify)
■ Repository ☞ github.com/spotify/luigi
■ Documentation ☞ luigi.readthedocs.io
■ “Hello world!” ☞ github.com/spotify/luigi/blob/master/examples/hello_world.py
● Technologies
■ GFAL2 ☞ dmc.web.cern.ch/projects/gfal-2/home
■ Docker ☞ docker.com
■ Singularity ☞ singularity.lbl.gov
Marcel Rieger - 14.3.19 20 order: structure external HEP data
● Pythonic class collection to order “soft”, external HEP data
■ physics processes & cross sections
■ campaigns & datasets
■ channels & categories
■ variables & systematics
● Some data could be centrally managed, some is analysis specific
● Run the example:
● Use as data backend: > law run Reconstruction --dataset ttH125_bb --...
github.com/riga/order
Marcel Rieger - 14.3.19 21 Thoughts on HEP analyses
● What is a framework?
→ Bash scripts, python tools, crab configs, CMSSW modules, magic
→ Connections mostly exist in the physicists head
● Documentation?
→ Not the most beloved hobby in the physics community
● When a M.Sc. / PhD / Postdoc leaves ...
→ Can someone else run the analysis?
→ Is this information lost? Is a new framework required?
● Does execution dictate code design?
→ Does the analysis depend on where it runs?
● From my experience: ⅔ of time required for technicalities, ⅓ for physics
→ Physics output doubled if it was the other way round?
Marcel Rieger - 14.3.19 22 Existing WMS: MC production
● Structure known in advance
● Workflows static & recurring
● One-dimensional design
● Special infrastructures
● Homogeneous software requirements
Generator Showering Simulation Digitization Reco
Tailored systems
→ Requirements for HEP analyses mostly orthogonal
Marcel Rieger - 14.3.19 22 Existing WMS: MC production
● Structure known in advance
● Workflows static & recurring
● One-dimensional design
● Special infrastructures
● Homogeneous software requirements
Generator Showering Simulation Digitization Reco
Tailored systems Wishlist for end-user analyses
● Structure “iterative”, a-priori unknown
● Dynamic workflows, fast R&D cycles
● Tree design, arbitrary dependencies
● Incorporate existing infrastructure
● Use custom software, everywhere
→ Requirements for HEP analyses mostly orthogonal
Marcel Rieger - 14.3.19 23 WMS comparison
→ Existing WMS highly specialized for designated use case→ Requirements for HEP analyses mostly orthogonal
Existing WMSe.g. MC Management Generic Analysis WMS
Development Processfinal objective
known in advanceiterative, final composition
a priori unknown
Workflow Structurechain structure,
mostly one-dimensionaltree structure,
arbitrarily branched
Evolutionstatic over time,
recurrent executiondynamic,
fast R&D cycles
Infrastructurespecially tailored,
e.g. storage systems, DBsincorporate existing,
quickly adapt to changes
Applicability tuned to particular use caseflexible, able to model every possible workflow
Marcel Rieger - 14.3.19 24 Achievements
1. Toolbox providing building blocks for analyses
→ Design pattern, not a framework (no constraint on language or data structure)
→ Full decoupling of run locations, storage locations and software environments
2. All information transparently encoded in tasks, targets & dependencies
→ Results reproducible by developer, groups, collaboration, ...
→ Analysis preservation out-of-the-box
3. make-like execution across distributed resources
→ Reduces overhead of manual management
→ Improves cycle times & error-proneness
→ Changed paradigm from executing to defining an analysis → Move focus back to physics
Marcel Rieger - 14.3.19 25 A typical example: ML workflow with uncertainties
Reconstruction
MVA Split
MVA MVA Evaluation
Inference
MVA Training
...
...
train test evaluate
weights
Nominal MC
Marcel Rieger - 14.3.19 25 A typical example: ML workflow with uncertainties
Reconstruction
MVA Split
MVA MVA Evaluation
Inference
MVA Training
...
...
train test evaluate
weights
real data
Nominal MCData
Marcel Rieger - 14.3.19 25 A typical example: ML workflow with uncertainties
Reconstruction
MVA Split
MVA MVA Evaluation
Inference
MVA Training
...
...
train test evaluate
weightsMC with systematic derived from nominal
sample
Nominal MCDataMC, Syst. I
Marcel Rieger - 14.3.19 25 A typical example: ML workflow with uncertainties
Reconstruction
MVA Split
MVA MVA Evaluation
Inference
MVA Training
...
...
train test evaluate
weights
MC with systematic generated from
new events
Nominal MCDataMC, Syst. IMC, Syst. II
Marcel Rieger - 14.3.19 26 luigi/law architecture
Load .dependencies
Task Tree(Workers)
Network Local Remote
User
Central Scheduler
Analysis & Task Classes
Input / OutputTargets
Workers Software & Images
Command-lineInterface
Register Tasks
Next task?
Read
LoadSubmit as job
Poll status
Write Read Write
1
Marcel Rieger - 14.3.19
Scenario A: file not cached yet
27 Local caching
Remote storage (e.g. eos)
Remote
Local machine
law/python process Local cache
PWD /tmp
" Need to access file “a.root” (has unique, path-dep. hash X)
Local requestRemote request
# File “a.root” with hash X in cache with latest mtime? → no
$ S
tat fil
e “a
.root
”%
Download “a.root”
& Return local path in cache
' Store “a.root” using hash X
( Work with local file ) Change mtime of file to value from stat (see $ )
Marcel Rieger - 14.3.19
Scenario B: file already cached
28 Local caching
Remote storage (e.g. eos)
Remote
Local machine
law/python process Local cache
PWD /tmp
" Need to access file “a.root” (has unique, path-dep. hash X)
Local requestRemote request
# File “a.root” with hash X in cache with latest mtime? → yes
$ S
tat fil
e “a
.root
”
% Return local path in cache' Work with local file
Marcel Rieger - 14.3.19 29 CLI