Achieve Reproducible Research with PROVA!: Performance ... · Prologue Taxonomy PROVA! Conclusions Dilemma of Experimental Science "Trouble at the lab. Scientists like to think of

Prologue Taxonomy PROVA! Conclusions

Achieve Reproducible Research with PROVA!:Performance Reproduction of Various Applications

Helmar Burkhart, Antonio Maffia, Danilo Guerrera

University of Basel

[email protected]

Friedrich-Alexander-Universitat Erlangen-Nurnberg

Tue, Oct 26, 2015


Dilemma of Experimental Science

”Trouble at the lab. Scientists like to think of science asself-correcting. To an alarming degree, it is not.” [Economist,October ’13]

Psychology: 9 separate experiments could not verify afamous study in the field.

Cancer research: Out of 53 studies only 6 could bereproduced.

Pharma research: Only 25% of 67 seminal studies could beverified.


Reproducibility - A Science Principle

“Non-reproducible single occurrences are of no significance toscience.” (Karl Popper The Logic of Scientific Discovery1934/1959)

Algorithmic Engineering: J. of Experimental Algorithmicswww.jea.acm.org

Programming Languages & Methodologies:http://evaluate.inf.usi.ch

Artificial Intelligence: http://recomputation.org

Computational Sciences: Tools Available

www.jea.acm.org

http://evaluate.inf.usi.ch

http://recomputation.org




How Do Computational Disciplines Perform?





www.jea.acm.org











www.jea.acm.org








Authors are asked to make every effort to simplify theverification process.




www.jea.acm.org











www.jea.acm.org









Canon (What is good science?); Artifacts (Software,Data): Evaluation and Certificate.



www.jea.acm.org











www.jea.acm.org










Manifesto (”If we can compute your experiment now,anyone can recompute it 20 years from now”), VirtualMachine (”the only way, Runtime performance is asecondary issue).


www.jea.acm.org











www.jea.acm.org











Workbench Approach: Taverna, Vistrails, Kepler, KnimeVersion Control Approach: Sumatra, MadagascarVirtualization Approach: CDE, Emulab

www.jea.acm.org











Good ad-hoc efforts, but none is addressing Performance!

www.jea.acm.org




Problem / Method / System

A computational problem is solved by an algorithmic method on acompute system.

Problem: specification of the problem including characteristicparameters.

Method: description of the algorithmic approach used totackle the problem.

System: representation of the compute environment (bothhardware and software), on which an experiment is run.




















Micro- and Macro-Experiments

micro-experiment 1

Syst

em

Method

ProblemPro1

Met1

Sys1


Micro- and Macro-Experiments

micro-experiment 1

Syst

em

Method

ProblemPro1

Met1

Met2

Sys1

micro-experiment 2


Micro- and Macro-ExperimentsSyst

em

Method

ProblemPro1

Met1

Met2

Met3

Sys1

micro-experiment 1

micro-experiment 2

micro-experiment 3


Micro- and Macro-ExperimentsSyst

em

Method

ProblemPro1

Met1

Met2

Met3macro-experiment

Sys1

micro-experiment 1

micro-experiment 2

micro-experiment 3


Reproducibility Levels

Repetition: re-running the original micro- ormacro-experiment without any variation of the parameters,should drive to the same results and a certain level ofcredibility is guaranteed (completeness of documentation)

Replication: is related to the system hosting an experiment.An experiment should not be bound to a specific computeenvironment (portability)

Re-experimentation: if changing the methods drives to thesame outputs, the scientific approach is proven (correctnessof the approach)












Our Goals

Provide a clean and easy to use environment

Do not introduce overhead that affects the performancemeasurements

Allow on-the-fly generation of graphs, regarding the chosenperformance metric (so far GFlops/s)

Automatic generation of the documentation, regarding bothsystem and software stack

Support the reproducibility levels for benchmark experiments

Provide collaborative approach

Create an ecosystem of trusted scientific computing, startingfrom our field


Our Goals









Our Goals









Our Goals









Our Goals









Our Goals









Our Goals









PROVA! - First Version of the Architecture

https ssh Web Browser

Remote Environment

Cluster

ParallelmachineParallel

machine...Parallelmachine

File Storage

workspaces

WorkspaceUserN

WorkspaceUser1

...

NFS

Experimentand Analysis

Server

Front-end machine

Framework


Replication Workflow

AdminModules

Add, removeList available and installed

ScientistProjects

Add, remove, list

Methods

Add, remove implemented project codeList, download available local implemented example code

Network interface to manage experiments


Workflow Structure

The root directory ($BENCHROOT) contains the following folders:$BENCHROOT

modules avail

openmp

mpi

modules installed

openmp

scripts

compile.sh

run.sh

...

util


Workspace Structure

The root directory ($WORKSPACE) contains as many folders asthe projects defined by the user:

$WORKSPACE

Project 1

Method 1

src

Makefile

example.c

dim input.h

out

DATE-PARAMS-THREADS.out

bin

project


How to install a module?

1 #! / b i n / bash2 s e t −e34 PWD PATH=‘pwd ‘5 PACKAGE NAME=” mpich−3 . 1 . 2 . t a r . gz ”6 URL=” h t t p : / /www. mpich . org / s t a t i c / downloads / 3 . 1 . 2 /$PACKAGE NAME”7 RELEASE=” mpich−3.1.2 ” #name o f t h e f o l d e r i n s i d e t h e package downloaded89 i f [ −a $PACKAGE NAME ] ; then

10 rm $PACKAGE NAME ;11 f i12 #Download Mpi13 wget $URL14 #e x t r a c t Mpi15 t a r x f $PACKAGE NAME16 rm −f $PACKAGE NAME17 i f [ −d ” b i n ” ] ; then18 rm −r f b i n ;19 f i20 cd $RELEASE21 #c o n f i g u r e to i n s t a l l i n a l o c a l d i r e c t o r y22 . / c o n f i g u r e −−p r e f i x=$PWD PATH/ b i n −−d i s a b l e−f o r t r a n −−d i s a b l e−f c2324 NPROC=‘ nproc ‘25 i f [ $NPROC −gt 1 ] ; then26 make −j$NPROC27 e l s e28 make29 f i30 make i n s t a l l


Modules Management

So far modules are managed by mean of scripts:

admin has to write the installation script and care aboutdependencies

users must ask him to install a module before using it in theirprojects

Our idea is to exploit Lmod1, an environmental modules system, toprovide a convenient way to dynamically change the users’environment through modulefiles. This includes easily adding orremoving directories to the PATH environment variable. Amodulefile contains the necessary information to allow a user torun a particular application or provide access to a particular library.

1https://www.tacc.utexas.edu/research-development/tacc-projects/lmod


Scientific Software Management

On a higher level, we plan to use Easybuild2, a software build andinstallation framework that allows you to manage (scientific)software on High Performance Computing (HPC) systems in anefficient way. It can be used for keeping track of versions and easilybuild a consistent software stack.

a flexible framework for building/installing (scientific) software

fully automates software builds

allows for easily reproducing previous builds

keep the software build recipes/specifications simple andhuman-readable

supports co-existence of versions/builds via dedicatedinstallation prefix and module files

enables sharing with the HPC community

automatic dependency resolution2http://hpcugent.github.io/easybuild/


Lmod + EasyBuild

Easily install new software as module

Clean environment for all of the users

Keep track of the software installed

Possibility to save a configuration file with the modules youuse more often

Possible to create such a file and attach it to an experimentExport source code + environment

. . .


PROVA! - Current Version of the Architecture


Remote Environment

Cluster



File Storage

workspaces

WorkspaceUserN

WorkspaceUser1

...

NFS


Server

Front-end machine

Framework Scheduler


New Workflow Structure

The root directory ($BENCHROOT) contains the following folders:$BENCHROOT

methodType avail

software

lua

lmod

python

easybuild

modules

my easyblocks

software

...

...


Commands overview

eb cmd Use EasyBuild included in the tool

methodType Manage method types

project Manage projects (CRUD)

method Manage methods (CRUD)

compile Manage compile of the methods of a project

run Manage run of methods of a project

exp Manage the experiment definition and execution

run exp Manage the execution of an experiment (compile+ run + gather)


Sample usage

Initialization


Sample usage

Initialization

cd $PROVAROOT; source ./util/BaseSetup.sh ˜/myworkspace


Sample usage

Installation of the tool


Sample usage

Installation of the tool

workflow install


Sample usage

List available method types


Sample usage


workflow methodType −l


Sample usage


workflow methodType −l

Output:

Installed :

Available :

List of MethodTypes in $BENCHROOT/methodType avail:MethodTypes: OpenMP−4.0−GCC−4.9.2 OpenMPI−1.8.4−GCC−4.9.2


Sample usage

How to get new methodType?


Sample usage


Add a directory to the ”methodType avail” path:

methodType avail

methodTypeName

.methodType

compile.sh

run.sh

src

example.c

Makefile

...


Sample usage


1 {2 ”name ” : ” OpenMPI−1.8.4−GCC−4.9.2” ,3 ” eb modu les ” : [4 ”OpenMPI/1.8.4−GCC−4.9.2”5 ] ,6 ” v e r s i o n ” : ” 1 . 0 ” ,7 ”comment ” : ” OpenMPI 1 . 8 . 4 based on GCC 4 . 9 . 2 ”8 }


Sample usage

Installation of a methodType


Sample usage


workflow methodType −i OpenMP−4.0−GCC−4.9.2


Sample usage


1 f n I n s t a l l ( ){2 #r e a d method t y p e name3 METHOD TYPE=$14 #r e a d E a s y B u i l d modules needed by t h e method t y p e i n EB MODULES5 r e a d −r a EB MODULES <<< $ ( awk −F ’ [ ”” ] ’ ’/ eb modu les /{ g e t l i n e ; w h i l e ( $0

! ˜ / ] / ) { p r i n t $2 ; g e t l i n e ; } } ’ $ m e t h o d T y p e a v a i l d i r /$METHOD TYPE/ . methodType )

6 #t a k e t h e number o f E a s y B u i l d modules7 ebModLen=${#EB MODULES[∗ ]}8 #f o r each E a s y B u i l d module check i f i n s t a l l e d . I f not , t r y to i n s t a l l i t !9 f o r ( ( i =0; i<${ebModLen} ; i++ ) )

10 do11 #t r y to i n s t a l l eb module12 EB MOD=”${EB MODULES [ $ i ]//\//−}”13 echo ” Try to i n s t a l l $EB MOD . eb ”14 eb ”$EB MOD . eb −r ” | | t r u e #s k i p e r r o r and check a f t e r w a r d s15 #t r y to l o a d module ’ s c a t e g o r y16 EB MOD CAT=$ ( c a t $EASYBUILD PREFIX/ e b f i l e s r e p o /${EB MOD%%−∗}/$EB MOD

. eb | awk ’/ m o d u l e c l a s s /{ p r i n t $3 } ’ )17 cd $EASYBUILD PREFIX/ modules18 module use $EB MOD CAT19 #check t h e module i n s t a l l a t i o n20 . . .21 done22 }


Sample usage

Creation of a project into the defined workspace


Sample usage


workflow project −c −p myfirstproject −−params x max y max −−threads 2−−values 20 20 −−comment ”first project with PROVA!”


Sample usage


workflow project −c −p myfirstproject −−params x max y max −−threads 2−−values 20 20 −−comment ”first project with PROVA!”

Output:

I will create the project : ” myfirstproject ” in the folder :/ scicore /home/burkhart/guerrera/newwork space/Project : ” myfirstproject ” created !You can add a method to this project running:workflow method −c −p myfirstproject −m method type −n method name


Sample usage

Creation of a method in a project


Sample usage

Creation of a method in a project

workflow method −c −p myfirstproject −m OpenMP−4.0−GCC−4.9.2 −n openmp

−−comment ”My first method with PROVA!: it uses OpenMP 4.0 and GCC 4.9.2”


Scenarios

Research Result Review

Period of Time Observations

Reproducibility in the Curriculum



Comparison of blocking and non blocking primitives.Replication: first attempt in Potsdam by the group of prof. Schnor

Problem Calculate a 3-D Poisson equation of 8003 elements (IEEE double precision arithmetic) with an error of 0.01.

System

SW: Distributed-memory computer with Message Passing Interface (MPICH2 1.0.2), GCC undefined HW: 128 nodes -  CPU: dual Opteron 246, 2 GHz, cache undefined -  RAM: undefined -  OS: undefined -  Network: GbE

Method 1. Blocking collective communication 2. Non-blocking collective communication using NBC library




System

SW: Distributed-memory computer with Message Passing Interface (MPICH 3.1.2), GCC undefined HW: 28 homogeneous nodes (Dell R610) - CPU: 2x Intel Xeon E5520 4-Core (SMT deactivated), 2,26 GHz, 8 MiB L3 cache - RAM: 48 GiB - OS: Scientific Linux, Kernel 2.6.18 - Network: GbE NIC on-board Broadcom Corporation NetXtreme II BCM5709




blocking 100 non-blocking

80

60

40

20

0 8 16 24 40 48 56 64 72 80 88 96 32

Speedup

Number of CPUs

(a) Gigabit Ethernet




System

SW: Distributed-memory computer with Message Passing Interface (MPICH 3.1.4), GCC 4.7.2. HW: 4 homogeneous nodes - CPU: 2x AMD Opteron 6274 16-Core, 2.2 GHz, 12 MiB L3 cache - RAM: 256 GiB - OS: Ubuntu 12.04.4, Kernel 3.8.0-38




0 5

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

100

8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96

Spe

edup

#Cores

Blocking_MPICH Non-BlockingLibNBC_MPICH


Period of time Observation

Comparison of stencil variats implemented with Chapel Cray v1.4and v1.11Replication: both experiments held in Basel

Problem Calculate a 3-D wave equation of 2003 elements (IEEE double precision arithmetic) in 100 timesteps.

System

SW: PGAS programming model using Chapel v1.4, GCC 4.7.2 HW: 1 node -  CPU: 2x AMD Opteron 6274 16-Core, 2.2 GHz, 12 MiB L3 cache -  RAM: 256 GiB -  OS: Ubuntu 12.04.4, Kernel 3.8.0-38

Method 1. Use 3 nested “for” loops (outermost parallelized). 2. Use “forall” statement with “domain map” function.




System

SW: PGAS programming model using Chapel v1.11, GCC 4.7.2 HW: Unchanged




0

0.2

0.4

0.6

0.8

1

1.2

1 2 4 8 16

GFl

op/s

Number of Threads

Performance Comparison of Project: ChapelWaveParameters (X_MAX T_MAX): 200 100

Implemented Methodwave3_1_4

wave3_1_11

0

0.2

0.4

0.6

0.8

1

1.2

1 2 4 8 16

GFl

op/s

Number of Threads

Performance Comparison of Project: ChapelWaveParameters (X_MAX T_MAX): 200 100

Implemented Methodwave2_1_4

wave2_1_11



PROVA! has been used during HPC course in FS15.Students did their assignments using it.Repetition:

Problem Compute a 2-D Matrix Multiplication 10002 elements (IEEE double precision arithmetic).

System

SW: Shared-memory computer using OpenMP 3.1, GCC 4.7.2 HW: 1 node -  CPU: 2x AMD Opteron 6274 16-Core, 2.2 GHz, 12 MiB L3 cache -  RAM: 256 GiB -  OS: Ubuntu 12.04.4, Kernel 3.8.0-38

Method To be defined by students: -  Naive -  cache awareness …



PROVA! has been used during HPC course in FS15.Students did their assignments using it.Repetition:

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1 2 4 8 16 32

GFl

op/s

Number of Threads

Performance Comparison of Project:SpeedAss3Task2Parameters (SIZE): 1000

Implemented MethodStudent1Student2Student3Student4Student5Student6


Requirements

Python >2.7

Numpy >1.9

Lmod >5.8

EasyBuild >2.2

Bash shell


Current limitations

Jobs: no clue about when the job finishes its execution

Homogeneity of nodes: libraries and sw are installed on ashared NFS so all the nodes must be homogeneous to runsuch sw

Experiment is run as a block: bad resource usage

Installation of the modules is simply delegated to EasyBuild

Visualization is not so powerful


Current limitations


Web interface has a field for showing real time the output (notuseful anymore)How to get notified? (email not useful)Solved by using GC3Pie (from the next version)






Current limitations







Current limitations



Solution: EasyBuild allows to install on all nodes via jobsDrawback: must manage different mount points for thesoftware stack (EB delegates it to sys admins - architecturespecific)Issue: in a job there is a command of type ”workflow”: wemust first make it more fine grain, writing in the job only bashcommands (thus PROVA! can be installed on the front-endonly)





Current limitations







Current limitations




it allocates the max cores required by the experiment since thebeginning




Current limitations







Current limitations





in the first version libraries and software stack were notmanaged, butnow you need to know EasyBuild to customize the softwareyou use: i.e changing a flag in the command line vs creating anew easyconfig



Current limitations







Conclusions

Reproducibility needs to be emphasized in computationalsciences.

Repeatability of an experiment only possible if precisedescription of experiment is given: Problem, System, andMethod.

Repeatability: World-wide access to experiments throughInternet feasible (security and authentication mechanismsessential).

Replication and re-experimentation: harder to achieve but notimpossible.


Future Work

Porting to external HPC environments


Remote Environment

Cluster



File Storage

workspaces

WorkspaceUserN

WorkspaceUser1

...

NFS

git git


Server

Front-end machine

Framework

Remote Storage

User1 repository

Workspace UserN repository

Workspace...

Cloud

Scheduler

Collaborative Performance Engineering

Integration of Performance Tool (Likwid)

Correctness check for experiments results

Reorganize the visualization of experiments and results

Documentation


The End

Documents

Achieve Reproducible Research with PROVA!: Performance ... · Prologue Taxonomy PROVA! Conclusions Dilemma of Experimental Science "Trouble at the lab. Scientists like to think of