39
Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-1

University of Hawaii

Hackystat and the DARPA High Productivity Computing

Systems Program

Philip JohnsonUniversity of Hawaii

Page 2: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-2

University of Hawaii

Overview of HPCS

Page 3: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-3

University of Hawaii

High ProductivityComputing Systems

Goal: Provide a new generation of economically viable high productivity computing

systems for the national security and industrial user community (2007 – 2010)

Impact:Performance (time-to-solution): speedup critical national

security applications by a factor of 10X to 40XProgrammability (time-for-idea-to-first-solution): reduce

cost and time of developing application solutions Portability (transparency): insulate research and

operational application software from systemRobustness (reliability): apply all known techniques to

protect against outside attacks, hardware faults, & programming errors

Fill the Critical Technology and Capability GapToday (late 80’s HPC technology)…..to…..Future (Quantum/Bio Computing)

Fill the Critical Technology and Capability GapToday (late 80’s HPC technology)…..to…..Future (Quantum/Bio Computing)

Applications: Intelligence/surveillance, reconnaissance, cryptanalysis, weapons analysis, airborne contaminant

modeling and biotechnology

HPCS Program Focus Areas

Page 4: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-4

University of Hawaii

Fill the high-end computing technology and capability gap for critical national security missions

Fill the high-end computing technology and capability gap for critical national security missions

Tightly Coupled Parallel Systems

2010High-End Computing Solutions

New Goal: Double Value

Every 18 Months

New Goal: Double Value

Every 18 Months

Moore’s Law Double Raw Performance

every 18 Months

Moore’s Law Double Raw Performance

every 18 Months

Vision: Focus on the Lost Dimension of HPC –

“User & System Efficiency and Productivity”

Commodity HPCs

Vector

Parallel Vector Systems

1980’s Technology

Page 5: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-5

University of Hawaii

HPCS Technical Considerations

Microprocessor

Shared-MemoryMulti-Processing

Distributed-MemoryMulti-Computing

“MPI”

Custom Vector

Architecture TypesCommunication Programming

ModelsSymmetric

MultiprocessorsDistributed Shared

Memory

ParallelVector

CommodityHPC

VectorSupercomputer

ScalableVector

MassivelyParallel

ProcessorsCommodity

Clusters, Grids

Single Point Design Solutions are no longer Acceptable

HPCS FocusTailorable Balanced Solutions

PerformanceCharacterization

& PrecisionProgramming

Models

HardwareTechnology

SoftwareTechnology

SystemArchitecture

Page 6: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-6

University of Hawaii

HPCS Program Phases I - III

02 05 06 07 08 09 1003 04

ProductsMetrics,

Benchmarks

Academia

Research

Platforms

Early

Software

Tools

Early

Pilot

Platforms

Phase IIR&D

Phase IIIFull Scale Development

Metrics and Benchmarks

System DesignReview

Industry

Application Analysis

PerformanceAssessment

HPCSCapability or

Products

Fiscal Year

Concept Reviews PDR

Research Prototypes

& Pilot Systems

Phase III Readiness Review

TechnologyAssessments

Requirementsand Metrics

Phase IIReadiness Reviews

Phase IIndustry

Concept Study

Reviews

Industry Procurements

Critical Program Milestones

DDR

Industry Evolutionary Development Cycle

Page 7: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-7

University of Hawaii

Application Analysis/Performance Assessment

Activity Flow

Productivity

Ratio of Utility/Cost

Metrics

- Development time (cost)

- Execution time (cost)

Implicit Factors

DDR&E & IHEC Mission Analysis

HPCS Applications

1. Cryptanalysis2. Signal and Image Processing3. Operational Weather4. Nuclear Stockpile Stewardship5. Etc.

Common Critical Kernels

Participants

HPCS Technology Drivers

Define System Requirements and

CharacteristicsCompact

Applications

Applications

Application Analysis Benchmarks & Metrics Impacts

Mission Partners:

DOD

DOE

NNSA

NSA

NRO

Participants:

Cray

IBM

Sun

DARPA

HPCS ProgramMotivation

Inputs

Mission Partners

Improved Mission Capability

Mission-Specific Roadmap

Mission Work Flows

Page 8: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-8

University of Hawaii

Workflow Priorities & Goals

Implicit Productivity FactorsWorkflow Perf. Prog. Port. Robust.Researcher HighEnterprise High High High HighProduction High High

Pro

du

cti

vit

y

Problem Size

Workstation

Cluster

HPCS

HPCS Goal

Re

se

arc

he

r

Production

Enterp

rise

• Workflows define scope of customer priorities

• Activity and Purpose benchmarks will be used to measure Productivity

• HPCS Goal is to add value to each workflow

– Increase productivity while increasing problem size

• Workflows define scope of customer priorities

• Activity and Purpose benchmarks will be used to measure Productivity

• HPCS Goal is to add value to each workflow

– Increase productivity while increasing problem size

MissionNeeds

SystemRequirements

Page 9: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-9

University of Hawaii

Productivity Framework Overview

Phase I: Define Framework & Scope Petascale Requirements

Phase II: Implement Framework & Perform Design Assessments

Phase III: Transition To HPC Procurement Quality Framework

Value Metrics•Execution•Development

Benchmarks-Activity•Purpose

Workflows-Production-Enterprise-Researcher

Preliminary

Multilevel

System

Models

&

Prototypes

Final

Multilevel

System

Models

&

SN001

HPCS Vendors

HPCS FFRDC & Gov

R&D Partners

Mission Agencies

Acceptance

Level Tests

Run Evaluation

Experiments

Commercial or Nonprofit

Productivity Sponsor

HPCS needs to develop a procurement quality assessment methodology that will be the basis of 2010+ HPC procurements

HPCS needs to develop a procurement quality assessment methodology that will be the basis of 2010+ HPC procurements

Page 10: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-10

University of Hawaii

HPCS Phase II Teams

Industry:

Productivity Team (Lincoln Lead)

PI: SmithPI: Elnozahy PI: Rulifson

MIT Lincoln Laboratory

Goal: Provide a new generation of economically viable high productivity computing

systems for the national security and industrial user community (2007 – 2010)

PI: Kepner PI: Lucas

PI: Koester

PI: Basili PI: Benson & Snavely

PIs: Vetter, Lusk, Post, Bailey PIs: Gilbert, Edelman, Ahalt, Mitchell

LCSOhioState

Goal: Develop a procurement quality assessment methodology that will be the basis

of 2010+ HPC procurements

Page 11: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-11

University of Hawaii

Motivation: Metrics Drive Designs

Execution Time (Example)

Current metrics favor caches and pipelines

• Systems ill-suited to applications with

• Low spatial locality

• Low temporal locality

Development Time (Example)

• No metrics widely used

• Least common denominator standards

• Difficult to use

• Difficult to optimize

“You get what you measure”

Top500 LinpackRmax

Large FFTs(Reconnaissance)

StreamsAdd

Table Toy (GUPS)(Intelligence)

High

High

Low

Low

HPCS

Sp

atia

l L

oc

alit

y

Temporal Locality

Adaptive Multi-PhysicsWeapons DesignVehicle Design

WeatherC/Fortran

MPI/OpenMP

Matlab/Python

Assembly/VHDL

High PerformanceHigh Level Languages

LanguagePerformance

Lan

gu

ag

eE

xpre

ssiv

enes

s

UPC/CAF

SIMD/DMA

HPCS

Low

Low

High

High

Tradeoffs

Page 12: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-12

University of Hawaii

Development Time (cost)

Execution Time (cost)

ProductivityMetrics

System Parameters(Examples)

BW bytes/flop (Balance)Memory latencyMemory size……..

Productivity

Processor flop/cycle Processor integer op/cycleBisection BW………Size (ft3)Power/rackFacility operation ……….

Code size Restart time (Reliability) Code Optimization time ………

Activity & Purpose

Benchmarks

Actual System

orModel

WorkFlows

Phase 1: Productivity Framework

(Ratio of Utility/Cost)C

om

mo

n M

od

eling

Interface

Page 13: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-13

University of Hawaii

Development Time (cost)

Execution Time (cost)

ProductivityMetrics

System Parameters(Examples)

BW bytes/flop (Balance)Memory latencyMemory size……..

Productivity

Processor flop/cycle Processor integer op/cycleBisection BW………Size (ft3)Power/rackFacility operation ……….

Code sizeRestart time (Reliability) Code Optimization time ………

Activity & Purpose

Benchmarks

Actual System

orModel

WorkFlows

Phase 2: Implementation

(Ratio of Utility/Cost)C

om

mo

n M

od

eling

Interface

Dev

Inte

rfac

eE

xe In

terf

ace

Metrics Analysis ofCurrent and New Codes

(Lincoln, UMD & Mission Partners)

University Experiments(MIT, UCSB, UCSD, UMD, USC)

(ANL & Pmodels Group)

(ISI, LLNL& UCSD)

(Mitre, ISI, LBL, Lincoln, HPCMO, LANL & Mission Partners)

Performance Analysis(ISI, LLNL & UCSD)

Contains Proprietary Information - For Government Use Only

(Lincoln, OSU, CodeSourcery)

Page 14: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-14

University of Hawaii

HPCS Mission Work Flows

Decide

Observe

Act

Orient

Production

Hours toMinutes

(Response Time)

Design

Simulation

Visualize

Enterprise

Monthsto days

Overall Cycle Development Cycle

Optimize

ScaleTestDevelopment

Years tomonths

Monthsto days

Code

DesignPrototyping

Evaluation

OperationMaintenance

Design

Code

Test

Port, Scale,Optimize

Init

ial

De

ve

lop

me

nt

Days tohours

Experiment

TheoryCode

TestDesignPrototyping

Hours tominutes

HPCS Productivity Factors: Performance, Programmability, Portability, and Robustness are very closely coupled with each work flow

HPCS Productivity Factors: Performance, Programmability, Portability, and Robustness are very closely coupled with each work flow

Researcher

Execution

Development

Initial Product Development

Port Legacy Software

Port Legacy Software

Researcher

Enterprise

Production

Page 15: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-15

University of Hawaii

HPC Workflow SW Technologies

Production Workflow

• Many technologies targeting specific pieces of workflow

• Need to quantify workflows (stages and % time spent)

• Need to measure technology impact on stages

Design, Code, TestAlgorithm

Development Spec RunPort, Scale, Optimize

Workstation Supercomputer

OperatingSystems

Compilers

Libraries

Tools

ProblemSolving

Environments

Linux RT Linux

C++ F90

ATLAS, BLAS,FFTW, PETE, PAPIMPI

GlobusUML

POOMA

CORBA

CCA PVL

VSIPL||VSIPL++

ESMF

Matlab UPC CoarrayJava

TotalView

OpenMP

HPC SoftwareMainstream Software

DRI

Page 16: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-16

University of Hawaii

Prototype Productivity Models

Special Model with Work Estimator (Sterling)

Least Action (Numrich)

Efficiency and Power(Kennedy, Koelbel, Schreiber)

hour

day

week

month

year

hour day week month year

Pro

gra

mm

ing

Tim

e

Execution Time

executionboundedmissions

programmingboundedmissions

Surveillance

Cryptanalysis

Intelligence

Weather(operational)

WeaponsDesign

HPCS Goal

Weather(research)

Time-To-Solution (Kogge)

x A xEffortMultipliers Size

ScaleFactors

CoCoMo II(software engineering

community)

productivityGUPS ...Linpack

useful opssecond GUPS

...Linpack

Hardware Cost

productivityfactor

mission

factor

Productivity Factor Based (Kepner)

productivityfactor

Language

Level

Parallel

Model

Portability

Availability

Maintenance

T(PL) I(PL) rE(PL)

I(P 0)I (PL)I (P 0) rE(P 0)E(PL)

E(P 0)

I(P 0) /L rE (P 0) /LUtility (Snir)

HPCS has triggered ground breaking activity in understanding HPC productivity-Community focused on quantifiable productivity (potential for broad impact)HPCS has triggered ground breaking activity in understanding HPC productivity-Community focused on quantifiable productivity (potential for broad impact)

P(S,A,U(.)) mincos t

U(T(S,A,Cost))

Cost

wSPEA

cf n cm co T

S = º [ wdev + wcomp ] dt; S = 0

Page 17: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-17

University of Hawaii

Example Existing Code Analysis

Cray Inc. Proprietary Ğ Not For Public Disclosure

MG Performance

Cray Inc. Proprietary Ğ Not For Public Disclosure

NAS MG Linecounts

0

200

400

600

800

1000

1200

MPI Java HPF OpenMP Serial A-ZPL

comm/sync/dir

declarations

computation

Analysis of existing codes used to test metrics and identify important trends in productivity and performance

Analysis of existing codes used to test metrics and identify important trends in productivity and performance

Page 18: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-18

University of Hawaii

Example Experiment Results (N=1)

0

1

10

100

1000

0 200 400 600 800 1000

Matlab

BLAS/MPI

SingleProcessor

SharedMemory

DistributedMemory

Matlab C

Per

form

ance

(S

pee

du

p x

Eff

icie

ncy

)

Development Time (Lines of Code)

C++

BLAS

pMatlab

MatlabMPI

BLAS/OpenMP

PVLBLAS/MPI

Research CurrentPractice

• Same application (image filtering)

• Same programmer• Different langs/libs

•Matlab•BLAS•BLAS/OpenMP•BLAS/MPI*•PVL/BLAS/MPI*•MatlabMPI•pMatlab*

• Same application (image filtering)

• Same programmer• Different langs/libs

•Matlab•BLAS•BLAS/OpenMP•BLAS/MPI*•PVL/BLAS/MPI*•MatlabMPI•pMatlab*

*Estimate

3

2 1

4

6

7 5

Controlled experiments can potentially measure the impact of different technologies and quantify development time and execution time tradeoffs

Controlled experiments can potentially measure the impact of different technologies and quantify development time and execution time tradeoffs

Page 19: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-19

University of Hawaii

Summary

• Goal is to develop an acquisition quality framework for HPC systems that includes

– Development time– Execution time

• Have assembled a team that will develop models, analyze existing HPC codes, develop tools and conduct HPC development time and execution time experiments

• Measures of success– Acceptance by users, vendors and acquisition community– Quantitatively explain HPC rules of thumb:

• "OpenMP is easier than MPI, but doesn’t scale a high”• "UPC/CAF is easier than OpenMP”• "Matlab is easier the Fortran, but isn’t as fast”

– Predict impact of new technologies

Page 20: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-20

University of Hawaii

Example Development Time Experiment

• Goal: Quantify development time vs. execution time tradeoffs of different parallel programming models

– Message passing (MPI)– Threaded (OpenMP)– Array (UPC, Co-Array Fortran)

• Setting: Senior/1st Year Grad Class in Parallel Computing (MIT/BU, Berkeley/NERSC, CMU/PSC, UMD/?, …)

• Timeline:– Month 1: Intro to parallel programming– Month 2: Implement serial version of compact app– Month 3: Implement parallel version

• Metrics:– Development time (from logs), SLOCS, function points, …– Execution time, scalability, comp/comm, speedup, …

• Analysis:– Development time vs. Execution time of different models– Performance relative to expert implementation– Size relative to expert implementation

Page 21: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-21

University of Hawaii

Hackystat in HPCS

Page 22: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-22

University of Hawaii

About Hackystat

• Five years old:– I wrote the first LOC during first week of May, 2001.– Current size: 320,562 LOC (not all mine)– ~5 active developers– Open source, GPL

• General application areas:– Education: teaching measurement in SE– Research: Test Driven Design, Software Project

Telemetry, HPCS– Industry: project management

• Has inspired startup: 6th Sense Analytics

Page 23: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-23

University of Hawaii

Goals for Hackystat-HPCS

• Support automated collection of useful low-level data for a wide variety of platforms, organizations, and application areas.

• Make Hackystat low-level data accessable in a standard XML format for analysis by other tools.

• Provide workflow and other analyses over low-level data collected by Hackystat and other tools to support:

– discovery of developmental bottlenecks – insight into impact of tool/language/library choice

for specific applications/organizations.

Page 24: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-24

University of Hawaii

Pilot Study, Spring 2006

• Goal: Explore issues involved in workflow analysis using Hackystat and students.

• Experimental conditions (were challenging):– Undergraduate HPC seminar– 6 students total, 3 did assignment, 1 collected data.– 1 week duration– Gauss-Seidel iteration problem, written in C, using

PThreads library, on cluster

• As a pilot study, it was successful.

Page 25: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-25

University of Hawaii

Data Collection: Sensors

• Sensors for Emacs and Vim captured editing activities.

• Sensor for CUTest captured testing activities.

• Sensor for Shell captured command line activities.

• Custom makefile with compilation, testing, and execution targets, each instrumented with sensors.

Page 26: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-26

University of Hawaii

Example data: Editor activities

Page 27: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-27

University of Hawaii

Example data: Testing

Page 28: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-28

University of Hawaii

Example data: File Metrics

Page 29: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-29

University of Hawaii

Example data: Shell Logger

Page 30: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-30

University of Hawaii

Data Analysis: Workflow States

• Our goal was to see if we could automatically infer the following developer workflow states:

– Serial coding

– Parallel coding

– Validation/Verification

– Debugging

– Optimization

Page 31: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-31

University of Hawaii

Workflow State Detection: Serial coding

• We defined the "serial coding" state as the editing of a file not containing any parallel constructs, such as MPI, OpenMP, or PThread calls.

• We determine this through the MakeFile, which runs SCLC over the program at compile time and collects Hackystat FileMetric data that provides counts of parallel constructs.

• We were able to identify the Serial Coding state if the MakeFile was used consistently.

Page 32: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-32

University of Hawaii

Workflow State Detection: Parallel Coding

• We defined the "parallel coding" state as the editing of a file containing a parallel construct (MPI, OpenMP, PThread call).

• Similarly to serial coding, we get the data required to infer this phase using a MakeFile that runs SCLC and collects FileMetric data.

• We were able to identify the parallel coding state if the MakeFile was used consistently.

Page 33: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-33

University of Hawaii

Workflow State Detection: Testing

• We defined the "testing" state as the invocation of unit tests to determine the functional correctness of the program.

• Students were provided with test cases and the CUTest to test their program.

• We were able to infer the Testing state if CUTest was used consistently.

Page 34: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-34

University of Hawaii

Workflow State Detection: Debugging

• We have not yet been able to generate satisfactory heuristics to infer the "debugging" state from our data.

– Students did not use a debugging tool that would have allowed instrumentation with a sensor.

– UMD heuristics, such as the presence of "printf" statements, were not collected by SCLC.

– Debugging is entwined with Testing.

Page 35: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-35

University of Hawaii

Workflow State Detection:Optimization

• We have not yet been able to generate satisfactory heuristics to infer the "optimization" state from our data.

– Students did not use a performance analysis tool that would have allowed instrumentation with a sensor.

– Repeated command line invocation of the program could potentially identify the activity as "optimization".

Page 36: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-36

University of Hawaii

Insights from the pilot study, 1

• Automatic inference of these workflow states in a student setting requires:

– Consistent use of MakeFile (or some other mechanism to invoke SCLC consistently) to infer serial coding and parallel coding workflow states.

– Consistent use of an instrumented debugging tool to infer the debugging workflow state.

– Consistent use of an "execute" MakeFile target (and/or an instrumented performance analysis tool) to infer the optimization workflow state.

Page 37: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-37

University of Hawaii

Insights from the pilot study, 2

• Ironically, it may be easier to infer workflow states from industrial settings than from classroom settings!

– Industrial settings are more likely to use a wider variety of tools which could be instrumented and provide better insight into development activities.

– Large scale programming leads inexorably to consistent use of MakeFiles (or similar scripts) that should simplify state inference.

Page 38: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-38

University of Hawaii

Insights from the pilot study, 3

• Are we defining the right set of workflow states?

• For example, the "debugging" phase seems difficult to distinguish as a distinct state.

• Do we really need to infer "debugging" as a distinct activity?

• Workflow inference heuristics appear to be highly contextual, depending upon the language, toolset, organization, and application. (This is not a bug, this is just reality. We will probably need to enable each MP to develop heuristics that work for them.)

Page 39: Slide-1 University of Hawaii Hackystat and the DARPA High Productivity Computing Systems Program Philip Johnson University of Hawaii

Slide-39

University of Hawaii

Next steps

• Graduate HPC classes at UH.– The instructor (Henri Casanova) has agreed to

participate with UMD and UH/Hackystat in data collection and analysis.

– Bigger assignments, more sophisticated students, hopefully larger class!

• Workflow Inference System for Hackystat (WISH)– Support export of raw data to other tools. – Support import of raw data from other tools.– Provide high-level rule-based inference mechanism to

support organization-specific heuristics for workflow state identification.