18
Sep 13, 2006 Sep 13, 2006 Scientific Computing Scientific Computing 1 Managing Managing Scientific Scientific Computing Projects Computing Projects Erik Deumens Erik Deumens QTP and HPC Center QTP and HPC Center

Managing Scientific Computing Projects

  • Upload
    opa

  • View
    34

  • Download
    5

Embed Size (px)

DESCRIPTION

Managing Scientific Computing Projects. Erik Deumens QTP and HPC Center. Overview. What is a scientific computing project? Procedures to manage scientific computing projects. Commodity computing. E-mail Web access Writing: papers, letters, thesis, presentations, web content - PowerPoint PPT Presentation

Citation preview

Page 1: Managing Scientific Computing Projects

Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing 11

Managing Scientific Managing Scientific Computing ProjectsComputing Projects

Erik DeumensErik Deumens

QTP and HPC CenterQTP and HPC Center

Page 2: Managing Scientific Computing Projects

22Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

OverviewOverview

What is a scientific computing project?What is a scientific computing project?

Procedures to manage scientific Procedures to manage scientific computing projectscomputing projects

Page 3: Managing Scientific Computing Projects

33Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Commodity computingCommodity computing

E-mailE-mail

Web accessWeb access

Writing: papers, letters, thesis, Writing: papers, letters, thesis, presentations, web contentpresentations, web content

Drawing: graphs, figures, plotsDrawing: graphs, figures, plots

Calculating: spreadsheets, Mathematica, Calculating: spreadsheets, Mathematica, Maple, SAS, MatlabMaple, SAS, Matlab

Page 4: Managing Scientific Computing Projects

44Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Science and EngineeringScience and Engineering

Computing with softwareComputing with software Physics: VASP, WIENPhysics: VASP, WIEN Chemistry: Gaussian, Q-ChemChemistry: Gaussian, Q-Chem Engineering: ANSYSEngineering: ANSYS

Developing softwareDeveloping software ProgrammingProgramming PrototypingPrototyping DebuggingDebugging Performance analysisPerformance analysis

Page 5: Managing Scientific Computing Projects

55Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Scientific Computing ProjectScientific Computing Project

Significant human effortSignificant human effort

Many steps with dependenciesMany steps with dependencies

Takes a long time on one computer or Takes a long time on one computer or many computers to completemany computers to complete

Involves a lot of dataInvolves a lot of data Input given to be processedInput given to be processed Intermediate data for the computationIntermediate data for the computation Output produced to be analyzedOutput produced to be analyzed

Page 6: Managing Scientific Computing Projects

66Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Example SCPExample SCP

Test a set of model parametersTest a set of model parameters Given basic parameters BGiven basic parameters Bnn

Compute dependent values DCompute dependent values D jj

Compare to test values TCompare to test values T jj

If the number of dependent and test value If the number of dependent and test value sets is large, say 1,000sets is large, say 1,000

And each computation takes time, say 1 hAnd each computation takes time, say 1 h

Then this is a projectThen this is a project

Page 7: Managing Scientific Computing Projects

77Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Recognizing SCPRecognizing SCP

Act from early stages as if it is SCPAct from early stages as if it is SCP

Then procedures are tested and reliable Then procedures are tested and reliable by the time by the time the science of the project becomes harderthe science of the project becomes harder and requires all attentionand requires all attention

Page 8: Managing Scientific Computing Projects

88Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Reliability of modern computersReliability of modern computers

Computers, networks and software areComputers, networks and software are Very stableVery stable Very powerfulVery powerful

Leads to wide spread belief that they areLeads to wide spread belief that they are Infinitely stableInfinitely stable Infinitely powerfulInfinitely powerful

Probability of failureProbability of failure Small chance times lots of work = big chanceSmall chance times lots of work = big chance

Page 9: Managing Scientific Computing Projects

99Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

OverviewOverview

What is a scientific computing project?What is a scientific computing project?

Procedures to manage scientific Procedures to manage scientific computing projectscomputing projects

Page 10: Managing Scientific Computing Projects

1010Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Manage a SCPManage a SCP

Project analysisProject analysis DataData ComputationComputation

Develop strategyDevelop strategy Organize the computationOrganize the computation Manage the dataManage the data

AutomationAutomation Avoid human errorsAvoid human errors Protect against disastersProtect against disasters

Page 11: Managing Scientific Computing Projects

1111Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Project analysisProject analysis

Often a project starts smallOften a project starts small

Once you decide the project is worthwhile, Once you decide the project is worthwhile, perform a project analysisperform a project analysis Data: before, during, afterData: before, during, after Computation: how many, how longComputation: how many, how long Precautions: minimize effect of disastersPrecautions: minimize effect of disasters

Page 12: Managing Scientific Computing Projects

1212Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Develop strategyDevelop strategy

Organize the computationOrganize the computation Choose computer systemChoose computer system Study scheduling systemStudy scheduling system

Match the project computation flow onto the Match the project computation flow onto the scheduling policiesscheduling policies

Manage the dataManage the data Input files generated by hand? By machine?Input files generated by hand? By machine? Space for large intermediate filesSpace for large intermediate files Space for output filesSpace for output files

Page 13: Managing Scientific Computing Projects

1313Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

AutomationAutomation

Extra tools needed to manage the project?Extra tools needed to manage the project? Generate input files from a database?Generate input files from a database?

Write scripts? Use a tool already developed?Write scripts? Use a tool already developed? Generate scheduler command files?Generate scheduler command files?

Does a tool exist? Some tools are very complex. Is Does a tool exist? Some tools are very complex. Is it easier to write scripts than to learn the tool?it easier to write scripts than to learn the tool?

Collect data from output files into a database?Collect data from output files into a database?Write scripts? Write a compiled program?Write scripts? Write a compiled program?

Page 14: Managing Scientific Computing Projects

1414Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

AutomationAutomation

Computation and data monitoringComputation and data monitoring Check status of each runCheck status of each run

Submit the job again if it failedSubmit the job again if it failed Check correctness and integrity of output dataCheck correctness and integrity of output data

Even if the job finishedEven if the job finished

it may have generated an error messageit may have generated an error message

there may be no result there may be no result

or the result may be invalid or incorrector the result may be invalid or incorrect

Page 15: Managing Scientific Computing Projects

1515Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

PrecautionsPrecautions

Prepare for some disastersPrepare for some disasters Some or all computed data is lost or Some or all computed data is lost or

corrupted?corrupted?Make sure Make sure all files created manuallyall files created manually are on disks are on disks that are backed upthat are backed up

at least, you can run computations againat least, you can run computations again Some output has been processedSome output has been processed

Make sure partial results are on disks that are Make sure partial results are on disks that are backed upbacked up

Page 16: Managing Scientific Computing Projects

1616Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Growing projectsGrowing projects

Often projects start smallOften projects start small Procedures are developed and usedProcedures are developed and used

They work well for 1,000 casesThey work well for 1,000 cases

Then the scope is increasedThen the scope is increased After partial successAfter partial success Procedures are used unchangedProcedures are used unchanged

They do not work for 1,000,000 cases!They do not work for 1,000,000 cases!

Must perform new analysis when scope Must perform new analysis when scope changeschanges

Page 17: Managing Scientific Computing Projects

1717Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

Tool choicesTool choices

Small operationsSmall operations Scripts are easy to write and changeScripts are easy to write and change Run fast for small numbersRun fast for small numbers

Large operationsLarge operations Running a script 10,000,000 times may be Running a script 10,000,000 times may be

very slow and cause unexpected side effectsvery slow and cause unexpected side effects Investigate better toolsInvestigate better tools

Program in compiled languageProgram in compiled language

Use database instead of simple filesUse database instead of simple files

Page 18: Managing Scientific Computing Projects

1818Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing

ConclusionConclusion

A little bit of thought, can save you from a A little bit of thought, can save you from a lot of trouble and extra worklot of trouble and extra work

Every scientific computation project that is Every scientific computation project that is worth doingworth doing

is worth a little bit of thought about how to is worth a little bit of thought about how to do it.do it.