Upload
opa
View
34
Download
5
Embed Size (px)
DESCRIPTION
Managing Scientific Computing Projects. Erik Deumens QTP and HPC Center. Overview. What is a scientific computing project? Procedures to manage scientific computing projects. Commodity computing. E-mail Web access Writing: papers, letters, thesis, presentations, web content - PowerPoint PPT Presentation
Citation preview
Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing 11
Managing Scientific Managing Scientific Computing ProjectsComputing Projects
Erik DeumensErik Deumens
QTP and HPC CenterQTP and HPC Center
22Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
OverviewOverview
What is a scientific computing project?What is a scientific computing project?
Procedures to manage scientific Procedures to manage scientific computing projectscomputing projects
33Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Commodity computingCommodity computing
E-mailE-mail
Web accessWeb access
Writing: papers, letters, thesis, Writing: papers, letters, thesis, presentations, web contentpresentations, web content
Drawing: graphs, figures, plotsDrawing: graphs, figures, plots
Calculating: spreadsheets, Mathematica, Calculating: spreadsheets, Mathematica, Maple, SAS, MatlabMaple, SAS, Matlab
44Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Science and EngineeringScience and Engineering
Computing with softwareComputing with software Physics: VASP, WIENPhysics: VASP, WIEN Chemistry: Gaussian, Q-ChemChemistry: Gaussian, Q-Chem Engineering: ANSYSEngineering: ANSYS
Developing softwareDeveloping software ProgrammingProgramming PrototypingPrototyping DebuggingDebugging Performance analysisPerformance analysis
55Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Scientific Computing ProjectScientific Computing Project
Significant human effortSignificant human effort
Many steps with dependenciesMany steps with dependencies
Takes a long time on one computer or Takes a long time on one computer or many computers to completemany computers to complete
Involves a lot of dataInvolves a lot of data Input given to be processedInput given to be processed Intermediate data for the computationIntermediate data for the computation Output produced to be analyzedOutput produced to be analyzed
66Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Example SCPExample SCP
Test a set of model parametersTest a set of model parameters Given basic parameters BGiven basic parameters Bnn
Compute dependent values DCompute dependent values D jj
Compare to test values TCompare to test values T jj
If the number of dependent and test value If the number of dependent and test value sets is large, say 1,000sets is large, say 1,000
And each computation takes time, say 1 hAnd each computation takes time, say 1 h
Then this is a projectThen this is a project
77Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Recognizing SCPRecognizing SCP
Act from early stages as if it is SCPAct from early stages as if it is SCP
Then procedures are tested and reliable Then procedures are tested and reliable by the time by the time the science of the project becomes harderthe science of the project becomes harder and requires all attentionand requires all attention
88Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Reliability of modern computersReliability of modern computers
Computers, networks and software areComputers, networks and software are Very stableVery stable Very powerfulVery powerful
Leads to wide spread belief that they areLeads to wide spread belief that they are Infinitely stableInfinitely stable Infinitely powerfulInfinitely powerful
Probability of failureProbability of failure Small chance times lots of work = big chanceSmall chance times lots of work = big chance
99Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
OverviewOverview
What is a scientific computing project?What is a scientific computing project?
Procedures to manage scientific Procedures to manage scientific computing projectscomputing projects
1010Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Manage a SCPManage a SCP
Project analysisProject analysis DataData ComputationComputation
Develop strategyDevelop strategy Organize the computationOrganize the computation Manage the dataManage the data
AutomationAutomation Avoid human errorsAvoid human errors Protect against disastersProtect against disasters
1111Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Project analysisProject analysis
Often a project starts smallOften a project starts small
Once you decide the project is worthwhile, Once you decide the project is worthwhile, perform a project analysisperform a project analysis Data: before, during, afterData: before, during, after Computation: how many, how longComputation: how many, how long Precautions: minimize effect of disastersPrecautions: minimize effect of disasters
1212Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Develop strategyDevelop strategy
Organize the computationOrganize the computation Choose computer systemChoose computer system Study scheduling systemStudy scheduling system
Match the project computation flow onto the Match the project computation flow onto the scheduling policiesscheduling policies
Manage the dataManage the data Input files generated by hand? By machine?Input files generated by hand? By machine? Space for large intermediate filesSpace for large intermediate files Space for output filesSpace for output files
1313Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
AutomationAutomation
Extra tools needed to manage the project?Extra tools needed to manage the project? Generate input files from a database?Generate input files from a database?
Write scripts? Use a tool already developed?Write scripts? Use a tool already developed? Generate scheduler command files?Generate scheduler command files?
Does a tool exist? Some tools are very complex. Is Does a tool exist? Some tools are very complex. Is it easier to write scripts than to learn the tool?it easier to write scripts than to learn the tool?
Collect data from output files into a database?Collect data from output files into a database?Write scripts? Write a compiled program?Write scripts? Write a compiled program?
1414Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
AutomationAutomation
Computation and data monitoringComputation and data monitoring Check status of each runCheck status of each run
Submit the job again if it failedSubmit the job again if it failed Check correctness and integrity of output dataCheck correctness and integrity of output data
Even if the job finishedEven if the job finished
it may have generated an error messageit may have generated an error message
there may be no result there may be no result
or the result may be invalid or incorrector the result may be invalid or incorrect
1515Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
PrecautionsPrecautions
Prepare for some disastersPrepare for some disasters Some or all computed data is lost or Some or all computed data is lost or
corrupted?corrupted?Make sure Make sure all files created manuallyall files created manually are on disks are on disks that are backed upthat are backed up
at least, you can run computations againat least, you can run computations again Some output has been processedSome output has been processed
Make sure partial results are on disks that are Make sure partial results are on disks that are backed upbacked up
1616Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Growing projectsGrowing projects
Often projects start smallOften projects start small Procedures are developed and usedProcedures are developed and used
They work well for 1,000 casesThey work well for 1,000 cases
Then the scope is increasedThen the scope is increased After partial successAfter partial success Procedures are used unchangedProcedures are used unchanged
They do not work for 1,000,000 cases!They do not work for 1,000,000 cases!
Must perform new analysis when scope Must perform new analysis when scope changeschanges
1717Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
Tool choicesTool choices
Small operationsSmall operations Scripts are easy to write and changeScripts are easy to write and change Run fast for small numbersRun fast for small numbers
Large operationsLarge operations Running a script 10,000,000 times may be Running a script 10,000,000 times may be
very slow and cause unexpected side effectsvery slow and cause unexpected side effects Investigate better toolsInvestigate better tools
Program in compiled languageProgram in compiled language
Use database instead of simple filesUse database instead of simple files
1818Sep 13, 2006Sep 13, 2006 Scientific ComputingScientific Computing
ConclusionConclusion
A little bit of thought, can save you from a A little bit of thought, can save you from a lot of trouble and extra worklot of trouble and extra work
Every scientific computation project that is Every scientific computation project that is worth doingworth doing
is worth a little bit of thought about how to is worth a little bit of thought about how to do it.do it.