Upload
alberta-shelton
View
219
Download
1
Tags:
Embed Size (px)
Citation preview
Building the e-Minerals Minigrid
Rik Tyer, Lisa Blanshard, Kerstin Kleese(Data Management Group)
Rob Allan, Andrew Richards(Grid Technology Group)
AHM, Nottingham 2003
Royal Institution
University of
Reading
Project Members
AHM, Nottingham 2003
Who we are
• One of Europe’s largest Research Support Organisations
• Provides large scale experimental, data and computing facilities
• Serves the UK research community both in academia and industry
• Annually support 12000 scientists from all major scientific domains
• 1800 members of staff over three sites:
Rutherford Appleton Laboratory in Oxfordshire
Daresbury Laboratory in Cheshire
Chilbolton Observatory in Hampshire
• Large quantities of data associated with the various facilities
• Major e-Science centre in the UK
http://www.cclrc.ac.uk
Council for the Central Laboratory of the Research Councils
AHM, Nottingham 2003
Environmental Issues
Radioactive waste disposal
Crystal growth and scale inhibition
Pollution: molecules and atoms on mineral surfaces
Crystal dissolution and weathering
AHM, Nottingham 2003
Examples of Codes
DL_POLY3: parallel molecular dynamics code. Modifications aimed at running efficient simulations with millions of atoms for simulations of radiation damage (Daresbury)
SIESTA: Order-N quantum mechanics code. Objective to run with large samples and realistic fluids (Cambridge)
SURFACE SIMULATIONS: New developments aimed at efficient scanning of many configurations of complex fluid-mineral interfaces, for studies of crystal growth and dissolution (Bath)
AHM, Nottingham 2003
Data Management Requirements
• Many output files produced from each simulation run
• Each set of input and output files is a dataset
• Need to keep information about each simulation – metadata
• Other scientists need access to this information and datasets
• Need to search different metadata repositories at once
• Access could be from anywhere in the world
• Need to categorise data so it can be found by someone else
• These requirements are same for all scientists
AHM, Nottingham 2003
Integrated Portals using web services
External Applications
DataPortal
HPCPortal
High Performance
Computers on the GRID
Metadata databases
Web Services
Web Services
Web Services
AHM, Nottingham 2003
DataPortal
Metadata Object
Topic
Study Description
Access Conditions
Data Location
Data Description
Related Material
Discipline e.g. Earth Sciences/Soil Contamination/Heavy Metals/Arsenic
Provenance about what the study is, who did it and when.
Conditions of use providing information on who and how the data can be accessed.
Detailed description of the organisation of the data into datasets and files.
Locations providing a navigational to where the data on the study can be found.References into the literature and community providing context about the study.
Scientific Metadata Model
AHM, Nottingham 2003
DataPortal – Use Cases
DATA PORTAL
Request Metadata
Store AssociatedData Files
Transfer Data Files
Multiple Metadata Repositories
Scientist
External ApplicationRemote Machines
AHM, Nottingham 2003
Plan of Work
High-requirement science
High-performance codes
Collaborative environment
AHM, Nottingham 2003
e-Minerals Minigrid and Portal
• Minigrid makes the shared computing resources available to the project through the UK e-Science grid (built using globus, incorporating storage resource broker)
• The e-Minerals minigrid is accessed through the e-Minerals portal, based on the HPCPortal and Dataportal developed at the Daresbury Laboratory
• The e-Minerals minigrid links to a storage resource broker to store the outputs of simulation runs
AHM, Nottingham 2003
e-Minerals Portal
AHM, Nottingham 2003
DataPortal Results
AHM, Nottingham 2003
Condor Technologies
Condor:
Mature-ish technology to build small or large distributed computing systems from standard desktop computers
The important point is that condor can allow you to use idle time on desktops, and hence harness the potential of powerful processors
AHM, Nottingham 2003
The UCL Windows Condor Pool
• Runs WTS (Windows Terminal Server)
• Approximately 750 cpu’s in 30 clusters.
• Most are 1GHz Pentium 4, with 256/512Mb ram and 40Gb hard disks.
• All 90%+ underutilised and running 24/7…
• We are using condor to use this pool as a massive distributed computing system
AHM, Nottingham 2003
Technology Used
• Operating system: SuSE Linux 8.1 (kernel 2.4.19-4GB)
• Sun Microsystems J2SDK version 1.4
• All Data Portal web services built and deployed under Apache Tomcat version 4.1.18 using Apache Ant version 1.5.1
• Apache Axis as the SOAP engine
• For authentication MyProxy server used
• Systinet UDDI Server version 4.5 for Lookup Web Service
• PostgreSQL (version 7.2.2-16) databases for Lookup, Session Manager, Access & Control, Shopping Cart Web Services
• HPCPortal services built using Globus 2 toolkit with GSoap2 libraries. Deployed under standard Apache http server 1.3.27
AHM, Nottingham 2003
Further Information
Environment from the Molecular Levelhttp://www.e-science.clrc.ac.uk/web/projects/emineralshttp://eminerals.org/
E-minerals Mini Grid (need a X.509 certificate)http://wk-pc1.dl.ac.uk:8080/dataportal/eminerals.html
Integrated e-Science Environment Portalhttp://esc.dl.ac.uk/IeSE/
HPC Grid Services Portalhttp://esc.dl.ac.uk/HPCPortal/
DataPortal demonstrationhttp://esc.dl.ac.uk:9000/index.html
UK CCLRC e-Science Centrehttp://www.e-science.clrc.ac.uk