Upload
crystal-austin
View
217
Download
0
Embed Size (px)
Citation preview
Introduction to Grid Computing
Alina Bejan - University of Chicago
SC08 Education Program, OU - Oklahoma
Aug 14 2008
1
Introduction to Grid Computing
2
Computing “Clusters” are today’s Supercomputers
Cluster Management“frontend”
Tape Backup robots
I/O Servers typically RAID fileserver
Disk Arrays Lots of Worker Nodes
A few Headnodes, gatekeepers and
other service nodes
2
Introduction to Grid Computing
3
Cluster ArchitectureCluster
User
3
Head Node(s)
Login access (ssh)
Cluster Scheduler(PBS, Condor, SGE)
Web Service (http)
Remote File Access(scp, FTP etc)
Node 0
…
…
…
Node N
SharedCluster
FilesystemStorage
(applicationsand data)
Job execution requests & status
ComputeNodes
(10 … 10,000PC’swithlocal
disks)
…Cluster
User
Internet
Protocols
Introduction to Grid Computing
4
Scaling up Science:Citation Network Analysis in Sociology
2002
1975
1990
1985
1980
2000
1995
Work of James Evans, University of Chicago,
Department of Sociology
4
Introduction to Grid Computing
5
Scaling up the analysis
Query and analysis of 25+ million citations Work started on desktop workstations Queries grew to month-long duration With data distributed across
U of Chicago TeraPort cluster: 50 (faster) CPUs gave 100 X speedup Many more methods and hypotheses can be tested!
Higher throughput and capacity enables deeper analysis and broader community access.
5
Introduction to Grid Computing
6
Grids consist of distributed clusters
Grid Client
Application& User Interface
Grid ClientMiddleware
Resource,Workflow
& Data Catalogs
6
Grid Site 2:Sao Paolo
GridService
Middleware
ComputeCluster
GridStorage
Grid
Protocols
Grid Site 1:Fermilab
GridService
Middleware
ComputeCluster
GridStorage
…Grid Site N:UWisconsin
GridService
Middleware
ComputeCluster
GridStorage
Introduction to Grid Computing
7
HPC vs. HTCHTC: many computing resources over long periods of time to
accomplish a computational task.
HPC tasks are characterized as needing large of amounts of computing power for short periods of time, whereas HTC tasks also require large amounts of computing, but for much longer times (months and years, rather than hours and days).
Fine-grained calculations are better suited to HPC: big, monolithic supercomputers, or very tightly coupled computer clusters with lots of identical processors and an extremely fast, reliable network between the processors.
Embarrassingly parallel calculations are ideal for HTC: more loosely-coupled networks of computers where delays in getting results from one processor will not affect the work of the others. (course grained)
Introduction to Grid Computing
8
Grids represent a different approach
– Building HTC resource by joining clusters together in a grid
– Example:
Current Compute Resources:– 70 Open Science Grid sites, but all
very different– Connected via Inet2, NLR.... from
10 Gbps – 622 Mbps– Compute Elements & Storage
Elements– Most are Linux clusters– Most are shared
• Campus grids• Local non-grid users
– More than 20,000 CPUs• A lot of opportunistic usage • Total computing capacity
difficult to estimate• Same with Storage
Current Compute Resources:– 70 Open Science Grid sites, but all
very different– Connected via Inet2, NLR.... from
10 Gbps – 622 Mbps– Compute Elements & Storage
Elements– Most are Linux clusters– Most are shared
• Campus grids• Local non-grid users
– More than 20,000 CPUs• A lot of opportunistic usage • Total computing capacity
difficult to estimate• Same with Storage
The OSG
Introduction to Grid Computing
9
Initial Grid driver: High Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 3Tier 3
1 TIPS is approximately 25,000
SpecInt95 equivalents
Image courtesy Harvey Newman, Caltech 9
Introduction to Grid Computing
10
Mining Seismic data for hazard analysis (Southern Calif. Earthquake Center).
InSAR Image of theHector Mine Earthquake
• A satellitegeneratedInterferometricSynthetic Radar(InSAR) image ofthe 1999 HectorMine earthquake.
• Shows thedisplacement fieldin the direction ofradar imaging
• Each fringe (e.g.,from red to red)corresponds to afew centimeters ofdisplacement.
SeismicHazardModel
Seismicity Paleoseismology Local site effects Geologic structure
Faults
Stresstransfer
Crustal motion Crustal deformation Seismic velocity structure
Rupturedynamics 1010
Introduction to Grid Computing
11
Grids can process vast datasets. Many HEP and Astronomy experiments consist of:
Large datasets as inputs (find datasets) “Transformations” which work on the input datasets (process) The output datasets (store and publish)
The emphasis is on the sharing of these large datasets Workflows of independent program can be parallelized.
Montage Workflow: ~1200 jobs, 7 levelsNVO, NASA, ISI/Pegasus - Deelman et al.
Mosaic of M42 created on TeraGrid
= Data Transfer
= Compute Job 11
Introduction to Grid Computing
12
Ian Foster’s Grid Checklist
A Grid is a system that:
Coordinates resources that are not subject to centralized control
Uses standard, open, general-purpose protocols and interfaces
Delivers non-trivial qualities of service
12
Introduction to Grid Computing
13
The Grid Middleware Stack (and course modules)
Grid Security Infrastructure
JobManagement
DataManagement
Grid InformationServices
Core Globus Services
Standard Network Protocols and Web Services
Workflow system (explicit or ad-hoc)
Grid Application (often includes a Portal)
13
Job and resource management
14
Introduction to Grid Computing
15
How do we access the grid ?
Command line with tools that you'll use Specialised applications
Ex: Write a program to process images that sends data to run on the grid as an inbuilt feature.
Web portals I2U2 SIDGrid
Introduction to Grid Computing
16
Grid Middleware glues the grid together A short, intuitive definition:
the software that glues together different clusters into a grid
taking into consideration the socio-political side of things (such as common policies on who can use what, how much, and what for)
Introduction to Grid Computing
17
Grid middleware components
Job management Storage management Information Services Security
Introduction to Grid Computing
18
Globus and Condor play key roles Globus Toolkit provides the base middleware
Client tools which you can use from a command line APIs (scripting languages, C, C++, Java, …) to build
your own tools, or use direct from applications Web service interfaces Higher level tools built from these basic components,
e.g. Reliable File Transfer (RFT) Condor provides both client & server scheduling
In grids, Condor provides an agent to queue, schedule and manage work submission
18
Introduction to Grid Computing
19
Globus Toolkit Developed at UChicago/ANL and USC/ISI (Globus
Alliance) - globus.org Open source Adopted by different scientific communities and
industries Conceived as an open set of architectures, services and
software libraries that support grids and grid applications Provides services in major areas of distributed systems:
Core services Data management Security
Introduction to Grid Computing
20
Globus - core services
A toolkit that can serve as a foundation to build a grid Authorization Message level security System level services (e.g., monitoring) Associated data management provides file services
GridFTP RFT (Reliable File Transfer) RLS (Replica Location Service)
Introduction to Grid Computing
21
Local Resource Managers (LRM) Compute resources have a local resource manager
(LRM) that controls: Who is allowed to run jobs How jobs run on a specific resource Specifies the order and location of jobs
Example policy: Each cluster node can run one job. If there are more jobs, then they must wait in a queue
Examples: PBS, LSF, Condor
Introduction to Grid Computing
22
Local Resource Manager: a batch schedulerfor running jobs on a computing cluster Popular LRMs include:
PBS – Portable Batch System LSF – Load Sharing Facility SGE – Sun Grid Engine Condor – Originally for cycle scavenging, Condor has evolved
into a comprehensive system for managing computing LRMs execute on the cluster’s head node Simplest LRM allows you to “fork” jobs quickly
Runs on the head node (gatekeeper) for fast utility functions No queuing (but this is emerging to “throttle” heavy loads)
In GRAM, each LRM is handled with a “job manager”
22
Introduction to Grid Computing
23
GRAM Globus Resource Allocation Manager
GRAM = provides a standardised interface to submit jobs to LRMs.
Clients submit a job request to GRAM GRAM translates into something a(ny) LRM can
understand …. Same job request can be used for many different
kinds of LRM
Introduction to Grid Computing
24
Job Management on a Grid
User
The Grid
Condor
PBS
LSF
fork
GRAM
Site A
Site B
Site C
Site D
Introduction to Grid Computing
25
GRAM’s abilities Given a job specification:
Creates an environment for the job Stages files to and from the environment Submits a job to a local resource manager Monitors a job Sends notifications of the job state change Streams a job’s stdout/err during execution
Introduction to Grid Computing
26
GRAM components
Worker nodes / CPUsWorker node / CPU
Worker node / CPU
Worker node / CPU
Worker node / CPU
Worker node / CPU
LRM eg Condor, PBS, LSF
Gatekeeper
Internet
JobmanagerJobmanager
globus-job-run
Submitting machine(e.g. User's workstation)
Introduction to Grid Computing
27
Submitting a job with GRAM globus-job-run command
$ globus-job-run workshop1.ci.uchicago.edu /bin/hostname
Run '/bin/hostname' on the resource workshop1.ci.uchicago.edu
We don't care what LRM is used on ‘workshop1’ This command works with any LRM.
Introduction to Grid Computing
28
Condor
Condor is a specialized workload management system for compute-intensive jobs (aka batch system)
is a software system that creates an HTC environment Created at UW-Madison
Detects machine availability Harnesses available resources Uses remote system calls to send R/W operations over the
network Provides powerful resource management by matching resource
owners with consumers (broker)
Introduction to Grid Computing
29
Condor manages your cluster Given a set of computers…
Dedicated Opportunistic (Desktop computers)
And given a set of jobs… Can be a very large set of jobs
Condor will run the jobs on the computers Fault-tolerance (restart the jobs) Priorities Can be in a specific order With more features than we can mention here…
Introduction to Grid Computing
30
Condor - features
Checkpoint & migration Remote system calls
Able to transfer data files and executables across machines
Job ordering Job requirements and preferences can be
specified via powerful expressions
Introduction to Grid Computing
31
Condor-G Condor-G is a marketing term
It’s a feature of Condor It can be used without using Condor on your cluster
Feature: instead of submitting just to your local cluster, you can submit to an external grid. The “G” in Condor-G
Many grid types are supported: Globus (old or new) Nordugrid CREAM Amazon EC2 … others
Introduction to Grid Computing
32
Globus GRAM Protocol Globus GRAM
Submit to LRM
Organization A Organization B
Condor-GCondor-G
myjob1myjob2myjob3myjob4myjob5…
Remote Resource Access: Condor-G + Globus + Condor
Introduction to Grid Computing
33
Why Condor-G?
Globus has command-line tools, right?
Condor-G provides extra features:
Access to multiple types of grid sites
A reliable queue for your jobs
Job throttling
Introduction to Grid Computing
34
Four Steps to Run a Job with Condor
These choices tell Condor how when where to run the job, and describe exactly what you want to run.
Choose a Universe for your job Make your job batch-ready Create a submit description file Run condor_submit
Introduction to Grid Computing
35
1. Choose a Universe There are many choices
Vanilla: any old job Grid: run jobs on the grid Standard: checkpointing & remote I/O Java: better for Java jobs MPI: Run parallel MPI jobs Virtual Machine: Run a virtual machine as job …
For now, we’ll just consider grid
Introduction to Grid Computing
36
2. Make your job batch-ready
Must be able to run in the background: no interactive input, windows, GUI, etc.
Condor is designed to run jobs as a batch system, with pre-defined inputs for jobs
Can still use STDIN, STDOUT, and STDERR (the keyboard and the screen), but files are used for these instead of the actual devices
Organize data files
Introduction to Grid Computing
37
3. Create a Submit Description File A plain ASCII text file Condor does not care about file extensions
Tells Condor about your job:
Which executable to run and where to find it Which universe Location of input, output and error files Command-line arguments, if any Environment variables Any special requirements or preferences
Introduction to Grid Computing
38
Simple Submit Description File# myjob.submit file
# Simple condor_submit input file
# (Lines beginning with # are comments)
# NOTE: the words on the left side are not
# case sensitive, but filenames are!executable=/sw/national_grids/primetest arguments=143output=results.outputerror=results.errorlog=results.lognotification=neveruniverse=gridgrid_resource=gt2 workshop1.ci.uchicago.edu/jobmanager-
forkqueue
Introduction to Grid Computing
39
4. Run condor_submit
You give condor_submit the name of the submit file you have created:
condor_submit my_job.submit
condor_submit parses the submit file
Introduction to Grid Computing
40
Other Condor commands
condor_q – show status of job queue
condor_status – show status of compute nodes condor_rm – remove a job condor_hold – hold a job temporarily condor_release – release a job from hold
Introduction to Grid Computing
41
Submitting more complex jobs
express dependencies between jobs WORKFLOWS
And also, we would like the workflow to be managed even in the face of failures
Introduction to Grid Computing
42
DAGMan Directed Acyclic Graph Manager DAGMan allows you to specify the dependencies between
your Condor jobs, so it can manage them automatically for you.
(e.g., “Don’t run job “B” until job “A” has completed successfully.”)
Introduction to Grid Computing
43
What is a DAG? A DAG is the data structure used by
DAGMan to represent these dependencies.
Each job is a “node” in the DAG.
Each node can have any number of “parent” or “children” nodes –
as long as there are no loops!
Job A
Job B Job C
Job D
Introduction to Grid Computing
44
A DAG is defined by a .dag file, listing each of its nodes and their dependencies:
# diamond.dagJob A a.subJob B b.subJob C c.subJob D d.subParent A Child B CParent B C Child D
each node will run the Condor job specified by its accompanying Condor submit file
Defining a DAG
Job A
Job B Job C
Job D
Introduction to Grid Computing
45
Submitting a DAG To start your DAG, just run condor_submit_dag with your .dag file,
and Condor will start a personal DAGMan daemon which to begin running your jobs:
% condor_submit_dag diamond.dag
condor_submit_dag submits a Scheduler Universe Job with DAGMan as the executable.
Thus the DAGMan daemon itself runs as a Condor job, so you don’t have to baby-sit it.
Introduction to Grid Computing
46
Why DAGMan?
Submit and forget You can submit a large workflow and let DAGMan run
it Fault tolerance
DAGMan can help recover from many common failures
Example: You can retry jobs automatically if they fail
Grid Workflow
47
Introduction to Grid Computing
48
A typical workflow pattern in image analysis runs many filtering apps (fMRI - functional magnetic resonance imaging)
3a.h
align_warp/1
3a.i
3a.s.h
softmean/9
3a.s.i
3a.w
reslice/2
4a.h
align_warp/3
4a.i
4a.s.h 4a.s.i
4a.w
reslice/4
5a.h
align_warp/5
5a.i
5a.s.h 5a.s.i
5a.w
reslice/6
6a.h
align_warp/7
6a.i
6a.s.h 6a.s.i
6a.w
reslice/8
ref.h ref.i
atlas.h atlas.i
slicer/10 slicer/12 slicer/14
atlas_x.jpg
atlas_x.ppm
convert/11
atlas_y.jpg
atlas_y.ppm
convert/13
atlas_z.jpg
atlas_z.ppm
convert/15
Workflow courtesy James Dobson, Dartmouth Brain Imaging Center 48
Introduction to Grid Computing
49
Workflow management systems Orchestration of many resources over long time
periods Very complex to do manually - workflow automates
this effort Enables restart of long running scripts Write scripts in a manner that’s location-
independent: run anywhere Higher level of abstraction gives increased portability
of the workflow script (over ad-hoc scripting)
Introduction to Grid Computing
50
OSG & job submissions OSG sites present interfaces allowing remotely submitted
jobs to be accepted, queued and executed locally.
OSG supports the Condor-G job submission client which interfaces to either the pre-web service or web services GRAM Globus interface at the executing site.
Job managers at the backend of the GRAM gatekeeper support job execution by local Condor, LSF, PBS, or SGE batch systems.
Data Management
\
51
Introduction to Grid Computing
52
High-performance tools to solve data problems. The huge volume of data:
Storing it Moving it
[ Measured in terabytes, petabytes, and further …]
The huge number of filenames: 1012 filenames is expected soon Collection of 1012 of anything is a lot to handle efficiently
How to find the data
Introduction to Grid Computing
53
Data Questions on the Grid
Where are the files I want?
How to move data/files to/from where I want?
Introduction to Grid Computing
54
Data management services provide the mechanisms to find, move and share data GridFTP
Fast, Flexible, Secure, Ubiquitous data transport Often embedded in higher level services
RFT Reliable file transfer service using GridFTP
Replica Location Service (RLS) Tracks multiple copies of data for speed and reliability
Storage Resource Manager (SRM) Manages storage space allocation, aggregation, and transfer
54
Introduction to Grid Computing
55
GridFTP high performance, secure, and reliable data transfer
protocol based on the standard FTP
Extensions include Strong authentication, encryption via Globus GSI Multiple data channels for parallel transfers Third-party transfers
Introduction to Grid Computing
56
Data channel
A file transfer with GridFTP Control channel can go either way
Depends on which end is client, which end is server Data channel is still in same direction
Site ASite B
Control channel
Server
Introduction to Grid Computing
57
Site BSite A
Data channel
Third party transfer File transfer without data flowing through the client. Controller can be separate from src/dest Useful for moving data from storage to compute
Control channels
Client
ServerServer
Introduction to Grid Computing
58
Site B
Going fast – parallel streams Use several data channels
Site A
Control channel
Data channelsServer
Introduction to Grid Computing
59
GridFTP usage globus-url-copy
Convensions on URL formats:
file:///home/YOURLOGIN/dataex/largefile a file called largefile on the local file system, in
directory /home/YOURLOGIN/dataex/
gsiftp://osg-edu.cs.wisc.edu/scratch/YOURLOGIN/ a directory accessible via gsiftp on the host called
osg-edu.cs.wisc.edu in directory /scratch/YOURLOGIN.
Introduction to Grid Computing
60
GridFTP examples
globus-url-copy
file:///home/YOURLOGIN/dataex/myfile
gsiftp://osg-edu.cs.wisc.edu/nfs/osgedu/YOURLOGIN/ex1
globus-url-copy
gsiftp://osg-edu.cs.wisc.edu/nfs/osgedu/YOURLOGIN/ex2
gsiftp://tp-osg.ci.uchicago.edu/YOURLOGIN/ex3
Introduction to Grid Computing
61
RFT = Reliable file transfer • protocol that provides reliability and fault tolerance
for file transfers
• Part of the Globus Toolkit
• RFT acts as a client to GridFTP, providing management of a large number of transfer jobs (same as Condor to GRAM)
• RFT can• keep track of the state of each job• run several transfers at once• deal with connection failure, network failure, failure of any
of the servers involved.
Introduction to Grid Computing
62
OSG & Data management OSG relies on GridFTP protocol for the raw transport of the data using
Globus GridFTP in all cases except where interfaces to storage management systems (rather than file systems) dictate individual implementations.
OSG supports the SRM interface to storage resources to enable management of space and data transfers to prevent unexpected errors due to running out of space, to prevent overload of the GridFTP services, and to provide capabilities for pre-staging, pinning and retention of the data files.
OSG currently provides reference implementations of two storage systems the BeStMan and dCache
Grid Security
63
Introduction to Grid Computing
64
Grid security is a crucial component Need for secure communication
Privacy - only invited to understand conversation (use encryption)
Integrity - message unchanged (use signed messages)
Authentication - verify entities are who they claim to be (use
certificates and CAs)
Authorization - allowing or denying access to services based on
policies.
Need to support “single sign-on” for users of grid Delegation of credentials for computations that involve multiple
resources and/or sites
64
Introduction to Grid Computing
65
Identity & Authentication
Each entity should have an identity Authenticate: Establish identity
Is the entity who he claims he is ? Examples:
Driving License Username/password
Stops masquerading imposters
Introduction to Grid Computing
66
Authorization
Establishing rights What can a certain identity do ?
Examples: Are you allowed to be on this flight ?
Passenger ? Pilot ?
Unix read/write/execute permissions Must authenticate first
Introduction to Grid Computing
67
Single Sign-on Important for complex applications that need to
use Grid resources Enables easy coordination of varied resources Enables automation of processes Allows remote processes and resources to act on user’s
behalf Authentication and Delegation
Introduction to Grid Computing
68
Certificates
• Central concept in GSI authentication• Every user, resource and service on Grid is
identified via a certificate• Contains:
– Subject name (identifies entity)– Corresponding public key– Identity of the CA that has signed the cert (to certify
that the public key and the identity both belong to the subject)
– The digital signature of the CA
• GSI certs are encoded in a X509 certificate format
Introduction to Grid Computing
69
Grid Security - in practice - steps: Get certificate from relevant CA
DOEGrids in our case
Request to be authorized for resources Meaning you will be added to the OSGEDU VOMS
Generate proxy as needed Using grid-proxy-init
Run clients Authenticate Authorize Delegate as required
Numerous resources, different CAs, numerous credentials
National GridCyberinfrastructure
70
Introduction to Grid Computing
71
Grid Resources in the US
Origins: National Grid (iVDGL, GriPhyN,
PPDG) and LHC Software & Computing Projects
Current Compute Resources: 70 Open Science Grid sites Connected via Inet2, NLR.... from
10 Gbps – 622 Mbps Compute & Storage Elements All are Linux clusters Most are shared
Campus grids Local non-grid users
More than 20,000 CPUs A lot of opportunistic usage Total computing capacity
difficult to estimate Same with Storage
Origins: National Grid (iVDGL, GriPhyN,
PPDG) and LHC Software & Computing Projects
Current Compute Resources: 70 Open Science Grid sites Connected via Inet2, NLR.... from
10 Gbps – 622 Mbps Compute & Storage Elements All are Linux clusters Most are shared
Campus grids Local non-grid users
More than 20,000 CPUs A lot of opportunistic usage Total computing capacity
difficult to estimate Same with Storage
Origins: National Super Computing Centers,
funded by the National Science Foundation
Current Compute Resources: 14 TeraGrid sites Connected via dedicated multi-
Gbps links Mix of Architectures
ia64, ia32 LINUX Cray XT3 Alpha: True 64 SGI SMPs
Resources are dedicated but Grid users share with local
users 1000s of CPUs, > 40 TeraFlops
100s of TeraBytes
Origins: National Super Computing Centers,
funded by the National Science Foundation
Current Compute Resources: 14 TeraGrid sites Connected via dedicated multi-
Gbps links Mix of Architectures
ia64, ia32 LINUX Cray XT3 Alpha: True 64 SGI SMPs
Resources are dedicated but Grid users share with local
users 1000s of CPUs, > 40 TeraFlops
100s of TeraBytes
The TeraGridThe OSG
Introduction to Grid Computing
72
Open Science Grid (OSG) provides shared computing resources, benefiting a broad set of disciplines
OSG incorporates advanced networking and focuses on general services, operations, end-to-end performance
Composed of a large number (>70 and growing) of shared computing facilities, or “sites”
http://www.opensciencegrid.org/
A consortium of universities and national laboratories, building a sustainable grid infrastructure for science.
72
Introduction to Grid Computing
73
96 Resources across
production & integration infrastructures
20 Virtual Organizations +6 operations
Includes 25% non-physics.
~20,000 CPUs (from 30 to 4000)
~6 PB Tapes
~4 PB Shared Disk
Snapshot of Jobs on OSGs
Sustaining through OSG submissions:
3,000-4,000 simultaneous jobs .
~10K jobs/day
~50K CPUhours/day.
Peak test jobs of 15K a day.
Using production & research networks
OSG Snapshot
Introduction to Grid Computing
74
To efficiently use a Grid, you must locate and monitor its resources. Check the availability of different grid sites Discover different grid services Check the status of “jobs” Make better scheduling decisions with
information maintained on the “health” of sites
74
Introduction to Grid Computing
75
OSG - VORS (VO Resource Locator)
Introduction to Grid Computing
76
OSG Middleware
Infr
ast
ruct
ure
Ap
pli
cati
on
s VO Middleware
Core grid technology distributions: Condor, Globus, myproxy: shared with TeraGrid and
others
Virtual Data Toolkit (VDT) core technologies + software needed by
stakeholders: many components shared with EGEE
OSG Release Cache: OSG specific configurations, utilities etc.
HEP
Data and workflow management etc
Biology
Portals, databases etc
User Science Codes and Interfaces
Existing Operating, Batch systems and Utilities
Astrophysics
Data replication etc
Introduction to Grid Computing
77
TeraGrid provides vast resources via a number of huge computing facilities.world's largest, most comprehensive distributed cyberinfrastructure for open scientific research (750 teraflops of computing capability and more than 30 petabytes of storage)
77
Introduction to Grid Computing
78
Contact us [email protected]
opensciencegrid.org/Education Online training course Materials, resources
Introduction to Grid Computing
79
Acknowledgments:
Presentation based on different OSG speaker notes (globus, condor, swift, GSI teams).
Introduction to Grid Computing
80
Lab
URL: opensciencegrid.org/GridSchool
Machines we’ll use for login : workshop1.ci.uchicago.edu workshop2.ci.uchicago.edu
Username: trainXX, where XX in [01 .. 40]
Password: 0808OKLA