Upload
darrell-tate
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Job submission architectures in GRID environment
Masamichi Ando M1 Student 26403Taura Lab. Department of Information Science and Technology
Background(1)
Large computing power is required in emerging scientific fields Astronomy High-energy physics Genomic science Etc.
Background(2)
We can get large computing power by connecting many computers through the network. Fast computational resources Fast access to large quantities of data Access to Data which is physically distant
But large-scale distributed computing (Grid) still has many problems.We look at job submission.
Contents
1. GRID environment2. How single sign-on is realized3. Towards rapid job submission4. GRID beyond firewalls5. Conclusion
1.GRID environment
Because of its feature, grid environment requires specific architecture, such as secure authentication, authorization and others.
Grid
Grid is large-scale(Continental-scale or National-scale) distributed computing E.g. CERN’s LHC
Global project in particle physics which will start in 2006
Generate petascale data every year
Grid includes different administrative domains
User population is large and dynamic
The system which has only one “master” doesn’t have scalability.
User User User User User UserUser User User
node1
node2
node3
node4
node5Server
Resource pool is large and dynamic
The system that a crash in a part affects the whole doesn’t have scalability.
networknetworkCrash!
A computation acquire and release resources dynamically
Single sign-on(User should be able to authenticate once and compute without further authentication)
computing
resourceUser
Withoutauthentication
Communication support
Some application require specific communication mechanism Unicast and Multicast Low-level communication
connection(e.g., TCP)
Dynamic connection for dynamic resources and users
Authentication and Security
Resources are subject to its local security policyAn individual user is associated with different local name space at different administrative domain
About job submission
We require… Single sign-on Rapid and scalable job submission More nodes to be participate in Grid
2.How single sign-on is realized
Survey of GSI(Grid security infrastructure) developed as part of Globus project
Globus toolkit(de facto standard)
Globus toolkit is a bag of service for GRID computingOne of them is the GSI(grid security infrastructure)GSI provides single sign-on and other security architectures
USER PROXY
Definition session manager process given
permission to act on behalf of a user for limited period of time
Advantage User can realize single sign-on by
generating user proxy before computing
RESOURCE PROXY
Definition An agent that represents a resource Serve as the interface between the
grid security architecture and the local security architecture
Resource Allocation Protocol
UserSite A
Site BSite CChild
process
User proxy
process
Resourceproxy
Resourceproxy
3. Towards rapid job submission
Survey of Gfpmd(Gfarm Process Management Daemon) developed by iwasaki
Gfpmd
Gfpmd is developed as part of the Gfarm(Grid Data Farm).Gfarm architecture is designed for global Petascale data intensive computing.Gfarm uses GSI for communication.
Overhead of authentication
Using GSI for authentication, if an ingenuous method is used to start a job, it takes the time proportional to the number of nodes.It is expected to take several thousand seconds for starting job which consists of thousands of process.Gfpmd is aiming to shorten this overhead
Connect beforeComputing
(GSI Authenticationwith Host Credential)
Node A
Node C
Node B
Node D
Gfpmd
gfpmd
gfpmd
gfpmd
gfpmd
User
GSI authentication
Ring-connection structure(1)
Crashoccurs
Ring-connection structure(2)
Ring-connection structure(3)
I/O tree is built in parallel for each job
1
2
6 7
3
4 5
12
6 7
34
5
ExaminationExamine the gfpmd with small job. (2001.1.29)
number of nodes
seconds
0
1
2
3
4
5
6
7
3 7 15 31 63
outputI/ O tree buildingjob requestauthentication
4. GRID beyond firewalls
Survey of VPG(Virtual Private Grid) developed by Kaneda
Restriction
VPG is designed for… Automatically work around administrative restrictions Utilize machines without changing administrative
restrictions
subnetCannotconnect
Node BLog on togateway
Node A
Node C
VPG
VPG provides shell nicknaming (giving an unique name
to each host independent of DNS names)
job submission to any nicknamed host redirection from/to a file on any host pipe between commands executed on
any host
VPG architecture
Internet
LAN
Node C
Node B
Node A
(private IP)
(global IP)
Cannotconnect
Bi-directionalconnection
vpgd
vpgd
Using SSH port forwardingLAN
Node BNode A(private IP)(global IP)
vpgd
Node C
Cannotconnect
(global IP)
Use SSH port forwardingwith empty pass-phrase
vpgd
VPG nicknaming
LAN X
Node Bprivate IP
“192.168.0.2”No dns name
nickname “sky”
LAN Y
Node Cprivate IP
“192.168.0.2”No dns name
nickname “marine”
Node Aglobal IP
“xxx.xxx.xxx.xxx”*.u-tokyo.ac.jp
nickname “earth”vpgd vpgd
vpgd
Same IP(private IP)
No dns namenickname
Job to node B
Job to node C
Spanning tree connection
Home node
normalssh
ExaminationCompare vpg with other tools by submitting a small job.
0
1
2
3
4
5
6
1hop 2hops 3hops
globus- job- run
sshrshvpg
seconds
5. Conclusion
We introduce GRID environment and three architectures for job submission. Single sign-on architecture using USER
PROXY. Rapid job submission architecture
using gfpmd. An architecture to utilize machines
beyond firewall using vpgd.