19
Grid Computing Project Supervisor: Dr. Hassan Rajaei Dept. of Computer Science Bowling Green State University Bowling Green, OH Grid Computing With MPI Over Multiple Clusters Presented by: Vasil Lalov James Murithi

Grid Computing

Embed Size (px)

DESCRIPTION

Grid Computing. Grid Computing With MPI Over Multiple Clusters. Presented by: Vasil Lalov James Murithi. Project Supervisor: Dr. Hassan Rajaei Dept. of Computer Science Bowling Green State University Bowling Green, OH. Presentation Overview. Introduction Clustering Concepts - PowerPoint PPT Presentation

Citation preview

Grid Computing

Project Supervisor: Dr. Hassan RajaeiDept. of Computer ScienceBowling Green State UniversityBowling Green, OH

Grid Computing With MPI Over Multiple Clusters

Presented by: Vasil Lalov James Murithi

2

Presentation Overview

Introduction

Clustering Concepts

Grid Computing Concepts

Our Contribution

Demonstration

Q/A Time

3

Parallel Programming Concepts

Example of a typical computer program:

Application

Process Data

Results

Application

4

Parallel Programming Concepts

An Example of a primitive parallel program:

ApplicationApplication

Processor Processor Data

MasterProcess

Data

MasterProcess

Results

5

Clustering Concepts

Example of a small cluster:

Head Node

Network Switch

Compute Nodes

6

Grid Computing Concepts

Definition of a Grid:

1998: “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities” - Carl Kesselman and Ian Foster

Grid computing is an emerging computing model that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure - from Wikipedia

Ian Foster

7

Categories of Grids

Computational grids (CPU scavenging) – monitor the network for idle resources and

use these for high performance computing

Data grids – is a grid computing system that deals with data, the controlled sharing

and management of large amounts of distributed data.

Equipment grids – have a primary piece of equipment like a telescope which the grid

gets data from and analyses.

8

Coordination of resources that are subject to decentralized control Resources from different domains (VO, company, department) Users from different domains Resources are often geographically separated

Use of standard, open general-purpose protocols and interfaces Authentication/authorization Resource discovery/access

Delivers non-trivial quality of service Utility of combined system >> sum of parts

Grid Computing Concepts

Key Elements of a Computational Grid:

9

Grid Computing Concepts

Grid Types:

Global Grid Includes resources located in multiple countries around the world Used for solving problems of global importance Rarely used for time sensitive applications

National Grid – e.g. Terra Grid Includes resources located with in the boundaries of a single

country Often used for governmental purposes

Mini Grid Includes resources owned and managed by a single organization

(company, university, etc.) Primarily used for research and education purposes True commercial use is still in its infancy

10

Grid Computing Concepts

Grid Organization of Resources:

VirtualOrganization

Data Warehouse

Cluster 2

Cluster 1

11

Grid Resource Managers

Definition and Examples:

Definition - A software package that is responsible for: Detecting and managing available resources on the grid Collecting, distributing and managing jobs that use the grid resources Providing a simple user interface for submitting jobs to the grid Enforcing security policies for protecting resources, data and users

on the grid

Popular Grid Resource Managers: Globus ToolKit Condor

12

MPI Compilers for Grid Computing

13

Grid Resource Managers

Problems with Globus ToolKit:

Complex Installation and Configuration

To run parallel jobs, MPICH-G2 is required Very difficult installation Requires 2 IP addresses per compute node Requires recompiling existing MPI based software

Current Source Code is broken

Runs on Java (slow, problematic)

14

Grid Resource Managers

Condor Grid Manager:

Requires only MPI 1.2.x: No need for second NIC card and external IP addresses No need for recompiling existing MPI based software

Extremely versatile and scalable: Can manage very small and very large grids Manages multiple types of resources Automatically finds, configures and uses resources Works with many types of job schedulers (PBS, SGE, etc)

Easy to use once installed and configured

Standalone Application (faster)

Huge community support

Current version is 6.8.6

15

Grid Resource Managers

Details Condor Grid Manager:

Condor Universes – a universe is a run time environment Standard – The standard universe allows a job running under

Condor to handle system calls by returning them to the machine where the job was submitted

Vanilla – provides a way to run jobs that cannot be relinked, these jobs cannot be relocated, for batch ready jobs

MPI – Obsolete universe Parallel – Parallel jobs including MPI

What is Condor good/used for? “Hunting” for available resources Maximizing the Grid throughput Background Jobs (BOINC) Interfacing with other job managers (Globus, SGE)

16

Demonstration

Grid Monitoring of Resources

Condor Job Submission Scripts

Condor Job Submission Process

17

Future Work

Improve on the current Condor Configuration on Protos Cluster

Research on interoperability of Globus and Condor

Install and configure Condor on BWP4 Cluster

Test the mini-grid

Scale up the current platform

18

In Conclusion

Grid computing is exponentially more complex than cluster computing

Grids are usually designed for wide range of applications

Execution of MPI jobs in Grid environment requires additional setup

Overall, Grids are more reliable than clusters but not as consistent

19

Q/A Time

Thanks

Questions?