47
GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. hop University of Mumba hysicist whose research field induces & utilizes cutting borrowed slides from various resources

GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Embed Size (px)

Citation preview

Page 1: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

GRID: Computing Without Borders

Kajari Mazumdar

Department of High Energy PhysicsTata Institute of Fundamental Research, Mumbai.

Soft-computing workshop University of Mumbai , December 1, 2009

Disclaimer:• I am only a physicist whose research field induces & utilizes cutting-edge technology• I have mostly borrowed slides from various resources

Page 2: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Plan of talk

Grid concept in simple terms

Requirements of today’s scientific community

Evolution of Grid

LHC Computing Grid TIFR grid computing centre DAE contributions

Outlook

Page 3: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Grid computing in simple words Grid is an utility or infra-structure for complex, huge computations, where remote resources are accessible through web (internet), from desktop, laptop, mobile phone. It is similar to power grid, where the user does not have to worry about the source of the computing power.

Imagine millions of computers owned by individuals, institutes from various countries across the world connected to form a single, huge, super-computer!

This technology, developed since last only one decade, is being used presently, by• High energy physicists to analyze data to be produced very soon in LHC experiment where Indian scientists are taking part.

• Earth scientists to monitor Ozone layer activity (deals daily with Data whose volume is equivalent to 150 CDs).

It is the natural evolution of internet facility .

Page 4: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

1. Share more than information Data, computing power, applications in dynamic, multi-institutional, virtual

organizations (Ian Foster: Anatomy of Grid)

2. Efficient use of resources at many institutes. People from many institutions working to solve a common problem (virtual organisation).

3. Join local communities.

4. Interactions with the underneath layers must be transparent and seemless to the user.

From Web to Grid Computing

Page 5: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Computing requirements

Page 6: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

High-end computing application

Weather forecast

Page 7: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Share data between thousands of scientists with multiple interests

• Link major and minor computer centres

• Ensure all data accessible anywhere, anytime

• Grow rapidly, yet remain reliable for more than a decade

• Cope with different management policies of different centres

• Ensure data security

• Be up and running routinely

•Need to check up health of facilty on 24X7 basis.

A huge man power is at work invisibly .

Challenges in scientific computations

Page 8: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Ever-increasing demand

• PC of early 2000 era is as fast as of supercomputers of 1990’s. Still, for many application it is not adequate! users continue to buy new machines!

• Storage available in a PC could not be thought of during 1990’s storage capacity doubles every 12 month or so!

• Recent years of this decade is seeing mammoth scientifc projects where data size is several Petabytes per year.

• To work with a colleague even across a campus on Petabyte scale we need ultrafast network.

Even though CPU power, disc storage, communication speed continue to increase, computing resources are failing to satisfy users’ demands, they are difficult to use.

Page 9: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Supercomputer

Page 10: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Parallel computer

Page 11: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

PC clusters, multiple PCs

Page 12: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Clusters: Primary IT infrastructures

Clusters replace traditional computing platforms and can be configured according to the needNetwork load distribution and load balanceHigh availability, High performance /computation intensive, ..

Issues related to building clusters

• Scalability of interconnection network• Scalability of software components (libraries, applications,..)• Auto-installation, cluster management, trouble-shooting, …• Space management (desktop/rack mounted)• Layout of nodes, noises, cable layout, cooling, ..• Power management• Centralized infrastructure management software• Performance/ Price/ Power consumption

Cost of ownership not very low!

Page 13: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Peer to Peer (P2P) computing

Computing based on idea of sharing distributed resources with each other with or without the support from a server

There are many under-utilised resourcesWith powerful pcs, real utilisation today is < 10%

In large organizations, with thousands of PCs, increasing day by day utilise that in cycle stealing mode!• Total delivered power is > few Mflops• Total available free disk space > 100 Terabytes

• Latency and bandwidth of LAN environment is quite adequate for p2p computing mostly.• Space is also not a problem, keep the PCs wherever they are!

Page 14: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Internet computing

• Today you cannot run your jobs on the internet.

• Internet computing using idle PC’s is becoming an important computing platform (Seti@home, Napster, ..)www is the promising candidate for core component of wide-area distributed computing environment.Efficient client/server models/protocolsTransparent networking, navigation, GUI with multimedia access and dissemination for data visualization.

• Mechanism for distributed computing : CGI, Java

• With improved price/performance and open source, free software, web-services, it is becoming easy to develop loosely coupled distributed applications.

Page 15: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Working together apart

Page 16: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Essentials of GRID computing

Page 17: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Virtual organizations: GRID

TIFR

Page 18: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Grid Components

Page 19: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Grid overview

Page 20: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop
Page 21: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop
Page 22: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop
Page 23: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop
Page 24: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

How does grid work?

Page 25: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

GRID portal / Gateway

Page 26: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Grid Services: grid middleware

Page 27: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop
Page 28: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

LHC and the GRID Computing

A pathologist uses a microscope to examine blood cells, of size about one thousandth of a mm, ie, 10-6 m

High energy probes structure of fundamental matter.

LHC will collide very, very high energy protons for this purpose.

Mammoth, very complex detectors (length 30 m, dia 20 m) are the technical eye of several thousand scientists to probe the smallest length scale.

Page 29: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Complexity of LHC experiments

When 2 very high energy protons will collide at LHC, mostly the situation in the detector will be like this, very crowded.

About 10 million electrical signals will have to be recorded in tiny fraction of a second, repeatedly for a long time (about 10 years). Using computers, a digital image is created for each such instance. Image size is about 2 MB on average, but varies considerably.

But most of these pictures are not interesting! Good things are always rare!

Page 30: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

In LHC experiment the task of the scientist is, to

Look for an instance with patterns of this type from 10 thousand Billion (1013 )crowded pictures.

This picture contains the clue about our universe.

Such a job is like, searching for a needle in a million haystacks! Similar to looking for a particular person in a thousand world populations of today (6 Billion, India’s population 1.2 Billion) A single computing system will never scale up to the

challenge.Concept of GRID computing developed from such requirements

Page 31: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

LHC will collide 6-8 hundred million proton-on-proton per second for several years.

Only 1 in 20 thousand collisions will have an important tale to

tell, but we do not know which one!

so we have to search through all of them!

Huge task!

• 15 PBytes (10 15 bytes) of data a year

• Analysis requires ~100,000

computers to get results in reasonable time.

GRID computing is essential

In hard numbers

Page 32: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

LHC-CERN DAE collaboration

175 Mbps

100 Mbps

100 Mbps100 Mbps

Page 33: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

CAF450MB/s(300Hz)

30-300MB/s (ag. 800MB/s)

~150k jobs/day

~50k jobs/day

50-500MB/s

10-20MB/s

CMSdetector

Tier-0Prompt ReconstructionArchival of Copy of Raw

and First RECO dataCalibration Streams (CAF)Data Distribution Tier-1

7 Tier-1sRe-Reconstruction

SkimmingSecond Archival of RAW

Served Copy of RECOArchival of Simulation

Data Distribution Tier-2

~50 Tier-2sPrimary Resources for

Physics Analysis and Detetector Studies by

users MC Simulation Tier-1

WLCG Computing Grid Infrastructure

The way CMS uses the GRID (WLCG)

TIER-3TIER-3 TIER-3TIER-3

100MB/s

CMS in Total: 1 Tier-0 at CERN (GVA) 7 Tier-1s on 3 continents 50 Tier-2s on 4 continents

33P.Kreuzer - GRID Computing - Mumbai

Page 34: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Tier 0

Tier 1National centres

Tier 2Regional groups

Different Universities,Institutes

Individual scientist’s PC

Experimental

site

CERN computer

centre,Geneva

ASIA(Taiwan)

India China KoreaPakist

an

France

ItalyGermany

USA

TIFRDelhi

U.Panjab U.

Useful model for Particle Physics experiments, but not necessary for others

T2_IN_TIFR

Tiered/Layered Structure connecting computers across the globe

Page 35: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Hardware at TIFR site: T2_IN_TIFR

About 50 users/scientists at present, still growing.

Another similar Tier2 centre in Kolkata for a different experiment at LHC.

Grid facility has been functional at TIFR for almost a year.

The CMS collaboration at LHC, CERN has been using the computer resources.

• Storage: 350 TB• 300 worker nodes.• Internet bandwidth: 1 GBps. To be upgraded in near future.

Note, continuous monitoring essential, we are manageing with 5 engineers, not all are full time.

Page 36: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Networking , GRID Middleware , Sites

GRID Middleware Services- Storage Elements- Computing Element- Workload Management System- Local File Catalog- Information System- Virtual Organisation Management Service- Inter-operability between GRIDs EGEE, OSG, NorduGriD..

Networking

Site Specificities, e.g. Storage/Batch systems at CMS Tier-1s:

Storage : dCache/ Castor dCache/HPSS dCache/ Castor Castor+ dCache/Chimera Enstore Enstore Storm TSM

Batch : Condor Torque/Maui BQS Torque/Maui Torque/Maui LSF PBSPro

FNAL RAL CCIN2P3 PIC ASGC INFN FZK

36P.Kreuzer - GRID Computing - Mumbai

Page 37: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Data Transfers from/to TIFR• TIFR T1 Transfer Quality (last 3 months) : improving, aim for stability :

• TIFR T1 production transfers (last year) : modest but ready to grow !

TIFR ASGC8TB MC data(custodial storage)

T1 TIFRTot 37TB from 7 T1s

37P.Kreuzer - GRID Computing - Mumbai

Page 38: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Statistics and plotsSite summary table

Site history

Site ranking

38P.Kreuzer - GRID Computing - Mumbai

Page 39: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

CMS Software Deployment

Deployment of CMS SW to 90% sites in few hours

Basic strategy: Use RPM (with apt-get) in CMS SW area

EGEE

39P.Kreuzer - GRID Computing - Mumbai

Page 40: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

CMS Centers and Computing Shifts

CMS Centre at CERN: monitoring, computing operations, analysis

CMS Experiment Control Room

CMS Remote Operations Centre at Fermilab

• CMS running Computing shifts 24/7• Encourage remote shifts• Main task: monitor and alarm •CMS sites & Computing Experts

40P.Kreuzer - GRID Computing - Mumbai

Page 41: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Challenge

Page 42: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Current Issues involved

Page 43: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

You may be one of these scientists working at LHC and using GRID computing facility very soon, even trying to improve it!

Page 44: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop
Page 45: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

Compute-communication cross-over

Page 46: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop

World Wide Web – Information Sharing

• Invented at CERN by Tim Berners-Lee (in 1990s)

• Agreed protocols: HTTP, HTML, URLs

• Anyone can access information and post their own

• Quickly crossed over into public use

No.

of

Inte

rnet

hosts

(m

illion

s)

Year

Going back

Page 47: GRID: Computing Without Borders Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research, Mumbai. Soft-computing workshop