15
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal

Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal

  • Upload
    watson

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal. Outline of the talk Example of typical cluster. Cluster types. Setup up of the Cluster Components required to make up a cluster. Selection of different Components for cluster. - PowerPoint PPT Presentation

Citation preview

Page 1: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Cluster computing facility for CMS simulation work at NPD-BARC

Raman Sehgal

Page 2: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Outline of the talk

Example of typical cluster.

Cluster types.

Setup up of the Cluster

Components required to make up a cluster.

Selection of different Components for cluster.

Overview of selected components.

Page 3: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

NODE-01NODE-01 NODE-02NODE-02 NODE-03NODE-03 NODE-04NODE-04 NODE-05NODE-05 NODE-06NODE-06 NODE-07NODE-07 NODE-08NODE-08

NODE-09NODE-09 NODE-10NODE-10 NODE-11NODE-11 NODE-12NODE-12 NODE-13NODE-13 NODE-14NODE-14 NODE-15NODE-15 NODE-16NODE-16

FILE SERVERFILE SERVERFAST / GIGABIT ETHERNET SWITCH

CAT-5/6 CABLE

A Typical Cluster

Page 4: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Cluster TypesHigh Performance clusters

Parallel computers and jobsOptimized for better job performance Emphasis on better interconnects

High Throughput clustersLarge numbers of sequential jobs Emphasis on better storage and I/OLoad Balancing

High Availability clustersClusters to provide reliable serviceWeb servers, database servers etc.Emphasis on better redundancy

Page 5: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

user1 user2 user3 user4

Head Node(UI,PBS,NIS)

Gigabit Switch

Worker Nodes

Cat 5/6 Cable

To CMS Grid

Node1

Node2

Node3

Node 4

Node 5

Node 6

Node 7

Node 8

Node 9

Node 10

Node 11

Node 12

Node 13

Node14

Node 15

20 TB Storage Box

Infiniband

Public N/w

Private N/w

Infiniband Cable

Cluster Setup

Page 6: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Components that make up a clusterNodes

Compute (Worker) nodes : For running jobsService nodes (Head Node) : Management, monitoring, User interfaceStorage nodes : Central storage area for users and all I/P and O/P files

NetworkSwitchesCables

Support equipmentRacks : House nodes and network equipmentKVM switches, Terminal servers : For console sharingPower distribution units : Power sequencing and control

SoftwareOperating SystemNetworking protocolsApplications

Page 7: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Selection of NodeComputing power of a cluster depends on the power of a compute nodeChoice of processor (Xeon, Opteron, Itanium …), cache, frequencyMemorySingle/Dual/Quad ProcessorNetwork ports (Single/Dual)Expansion slots (PCI, PCIe)Management (IPMI …)Form Factor (Tower, Rack mountable, Chassis, Power supply)Heat Dissipation, CoolingOS Support

Desktop node Rack Mount server

Page 8: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Selection of network switchSelection of a good interconnect is an important part in the design of any cluster.Gigabit Ethernet with copper is the standard for cluster interconnectsInfiniband provides high throughput and low latency.Infiniband is useful when there is high I/O and communication overhead.Choice of 24 and 48 port stackable switchesChassis (core) switches available for large configurations (upto 672 ports)

Stackable Fixed Port Switch Chassis based switch

Page 9: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Storage BoxCentral storage server(s) is (are) needed to store all users’ directories and all I/P and O/P files

Should have high capacity.

Ensure reliability using RAID .

Automated backup

Console SharingAccess to each node’s console (keyboard, mouse, display) is required during installation

KVM Switches share a single Keyboard, Mouse, Video among multiple nodes

Page 10: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Operating SystemMajority of clusters nowadays run some distribution of Linux

RobustOpen source solutionCost-effectiveSupport for clusters (auto installers, cluster management tools)

Widely used distributions: Redhat, SuSe and Debian Other Linux distributions : Mandriva and Gentoo Redhat based distributions : Scientific Linux Cern, CentosSelection of a distribution for a cluster depends mainly on compatiability of code to be run, so for our purpose the selected distribution is Scientific Linux Cern 4.0 or higher

Code to be run : CMSSW

Page 11: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Overview of selected components1)Worker Node and Head Node

Processor: Dual Intel Quad Core Xeon speed @ 3.0 GHz or higher Harpertown series with 12 MB cache or higher with 1600 MHz FSB.

Memory: 16 GB 800 MHz DDR2 memory Hard Disk: 2-3 750 GB or higher Enterprise SATA II / SAS hard disks Network:

  1. Two or more PXE boot compliant on-board 1000BaseT Gigabit Ethernet Ports 2. One Infiniband 4x DDR (20 Gbps) port on PCI Express x8 or higher 3. Form Factor: 1 U Rackmountable 

Page 12: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

2).Storage Server : 2 NumberProcessor: Dual Intel Quad Core Xeon @ 2.5 GHz or higher with 12 MB

cache or higher with 1333 MHz FSB, 80W per processor Memory: 16 GB, 667 MHz DIMM DDR-2 Capacity : 24 x 450 GB = 10.8 TB with SAS disks of 15 K rpm.Network:   Two PXE boot complaint on-board 1000BaseT Gigabit Ethernet

Ports   Dual 10G Ethernet multimode fiber port on PCI Express   One Infiniband 4x DDR(20 Gbps) port on PCI ExpressForm Factor :3 / 4 U rackmountable.RAID level : Support for RAID levels 0,1,10,5,6Server Management : Using Intelligent Platform Management Interface ( IPMI)

Page 13: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

Networking Switches

3). Infiniband Switch with following specifications: No. of Ports : 24 port InfiniBand 4X DDR (20Gbps) ports

Bandwidth : 960 Gbps

Chassis: 1 U Rack Mountable chassis

Management protocols : SNMP, Telnet, SSH, HTTP, FTP

Infiniband cables : 4xDDR infiniband CX4, 30 AWG Passive

cable

4).Gigabit Ethernet Switch

(a) 24 port autosensing, unmanaged Gigabit Switch

(b) 24 port autosensing, managed Gigabit Switch

Page 14: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

5). Rack Mountable KVM switch with following specifications

No. of ports: 16 ports with PS/2 and USB keyboard and mouse

support

Features: Integrated with 15 inch LCD monitor, Keyboard and

touch pad

Form Factor: 1U with LCD panel folded

6) Uninterruptible Power Supply (UPS)

Capacity: 6KVA

Form Factor: 3 U Rack Mountable with rail kit and internal

batteries

 Nominal Output Voltage: 230 V @ 50 Hz

Backup time: 30 minutes at half load with internal battery or

higher

Page 15: Cluster computing facility for    CMS simulation work at     NPD-BARC Raman Sehgal

THANKS