COMP3050 Linux Clustering

Embed Size (px)

Citation preview

  • 8/6/2019 COMP3050 Linux Clustering

    1/34

    Linux clustercomputing

    Morris Law, IT Coordinator,

    Science Faculty, Hong Kong

    Baptist University

  • 8/6/2019 COMP3050 Linux Clustering

    2/34

    PII 4-node clusters started in

    1999

  • 8/6/2019 COMP3050 Linux Clustering

    3/34

    PIII 16 node

    clusterpurchased in

    2001.

    Plan for grid

    For test base

  • 8/6/2019 COMP3050 Linux Clustering

    4/34

    HKBU - 64-nodes P4-Xeon

    cluster at #300 of top500

  • 8/6/2019 COMP3050 Linux Clustering

    5/34

    16-node P4 Xeon Cluster for

    computational research from 2005

    16 compute nodes

    each with

    P4Xeon 3.2GHz x 2

    2GB RAM

    36GB SCSI harddisk

    ROCKS 4.0.0

  • 8/6/2019 COMP3050 Linux Clustering

    6/34

    The cluster management team

  • 8/6/2019 COMP3050 Linux Clustering

    7/34

    OUTLINE

    What is PC cluster?

    Different kinds ofPC cluster

    Beowulf cluster SSI cluster

    HPCC Cluster and parallel computing

    applications

  • 8/6/2019 COMP3050 Linux Clustering

    8/34

    What is a PC cluster?

    An ensemble of networked, stand-alone

    common-off-the-shelf computers used

    together to solve a given problem.

  • 8/6/2019 COMP3050 Linux Clustering

    9/34

    Different kinds ofPC cluster

    High Performance Computing Cluster

    (Beowulf cluster)

    Load Balancing High Availability

  • 8/6/2019 COMP3050 Linux Clustering

    10/34

    High Performance Computing

    Cluster (Beowulf)

    Start from 1994

    Donald Becker of NASA assemble theworlds first cluster with 16 sets of DX4PCsand 10 Mb/s ethernet

    Also called Beowulf cluster

    Built from commodity off-the-shelf hardware

    Applications like data mining, simulations,parallel processing, weather modelling,computer graphical rendering, etc.

  • 8/6/2019 COMP3050 Linux Clustering

    11/34

    Examples ofBeowulf cluster

    Scyld Cluster O.S. originated by Donald Becker

    http://www.scyld.com

    ROCKS from NPACI

    http://www.rocksclusters.org

    OSCAR from open cluster group

    http://oscar.sourceforge.net

    OpenSCE from Thailand

    http://www.opensce.org

    SCore from PC Cluster Consortium, Japan

    http://www.pccluster.org/

  • 8/6/2019 COMP3050 Linux Clustering

    12/34

    Load Balancing Cluster

    PC cluster deliver load balancing

    performance

    Commonly used with busy ftp and webservers with large client base

    Large number of nodes to share load

  • 8/6/2019 COMP3050 Linux Clustering

    13/34

    High Availability Cluster

    Avoid downtime of services

    Avoid single point of failure

    Always with redundancy Almost all load balancing cluster are with HA

    capability

  • 8/6/2019 COMP3050 Linux Clustering

    14/34

    Examples of Load Balancing

    and High Availability Cluster

    RedHat Cluster Suite

    http://www.redhat.com/cluster_suite/

    Turbolinux Cluster Server

    http://www.turbolinux.com/products/middleware/tlcs8.html

    Linux Virtual ServerProject

    http://www.linuxvirtualserver.org/

    Single System Image Cluster for Linux

    http://www.openssi.org

    Shaolin HA cluster http://www.shaolinmicro.com/product/hacluster/

  • 8/6/2019 COMP3050 Linux Clustering

    15/34

    Snapshots 1

    An example of Beowulf Cluster:

    ROCKS

    (http://www.rocksclusters.org)

  • 8/6/2019 COMP3050 Linux Clustering

    16/34

    ROCKSSNAPSHOTS

    The schematic diagram of a rocks cluster

  • 8/6/2019 COMP3050 Linux Clustering

    17/34

    ROCKSSNAPSHOTS

    Installation of a compute node

  • 8/6/2019 COMP3050 Linux Clustering

    18/34

    ROCKSSNAPSHOTS

    Ganglia Monitoring tools

  • 8/6/2019 COMP3050 Linux Clustering

    19/34

    HPCC Cluster and parallel

    computing applications

    Message Passing Interface MPICH (http://www-unix.mcs.anl.gov/mpi/mpich/)

    LAM/MPI (http://lam-mpi.org)

    Mathematical

    fftw (fast fourier transform) pblas (parallel basic linear algebra software)

    atlas (a collections of mathematical library)

    sprng (scalable parallel random number generator)

    MPITB -- MPI toolbox for MATLAB

    Quantum Chemistry software gaussian, qchem, amber

    Molecular Dynamic solver NAMD, gromacs, gamess

    Weather modelling MM5 (http://www.mmm.ucar.edu/mm5/mm5-home.html)

  • 8/6/2019 COMP3050 Linux Clustering

    20/34

    NAMD2 Software for

    Quantum Chemistry

  • 8/6/2019 COMP3050 Linux Clustering

    21/34

    Single System Image(SSI) Cluster

    MOSIX

    openMosix

  • 8/6/2019 COMP3050 Linux Clustering

    22/34

    MOSIX and openMosix

    MOSIX: MOSIX is a software package that enhances the Linuxkernel with cluster capabilities. The enhanced kernel supports anysize cluster of X86/Pentium based boxes. MOSIX allows for theautomatic and transparent migration of processes to other nodes

    in the cluster, while standard Linux process control utilities, suchas 'ps' will show all processes as if they are running on the nodethe process originated from.

    openMosix: openMosix is a spin off of the original Mosix. The firstversion of openMosix is fully compatible with the last version of

    Mosix, but is going to go in its own direction.

  • 8/6/2019 COMP3050 Linux Clustering

    23/34

    OpenMosix installation

    Install Linux in each nodes

    Download and install

    openmosix-kernel-2.4

    .26-openmosix

    1.i6

    86

    .rpm openmosix-tools-0.3.6-2.i386.rpm

    and related packages like thoses in

    www.openmosixview.com

    Reboot with openmosix kernel

  • 8/6/2019 COMP3050 Linux Clustering

    24/34

    Screenshots 2

    OpenMosix cluster management

  • 8/6/2019 COMP3050 Linux Clustering

    25/34

    openMosix cluster

    management tools

    openMosixView

    openMosixmigmon

    3dmosmon

  • 8/6/2019 COMP3050 Linux Clustering

    26/34

    Advantage ofSSI cluster

    Not need to parallelize code

    Automatic process migration, i.e. load

    balancing Add / delete nodes at any time

    Well aware of hardware and system

    resources

  • 8/6/2019 COMP3050 Linux Clustering

    27/34

    Reference URLs

    Clustering and HA

    Beowulf, parallel Linux cluster.

    ROCKS from NPACI OPENMOSIX , scalable cluster computing with

    process migration

    HighP

    erformance Cluster Computing CentreSupported by Dell and Intel

    Linux Cluster Information Center

    The Quantian Scientific Computing Environment

  • 8/6/2019 COMP3050 Linux Clustering

    28/34

    PC cluster nowadays

    Node hardware Multi-core CPUs with L2 and L3 cache

    DDR RAM

    Large harddisk (over 500GB per disk)

    Blade / Rack mount server

    Storage

    SAN, I/O nodes, parallel file systems

  • 8/6/2019 COMP3050 Linux Clustering

    29/34

    PC cluster nowadays (cont)

    Interconnect

    Gigabit Ethernet, Myrinet, Infiniband, Quadrics

  • 8/6/2019 COMP3050 Linux Clustering

    30/34

    Thank you!

    Welcome to visit HPCCC, HKBU

    http://www.hkbu.edu.hk/hpccc/

    http://www.hkbu.edu.hk/tdgc/

  • 8/6/2019 COMP3050 Linux Clustering

    31/34

    The Scientific Computing Lab.

  • 8/6/2019 COMP3050 Linux Clustering

    32/34

    Opening of the High Performance Cluster

    Computing Centre Supported by Dell and Intel

  • 8/6/2019 COMP3050 Linux Clustering

    33/34

    TDG cluster configuration

    Master node: DELL PE2650 P4 Xeon

    2.8GHz x 2

    4GB ECC DDR RAM

    36GB x 2 internal HDrunning RAID 1 (mirror)

    73GB x 10 HD arrayrunning RAID 5 with hotspare

    Compute nodes x 64

    each with DELL PE2650 P4 Xeon

    2.8GHz x 2

    2GB ECC DDR RAM

    36GB internal HD

  • 8/6/2019 COMP3050 Linux Clustering

    34/34

    Interconnect

    configuration

    ExtremeBlackDiamond

    6816 Gigabit

    ethernet switch