20
 Page 1 Server Architectures SMP: Symmetric Multi-Processors René J. Chevance Jan uar y 2005 Page 2 © RJ Chevance Foreword n This presentation is an introduction to a set of presentations about server architectures. They are based on the following book: Serveurs Archit ectures: Multiproc essors, Clusters, Parallel Systems, Web Servers, Storage Solutions René J. Chevance Digital Press December 2004 ISBN 1-55558-333-4 http:/ /books . else vier.com/ This book has been derived from the following one: Serveurs multiprocesseurs, clusters et architect ures parallèle s René J. Chevance Eyrolles Avril 2000 ISBN 2-212 -09114-1 http://www.eyrolles.com/ The English version integrates a lot of updates as well as a new chapter on Storage Solutions. Contact: www.chevance.com [email protected]

ServerArchitectures SMPSymmetricMulti Processors

Embed Size (px)

DESCRIPTION

ppt slides from the book

Citation preview

  • Page 1

    Server ArchitecturesSMP: Symmetric Multi-Processors

    Ren J. ChevanceJanuary 2005

    Page 2

    RJ Chevance

    Forewordn This presentation is an introduction to a set of presentations

    about server architectures. They are based on the following book:

    Serveurs Architectures: Multiprocessors, Clusters, Parallel Systems, Web Servers, Storage Solutions

    Ren J. ChevanceDigital Press December 2004 ISBN 1-55558-333-4

    http://books. elsevier.com/

    This book has been derived from the following one:

    Serveurs multiprocesseurs, clusters et architectures parallles

    Ren J. ChevanceEyrolles Avril 2000 ISBN 2-212-09114-1

    http://www.eyrolles.com/

    The English version integrates a lot of updates as well as a new chapter on Storage Solutions.

    Contact: www.chevance.com [email protected]

  • Page 2

    Page 3

    RJ Chevance

    Organization of the Presentationsn Introductionn Processors and Memoriesn Input/Outputn Evolution of Software Technologies Symmetric Multi-Processors (this document)

    SMP Architecture Overview Microprocessors and SMP Operating Systems and SMP SMP with moderate number of processors (8 processors) SMP with Many Processors (16 processors): Nearly UMA Systems, CC-

    NUMA, COMA Performance Improvement Provided by SMP Advantages and disadvantages of SMP architecture Partitioning: Virtual Machines, Hardware Partitioning, Logical Partitioning

    n Cluster and Massively Parallel Machinesn Data Storagen System Performance and Estimation Techniquesn DBMS and Server Architectures n High Availability Systemsn Selection Criteria and Total Cost of Possessionn Conclusion and Prospects

    Page 4

    RJ Chevance

    SMP Architecture Overviewn Symmetric MultiProcessor Architecture

    With SMP, all processorsmay accessto all system resources (memory, I/O). A SMP is operating under the control of one Operating System managing all system resources.

    The native programming model is the sharing of memory between processes.

    Processor Processor Processor

    System ControllerMemory

    I/O Controller I/O Controller

    Processor -memory-I/O interconnect

    I/O bus

    or

    Systme Operating System

    DBMS(s)Middleware

    Applications

    Hardware resources(processors, memory, controllers and peripherals)

    b) SoftwareViewof a Symmetrical Multiprocessora) Hardwareview of a Symmetrical Mutiprocessor

  • Page 3

    Page 5

    RJ Chevance

    SMP Architecture Overview(2)

    n SMP Programming and Execution Model

    Process

    Processor

    Process

    Process

    Processor

    Processus

    Processor

    Process

    Processor

    Process

    Processor

    Process

    a) - Uniprocessor b) - Tightly-Coupled Multprocessor /SMP

    Page 6

    RJ Chevance

    SMP Architecture Overview(3)n Lightweight processes (threads)

    o Threads are sharing process resourceso Cost of thread switching is lower than process

    switching

    Single-threadedUser Process

    2GBSharedLibrariesSharedMemory

    Stack

    Data

    Code (text)0

    Proc. Context

    Multi-threadedUser Process

    Kernel codeand data

    4GB

    2GB

    OperatingSystemAddressSpace

    u area: Status and control informationassociated with the process

    UserAddress Space

    2GBSharedLibraries

    SharedMemory

    Data

    Code (text) 0

    Proc. Context

    Stack

    Thread 0 Thread n

    Main process

    Proc. Context

    Stack

  • Page 4

    Page 7

    RJ Chevance

    Microprocessors and SMP

    n Microprocessors must integrate specific features to operate in an SMP environment

    n Cache Coherence - Illustration

    Store '4' ->A

    Memory

    (1)

    Processor P1 P2 P3

    Cache

    Load A Load A

    b

    A=1

    A=1A=1

    Memory

    (2)

    Processor P1 P2 P3

    Cache

    Load A Load A

    b

    A=1

    A=1A=1A=4

    Page 8

    RJ Chevance

    Microprocessors and SMP(2)n Cache Coherence Protocol (e.g. MESI)

    n Memory Consistencyn Synchronization Mechanisms

    Invalid Shared

    ExclusiveModified

    Read miss

    LB

    LB

    U

    U

    Read miss

    External write

    External write

    External read

    Externalread

    Write hit

    Write hit

    Read Hit

    Read hit

    Read hit

    Write hit

    External write

    U indicates an update of the memoryLB indicates loading a blockinto the cache

  • Page 5

    Page 9

    RJ Chevance

    Operating Systems and SMPn Operating systems must be designed for SMP

    (e.g. Windows NT) or adpated to SMP (e.g. Unix)n e.g. transforming an OS for SMP:

    o MP Safe Operation then MP efficiento Transforming kernel processes into threadso Pre-emptive kernelo Locking to prevent simultaneous access to shared

    resources (e.g. data structures)l Granularity, number of locks, duration of locksl Dead locks

    o Allocating processes/threads to processors (affinity)o

    n Need constant tuning to adapt to changing characteristics such as the number of processors,.

    Page 10

    RJ Chevance

    SMP with moderate number of processorsn By 2003, we can assume that up to 8 is a moderate number

    of processors (no need to use sophisticated technologies). This limit is growing with time as microprocessor vendors integrate more and more system capability into their silicon.

    n Internal Structure of the AMD Opteron Processor (according to AMD)

    DDR MemoryController

    DDR MemoryController

    OpteronProcessor

    Core

    OpteronProcessor

    Core

    L1Instruction

    Cache

    L1Instruction

    Cache

    L1Data

    Cache

    L1Data

    Cache

    L2Cache

    L2Cache

    HyperTransportHyperTransport

  • Page 6

    Page 11

    RJ Chevance

    SMP with moderate number of processors(2)

    n 4 and 8 processor HyperTransport-basedSMP Systems (according to AMD)

    HyperTransportTo AGP Bridge

    HyperTransportTo AGP Bridge

    HyperTransportTo PCI-X bridgeHyperTransportTo PCI-X bridge

    HyperTransportTo PCI-X bridgeHyperTransportTo PCI-X bridge SouthbridgeSouthbridge

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    8xAGP8x

    AGP

    AGP optional

    a) 4 way SMP a) 8 way SMP

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    OpteronOpteron

    PCI Adapters Memory

    Page 12

    RJ Chevance

    SMP with moderate number of processors(3)n Overview of IBMs X-Architecture (IBM)

    Proc. Proc.

    Proc. Proc.Mem

    ory

    Proc.Proc.

    Proc.Proc.

    Mem

    ory

    Proc. Proc.

    Proc. Proc.Mem

    ory

    Proc.Proc.

    Proc.Proc.

    Mem

    ory

  • Page 7

    Page 13

    RJ Chevance

    SMP with moderate number of processors(4)n Architecture of a 4-Way Itanium-2 Based SMP Server, the

    SR870NB4 (Source INTEL)

    Itanium 2Itanium 2 Itanium 2Itanium 2 Itanium 2Itanium 2 Itanium 2Itanium 2

    System Bus 6.4 GB/s

    Scalable NodeController

    SNC

    Scalable NodeController

    SNC6.4 GB/s

    DMHDDR

    DDR

    DMHDDR

    DDR

    DMHDDR

    DDR

    DMHDDR

    DDR

    Memory128 GB

    FirmwareHub

    I/O HubI/O Hub

    Scalability Ports

    LegacyI/O Hub

    PCI-XBridge

    Hub Interface1 GB/s per link

    PCI-XBridge

    Page 14

    RJ Chevance

    SMP with moderate number of processors(5)

    n Architecture of a 4-Way Xeon MP-based Server (according to ServerWorks)

    Pentium 4Xeon MP

    Pentium 4Xeon MP

    Pentium 4Xeon MP

    Pentium 4Xeon MP

    Pentium 4Xeon MP

    Pentium 4Xeon MP

    Pentium 4Xeon MP

    Pentium 4Xeon MP

    GC HEGC HE

    CIOB-XCIOB-XIMBus

    CIOB-XCIOB-X

    CIOB-XCIOB-X

    PCI-X

    DDR 200 DDR 200

    DDR 200 DDR 20016 B

    CSB5Legacy I/O

    CSB5Legacy I/O 32-bit PCI

  • Page 8

    Page 15

    RJ Chevance

    SMP with moderate number of processors(6)

    n Crossbar-based Architectures: Sun Fire 3800 Architecture (Source: Sun)

    Processor& E Cache

    Memory8 GB

    CPU Data Switch

    Processor& E Cache

    Memory8 GB

    2.4 GB/sec2.4 GB/sec

    2.4 GB/sec2.4 GB/sec

    Processor& E Cache

    Memory8 GB

    CPU Data Switch

    Processor& E Cache

    Memory8 GB

    2.4 GB/sec2.4 GB/sec

    2.4 GB/sec2.4 GB/sec

    Address Repeater

    Board Data Switch

    4.8 G /sec

    4.8 GB/sec

    DatapathController

    150 millionaddresses/sec

    4.8 GB/sec

    Page 16

    RJ Chevance

    SMP with Many Processors 16 processorsn Because of the physical and electrical constraints, larger SMP

    systems must use a hierarchy of interconnect mechanisms; in arecursive manner, a large SMP tends to be a collection of smaller systems, each of which is itself an SMP

    n Such a hierarchy means that memory access times differ - when aprocessor accesses its own memory, it will see a much shorter access time than when it accesses memory in a distant module. This non-uniformityof memory access times can have a majoreffect on system behavior and on the software running on it

    n The degree of uniformity allows us to distinguish two classes of SMP systems.o Systems in which access to any memory takes an identical amount

    of time, or very nearly identical are called(respectively) UMA - forUniform Memory Access - or nearly UMA .

    o Systems in which there is a noticeable disparity in access times - afactor greater than 2 - are referred to as CC-NUMA, for Cache -Coherent Non-Uniform Memory Access

    n This classification is purely empirical, and its practical significance is that in a distinctly NUMA system, software must -for maximum efficiency - take account of the varying accesstimes. The amount of work needed for this adaptation tends toincrease with the NUMA ratio.

  • Page 9

    Page 17

    RJ Chevance

    Nearly UMA Systems

    n Fujitsu Prime Power System architecture (simplified) (after Fujitsu)

    ProcessorsMemoryPCI cardsL1 Crossbar

    System card

    Up to 8 cards(32 processors)

    L2 Crossbar

    Cabinet(32 processors)

    M

    PPPP

    PCI

    L1

    L2 L2

    L2L2

    Memory latency: 300 ns when a L2 crossbar is part of thaccess path (one may assume 200 ns for local access time)

    Page 18

    RJ Chevance

    Nearly UMA Systems(2)

    n HP Superdomeo HP Superdome Cell (after HP)

    CellController

    MemoryUp to 16 GB

    PA

    -860

    0

    PA

    -860

    0

    PA

    -860

    0

    PA

    -860

    0

    Cro

    ssba

    r

    I/OController

    AS

    IC PCI

    AS

    IC PCI

    AS

    ICPCI

    AS

    ICPCI

    1.6 GB/s

    6.4 GB/s

  • Page 10

    Page 19

    RJ Chevance

    Nearly UMA Systems(3)

    n HP Superdome Architecture (after HP)

    I/O

    CellC

    ross

    bar

    Cro

    ssba

    r

    Cro

    ssba

    r

    I/O

    Cell

    Cro

    ssba

    r

    BackplaneBackplane

    CabinetCabinet

    Memory latencies for a fully-loaded systemare said to be:

    within-cell: 260 ns; on the same crossbar: 320 to 350 ns; within the same cabinet: 395 ns; between cabinets: 415 ns.

    Page 20

    RJ Chevance

    Nearly UMA Systems(4)

    n IBM pSeries 690o Structure of Power4 MCM (after IBM)

    CPUCore

    CPUCore

    L3

    Dir

    .

    Shared L2

    Chip-Chip Comm.

    CPUCore

    CPUCore

    L3

    Dir

    .

    Shared L2

    Chip-Chip Comm.

    CPUCore

    CPUCore

    L3

    Dir

    .

    Shared L2

    Chip-Chip Comm.

    CPUCore

    CPUCore

    L3

    Dir

    .

    Shared L2

    Chip-Chip Comm.

    Multi-chip Module Boundary

    L3MemoryCtrl

    L3MemoryCtrl

    Mem

    ory

    Mem

    ory

    GX Bus

    GX Bus

    L3 MemoryCtrl

    L3 MemoryCtrl

    Mem

    ory

    Mem

    ory

    GX Bus

    GX Bus

  • Page 11

    Page 21

    RJ Chevance

    Nearly UMA Systems(5)

    o Architecture of IBM pSeries 690 (after IBM)

    P PL2

    P PL2

    P PL2

    P PL2

    GX

    GX

    GX GX

    L3L3

    L3L3

    L3L3 L3L3

    PPL2

    PPL2

    PPL2

    PPL2

    GX

    GX

    GXGX

    L3 L3

    L3 L3

    L3 L3L3 L3

    P PL2

    P PL2

    P PL2

    P PL2

    GX

    GX

    L3L3

    L3L3

    L3L3 L3L3

    PPL2

    PPL2

    PPL2

    PPL2

    GX

    GX

    L3 L3

    L3 L3

    L3 L3L3 L3

    GX GX GXGX

    GX Slot

    GX Slot

    GX Slot

    MemorySlot

    MemorySlot

    MemorySlot

    MemorySlot MemorySlot

    MemorySlot

    MemorySlot

    MemorySlot

    MCM 0MCM 1

    MCM 2 MCM 3

    Page 22

    RJ Chevance

    Nearly UMA Systems(6)

    n z900 Hardware Architecture (after IBM)

    PU01

    L1

    PU02

    L1

    PU03

    L1

    PU04

    L1

    PU05

    L1

    PU06

    L1

    PU07

    L1

    PU08

    L1

    PU09

    L1

    L2 cache control chip and L2 cache data chips (16 MB shared )

    PU00

    L1

    MBA

    1

    MBA

    0

    STISTI

    Memorycard

    0

    Memorycard

    2

    Clock

    PU0B

    L1

    PU0C

    L1

    PU0D

    L1

    PU0E

    L1

    PU0F

    L1

    PU10

    L1

    PU11

    L1

    PU12

    L1

    PU13

    L1

    L2 cache control chip and L2 cache data chips (16 MB shared )

    PU0A

    L1

    MBA

    2

    MBA

    3

    STISTI

    Memorycard

    1

    Memorycard

    3

    CCE 0 CCE 1

    ETR

    ETR

  • Page 12

    Page 23

    RJ Chevance

    Nearly UMA Systems(7)

    n Architecture of a system with 16 Itanium 2 processors (after INTEL)

    Itanium2

    Itanium2

    Itanium2

    Itanium2

    SNC

    Mem

    ory

    Fir

    mw

    are

    Hu

    b

    Itanium2

    Itanium2

    Itanium2

    Itanium2

    SNC

    Mem

    ory

    Fir

    mw

    are

    Hu

    b

    ScalabilityPort Switch

    I/O HubLegacy I/OController Hub

    PCI-X PCI-X

    Itanium2

    Itanium2

    Itanium2

    Itanium2

    SNC

    Mem

    ory

    Fir

    mw

    are

    Hu

    b

    Itanium2

    Itanium2

    Itanium2

    Itanium2

    SNC

    Mem

    ory

    Fir

    mw

    are

    Hu

    b

    ScalabilityPort Switch

    I/O HubLegacy I/OController Hub

    PCI-X PCI-X

    The system is based on 4-processor building blocks, using - here - the Scalability Port Switch and its Scalability Ports.These provide both connection and coherence, and offer a reasonable NUMA factor - Intel indicates a value of 2.2 for this system.

    Page 24

    RJ Chevance

    Nearly UMA Systems(8)

    n Structure Suns 3800(8), 4800(12) and 6800(24) Systems (after Sun)

    Fireplane address router

    Fireplane data crossbar

    CPU/MemoryBoard 4

    CPU/MemoryBoard 5

    CPU/MemoryBoard 6

    CPU/MemoryBoard 3

    CPU/MemoryBoard 2

    CPU/MemoryBoard 1

    I/O Assembly 1I/O Assembly 1

    I/O Assembly 2I/O Assembly 2

    I/O Assembly 4I/O Assembly 4

    I/O Assembly 3I/O Assembly 3

    Sun Fire 3800 Server

    Sun Fire 4800/4810 Server

    Sun Fire 6800 Server

    Local memory access time: 180 ns, access to memory on another card takes 240 ns

  • Page 13

    Page 25

    RJ Chevance

    CC-NUMA Architecture

    n Early CC-NUMA systems were based upon 4-way building blocks (Pentium-based) and used derivatives of the SCI (Scalable Coherent Interface). Early systems were characterized by poor performance

    n Principles of CC-NUMA Architecture

    Proc.

    Cache

    Memory

    E/SI/O

    Coherentnetwork i/f

    Module

    Coherent Network Interconnect

    Proc.

    Cache

    MmoireMmoire

    MmoireMmoire

    Memory

    E/SE/S

    E/SE/S

    I/O

    Physical Configuration Equivalent Logical Configuration

    Proc.

    Cache

    Memory

    E/SI/OCoherentnetwork i/f

    Module

    Page 26

    RJ Chevance

    CC-NUMA Architecture(2)

    n CC-NUMA performance is very sensitive to locality of information. Large NUMA factor (e.g. above 5) requires extensive software optimizations

    n Some examples of possible optimizations whenNUMA factor is large:o Affinity scheduling: arranging that a process, once

    suspended, is rescheduled on the same processor on which it last executed in the expectation that the caches contain useful state from the earlier execution

    o When allocating memory, prioritize local allocation (i.e., allocation within the module on which the program is executing) above distant allocation

    o When a process has had pages paged out, allocate memory for the pages when paged back in on the same module as the process is executing

  • Page 14

    Page 27

    RJ Chevance

    CC-NUMA Architecture(3)

    n General Architecture of SGI Origin 3000 System (after SGI)

    Possessor A Possessor B

    HubChip

    Memoryand

    Directory

    I/OCrossbar

    Module 0

    Scalable Interconnect Network : NUMAlink = 3.2 GB/sec.

    Module 1 Module 255

    Page 28

    RJ Chevance

    CC-NUMA Architecture(4)n 32 and 64 processor SGI Origin 3000

    configurations (after SGI)

    R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    32 processors

    64 processors

  • Page 15

    Page 29

    RJ Chevance

    CC-NUMA Architecture(5)n SGI Origin 3000 - 128-processor Configuration

    (after SGI)

    R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R M

    M

    RM

    M

    R

    M

    M R

    M

    M

    R

    RR

    R

    RR

    R

    R128 processors

    Hierachical Fat Bristled Hypercube

    Page 30

    RJ Chevance

    COMA Architecturen COMA Architecture (Cache Only Memory Architecture)

    aimed at large SMP: data doesnt have a fixed place in memory (that is, has no fixed address in the memory of a module), and the memory of each module acts as a cache. Data is moved, by hardware, to where it is being accessed, and data that is shared for reading may therefore be resident in several modules at any one time

    n Schematic diagram of a COMA architecture

    E/SE/S

    E/SE/S

    I/O

    Cache

    Proc. Cache(Memory)

    Physical Configuration Equivalent Logical Configuration

    Proc.

    Cache

    Memory

    E/SI/OCoherent network i/f

    Module

    Coherent Network Interconnect

    Proc.

    Cache

    Memory

    E/SI/OCoherent network i/f

    Module

  • Page 16

    Page 31

    RJ Chevance

    COMA Architecture(2)n Only one Company (KSR now disappeared) has introduced a

    COMA system to the market placen Advantages:

    o because memory is managed as a cache, data is always local to the module using it

    o many copies of data used read-only may exists simultaneously

    n Disadvantages:o the hardware is more complexo directory sizes increase, as does the time necessary to make

    allocate data

    n As originally described, COMA required a specific microprocessor architecture in the area of memory management (e.g. KSR) and so a very large investment

    n S-COMA (Simple COMA), a project at Stanford University), is using standard microprocessors

    Page 32

    RJ Chevance

    SMP, CC-NUMA, COMA: a Summary

    n The traditional UMA SMP architecture provides the best character istics as regards memory access time and the simplicity of the required cache coherence protocols

    n CC-NUMAs data uniqueness property (0 or 1 copies of an object in memory at any time) simplifies the coherence protocol at the expense of increased access time to data in other modules

    n COMAs ability to replicate data copies improves effective access time but results in a greater complexity in the needed cache coherence protocol

    VM Extracts0 or 1 copies,managed by

    the VM system

    VirtualMemory

    (VM)Implementation

    of VM

    Classic SMP CC-NUMA COMA

    VM Extracts0..N copies

    Managed by cache coherence

    VM Extracts0..N copies

    Managed by cache coherence

    Memory

    VM Extracts0..N copies

    Managed by cache coherence

    VM Extracts0 or 1 copies,managed by

    the VM system

    VM Extracts0 or 1 copies

    VM Extracts0 or 1 copies

    VM Extracts0..N copies

    Managed by cache coherence

    Cache-coherentinterconnect

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Memory Memory MemoryMemory

  • Page 17

    Page 33

    RJ Chevance

    Performance Improvement Provided by SMP n A number of factors prevents an N-processor

    SMP having N times the performance of a uniprocessor e.g.o Contention for the bandwidth of the processor-

    memory interconnect increases with the number of processors

    o Contention for shared memory increases with the number of processors

    o Maintaining cache coherence means that cache line invalidations, and movement of data from cache to cache, must occur

    o Reduction of cache hit rate as a result of more context switches

    o An increase in the number of instructions which must be executed to implement a given function because of the existence of locks and the contention for them

    n Very few data publicly available about scalability of SMP

    Page 34

    RJ Chevance

    Performance Improvement Provided by SMP(2)n Evolution of the behavior of a system multiprocessor

    (transactional application and DBMS) according to the successive versions of software showing the importance of improvements in the OS and the DBMS (the hardware being unchanged)

    0

    1

    2

    3

    4

    5

    1 2 3 4 5 6 7 8

    Number of Processors

    Rel

    ativ

    e P

    erfo

    rman

    ce

    Version i+1

    Version i

  • Page 18

    Page 35

    RJ Chevance

    Advantages and disadvantages of SMP architecture

    Advantages Disadvantages. Scalability of the system at a moderate cost. . Existence of at least one single point of failure - the

    operating system. Simple performance increase by adding a card or a processor module

    . Maintenance or update activities on the hardware and the OS generally require the whole system to be stopped

    . Multiprocessor effectiveness - the system performance increases (within definite limits) as the num ber of processors grows

    . Limitation of the number of processors because of access contention in the hardware (e.g., the bus) and software (operating system and DBMSs).

    . Simplicity of the programming model . Upgrade capability limited by rapid system obsolescence versus processor generations

    . Application transparency - single processor applications continue to work (although only multi- threaded applications can benefit from the architecture)

    . Complexity of the hardware.

    . Availability: if one processor fails, the system may be restarted with fewer processors

    . Adaptation and expensive tuning of the operating system.

    . The ability to partition the system . Necessary application adaptation to benefit from the performance available.

    Page 36

    RJ Chevance

    Partitioningn Partitioning divides a single physical SMP system into a number of

    independent logical SMPs, each logical SMP running its own OS. Examples of situations where partitioning is useful:o when consolidating several servers into a single system (for investment

    rationalization, for example), while maintaining the apparent independence of the servers;

    o sharing the same systems between independent activities whose balance changes over time

    n Example of system partitioning:

    Proc.

    Cache

    Proc.

    Cache

    Proc.

    Cache

    Memory

    SMP Physical Configuration

    Proc.

    Cache

    Proc.

    Cache

    Production System

    Proc.

    Cache

    Proc.

    Cache

    Test System forNew Versions

    (Operating Systemand/or Applications)

    Proc.

    Cache

    Proc.

    Cache

    DevelopmentSystem

    MemoryMemoryMemory

  • Page 19

    Page 37

    RJ Chevance

    Partitioning(2)n Techniques of Partitioning

    o Virtual Machines (introduced in the late 60s for mainframes and widely used, being now used for servers based upon standard technologies)l Virtual Machine - Principles of Operation

    l Each virtual machine has its own address space and Operating System (all can be different)

    l The Hpervisor executes privilegedmode instruction issued by guest OS onto real resources

    Hypervisor

    Proc.Proc. PhysicalPage

    PhysicalPage Storage

    OperatingSystem

    OperatingSystem

    HardwareResources

    Virtual Machine 1 Virtual

    Machine N

    App

    licat

    ion

    App

    licat

    ion

    App

    licat

    ion

    App

    licat

    ion

    Page 38

    RJ Chevance

    Partitioning(3)n Hardware Partitioning

    o Consists of defining, for a given physical system configuration, a number of logical systems, each with its own allocation of hardware resources (processor, memory, controllers and peripherals)

    n Logical Partitioningo Logical partitioning does not require dedication of the physical resources to

    logical systems; rather, the resources are managed by a special operating system which dynamically multiplexes the physical resources correctly between the logical systems

    Proc.Proc. Physical Page

    Physical Page Storage

    OperatingSystem

    OperatingSystem

    HardwareResources

    LogicalMachine 1

    LogicalMachine 1

    App

    licat

    ion

    App

    licat

    ion

    App

    licat

    ion

    App

    licat

    ion

    Virtual SystemAbstraction Layer

    Virtual SystemAbstraction Layer

    System Abstraction Layer

    Hyp

    ervi

    sor

    Res

    ourc

    eM

    anag

    emen

    t

  • Page 20

    Page 39

    RJ Chevance

    Partitioning(4)

    n Dynamic partitioning is key for the flexibility of the operation (otherwise the whole system must be stopped, reconfigured and restarted). Virtual Machines and Logical Partitioning facilitate dynamic partitioning

    n Workload Managers:o Complements an OS by ensuring that a given

    application (or set of applications) can be afforded a guaranteed slice of the actual physical systems resources (e.g, physical memory, processor time.)

    o A Resource Manager reserves sufficient resources to ensure that critical work may be performed as required, while allowing a system so support multiple concurrent applications. When the critical work does not use all pre -allocated resources, the unused resource slots may be used by other applications

    Page 40

    RJ Chevance

    Server Consolidationn The pressure on costs (TCO reduction), information managers are

    looking at reducing the number of servers deployed. Several approaches are proposed such as:o physical concentration were several servers are regrouped in the same

    location possibly in a cluster configuration (offering high availability if needed);

    o concentration were all applications distributed on several servers are supported on a single system (i.e. in a way similar to the traditional mainframe) with technologies such as:l Virtual machine approach l Workload management were a single operating system supports

    simultaneously several applications. Resource sharing between applications is being managed according to a set of stated rules.

    o data centric were data is supported by a single (high availability) system, offering to application servers the data service (application servers does not support data) ;

    o application integration were various applications and associated data are concentrated on a smaller set of servers in order to facilitate the cooperation between applications.

    n The choice of a solution depends heavily on the characteristics of the existing configuration and the set of constraints. Obviously, reducing the number of servers contributes to reduce the TCO. Both cost and business continuity issues may prevent to go further than physical concentration.