29
Dynamic Partitioning in Windows Longhorn Santosh Jodh Software Design Engineer Windows Kernel Platform Group santoshj @ microsoft.com Microsoft Corporation Mike Tricker Program Manager Windows Kernel Platform Group miketri @ microsoft.com Microsoft Corporation

Dynamic Partitioning in Windows Longhorn Santosh Jodh Software Design Engineer Windows Kernel Platform Group santoshj @ microsoft.com Microsoft Corporation

Embed Size (px)

Citation preview

Dynamic Partitioning in Windows Longhorn

Santosh JodhSoftware Design EngineerWindows Kernel Platform Groupsantoshj @ microsoft.comMicrosoft Corporation

Mike TrickerProgram ManagerWindows Kernel Platform Groupmiketri @ microsoft.comMicrosoft Corporation

Session OutlineSession Outline

Introduction to Dynamic Partitioning (DP)

Clarifying the terminologyReliability, Availability & Serviceability (RAS)

Capacity on Demand (CoD)

Resource Management (RM)

Hot Add, Replace & Remove

Goals and non-goals for DP onWindows codenamed “Longhorn”

What we’re expecting others to do to support this

Session GoalsSession Goals

Attendees should leave this session with a good understanding of the following:

What Microsoft means by Dynamic Partitioning

DP-related terminology and acronyms

Microsoft’s goals and non-goals for DP in Windows Longhorn

Knowledge of where to find resources for DP

An Introduction to Dynamic Partitioning An Introduction to Dynamic Partitioning

A hardware partitionable server has the ability to create one or more isolated hardware partitions comprising processors, memory and I/O, each supporting a single Windows instance

A dynamically partitionable server has the ability to add, replace or remove hardware within a partition without needing to reboot the OS instance within the partition

Why is this interesting?Hardware partition support has been available on some large servers for a number of years

Windows is supported on hardware partitionable systems today, but does not support dynamic hardware partitioning

With the projected increase in processor performance Microsoft expects a number of these features to become available on mid-range systems

Microsoft plans to add support for dynamically partitionable hardware in Windows Longhorn

Why You Should Care About DPWhy You Should Care About DP

Microsoft believes that the capabilities that have previously been limited to expensive high end systems are moving into the mainstream

Together with the introduction of multi-core processors this will make relatively small and inexpensive systems as powerful and reliable as today’s high end systems

This will push highly fault-tolerant enterprise-critical applications such as large databases and management information applications onto less expensive platforms

Which means that a range of hardware that has not previously had to consider some of the issues with dynamic hardware will now need to

In the same way that RAID 5 changed the way in which we considered disks in the 1990’s

What Do We Mean By a Partition?What Do We Mean By a Partition?

One scale-up application, e.g., Database

OSOS

Cell 1 Cell 2 Cell 3 Cell 4

OS

Cell 5 Cell 6

SQLResource

Management

SQL Exchange

Virtual Server

VM1 VM2 VM3

Multiple applications running on one OS

Multiple Virtual Machines running on one OS

All running on a single system

Reliability, Availability and ServiceabilityReliability, Availability and Serviceability

Minimizing unplanned downtime due to failing hardware

E.g. if a processor starts to show signs of failing (increasing number of corrected errors or thermal events) swap it with one that’s on standby without needing to reboot the computer (similar to a hot spare disk in RAID 5)

Capacity on DemandCapacity on Demand

The ability to enable processors that are physically present in the computer but not enabled by default

E.g. buy a system with 8 processors, only 4 of which are initially paid for, enabled and used by the OS, and then when the workload grows pay to enable 2 or 4 more

Resource ManagementResource Management

Sharing resources between two or more partitions

E.g. If the load on partition 1 is increasing whilst the workload on partition 2 is decreasing move processors and/or memory from partition 2 to partition 1 to better handle the increasing workload

More TerminologyMore Terminology

SocketA physical socket into which a processor and/or memory may be plugged mechanically

Sockets may also be independently powered

Partition Unit (PU)A collection of system resources that form the smallest building blocks that can be assigned to a partition

E.g. processors, memory and I/O host bridges

More than one PU may be required to boot a partition

Yet More TerminologyYet More Terminology

Hot AddAdding a socket or cell to a running partition

Hot RemoveRemoving a socket or cell from a running partition

Hot ReplaceReplacing a socket or cell in a running partition with one that is already physically present in the system but offline before the operation is started

Note that Hot Replace is NOT the same as Hot Remove followed by Hot Add

And Yet More TerminologyAnd Yet More Terminology

Hot SwapSome vendors support a model that does not require the stand-by hardware to be physically present before the Replace operation is started, and thus IS equivalent to a Hot Remove followed by a Hot Add

Hot PlugA term typically covering Hot Add and Hot Remove

Assumptions We’re Making About Hardware Assumptions We’re Making About Hardware

Future partitionable machines will contain PUs which comprise

Processors and memory together

Processors

Memory

I/O host bridges

The ACPI tables in those systems will be updated to expose specific methods required to support changing the hardware configuration without needing to reboot

The firmware will be able to assist the OS during Hot Add and Hot Replace operations

More Hardware AssumptionsMore Hardware Assumptions

Systems will include a Service Processor (SP) or Baseboard Management Controller (BMC)

PUs can be electrically isolated when not in use

No hardware assigned to a specific PU can be shared with other partitions, ensuring that a single failure cannot affect more than one partition

Dynamic Hardware PartitioningDynamic Hardware Partitioning

Memory

Memory Memory

Memory

IO Bridge

Service Processor

1. Partition Manager provides the UI for partition creation

and management

2. Service Processor controls the inter processor and IO

connections

Partition Manager

3. Platforms partitionable to the socket level. Virtualization used

for sub socket partitioning

4. Support for dynamic partitioning and socket

replacement

PCI Express

Core Core

Cache

… Core Core

Cache

Core Core

Cache

…Core Core

Cache

. . .

IO Bridge

. . .IO Bridge

. . .

IO Bridge

. . .

Longhorn dynamic hardware

partitioning features are focused on

improving server RAS

Future Hardware Partitionable Server

Goals For Windows LonghornGoals For Windows Longhorn

Support the Hot Add of:Processors

Memory

I/O host bridges

Support the Hot Replace of:Processors

Memory

OS support onlyx64 and Itanium only – no 32-bit support will be provided

Server SKUs only - for SKUs supporting 4 processors or more only

Non-Goals For Windows LonghornNon-Goals For Windows Longhorn

Hot RemoveWindows Longhorn will not support the Hot Remove of processors or memory

However tools will be supplied to allow both device driver and application developers to validate that they behave correctly in the case of a Hot Remove operation for either processors or memory

Partition ManagerToday’s Partition Managers are proprietary to each major OEM’s platform, and Microsoft will not be providing equivalent functionality in Windows Longhorn

Microsoft will work with the system vendors to enable Windows DP support via their partition management tools

SP & BMC “drivers”SPs and BMCs are devices that can be accessed from Windows via a device driver

Windows Longhorn will include an IPMI driver which can communicate with SPs and BMCs via a standard interface, but will not provide specific drivers for any vendor’s SP or BMC

Supporting TechnologiesSupporting Technologies

Windows Hardware Error Architecture (WHEA)Error infrastructure designed to support (amongst other things) DP, especially Hot Replace operations

Making hardware error information more easily available for management applications to analyze and make failure predictions

Extends the Machine Check Architecture available with the Intel Itanium platform

Multi-level rebalanceWindows Longhorn offers more sophisticated and extensive rebalance operations when hardware is added or removed

This is not specific to DP, but will be leveraged by DP to make these operations as efficient as possible

PCI Express and specifically Advanced Error ReportingThe PCI bus is unable to report many errors, and most end up as NMIs

PCI Express introduces AER and supports error correction, which will be exposed by WHEA for error prediction by management applications

Status of the Various ComponentsStatus of the Various Components

Hot Add of memory is already supported by Windows Server 2003

x86 support shipped in Windows Server 2003 RTM

x64 & Itanium support was added in Windows Server 2003 Service Pack 1

Hot Add of I/OVarious device classes supporting Hot Plug are already available

With Windows Longhorn the extended support for PCI Express devices makes this a very compelling feature

Hot Add Processor support is now in testOn x64 and Itanium

Hot Replace for processors and memory is under development

What DP Implies to an Application DeveloperWhat DP Implies to an Application Developer

Add: applications can register for plug & play notifications of new hardware arriving

Application developers with hard dependencies on memory or number of threads should watch for these notifications and update their behavior accordingly

Resource management software, such as Microsoft’s WSRM, can abstract these changes such that the majority of applications do not need to explicitly handle these notifications

Replace: applications will be unaffected and will see no change in the system

Application developers need do nothing

Remove: applications cannot make hard assumptions about memory or thread affinity

Application developers cannot make assumptions about memory being fixed that they may do today, and should not rely upon thread affinity or the size of thread pools being static

What DP Implies to a Driver Developer What DP Implies to a Driver Developer

Add: drivers can register for plug & play notifications of new hardware arriving

Driver developers have fewer memory size limitations than application developers, and pool sizes will not change even if overall memory grows.

The addition of processors and the related interrupt routing changes should also be invisible to drivers

So in the Add case most drivers will not do anything new

Replace: drivers will be unaffected and will see no change in the system

There are implications around device timeouts, as it will be necessary to quiesce the system whilst the replace operation completes

Remove: drivers cannot make hard assumptions about memory or thread affinity

Drivers cannot make any assumptions around thread affinity, or even that the affinity mask will remain contiguous as it is today

Logo Requirements and TestingLogo Requirements and Testing

NOTE: DP is a Server-only feature, so there are no new Client requirements arising from this feature

A number of new requirements are being proposed for the Microsoft logo program for Server to support DP

Most apply to either platform firmware or device driversSpecific ACPI method support

Device drivers must not assume that the processor affinity mask is contiguous

We will also be providing test tools to ensure that you’re ready for Hot Remove support in a subsequent Windows release

These will apply to both applications and device drivers

Other Implications of DPOther Implications of DP

What about NUMA?What happens to the System Resource Affinity Table (SRAT) or System Locality Distance Information Table (SLIT) when new hardware gets added?

Nothing happens to the SRAT as it’s a static table updated (and read by Windows) only at boot time

So it will be updated the first time the system reboots after hardware is added

For Windows Longhorn we’re not making use of the SLIT nor supporting the _SLI method to update locality information dynamically, so again nothing needs to be done here

SummarySummary

Windows Longhorn is planned to contain support for:

Hot Add of processors, memory and I/O host bridges

Hot Replace of processors and memory

Windows Longhorn will not contain support for:Hot Remove of memory and processors

An in-box Partition Manager

There are things you’ll need to do to:Enable DP on your systems

If your application is hardware-aware you may make use of the benefits offered by DP, and to not fail when hardware changes underneath you

Ensure that your device drivers work correctly on DP-capable systems

Call to ActionCall to Action

Application developers can benefit from DP if they make their application DP-aware

Driver developers need to make their drivers DP-aware to work well on DP-capable systems

Any may fail completely if they are badly behaved when hardware changes beneath them

You may already be talking to us if you’re interested in DP

If you’re interested and aren’t yet talking to us then please do!

Community ResourcesCommunity Resources

Windows Hardware & Driver Central (WHDC)www.microsoft.com/whdc/default.mspx

Technical Communitieswww.microsoft.com/communities/products/default.mspx

Non-Microsoft Community Siteswww.microsoft.com/communities/related/default.mspx

Microsoft Public Newsgroupswww.microsoft.com/communities/newsgroups

Technical Chats and Webcastswww.microsoft.com/communities/chats/default.mspx

www.microsoft.com/webcasts

Microsoft Blogswww.microsoft.com/communities/blogs

Additional ResourcesAdditional Resources

Email:dpfb @ microsoft.com

Related SessionsWindows Hardware Error Architecture

Error Management Solutions Synergy with WHEA

© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.