30
Dave Probert, Ph.D. - Windows Kernel Architect Microsoft Windows Division Copyright Microsoft Corporation

Evolution of the Windows Kernel Architecture, by Dave Probert

  • Upload
    yang

  • View
    3.201

  • Download
    6

Embed Size (px)

DESCRIPTION

Originally from: http://www.palermo.edu/ingenieria/Oct2009.ppt

Citation preview

Page 1: Evolution of the Windows Kernel Architecture, by Dave Probert

Dave Probert, Ph.D. - Windows Kernel ArchitectMicrosoft Windows Division

Copyright Microsoft Corporation

Page 2: Evolution of the Windows Kernel Architecture, by Dave Probert

About Me Ph.D. in Computer Engineering (Operating Systems

w/o Kernels) Kernel Architect at Microsoft for over 13 years

Managed platform-independent kernel development in Win2K/XP

Working on multi-core & heterogeneous parallel computing support Architect for UMS in Windows 7 / Windows Server 2008 R2

Co-instigator of the Windows Academic Program Providing kernel source and curriculum materials to universities http://microsoft.com/WindowsAcademic or

[email protected] Wrote the Windows material for leading OS textbooks

Tanenbaum, Silberschatz, Stallings Consulted on others, including a successful OS textbook in China

Page 3: Evolution of the Windows Kernel Architecture, by Dave Probert

UNIX vs NT Design Environments

Environment which influenced

fundamental design decisions

UNIX [1969] Windows (NT) [1989]16-bit program address space

Kbytes of physical memory

Swapping system with memory mapping

Kbytes of disk, fixed disks

Uniprocessor

State-machine based I/O devices

Standalone interactive systems

Small number of friendly users

32-bit program address space

Mbytes of physical memory

Virtual memory

Mbytes of disk, removable disks

Multiprocessor (4-way)

Micro-controller based I/O devices

Client/Server distributed computing

Large, diverse user populations

Copyright Microsoft Corporation

Page 4: Evolution of the Windows Kernel Architecture, by Dave Probert

Effect on OS Design

NT vs UNIXAlthough both Windows and Linux have adapted to changes in the environment, the original design environments (i.e. in 1989 and 1969) heavily influenced the design choices:

Unit of concurrency:

Process creation:

I/O:

Namespace root:

Security:

Threads vs processes

CreateProcess() vs fork()

Async vs sync

Virtual vs Filesystem

ACLs vs uid/gid

Addr space, uniproc

Addr space, swapping

Swapping, I/O devices

Removable storage

User populations

Copyright Microsoft Corporation

Page 5: Evolution of the Windows Kernel Architecture, by Dave Probert

Today’s Environment [2009]

64-bit addresses

GBytes of physical memory

TBytes of rotational disk

New Storage hierarchies (SSDs)

Hypervisors, virtual processors

Multi-core/Many-core

Heterogeneous CPU architectures, Fixed function hardware

High-speed internet/intranet, Web Services

Media-rich applications

Single user, but vulnerable to hackers worldwide

Convergence: Smartphone / Netbook / Laptop / Desktop / TV / Web / Cloud

Copyright Microsoft Corporation

Page 6: Evolution of the Windows Kernel Architecture, by Dave Probert

Windows Architecture

hardware interfaces (buses, I/O devices, interrupts, interval timers, DMA, memory cache control, etc., etc.)

System Service Dispatcher

Task Manager

Explorer

SvcHost.Exe

WinMgt.Exe

SpoolSv.Exe

ServiceControl Mgr.

LSASS

Ob

ject

Mg

r.

WindowsUSER,GDI

File

Syste

m C

ach

e

I/O Mgr

Environment Subsystems

UserApplication

Subsystem DLLs

System Processes Services Applications

SystemThreads

UserMode

KernelMode

NTDLL.DLL

Device &File Sys.Drivers

WinLogon

Session Manager

Services.Exe POSIX

Windows DLLs

Plu

g a

nd

Pla

y M

gr.

Pow

er

Mg

r.

Secu

rityR

efe

ren

ce

Mon

itor

Virtu

al

Mem

ory

Pro

cesses

&Th

read

s

Local

Pro

ced

ure

Call Graphics

Drivers

Kernel

Hardware Abstraction Layer (HAL)

(kernel mode callable interfaces)

Con

fig

ura

-tio

n M

gr

(reg

istry

)

OS/2

Windows

Copyright Microsoft Corporation

Page 7: Evolution of the Windows Kernel Architecture, by Dave Probert

Kernel-mode Architecture of Windows

Copyright Microsoft Corporation

NT API stubs (wrap sysenter) -- system library (ntdll.dll)user

mode

kernel

mode

NTOS executive layer

Trap/Exception/Interrupt Dispatch

CPU mgmt: scheduling, synchr, ISRs/DPCs/APCs

DriversDevices, Filters, Volumes, Networking, Graphics

Hardware Abstraction Layer (HAL): BIOS/chipset details

firmware/

hardware

CPU, MMU, APIC, BIOS/ACPI, memory, devices

NTOS kernel layer

Caching Mgr

Security

Procs/Threads

Virtual Memory

IPC

glue

I/O

Object Mgr

Registry

Copyright Microsoft Corporation

Page 8: Evolution of the Windows Kernel Architecture, by Dave Probert

Kernel/Executive layers

Kernel layer – ntos/ke – ~ 5% of NTOS source) Abstracts the CPU

Threads, Asynchronous Procedure Calls (APCs) Interrupt Service Routines (ISRs) Deferred Procedure Calls (DPCs – aka Software Interrupts)

Providers low-level synchronization

Executive layer OS Services running in a multithreaded

environment Full virtual memory, heap, handles Extensions to NTOS: drivers, file systems,

network, …

Copyright Microsoft Corporation

Page 9: Evolution of the Windows Kernel Architecture, by Dave Probert

NT (Native) API examples

NtCreateProcess (&ProcHandle, Access, SectionHandle, DebugPort, ExceptionPort, …)

NtCreateThread (&ThreadHandle, ProcHandle, Access, ThreadContext, bCreateSuspended, …)

NtAllocateVirtualMemory (ProcHandle, Addr, Size, Type, Protection, …)

NtMapViewOfSection (SectionHandle, ProcHandle, Addr, Size, Protection, …)

NtReadVirtualMemory (ProcHandle, Addr, Size, …)

NtDuplicateObject (srcProcHandle, srcObjHandle, dstProcHandle, dstHandle, Access, Attributes, Options)

Copyright Microsoft Corporation

Page 10: Evolution of the Windows Kernel Architecture, by Dave Probert

Windows Vista Kernel Changes Kernel changes mostly minor

improvements Algorithms, scalability, code maintainability CPU timing: Uses Time Stamp Counter (TSC)

Interrupts not charged to threads Timing and quanta are more accurate

Communication ALPC: Advanced Lightweight Procedure Calls Kernel-mode RPC New TCP/IP stack (integrated IPv4 and IPv6)

I/O Remove a context switch from I/O Completion

Ports I/O cancellation improvements

Memory management Address space randomization (DLLs, stacks) Kernel address space dynamically configured

Security: BitLocker, DRM, UAC, Integrity LevelsCopyright Microsoft Corporation

Page 11: Evolution of the Windows Kernel Architecture, by Dave Probert

Windows 7 Kernel Changes Miscellaneous kernel changes

MinWin Change how Windows is built Lots of DLL refactoring API Sets (virtual DLLs)

Working-set management Runaway processes quickly start reusing own pages Break up kernel working-set into multiple working-sets

System cache, paged pool, pageable system code Security

Better UAC, new account types, less BitLocker blockers

Energy efficiency Trigger-started background services Core Parking Timer-coalescing, tick skipping

Major scalability improvements for large server apps Broke apart last two major kernel locks, >64p

Kernel support for ConcRT User-Mode Scheduling (UMS)

Copyright Microsoft Corporation

Page 12: Evolution of the Windows Kernel Architecture, by Dave Probert

MinWin MinWin is first step at creating architectural

partitions Can be built, booted and tested separately from the rest

of the system Higher layers can evolve independently An engineering process improvement, not a microkernel

NT! MinWin was defined as set of components

required to boot and access network Kernel, file system driver, TCP/IP stack, device drivers,

services No servicing, WMI, graphics, audio or shell, etc, etc, etc

MinWin footprint: 150 binaries, 25MB on disk, 40MB in-memory

Page 13: Evolution of the Windows Kernel Architecture, by Dave Probert

MinWin Layering

Shell,Graphics,Multimedia,Layered Services,Applets, Etc.

Kernel, HAL,TCP/IP,File Systems,Drivers,Core System Services

MinWin

Page 14: Evolution of the Windows Kernel Architecture, by Dave Probert

Timer Coalescing Secret of energy efficiency: Go idle and Stay idle Staying idle requires minimizing timer interrupts Before, periodic timers had independent cycles even

when period was the same New timer APIs permit timer coalescing

Application or driver specifies tolerable delay Timer system shifts timer firing

MarkRuss

Page 15: Evolution of the Windows Kernel Architecture, by Dave Probert

Broke apart the Dispatcher Lock Scheduler Dispatcher lock hottest on server

workloads Lock protects all thread state changes (wait,

unwait) Very lock at >64x

Dispatcher lock broken up in Windows 7 / Server 2008 R2 Each object protected by its own lock Many operations are lock-free

Copyright Microsoft Corporation

Page 16: Evolution of the Windows Kernel Architecture, by Dave Probert

Removed PFN Lock Windows tracks the state of pages in physical

memory In use: in working sets: Not assigned: on paging lists: freemodified,

standby, … Before, all page state changes protected by global

PFN (Physical Frame Number) lock As of Windows 7 the PFN lock is gone

Pages are now locked individually Improves scalability for large memory

applications

Copyright Microsoft Corporation

Page 17: Evolution of the Windows Kernel Architecture, by Dave Probert

The Silicon Power WallThe situation: Power2 ∝ Clock frequency Voltage ∝ Power2

⇨ Clock frequency and Voltage offset each other Clock frequency inversely proportional to logic path length

Bad News: Power is about as low as it can go Logic paths between clocked elements are pretty short

Good News: Moore’s Law continues (# transistors doubles ~22 months) All that parallel computational theory is going into practice

Transistors going into more cores, not faster cores!

Software subject to Amdahl’s Law, not Moore’s Law(or Gustafson’s Law

– if my wife can find large enough datasets she cares about) 17

Page 18: Evolution of the Windows Kernel Architecture, by Dave Probert

Approaches to HW parallelismHomogeneous

More big superscalar cores Extend with private (or shared) SIMD engines (SSE on steroids) (Maybe) not very energy efficient

A few more big, cores and lots of smaller, slower, cooler cores

Use SIMD for performance Shutoff idle small cores for energy efficiency (but leakage?)Lots of little fully programmable cores, all the same Nobody has ever gotten this to work – more on this later

HeterogeneousProgrammable Accelerators (e.g. GPUs) Attach loosely-coupled, specialized (non-x86), energy-efficient

coresFixed-function Accelerators Very energy-efficient, device-like computational units for very-specific

tasks18

Page 19: Evolution of the Windows Kernel Architecture, by Dave Probert

User Mode Scheduling (UMS) Improve support for efficient cooperative

multithreaded scheduling of small tasks (over-decomposition)

Want to schedule tasks in user-mode Use NT threads to simulate CPUs, multiplex tasks onto

these threads When a task calls into the kernel and blocks, the CPU

may get scheduled to a different app If a single NT thread per CPU, when it blocks it blocks. Could have extra threads, but then kernel and user-mode

are competing to schedule the CPU Tasks run arbitrary Win32 code (but only x64/IA64)

Assumes running on an NT thread (TEB, kernel thread) Used by ConcRT (Visual Studio 2010’s Concurrency

Run-Time)

Copyright Microsoft Corporation

Page 20: Evolution of the Windows Kernel Architecture, by Dave Probert

Windows 7 User-Mode Scheduling UMS breaks NT thread into two parts:

UT: user-mode portion (TEB, ustack, registers) KT: kernel-mode portion (ETHREAD, kstack, registers)

Three key properties: User-mode scheduler switches UTs w/o ring crossing KT switch is lazy: at kernel entry (e.g. syscall, pagefault) CPU returned to user-mode scheduler when KT blocks

KT “returns” to user-mode by queuing completion User-mode scheduler schedules corresponding UT (similar to scheduler activations, etc)

Copyright Microsoft Corporation

Page 21: Evolution of the Windows Kernel Architecture, by Dave Probert

Normal NT Threading

kerneluser

KT0 KT1 KT2

UT2UT1UT0

Kernel-modeScheduler

NTOS executive

trap code

NT Thread is Kernel Thread (KT) and User Thread (UT)UT/KT form a single logical thread representing NT thread in user or kernel

KT: ETHREAD, KSTACK, link to EPROCESSUT: TEB, USTACK

x86 core

Copyright Microsoft Corporation

Page 22: Evolution of the Windows Kernel Architecture, by Dave Probert

User-Mode Scheduling (UMS)

kerneluser

Thread Parking

KT0 KT1 KT2

UT Completion list

PrimaryThread

UT0UT1

UT0

User-modeScheduler

trap code

NTOS executiveKT0 blocks

Only primary thread runs in user-mode

Trap code switches to parked KT

KT blocks primary returns to user-mode

KT unblocks & parks queue UT completion

Copyright Microsoft Corporation

Page 23: Evolution of the Windows Kernel Architecture, by Dave Probert

UMS Based on NT threads

Each NT thread has user & kernel parts (UT & KT) When a thread becomes UMS, KT never returns to UT

(Well, sort of) Instead, the primary thread calls the USched

USched Switches between UTs, all in user-mode When a UT enters kernel and blocks, the primary

thread will hand CPU back to the USched declaring UT blocked

When UT unblocks, kernel queues notification USched consumes notifications, marks UT runnable

Primary Thread Self-identified by entering kernel with wrong TEB So UTs can migrate between threads Affinities of primaries and KTs are orthogonal issues

Copyright Microsoft Corporation

Page 24: Evolution of the Windows Kernel Architecture, by Dave Probert

UMS Thread Roles Primary threads: represent CPUs, normal app threads

enter the USched world and become primaries, primaries also can be created by UScheds to allow parallel execution

Primaries represent concurrent execution

UMS threads (UT/KTs): allow blocking in the kernel without losing the CPU

UMS thread represent concurrent blocking in kernel

Copyright Microsoft Corporation

Page 25: Evolution of the Windows Kernel Architecture, by Dave Probert

Thread Scheduling vs UMS

Core 2

Thread3

Non-running threads

Core 1

Thread4

Thread5

Thread1

Thread2

Thread6

Core 2Core 1

UserThrea

d2

Kernel

Thread2

UserThrea

d1

Kernel

Thread1

UserThrea

d3

Kernel

Thread3

UserThrea

d4

Kernel

Thread4

UserThrea

d5

Kernel

Thread5

UserThrea

d6

Kernel

Thread6

MarkRuss

Page 26: Evolution of the Windows Kernel Architecture, by Dave Probert

Win32 compat considerations

Why not Win32 fibers? TEB issues

Contains TLS and Win32-specific fields (incl LastError) Fibers run on multiple threads, so TEB state doesn’t

track

Kernel thread issues Visibility to TEB I/O is queued to thread Mutexes record thread owner Impersonation Cross-thread operations expect to find threads and IDs Win32 code has thread and affinity awareness

Copyright Microsoft Corporation

Page 27: Evolution of the Windows Kernel Architecture, by Dave Probert

Futures: Master/Slave UMS?

remote kernel

Remote x86

Thread Parking

KT0 KT1 KT2

UT2UT1

RemoteScheduler

trap code

NTOS executiveKernel-modeScheduler

Syscall Completion QueueSyscall Request Queue

UT0

x86 core

UTs (can) run on accelerators or x86s

KTs run on x86s, syscalls remoted/batched

Pagefaults are just like syscalls

Accelerator never “loses the CPU” (implicit primary)

Copyright Microsoft Corporation

Page 28: Evolution of the Windows Kernel Architecture, by Dave Probert

Operating Systems Futures

Many-core challenge New driving force in software innovation:

Amdahl’s Law overtakes Moore’s Law as high-order bit

Heterogeneous cores? OS Scalability

Loosely –coupled OS: mem + cpu + services? Energy efficiency

Shrink-wrap and Freeze-dry applications? Hypervisor/Kernel/Runtime relationships

Move kernel scheduling (cpu/memory) into run-times?

Move kernel resource management into Hypervisor?

Copyright Microsoft Corporation

Page 29: Evolution of the Windows Kernel Architecture, by Dave Probert

Windows Academic Program Windows Kernel Internals

Windows kernel in source (Windows Research Kernel – WRK) Windows kernel in PowerPoint (Curriculum Resource Kit –

CRK) Based on Windows Server 2008 Service Pack 1

Latest kernel at time of release First kernel release with AMD64 support

Joint program between Windows Product Group and MS Academic Groups Program directed by Arkady Retik (Need a DVD? Have

questions?)Information available at http://microsoft.com/WindowsAcademic OR [email protected]

Microsoft Academic Contacts in Buenos AiresMiguel Saez ([email protected]) or Ezequiel Glinsky ([email protected])

Copyright Microsoft Corporation

Page 30: Evolution of the Windows Kernel Architecture, by Dave Probert

30

muchas gracias