View
37
Download
0
Category
Preview:
DESCRIPTION
Why can’t we do ‘raw’ I/O?. How the x86 stops user-programs from directly controlling devices, and how we devise a ‘workaround’. x86 Privilege Levels. - PowerPoint PPT Presentation
Citation preview
Why can’t we do ‘raw’ I/O?
How the x86 stops user-programs from directly controlling devices,
and how we devise a ‘workaround’
x86 Privilege Levels
• For multiple users doing multiple tasks in a manner that affords each some ‘protection’ against inteference by others, any modern CPU will implement two or more separate levels of ‘privilege’ for its operations -- an ‘unrestricted privileges’ arena for the code in its Master Control Program (its ‘kernel’), and a ‘restricted privileges’ realm for code in users’ application programs
Four Privilege Rings
Ring 3
Ring 2
Ring 1
Ring 0
Least-trusted level
Most-trusted level
Suggested purposes
Ring0: operating system kernel
Ring1: operating system services
Ring2: custom extensions
Ring3: ordinary user applications
Unix/Linux and Windows
Ring0: operating system
Ring1: unused
Ring2: unused
Ring3: application programs
IOPL
• The Intel x86 processor includes a way to either allow or prohibit accesses to system peripheral devices by code that executes in the various ‘privilege rings’, by utilizing a 2-bit field within the x86 FLAGS register which controls whether or not ‘in’ and ‘out’ are allowed to execute – the field is known as the I/O Privilege Level field, and Linux normally sets its value to be zero
The x86 API registers
RAX RSP
RBX RBP
RCX RSI
RDX RDI
RIP RFLAGS
CS DS ES FS GS SS
Intel Core-2 Quad processor
R8 R12
R9
R10
R11
R13
R14
R15
The FLAGS register
NT
IOPLOF
DF
IF
TF
SF
ZF
0AF
0PF
1CF
Legend: ZF = Zero Flag SF = Sign Flag IOPL = I/O Privilege LevelCF = Carry Flag NT = Nested TaskPF = Parity Flag TF = Trap FlagOF = Overflow Flag IF = Interrupt FlagAF = Auxiliary Flag DF = Direction Flag
Status-flags
Control-flags
0
13 12
‘seeflags.cpp’
• This demo-program allows us to view the settings of bits in the RFLAGS register – and the IOPL-field in particular (bits 13,12)
• When IOPL == 0, only ring0 code will be able to execute ‘in’ and ‘out’ instructions
• When IOPL == 3, then code executing in any of the rings will be able to execute I/O
• So – let’s change IOPL to 3 – but how?
‘pushfq’/’popfq’
• An idea suggested by the ‘inline’ assembly language in our ‘seeflags.cpp’ demo would be to just ‘pop’ a suitably designed value from the stack into the RFLAGS register
• But the CPU is not about to allow that if it’s currently executing ring3 code while IOPL is set to 0 – that would compromise the system’s intended ‘protection’
Must do it from ring0!
• Our classroom’s Linux systems will allow us to install our own code-module, as an ‘add-on’ to the running kernel, and such code could therefore be executed without any restrictions (i.e., at ring0)
• This idea motivates us to explore briefly the programming ideas needed for writing our own LKM (Linux Kernel Module)
A module’s organization
my_info
module_init
module_exit
The module’s two required administrative functions
The module’s ‘payload’ function
Our ‘newproc.cpp’ utility
• The type of LKM that creates a pseudo-file in the ‘/proc’ directory, there is a ‘skeleton’ of C-language code we can start from, and then add our own specific functionality to that skeleton-code
• You can quickly create this ‘skeleton’ file by using our ‘newproc.cpp’ utility-program
Software interrupts
• One way a user-program, which normally executes in ring3, to switch to ring0 (if it’s allowed) is by using a ‘software interrupt’
• This is how the 32-bit version of Linux did its various system-calls, with ‘int $0x80’
• We can craft an LKM whose ‘payload’ is an interrupt service routine that would be able to change the IOPL from 0 to 3
Systems programming
• To accomplish this design-idea, we’ll need an understanding of our CPU’s interrupt mechanism, including some special data-structures located in kernel memory and some special CPU registers which allow the CPU to locate those data-structures
Descriptor Tables
IDT
Interrupt Descriptor Table (256 Gate Descriptors)
IDTR
GDTR
GDT
Global Descriptor Table (Segment Descriptors)
Special processor registers used by CPU for locating its Descriptor Tables within the system’s memory
IDT Descriptor-format
reserved (=0)
offset 63..32
offset 31..16 00000
32-bits
3
2
1
0 offset 15..0 segment selector
gate type
P 0DPL
IST
LEGEND: segment-selector (for the handler’s code-segment) offset within code-segment to handler’s entry-point gate-type (0xE = Interrupt Gate, 0xF = Trap Gate) IST = Interrupt Stack Table (0..7)
P = Present (1 = yes, 0 = no)
IDTR register-format
Base-Address of the IDT segment (64-bits)segment
limit
80-bits
Special processor instructions are used to ‘load’ this 10-byte register from a memory-image (‘LIDT’), or to ‘store’ this register’s value (‘SIDT’)
The ‘LIDT’ instruction can only be executed by code running in Ring0, but the ‘SIDT’ can be executed by code running at any privilege level.
IDTR:
Stack layout after an interrupt
RSP
RFLAGS
RIPRSP0
32(%rsp)
24(%rsp)
16(%rsp)
8(%rsp)
0(%rsp)
SS
CS
64-bits
Ring0 stack
Our interrupt-9 handler
//-------------------- INTERRUPT SERVICE ROUTINE ----------------- void isr_entry( void ); asm(“ .text “); asm(“ .type isr_entry, @function “); asm(“isr_entry: “); asm( orq $0x3000, 16(%rsp) “); asm( iretq “); //--------------------------------------------------------------------------------------
Our ‘iokludge.c’ kernel module uses this ‘inline’ assembly language to generate the machine-code for handling an interrupt-9, which merely sets the IOPL-field (in the saved image of the RFLAGS register) to 3, and then resumes execution of the interrupted application program.
Core-2 Quad system
Intel Core-2 Quad processor
CPU0
CPU1
CPU2
CPU3
system memory
I/O I/O I/O I/O I/O
system bus
‘smp_call_function()’
• This Linux kernel ‘helper’ routine allows a CPU to request all other CPUs to execute a specified subroutine of type: void function( void *info );
• In our current Linux kernel (vers. 2.6.26.6) this helper-routine takes four arguments:– The address of the subroutine’s entry-point– The address of data the subroutine needs– A flag that indicates whether or not to ‘retry’– A flag that indicates whether or not to ‘wait’
• (Note: Newer kernels omit the ‘retry’ argument)
Working with LKM’s
• Create an LKM skeleton using ‘newproc’
• Compile an new LKM using ‘mmake’
• Install an LKM’s compiled ‘kernel object’ using the Linux ‘/sbin/insmod’ command
• Remove an LKM from the running kernel using the Linux ‘/sbin/rmmod’ command
‘iokludge.c’
module_init: 1) Allocate a kernel memory page, to be used as a new Interrupt Descriptor Table 2) Save original contents of system register IDTR, so it can be restored later 3) Prepare a memory-image for the new value of register IDTR, referring to kpage 4) Setup pointers ‘oldidt’ and ‘newidt’ and copy the original IDT to our new page 5) Setup a Gate-Descriptor, to be installed as Gate 9 in our new IDT array 6) Activate the new Interrupt Descriptor Table on all the processors in our system 7) Return 0, to indicate a successful module-installation
module_exit: 1) Restore the original value to register IDTR in each of our system’s processors 2) Free the page of kernel memory that was previously allocated for use as an IDT
‘tryiopl3.cpp’
• This demo-program is a modification of our earlier ‘seeflags.cpp’ example – but here we included the software interrupt instruction ‘int $9’ which, if ‘iokludge.ko’ has been installed, will allow us to check that indeed the RFLAGS register’s IOPL has been changed from 0 to 3 – thereby permitting ‘in’ and ‘out’ to be executed!
Homework exercise
• Modify the ‘82573pci.cpp’ program that we weren’t able to execute, even with ‘sudo’, at our previous class meeting, replacing its call to Linux’s ‘iopl()’ library-function by the ‘inline’ assembly language statement for software interrupt 9, i.e. asm(“ int $9 “);
• Then try again to compile and execute our ‘82573.cpp’ demo-program, only this time with our ‘iokludge.ko’ LKM installed
Recommended