14
1    TERM PAPER OF DATA ST RUCTURE COURSE CODE:-CSE205  TOPIC:- Role of Data structure in OS kernel design DATE OF SUBMISSION:- 01-05-2010   SUBMITTED TO:-    SUBMITTED BY:- MISS.KOMAL     ARADHANA KUMARI SECTION:-C2801B29 REG:-10809078    

Lovely Profession University

Embed Size (px)

Citation preview

Page 1: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 1/14

1

 

 

 

TERM PAPER OF DATA STRUCTURE

COURSE CODE:-CSE205 

TOPIC:- Role of Data structure in OS kernel design

DATE OF SUBMISSION:-01-05-2010

 

 

SUBMITTED TO:-       SUBMITTED BY:-

MISS.KOMAL         ARADHANA KUMARI

SECTION:-C2801B29

REG:-10809078

 

 

 

Page 2: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 2/14

 

 

ACKNOWLEDGEMENT 

I would like to express my gratitude to our data structure lecturer MissKomal mam  who have given me the opportunity to make project on such an

interesting topic. This work is the consequence of the guidance of my lecturer.

The approach used in order to complete this work is totally research oriented. 

I extend my gratitude to my friends  who  have  helped  and  encouraged

me to  bring out this  fruitful result. Sincerely thanks to u all.

At last but not least I am thankful to almighty who helped me at every step of my

research and analysis. 

 

 

Aradhana kumari  

 

 

 

 

 

 

 

 

 

 

 

 

Page 3: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 3/14

 

Data Structures plays vital roll in real world applications. In OS process management we use

different scheduling schemes, in those there is imp roll of these data structures. we have different

file systems like ext2,ext3 etc, to maintain these also we use data structures,(superblock-double

linked lists), in task management also we use these data structures.

KERNEL:-

The kernel is the central part of an operating system, that directly controls the computer 

hardware. Usually, the kernel is the first of the user-installed software on a computer, booting

directly after the BIOS. Operating system kernels are specific to the hardware on which they are

running, thus most operating systems are distributed with different kernel options that are

configured when the system is installed. Changing major hardware components such as the

motherboard, processor, or memory, often requires a kernel update. Additionally, often new

kernels are offered that improve system security or performance. The two major types of kernels

competing in today's computer markets are the Windows kernel and the unix-like kernels.

The Windows kernel is available only with the Microsoft Windows series of operating systems.

It is proprietary software, developed and distributed by Microsoft Corporation. Introduced in

Windows/386, it's many incarnations have since gone by several different names, and some had

no names at all. The latest version of the Windows kernel was introduced in Windows NT, and

has had many of it's functions removed and placed in user-mode software for Windows Vista.

This leads to increased system stability and security. In Vista, application-level software exploits

have much less access to the core functions of the operating system, and application crashes will

not bring down the OS.

Unix-like kernels are a family of operating system kernels that are based upon, or operate similar to, the original Bell Labs UNIX operating system. Common examples of unix-like kernels are

the Linux kernel, BSD, Mac OS, and Solaris. While many of these kernels were developed with

original Bell Labs code as part of the software, not all of them have direct lineage to Bell. Linux,

for instance, was developed as a free alternative to Minix, itself an independently developed

variation of UNIX. Although originally running an original kernel design, Mac OS was outfitted

with a unix-like kernel in 1988 with the introduction of A/UX. All subsequent Apple operating

systems have unix-like kernels, including the current Mac OS-X's BSD-derived kernel

Also,,

In computing, the kernel is the central component of most computer operating systems; it is a

bridge between applications and the actual data processing done at the hardware level. The

kernel's responsibilities include managing the system's resources (the communication between

hardware and software components). Usually as a basic component of an operating system, a

kernel can provide the lowest-level abstraction layer for the resources (especially processors and

I/O devices) that application software must control to perform its function. It typically makes

Page 4: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 4/14

 

these facilities available to application processes through inter-process communication

mechanisms and system calls.

 

 

Data structures are very much important in Kernel development. Simple example would be stack 

what we say where local variables get stores is implemented with data structures.. Even the

Buffers are maintained in the double linked list..

Kernel Data Structures

The operating system must keep a lot of information about the current state of the system. Asthings happen within the system these data structures must be changed to reflect the current

reality. For example, a new process might be spawned when a user logged into the system. Thekernel must create a data structure representing the new process. The scheduler can then use it if 

it decides to schedule that process to run when it is its turn.

Mostly these data structures exist in physical memory and are accessible only by the kernel andits subsystems. Data structures contain data and pointers; addresses of other data structures or the

addresses of routines. Taken all together, the data structures used by the Linux kernel can look very confusing. Every data structure has a purpose and although some are used by several kernel

subsystems, they are more simple than at first seen. This book bases its description of the Linuxkernel on its data structures. It talks about each kernel subsystem in terms of its algorithms, its

methods of getting things done, and their usage of the kernel's data structures. The idea behindAbstract Data Types is to encapsulate the entire implementation of a data structure, and provide

just a well defined interface for manipulating it. The benefit of this approach is that it provides aclean separation. The data type can be implemented with no knowledge of the application which

might end up using it, and the application can be implemented with no knowledge of theimplementation details of the data type. Both sides simply write code based on the interface

which works like a contract to explicitly state what is required and what can be expected.

Kernel basic facilities

The kernel's primary purpose is to manage the computer's resources and allow other programs torun and use these resources. Typically, the resources consist of:

Page 5: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 5/14

 

� The Central Processing Unit (CPU, the processor). This is the most central part of acomputer system, responsible for running or executing programs on it. The kernel takes

responsibility for deciding at any time which of the many running programs should be allocatedto the processor or processors (each of which can usually run only one program at a time)

�The computer's memory. Memory is used to store both program instructions and data.Typically, both need to be present in memory in order for a program to execute. Often multiple

programs will want access to memory, frequently demanding more memory than the computer 

has available. The kernel is responsible for deciding which memory each process can use, anddetermining what to do when not enough is available.

� Any Input/Output (I/O) devices present in the computer, such as keyboard, mouse, disk 

drives, printers, displays, etc. The kernel allocates requests from applications to perform I/O toan appropriate device (or subsection of a device, in the case of files on a disk or windows on a

display) and provides convenient methods for using the device (typically abstracted to the pointwhere the application does not need to know implementation details of the device).

Key aspects necessary in resource managements are the definition of an execution domain(address space) and the protection mechanism used to mediate the accesses to the resources

within a domain.

Kernels also usually provide methods for synchronization and communication between processes(called inter-process communication or IPC).

A kernel may implement these features itself, or rely on some of the processes it runs to provide

the facilities to other processes, although in this case it must provide some means of IPC to allowprocesses to access the facilities provided by each other.

Finally, a kernel must provide running programs with a method to make requests to access thesefacilities. 

Process management

The main task of a kernel is to allow the execution of applications and support them withfeatures such as hardware abstractions. A process defines which memory portions the application

can access..Kernel process management must take into account the hardware built-in equipmentfor memory protection.And here data structures uses for management process.

To run an application, a kernel typically sets up an address space for the application, loads thefile containing the application's code into memory (perhaps via demand paging), sets up a stack 

for the program and branches to a given location inside the program, thus starting its execution.

Multi-tasking kernels are able to give the user the illusion that the number of processes being runsimultaneously on the computer is higher than the maximum number of processes the computer 

is physically able to run simultaneously. Typically, the number of processes a system may run

Page 6: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 6/14

 

simultaneously is equal to the number of CPUs installed (however this may not be the case if theprocessors support simultaneous multithreading).

In a pre-emptive multitasking system, the kernel will give every program a slice of time and

switch from process to process so quickly that it will appear to the user as if these processes were

being executed simultaneously. The kernel uses scheduling algorithms to determine whichprocess is running next and how much time it will be given. The algorithm chosen may allow for some processes to have higher priority than others. The kernel generally also provides these

processes a way to communicate; this is known as inter-process communication (IPC) and themain approaches are shared memory, message passing and remote procedure calls (see

concurrent computing).

Other systems (particularly on smaller, less powerful computers) may provide co-operativemultitasking, where each process is allowed to run uninterrupted until it makes a special request

that tells the kernel it may switch to another process. Such requests are known as "yielding", andtypically occur in response to requests for interprocess communication, or for waiting for an

event to occur. Older versions of Windows and Mac OS both used co-operative multitasking butswitched to pre-emptive schemes as the power of the computers to which they were targeted

grew.

The operating system might also support multiprocessing (SMP or Non-Uniform Memory

Access); in that case, different programs and threads may run on different processors. A kernelfor such a system must be designed to be re-entrant, meaning that it may safely run two different

parts of its code simultaneously. This typically means providing synchronization mechanisms(such as spinlocks) to ensure that no two processors attempt to modify the same data at the same

time.

Memory management

The kernel has full access to the system's memory and must allow processes to safely access this

memory as they require it. Often the first step in doing this is virtual addressing, usually achievedby paging and/or segmentation. Virtual addressing allows the kernel to make a given physical

address appear to be another address, the virtual address. Virtual address spaces may be differentfor different processes; the memory that one process accesses at a particular (virtual) address

may be different memory from what another process accesses at the same address. This allowsevery program to behave as if it is the only one (apart from the kernel) running and thus prevents

applications from crashing each other.

On many systems, a program's virtual address may refer to data which is not currently inmemory. The layer of indirection provided by virtual addressing allows the operating system touse other data stores, like a hard drive, to store what would otherwise have to remain in main

memory (RAM). As a result, operating systems can allow programs to use more memory thanthe system has physically available. When a program needs data which is not currently in RAM,

the CPU signals to the kernel that this has happened, and the kernel responds by writing thecontents of an inactive memory block to disk (if necessary) and replacing it with the data

Page 7: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 7/14

 

requested by the program. The program can then be resumed from the point where it wasstopped. This scheme is generally known as demand paging.

Virtual addressing also allows creation of virtual partitions of memory in two disjointed areas,

one being reserved for the kernel (kernel space) and the other for the applications (user space).

The applications are not permitted by the processor to address kernel memory, thus preventingan application from damaging the running kernel. This fundamental partition of memory spacehas contributed much to current designs of actual general-purpose kernels and is almost universal

in such systems, although some research kernels (e.g. Singularity) take other approaches.

Device management

To perform useful functions, processes need access to the peripherals connected to the computer,

which are controlled by the kernel through device drivers. For example, to show the user something on the screen, an application would make a request to the kernel, which would

forward the request to its display driver, which is then responsible for actually plotting the

character/pixel.

A kernel must maintain a list of available devices. This list may be known in advance (e.g. on an

embedded system where the kernel will be rewritten if the available hardware changes),configured by the user (typical on older PCs and on systems that are not designed for personal

use) or detected by the operating system at run time (normally called plug and play).

In a plug and play system, a device manager first performs a scan on different hardware buses,such as Peripheral Component Interconnect (PCI) or Universal Serial Bus (USB), to detect

installed devices, then searches for the appropriate drivers.

As device management is a very OS-specific topic, these drivers are handled differently by eachkind of kernel design, but in every case, the kernel has to provide the I/O to allow drivers tophysically access their devices through some port or memory location. Very important decisions

have to be made when designing the device management system, as in some designs accessesmay involve context switches, making the operation very CPU-intensive and easily causing a

significant performance overhead.[citation needed]

System calls

To actually perform useful work, a process must be able to access the services provided by the

kernel. This is implemented differently by each kernel, but most provide a C library or an API,

which in turn invokes the related kernel functions.[14]

The method of invoking the kernel function varies from kernel to kernel. If memory isolation isin use, it is impossible for a user process to call the kernel directly, because that would be a

violation of the processor's access control rules. A few possibilities are:

� Using a software-simulated interrupt. This method is available on most hardware, and istherefore very common.

Page 8: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 8/14

 

� Using a call gate. A call gate is a special address stored by the kernel in a list in kernelmemory at a location known to the processor. When the processor detects a call to that address, it

instead redirects to the target location without causing an access violation. This requireshardware support, but the hardware for it is quite common.

�Using a special system call instruction. This technique requires special hardware support,which common architectures (notably, x86) may lack. System call instructions have been added

to recent models of x86 processors, however, and some (but not all) operating systems for PCs

make use of them when available.

� Using a memory-based queue. An application that makes large numbers of requests butdoes not need to wait for the result of each may add details of requests to an area of memory that

the kernel periodically scans to find requests.

Kernel design decisions

Issues of kernel support for protection

An important consideration in the design of a kernel is the support it provides for protection from

faults (fault tolerance) and from malicious behaviors (security). These two aspects are usuallynot clearly distinguished, and the adoption of this distinction in the kernel design leads to the

rejection of a hierarchical structure for protection.[1]

The mechanisms or policies provided by the kernel can be classified according to several criteria,as: static (enforced at compile time) or dynamic (enforced at runtime); preemptive or post-

detection; according to the protection principles they satisfy (i.e. Denning[15][16]); whether theyare hardware supported or language based; whether they are more an open mechanism or a

binding policy; and many more.

Fault tolerance

A useful measure of the level of fault tolerance of a system is how closely it adheres to theprinciple of least privilege. In cases where multiple programs are running on a single computer,

it is important to prevent a fault in one of the programs from negatively affecting the other.Extended to malicious design rather than a fault, this also applies to security, where it is

necessary to prevent processes from accessing information without being granted permission.

The two major hardware approaches for protection (of sensitive information) are Hierarchical

protection domains (also called ring architectures, segment architectures or supervisor mode),andCapability-based addressing.

 

 

 

Page 9: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 9/14

 

 

 

 

Privilege rings, such as in the x86, are a common implementation of Hierarchical protection

domains used in many commercial systems to have some level of fault tolerance.

Hierarchical protection domains are much less flexible, as is the case with every kernel with ahierarchical structure assumed as global design criterion. In the case of protection it is not

possible to assign different privileges to processes that are at the same privileged level, andtherefore is not possible to satisfy Denning's four principles for fault tolerance (particularly the

Principle of least privilege). Hierarchical protection domains also have a major performancedrawback, since interaction between different levels of protection, when a process has to

manipulate a data structure both in 'user mode' and 'supervisor mode', always requires messagecopying (transmission by value). A kernel based on capabilities, however, is more flexible in

assigning privileges, can satisfy Denning's fault tolerance principles, and typically doesn't suffer from the performance issues of copy by value.

Both approaches typically require some hardware or firmware support to be operable and

efficient. The hardware support for hierarchical protection domains is typically that of "CPUmodes." An efficient and simple way to provide hardware support of capabilities is to delegate

the MMU the responsibility of checking access-rights for every memory access, a mechanismcalled capability-based addressing. Most commercial computer architectures lack MMU support

for capabilities. An alternative approach is to simulate capabilities using commonly-supportedhierarchical domains; in this approach, each protected object must reside in an address space that

the application does not have access to; the kernel also maintains a list of capabilities in suchmemory. When an application needs to access an object protected by a capability, it performs a

system call and the kernel performs the access for it. The performance cost of address space

switching limits the practicality of this approach in systems with complex interactions betweenobjects, but it is used in current operating systems for objects that are not accessed frequently or which are not expected to perform quickly. Approaches where protection mechanism are not

firmware supported but are instead simulated at higher levels (e.g. simulating capabilities bymanipulating page tables on hardware that does not have direct support), are possible, but there

are performance implications. Lack of hardware support may not be an issue, however, for systems that choose to use language-based protection.

Page 10: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 10/14

10

 

An important kernel design decision is the choice of the abstraction levels where the securitymechanisms and policies should be implemented. Kernel security mechanisms play a critical role

in supporting security at higher levels.

One approach is to use firmware and kernel support for fault tolerance (see above), and build the

security policy for malicious behavior on top of that (adding features such as cryptographymechanisms where necessary), delegating some responsibility to the compiler. Approaches thatdelegate enforcement of security policy to the compiler and/or the application level are often

called language-based security.

The lack of many critical security mechanisms in current mainstream operating systems impedesthe implementation of adequate security policies at the application abstraction level.[28] In fact,

a common misconception in computer security is that any security policy can be implemented inan application regardless of kernel support.

Hardware-based protection or language-based protection

Typical computer systems today use hardware-enforced rules about what programs are allowedto access what data. The processor monitors the execution and stops a program that violates a

rule (e.g., a user process that is about to read or write to kernel memory, and so on). In systemsthat lack support for capabilities, processes are isolated from each other by using separate

address spaces. Calls from user processes into the kernel are regulated by requiring them to useone of the above-described system call methods.

An alternative approach is to use language-based protection. In a language-based protection

system, the kernel will only allow code to execute that has been produced by a trusted languagecompiler. The language may then be designed such that it is impossible for the programmer to

instruct it to do something that will violate a security requirement.

Advantages of this approach include:

� No need for separate address spaces. Switching between address spaces is a slow

operation that causes a great deal of overhead, and a lot of optimization work is currentlyperformed in order to prevent unnecessary switches in current operating systems. Switching is

completely unnecessary in a language-based protection system, as all code can safely operate inthe same address space.

� Flexibility. Any protection scheme that can be designed to be expressed via a

programming language can be implemented using this method. Changes to the protection scheme(e.g. from a hierarchical system to a capability-based one) do not require new hardware.

Disadvantages include:

� Longer application start up time. Applications must be verified when they are started toensure they have been compiled by the correct compiler, or may need recompiling either from

source code or from bytecode.

Page 11: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 11/14

11

 

� Inflexible type systems. On traditional systems, applications frequently performoperations that are not type safe. Such operations cannot be permitted in a language-based

protection system, which means that applications may need to be rewritten and may, in somecases, lose performance.

Examples of systems with language-based protection include JX and Microsoft's Singularity.

Process cooperation

Edsger Dijkstra proved that from a logical point of view, atomic lock and unlock operationsoperating on binary semaphores are sufficient primitives to express any functionality of process

cooperation.[33] However this approach is generally held to be lacking in terms of safety andefficiency, whereas a message passing approach is more flexible.

I/O devices management

The idea of a kernel where I/O devices are handled uniformly with other processes, as parallelco-operating processes, was first proposed and implemented by Brinch Hansen (although similar 

ideas were suggested in 1967). In Hansen's description of this, the "common" processes arecalled internal processes, while the I/O devices are called external processes.

Kernel-wide design approaches

Naturally, the above listed tasks and features can be provided in many ways that differ from eachother in design and implementation.

The principle of separation of mechanism and policy is the substantial difference between the

philosophy of micro and monolithic kernels.Here a mechanism is the support that allows theimplementation of many different policies, while a policy is a particular "mode of operation".

For instance, a mechanism may provide for user log-in attempts to call an authorization server todetermine whether access should be granted; a policy may be for the authorization server to

request a password and check it against an encrypted password stored in a database. Because themechanism is generic, the policy could more easily be changed (e.g. by requiring the use of a

security token) than if the mechanism and policy were integrated in the same module.

In minimal microkernel just some very basic policies are included, and its mechanisms allows

what is running on top of the kernel (the remaining part of the operating system and the other applications) to decide which policies to adopt (as memory management, high level process

scheduling, file system management, etc.). A monolithic kernel instead tends to include manypolicies, therefore restricting the rest of the system to rely on them.

Per Brinch Hansen presented cogent arguments in favor of separation of mechanism and

policy.The failure to properly fulfill this separation, is one of the major causes of the lack of substantial innovation in existing operating systems, a problem common in computer 

architecture. The monolithic design is induced by the "kernel mode"/"user mode" architecturalapproach to protection (technically called hierarchical protection domains), which is common in

Page 12: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 12/14

12 

 

conventional commercial systems; in fact, every module needing protection is thereforepreferably included into the kernel. This link between monolithic design and "privileged mode"

can be reconducted to the key issue of mechanism-policy separation; in fact the "privilegedmode" architectural approach melts together the protection mechanism with the security policies,

while the major alternative architectural approach, capability-based addressing, clearly

distinguishes between the two, leading naturally to a microkernel design (see Separation of protection and security).

While monolithic kernels execute all of their code in the same address space (kernel space)microkernels try to run most of their services in user space, aiming to improve maintainability

and modularity of the codebase. Most kernels do not fit exactly into one of these categories, butare rather found in between these two designs. These are called hybrid kernels. More exotic

designs such as nanokernels and exokernels are available, but are seldom used for productionsystems. The Xen hypervisor, for example, is an exokernel.

 

IN LINUX:-

On the other hand, one of the costs of this approach is tightly connected with the abstract natureof the interface. The point of abstracting an interface is to remove unnecessary details. This is

good for introductory computing students, but is bad for kernel programmers. In kernelprogramming. performance is very important, coming as a close third after correctness and

maintainability, and sometimes taking precedence over maintainability. Not all code paths in thekernel are performance-critical, but many are, and the development process benefits from the

same data structures being using in both performance critical and less critical paths. So it isessential that data types are not overly abstracted, but that all details of the implementation are

visible so that the programmer can make optimal choices when using them.

So the first principle of data structures in the kernel is not to hide detail. To see how this applies,and to discover further principles from which to extract patterns, we will explore a few of the

more successful data types used in the Linux kernel.

A Micro-kernel for Grid Data Structures

As a consequence of the outcome of algorithm requirements analysis, we can identify a set of 

functional primitives, forming a micro-kernel for grid data structures. This micro-kernel serves

two fundamental purposes: First, it separates basic (atomic) functionality from derived(composite) functionality, thus answering the question ``What belongs into a grid datastructure?'' And second, in the context of generic programming, it serves as a broker layer 

between concrete data representations and generic components like algorithms and other datastructures.

Page 13: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 13/14

13 

 

The identification of a small yet sufficient set of basic primitives is an iterative process,depending on an analysis of both typical algorithm requirements and data structure capabilities,

as mentioned in the preceding section.

With respect to the universe of possible operations on grids the micro-kernel is (almost) minimal :

No part of it can be expressed appropriately by other parts in the general case. With respect tothe basic functionality offered by concrete grid data structures it is maximal : A givenrepresentation component in general implements only a part of it. For example, a simple

triangulation data structure which only stores the vertex indices for each cell cannot provideiteration over neighbor cells. On the other hand, we are able to provide a generic implementation

of neighbor iteration, using the part of the micro-kernel supported by the triangulationcomponent. However, this will not be as efficient as a specialized implementation by extending

the triangulation component.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Page 14: Lovely Profession University

8/8/2019 Lovely Profession University

http://slidepdf.com/reader/full/lovely-profession-university 14/14

14 

 

 

REFERENCES:-  

http://lwn.net/Articles/336255/

sunsite.nus.ed u.sg/ LDP / LDP /tlk/node23.html  

http://www.math.tu-cottbus.de/INSTITUT/lsnmwr/papers/tmpw00/node3.html

http://what-is-what.com/what_is/kernel.html