Kernel Chintech Notes3

Embed Size (px)

Citation preview

  • 8/6/2019 Kernel Chintech Notes3

    1/51

    NZ chintechLK - Main Algorithms 1

    MAIN ALGORITHMS

  • 8/6/2019 Kernel Chintech Notes3

    2/51

    NZ chintechLK - Main Algorithms 2

    SIGNALS

    A signal is a short message that may be sent to a

    process or a group of processes.

    it is one of the oldest methods of Inter process

    communication. Represented in the system by a name of the form

    SIGXXXX. Egs SIGKILL,SIGSTOP,SIGCONT signals serve two main purpose.

    to make a process aware that a specific event has

    occurred.

    to force a process to execute a signal handler

    function included in its code.

  • 8/6/2019 Kernel Chintech Notes3

    3/51

    NZ chintechLK - Main Algorithms 3

    SIGNALS

    the kernel uses signals to inform processes about

    certain events.

    the user uses signals to abort processes or to switchinteractive programs to a defined state.

    the processes use signals to synchronize themselves

    with other processes.

  • 8/6/2019 Kernel Chintech Notes3

    4/51

    NZ chintechLK - Main Algorithms 4

    SIGNALS

    the kernel can send signal to every other process.

    normal processes can send signals to other

    processes provided

    it has proper capabilities.

    the destination process has the same UID and

    GUID.

    the signal is SIGCONT and destination process

    is in the same login session of the sending

    process.

  • 8/6/2019 Kernel Chintech Notes3

    5/51

    NZ chintechLK - Main Algorithms 5

    SIGNALS Actions performed on delivering a signal

    there are 3 ways in which a process can respond

    to a signal.

    Explicitly ignore the signal.

    Execute the default action associated with the signal.This action which is predefined by the kernel

    depends upon the signal type.

    Catch the signal by invoking a corresponding signal-

    handler function.

    The SIGKILL and SIGSTOP signals cannot be

    ignored, caught or blocked and their default

    actions must always be executed.

  • 8/6/2019 Kernel Chintech Notes3

    6/51

    NZ chintechLK - Main Algorithms 6

    SIGNALS

    Signals are sent via the function,int send_sig_info(int sig, struct siginfo *info, struct

    task_struct *t);

    where sig is the signal number, info refers to information

    about the sender process;if info =1, then signal is sent by a

    user mode process, if info=0, it is sent by the kernel; t is a

    pointer to the destination process.

    this function actually sets two components in the task_structof the destination process

    pending contains info about the signals that are

    received, the sender of the signal etc

    sigpending set whenever a signal is handed over to a

  • 8/6/2019 Kernel Chintech Notes3

    7/51

    NZ chintechLK - Main Algorithms 7

    SIGNALS

    Signals that have been generated but not been

    delivered are called pending signals.

    At any time, only one pending signal of a giventype may exist for a process, additional pending

    signals of the same type to the same process are

    discarded.

    The blocked component in task_struct contains a

    bitmask of signals that have been blocked by a

    process. A blocked signal cannot be delivered to a

    process until it is unblocked.

  • 8/6/2019 Kernel Chintech Notes3

    8/51

    NZ chintechLK - Main Algorithms 8

    SIGNALS

    Why are signals pending? Signals can only be sent to the current process.

    So if a process is currently not executing, any

    signals sent to that process will be saved by the

    kernel until that process resumes execution.

    Signals of a given type may be blocked by a

    process. In this case, process will not receive a

    signal until it removes the block.

    When a process executes a signal handler

    function, it usually masks the corresponding

    signals ie it automatically blocks the signal untilthe handler terminates.

  • 8/6/2019 Kernel Chintech Notes3

    9/51

    NZ chintechLK - Main Algorithms 9

    SIGNALS

    Delivering a Signal

    Kernel checks the sigpending flag of task

    structure for pending signals every time itcompletes handling an interupt or a system call.

    if sigpending is set and signal is not blocked,

    kernel invokes the do_signal() which takes overthe actual signal handling.

    do_signal() will refer to the component sig in

    task_struct which holds information about how

    each process handles every possible signal

  • 8/6/2019 Kernel Chintech Notes3

    10/51

    NZ chintechLK - Main Algorithms 10

    SIGNALS

    Amongst other things it contains either the address

    of a routine which will handle the signal or a flag

    which tells Linux that the process either wishes to

    ignore this signal or let the kernel handle the signal

    for it.

    if the signal is caught, then a user-defined function

    has to be called, do_signal() calls the functionhandle_signal().

  • 8/6/2019 Kernel Chintech Notes3

    11/51

    NZ chintechLK - Main Algorithms 11

    Hardware Interrupts

    Interrupts are used to allow the hardware to

    communicate with the operating system.

    Interrupts can occur at any time, when the kernelmay want to finish something else it was trying to

    do.

    The kernels goal is therefore to get the interrupt

    out of the way as soon as possible and defer as

    much processing as it can.

  • 8/6/2019 Kernel Chintech Notes3

    12/51

    NZ chintechLK - Main Algorithms 12

    Hardware Interrupts

    Suppose a block of data has appeared on anetwork line, when the hardware interrupts thekernel, it could mark the presence of data, give the

    processor back to whatever was running before,and do the rest of the processing later.

    The activities that the kernel needs to perform aredivided into two parts: a top half that the kernel

    executes right away and a bottom half that is leftfor later.

    The kernel keeps a queue pointing to all thefunctions that represent bottom halves waiting to

    be executed and pulls them off the queue toexecute them at articular oints in rocessin .

  • 8/6/2019 Kernel Chintech Notes3

    13/51

    NZ chintechLK - Main Algorithms 13

    Hardware Interrupts

    Bottom half is implemented using softirqs,

    tasklets or bottom halves.

  • 8/6/2019 Kernel Chintech Notes3

    14/51

    NZ chintechLK - Main Algorithms 14

    The main handling routine for hardware interrupts is carried out

    by do_IRQ()

    Unsigned int do_IRQ(struct pt_regs regs) {

    int irq;

    struct irqaction *action;

    /*take irq number from the register*/

    irq = regs.orig_eax & 0xff;

    /*find the respective handler*/

    action = irq_desc[irq].action

    /* and execute the actions*/

  • 8/6/2019 Kernel Chintech Notes3

    15/51

    NZ chintechLK - Main Algorithms 15

    while ( action ){

    action->handler(irq, regs);

    action = action->next;

    }

    /* the actual hardware interrupt is exited here */

    if (softirq_active & softirq_mask)

    do_softirq();

    }

  • 8/6/2019 Kernel Chintech Notes3

    16/51

    NZ chintechLK - Main Algorithms 16

    Software Interrupts

    Whenever a system call is about to return to userspace,

    or a hardware interrupt handler exits, any `software

    interrupts' which are marked pending (usually byhardware interrupts) are run.

    Much of the real interrupt handling work is done here.

    Tasklets, softirqs and bottom halves all fall into the

    category of `software interrupts'.

  • 8/6/2019 Kernel Chintech Notes3

    17/51

    NZ chintechLK - Main Algorithms 17

    Software Interrupts

    Early in the transition to SMP, there were only`bottom halves' (BHs), which didn't take advantage ofmultiple CPUs. . No matter how many CPUs you

    have, no two BHs will run at the same time. Thisaffects the performance.

    Softirqs are fully-SMP versions of BHs: they can runon as many CPUs at once as required.

    tasklets are like softirqs, except that any tasklet willonly run on one CPU at any time, although differenttasklets can run simultaneously (unlike different BHs).

  • 8/6/2019 Kernel Chintech Notes3

    18/51

    NZ chintechLK - Main Algorithms 18

    Software Interrupts

    Softirq

    Linux 2.4 uses a limited number of softirqs.

    There are only four kinds of softirqs

    enum {HI_SOFTIRQ =0 //(index 0, handleshigh priority tasklets and bottom halves)

    NET_TX_SOFTIRQ //(index = 1, transmitspackets to network cards)

    NET_RX_SOFTIRQ //( index = 2, receivespackets from network cards)

    TASKLET_SOFTIRQ //(index =3, handlestasklets).

    }

  • 8/6/2019 Kernel Chintech Notes3

    19/51

    NZ chintechLK - Main Algorithms 19

    Software Interrupts

    The main data structure used to represent softirqs is the

    softirq_vec array which includes 32 elements of type

    softirq_action.

    Static struct softirq_action softirq_vec[32];

    The registration(initialization) of an interrupt handler is carried outby the function open_softirq(). It uses 3 parameters: the softirq

    index , a pointer to the function to be executed, and a second

    pointer to the data structure that may be required by softirq

    function.

    softirqs are activated by invoking raise_softirq() , which takesone parameter ,the software index.

    do_softirq() executes the softirq functions.

  • 8/6/2019 Kernel Chintech Notes3

    20/51

    NZ chintechLK - Main Algorithms 20

    Void open_softirq(int nr, void (*action)(struct softirq_action*),

    void *data);

    Raise_softirq(int nr);

    Void do_softirq();

  • 8/6/2019 Kernel Chintech Notes3

    21/51

    NZ chintechLK - Main Algorithms 21

    Software Interrupts

    Tasklets

    Differ from softirqs because a tasklet cannot be

    executed by two CPUs at the same time. Differenttasklets can be executed concurrently on several CPUs.

    The registration of a tasklet is carried out via the

    functin tasklet_init(). Using tasklet_schedule() a tasklet

    is marked for processing and the software interrupt

    TASKLET_SOFTIRQ is activated.

  • 8/6/2019 Kernel Chintech Notes3

    22/51

    NZ chintechLK - Main Algorithms 22

    /*Suppose you are writing a device driver and you want to use a

    tasklet; first you should allocate a new tasklet_struct data structure */

    Struct tasklet_struct *t;

    /*initialize the tasklet by invoking tasklet_init(); this function takes 3

    parameters; the address of the tasklet descriptor, the adress of your

    tasklet function and its optional integer argument*/

    Void tasklet_init(struct tasklet_struct *t, void (*func)(unsigned

    long), unsigned long data);

    /*Activate the tasklet by calling the tasklet_schedule()*/

    Void tasklet_schedule(struct tasklet_struct *t);

  • 8/6/2019 Kernel Chintech Notes3

    23/51

    NZ chintechLK - Main Algorithms 23

    Software Interrupts

    Bottom halves

    Software interrupts and tasklets are new in Linux 2.4,

    whereas bottom halves have been available for a very

    long time.

    Bottom halves are globally serialized ie, when one

    bottom half is in execution, the other CPUs cannot

    execute any bottom half, even if it is of different type.

    This degrades the performance of Linux Kernel on

    multiprocesor systems.

  • 8/6/2019 Kernel Chintech Notes3

    24/51

    NZ chintechLK - Main Algorithms 24

    Booting the system

    Once LILO finds the Linux kernel and loads it intomemory, execution starts at the entry point start:which is held in the arch/i386/boot/setup.S file.

    This section contains assembler code responsiblefor initializing the hardware.

    Once the essential hardware parameters have been

    established, the CPU is switched into protectedmode by setting the protected mode bit in themachine status word.

  • 8/6/2019 Kernel Chintech Notes3

    25/51

    NZ chintechLK - Main Algorithms 25

    Booting the system The assembler instruction

    jmpi 0x100000 , __KERNEL_CS

    then initiates a jump to the start address of the 32-bit code of

    the actual operating system kernel and continues fromstartup_32: in the file arch/i386/kernel/head.S

    More sections of the hardware are initialized here(the

    MMU, the coprocessor, etc)

    sets up the environment required for the execution of the

    first Linux process.

    sets up the environment required for the execution of the

    kernels C functions

  • 8/6/2019 Kernel Chintech Notes3

    26/51

    NZ chintechLK - Main Algorithms 26

    Booting the system

    Once initialization is complete, the first C function,

    start_kernel()from init/main.c is called.

    The start_kernel( ) function completes the initialization of

    the Linux kernel. Nearly every kernel component isinitialized by this function; for eg.,

    1. Initialise irqs.

    2. Initialise data required for scheduler.

    3. Initialise time keeping data.

    4. Initialise softirq subsystem.5. Parse boot commandline options.

    6. Initialise page tables

    7. If module support was compiled into the kernel, initialisedynamical module loading facility.

  • 8/6/2019 Kernel Chintech Notes3

    27/51

    NZ chintechLK - Main Algorithms 27

    Booting the system The Kernel Idle Thread (Process 0) generates a kernel

    thread for process 1(commonly called init process) which

    executes the init() which is one of the routines defined in

    linux/init/main.c.Kernel_thread (init,NULL,)

    After creating process 1, Process 0 is only concerned with

    using up unused computing time. it executes the cpu_idle()

    function and is selected by the scheduler only where there

    are no other processes in the TASK_RUNNING state.

    The init() carries out the remaining initialization. The

    do_basic_setup() initializes all drivers for the hardware

    here.

  • 8/6/2019 Kernel Chintech Notes3

    28/51

    NZ chintechLK - Main Algorithms 28

    Booting the system

    Static int init()

    {

    do_basic_setup();

    Now an attempt is made to establish a connection

    with the console and open the file descriptors 0, 1

    and 2.

    If (open(/dev/console, 0_RDWR, 0)printk(Warning: Unable to open an initial console.\n);

    Then an attempt is made to execute a boot

    program specified by the user or one of the

    programs /sbin/init, /etc/init or /bin/init

  • 8/6/2019 Kernel Chintech Notes3

    29/51

    NZ chintechLK - Main Algorithms 29

    Booting the system

    These usually start the background process running under

    Linux and make sure that the getty program runs on each

    connected terminal thus a user can log on to the system.

    if (execute_command)

    execve(execute_command,argv_init,envp_init);

    execve(/sbin/init,argv_init,envp_init);

    execve(/etc/init,argv_init,envp_init);

    execve(/bin/init,argv_init,envp_init);

  • 8/6/2019 Kernel Chintech Notes3

    30/51

    NZ chintechLK - Main Algorithms 30

    Booting the system

    If none of the programs mentioned above exists,

    an attempt is made to start a shell, so that the super

    user can repair the system. If this is not possiblethe system is stopped.

    execve(/bin/sh,argv_init,envp_init);

    Panic(No init found);

  • 8/6/2019 Kernel Chintech Notes3

    31/51

    NZ chintechLK - Main Algorithms 31

    Timer Interrupts

    Important global variables

    jiffies

    kernel/sched.c: unsigned long volatile jiffies=0;

    ticks (10ms) since the system was started up

    xtime

    kernel/sched.c : volatile struct timeval xtime;

    actual time

    Timer interrupt routine (do_timer()) updates jiffies and make the bottom half active

    the bottom half is called later, an handles the rest of thework.

  • 8/6/2019 Kernel Chintech Notes3

    32/51

    NZ chintechLK - Main Algorithms 32

    Timer Interrupts

    Void do_timer (struct pt_regs *regs)

    {

    /*updates jiffies which contains the no. of elapsed ticks since thesystem started.*/

    (*(unsigned long *)&jiffies)++/*checks how long the current process has been running.

    update_process_times(user_mode(regs))*/

    /*activates the TIMER_BH bottom half routine */

    mark_bh(TIMER_BH)

    if (TQ_ACTIVE(tq_timer))

    mark_bh(TQUEUE_BH)

    }

  • 8/6/2019 Kernel Chintech Notes3

    33/51

    NZ chintechLK - Main Algorithms 33

    Timer Interrupts

    Each invocation of the top half of the timerinterrupt handler marks the TIMER_BH bottomhalf as active. As soon as the kernel leaves theinterrupt mode, the timer_bh(), which isassociated with TIMER_BH starts.

    void timer_bh(void){

    //updates the system date and time and computes the system load

    update_times();

    //checks whether timers have expired.

    run_timer_list();

    }

  • 8/6/2019 Kernel Chintech Notes3

    34/51

    NZ chintechLK - Main Algorithms 34

    Timer Interrupts The update_times() is responsible for updating the

    times.Static inline void update_times(void)

    {

    unsigned long ticks;

    /*wall_jiffies stores the time of the last update of the xtime variable*/

    ticks = jiffies wall_jiffies;

    if (ticks) {

    wall_jiffies += ticks

    update_wall_time(ticks);/* deals with the update of the real-time xtime

    and is called when some time has passes since the last call of thefunction.*/

    }

    Calc_load(ticks);/* counts the no. of processes in the TASK_RUNNING orTASK_UNINTERRUPTIBLE state and uses this no. to updata CPUusage statistics*/

    }

  • 8/6/2019 Kernel Chintech Notes3

    35/51

    NZ chintechLK - Main Algorithms 35

    Timer Interrupts

    The function update_process_time collects datafor the scheduler and decides whether it has to becalled.update_one_process(p, ticks, user, system, 0)

    struct task_struct *p = current;

    If(p->pid)

    {

    p->counter -= 1;if (p->counter counter = 0;

    p->need_resched = 1;

    }

  • 8/6/2019 Kernel Chintech Notes3

    36/51

    NZ chintechLK - Main Algorithms 36

    Timer Interrupts

    The if condition checks whether the kernel isexecuting process with PID 0, ie the swapper process.This is an idle process which runs whenever there are

    no other processes in the TASK_RUNNING state. For any other process, the counter component of the

    task structure is updated. When counter is zero, thetime slice of the current process has expired and thescheduler is activated.

  • 8/6/2019 Kernel Chintech Notes3

    37/51

    NZ chintechLK - Main Algorithms 37

    Timer Interrupts

    Under Linux, it is possible to limit a processs CPUconsumption resource. This is done by the system callsetrlimit. Exceeding the time limit is checked in the timerinterrupt, and the process is either informed via the

    SIGXPU signal or aborted by means of SIGKILL signal.

    Subsequently the interval timers of the current task mustbe updated. When these have expired, the task is informedby a corresponding signal.

    V id d ( ) /

  • 8/6/2019 Kernel Chintech Notes3

    38/51

    NZ chintechLK - Main Algorithms 38

    Void update_one_process(p,user,system,cpu) /

    *invoked by update_process times.*/

    {

    p->per_cpu_utime[cpu] += user;

    p->per_cpu_stime[cpu] += system;

    do_process_times(p, user, system);

    do_it_virt(p, user)

    do_it_prof(p);

    }

    Void do_process_times(p, user, system){

    psecs = (p ->times.tms_utime += user);

    psecs += (p->times.tms_stime += system);

  • 8/6/2019 Kernel Chintech Notes3

    39/51

    NZ chintechLK - Main Algorithms 39

    If (psecs / HZ > p->rlim[RLIMIT_CPU].rlim_cur {

    if (!(psecs % HZ))

    send_sig(SIGXCPU, p, 1);

    If (psecs / HZ > p->rlim[RLIMIT_CPU].rlim_max

    send_sig(SIGKILL, p, 1);

    }

    }

  • 8/6/2019 Kernel Chintech Notes3

    40/51

    NZ chintechLK - Main Algorithms 40

    Void do_it_virt(p, user) {

    unsigned long it_virt = p->it_virt_value;

    if (it_virt)

    {

    it_virt -= user;

    if (it_virt it_virt_incr;

    send_sig(SIGVTALRM, p, 1);

    }

    p->it_virt_value = it_virt user;

    }

    }

  • 8/6/2019 Kernel Chintech Notes3

    41/51

    NZ chintechLK - Main Algorithms 41

    The scheduler

    The scheduler is responsible for allocating the processor toindividual processes.

    The main variables in task_struct related to scheduling

    are : policy : This is the scheduling policy that will be applied to

    this process. There are two types of Linux process, normaland real time. Real time processes have a higher prioritythan all of the other processes. If there is a real time process

    ready to run, it will always run first. Real time processes mayhave two types of policy, round robin and first in first out.In round robin scheduling, each runnable real time processis run in turn and in first in, first out scheduling eachrunnable process is run in the order that it is in on the runqueue and that order is never changed.

  • 8/6/2019 Kernel Chintech Notes3

    42/51

    NZ chintechLK - Main Algorithms 42

    The scheduler

    Priority : This is the priority that the scheduler willgive to this process. It is the value used forrecalculation when all runnable processes have a

    counter value of 0. You can alter the priority of aprocess by means of system calls and the renicecommand

    rt_priority: Linux supports real time processes andthese are scheduled to have a higher priority than

    all of the other non-real time processes in system.This field allows the scheduler to give each realtime process a relative priority. The priority of areal time processes can be altered using systemcalls.

  • 8/6/2019 Kernel Chintech Notes3

    43/51

    NZ chintechLK - Main Algorithms 43

    The scheduler

    Counter: This is the amount of time (in jiffies) thatthis process is allowed to run for. It is set topriority when the process is first run and isdecremented each clock tick.

    The Linux scheduling algorithm is implemented inschedule() (kernel/sched.c).

    The scheduler is run from several places within thekernel.

    There are system calls which call the schedule(), usually

    indirectly by calling sleep_on(). It may also be run at the end of a system call, just before a

    process is returned to user mode from system mode. Theflag need_resched is checked by the ret_from_sys_call()routine. If it is set, the scheduler is called. One reason that itmight need to run is because the system timer has just setthe current processes counter to zero.

  • 8/6/2019 Kernel Chintech Notes3

    44/51

    NZ chintechLK - Main Algorithms 44

    The scheduler

    Each time the scheduler is run it does thefollowing:

    The scheduler runs the bottom half handlers andprocesses the scheduler task queue.

    The process with highest priority is determined.

    The new process becomes the current process.

  • 8/6/2019 Kernel Chintech Notes3

    45/51

    NZ chintechLK - Main Algorithms 45

    The scheduler

    Simplified version of the schedule() :

    asmlinkage void schedule(void)

    {

    struct task_struct * prev, * next, *p;

    prev = current;

    prev->need_resched = 0;

  • 8/6/2019 Kernel Chintech Notes3

    46/51

    NZ chintechLK - Main Algorithms 46

    The scheduler

    The software interrupts are processed ie any

    bottom half handlers are processed now as it may

    manipulate information capable of influencing

    scheduling.

    if (softirq_active(this_cpu) & softirq_mask(this_cpu))

    do_softirq();

  • 8/6/2019 Kernel Chintech Notes3

    47/51

    NZ chintechLK - Main Algorithms 47

    The scheduler

    If schedule() was called because the current process has to wait for anevent, it is removed from the run queue. If the scheduling policy ofthe current processes is round robin then it is put onto the backof the run queue.

    if (!prev->counter && prev->policy == SCHED_RR)

    {

    prev->counter = prev->priority;

    mov_last_runqueue(prev);

    }

    if ( prev->state != TASK_RUNNING ){

    del_from_runqueue(prev);

    }

  • 8/6/2019 Kernel Chintech Notes3

    48/51

    NZ chintechLK - Main Algorithms 48

    The scheduler

    Next the scheduler looks through the processes

    on the run queue looking for the most deserving

    process to run. If there are any real time

    processes (those with a real time scheduling

    policy) then those will get a higher weighting than

    ordinary processes. The function goodness()

    calculates the priority for each process.

  • 8/6/2019 Kernel Chintech Notes3

    49/51

    NZ chintechLK - Main Algorithms 49

    Next = idle_task; /* next process */

    Next_p = -1000; /* and the priority*/

    List_for_each(p,&runqueue_head)

    {

    if( ! Can_schedule(p) )

    continue;

    weight = goodness(p,prev,this_cpu);

    if( weight > next_p)

    {

    next_p = weight; next = p;

    }

  • 8/6/2019 Kernel Chintech Notes3

    50/51

    NZ chintechLK - Main Algorithms 50

    The scheduler

    If next_p is greater than zero, we have found a

    suitable candidate. If next_p is less than zero, there

    is no ready to run process and we must activate the

    idle task. In both cases next points to the task to be

    activated next. If next_p is equal to zero, there are

    ready to run processes, but we must recalculate

    their dynamic priorities ( value of counter). Thecounter values of all processes are recalculate.

    Then the scheduler is restarted.

  • 8/6/2019 Kernel Chintech Notes3

    51/51

    If (next_p == 0)

    {

    for_each_task(p)

    {

    p->counter = (p->counter / 2) + p->priority;

    }

    }

    The task indicated by next is the next to be activated.

    If( prev != next )

    switch_to(prev,next);

    }/* schedule() */