Design_OS

Embed Size (px)

Citation preview

  • 8/7/2019 Design_OS

    1/4

    By David KleidermacherDirector of Engineering

    E-mail: [email protected]

    Mark Griglock

    Safety- Critical EngineeringManagerE-mail: [email protected]

    Green Hills Software

    Whether you are designing atelecom switch, a piece ofmedical equipment or one ofthe many complex systemsaboard an aircraft, certaincritical parts of the applicationmust be able to operate underall conditions. Indeed, given

    the steadily increasing speed ofprocessors and the economi-cally-driven desire to run mul-tiple applications, at varyinglevels of criticality, on the sameprocessor, the risks continue togrow.

    Consider a blood gas ana-lyzer used in an intensive careunit. The analyzer may servetwo distinct purposes. First, itmonitors the level of oxygenand other gasses in the

    patients bloodstream, in realtime. If any monitored gasreaches a dangerously low orhigh level, the analyzer shouldproduce an audible alarm ortake some more direct,interventionary action. But thedevice may have a second use,offering a historical display ofgas levels for offline analysis.In such a system, data logging,data display and user interfacethreads may compete with thecritical monitoring and alarm

    threads for use of the processorand other resources.

    In order for threads of vary-ing importance to safely coex-ist in the same system, the OSthat manages the processor andother resources must be able toproperly partition the softwareto guarantee resource availabil-ity. The key word here is guar-antee. Post-design, post-imple-mentation testing cannot becounted on. Safety-critical sys-

    tems must be safe at all times.

    TerminologyThe following terms are

    used in this article: Thread: A lightweight unit

    of program execution.

    Process: A heavyweight unitconsisting primarily of a dis-tinct address space, withinwhich one or more threadsexecute.

    Kernel: The portion of anOS that provides core sys-tem services such as sched-uling, thread synchroniza-tion and inter-process com-munication.

    Memory protectionFault tolerance begins withmemory protection. For manyyears, microprocessors have in-cluded on-chip memory man-agement units (MMU) that en-able individual threads of soft-

    ware to run in hardware-pro-tected address spaces. But manycommercial RTOS never enablethe MMU, even if such hardwareis present in the system.

    When all of an applicationsthreads share the same memoryspace, any thread couldinten-tionally or unintentionallycor-rupt the code, data or stack ofanother thread. A misbehavedthread could even corrupt thekernels own code or internal

    data structures. It is easy to seehow a single errant pointer inone thread could easily bringdown the entire system or atleast cause it to behave unex-pectedly.

    For safety and reliability, aprocess-based RTOS is prefer-able. To create processes withindividual address spaces, theRTOS need only create someRAM-based data structures andenable the MMU to enforce theprotections described therein.

    The basic idea is that a new setof logical addresses isswitched in at each contextswitch.

    The MMU maps a logical ad-dress used during an instruc-tion fetch or a data read or writeto a physical address in memorythrough the current mapping.It also flags attempts to accessillegal logical addresses, whichhave not been mapped to anyphysical address. The cost of

    processes is the overhead inher-ent in memory access througha look-up table. But the payoffis huge. Careless or maliciouscorruption across processboundaries is rendered impos-sible. A bug in a user interface

    thread cannot corrupt the codeor data of a more critical thread.It is truly a wonder that non-memory protected OS are stillused in complex embedded sys-

    tems where reliability, safety orsecurity are important.Enabling the MMU has

    other benefits as well. One bigadvantage stems from the abil-ity to selectively map andunmap pages into a logical ad-dress space. Physical memorypages are mapped into the logi-cal space to hold the currentprocess code; others aremapped for data. Likewise,physical memory pages aremapped in to hold the stacks of

    threads that are part of the pro-cess. An RTOS can easily pro-vide the ability to leave a pages

    worth of the logical addressesafter each threads stack un-mapped. That way, if any threadoverflows its assigned stack, a

    hardware memory protectionfault will occur. The kernel willsuspend the thread instead ofallowing it to corrupt other im-portant memory areas withinthe address space (like anotherthreads stack). This adds alevel of protection betweenthreads, even within the sameaddress space.

    Memory protection, includ-ing this kind of stack overflowdetection, is often helpful dur-

    ing the development of an ap-plication. Programming errorswill generate exceptions thatare immediately detected andeasily traceable to the sourcecode. Without memory protec-tion, bugs can cause subtle cor-

    ruptions that are very difficultto track down. In fact, sinceRAM is often located at physi-cal address zero in a flatmemory model, even NULL

    pointer dereferences will goundetected! (Clearly, logicalpage zero is a good one to addto the unmap list.)

    System callAnother issue is that the kernelmust protect itself against im-proper system calls. Many ker-nels return the actual pointer toa newly created kernel object,such as a semaphore, to thethread that created it, as ahandle. When that pointer is

    passed back to the kernel in sub-sequent system calls, it may bede-referenced directly. But what

    if the thread uses that pointer tomodify the kernel object di-rectly or simply overwrites itshandle with a pointer to some

    other memory. The results maybe disastrous.

    A bad system call shouldnever be able to take down thekernel. An RTOS should, there-fore, employ opaque handlesfor kernel objects. It should also validate the parameters to allsystem calls.

    Fault tolerance and highavailabilityEven the best software has la-

    tent bugs. As applications be-come more complex, perform-ing more functions for a soft-ware-hungry world, the numberof bugs in fielded systems willcontinue to rise. System design-ers must, therefore, plan for fail-

    Designing safety-critical operating systems

    Active

    Active

    Redundant

    Figure 1: Redundancy via system heartbeats.

  • 8/7/2019 Design_OS

    2/4

    ures and employ fault recoverytechniques. Of course, the ef-fect of fault recovery is applica-tion-dependent: a user interfacecan restart itself in the face of afault, a flight-control systemprobably cannot.

    One way to do fault recoveryis to have a supervisor thread inan address space all on its own. When a thread faults (for ex-ample, due to a stack overflow),the kernel should provide somemechanism whereby notifica-tion can be sent to the supervi-sor thread. If necessary, the su-pervisor can then make a sys-tem call to close down thefaulted thread or the entire pro-cess and restart it. The supervi-sor might also be hooked into asoftware watchdog setup, whereby thread deadlocks andstarvation can be detected aswell.

    In many critical systems,

    high availability is assured byemploying multiple redundantnodes in the system. In such asystem, the kernel running ona redundant node must have theability to detect a failure in oneof the operating nodes. Onemethod is to provide a built-inheartbeat in the interprocessormessage passing mechanism ofthe RTOS (Figure 1). Uponsystem startup, a communica-tions channel is opened be-tween the redundant nodes andeach of the operating nodes.During normal operation, theredundant nodes continuallyreceive heartbeat messagesfrom the operating nodes. If theheartbeat fails to arrive, the re-

    dundant node can take controlautomatically.

    Mandatory vs. discretionaryaccess controlAn example of a discretionaryaccess control is a Unix file: aprocess or thread can, at its solediscretion, modify the permis-sions on a file, thereby permittingaccess to the file by another pro-cess in the system. Discretion-ary access controls are useful forsome objects in some systems.

    An RTOS that is used in asafety- or security-critical sys-tem must be able to go one bigstep further and provide man-datory access control of criticalsystem objects. For example,consider an aircraft sensor de- vice, access to which is con-trolled by a flight control pro-gram. The system designermust be able to set up the sys-tem statically such that the

    flight control program and onlythe flight control program hasaccess to this device. Anotherapplication in the system can-not dynamically request andobtain access to this device.And the f light control programcannot dynamically provide ac-cess to the device to any otherapplication in the system. Theaccess control is enforced by thekernel, is not circumventableby application code and is thusmandatory. Mandatory accesscontrol provides guarantees.Discretionary access controlsare only as effective as the ap-plications using them and theseapplications must be assumedto have bugs in them.

    Guaranteed resource availability:Space domainIn safety-critical systems, a criti-cal application cannot, as a re-sult of malicious or careless ex-ecution of another application,run out of memory resources. Inmost RTOS, memory used tohold thread control blocks andother kernel objects comes froma central store.

    When a thread creates a newthread, semaphore or other ker-nel object, the kernel carves offa chunk of memory from thiscentral store to hold the data forthis object. A bug in one threadcould, therefore, result in a situ-ation where this program cre-ates too many kernel objects andthe central store is exhausted(Figure 2a). A more criticalthread could fail as a result, per-haps with disastrous effects.

    In order to guarantee thatthis scenario cannot occur, theRTOS can provide a memoryquota system wherein the sys-tem designer statically defineshow much physical memoryeach process has (Figure 2b).For example, a user interfaceprocess might be provided amaximum of 128KB and aflight control program a maxi-mum of 196KB. If a threadwithin the user interface pro-cess encounters the aforemen-tioned failure scenario, the pro-cess may exhaust its own128KB of memory. But theflight control program and its196KB of memory are whollyunaffected.

    In a safety-critical system,memory should be treated as ahard currency: when a threadwants to create a kernel object,its parent process must provide

    a portion of its memory quotato satisfy the request. This kindof space domain protectionshould be part of the RTOS de-sign. Central memory storesand discretionarily-assignedlimits are insufficient whenguarantees are required.

    If an RTOS provides amemory quota system, dynamicloading of low criticality appli-cations can be tolerated. Highcriticality applications alreadyrunning are guaranteed to havethe physical memory they willrequire to run. In addition, thememory used to hold any newprocesses should come fromthe memory quota of the creat-ing process. If this memorycomes from a central store,then process creation can fail ifa malicious or carelessly writ-ten application attempts to cre-

    ate too many new processes.(Most programmers have ei-ther mistakenly executed or atleast heard of a Unix forkbomb, which can easily takedown an entire system.) In mostsafety-critical systems, dynamicprocess creation will simply notbe tolerated at all and the RTOSshould be configurable suchthat this capability can be re-moved from the system.

    Guaranteed resource availability:Time domainThe vast majority of RTOS em-ploy priority-based, preemptiveschedulers. Under this scheme,the highest priority readythread in the system always getsto use the processor (execute).If multiple threads are at thatsame highest priority level, theygenerally share the processorequally, via timeslicing. The

    Thread A

    Traditional schedulerThread B is starved

    Thread A'

    Thread A"

    Thread B

    Thread B

    Thread A

    Scheduler using weightsThread B CPU resource is guaranteed

    Thread B

    Thread A'

    Thread A"

    Thread B

    Figure 3: Traditional scheduler versus scheduler with weights.

    Memorystarved

    Central store

    Central store

    a)

    Memoryguaranteed

    b)

    Figure 2: a) Before memory quotas and b) after.

  • 8/7/2019 Design_OS

    3/4

    problem with this timeslicing(or even run-to-completion)within a given priority level, isthat there is no provision forguaranteeing processor time forcritical threads.

    Consider the following sce-nario: the system includes twothreads at the same prioritylevel. Thread A is a non-critical,background thread. Thread B isa critical thread that needs at

    least 40 percent of the proces-

    sor time to get its work done.Because Thread A and B are as-signed the same priority level,the typical scheduler will timeslice them so that both threadsget 50 percent of the processor.At this point, Thread B is ableto get its work done. Now sup-pose Thread A creates a newthread at the same prioritylevel. Consequently, there arethree highest priority threadssharing the processor. Sud-denly, Thread B is only getting33 percent of the processor andcannot get its critical workdone. For that matter, if thecode in Thread A has a bug orvirus, it may create dozens oreven hundreds of confeder-ate threads, causing Thread Bto get a tiny fraction of theruntime.

    One solution to this problemis to enable the system designerto inform the scheduler of athreads maximum weight

    within the priority level (Fig-ure 3). When a thread createsanother equal priority thread,the creating thread must giveup part of its own weight to thenew thread. In our previous ex-ample, suppose the system de-signer had assigned weight toThread A and Thread B suchthat Thread A has 60 percent ofthe runtime and Thread B has40 percent of the runtime. When Thread A creates thethird thread, it must providepart of its own weight, say 30percent. Now Thread A and thenew thread each get 30 percentof the processor time but criti-cal Thread Bs 40 percent re-mains inviolate. Thread A can

    create many confederatethreads without affecting theability of Thread B to get itswork done; Thread Bs proces-sor reservation is thus guaran-teed. A scheduler that providesthis kind of guaranteed re-source availability in additionto the standard schedulingtechniques is required in somecritical embedded systems, par-ticularly avionics.

    The problem inherent in all

    schedulers is that they are igno-rant of the process in whichthreads reside. Continuing ourprevious example, suppose thatThread A executes in a user in-terface process while criticalThread B executes in a flightcontrol process. The two appli-cations are partitioned and pro-tected in the space domain butnot in the time domain. Design-ers of safety-critical systems re-quire the ability to guaranteethat the run-time characteris-tics of the user interface cannotpossibly affect the run-timecharacteristics of the flightcontrol system. Threadschedulers simply cannot makethis guarantee.

    Consider a situation in which ThreadB normally getsall the runtimeit needs bymaking ithigher priority

    than Thread Aor any of theother threadsin the user in-terface. Due toa bug or poordesign or im-proper testing,Thread B may lower its own pri-ority (the ability to do so isavailable in practically all ker-nels), causing the thread in theuser interface to gain control ofthe processor. Similarly,Thread A may raise its priorityabove the priority of Thread Bwith the same effect.

    A convenient way to guaran-tee that the threads in processesof different criticality cannot af-

    fect each other is to provide aprocess-level scheduler. De-signers of safety critical soft- ware have noted this require-ment for a long time. The pro-cess, or partition, schedulingconcept is a major part ofARINC Specification 653, anAvionics Application SoftwareStandard Interface.

    The ARINC 653 partitionscheduler runs partitions, orprocesses, according to atimeline established by the sys-tem designer. Each process isprovided one or more windowsof execution within the repeat-ing timeline. During each win-dow, all the threads in the otherprocesses are not runnable;only the threads within the cur-rently active process arerunnable (and typically arescheduled according to the

    standard thread schedulingrules). When the flight controlapplications window is active,its processing resource is guar-anteed; a user interface applica-tion cannot run and take awayprocessing time from the criti-cal application during this win-dow.

    Although not specified inARINC 653, a prudent additionto the implementation is to ap-ply the concept of a backgroundpartition. When there are norunnable threads within the ac-tive partition, the partitionscheduler should be able to run background threads, if any, inthe background partition, in-stead of idling. An examplebackground thread might be a

    low priority diagnostic agentthat runs occasionally but doesnot have hard real-time dead-lines.

    Attempts have been made toadd partition scheduling on topof commercial off-the-shelf OSby selectively halting all thethreads in the active partitionand then running all thethreads in the next partition.Thus, partition switching time

    is linear with the number ofthreads in the partitions, anunacceptably poor implemen-tation. The RTOS must imple-ment the partition schedulerwithin the kernel and ensurethat partition switching takesconstant time and is as fast aspossible.

    SchedulabilityMeeting hard deadlines is one ofthe most fundamental require-ments of a RTOS and is espe-cially important in safety-criti-cal systems. Depending on thesystem and the thread, missinga deadline can be a critical fault.

    Rate monotonic analysis(RMA) is frequently used by sys-tem designers to analyze andpredict the timing behavior ofsystems. In doing so, the systemdesigner is relying on the un-

    derlying OS to provide fast andtemporally deterministic sys-tem services. Not only must thedesigner understand how longit takes to execute the threadscode, but also any overhead as-sociated with the thread mustbe determined. Overhead typi-cally includes context switchtime, the time required to ex-ecute kernel system calls, andthe overhead of interrupts andinterrupt handlers firing andexecuting.

    All RTOS incur the overheadof context switching. Lowercontext switching time implieslower overhead, more efficientuse of available processing re-sources and increased likeli-hood of meeting deadlines. A

    RTOSs context switching codeis usually hand optimized foroptimal execution speed.

    Interrupt latencyA typical embedded system hasseveral types of interrupts re-sulting from the use of variouskinds of devices. Some inter-rupts are higher priority and re-quire a faster response timethan others. For example, an

    H

    M

    L

    Figure 4: Priority inversion.

    H

    M

    L

    Figure 5: Priority inheritance.

  • 8/7/2019 Design_OS

    4/4

    interrupt that signals the kernelto read a sensor that is criticalto an aircrafts flight controlshould be handled with theminimum possible latency. Onthe other hand, a typical timertick interrupt frequency may be60Hz or 100Hz. Ten millisec-onds is an eternity in hardwareterms, so interrupt latency forthe timer tick interrupt is not ascritical as for most other inter-

    rupts.Most kernels disable all in-

    terrupts while manipulatinginternal data structures duringsystem calls. Interrupts are dis-abled so that the timer tick in-terrupt cannot occur (a timertick may cause a context switch)at a time when internal kerneldata structures are beingchanged. The systems inter-rupt latency is directly relatedto the length of the longestcritical section in the kernel.

    In effect, most kernels in-crease the latency of all inter-rupts just to avoid a low prior-ity timer interrupt. A better so-lution is to never disable inter-rupts in kernel system calls andinstead to postpone the han-dling of an intervening timertick until the system call com-pletes. This strategy dependson all kernel system calls beingshort (or at least that calls thatare not short are restartable),

    so that scheduling events canpreempt the completion of thesystem call. Therefore, the timeto get back to the scheduler mayvary by a few instructions (in-significant for a 60Hz sched-uler) but will always be shortand bounded. It is much moredifficult to engineer a kernelthat has preemptible systemcalls in this manner, which is why most kernels do not do itthis way.

    Bounded execution timesIn order to allow computation ofthe overhead of system calls thata thread will execute while do-ing its work, an RTOS shouldprovide bounded execution

    times for all such calls. Two ma-jor problems involve the timingof message transfers and thetiming of mutex take opera-tions.

    A thread may spend timeperforming a variety of activi-ties. Of course, its primary ac-tivity is executing code. Otheractivities include sending andreceiving messages. Messagetransfer times vary with the size

    of the data. How can the systemdesigner account for this time?The RTOS can provide a capa- bility for controlling whethertransfer times are attributed tothe sending thread or to the re-ceiving thread or shared be-tween them. Indeed, thekernels scheduler should treatall activities, not just the pri-mary activity, as prioritizedunits of execution so that thesystem designer can properlycontrol and ac-count for them.

    Priority inversionPriority inversionhas long been the bane of system de-signers attemptingto perform ratemonotonic analy-sis, since RMA de-pends on higherpriority threads running beforelower priority threads. Priority

    inversion occurs when a highpriority thread is unable to run because a mutex (or binarysemaphore) it attempts to ob-tain is owned by a low prioritythread but the low prioritythread is unable to execute andrelease the mutex because amedium priority thread is alsorunnable (Figure 4). The mostcommon RTOS solution to thepriority inversion problem is tosupport the priority inheritanceprotocol.

    A mutex that supports prior-ity inheritance works as fol-lows: if a high-priority threadattempts to take a mutex al-ready owned by a low prioritythread, the kernel automati-

    cally elevates the low prioritythread to the priority of thehigh priority thread. Once thelow priority thread releases themutex, its priority will be re-turned to normal and the highpriority thread will run. Thedynamic priority elevation pre-vents a medium priority threadfrom running while the highpriority thread is waiting; pri-ority inversion is avoided (Fig-ure 5). In this example, thecritical section execution time(the time the low prioritythread holds the mutex) isadded to the overhead of thehigh priority thread.

    A weakness of the priorityinheritance protocol is that itdoes not prevent chained block-ing. Suppose a medium prioritythread attempts to take a mutexowned by a low priority thread

    but while the low prioritythreads priority is elevated tomedium by priority inherit-ance, a high priority thread be-comes runnable and attemptsto take another mutex alreadyowned by the medium prioritythread. The medium prioritythreads priority is increased tohigh but the high prioritythread now must wait for boththe low priority thread and themedium priority thread to com-

    plete before it can run again.The chain of blocking criti-

    cal sections can extend to in-clude the critical sections of anythreads that might access thesame mutex. Not only does thismake it much more difficult forthe system designer to computeoverhead, but since the systemdesigner must compute the worst case overhead, thechained blocking phenomenonmay result in a much less effi-cient system (Figure 6). Theseblocking factors are added intothe computation time for tasksin the RMA analysis, poten-tially rendering the systemunschedulable. This may forcethe designer to resort to a fasterCPU or to remove functionalityfrom the system.

    A solution called the prior-ity ceiling protocol not onlysolves the priority inversionproblem but also preventschained blocking (Figure 7).In one implementationscheme (called the highestlocker), each semaphore hasan associated priority, which isassigned by the system de-

    signer to be the priority of thehighest priority thread thatmight ever try to acquire thatobject. When a thread takessuch a semaphore, it is imme-diately elevated to the priorityof the semaphore. When thesemaphore is released, thethread reverts back to its origi-nal priority. Because of thispriority elevation, no otherthreads that might contend forthe same semaphore can rununtil the semaphore is re-

    leased. It is easy to see how thisprevents chained blocking.

    Several RTOS provide sup-port for both priority inherit-ance and priority ceilings, leav-ing the decision up to the sys-tem designer.

    Changing requirementsMany of the RTOS in use todaywere originally designed forsoftware systems that weresmaller, simpler and ran on pro-

    cessors without memory protec-tion hardware. With the ever-

    increasing complexity of appli-cations in todays embedded sys-tems, fault tolerance and highavailability features have be-come increasingly important.Especially stringent are the re-quirements for safety-criticalsystems.

    Fault tolerance begins withprocesses and memory protec-tion but extends to much more,especially the need to guaranteeresource availability in the time

    and space domains. Kernel sup-port for features like the prior-ity ceiling protocol give safety-critical system designers the ca-pabilities needed to maximizeefficiency and guarantee sche-dulability in their systems.

    H

    S1

    S2

    S2

    rS2rS1

    M

    L

    H

    Overhead

    S1

    S2

    (S1)

    (S2) S2

    rS2

    rS1M

    L

    Figure 6: Chained blocking caused by priority inheritance.

    Figure 7: Priority ceilings.