30
WHITE PAPER Building Intelligent Devices with MontaVista Linux Consumer Electronics Edition Prepared by Bill Weinberg MontaVista Software

Linux for Advanced Consumer Electronics

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Linux for Advanced Consumer Electronics

W H I T E P A P E R

Building Intelligent Devices with MontaVista Linux Consumer Electronics Edition

Prepared by

Bill Weinberg MontaVista Software

Page 2: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Table of Contents

Abstract...................................................................................................................3

Introduction.............................................................................................................4 Consumer Electronics Applications...............................................................4

Embedded Software for Consumer Electronics ..................................................5 A Platform vs. A Stack....................................................................................5 The MontaVista Software CEE Strategy ......................................................5 Strategic Microprocessor Architectures ........................................................7

Key Technologies ..................................................................................................8 Power Management.......................................................................................8 Storage and File Systems............................................................................13 Boot Sequence .............................................................................................15 Towards Instant On......................................................................................18 Scaling Linux RAM/ROM Footprint.............................................................19 Execute in Place...........................................................................................23 Optimizing Performance of Consumer Electronics Applications...............27

Conclusion............................................................................................................29

Revision History ...................................................................................................30

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 2

Page 3: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Abstract

This white paper reviews the key technologies and tools required to

build next-generation advanced consumer electronics applications,

with particular attention to the features and capabilities in MontaVista

Linux Consumer Electronics Edition

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 3

Page 4: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Introduction

Embedded Linux found its first commercial applications in networking infrastructure. Starting in 1999, designers of switches, routers, firewalls, VPNs and access equipment began migrating existing applica-tions and building new ones based on the Open Source GNU/Linux operating system. They chose (and continue to choose) Linux for its stability, robustness, flexibility, and world-class networking capabilities.

Today, manufacturers of consumer electronics equipment are building and already deploying intelligent consumer devices based on embedded Linux. Global suppliers, like Sony, Matsushita, NEC, Philips, Motorola and others choose Linux as their strategic embedded platform for the very same reasons, as well as its scalability, hardware support, graphical user interface options, and for the significant savings they can realize in development and deployment.

To address this burgeoning use of embedded Linux in intelligent consumer devices, and to better serve developers of these applications, MontaVista Software introduced MontaVista Linux Consumer Electron-ics Edition. This white paper described the genesis and capabilities of that product, and considerations for its use in building the next generation of advanced consumer electronics applications.

Consumer Electronics Applications

MontaVista Linux Consumer Electronics Edition provides significant value for developers of a broad range of application types. In particular, these applications break down as follows:

Mobile and Wireless

TV and Home Entertainment

Automotive Telematics and In-Car Entertainment

Mobile Phones

Wireless Handhels

Portable Media Player

Portable Game Consoles

Intelligent Remote Controls

Digital Still and Video Cameras

Digital / HDTV

PVR / DVR

Set Top Box

Digital Audio Receivers

Musical Instruments

Karaoke

Game Consoles

Navigation Systems

Vehicle Management

Digital Radio

Digital Media Players

Hand-free Mobile Phones

Wireless Data and Media Sharing

Moreover, both MontaVista Linux Consumer Electronics Edition and also MontaVista’s flagship product, MontaVista Linux Professional Edition, form the basis for many home networking, access and small-office applications:

Home Networking and Control

Small Office and Imaging

Home Gateway

Broadband Access

Home Automation

Security & Monitoring

Domestic Robotics

Laser and Inkjet printers

Fax & Scanners

Intelligent Copiers

Multi-function Peripherals

Network Printers

Routers, Firewalls, VPN

IP Telephony Clients

Audio & Video Conferencing

PBX & Voicemail

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 4

Page 5: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Embedded Software for Consumer Electronics

A Platform vs. A Stack

Today it is common to speak of a software “stack” – the entire software content of the delivered final product. For device manufacturers whose main value-add comes from manufacturing or channel (not primary technology), the availability of components that constitute a complete software stack is key. Most device manufacturers, however, add significant value in terms of original design and functionality

Linux CE Solution Developer Value-Add Challenges

100% Finished final product, off-the-shelf

Brand, Manufacturing Minimal differentiation

80% Shrink-wrapped “solution” stack

Look & Feel, Management Interface

Opportunities for branding offset by identical functionality; underlying components commoditized

60% OS platform, development tools and middleware

Application Stack, Management Interface

Device OEM invests to add value / differentiate

40% hardware and OS platform

Application Stack, Management Interface, Middleware and Drivers

Maximum opportunity to add value – significant engineering required

10% bare hardware and ROM monitor only

Entire software stack, including OS, Middleware and Applications

Very large development and code management effort

The MontaVista Software CEE Strategy

MontaVista Software made a very conscious decision to pursue a strategy of providing an OS platform, development tools and key middleware (60% in table above) in MontaVista Linux Consumer Electronics Edition. By offering a “strategic platform” to device manufacturers, MontaVista benefits the maximum number of project types for the greatest number of hardware platforms.

More than MontaVista Linux Professional Edition

MontaVista Linux Professional Edition targets a very broad range of embedded microprocessors and system boards. “Pro”, however, is a “horizontal” product, designed to provide a solid foundation across an array of CPU architectures, to give developers in a head start in applications as diverse as network-ing, instrumentation, control, imaging, and multimedia.

Developers of consumer electronics (CE) devices have already successfully developed and deployed dozens of applications using MontaVista Linux Professional Edition. These first generation of projects typically involved the porting and/or development of application-specific software stacks AND develop-ment of application-specific peripheral device drivers.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 5

Page 6: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

MontaVista Linux Consumer Electronics Edition builds on the success of these first CE projects with Pro, and helps CE device developers meet their ever-tightening development schedules by enhancing the following areas:

Comprehensive support for application-targeted peripherals on high-integration “SoC” processors like TI OMAP and Intel XScale and the reference boards they populate

Integrated Dynamic Power Management (DPM) in the MontaVista Linux kernel, device drivers, and the additional of power management middleware to optimize battery life in handheld devices

Support for fast launching and footprint optimization with kernel and application execute-in-place (XIP) from advanced flash file systems like CramFS

Tools to measure and optimize interrupt and preemption latency, memory utilization, and boot-up times

Other optimizations to support CE market requirements for streaming media, flash memory, and security

Why not a “solution stack”?

Microsoft and also other embedded Linux suppliers offer CE products that they tout as “complete solution stacks” -- aggregations of operating system, drivers, middleware and domain-specific applica-tion code that purport to include 95% of the needed software in intelligent consumer devices.

MontaVista Software chose a platform rather than a stack-based approach in direct response to input from top global device manufacturers. These household-name OEMs (and dozens of other lesser-known) companies found the MontaVista platform approach superior because

So-called solution stacks offer a “one size fits all” product look and feel that leaves little or no room for product differentiation and branding – e.g., a Microsoft Phone 2000 device is first and foremost a Microsoft device; the hardware and other CE value-add is effectively commoditized, like a PC.

Pre-packaged stacks are usually monolithic and royalty-bearing; CE device manufacturers need the freedom to choose “best fit” software components and optimize their BOM costs.

Leading device manufacturers continue to invest valuable engineering resources and want to leverage past investments in application and middleware by porting this IP to and embedded Linux platform – they don’t want to throw these investments out and start over, with new, incremental costs.

Solution stack suppliers typically really only add value in one particular area (e.g., networking components or tools). As such, the strategic OS platform, device drivers, and other key components are treated as commodity technologies. Examples of this shortcoming are products from several small home gateway stack providers, who include non-standard (forked) Linux kernels, inferior tools, and untested device drivers to support their royalty-bearing VoIP, VPN and other access software.

Instead, MontaVista and its key technology partners in networking, graphics, data bases, multimedia and other CE-relevant areas have put together a menu of “best in class” solutions for CE applications. These open solution offerings help large and small developers of home gateways, set top boxes, digital mobile handsets and other applications get to market quickly, with maximum choice and flexibility.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 6

Page 7: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Partners building these solutions together with MontaVista include

IPinfusion Trolltech Real Networks OpenWave

Access Hughes Software IBM PacketVideo

Strategic Microprocessor Architectures

The next generation CE devices described in this white paper demand the performance profiles offered only by 32 and 64 bit processors with fully integrated and enabled memory management units (MMUs). While many shipping and legacy Consumer Electronics devices (PDAs, phones, older STBs, etc.) still employ MMU-less processors1, designers who opt for the power of embedded Linux also opt for more processing power. While specialized versions of Linux (ucLinux) do exist for these older, limited architectures, Top tier CE manufacturers look past the low price-points for these CPUs and focus on total system cost, including development, deployment, and maintenance expenditures down the road.

The key characteristics of these high-integration “systems on a chip” include

RISC Processor Core with integrated, hardware-assisted MMU

Highly coherent, multi-way associative integrated cache and cache controllers

Integrated memory and bus controllers

Rich on-chip peripheral set – serial, networking, USB, display, I2C, IR

Multimedia coprocessors and/or DSP to implement CODECs

Pervasive power management (encompassing CPU core and peripherals)

Scalable clock and voltage for CPU, bus, and peripherals

Processors that deliver this combination of capabilities include those in the ARM Architecture (ARM7 and ARM9 derivatives, like the TI OMAP and Intel XScale. To a lesser extent this level of SoC integra-tion also exists for select PowerPC and MIPS processors as well.

Approaches to CPU and Board Support

While it is common for embedded OS and tools suppliers to speak of processor support in terms of processor families and “core” types, there often exists a large gap between processor support in the abstract and offering a useful “out of the box” experience to support rapid prototyping, development, integration, and deployment. As such, it is essential that an embedded Linux platform supplier not merely provide generic tools and kernel for a given processor/family, but target actual existing board-level platforms.

Linux Support Package - LSP

MontaVista Software addresses the CPU and board support challenge by building and providing comprehensive, integrated and quality-assured board-level implementations of the Linux kernel and device drivers. Each “Linux Support Package” (LSP2) is built especially for the reference and evaluation systems provided by semiconductor manufacturers to facilitate design-in of their processors. For example, for the TI OMAP 5910/1510, MontaVista Linux Consumer Electronics Edition targets and comprehensively supports the TI “Innovator” platform.

1 Especially 16/32 bit 68000-derivatives like DragonBall and M-Core and older ARM implementations

2 Cf. traditional RTOS terminology “board support package” – BSP.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 7

Page 8: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

With MontaVista Linux Consumer Electronics Edition, the pre-built and targeted LSP boots out-of-the-box, and includes pre-integrated support for both the on-chip and board-based peripherals present in the CPU and on the reference board. By comprehensively supporting such reference hardware, MontaVista Linux lets developers rapidly prototype and develop their applications on stable, working hardware. When the hardware design team delivers the application-specific custom platform, it is then much simpler to move the OS and application code to the custom hardware than it would be to start develop-ment there.

Differential Debugging

Even with high integration processors like the TI OMAP and Intel XScale, where most or all of the peripherals in the final design will be on-chip, leveraging reference hardware fosters faster development and avoids debugging challenges that arise from early and potentially unstable hardware. When program bugs surface, the reference board paradigm allows both developers and the MontaVista Support team to determine if the fault lies in software or in hardware – if code executes correctly on the reference platform, the bug probably exists in the new hardware; if not, then it is likely to be a software bug.

Key Technologies

The following sections reviews key technologies required by CE device manufacturers, specifically

Power Management

Non-volatile Storage

System Boot Time

Memory Footprint

Execute in Place

Power Management

Power Management Scope

Anyone who owns a notebook computer will perceive that their portable device behaves differently when running on battery vs. on AC mains power – the screen dims, the processor clock slows, and the system drifts off to standby or to sleep whenever possible. PDA owners will also contend with screens dimming and devices sleeping after a quiescent period, and cell phone users will have noted that after dialing, backlight and key-lights extinguish. Behind these visible behaviors are a mix of hardware and software technology and policy.

Gross behaviors like full throttle, stand-by and sleep leverage native CPU capabilities to reduce opera-tional voltage and/or clock frequency to save power. What most device users don’t perceive is that in addition to wholesale changes among system-wide states, is that actual power management can also be incremental and can occur many hundreds times a second.

Any Dynamic Power Management (DPM) strategy begins with scaling the operating voltage and frequency of the one or more processor cores present in a portable device – high integration PowerPC, ARM and x86-based systems often feature a DSP or intelligent base band processor. Indeed, CPUs such as the TI OMAP™ processor family, the recently announced IBM® PowerPC™ 405LP and others offer dynamic scaling of core voltage and frequency.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 8

Page 9: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Modern embedded processors, however, are so power-efficient that the CPU is not always the single greatest energy-consumer – CPU power consumption typically accounts for 20-25% of total system energy budgets. Other energy hogs can include high-performance memories, color displays and radio interfaces. Therefore, a dynamic power management system that is only concerned with core voltage and frequency scaling the processor core may be of limited use.

A truly useful dynamic power management scheme will support rapid scaling of a variety of voltages and clocks, in concert with and/or independently of the CPU core operation.

Component Power Management Opportunities

CPU Core Clock frequency and voltage; sleep/halt

DSP Core Clock frequency and voltage, disable

RAM Bus clock, DRAM refresh rate, burst modes, enable/disable

Flash Access modes, write enable/disable, use of shadow RAM

System Bus Vary voltage and frequency

LCD Display Change scan frequency, turn off/dim backlight, power-down

Audio Output Scale volume, disable output stage

Base band/RF Cycle power, disable RF output stage

Voltage and Frequency Scaling

Note that while the above and other processors allow for dynamic, almost independent scaling of frequency and voltage, the two are and must be practically scaled together. As it turns out, both frequency and voltage present interesting challenges in their scaling functions.

Challenges to Frequency Scaling

The range of allowable clock frequencies is dependent on the particular manufacturing process em-ployed in building a CPU, while the granularity of scaling depends on the choice of scaling method. While it is possible to build a continuously scalable clock (using a PLL or equivalent), it is usually more practical to supply a high frequency clock as an input to the processor package and to divide that frequency down to a suitable clock using counters/dividers. For example to establish clocks at 50, 100, 133, and 200 MHz, a design could divide a 400 MHz input clock by 8, 4, 3, and 2 respectively. It is not always practical, however, to start with a high enough frequency input clock to arrive at all desired sub-frequencies or a continuous range of frequencies, such that stepwise clock frequency determination is the more common practice.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 9

Page 10: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Challenges to Voltage Scaling

Energy savings from clock scaling pale in comparison to savings from voltage reduction – energy usage typically increases as a quadratic function of voltage in a circuit, so low voltage operation yields very high benefit.

At any given clock frequency, however, there exists a range of acceptable voltages needed to drive processor circuits to “saturation” (that is, to a full logical “1”). Voltages below this “floor” result in unreli-able operation. This minimum voltage places a minimum on the power savings at a given frequency and further advocates pre-determining a fixed set of clocks and voltages for CPU operations.

The following table illustrates this companioning of frequency and operating voltage for a typical embed-ded processor core:

Figure 1. – Relationship between CPU clock frequency and power supply voltage

Architecture

Coming from the universe of “white box” PCs and notebooks are two existing schemes for power management: the legacy Advanced Power Management (APM) scheme (still used in many Linux-based portable devices but phased out in Microsoft-based notebooks and hand-helds) and ACPI (Advanced Configuration and Power Interface), the current standard backed by Intel, Toshiba and others. Systems like ACPI are preferred for COTS (commercial off-the-shelf) hardware like PCs, notebooks, servers, and even blades for communications equipment, but present strong dependencies on the prevailing x86/IA-32 BIOS architecture.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 10

Page 11: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Embedded systems usually have no BIOS (in the PC/AT sense) and usually do not have the luxury of machine abstractions to insulate the operating system from low-level device and power management activities. In embedded Linux as in other OSes that target battery-powered applications, power man-agement activities thereby require specific intervention on the part of the OS kernel and device drivers. It is important to note, however, that while low-level implementation of dynamic power management is resident in the OS kernel, power management strategies and policies can and do emanate from middleware and user application code.

Linux Kernel

Driver Driver

GenericApplication

Policy Manager

DPM Subsystem

DPM-Aware Application

Application inDPM Wrapper

Driver

SystemSoftware

Middleware

UserPrograms

Linux Kernel

Driver Driver

GenericApplication

Policy Manager

DPM Subsystem

DPM-Aware Application

Application inDPM Wrapper

Driver

SystemSoftware

Middleware

UserPrograms

Figure 2 – Power Management and the Embedded Linux Software Stack

Interfaces and APIs

Ideally, a power management system would be almost completely transparent to as many levels of the software stack as possible. Indeed, this was the path followed by Transmeta in its Crusoe architecture and has been the goal of existing BIOS-based power management schemes. However, developers with experience in building handheld devices will testify to the fact that some degree of explicit participation is required across the system, as follows:

Kernel Interfaces

In the DPM architecture for Linux, the DPM subsystem within the kernel maintains the power state of the system and ties together the various power-managed elements of a DPM system. Relatively few, if any, other parts of the kernel need to interact with DPM directly, however; rather, DPM is best thought of as a service provider to drivers, middleware and applications.

Driver Interfaces

DPM-enabled device drivers are more “stateful” than default drivers: they are driven through various states by external events and through callbacks from the kernel DPM subsystem to reflect/follow operational policies. Driver APIs also allow drivers to register the basic operational characteristic of the devices they interface/manage, to allow for finer-grained policy decisions.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 11

Page 12: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

User-program APIs

User programs (applications) will fall into one of three categories:

Power Management Aware

Legacy Applications in PM-aware “wrappers”

Legacy Applications with no power management

Power-management Aware applications can leverage the APIs available from a Policy Manager, to establish their base constraints and/or to force changes in power management policy to match their execution requirements. Legacy applications without explicit power management capabilities can be “wrapped” in code and/or patch to achieve comparable effects, but can also be left to run with default behaviors, dependent upon default policy management of a wider scope.

The actual mechanisms under embedded Linux DPM include APIs like dpm_set_os() [kernel], as-sert_constraint(), remove_constraint() and set_operating_state() [kernel and driver], set_policy() and set_task_state() [user level via system calls] and the /proc interface.

Real-time Impact of DPM

Until recently, scaling CPU voltage and frequency presented significant challenges to real-time perform-ance. The instability introduced by changes in either parameter, and the accompanying time needed to “relock” phase-lock loops and other dynamic clock mechanisms, introduced long latencies (sometimes many milliseconds) during which the CPU could neither perform compute operations nor respond to outside events (interrupts).

Slow Clock Fast ClockTransitionSlow Clock Fast ClockTransition

LatencyLatency

Figure 3 – Clock Transition Latency

Many modern embedded processorscan scale frequencies with latencies measured in a handful of microseconds, and respond to changing voltages with a latency measured in tens of microseconds, all without interrupting system operations, allowing for much more aggressive and fine-grained policies. For example, voltage and frequency can be reduced between frames of MPEG video or IP-based voice packets.

Note, however, that while the CPUs supported by MontaVista Linux Consumer Electronics edition may themselves support rapid scaling of voltage and frequency, not all reference platforms offer clock generation circuits and power supplies that match this agility. One reference board reportedly required up to 200 milliseconds to change operating voltage.

A more general challenge to real-time performance is that of response to interrupts during deep sleep modes. While most on-chip peripherals may be programmed to “wake up” the system upon receipt of an interrupt, developers must carefully specify policies to enable (selective) device-based wake-up and account for system-wide latencies and storage classes to ensure timely execution of interrupt handlers and/or user-space responses to events (preemption latency).

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 12

Page 13: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Storage and File Systems

Streaming-Optimized File Systems

Traditionally, file systems have been optimized for fast sequential access (as with magnetic tape) and flexible random access (as with rotating media, and more recently with flash). In optimization for random access, modern operating systems like Linux take into account statistics for real-world file access (tendencies for often-used data records to cluster together in a file) and for the need to rewind/reset file data structures and disk read/write heads quickly. As such, most file systems rely heavily of caching to speed up read and write operations, often with significant RAM usage and system overhead.

Multi-media streams are typically handled with traditional file system mechanisms, yet present very different usage patterns from real storage. That is, while media streams comprise large amounts of data and also benefit from caching, the bulk of the data in a stream is ephemeral – it need not be held in memory for longer than the time needed for audio or video output. As such, streamed data puts undue stress on traditional file systems and can overtax limited physical memory on many types of devices, especially portable devices.

To optimize system response in the presence of multimedia streaming, MontaVista Linux Consumer Electronics Edition includes support for optimization to the Linux Virtual File System (VFS), enable via the O_STREAMING flag used with calls like open(). This flag provides a “hint” to the VFS and to the Virtual Memory Manager (VMM) that allows for process-based pruning of a file page cache once the file index passes the pages being read or written (drop-behind). Normally, these pages would be retained in cache for as long as possible to accelerate potential re-reading or re-writing, but with streaming file operation, the data flow is guaranteed to be uni-directional and single-use.

Page Cache

PagePage PagePage PagePage PagePage PagePage

PagePagePage Cache

PagePage PagePage PagePage PagePageLook-ahead

PagePagePage Cache

PagePage PagePage PagePage PagePageLook-ahead

PagePage

Look-ahead

PagePagePage Cache

PagePage PagePage PagePageLook-ahead

PagePage PagePagePage Cache

PagePage PagePage PagePageLook-ahead

PagePage PagePage

Free Pages

PagePage PagePage

Figure 4 – O_STREAMING, Page Cache, and Look-ahead

Note that O_STREAMING is beneficial in terms of memory usage, but entirely safe for integrity of streamed data – it still enables look-ahead, essential for stable viewing of multimedia and does not simple drop all pages

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 13

Page 14: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Flash for Boot and Persistent Storage

Before the advent of embedded Linux and the popularity of “peer-level” operating systems in embedded devices, developers typically meant any of three things when they said “Flash File System”:

Flash Boot Ability to copy contents of flash device to RAM and execute OS and applications there. Leverages field-programmability of flash for patches and upgrades to device firmware, usually by wholesale image replacement.

Block Store R/W, block-level access to stored parameter sets for saving non-volatile critical data, like user preferences, device calibration, etc. For NOR flash, read is memory mapped, while write operations require a special driver. Typical for small, on-chip SoC flash arrays.

Disk Emulation Block or file system-level emulation of rotating media, either through a h/w disk interface (IDE or SCSI) or a software layer.

With embedded Linux, all three options are core native capabilities of the platform. Booting from flash is standard on all supported architectures, including popular embedded CPUs like Intel IA-32, ARM cores (Intel StrongARM, Intel XScale, TI OMAP, and others), PowerPC (Motorola and IBM), MIPS, and Hitachi SH. Block Access under Linux is also straightforward, but does require that both read and write operations map the flash device’s physical address into Linux logical address space. Disk emulation is also easily realized, with many native file system options.

Flash File Systems

MontaVista Linux Consumer Electronics Edition (CEE) supports a range of file systems that overlay NOR memory-mapped flash and provide the expected standard I/O operations like open, close, read, write, etc. and that implement standard Linux directory structure, protection and security.

In particular, CEE supports two flash-based file systems, CramFS and JFFS2, with the following characteristics:

CramFS JFFS2

Read/Write Operations Read-Only Read/Write

Write Operations N/A Journaled

Data Compression 2:1 (N/A for XIP images) 1.5-1.7:1

Block Wear-Reduction N/A Supported

Start-Up Immediate on Mount Start-up time scales with blocks in use

Execute-in-Place Enabled with “sticky” bit N/A

Kernel / User Space Kernel Kernel

Figure 5 – CramFS and JFFS2 Features Compared

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 14

Page 15: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

When to use CramFS vs. JFFS2?

At first glance, a developer might want to use JFFS2 for all persistent storage – it’s the most flexible with read/write capability and very reliable, being a journal-based file system. Upon further consideration, however, many developers elect to use BOTH JFFS2 and CramFS, each for different purposes: They use CramFS for the “base” of their application – the fixed, unchanging application images and default data configurations; they use JFFS2 for data that changes on a regular basis – logging, downloaded images, data caching, etc. They leverage the fact that CramFS supports XIP to launch applications faster, and rely on the fact that read-only CramFS is much more difficult to hack.

A real-world application example from a well-known handset manufacturer uses multiple file systems as follows:

Raw NOR flash: XIP boot kernel image

CramFS: Root file system, core configuration utilities, default configuration data, base PIM stack components, Java Virtual Machine executable and base class storage, fonts and display elements

JFFS2: Optional utilities, carrier-supplied s/w, downloaded programs in Java and binary, user data storage (phone numbers, etc.), ring tones, “desktop” and background images, photos from integrated camaras, etc.

RAMfs: Temporary files, streaming

Boot Sequence

Developers of intelligent consumer devices face stringent requirements for “instant on” from their customers, the public at large. Quick power-up expectations come from our common experience with analogue solid-state appliances like televisions, radios, VCRs, and basic cell phones.

To understand the challenge faced when embedding Linux (or any other embedded OS) in consumer-grade devices, it is instructive to compare the boot-up times for enterprise-type workstations and servers based on Windows and Linux with the more aggressive times required in embedded devices:

Figure 6 – Comparative Boot Times for Enterprise and Embedded Computing

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 15

Page 16: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

The divergence in required startup times from the desktop to embedded is two orders of magnitude! To understand this gulf between embedded expectations and desktop reality, let’s look at the components that comprise the boot sequence, and consider how to optimize the time spent in each.

Boot Sequence Components

Booting can be broken into the five stages illustrated in Figure 7:

Init

-Serial or parallel execution-Appl. RAM exec or XIP-Num. executables, modules-Daemon/server init times

Networking Daemons

Cron

GUI Desktop

Loadable Modules

Web Services

Printing Services

System Logging

Final Application(s)

Cons

ole

Java

User File Systems

PowerOn

Kernel Init-Time Base -Page Tables-Bus Discovery-Device Drivers-File Systems

-Precalc vs. h/w calibration

-Num. Pages-Num. Buses-Drivers in kernel

vs. modules-FS start-up

Kernel Load-Decompress-Copy to RAM-RAMdisk(N/A if XIP)

-Kernel RAM exec. vs XIP

-Memory Size-Kernel Size-FS Size

BIOS / Monitor-mem test-h/w init-PCI discovery

-BIOS yes/no-BIOS options-Memory Size-Num/depth of

buses

200 MHz CPU < 0.2 secs200 MHz CPU < 0.2 secs

0-15 secs

2-60 secs

0-5 secs 0-15 secsInit

-Serial or parallel execution-Appl. RAM exec or XIP-Num. executables, modules-Daemon/server init times

Networking Daemons

Cron

GUI Desktop

Loadable Modules

Web Services

Printing Services

System Logging

Final Application(s)

Cons

ole

Java

User File Systems

Networking Daemons

Cron

GUI Desktop

Loadable Modules

Web Services

Printing Services

System Logging

Final Application(s)

Cons

ole

Java

User File Systems

PowerOn

Kernel Init-Time Base -Page Tables-Bus Discovery-Device Drivers-File Systems

-Precalc vs. h/w calibration

-Num. Pages-Num. Buses-Drivers in kernel

vs. modules-FS start-up

Kernel Load-Decompress-Copy to RAM-RAMdisk(N/A if XIP)

-Kernel RAM exec. vs XIP

-Memory Size-Kernel Size-FS Size

BIOS / Monitor-mem test-h/w init-PCI discovery

-BIOS yes/no-BIOS options-Memory Size-Num/depth of

buses

200 MHz CPU < 0.2 secs200 MHz CPU < 0.2 secs200 MHz CPU < 0.2 secs200 MHz CPU < 0.2 secs

0-15 secs

2-60 secs

0-5 secs 0-15 secs

Figure 7 – Boot Sequence Stages

Power-On

For most systems, turning on the power supply is effectively instantaneous, but a small finite time must pass from “cold” power on until available voltage stabilizes, usually a matter of a few milliseconds.

BIOS/Monitor

In a PC/AT-type system, a standard BIOS supports hardware initialization, memory testing, bus discov-ery, and other cold-start procedures. In custom designs, a basic boot monitor will still have to perform some subset of these tasks, and defer others to kernel initialization or post-init processing. In either case, the time spent in this phase can vary greatly, from under 1 second to over 15 seconds. Influenc-ing factors include:

Amount of physical memory and complexity of memory testing, if any

Type and number of devices to initialize

Number and depth of buses (e.g., PCI) to discover, traverse and map

Other timeouts built into the firmware by default (e.g., console checking)

With an off-the-shelf BIOS, you usually have the option of disabling or scaling back memory tests and polled checks; with custom firmware, you have the luxury of deciding whether to perform the checks at all (of course your application may need MORE not LESS stringent power-up testing than a BIOS).

Some memory checking and useful bus discovery is performed (again) later, during kernel initialization, so you might want to scale back on these activities in the firmware initialization phase.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 16

Page 17: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Kernel Load

In a standard Linux desktop system, a boot loader (e.g., GRUB or LILO) locates a compressed kernel image (vmLinuz) in a disk file system and copies it into RAM, decompressing during the copy sequence. Having written the image to RAM, the BIOS transfers execution (jumps) to a predetermined start address to begin kernel initialization. In many embedded systems (and some desktop/server scenar-ios), the BIOS/monitor must also decompress and copy a RAMdisk-based root file system to RAM.

While this booting phase is simple, it is not always short. The time needed to copy kernel and file system images to RAM increases more or less linearly with the size of the file. For small kernel and file system images, time spent is trivial; for larger images, it may impact instant-on requirements. For example, if a system requires on average 0.5 microseconds to read, decompress, and write a byte of kernel code to RAM, then a 2 MB image will take 1,048,576 microseconds, or just over one second to load the kernel or a like-sized file system image; a 10 MB image will take five times as long, contributing five seconds to the total boot sequence.

For larger images, using Execute In Place (XIP) from flash instead of copying effectively eliminates this entire phase – running directly from flash/ROM eliminates the need to decompress and copy. However, depending of flash memory access speed, may slow down kernel initializing and subsequence execu-tion (see XIP section below).

Kernel Initialization

Understanding all components of Linux kernel initialization would entail understanding large portions of the kernel itself , which is beyond the scope of this document. However, several key kernel initialization activities merit discussion here:

Timing Loops / Delays – Kernels for different architectures need to establish time bases and calibrated operation to match discovered hardware. Time spent in calibration loops of course slows down the initialization process. In many cases, empirically-derived calibration values can be deter-mined by the developer and preset en lieu of calibration code.

Page Table Initialization – In systems with large amounts of physical RAM to manage, the task of initialization the various management structures for that memory can be time-consuming. Consider that for a system with 128 MB of DRAM and a 4 KB page-size that 32,768 pages must be mapped and referenced initially in a free page list, and then allocated as needed for use by the kernel, drivers, and for storing page information itself. For systems with both unchanging memory size (no up-grades) and a predictable boot sequence, memory management structures could be pre-initialized instead of set up interatively; larger page sizes also accelerates this task.

Bus Discovery – Most consumer electronics devices have fixed on-chip interconnects and a single off-chip bus (unlike complex CPCI chasses), but can have peripheral interfaces, like USB or Fire-Wire, with discovery requirements. If establishing peripheral population proves to be time-consuming, consider deferring that activity until after user space init.

Device Driver Initialization – For fixed-function devices, developers often prefer to bind the com-plete drive set to the kernel at build time, for simplicity. However, statically bound drivers must also be initialized at boot time. Since this process is serialized and since some devices present long start-up latencies, this activity can add significant delays to kernel initialization. You should consider mi-grating drviers for slow-start devices into loadable modules, and deferring their initialization.

File System Initialization – Many file system types preferred in consumer electronics applications are kernel-based (e.g., CramFS, JFFS2). As such, you don’t have too much choice about when to perform initialization. Luckily, the base initialization of underlying device drivers for flash file systems is fairly trivial – the most burdensome activities for a system like JFFS2 occur at mount time. So, to avoid the need to mount “slow” file systems during kernel initialization, choose your root file system type wisely (RAMfs is a good choice) and defer mounting other systems until after user space init.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 17

Page 18: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

User Space Initialization – Init

Once the system has powered up, the kernel gets copied to RAM and initialized, then control is passed to the “init” process, with the following steps:

The kernel looks in /sbin for init

init runs the /etc/rc.d/rc.sysinit script

init runs all the scripts for the default run-level (as specified in /etc/inittab and by the contents of run-level directories rc0.d . . . rc6.d in directory /etc/rc.d/

init runs /etc/rc.d/rc.local (and your application start-up)

The bulk of the operations visible on the console come from items specified, for normal Linux operation (run level 5 in directory /etc/rc.d/rc5.d. For a typical workstation3 configuration, this directory contains

K04splash_late K04xdm K05atd K05cron K05hwscan K05nscd K06postfix K06splash K07alsasound K07fbset K07rpmconfigcheck K07sshd K11portmap K11splash_early K13pcmcia

K13resmgr K14hotplug K15syslog K16network K20isdn K20random S01isdn S01random S05network S06syslog S07hotplug S08pcmcia S08resmgr S10portmap S10splash_early

S14alsasound S14fbset S14rpmconfigcheck S14sshd S15kbd S15postfix S15splash S16atd S16cron S16hwscan S16nscd S17splash_late S17xdm

You can choose to optimize this init by removing or deferring items from inittab, from the run-level directory (rc5.d).

Towards Instant On

There exist two approaches to enhancing “instant on” from the end user perspective – (1) tuning and reconfiguring the Linux kernel, drivers, and user-space code to optimize the boot sequence, and (2) taking steps to improve the end user’s perception of how long that sequence takes.

Optimizing Start-Up

The first step to improving actual start-up time is to analyze the components of the boot sequence and to measure the contribution of each. MontaVista supplies the System Timing Tool as part of MontaVista Linux Consumer Electronics Edition.

3 In this case, SuSE Personal Edition version 8.2

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 18

Page 19: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Figure 8 – System Timing Tool

With the metrics provided by the tool in hand, your team can make informed decisions about kernel configuration, sequencing, pre-initializing and migrating components and functions in or out of the kernel and user space.

Improving the “User Experience” at Boot Time

The requirement for rapid boot, as seen above, can exist in opposition to the myriad tasks that an intelligent device must perform during start-up. If the actual total time cannot be reduced, then a variety of “stagecraft” techniques can be employed to improve the end-user’s perception of start-up time:

Display a splash screen as early as possible in the boot sequence, if possible in the BIOS or early in kernel initialization (like the penguin image in server and desktop Linux consoles)

Design an early splash screen to mimic later functional screens – some developers go so far as to draw passive button and other controls that are drawn over later with active versions

Sequence more critical services (like emergency dialing on a handset) to start as early as possible and defer others until boot and user space init have completed

Defer time-consuming operations (like mounting journaling file systems) until after init

Scaling Linux RAM/ROM Footprint

Developers new to Linux, especially those coming from a background of RTOS applications program-ming, will initially perceive Linux to be much larger than legacy platforms of their acquaintance. Indeed, the minimum footprint for a Linux system (see Figure 9) will dwarf the 50-100K byte profiles of “classic” RTOSes and executives. The reasons for this size differential are several:

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 19

Page 20: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Size is as Size does – the small memory footprints of traditional RTOSes represent significantly less functionality than Linux brings to an application. Add TCP/IP, file systems, security, memory protec-tion and a standards-based API to an small RTOS and it will consume as much if not even more resources than embedded Linux.

Linux is a Platform, Not a Services Library – most RTOSes are structured as extended run-time libraries. That is to say that RTOS applications link in just the RTOS services that they actually use. If an application does not use, for example, semaphores, then the implementing code is not included for deployment. Linux, by contrast, is intended to provide a standard repertoire of services to each application as it is deployed today, as it is updated tomorrow, and to new applications that make their way onto a deployed device. While the open source Linux kernel can be scaled and many services and capabilities configured in or out (e.g., SMP support, kernel-based file systems, even TCP/IP), ad hoc “codectomy” can limit your product’s ability to run off-the-shelf s/w and future versions of your application code without kernel redeployment.

Size is Relative – while legacy RTOSes might fit into under 100K, and embedded Linux into under 500K, realize that the minimum deployment footprint of CE.net variants like PocketPC is 24-27 megabytes, larger again by almost fifty times!

Application Profiles

To give developers a sense of resource usage, MontaVista established a series of application profiles that correspond to basic embedded application types:

Minimum Profile Simple console application over a serial interface that boots and reports memory usage. No networking or login shell.

Basic Profile TCP/IP-enabled system with telnet daemon and login shell

Router Profile Basic Profile plus Zebra routing package. Note that footprint include space for routing code and routing tables

Profile memory usage is broken into two pieces, kernel and file system, and shown at two stages, boot-time and run-time. The kernel image includes only code from the kernel itself, including device drivers (serial and Ethernet herein), RAMdisk file system code (RAMfs), and TCP/IP (where applicable). The file system image includes the requisite root file system and the various daemons and programs used in the profile. The boot-time images are compressed (approximately 2:1) and are expanded fully into RAM for execution and access.

RAMdisks are used in these profiles for simplicity. It is important to note, however, that a RAMdisk image can imply a three-way space usage: once in compressed form, once in expanded form, and up to a complete third time as programs are loaded into RAM for execution. With flash file systems, no decompression to RAM need take place at boot-time; using an XIP-capable flash file system reduces program memory use to a single (uncompressed) image.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 20

Page 21: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

0K

500K

1,000K

1,500K

2,000K

2,500K

3,000K

3,500K

4,000K

4,500K

5,000K

Compressed In Use Compressed In Use Compressed In use

File System

Kernel

Basic Profile:networking, nfs root, telnet

Minimum Profile:kernel, one app (reports usage)

Router Profile:basic + Zebra router with RIP

Figure 9 – Boot-time and Run-time Memory Usage for Three Linux Application Profiles for ARM CPUs

In the simplest profile, the embedded Linux kernel for the ARM architecture occupies less than 250K bytes of boot storage (and expands to just over 500K at run-time. The more feature-rich Basic and Router profiles grow to just under 400K at boot time and execute in just under 1 MB.

The size of the RAMdisk image varies directly with actual program and data content. The larger profiles are larger because the code therein does more. Still, a full routing application can be implemented in around 1.5 MBs flash and will run in approximately 4 MBs of system memory.

Footprint Scaling with MontaVista Linux Consumer Electronics Edition

MontaVista Linux CEE provides a number of tools and capabilities to measuring and optimizing memory usage in your application.

Target Configuration Tool

In MontaVista Linux versions 2.1 and 3.0, the Target Configuration Tool (TCT) provides a graphical front end to kernel configuration and supplies information about how kernel configuration options impact total system RAM and flash footprint.

Library Optimizer Tool

This tool examines populated, deployable platform images (file systems) and analyzes dynamic library usage by all applications and utilities being readied for deployment. Based on symbolic information inside executable programs, library modules not needed by actual application code are identified and removed from deployment versions of the libraries, reducing run-time footprints significantly.

TCT and LOT functionality is integrated into the graphical developer environment available in MontaVista Linux version 3.1, including MontaVista Linux Consumer Electronics Edition 3.1

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 21

Page 22: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Figure 10 – Target Configuration Tool showing sizes for selected configurations

System Measurement Tool – Memory Map

The MontaVista Linux Consumer Electronics Edition features various capabilities in the System Meas-urement tool for analyzing performance and resource utilization. The Memory Map display shows how much system RAM (in pages) is being used at any one time and how much free memory is available.

Figure 11 – System Measurement Tool Memory Map Display

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 22

Page 23: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Other Scaling Techniques

Embedded Linux offers developers many types of opportunities to reduce boot-time and run-time memory usage:

Use compressing flash file systems (JFFS2 and CramFS) to save space

Reduce run-time RAM usage by experimenting with kernel command-line options for reducing run-time memory use. To obtain the minimum viable run-time RAM size, step available memory down in 16K increments at boot-time and attempt to boot the system until you obtain a value that fails to boot. Use a value larger than this minimum to suit your needs, but smaller than the default.

Remove device drivers from the kernel boot image and load them later as modules – this move will save on kernel image size at the expense of (compressed) file system space.

Strip symbolic information from executables, but only after you have thoroughly debugged your application

Carefully audit the contents of /bin, /sbin, /usr/bin, and /etc and remove unneeded code and configuration files

Build your application and selectively rebuild the kernel and libraries with gcc optimizations biased towards size over execution speed (MontaVista defaults to speed over size)

Execute in Place

When Linux made its big splash in 1999, both for enterprise and embedded applications, there was a stark divide between traditional embedded systems and workstations and servers. Embedded device software was built as single large program images, containing applications, libraries, and usually an OS (or RTOS), that was copied from ROM to RAM at start-up. Lacking any sort of file systems, these devices were simple to program but hard to upgrade. In contrast, workstations and servers based on Linux (or Windows), with resident (disk-based) file systems, copied the OS from disk to RAM at start-up. Disk-based applications, libraries, and even the OS images could be upgraded on disk at will.

When deploying Linux as an embedded OS, one of the first differences embedded developers notice is the presence of a file system, even in ROM-based devices. The Linux/UNIX programming paradigm requires a file system to support invocation or spawning of new programs, or more correctly, child processes. At the very least, a mechanism must exist to resolve file paths for program images in system calls like exec.

For reasons of stability and security, embedded Linux developers early on wanted a pure ROM-based file system capability. Early efforts built ad hoc block-level interfaces underneath the standard EXT2 disk file system to both NAND and NOR flash devices and disk-like subsystems. While this EXT2 work leveraged an open source base, companies like M-Systems built proprietary flash disk systems around NAND flash. While these proprietary solutions were performant and reasonably cost-effective, they suffered from being overly device-specific (lacking abstraction), non-portable, and sometimes conflicting with open source licensing norms.

Embedding Linux with Flash Today

In the last two years, a series of developments have arisen in Open Source (and elsewhere) that ease embedding applications in flash. The first is the development of an abstraction layer within Linux for managing memory devices – the Memory Technology Device, or MTD subsystem abstraction. The aim of the MTD Project (see http://www.linux-mtd.infradead.org/) is to simplify building flash drivers for new devices, by providing a generic interface between the drivers and the upper layers of the system.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 23

Page 24: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

The second is the proliferation of flash-optimized file systems with options to support different flash media types, read/write, XIP, and licensing options:

File System Description R/W XIP MTD

Wear Level-ing

Media Type Licensing

EXT2 Standard Linux Disk File System. R/W N N N Disk, NAND or

NOR Flash Open Source

JFFS2 Linux Journaling Flash File System R/W N Y Y Any NOR and

NAND Flash Open Source

CramFS Compressing Flash File System R Y Y N/A Any NOR Flash Open

Source

VFM Intel Virtual File Manager R/W Y N Y Intel NOR Flash

StrataFlash Intel

Figure 12 – Off-the-shelf Options for Linux Flash File Systems

Other options also exist for entirely proprietary NAND flash subsystems (like M-Systems DiskOnChip), for compatibility with specific media types (like CompactFlash, which just uses the Linux IDE driver family), and for compatibility with Microsoft FAT file systems (MFFS).

The Need for Execute-In-Place (XIP)

In both traditional (RTOS-based) embedded systems and with embedded Linux, the OS and application code both execute from RAM (see diagram above). While boot loaders and file systems exist to host copies of program executables, it has been the exception, not the rule, for code to run from ROM or flash. With linearly-addressed, “flat” RTOS systems, the option always existed for XIP, but was es-chewed for performance reasons (fetches from RAM are typically faster than those from flash). With Linux, XIP has faced obstacles arising from file compression, disk-optimized boot loaders, and the complexity of managing paged, virtually addressed “TEXT” (code) pages in the Linux memory manage-ment system.

XIP Benefits

Allowing the Linux kernel and applications to execute in place offers distinct advantages to a variety of application types. In general, any embedded application will benefit from XIP, but manufacturers of mobile phones, PDAs, and other consumer electronics devices have the most to gain from XIP:

Faster System Boot / start-up (no need to decompress/copy into RAM as part of booting)

Faster Application Launch (no need to decompress/copy into RAM as part of fork/exec sequence)

Less RAM Needed (trade-off with ROM/Flash)

Lower Power Consumption (flash is more efficient than RAM)

Better security (with kernel or read-only CramFS, since core images are immutable)

Flexibility (can be used in concert with other file systems / technologies)

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 24

Page 25: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Figure 13 – RAM-based and XIP Models Compared

Unique XIP Requirements for Applications and Kernel

When considering where and how to use XIP in an embedded Linux application, it is important to understand the differing implications of user program vs. kernel execution from flash memory.

Leveraging CramFS for User Program XIP

Standard Linux (and MontaVista Linux in particular) support application-level (user programs) by leveraging the native XIP capabilities in the CramFS file system. CramFS offers read-only operation with high compression capability for both programs and data stored in a CramFS image (up to 2:1 compression).

A deployed CramFS image can of course contain a mix of XIP programs, stored programs, and data. When CramFS images are created, the mkcramfs utility looks at the “sticky bit” for files in the local hosted directory structure, storing executables with that bit set without the usual CramFS compression, and optimized for in-place execution.

Kernel XIP on Raw Flash Devices

Since the Linux kernel must execute before any file systems are initialized, and itself supports the operation of all file systems, it cannot depend on the XIP capabilities of any file system. Instead, it must be enabled to execute as a single, contiguous executable image mapped from a flash device into Linux logical memory.

With NOR flash, the initial boot operation is very straightforward (although somewhat architecture-dependent). The boot loader needs merely to jump to a known start address, without the usual need to

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 25

Page 26: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

decompress a vmlinuz image. Minor complications arise once the processor Memory Management Unit (MMU) is enabled, causing all physical addresses in the kernel to be offset by a fixed amount in subse-quent (logically addressed) accesses and operations. The most immediate concern in this area is the loading and decompression of a RAMdisk or other ROM-based root file system.

XIP Trade-offs

XIP is a powerful capability of MontaVista Linux Consumer Electronics Edition, but one that presents some trade-offs in terms of flash vs. RAM utilization, and of execution speed.

XIP Saves RAM but Requires More Flash

XIP saves RAM by not copying programs and system code from ROM. In doing so, it increases flash/ROM requirements, primarily because XIP images must also be readable in-place, and can't be compressed like "normal" files or kernel boot images. Since CramFS can achieve a 2:1 or better compression ratio, XIP images effectively take up twice the space they would were they compressed and executed from RAM.

At first glance, the trade appears to come down to speed vs. size – the larger, uncompressed XIP images kernel and program boot and launch faster, at the price of needing more flash in a system. Purely cost-conscious applications should then probably eschew XIP, since RAM is significantly cheaper than flash. However, RAM uses more power than flash, limiting battery life in handheld devices or requiring larger (and more expensive) batteries.

XIP, Flash Technologies, and Kernel Boot Image Size

The option for execute in place exists only for NOR flash devices. NAND flash is also quite popular for embedded devices, and depending on configuration, can be less expensive than NOR flash for compa-rable or larger capacities. NAND, however, cannot support XIP because it is architected for block-level access and cannot be connected directly to a processor local bus – it must be accessed like a storage subsystem, through a controller, and interfaced via a block device driver. In some cases, NAND flash is configured to mimic standard IDE devices.

While NOR flash offers the options of direct bus connection, giving the appearance of simple read-only memory, access times for NOR can differ markedly from those for comparably sized RAM. First, basic read cycles from NOR flash are slower than RAM read cycles, often requiring the insertion of wait states in a memory controller. Second, sequential reads from NOR flash traverse the contents of flash cells in sequence, such that for reads out of sequence, the device must traverse all intervening bytes in the flash cell, further slowing read performance. As such, execution in place from NOR flash will almost always be slower than execution from RAM (but will benefit from the existence of instruction cache).

It is interesting then, to compare Linux kernel XIP from NOR flash vs. the time needed to copy a kernel image from NAND flash into RAM and to execute there. While direct execution from (NOR) flash can begin almost instantaneously (vs. copying into RAM), byte-for-byte XIP can take longer than execution from an image already in RAM.

Thus, it is key to compare the TOTAL EXECUTION TIME to know if XIP is to bring advantages in execution speed.

Execution Speed / Fetch Cycles from Flash vs. RAM

Another tradeoff depends on the memory access speed and bus design of embedded hardware. In some systems, read-only memory is significantly slower than RAM, with longer fetch times and bus cycles. Thus, while XIP kernel and applications boot and launch faster using XIP, they might actually execute more slowly. Such storage medium speed deficits can be greatly improved or even eliminated in CPU architectures with reasonable amounts of cache memory.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 26

Page 27: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

XIP and Read/Write File Systems

NOR flash, while offering finer granularity for write operations (and direct memory mapping needed for XIP), suffers from lengthy multi-byte write and erase cycle times (up to many milliseconds, depending on the amount of data). During write cycles, the contents of NOR flash devices cannot be read, even for blocks/partitions not being overwritten or erased. As such, write-capable flash file systems (like JFFS2) have no simple means to support XIP – during a write operation the CPU can no longer fetch code from the device being programmed! While some applications can be suspended until write operations are complete, usually the kernel and drivers cannot be.

The simplest solution is of course to place device-programming code outside of the file system being (over)written. If the flash file system is the only persistent device, then programming code must reside in RAM. This seemingly simple solution, however, presents serious obstacles to system performance, especially event response. Applications with tight, real-time response requirements, e.g., in the dozens of microseconds, cannot tolerate waiting tens of milliseconds for a flash write operation to complete before servicing a real-time event. Again, no free lunch.

There do exist proprietary flash solutions with both XIP and read/write capability, such as Intel’s VFM (a.k.a. Hat Creek). Such file systems include code that resides in RAM to field interrupts during flash write cycles and to queue write operations behind time-critical service code.

Optimizing Performance of Consumer Electronics Applications

MontaVista Linux Consumer Electronics Edition provides a unique set of tools for analyzing and tuning real-time performance characteristics of intelligent devices.

Figure 14 - Interrupt Latency Measurement Tool Interrupt Latency Measurement Tool

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 27

Page 28: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

This powerful tool performs two kinds of analysis and display of interrupt latency information gathered by an instrumented MontaVista Linux kernel:

Displays histograms of collected actual interrupt latency samples, where the X axis represents a series of “buckets” that store latency times in increments of two microseconds, and the Y axis dis-plays the number of actual samples for each bucket.

Lists kernel and driver routine names that account for the latency samples in the histogram, along with the latency (in microseconds) and sample contributions for each.

You can benefit from using the tool because it both allows you to see the interrupt response characteris-tics of your system under varying loads and to pinpoint the locale of significant “out-liers” (long latency contributors).

With this data in hand, you can

Find performance bottlenecks in driver and kernel code

Make more informed decisions of about interrupt prioritization

Ensure that your hardware and software implementation meets key design criteria

Preemption Latency Measurement Tool

The Preemption Latency Measurement Tool is conceptually similar to the Interrupt Latency Measure-ment Tool, but instead of displaying data gathered for low-level interrupt response times, it analyzes and displays preemption or task-response latency – the time from the receipt of an interrupt (signaling data availability to a driver) until a sleeping process/thread waiting on that event is scheduled and executed.

Figure 15 – Preemption Latency Measurement Tool

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 28

Page 29: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

This tool is dynamic, showing a running display of actual latencies, and a cumulative tally of average and worst case preemption latency samples under actual loads.

Conclusion

Global manufacturers of intelligent consumer electronics devices are already leveraging the value and technical capabilities of embedded Linux for their next-generation devices. In Linux, they see not just a reliable, low-cost embedded OS, but a strategic platform on which to unify their technology investment and to build their leading products today and for the decade ahead.

MontaVista Linux Consumer Electronics Edition enables these market leaders to focus their investment on their unique added-value technology while building on a solid foundation they can trust.

To learn more about Consumer Electronics Edition, visit

http://www.mvista.com/cee/

and see how MontaVista Software technology leadership can bring your next project to market quickly, reliably and at significantly lower cost.

Copyright © 2003, MontaVista Software, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. See <http://www.opencontent.org/openpub/>. 29

Page 30: Linux for Advanced Consumer Electronics

Linux for Advanced Consumer Electronics

Revision History

Rev. Date By Comments

0.5 26oct03 BW Added detail on start-up, instant-on; new title; integrated additional comments from marketing and engineering; Final Review

0.4 17oct03 BW Edits from SH on markets, applications, technologies, partners

0.3 05sep03 BW Added O_STREAMING; Version sent out for review

0.2 18aug03 BW Added introduction, strategy, microprocessor, power manage-ment, footprint, boot-time content, tool screen shots

0.1 07apr03 BW New – from outline generated with Product Management with abstract, material on flash and XIP

MontaVista Software, Inc. 1237 East Arques Ave. Sunnyvale, CA 94085 Tel : (408) 328-9200 Fax : (408) 328-9204 email: [email protected] http://www.mvista.com Powering The Embedded Revolution

Copyright © 2003 MontaVista Software, Inc. All rights reserved. MontaVista Linux is a registered trademark of MontaVista Software, Inc. Linux is a registered trademark of Linus Torvalds. All other trademarks are theproperty of their respective owners.