The ARM is a 32

Embed Size (px)

Citation preview

  • 8/8/2019 The ARM is a 32

    1/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    CHAPTER 1

    LITERATURE SURVEY

    The ARM is a 32-bit reduced instruction set computer (RISC) instruction set

    architecture (ISA) developed by ARM Holdings. It was known as the Advanced RISC

    Machine, and before that as the Acorn RISC Machine. The ARM architecture is the

    most widely used 32-bit ISA in terms of numbers produced. They were originally

    conceived as a processor for desktop personal computers by Acorn Computers, a

    market now dominated by the x86 family used by IBM PC compatible and AppleMacintosh computers. The relative simplicity of ARM processors made them suitable

    for low power applications. This has made them dominant in the mobile and

    embedded electronics market, as relatively low cost, and small microprocessors and

    microcontrollers.

    As of 2007, about 98 percent of the more than one billion mobile phones sold each

    year use at least one ARM processor. As of 2009, ARM processors account for

    approximately 90% of all embedded 32-bit RISC processors. ARM processors areused extensively in consumer electronics, including PDAs, mobile phones, digital

    media and music players, hand-held game consoles, calculators and computer

    peripherals such as hard drives and routers.

    The ARM architecture is licensable. Companies that are current or former ARM

    licensees include Alcatel-Lucent, Apple Inc., Atmel, Broadcom, Cirrus Logic, Digital

    Equipment Corporation, Freescale, Intel (through DEC), LG, Marvell Technology

    Group, Microsoft, NEC, Nuvoton, Nvidia, NXP (previously Philips), Oki, Qualcomm,

    Samsung, Sharp, STMicroelectronics, Symbios Logic, Texas Instruments, VLSI

    Technology, Yamaha and ZiiLABS.

    ARM processors are developed by ARM and by ARM licensees. Prominent ARM

    processor families developed by ARM Holdings include the ARDigital Equipment

    CorporationM7, ARM9, ARM11 and Cortex. Notable ARM processors developed by

    licensees include DEC StrongARM, Freescale i.MX, Marvell (formerly Intel) XScale,

    1

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    2/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    Nintendo, Nvidia Tegra, ST-Ericsson Nomadik, Qualcomm Snapdragon, the Texas

    Instruments OMAP product line, the Samsung Hummingbird and the Apple A4[3].

    1.1 HISTORY

    After achieving some success with the BBC Micro computer, Acorn Computers Ltd

    considered how to move on from the relatively simple MOS Technology 6502

    processor to address business markets like the one that would soon be dominated by

    the IBM PC, launched in 1981. The Acorn Business Computer (ABC) plan required a

    number of second processors to be made to work with the BBC Micro platform, but

    processors such as the Motorola 68000 and National Semiconductor 32016 were

    unsuitable, and the 6502 was not powerful enough for a graphics based user

    interface.

    Acorn would need a new architecture, having tested all of the available processors

    and found them wanting. Acorn then seriously considered designing its own

    processor, and their engineers came across papers on the Berkeley RISC project.

    They felt it showed that if a class of graduate students could create a competitive 32-

    bit processor, then Acorn would have no problem. A trip to the Western DesignCenter in Phoenix showed Acorn engineers Steve Furber and Sophie Wilson that

    they did not need massive resources and state-of-the-art R&D facilities.

    Wilson set about developing the instruction set, writing a simulation of the processor

    in BBC Basic that ran on a BBC Micro with a 6502 second processor. It convinced

    the Acorn engineers that they were on the right track. Before they could go any

    further, however, they would need more resources. It was time for Wilson to

    approach Acorn's CEO, Hermann Hauser, and explain what was afoot. Once the go-

    ahead had been given, a small team was put together to implement Wilson's model

    in hardware.

    1.1.1 Acorn RISC Machine: ARM2

    The official Acorn RISC Machine project started in October 1983. VLSI Technology,

    Inc was chosen as silicon partner, since it already supplied Acorn with ROMs and

    some custom chips. The design was led by Wilson and Furber, with a key design

    goal of achieving low-latency input/output (interrupt) handling like the MOS

    2

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    3/32

  • 8/8/2019 The ARM is a 32

    4/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    the design team in 1990 into a new company called Advanced RISC Machines Ltd.

    For this reason, ARM is sometimes expanded as Advanced RISC Machine instead of

    Acorn RISC Machine. Advanced RISC Machines became ARM Ltd when its parent

    company, ARM Holdings plc, floated on the London Stock Exchange and NASDAQ

    in 1998.

    The new Apple-ARM work would eventually turn into the ARM6, first released in early

    1992. Apple used the ARM6-based ARM 610 as the basis for their Apple Newton

    PDA. In 1994, Acorn used the ARM 610 as the main CPU in their Risc PC

    computers. DEC licensed the ARM6 architecture and produced the StrongARM. At

    233 MHz this CPU drew only 1 Watt of power (more recent versions draw far less).

    This work was later passed to Intel as a part of a lawsuit settlement, and Intel took

    the opportunity to supplement their aging i960 line with the StrongARM. Intel later

    developed its own high performance implementation known as XScale which it has

    since sold to Marvell[3].

    1.2 LICENSING GROWTH

    The ARM core has remained largely the same size throughout these changes. ARM2

    had 30,000 transistors, while the ARM6 grew to only 35,000. ARM's business has

    always been to sell IP cores, which licensees use to create microcontrollers and

    CPUs based on this core. The most successful implementation has been the

    ARM7TDMI with hundreds of millions sold. The idea is that the Original Design

    Manufacturer combines the ARM core with a number of optional parts to produce a

    complete CPU, one that can be built on old semiconductor fabs and still deliver

    substantial performance at a low cost. Atmel has been a precursor design center in

    the ARM7TDMI-Based Embedded System.

    ARM licensed about 1.6 billion cores in 2005. In 2005, about 1 billion ARM cores

    went into mobile phones. As of January 2008, over 10 billion ARM cores have been

    built, and in 2008 iSuppli predicted that by 2011, 5 billion ARM cores will be shipping

    per year. As of January 2011, ARM states that over 15 billion ARM processors have

    shipped.

    The architecture used in smartphones, personal digital assistants and other mobile

    devices is anything from ARMv5 in obsolete/low-end devices to ARM M-series in

    4

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    5/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    current high-end devices. XScale and ARM926 processors are ARMv5TE, and are

    now more numerous in high-end devices than the StrongARM, ARM9TDMI and

    ARM7TDMI based ARMv4 processors, but lower-end devices may use older cores

    with lower licensing costs. ARMv6 processors represented a step up in performance

    from standard ARMv5 cores, and are used in some cases, but Cortex processors

    (ARMv7) now provide faster and more power-efficient options than all those previous

    generations. Cortex-A targets applications processors, as needed by smartphones

    that previously used ARM9 or ARM11. Cortex-R targets real-time applications, and

    Cortex-M targets microcontrollers.

    In 2009, some manufacturers introduced netbooks based on ARM architecture

    CPUs, in direct competition with netbooks based on Intel Atom.

    1.3 ARM CORES

    ARM provides a summary of the numerous vendors who implement ARM cores in

    their design. KEIL also provides a somewhat newer summary of vendors of ARM

    based processors. ARM further provides a chart displaying an overview of the ARM

    processor lineup with performance and functionality versus capabilities for the more

    recent ARM7, ARM9, ARM11, Cortex-M, Cortex-R and Cortex-A device families.

    1.4 ARCHITECTURE

    From 1995 onwards, the ARM Architecture Reference Manual has been the primary

    source of documentation on the ARM processor architecture and instruction set,

    distinguishing interfaces that all ARM processors are required to support (such as

    instruction semantics) from implementation details that may vary. The architecture

    has evolved over time, and starting with the Cortex series of cores, three "profiles"

    are defined:

    "Application" profile: Cortex-A series

    "Real-time" profile: Cortex-R series

    "Microcontroller" profile: Cortex-M series

    Profiles are allowed to subset the architecture. For example the ARMv7-M profile

    used by the Cortex-M3 core is notable in that it supports only the Thumb-2 instruction

    5

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    6/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    set, and the ARMv6-M profile (used by the Cortex-M0) is a subset of the ARMv7-M

    profile (supporting fewer instructions).

    1.5 INSTRUCTION SET

    To keep the design clean, simple and fast, the original ARM implementation was

    hardwired without microcode, like the much simpler 8-bit 6502 processor used in

    prior Acorn microcomputers.

    1.6 RISC FEATURES

    The ARM architecture includes the following RISC features:

    Load/store architecture.

    No support for misaligned memory accesses (now supported in ARMv6

    cores, with some exceptions related to load/store multiple word instructions).

    Uniform 16 32-bit register file.

    Fixed instruction width of 32 bits to ease decoding and pipelining, at the cost

    of decreased code density. Later, "the Thumb instruction set" increased code

    density.

    Mostly single-cycle execution.

    To compensate for the simpler design, compared with contemporary processors like

    the Intel 80286 and Motorola 68020, some additional design features were used:

    Conditional execution of most instructions, reducing branch overhead and

    compensating for the lack of a branch predictor.

    Arithmetic instructions alter condition codes only when desired.

    32-bit barrel shifter which can be used without performance penalty with most

    arithmetic instructions and address calculations.

    Powerful indexed addressing modes.

    A link register for fast leaf function calls.

    Simple, but fast, 2-priority-level interrupt subsystem with switched register

    banks.

    6

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    7/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    1.7 CONDITIONAL EXECUTION

    The conditional execution feature (called predication) is implemented with a 4-bit

    condition code selector (the predicate) on every instruction; one of the four-bit codes

    is reserved as an "escape code" to specify certain unconditional instructions, but

    nearly all common instructions are conditional. Most CPU architectures only have

    condition codes on branch instructions.

    This cuts down significantly on the encoding bits available for displacements in

    memory access instructions, but on the other hand it avoids branch instructions when

    generating code for small if statements.

    One of the ways that Thumb code provides a more dense encoding is to remove that

    four bit selector from non-branch instructions.

    1.8 OTHER FEATURES

    Another feature of the instruction set is the ability to fold shifts and rotates into the

    "data processing" (arithmetic, logical, and register-register move) instructions, so

    that, for example, the C statement

    a += (j

  • 8/8/2019 The ARM is a 32

    8/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    strictly speaking, for them it's not possible to generate efficient code that would

    behave the way one would expect for C objects of type "int16_t"[3].

    1.9 PIPELINES AND OTHER IMPLEMENTATION ISSUES

    The ARM7 and earlier implementations have a three stage pipeline; the stages being

    fetch, decode, and execute. Higher performance designs, such as the ARM9, have

    deeper pipelines: Cortex-A8 has thirteen stages. Additional implementation changes

    for higher performance include a faster adder, and more extensive branch prediction

    logic. The difference between the ARM7DI and ARM7DMI cores, for example, was

    an improved multiplier (hence the added "M").

    1.10 COPROCESSORS

    The architecture provides a non-intrusive way of extending the instruction set using

    "coprocessors" which can be addressed using MCR, MRC, MRRC, MCRR, and

    similar instructions. The coprocessor space is divided logically into 16 coprocessors

    with numbers from 0 to 15, coprocessor 15 (cp15) being reserved for some typical

    control functions like managing the caches and MMU operation (on processors that

    have one).

    In ARM-based machines, peripheral devices are usually attached to the processor by

    mapping their physical registers into ARM memory space or into the coprocessor

    space or connecting to another device (a bus) which in turn attaches to the

    processor. Coprocessor accesses have lower latency so some peripherals (for

    example XScale interrupt controller) are designed to be accessible in both ways

    (through memory and through coprocessors). In other cases, chip designers only

    integrate hardware using the coprocessor mechanism. For example, an image

    processing engine might be a small ARM7TDMI core combined with a coprocessor

    that has specialized operations to support a specific set of HDTV transcoding

    primitives.

    1.11 DEBUGGING

    All modern ARM processors include hardware debugging facilities; without them,

    software debuggers could not perform basic operations like halting, stepping, and

    break pointing of code starting from reset. These facilities are built using JTAG

    8

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    9/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    support, though some newer cores optionally support ARM's own two-wire "SWD"

    protocol. In ARM7TDMI cores, the "D" represented JTAG debug support, and the "I"

    represented presence of an "Embedded ICE" debug module. For ARM7 and ARM9

    core generations, Embedded ICE over JTAG was a de-facto debug standard,

    although it was not architecturally guaranteed.

    The ARMv7 architecture defines basic debug facilities at an architectural level. These

    include breakpoints, watch points, and instruction execution in a "Debug Mode";

    similar facilities were also available with Embedded ICE. Both "halt mode" and

    "monitor" mode debugging are supported. The actual transport mechanism used to

    access the debug facilities is not architecturally specified, but implementations

    generally include JTAG support.

    There is a separate ARM "Core Sight" debug architecture, which is not architecturally

    required by ARMv7 processors[3].

    1.12 DSP ENHANCEMENT INSTRUCTIONS

    To improve the ARM architecture for digital signal processing and multimedia

    applications, a few new instructions were added to the set. These are signified by an"E" in the name of the ARMv5TE and ARMv5TEJ architectures. E-variants also imply

    T,D,M and I.

    The new instructions are common in digital signal processor architectures. They are

    variations on signed multiply-accumulate, saturated add and subtract, and count

    leading zeros.

    1.13 JAZELLE

    Jazelle is a technique that allows Java Bytecode to be executed directly in the ARM

    architecture as a third execution state (and instruction set) alongside the existing

    ARM and Thumb-mode. Support for this state is signified by the "J" in the ARMv5TEJ

    architecture, and in ARM9EJ-S and ARM7EJ-S core names. Support for this state is

    required starting in ARMv6 (except for the ARMv7-M profile), although newer cores

    only include a trivial implementation that provides no hardware acceleration.

    9

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    10/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    1.14 THUMB

    To improve compiled code-density, processors since the ARM7TDMI have featured

    the Thumb instruction set state. (The "T" in "TDMI" indicates the Thumb feature.)

    When in this state, the processor executes the Thumb instruction set, a variable-

    length instruction set providing 32-bit and 16-bit instructions. Most of the Thumb

    instructions are directly mapped to normal ARM instructions. The space-saving

    comes from making some of the instruction operands implicit and limiting the number

    of possibilities compared to the ARM instructions executed in the ARM instruction set

    state.

    In Thumb, the 16-bit opcodes have less functionality. For example, only branches

    can be conditional, and many opcodes are restricted to accessing only half of all of

    the CPU's general purpose registers. The shorter opcodes give improved code

    density overall, even though some operations require extra instructions. In situations

    where the memory port or bus width is constrained to less than 32 bits, the shorter

    Thumb opcodes allow increased performance compared with 32-bit ARM code, as

    less program code may need to be loaded into the processor over the constrained

    memory bandwidth.

    Embedded hardware, such as the Game Boy Advance, typically have a small amount

    of RAM accessible with a full 32-bit datapath; the majority is accessed via a 16 bit or

    narrower secondary datapath. In this situation, it usually makes sense to compile

    Thumb code and hand-optimise a few of the most CPU-intensive sections using full

    32-bit ARM instructions, placing these wider instructions into the 32-bit bus

    accessible memory.

    The first processor with a Thumb instruction decoder was the ARM7TDMI. All ARM9and later families, including XScale, have included a Thumb instruction decoder.

    1.15 Thumb-2

    Thumb-2 technology made its debut in the ARM1156 core, announced in 2003.

    Thumb-2 extends the limited 16-bit instruction set of Thumb with additional 32-bit

    instructions to give the instruction set more breadth. A stated aim for Thumb-2 is to

    10

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    11/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    achieve code density similar to Thumb with performance similar to the ARM

    instruction set on 32-bit memory. In ARMv7 this goal can be said to have been met.

    Thumb-2 extends both the ARM and Thumb instruction set with yet more

    instructions, including bit-field manipulation, table branches, and conditional

    execution. A new "Unified Assembly Language" (UAL) supports generation of either

    Thumb-2 or ARM instructions from the same source code; versions of Thumb seen

    on ARMv7 processors are essentially as capable as ARM code (including the ability

    to write interrupt handlers). This requires a bit of care, and use of a new "IT" (if-then)

    instruction, which permits up to four successive instructions to execute based on a

    tested condition. When compiling into ARM code this is ignored, but when compiling

    into Thumb-2 it generates an actual instruction.

    All ARMv7 chips support the Thumb-2 instruction set. Some chips, such as the

    Cortex-M3, support only the Thumb-2 instruction set. Other chips in the Cortex and

    ARM11 series support both "ARM instruction set mode" and "Thumb-2 instruction set

    mode".

    1.16 THUMB EXECUTION ENVIRONMENT (THUMBEE)

    ThumbEE, also known as Thumb-2EE, and marketed as Jazelle RCT (Runtime

    Compilation Target), was announced in 2005, first appearing in the Cortex-A8

    processor. ThumbEE is a fourth processor mode, making small changes to the

    Thumb-2 extended Thumb instruction set. These changes make the instruction set

    particularly suited to code generated at runtime (e.g. by JIT compilation) in managed

    Execution Environments. ThumbEE is a target for languages such as Limbo, Java,

    C#, Perl and Python, and allows JIT compilers to output smaller compiled code

    without impacting performance.

    New features provided by ThumbEE include automatic null pointer checks on every

    load and store instruction, an instruction to perform an array bounds check, access to

    registers r8-r15 (where the Jazelle/DBX Java VM state is held), and special

    instructions that call a handler. Handlers are small sections of frequently called code,

    commonly used to implement a feature of a high level language, such as allocating

    memory for a new object. These changes come from repurposing a handful of

    opcodes, and knowing the core is in the new ThumbEE mode.

    11

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    12/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    1.17 VECTOR FLOATING POINT (VFP)

    VFP (Vector Floating Point) technology is a coprocessor extension to the ARM

    architecture. It provides low-cost single-precision and double-precision floating-point

    computation fully compliant with the ANSI/IEEE Std 754-1985 Standard for Binary

    Floating-Point Arithmetic. VFP provides floating-point computation suitable for a wide

    spectrum of applications such as PDAs, smartphones, voice compression and

    decompression, three-dimensional graphics and digital audio, printers, set-top boxes,

    and automotive applications. The VFP architecture also supports execution of short

    vector instructions but these operate on each vector element sequentially and thus

    do not offer the performance of true SIMD (Single Instruction Multiple Data)

    parallelism. This mode can still be useful in graphics and signal-processing

    applications, however, as it allows a reduction in code size and instruction fetch and

    decode overhead.

    Other floating-point and/or SIMD coprocessors found in ARM-based processors

    include FPA, FPE, iwMMXt. They provide some of the same functionality as VFP but

    are not opcode-compatible with it.

    1.18 ADVANCED SIMD (NEON)

    The Advanced SIMD extension, marketed as NEON technology, is a combined 64-

    and 128-bit single instruction multiple data (SIMD) instruction set that provides

    standardized acceleration for media and signal processing applications. NEON can

    execute MP3 audio decoding on CPUs running at 10 MHz and can run the GSM

    AMR (Adaptive Multi-Rate) speech codec at no more than 13 MHz. It features a

    comprehensive instruction set, separate register files and independent executionhardware. NEON supports 8-, 16-, 32- and 64-bit integer and single-precision (32-bit)

    floating-point data and operates in SIMD operations for handling audio and video

    processing as well as graphics and gaming processing. In NEON, the SIMD supports

    up to 16 operations at the same time. The NEON hardware shares the same floating-

    point registers as used in VFP.

    12

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    13/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    1.19 SECURITY EXTENSIONS (TRUSTZONE)

    The Security Extensions, marketed as TrustZone Technology, is found in ARMv6KZ

    and later application profile architectures. It provides a low cost alternative to adding

    an additional dedicated security core to a SoC, by providing two virtual processors

    backed by hardware based access control. This enables the application core to

    switch between two states, referred to as worlds (to reduce confusion with other

    names for capability domains), in order to prevent information from leaking from the

    more trusted world to the less trusted world. This world switch is generally orthogonal

    to all other capabilities of the processor, thus each world can operate independently

    of the other while using the same core. Memory and peripherals are then made

    aware of the operating world of the core and may use this to provide access control

    to secrets and code on the device. Typical applications of TrustZone Technology are

    to run a rich operating system in the less trusted world, and smaller security-

    specialized code in the more trusted world (known as TrustZone Software, a

    TrustZone optimized version of the Trusted Foundations(TM) Software developed by

    Trusted Logic), allowing much tighter Digital Rights Management for controlling the

    use of media on ARM-based devices, and preventing any unapproved use of thedevice.

    In practice, since the specific implementation details of TrustZone are proprietary and

    have not been publicly disclosed for review, it is unclear what level of assurance is

    provided for a given threat model.

    1.20 NO-EXECUTE PAGE PROTECTION

    As of ARMv6, the ARM architecture supports no-execute page protection, which is

    referred to as XN, forExecute Never.[

    1.21 ARM LICENSEES

    ARM Ltd does not manufacture and sell CPU devices based on its own designs, but

    rather, licenses the processor architecture to interested parties. ARM offers a variety

    of licensing terms, varying in cost and deliverables. To all licensees, ARM provides

    an integratable hardware description of the ARM core, as well as complete software

    development toolset (compiler, debugger, SDK), and the right to sell manufactured

    13

    Department of ECE, SBCE

    http://en.wikipedia.org/wiki/ARM_architecture#cite_note-50http://en.wikipedia.org/wiki/ARM_architecture#cite_note-50
  • 8/8/2019 The ARM is a 32

    14/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    silicon containing the ARM CPU. Fabless licensees, who wish to integrate an ARM

    core into their own chip design, are usually only interested in acquiring a ready-to-

    manufacture verified IP core. For these customers, ARM delivers a gate netlist

    description of the chosen ARM core, along with an abstracted simulation model and

    test programs to aid design integration and verification. More ambitious customers,

    including integrated device manufacturers (IDM) and foundry operators, choose to

    acquire the processor IP in synthesizable RTL (Verilog) form. With the synthesizable

    RTL, the customer has the ability to perform architectural level optimizations and

    extensions. This allows the designer to achieve exotic design goals not otherwise

    possible with an unmodified netlist (high clock speed, very low power consumption,

    instruction set extensions, etc.). While ARM does not grant the licensee the right to

    resell the ARM architecture itself, licensees may freely sell manufactured product

    (chip devices, evaluation boards, complete systems, etc.). Merchant foundries can be

    a special case; not only are they allowed to sell finished silicon containing ARM

    cores, they generally hold the right to remanufacture ARM cores for other customers.

    Like most IP vendors, ARM prices its IP based on perceived value. In architectural

    terms, the lower performance ARM cores command a lower license cost than thehigher performance cores. In terms of silicon implementation, a synthesizable core is

    more expensive than a hard macro (black box) core. Complicating price matters, a

    merchant foundry who holds an ARM license (such as Samsung and Fujitsu) can

    offer reduced licensing costs to its fab customers. In exchange for acquiring the ARM

    core through the foundry's in-house design services, the customer can reduce or

    eliminate payment of ARM's upfront license fee. Compared to dedicated

    semiconductor foundries (such as TSMC and UMC) without in-house design

    services, Fujitsu/Samsung charge 2 to 3 times more per manufactured wafer. For lowto mid volume applications, a design service foundry offers lower overall pricing

    (through subsidization of the license fee). For high volume mass produced parts, the

    long term cost reduction achievable through lower wafer pricing reduces the impact

    of ARM's NRE (Non-Recurring Engineering) costs, making the dedicated foundry a

    better choice.

    Many semiconductor or IC design firms hold ARM licenses; Analog Devices, Atmel,

    Broadcom, Cirrus Logic, Energy Micro, Faraday Technology, Freescale, Fujitsu, Intel

    14

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    15/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    (through its settlement with Digital Equipment Corporation), IBM, Infineon

    Technologies, Nintendo, NXP Semiconductors, OKI, Qualcomm, Samsung, Sharp,

    STMicroelectronics, Texas Instruments and VLSI are some of the many companies

    who have licensed the ARM in one form or another.

    1.22 APPROXIMATE LICENSING COSTS

    ARM's 2006 annual report and accounts state that royalties totalling 88.7 million

    ($164.1 million) were the result of licensees shipping 2.45 billion units. This is

    equivalent to 0.036 ($0.067) per unit shipped. However, this is averaged across all

    cores, including expensive new cores and inexpensive older cores.

    In the same year ARM's licensing revenues for processor cores were 65.2 million

    (US$119.5 million), in a year when 65 processor licenses were signed, an average of

    1 million ($1.84 million) per license. Again, this is averaged across both new and old

    cores.

    Given that ARM's 2006 income from processor cores was approximately 60% from

    royalties and 40% from licenses, ARM makes the equivalent of 0.06 ($0.11) per unit

    shipped including both royalties and licenses. However, as one-off licenses aretypically bought for new technologies, unit sales (and hence royalties) are dominated

    by more established products. Hence, the figures above do not reflect the true costs

    of any single ARM product.

    1.23 OPERATING SYSTEMS

    1.23.1 Acorn systems

    The very first ARM-based Acorn Archimedes personal computers ran an interim

    operating system called Arthur, which evolved into RISC OS, used on later ARM-

    based systems from Acorn and other vendors.

    1.23.2 Embedded operating systems

    The ARM architecture is supported by a large number of embedded and real-time

    operating systems, including Windows CE, Symbian OS, eCos, INTEGRITY, Nucleus

    PLUS, MicroC/OS-II, QNX, RTXC Quadros, ThreadX and VxWorks.

    15

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    16/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    1.23.3 Unix-like

    The ARM architecture is supported by Unix and Unix-like operating systems such as

    GNU/Linux, BSD, Plan 9 from Bell Labs, Inferno, Solaris, Apple iOS, WebOS and

    Android.

    1.23.4 Windows

    Microsoft announced on 5 January 2011 that the next major version of the Windows

    NT family will include support for ARM processors. Microsoft demonstrated a

    preliminary version of Windows (version 6.2.7867) running on an ARM-based

    computer at the 2011 Consumer Electronics Show[3].

    CHAPTER 2

    PROBLEM STATEMENT

    The embedded web server , which take Samsung corporation's ARM9-S3C2440AL

    processor as core, is designed, it's operating system is Linux, the system hardware

    16

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    17/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    architecture is presented. Then the process of the Linux operating system being

    transplated on ARM is introduced. The realization of Boa and dynamic interaction

    between browser and the embedded system by using CGI are especially analyzed.

    Finally the implemented embedded web server is tested to indicate that it responding

    rapidly and operates efficiently and steadily, which achieves the expectant designing

    purpose[1].

    2.1 EMBEDDED WEB SERVER

    An Embedded Web Server (EWS) is a Web server that runs on an embedded system

    with limited computing resources and serves embedded Web documents to a Web

    browser. By embedding a Web server into a network device, it is possible for an

    EWS to provide a powerful Web-based management user interface constructed

    using HTML, graphics and other features common to Web browsers. When applied

    to embedded systems, Web technologies offer graphical user interfaces, which are

    user-friendly, inexpensive, cross-platform, and network-ready. General Web servers,

    which were developed for general purpose computers such as NT servers or Unix

    and Linux workstations, typically require megabytes of memory, a fast processor, a

    pre-emptive multitasking operating system, and other resources. A Web server can

    be embedded in a device to provide remote access to the device from a Web

    browser if the resource requirements of the Web server are reduced. The end result

    of reducing the resource requirements of the Web server is typically a portable set of

    code that can run on embedded systems with limited computing resources.

    Embedded system can be utilized to serve the embedded Web documents, including

    static and dynamic information about embedded systems, to Web browsers. This

    type of Web server is called an Embedded Web Server (EWS).

    EWSs are used to convey the state information of embedded systems, such as a

    systems working statistics, current configuration and operation results, to a Web

    browser. EWSs are also used to transfer user commands from a Web browser to an

    embedded system. The state information is extracted from an embedded system

    application and the control command is implemented through the embedded system

    application. In many instances, it makes sense for embedded Web software to be a

    lightweight version of Web software. For network devices, such as routers, switches

    17

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    18/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    and hubs, it is possible to place an EWS directly in the devices without additional

    hardware.

    The development of an EWS must take into account the relative scarcity of system

    resources. An EWS must meet the devices memory requirements and limited

    processing power. More difficult than overcoming the memory limitation is managing

    the impact of Web request servicing on the system CPU. An EWS process as a

    subordinate process to the main purpose of the device must use as few CPU

    resources as possible in order not to interfere with the main task of the system.

    Generally, network devices require high reliability. As one embedded component of a

    network device, an EWS must also be highly reliable. Because it is a subordinate

    process, at the very least it must protect against propagation of internal failure to the

    whole system. An EWS needs to run on a much broader range of embedded system

    environments in terms of the facilities they provide, and with much tighter resource

    constraints than mainstream computing hardware. Consequently, it must be highly

    portable. Security is an important concem in many possible applications for

    embedded Web server technology, especially ones that involve equipment

    configuration or administration. It is often desirable to limit access to this informationto a specific set of users. While Web documents can be quickly prototyped with

    readily available desktop authoring tools, the prototype must then be integrated into

    the system software. An EWS works with a fixed set of integrated Web documents

    that are usually frozen at the time the embedded system is manufactured, but then

    incorporates dynamic information of system software at run-time. For rapid

    development, an easy but powerful integration mechanism must be provided.

    Embedded web server refers to import Web Server at the scene the monitor and

    control equipment, in the support of appropriate hardware platforms and software

    systems, transfer traditional monitor and control equipment into a internet based ,

    possessed with TCP/IP protocol as the underlying communication protocol and Web

    server technology as its core. The resource of embedded devices is limited, and do

    not able to handle multiple user requests, so the speci fically designed for embedded

    Web server are needed instead of Apache used in Linux. In embedded Linux

    systems, the typical Web Server are Boa, httpd, thttpd and so on. Httpd is clearly

    inappropriate to advanced applications which only supports static pages and does

    18

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    19/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    not support CGI, thttpd and Boa provide similar functions except thttpd requires

    resources much larger when running compared with Boa. Boa is a single task of

    HTTP server, It is different from traditional Web server, It is not calls subprocess to

    handle multiple connections produced simultaneously through the fork, but

    reprocesses all the ongoing HTTP connections that only fork calls CGI programs,

    automatic directory generation, and file compression implementation. This is vital

    important for embedded systems by saving the maximum extent possible system

    resources. Based on the above exposition, Boa applied to the embedded platform

    has many advantages, therefore Boa is used as Web server in this paper. Its

    architecture showed in Fig. 2.1.

    Fig. 2.1. Architecture of Boa server[1]

    2.2 THE CHOICE OF EMBEDDED WEB SERVER

    Generally speaking, the embedded devices have limited resources and don't need to

    handle the requests of many users simultaneously. Therefore they do not need to

    use the most commonly used Linux server Apache. Web server which is specifically

    designed for embedded devices are applied in such case. This kind of Web server

    19

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    20/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    requires relatively small storage space and less memory to run, which makes it quite

    suitable for embedded applications.

    The typical embedded Web server has three kinds, namely httpd, Boa and thttpd. As

    the simplest Web server, httpd has the weakest functions among the three. It does

    not support authentication and CGI technology while Boa and thttpd support these

    functions. If Web server only provides some static web pages such as simple online

    help and system introduction, then a static server can be adopted; if you need to

    improve system security or interact with users such as real-time status query and

    landing, then you have to use dynamic Web technologies. In such situation, either

    Boa or thttpd can achieve these goals. In the present research, we adopt Boa, the

    Web server suitable for embedded system, because thttpd has less function and

    needs far more resources to run[2].

    2.2.1 The Principle Of Embedded Web Server Boa

    Boa is a single task Web server. The difference between Boa and traditional Web

    server is that when a connection request arrives, Boa does not create a separate

    process for each connection, nor handle multiple connections by copying itself.

    Instead, Boa handles multiple connections by establishing a list of HTTP requests,

    but it only forks new process for CGr program. In this way, the system resources are

    saved to the largest extent. Like a common Web sever, an embedded web server

    can accomplish tasks such as receiving requests from the

    client, analyzing requests, responding to those requests, and finally returning results

    to the client. The following is its work process.

    Complete the initialization of the Web server, such as creating an environment

    variable, creating socket, binding a port, listening to a port, entering the loop, and

    waiting for connection requests form a client.

    When there is a connection request from a client, Web server is responsible for

    receiving the request and saving related information.

    After receiving the connection request, Boa analyzes the request, calls analysis

    module, and works out solutions, URL target, and information of the list. At the same

    time, it processes the request accordingly.

    20

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    21/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    After the corresponding treatment is finished, the Web server sends responses to

    the client browser and then closes the TCP connection with the client. For different

    request methods, the embedded Web server Boa makes different responses. If the

    request method is HEAD, the response header will be sent to the browser; If the

    request method is GET, in addition to sending the response header, it will also read

    out from the server the URL target file of the client request and send it to the client

    browser; If the request method is POST, the information of the list will be sent to

    corresponding CGI program, and then take the information as a CGI parameter to

    execute CGI program. Finally, the results will be sent to client browser. Boa's

    flowchart is shown in Fig. 2.2.

    21

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    22/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    Fig. 2.2 Embedded Web server flowchart

    22

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    23/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    2.3 EMBEDDED LINUX

    Linux OS running in embedded system is known as Embedded Linux. Linux os

    occupy only up to 100KB space in memory. Now days most ES based on 32 bit

    processor like ARM, PowerPC, ColdFire etc have sufficient amount of flash and RAM

    memory. Actually linux is one of the favorite OS for ES. The reason behind this is

    following

    1. Linux is compact and occupy less space in memory.

    2. Linux has real time operational capabilities. Linux is real time operating system

    after release of kernel 2.6.x. Linux kernel is also preemptive kernel.

    3. Linux is fully configurable, it means you can use only those components which are

    desired and left others.

    4. Linux has support of virtual memory. This is special requirement of safety critical

    products like aeroplane, trains, nuclear reactor etc.

    5. Linux has support of all major devices like USB, Webcam, Printer, various file

    systems like FAT,NFS, FFS etc.

    6. Linux is open source, so user can do full configuration at each level.

    7. ES are designed in order to keep at low price. This requirement makes linux more

    suitable OS, because it is free.

    8. Linux is fully supported by community.

    9. Proprietary linux is also available by different vendors like Montavista, QNX,

    timesys, windriver etc.

    10. Linux has support of more then 150 processors.

    2.4 FEATURES OF LINUX

    - Linux is Monolith kernel with support of Modular architecture.- Protected mode so programs or user's can't access unauthorized areas.

    - Networking with TCP/IP and other protocols.

    - Multiple user capability.

    - Shared libraries

    - True multitasking

    - X: A graphical user interface similar to windows, but supports remote sessions over

    a network-

    23

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    24/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    2.5 FEATURES OF EMBEDDED LINUX

    Configurable kernel: Configurable features, Configurable size, Configurable

    functionality

    Device Support: wide range of device are supported like USB, Ethernet etc.

    Royalty Free:No need to pay royalty to for any type of product.

    Support for many embedded applications: Database (SQL Lite, Metalite), webserver

    (Boa, thttpd) Graphics (PEG, Nano )

    Open Source:Source code can be customized for specific need of embedded

    system

    2.6 BENEFIT OF USING EMBEDDED LINUX

    There are so many benefits of using embedded linux. They are

    1. Vendor independent

    Using linux means you are no longer depend on particular vendor for supply of tools.

    In linux everything is available from open source community. Even service model of

    all linux vendors is almost same they used to provide linux kernel, libraries etc. So,

    user can easily switched from one vendor to another.

    And even if user wants to go without vendor, everything is freely available. But in that

    case of the work of integration, BSP development has to be done by use itself.

    2. Easy availability of used tools

    In embedded linux so many development tools and utilities are easily available. User

    can download them and use them freely. So this result in fast development time for

    embedded system products.

    3. Various hardware supports

    Linux community is very active. They regularly add support of new hardware. Linux is

    used in various research laboratories and universities worldwide, so linux is always

    upto date with latest hardware support.

    4. Low cost development

    By using linux in embedded system product, we can development low cost products.

    Linux development tools are free and easily available. Linux is royalty free. There is

    no need to pay royalty for making any number of products.

    24

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    25/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    2.7 EMBEDDED LINUX DEVELOPMENT ENVIRONMENT

    Fig.2.3. Embedded Linux Setup[4]

    Following are the essential for embedded linux setup:

    1. Embedded system development board (like ARM9 board)

    2. Host PC

    3. Serial cable

    4. Ethernet cross cable

    5. Embedded linux kernel running in board

    Serial connection is used to bring up shell in host pc. Ethernet connection is used for

    downloading kernel and debugging.

    Host terminal must have following tools

    1. Eclipse IDE2. GCC toolchain for Embedded Linux

    3. TFTP (Trivial File Transport Protocol) server , this is for downloading of modules

    25

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    26/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    CHAPTER 3

    BLOCK DIAGRAM

    3.1 DESIGN OF THE HARDWARE SYSTEM

    S3C2440Al processor is used as core of the hardware platform in this paper. Fig. 3.1

    is the block diagram of hardware system. Include: serial port, Ethernet interface,

    JTAG port, storage systems and so on. The frequency Samsung S3C2440AL is

    400MHz and can up to 533MHz in the maximum. According to its mode of internal

    circuit. 12MHz chosen for the crystal. JTAG (Joint Test Action Group) is an

    international test protocol standard, software simulation, single-step debug and u-

    boot download can be carried out through the JTAG port, it's a simple and efficient

    means of developing and debugging embedded systems. The SDRAM capacity in

    the system is 64MB, working voltage is 3.3V, data bus is 32-bit,clock frequency up to

    100MHz, Auto-Refresh and Self-Refresh are both supported.

    Fig.3.1 Block diagram of hardware system[1]

    26

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    27/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    For supporting boot loader in the NAND Flash, a buffer named Steppingstone is

    equipped in SDRAM. When the system starting, the first 4Kbyte content in NAND

    Flash is load to the Steppingstone and be executed. When Startup code, he

    contents of the NAND Flash are copied to the SDRAM in general. The datas in

    NAND Flash are checked when ECC is used. The main program will be executed on

    the SDRAM based on the completion of copy[5]. S3C2440AL UART provide three

    serial I/O port, each port can operation on interrupt or DMA mode. UART can support

    a maximum baud rate of 115.2Kbps when using the system clock. Each UART

    channel for the receiver and transmitter includes two 64-bit FIFO. The LCD interface

    of S3C2440Al is integrated 4-wire resistive touch screen interface which can be

    directly connected to four wire resistive touch screen.

    3.2 DESIGN OF THE SOFTWARE SYSTEM

    Software development process based OS includes: the establishment of cross-

    compiler, the transplant of Bootloader, the transplant of embedded Linux, the deve-

    lopment embedded Web server. To begin with, system cross-compiler environment

    using EABI-4.3.3 is established. what's more, uboot that developed by the German

    DEXN group is used as Bootloader. The function of Bootloader is to initialize the

    hardware devices, establish memory mapping tables, thus establish appropriate

    hardware and software environment, prepare for the final call to the operating system

    kernel. Besides, yaffs file system is made.

    3.2.1 The Transplant Of Linux Kernel

    Linux is used as operating system because Linux system is a hierarchical structure

    and completely open its kernel source, the important feature of Linux is portability to

    support a wide range of hardware platforms, can run in most of the architecture.Contains a comprehensive set of editing, debugging and other development tools,

    graphical interface, a powerful network supporting and rich applications. In addition,

    the kernel can be reduced by configuring.

    Transplantation include the following sections:

    1) Kernel configure. Make menuconfig is used to configure Linux kernel. Support

    TCP/IP protocol in the Networking options, Add Network device support option in the

    Device Drivers, and select DM9000 support in Ethernet (10 or 100Mbit).

    27

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    28/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    2) Modify the corresponding kernel cod

    3) Connect script.

    4) Mount the file system.

    5) Driver transplantation (USB device driver migration, LCD driver transplantation,

    etc.).

    Linux development platform is build when above contents are compiled and wrote

    into system board flash.

    3.2.2 Set Up Of Web Server

    1) Download Boa source[7]. The boa-0.94.13 version is used in this paper.

    2) Decompress and compile Boa source. Use ./configure generated Makefile

    files,modify Makefile, chang CC=gcc and CPP=gcc E for CC=arm-linux-gcc and

    CPP=arm-linuxg++-E.

    3) Compile and optimize.

    4) Configure Boa.

    5) Test Boa run.

    NFS is used for testing generally,first, copy Boa web server to the directory of file

    system sbin/, copy Boa configuration file "boa.conf" to the directory of file system

    etc/boa/.Next, add the application to file system. Establish a web directory which

    contents of the cgi-bin directory and index.html static HTML pages following the in

    the shared directory. Last, start target board using NFS guide. Set the target board's

    IP address is 10.16.12.149 , Open the the browser in PC, input http://10.16.12.149,

    page will appear as shown in Figure 3.

    3.2.3 Implement of Dynamic Web Pages

    There are many different technologies to achieve dynamic Web page, commonly

    used with CGI, ASP, PHP, and JSP and so on. In Linux, CGI often used to achieve

    dynamic page.

    28

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    29/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    1) Overview on CGI

    CGI provides a access to execute external program for Web server, this server

    technology can be made to interact between the browser and server. CGI programs

    can be written by any programming language, for example Shell, Fortran, Pascal,

    Perl, C and so on But with CGI programs written in C language proved have a

    execution speed, security, etc, So C language is used for CGI program design in this

    paper.

    Fig. 3.2. Work process of CGI

    CGI's working process shown in Fig. 3.2, concrete steps are as follows [1]:

    a) Users in the client browser make a request to the Web server.

    b) Web server will make a judgement on the request.

    Web server will transfer file directly to the client browser if the request is a static file,

    else Web server will activate the CGI program.

    c) Daemon of Web server create a sub process which sets environment variables

    establishes two standards I/O data channels between the server and the external

    CGI process

    d) Web server startup the CGI program that URL specified. CGI program reading and

    processing client's input data through environment variables and standard input Stdin

    and calling the appropriate external program in accordance with the request.

    e) CGI will pass the result through the standard output Stdout to the server daemon

    after processing, format.

    2) Standard of CGI interface

    CGI interface standard including standard inputs, environment variables, standard

    output. Client submitted requests mostly by FORM. GET method and POST method

    29

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    30/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    are two methods of FORM submit, by which method depends on the setting in FORM

    METHOD. When use GET method, CGI program obtain data from the environment

    variable called QUERY_STRING, GET

    method is usually used to get the data do not want to change from the server.

    However, if the string is too long, POST method is often used, when POST method,

    WEB server transmit data by Stdin, but EOF character is not used to mark in the end

    of data, so environment variable CONTENT_LENGTH shoud be used to read data

    length. The standard Stdout used to output the result that handled by CGI programs,

    Outputs include MIME head- ers, actually display of the browser and HTML source

    code.

    3) Invoke of CGI program

    CGI program calls generally have the following two ways:

    (1) Directly.

    (2 Combined with FORM.

    When user submits an HTML FORM, first,Web browser encodes the datas form

    FORM, the format is as follows:name1=value1&name2=value2&name3=value3&valu

    e4= value4&...This format shall be URL encoded, procedures need to be analyzed

    and decoded. Then change some special characters into the corresponding ASCII

    characters. These special characters are:

    +: convert + to the space character; %xx: according to the value of xx to convert into

    the corresponding ASCII characters. Last, CGI will process the final results back to

    the client[9].

    4) Testing of dynamic Web page

    Dynamic Web page can be tested after the realization of the CGI. Firstly a simple

    helloweb.c file is written, then compile it:

    #arm-linux-gcc o helloweb.cgi helloweb.c

    #cp helloweb.cgi/opt/EmbedSky/root_nfs/web/cgi- bin

    Finally, http://10.16.12.149/cgi-bin/helloweb.cgi is

    entered on the browser in the PC and then the testing page

    (helloweb.cgi) will be opened, it shows as Figure 5.

    30

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    31/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    Fig.3.3 Dynamic page of helloweb.cgi[1]

    31

    Department of ECE, SBCE

  • 8/8/2019 The ARM is a 32

    32/32

    Project 2010 Design And Implementation Of

    An Embedded Web Server

    Based On ARM

    REFERENCES

    [1] Mo Guan;Minghai Gu, Design And Implementation Of An Embedded Web Server

    Based On ARM ,pp. 612-615,2010.

    [2] Yakun Liu; Xiaodong Cheng, Design and implementation of embedded Web

    server based on arm and Linux, Issue Date: 30-31 May 2010,On page(s): 316

    319

    [3] www. Wikipedia.com

    [4] embeddedcraft.org/embeddedlinux.html

    32