8/8/2019 The ARM is a 32
1/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
CHAPTER 1
LITERATURE SURVEY
The ARM is a 32-bit reduced instruction set computer (RISC) instruction set
architecture (ISA) developed by ARM Holdings. It was known as the Advanced RISC
Machine, and before that as the Acorn RISC Machine. The ARM architecture is the
most widely used 32-bit ISA in terms of numbers produced. They were originally
conceived as a processor for desktop personal computers by Acorn Computers, a
market now dominated by the x86 family used by IBM PC compatible and AppleMacintosh computers. The relative simplicity of ARM processors made them suitable
for low power applications. This has made them dominant in the mobile and
embedded electronics market, as relatively low cost, and small microprocessors and
microcontrollers.
As of 2007, about 98 percent of the more than one billion mobile phones sold each
year use at least one ARM processor. As of 2009, ARM processors account for
approximately 90% of all embedded 32-bit RISC processors. ARM processors areused extensively in consumer electronics, including PDAs, mobile phones, digital
media and music players, hand-held game consoles, calculators and computer
peripherals such as hard drives and routers.
The ARM architecture is licensable. Companies that are current or former ARM
licensees include Alcatel-Lucent, Apple Inc., Atmel, Broadcom, Cirrus Logic, Digital
Equipment Corporation, Freescale, Intel (through DEC), LG, Marvell Technology
Group, Microsoft, NEC, Nuvoton, Nvidia, NXP (previously Philips), Oki, Qualcomm,
Samsung, Sharp, STMicroelectronics, Symbios Logic, Texas Instruments, VLSI
Technology, Yamaha and ZiiLABS.
ARM processors are developed by ARM and by ARM licensees. Prominent ARM
processor families developed by ARM Holdings include the ARDigital Equipment
CorporationM7, ARM9, ARM11 and Cortex. Notable ARM processors developed by
licensees include DEC StrongARM, Freescale i.MX, Marvell (formerly Intel) XScale,
1
Department of ECE, SBCE
8/8/2019 The ARM is a 32
2/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
Nintendo, Nvidia Tegra, ST-Ericsson Nomadik, Qualcomm Snapdragon, the Texas
Instruments OMAP product line, the Samsung Hummingbird and the Apple A4[3].
1.1 HISTORY
After achieving some success with the BBC Micro computer, Acorn Computers Ltd
considered how to move on from the relatively simple MOS Technology 6502
processor to address business markets like the one that would soon be dominated by
the IBM PC, launched in 1981. The Acorn Business Computer (ABC) plan required a
number of second processors to be made to work with the BBC Micro platform, but
processors such as the Motorola 68000 and National Semiconductor 32016 were
unsuitable, and the 6502 was not powerful enough for a graphics based user
interface.
Acorn would need a new architecture, having tested all of the available processors
and found them wanting. Acorn then seriously considered designing its own
processor, and their engineers came across papers on the Berkeley RISC project.
They felt it showed that if a class of graduate students could create a competitive 32-
bit processor, then Acorn would have no problem. A trip to the Western DesignCenter in Phoenix showed Acorn engineers Steve Furber and Sophie Wilson that
they did not need massive resources and state-of-the-art R&D facilities.
Wilson set about developing the instruction set, writing a simulation of the processor
in BBC Basic that ran on a BBC Micro with a 6502 second processor. It convinced
the Acorn engineers that they were on the right track. Before they could go any
further, however, they would need more resources. It was time for Wilson to
approach Acorn's CEO, Hermann Hauser, and explain what was afoot. Once the go-
ahead had been given, a small team was put together to implement Wilson's model
in hardware.
1.1.1 Acorn RISC Machine: ARM2
The official Acorn RISC Machine project started in October 1983. VLSI Technology,
Inc was chosen as silicon partner, since it already supplied Acorn with ROMs and
some custom chips. The design was led by Wilson and Furber, with a key design
goal of achieving low-latency input/output (interrupt) handling like the MOS
2
Department of ECE, SBCE
8/8/2019 The ARM is a 32
3/32
8/8/2019 The ARM is a 32
4/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
the design team in 1990 into a new company called Advanced RISC Machines Ltd.
For this reason, ARM is sometimes expanded as Advanced RISC Machine instead of
Acorn RISC Machine. Advanced RISC Machines became ARM Ltd when its parent
company, ARM Holdings plc, floated on the London Stock Exchange and NASDAQ
in 1998.
The new Apple-ARM work would eventually turn into the ARM6, first released in early
1992. Apple used the ARM6-based ARM 610 as the basis for their Apple Newton
PDA. In 1994, Acorn used the ARM 610 as the main CPU in their Risc PC
computers. DEC licensed the ARM6 architecture and produced the StrongARM. At
233 MHz this CPU drew only 1 Watt of power (more recent versions draw far less).
This work was later passed to Intel as a part of a lawsuit settlement, and Intel took
the opportunity to supplement their aging i960 line with the StrongARM. Intel later
developed its own high performance implementation known as XScale which it has
since sold to Marvell[3].
1.2 LICENSING GROWTH
The ARM core has remained largely the same size throughout these changes. ARM2
had 30,000 transistors, while the ARM6 grew to only 35,000. ARM's business has
always been to sell IP cores, which licensees use to create microcontrollers and
CPUs based on this core. The most successful implementation has been the
ARM7TDMI with hundreds of millions sold. The idea is that the Original Design
Manufacturer combines the ARM core with a number of optional parts to produce a
complete CPU, one that can be built on old semiconductor fabs and still deliver
substantial performance at a low cost. Atmel has been a precursor design center in
the ARM7TDMI-Based Embedded System.
ARM licensed about 1.6 billion cores in 2005. In 2005, about 1 billion ARM cores
went into mobile phones. As of January 2008, over 10 billion ARM cores have been
built, and in 2008 iSuppli predicted that by 2011, 5 billion ARM cores will be shipping
per year. As of January 2011, ARM states that over 15 billion ARM processors have
shipped.
The architecture used in smartphones, personal digital assistants and other mobile
devices is anything from ARMv5 in obsolete/low-end devices to ARM M-series in
4
Department of ECE, SBCE
8/8/2019 The ARM is a 32
5/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
current high-end devices. XScale and ARM926 processors are ARMv5TE, and are
now more numerous in high-end devices than the StrongARM, ARM9TDMI and
ARM7TDMI based ARMv4 processors, but lower-end devices may use older cores
with lower licensing costs. ARMv6 processors represented a step up in performance
from standard ARMv5 cores, and are used in some cases, but Cortex processors
(ARMv7) now provide faster and more power-efficient options than all those previous
generations. Cortex-A targets applications processors, as needed by smartphones
that previously used ARM9 or ARM11. Cortex-R targets real-time applications, and
Cortex-M targets microcontrollers.
In 2009, some manufacturers introduced netbooks based on ARM architecture
CPUs, in direct competition with netbooks based on Intel Atom.
1.3 ARM CORES
ARM provides a summary of the numerous vendors who implement ARM cores in
their design. KEIL also provides a somewhat newer summary of vendors of ARM
based processors. ARM further provides a chart displaying an overview of the ARM
processor lineup with performance and functionality versus capabilities for the more
recent ARM7, ARM9, ARM11, Cortex-M, Cortex-R and Cortex-A device families.
1.4 ARCHITECTURE
From 1995 onwards, the ARM Architecture Reference Manual has been the primary
source of documentation on the ARM processor architecture and instruction set,
distinguishing interfaces that all ARM processors are required to support (such as
instruction semantics) from implementation details that may vary. The architecture
has evolved over time, and starting with the Cortex series of cores, three "profiles"
are defined:
"Application" profile: Cortex-A series
"Real-time" profile: Cortex-R series
"Microcontroller" profile: Cortex-M series
Profiles are allowed to subset the architecture. For example the ARMv7-M profile
used by the Cortex-M3 core is notable in that it supports only the Thumb-2 instruction
5
Department of ECE, SBCE
8/8/2019 The ARM is a 32
6/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
set, and the ARMv6-M profile (used by the Cortex-M0) is a subset of the ARMv7-M
profile (supporting fewer instructions).
1.5 INSTRUCTION SET
To keep the design clean, simple and fast, the original ARM implementation was
hardwired without microcode, like the much simpler 8-bit 6502 processor used in
prior Acorn microcomputers.
1.6 RISC FEATURES
The ARM architecture includes the following RISC features:
Load/store architecture.
No support for misaligned memory accesses (now supported in ARMv6
cores, with some exceptions related to load/store multiple word instructions).
Uniform 16 32-bit register file.
Fixed instruction width of 32 bits to ease decoding and pipelining, at the cost
of decreased code density. Later, "the Thumb instruction set" increased code
density.
Mostly single-cycle execution.
To compensate for the simpler design, compared with contemporary processors like
the Intel 80286 and Motorola 68020, some additional design features were used:
Conditional execution of most instructions, reducing branch overhead and
compensating for the lack of a branch predictor.
Arithmetic instructions alter condition codes only when desired.
32-bit barrel shifter which can be used without performance penalty with most
arithmetic instructions and address calculations.
Powerful indexed addressing modes.
A link register for fast leaf function calls.
Simple, but fast, 2-priority-level interrupt subsystem with switched register
banks.
6
Department of ECE, SBCE
8/8/2019 The ARM is a 32
7/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
1.7 CONDITIONAL EXECUTION
The conditional execution feature (called predication) is implemented with a 4-bit
condition code selector (the predicate) on every instruction; one of the four-bit codes
is reserved as an "escape code" to specify certain unconditional instructions, but
nearly all common instructions are conditional. Most CPU architectures only have
condition codes on branch instructions.
This cuts down significantly on the encoding bits available for displacements in
memory access instructions, but on the other hand it avoids branch instructions when
generating code for small if statements.
One of the ways that Thumb code provides a more dense encoding is to remove that
four bit selector from non-branch instructions.
1.8 OTHER FEATURES
Another feature of the instruction set is the ability to fold shifts and rotates into the
"data processing" (arithmetic, logical, and register-register move) instructions, so
that, for example, the C statement
a += (j
8/8/2019 The ARM is a 32
8/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
strictly speaking, for them it's not possible to generate efficient code that would
behave the way one would expect for C objects of type "int16_t"[3].
1.9 PIPELINES AND OTHER IMPLEMENTATION ISSUES
The ARM7 and earlier implementations have a three stage pipeline; the stages being
fetch, decode, and execute. Higher performance designs, such as the ARM9, have
deeper pipelines: Cortex-A8 has thirteen stages. Additional implementation changes
for higher performance include a faster adder, and more extensive branch prediction
logic. The difference between the ARM7DI and ARM7DMI cores, for example, was
an improved multiplier (hence the added "M").
1.10 COPROCESSORS
The architecture provides a non-intrusive way of extending the instruction set using
"coprocessors" which can be addressed using MCR, MRC, MRRC, MCRR, and
similar instructions. The coprocessor space is divided logically into 16 coprocessors
with numbers from 0 to 15, coprocessor 15 (cp15) being reserved for some typical
control functions like managing the caches and MMU operation (on processors that
have one).
In ARM-based machines, peripheral devices are usually attached to the processor by
mapping their physical registers into ARM memory space or into the coprocessor
space or connecting to another device (a bus) which in turn attaches to the
processor. Coprocessor accesses have lower latency so some peripherals (for
example XScale interrupt controller) are designed to be accessible in both ways
(through memory and through coprocessors). In other cases, chip designers only
integrate hardware using the coprocessor mechanism. For example, an image
processing engine might be a small ARM7TDMI core combined with a coprocessor
that has specialized operations to support a specific set of HDTV transcoding
primitives.
1.11 DEBUGGING
All modern ARM processors include hardware debugging facilities; without them,
software debuggers could not perform basic operations like halting, stepping, and
break pointing of code starting from reset. These facilities are built using JTAG
8
Department of ECE, SBCE
8/8/2019 The ARM is a 32
9/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
support, though some newer cores optionally support ARM's own two-wire "SWD"
protocol. In ARM7TDMI cores, the "D" represented JTAG debug support, and the "I"
represented presence of an "Embedded ICE" debug module. For ARM7 and ARM9
core generations, Embedded ICE over JTAG was a de-facto debug standard,
although it was not architecturally guaranteed.
The ARMv7 architecture defines basic debug facilities at an architectural level. These
include breakpoints, watch points, and instruction execution in a "Debug Mode";
similar facilities were also available with Embedded ICE. Both "halt mode" and
"monitor" mode debugging are supported. The actual transport mechanism used to
access the debug facilities is not architecturally specified, but implementations
generally include JTAG support.
There is a separate ARM "Core Sight" debug architecture, which is not architecturally
required by ARMv7 processors[3].
1.12 DSP ENHANCEMENT INSTRUCTIONS
To improve the ARM architecture for digital signal processing and multimedia
applications, a few new instructions were added to the set. These are signified by an"E" in the name of the ARMv5TE and ARMv5TEJ architectures. E-variants also imply
T,D,M and I.
The new instructions are common in digital signal processor architectures. They are
variations on signed multiply-accumulate, saturated add and subtract, and count
leading zeros.
1.13 JAZELLE
Jazelle is a technique that allows Java Bytecode to be executed directly in the ARM
architecture as a third execution state (and instruction set) alongside the existing
ARM and Thumb-mode. Support for this state is signified by the "J" in the ARMv5TEJ
architecture, and in ARM9EJ-S and ARM7EJ-S core names. Support for this state is
required starting in ARMv6 (except for the ARMv7-M profile), although newer cores
only include a trivial implementation that provides no hardware acceleration.
9
Department of ECE, SBCE
8/8/2019 The ARM is a 32
10/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
1.14 THUMB
To improve compiled code-density, processors since the ARM7TDMI have featured
the Thumb instruction set state. (The "T" in "TDMI" indicates the Thumb feature.)
When in this state, the processor executes the Thumb instruction set, a variable-
length instruction set providing 32-bit and 16-bit instructions. Most of the Thumb
instructions are directly mapped to normal ARM instructions. The space-saving
comes from making some of the instruction operands implicit and limiting the number
of possibilities compared to the ARM instructions executed in the ARM instruction set
state.
In Thumb, the 16-bit opcodes have less functionality. For example, only branches
can be conditional, and many opcodes are restricted to accessing only half of all of
the CPU's general purpose registers. The shorter opcodes give improved code
density overall, even though some operations require extra instructions. In situations
where the memory port or bus width is constrained to less than 32 bits, the shorter
Thumb opcodes allow increased performance compared with 32-bit ARM code, as
less program code may need to be loaded into the processor over the constrained
memory bandwidth.
Embedded hardware, such as the Game Boy Advance, typically have a small amount
of RAM accessible with a full 32-bit datapath; the majority is accessed via a 16 bit or
narrower secondary datapath. In this situation, it usually makes sense to compile
Thumb code and hand-optimise a few of the most CPU-intensive sections using full
32-bit ARM instructions, placing these wider instructions into the 32-bit bus
accessible memory.
The first processor with a Thumb instruction decoder was the ARM7TDMI. All ARM9and later families, including XScale, have included a Thumb instruction decoder.
1.15 Thumb-2
Thumb-2 technology made its debut in the ARM1156 core, announced in 2003.
Thumb-2 extends the limited 16-bit instruction set of Thumb with additional 32-bit
instructions to give the instruction set more breadth. A stated aim for Thumb-2 is to
10
Department of ECE, SBCE
8/8/2019 The ARM is a 32
11/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
achieve code density similar to Thumb with performance similar to the ARM
instruction set on 32-bit memory. In ARMv7 this goal can be said to have been met.
Thumb-2 extends both the ARM and Thumb instruction set with yet more
instructions, including bit-field manipulation, table branches, and conditional
execution. A new "Unified Assembly Language" (UAL) supports generation of either
Thumb-2 or ARM instructions from the same source code; versions of Thumb seen
on ARMv7 processors are essentially as capable as ARM code (including the ability
to write interrupt handlers). This requires a bit of care, and use of a new "IT" (if-then)
instruction, which permits up to four successive instructions to execute based on a
tested condition. When compiling into ARM code this is ignored, but when compiling
into Thumb-2 it generates an actual instruction.
All ARMv7 chips support the Thumb-2 instruction set. Some chips, such as the
Cortex-M3, support only the Thumb-2 instruction set. Other chips in the Cortex and
ARM11 series support both "ARM instruction set mode" and "Thumb-2 instruction set
mode".
1.16 THUMB EXECUTION ENVIRONMENT (THUMBEE)
ThumbEE, also known as Thumb-2EE, and marketed as Jazelle RCT (Runtime
Compilation Target), was announced in 2005, first appearing in the Cortex-A8
processor. ThumbEE is a fourth processor mode, making small changes to the
Thumb-2 extended Thumb instruction set. These changes make the instruction set
particularly suited to code generated at runtime (e.g. by JIT compilation) in managed
Execution Environments. ThumbEE is a target for languages such as Limbo, Java,
C#, Perl and Python, and allows JIT compilers to output smaller compiled code
without impacting performance.
New features provided by ThumbEE include automatic null pointer checks on every
load and store instruction, an instruction to perform an array bounds check, access to
registers r8-r15 (where the Jazelle/DBX Java VM state is held), and special
instructions that call a handler. Handlers are small sections of frequently called code,
commonly used to implement a feature of a high level language, such as allocating
memory for a new object. These changes come from repurposing a handful of
opcodes, and knowing the core is in the new ThumbEE mode.
11
Department of ECE, SBCE
8/8/2019 The ARM is a 32
12/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
1.17 VECTOR FLOATING POINT (VFP)
VFP (Vector Floating Point) technology is a coprocessor extension to the ARM
architecture. It provides low-cost single-precision and double-precision floating-point
computation fully compliant with the ANSI/IEEE Std 754-1985 Standard for Binary
Floating-Point Arithmetic. VFP provides floating-point computation suitable for a wide
spectrum of applications such as PDAs, smartphones, voice compression and
decompression, three-dimensional graphics and digital audio, printers, set-top boxes,
and automotive applications. The VFP architecture also supports execution of short
vector instructions but these operate on each vector element sequentially and thus
do not offer the performance of true SIMD (Single Instruction Multiple Data)
parallelism. This mode can still be useful in graphics and signal-processing
applications, however, as it allows a reduction in code size and instruction fetch and
decode overhead.
Other floating-point and/or SIMD coprocessors found in ARM-based processors
include FPA, FPE, iwMMXt. They provide some of the same functionality as VFP but
are not opcode-compatible with it.
1.18 ADVANCED SIMD (NEON)
The Advanced SIMD extension, marketed as NEON technology, is a combined 64-
and 128-bit single instruction multiple data (SIMD) instruction set that provides
standardized acceleration for media and signal processing applications. NEON can
execute MP3 audio decoding on CPUs running at 10 MHz and can run the GSM
AMR (Adaptive Multi-Rate) speech codec at no more than 13 MHz. It features a
comprehensive instruction set, separate register files and independent executionhardware. NEON supports 8-, 16-, 32- and 64-bit integer and single-precision (32-bit)
floating-point data and operates in SIMD operations for handling audio and video
processing as well as graphics and gaming processing. In NEON, the SIMD supports
up to 16 operations at the same time. The NEON hardware shares the same floating-
point registers as used in VFP.
12
Department of ECE, SBCE
8/8/2019 The ARM is a 32
13/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
1.19 SECURITY EXTENSIONS (TRUSTZONE)
The Security Extensions, marketed as TrustZone Technology, is found in ARMv6KZ
and later application profile architectures. It provides a low cost alternative to adding
an additional dedicated security core to a SoC, by providing two virtual processors
backed by hardware based access control. This enables the application core to
switch between two states, referred to as worlds (to reduce confusion with other
names for capability domains), in order to prevent information from leaking from the
more trusted world to the less trusted world. This world switch is generally orthogonal
to all other capabilities of the processor, thus each world can operate independently
of the other while using the same core. Memory and peripherals are then made
aware of the operating world of the core and may use this to provide access control
to secrets and code on the device. Typical applications of TrustZone Technology are
to run a rich operating system in the less trusted world, and smaller security-
specialized code in the more trusted world (known as TrustZone Software, a
TrustZone optimized version of the Trusted Foundations(TM) Software developed by
Trusted Logic), allowing much tighter Digital Rights Management for controlling the
use of media on ARM-based devices, and preventing any unapproved use of thedevice.
In practice, since the specific implementation details of TrustZone are proprietary and
have not been publicly disclosed for review, it is unclear what level of assurance is
provided for a given threat model.
1.20 NO-EXECUTE PAGE PROTECTION
As of ARMv6, the ARM architecture supports no-execute page protection, which is
referred to as XN, forExecute Never.[
1.21 ARM LICENSEES
ARM Ltd does not manufacture and sell CPU devices based on its own designs, but
rather, licenses the processor architecture to interested parties. ARM offers a variety
of licensing terms, varying in cost and deliverables. To all licensees, ARM provides
an integratable hardware description of the ARM core, as well as complete software
development toolset (compiler, debugger, SDK), and the right to sell manufactured
13
Department of ECE, SBCE
http://en.wikipedia.org/wiki/ARM_architecture#cite_note-50http://en.wikipedia.org/wiki/ARM_architecture#cite_note-508/8/2019 The ARM is a 32
14/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
silicon containing the ARM CPU. Fabless licensees, who wish to integrate an ARM
core into their own chip design, are usually only interested in acquiring a ready-to-
manufacture verified IP core. For these customers, ARM delivers a gate netlist
description of the chosen ARM core, along with an abstracted simulation model and
test programs to aid design integration and verification. More ambitious customers,
including integrated device manufacturers (IDM) and foundry operators, choose to
acquire the processor IP in synthesizable RTL (Verilog) form. With the synthesizable
RTL, the customer has the ability to perform architectural level optimizations and
extensions. This allows the designer to achieve exotic design goals not otherwise
possible with an unmodified netlist (high clock speed, very low power consumption,
instruction set extensions, etc.). While ARM does not grant the licensee the right to
resell the ARM architecture itself, licensees may freely sell manufactured product
(chip devices, evaluation boards, complete systems, etc.). Merchant foundries can be
a special case; not only are they allowed to sell finished silicon containing ARM
cores, they generally hold the right to remanufacture ARM cores for other customers.
Like most IP vendors, ARM prices its IP based on perceived value. In architectural
terms, the lower performance ARM cores command a lower license cost than thehigher performance cores. In terms of silicon implementation, a synthesizable core is
more expensive than a hard macro (black box) core. Complicating price matters, a
merchant foundry who holds an ARM license (such as Samsung and Fujitsu) can
offer reduced licensing costs to its fab customers. In exchange for acquiring the ARM
core through the foundry's in-house design services, the customer can reduce or
eliminate payment of ARM's upfront license fee. Compared to dedicated
semiconductor foundries (such as TSMC and UMC) without in-house design
services, Fujitsu/Samsung charge 2 to 3 times more per manufactured wafer. For lowto mid volume applications, a design service foundry offers lower overall pricing
(through subsidization of the license fee). For high volume mass produced parts, the
long term cost reduction achievable through lower wafer pricing reduces the impact
of ARM's NRE (Non-Recurring Engineering) costs, making the dedicated foundry a
better choice.
Many semiconductor or IC design firms hold ARM licenses; Analog Devices, Atmel,
Broadcom, Cirrus Logic, Energy Micro, Faraday Technology, Freescale, Fujitsu, Intel
14
Department of ECE, SBCE
8/8/2019 The ARM is a 32
15/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
(through its settlement with Digital Equipment Corporation), IBM, Infineon
Technologies, Nintendo, NXP Semiconductors, OKI, Qualcomm, Samsung, Sharp,
STMicroelectronics, Texas Instruments and VLSI are some of the many companies
who have licensed the ARM in one form or another.
1.22 APPROXIMATE LICENSING COSTS
ARM's 2006 annual report and accounts state that royalties totalling 88.7 million
($164.1 million) were the result of licensees shipping 2.45 billion units. This is
equivalent to 0.036 ($0.067) per unit shipped. However, this is averaged across all
cores, including expensive new cores and inexpensive older cores.
In the same year ARM's licensing revenues for processor cores were 65.2 million
(US$119.5 million), in a year when 65 processor licenses were signed, an average of
1 million ($1.84 million) per license. Again, this is averaged across both new and old
cores.
Given that ARM's 2006 income from processor cores was approximately 60% from
royalties and 40% from licenses, ARM makes the equivalent of 0.06 ($0.11) per unit
shipped including both royalties and licenses. However, as one-off licenses aretypically bought for new technologies, unit sales (and hence royalties) are dominated
by more established products. Hence, the figures above do not reflect the true costs
of any single ARM product.
1.23 OPERATING SYSTEMS
1.23.1 Acorn systems
The very first ARM-based Acorn Archimedes personal computers ran an interim
operating system called Arthur, which evolved into RISC OS, used on later ARM-
based systems from Acorn and other vendors.
1.23.2 Embedded operating systems
The ARM architecture is supported by a large number of embedded and real-time
operating systems, including Windows CE, Symbian OS, eCos, INTEGRITY, Nucleus
PLUS, MicroC/OS-II, QNX, RTXC Quadros, ThreadX and VxWorks.
15
Department of ECE, SBCE
8/8/2019 The ARM is a 32
16/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
1.23.3 Unix-like
The ARM architecture is supported by Unix and Unix-like operating systems such as
GNU/Linux, BSD, Plan 9 from Bell Labs, Inferno, Solaris, Apple iOS, WebOS and
Android.
1.23.4 Windows
Microsoft announced on 5 January 2011 that the next major version of the Windows
NT family will include support for ARM processors. Microsoft demonstrated a
preliminary version of Windows (version 6.2.7867) running on an ARM-based
computer at the 2011 Consumer Electronics Show[3].
CHAPTER 2
PROBLEM STATEMENT
The embedded web server , which take Samsung corporation's ARM9-S3C2440AL
processor as core, is designed, it's operating system is Linux, the system hardware
16
Department of ECE, SBCE
8/8/2019 The ARM is a 32
17/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
architecture is presented. Then the process of the Linux operating system being
transplated on ARM is introduced. The realization of Boa and dynamic interaction
between browser and the embedded system by using CGI are especially analyzed.
Finally the implemented embedded web server is tested to indicate that it responding
rapidly and operates efficiently and steadily, which achieves the expectant designing
purpose[1].
2.1 EMBEDDED WEB SERVER
An Embedded Web Server (EWS) is a Web server that runs on an embedded system
with limited computing resources and serves embedded Web documents to a Web
browser. By embedding a Web server into a network device, it is possible for an
EWS to provide a powerful Web-based management user interface constructed
using HTML, graphics and other features common to Web browsers. When applied
to embedded systems, Web technologies offer graphical user interfaces, which are
user-friendly, inexpensive, cross-platform, and network-ready. General Web servers,
which were developed for general purpose computers such as NT servers or Unix
and Linux workstations, typically require megabytes of memory, a fast processor, a
pre-emptive multitasking operating system, and other resources. A Web server can
be embedded in a device to provide remote access to the device from a Web
browser if the resource requirements of the Web server are reduced. The end result
of reducing the resource requirements of the Web server is typically a portable set of
code that can run on embedded systems with limited computing resources.
Embedded system can be utilized to serve the embedded Web documents, including
static and dynamic information about embedded systems, to Web browsers. This
type of Web server is called an Embedded Web Server (EWS).
EWSs are used to convey the state information of embedded systems, such as a
systems working statistics, current configuration and operation results, to a Web
browser. EWSs are also used to transfer user commands from a Web browser to an
embedded system. The state information is extracted from an embedded system
application and the control command is implemented through the embedded system
application. In many instances, it makes sense for embedded Web software to be a
lightweight version of Web software. For network devices, such as routers, switches
17
Department of ECE, SBCE
8/8/2019 The ARM is a 32
18/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
and hubs, it is possible to place an EWS directly in the devices without additional
hardware.
The development of an EWS must take into account the relative scarcity of system
resources. An EWS must meet the devices memory requirements and limited
processing power. More difficult than overcoming the memory limitation is managing
the impact of Web request servicing on the system CPU. An EWS process as a
subordinate process to the main purpose of the device must use as few CPU
resources as possible in order not to interfere with the main task of the system.
Generally, network devices require high reliability. As one embedded component of a
network device, an EWS must also be highly reliable. Because it is a subordinate
process, at the very least it must protect against propagation of internal failure to the
whole system. An EWS needs to run on a much broader range of embedded system
environments in terms of the facilities they provide, and with much tighter resource
constraints than mainstream computing hardware. Consequently, it must be highly
portable. Security is an important concem in many possible applications for
embedded Web server technology, especially ones that involve equipment
configuration or administration. It is often desirable to limit access to this informationto a specific set of users. While Web documents can be quickly prototyped with
readily available desktop authoring tools, the prototype must then be integrated into
the system software. An EWS works with a fixed set of integrated Web documents
that are usually frozen at the time the embedded system is manufactured, but then
incorporates dynamic information of system software at run-time. For rapid
development, an easy but powerful integration mechanism must be provided.
Embedded web server refers to import Web Server at the scene the monitor and
control equipment, in the support of appropriate hardware platforms and software
systems, transfer traditional monitor and control equipment into a internet based ,
possessed with TCP/IP protocol as the underlying communication protocol and Web
server technology as its core. The resource of embedded devices is limited, and do
not able to handle multiple user requests, so the speci fically designed for embedded
Web server are needed instead of Apache used in Linux. In embedded Linux
systems, the typical Web Server are Boa, httpd, thttpd and so on. Httpd is clearly
inappropriate to advanced applications which only supports static pages and does
18
Department of ECE, SBCE
8/8/2019 The ARM is a 32
19/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
not support CGI, thttpd and Boa provide similar functions except thttpd requires
resources much larger when running compared with Boa. Boa is a single task of
HTTP server, It is different from traditional Web server, It is not calls subprocess to
handle multiple connections produced simultaneously through the fork, but
reprocesses all the ongoing HTTP connections that only fork calls CGI programs,
automatic directory generation, and file compression implementation. This is vital
important for embedded systems by saving the maximum extent possible system
resources. Based on the above exposition, Boa applied to the embedded platform
has many advantages, therefore Boa is used as Web server in this paper. Its
architecture showed in Fig. 2.1.
Fig. 2.1. Architecture of Boa server[1]
2.2 THE CHOICE OF EMBEDDED WEB SERVER
Generally speaking, the embedded devices have limited resources and don't need to
handle the requests of many users simultaneously. Therefore they do not need to
use the most commonly used Linux server Apache. Web server which is specifically
designed for embedded devices are applied in such case. This kind of Web server
19
Department of ECE, SBCE
8/8/2019 The ARM is a 32
20/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
requires relatively small storage space and less memory to run, which makes it quite
suitable for embedded applications.
The typical embedded Web server has three kinds, namely httpd, Boa and thttpd. As
the simplest Web server, httpd has the weakest functions among the three. It does
not support authentication and CGI technology while Boa and thttpd support these
functions. If Web server only provides some static web pages such as simple online
help and system introduction, then a static server can be adopted; if you need to
improve system security or interact with users such as real-time status query and
landing, then you have to use dynamic Web technologies. In such situation, either
Boa or thttpd can achieve these goals. In the present research, we adopt Boa, the
Web server suitable for embedded system, because thttpd has less function and
needs far more resources to run[2].
2.2.1 The Principle Of Embedded Web Server Boa
Boa is a single task Web server. The difference between Boa and traditional Web
server is that when a connection request arrives, Boa does not create a separate
process for each connection, nor handle multiple connections by copying itself.
Instead, Boa handles multiple connections by establishing a list of HTTP requests,
but it only forks new process for CGr program. In this way, the system resources are
saved to the largest extent. Like a common Web sever, an embedded web server
can accomplish tasks such as receiving requests from the
client, analyzing requests, responding to those requests, and finally returning results
to the client. The following is its work process.
Complete the initialization of the Web server, such as creating an environment
variable, creating socket, binding a port, listening to a port, entering the loop, and
waiting for connection requests form a client.
When there is a connection request from a client, Web server is responsible for
receiving the request and saving related information.
After receiving the connection request, Boa analyzes the request, calls analysis
module, and works out solutions, URL target, and information of the list. At the same
time, it processes the request accordingly.
20
Department of ECE, SBCE
8/8/2019 The ARM is a 32
21/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
After the corresponding treatment is finished, the Web server sends responses to
the client browser and then closes the TCP connection with the client. For different
request methods, the embedded Web server Boa makes different responses. If the
request method is HEAD, the response header will be sent to the browser; If the
request method is GET, in addition to sending the response header, it will also read
out from the server the URL target file of the client request and send it to the client
browser; If the request method is POST, the information of the list will be sent to
corresponding CGI program, and then take the information as a CGI parameter to
execute CGI program. Finally, the results will be sent to client browser. Boa's
flowchart is shown in Fig. 2.2.
21
Department of ECE, SBCE
8/8/2019 The ARM is a 32
22/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
Fig. 2.2 Embedded Web server flowchart
22
Department of ECE, SBCE
8/8/2019 The ARM is a 32
23/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
2.3 EMBEDDED LINUX
Linux OS running in embedded system is known as Embedded Linux. Linux os
occupy only up to 100KB space in memory. Now days most ES based on 32 bit
processor like ARM, PowerPC, ColdFire etc have sufficient amount of flash and RAM
memory. Actually linux is one of the favorite OS for ES. The reason behind this is
following
1. Linux is compact and occupy less space in memory.
2. Linux has real time operational capabilities. Linux is real time operating system
after release of kernel 2.6.x. Linux kernel is also preemptive kernel.
3. Linux is fully configurable, it means you can use only those components which are
desired and left others.
4. Linux has support of virtual memory. This is special requirement of safety critical
products like aeroplane, trains, nuclear reactor etc.
5. Linux has support of all major devices like USB, Webcam, Printer, various file
systems like FAT,NFS, FFS etc.
6. Linux is open source, so user can do full configuration at each level.
7. ES are designed in order to keep at low price. This requirement makes linux more
suitable OS, because it is free.
8. Linux is fully supported by community.
9. Proprietary linux is also available by different vendors like Montavista, QNX,
timesys, windriver etc.
10. Linux has support of more then 150 processors.
2.4 FEATURES OF LINUX
- Linux is Monolith kernel with support of Modular architecture.- Protected mode so programs or user's can't access unauthorized areas.
- Networking with TCP/IP and other protocols.
- Multiple user capability.
- Shared libraries
- True multitasking
- X: A graphical user interface similar to windows, but supports remote sessions over
a network-
23
Department of ECE, SBCE
8/8/2019 The ARM is a 32
24/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
2.5 FEATURES OF EMBEDDED LINUX
Configurable kernel: Configurable features, Configurable size, Configurable
functionality
Device Support: wide range of device are supported like USB, Ethernet etc.
Royalty Free:No need to pay royalty to for any type of product.
Support for many embedded applications: Database (SQL Lite, Metalite), webserver
(Boa, thttpd) Graphics (PEG, Nano )
Open Source:Source code can be customized for specific need of embedded
system
2.6 BENEFIT OF USING EMBEDDED LINUX
There are so many benefits of using embedded linux. They are
1. Vendor independent
Using linux means you are no longer depend on particular vendor for supply of tools.
In linux everything is available from open source community. Even service model of
all linux vendors is almost same they used to provide linux kernel, libraries etc. So,
user can easily switched from one vendor to another.
And even if user wants to go without vendor, everything is freely available. But in that
case of the work of integration, BSP development has to be done by use itself.
2. Easy availability of used tools
In embedded linux so many development tools and utilities are easily available. User
can download them and use them freely. So this result in fast development time for
embedded system products.
3. Various hardware supports
Linux community is very active. They regularly add support of new hardware. Linux is
used in various research laboratories and universities worldwide, so linux is always
upto date with latest hardware support.
4. Low cost development
By using linux in embedded system product, we can development low cost products.
Linux development tools are free and easily available. Linux is royalty free. There is
no need to pay royalty for making any number of products.
24
Department of ECE, SBCE
8/8/2019 The ARM is a 32
25/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
2.7 EMBEDDED LINUX DEVELOPMENT ENVIRONMENT
Fig.2.3. Embedded Linux Setup[4]
Following are the essential for embedded linux setup:
1. Embedded system development board (like ARM9 board)
2. Host PC
3. Serial cable
4. Ethernet cross cable
5. Embedded linux kernel running in board
Serial connection is used to bring up shell in host pc. Ethernet connection is used for
downloading kernel and debugging.
Host terminal must have following tools
1. Eclipse IDE2. GCC toolchain for Embedded Linux
3. TFTP (Trivial File Transport Protocol) server , this is for downloading of modules
25
Department of ECE, SBCE
8/8/2019 The ARM is a 32
26/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
CHAPTER 3
BLOCK DIAGRAM
3.1 DESIGN OF THE HARDWARE SYSTEM
S3C2440Al processor is used as core of the hardware platform in this paper. Fig. 3.1
is the block diagram of hardware system. Include: serial port, Ethernet interface,
JTAG port, storage systems and so on. The frequency Samsung S3C2440AL is
400MHz and can up to 533MHz in the maximum. According to its mode of internal
circuit. 12MHz chosen for the crystal. JTAG (Joint Test Action Group) is an
international test protocol standard, software simulation, single-step debug and u-
boot download can be carried out through the JTAG port, it's a simple and efficient
means of developing and debugging embedded systems. The SDRAM capacity in
the system is 64MB, working voltage is 3.3V, data bus is 32-bit,clock frequency up to
100MHz, Auto-Refresh and Self-Refresh are both supported.
Fig.3.1 Block diagram of hardware system[1]
26
Department of ECE, SBCE
8/8/2019 The ARM is a 32
27/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
For supporting boot loader in the NAND Flash, a buffer named Steppingstone is
equipped in SDRAM. When the system starting, the first 4Kbyte content in NAND
Flash is load to the Steppingstone and be executed. When Startup code, he
contents of the NAND Flash are copied to the SDRAM in general. The datas in
NAND Flash are checked when ECC is used. The main program will be executed on
the SDRAM based on the completion of copy[5]. S3C2440AL UART provide three
serial I/O port, each port can operation on interrupt or DMA mode. UART can support
a maximum baud rate of 115.2Kbps when using the system clock. Each UART
channel for the receiver and transmitter includes two 64-bit FIFO. The LCD interface
of S3C2440Al is integrated 4-wire resistive touch screen interface which can be
directly connected to four wire resistive touch screen.
3.2 DESIGN OF THE SOFTWARE SYSTEM
Software development process based OS includes: the establishment of cross-
compiler, the transplant of Bootloader, the transplant of embedded Linux, the deve-
lopment embedded Web server. To begin with, system cross-compiler environment
using EABI-4.3.3 is established. what's more, uboot that developed by the German
DEXN group is used as Bootloader. The function of Bootloader is to initialize the
hardware devices, establish memory mapping tables, thus establish appropriate
hardware and software environment, prepare for the final call to the operating system
kernel. Besides, yaffs file system is made.
3.2.1 The Transplant Of Linux Kernel
Linux is used as operating system because Linux system is a hierarchical structure
and completely open its kernel source, the important feature of Linux is portability to
support a wide range of hardware platforms, can run in most of the architecture.Contains a comprehensive set of editing, debugging and other development tools,
graphical interface, a powerful network supporting and rich applications. In addition,
the kernel can be reduced by configuring.
Transplantation include the following sections:
1) Kernel configure. Make menuconfig is used to configure Linux kernel. Support
TCP/IP protocol in the Networking options, Add Network device support option in the
Device Drivers, and select DM9000 support in Ethernet (10 or 100Mbit).
27
Department of ECE, SBCE
8/8/2019 The ARM is a 32
28/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
2) Modify the corresponding kernel cod
3) Connect script.
4) Mount the file system.
5) Driver transplantation (USB device driver migration, LCD driver transplantation,
etc.).
Linux development platform is build when above contents are compiled and wrote
into system board flash.
3.2.2 Set Up Of Web Server
1) Download Boa source[7]. The boa-0.94.13 version is used in this paper.
2) Decompress and compile Boa source. Use ./configure generated Makefile
files,modify Makefile, chang CC=gcc and CPP=gcc E for CC=arm-linux-gcc and
CPP=arm-linuxg++-E.
3) Compile and optimize.
4) Configure Boa.
5) Test Boa run.
NFS is used for testing generally,first, copy Boa web server to the directory of file
system sbin/, copy Boa configuration file "boa.conf" to the directory of file system
etc/boa/.Next, add the application to file system. Establish a web directory which
contents of the cgi-bin directory and index.html static HTML pages following the in
the shared directory. Last, start target board using NFS guide. Set the target board's
IP address is 10.16.12.149 , Open the the browser in PC, input http://10.16.12.149,
page will appear as shown in Figure 3.
3.2.3 Implement of Dynamic Web Pages
There are many different technologies to achieve dynamic Web page, commonly
used with CGI, ASP, PHP, and JSP and so on. In Linux, CGI often used to achieve
dynamic page.
28
Department of ECE, SBCE
8/8/2019 The ARM is a 32
29/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
1) Overview on CGI
CGI provides a access to execute external program for Web server, this server
technology can be made to interact between the browser and server. CGI programs
can be written by any programming language, for example Shell, Fortran, Pascal,
Perl, C and so on But with CGI programs written in C language proved have a
execution speed, security, etc, So C language is used for CGI program design in this
paper.
Fig. 3.2. Work process of CGI
CGI's working process shown in Fig. 3.2, concrete steps are as follows [1]:
a) Users in the client browser make a request to the Web server.
b) Web server will make a judgement on the request.
Web server will transfer file directly to the client browser if the request is a static file,
else Web server will activate the CGI program.
c) Daemon of Web server create a sub process which sets environment variables
establishes two standards I/O data channels between the server and the external
CGI process
d) Web server startup the CGI program that URL specified. CGI program reading and
processing client's input data through environment variables and standard input Stdin
and calling the appropriate external program in accordance with the request.
e) CGI will pass the result through the standard output Stdout to the server daemon
after processing, format.
2) Standard of CGI interface
CGI interface standard including standard inputs, environment variables, standard
output. Client submitted requests mostly by FORM. GET method and POST method
29
Department of ECE, SBCE
8/8/2019 The ARM is a 32
30/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
are two methods of FORM submit, by which method depends on the setting in FORM
METHOD. When use GET method, CGI program obtain data from the environment
variable called QUERY_STRING, GET
method is usually used to get the data do not want to change from the server.
However, if the string is too long, POST method is often used, when POST method,
WEB server transmit data by Stdin, but EOF character is not used to mark in the end
of data, so environment variable CONTENT_LENGTH shoud be used to read data
length. The standard Stdout used to output the result that handled by CGI programs,
Outputs include MIME head- ers, actually display of the browser and HTML source
code.
3) Invoke of CGI program
CGI program calls generally have the following two ways:
(1) Directly.
(2 Combined with FORM.
When user submits an HTML FORM, first,Web browser encodes the datas form
FORM, the format is as follows:name1=value1&name2=value2&name3=value3&valu
e4= value4&...This format shall be URL encoded, procedures need to be analyzed
and decoded. Then change some special characters into the corresponding ASCII
characters. These special characters are:
+: convert + to the space character; %xx: according to the value of xx to convert into
the corresponding ASCII characters. Last, CGI will process the final results back to
the client[9].
4) Testing of dynamic Web page
Dynamic Web page can be tested after the realization of the CGI. Firstly a simple
helloweb.c file is written, then compile it:
#arm-linux-gcc o helloweb.cgi helloweb.c
#cp helloweb.cgi/opt/EmbedSky/root_nfs/web/cgi- bin
Finally, http://10.16.12.149/cgi-bin/helloweb.cgi is
entered on the browser in the PC and then the testing page
(helloweb.cgi) will be opened, it shows as Figure 5.
30
Department of ECE, SBCE
8/8/2019 The ARM is a 32
31/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
Fig.3.3 Dynamic page of helloweb.cgi[1]
31
Department of ECE, SBCE
8/8/2019 The ARM is a 32
32/32
Project 2010 Design And Implementation Of
An Embedded Web Server
Based On ARM
REFERENCES
[1] Mo Guan;Minghai Gu, Design And Implementation Of An Embedded Web Server
Based On ARM ,pp. 612-615,2010.
[2] Yakun Liu; Xiaodong Cheng, Design and implementation of embedded Web
server based on arm and Linux, Issue Date: 30-31 May 2010,On page(s): 316
319
[3] www. Wikipedia.com
[4] embeddedcraft.org/embeddedlinux.html
32