View
630
Download
0
Category
Preview:
DESCRIPTION
SoC System Manager white paper delivered for the IP-SOC Conference in Grenoble, France (November 2010).One of the key challenges associated with designing SoC system management schemes stems from the growing number of programmable devices on-chip. Programmable devices exponentially increase the number of combination's of software operations that drive hardware state changes in real time. This in turn complicates system level testing in order to achieve reasonable test coverage. Optimizing the SoC design for a single operating system provides little relief, because the diversity of applications running on the SoC continues to multiply the testing complexities at the system level. This paper will discuss design considerations and compare and contrast three system management architectures. The first is an ad hoc system management, which is comprised of combination's of hardware and software elements that serve a dual purpose, one being normal operation, and one for system management. The second is including system management as part of the on-chip interconnects implementation. The third architecture introduces a control plane approach for system management which complements the data centric global interconnect.
Citation preview
Optimizing System Management in the Platform SoC Era
Howard Pakosh, ChipStart and Phil Casini, AdvanceTech Marketing
November 2010
Introduction
Consumer focused SoCs have evolved
into platform architectures that are now
being driven by requirements from
operating systems such as Android,
iPhone. Linux, and Windows and the
thousands of applications they support.
Overtime more of the system is moving
into silicon . As a result, system
management functions have moved into
the SoC. Traditional feature based
regression testing at the silicon level
must now be increasingly complimented
with complex system level testing in
order to maintain a high level of system
coverage across SoC road maps.
Balancing price-performance-power and
high system level test coverage therefore
creates complex system management
design challenges that effect both
hardware and software operation.
System management must now be
considered as a central feature and
responsibility of the SoC architecture,
not just as a tactical design consideration
for the development of each individual
SoC. System management should
provide adequate synchronization of
hardware state changes driven by
software, maintain reasonable time to
market and maximize system test
coverage and support.
The remainder of this paper will discuss
design considerations and compare and
contrast three system management
architectures. The first is an ad hoc
system management, which is comprised
of combinations of hardware and
software elements that serve a dual
purpose, one being normal operation,
and one for system management. The
second is including system management
as part of the on-chip interconnects
implementation. The third architecture
introduces a control plane approach for
system management which complements
the data centric global interconnect.
Finally the paper will discuss the
growing importance of integrated
subsystem design and IP for SoCs and
how system level partitioning will play a
growing role in achieving efficient
system management.
System management design
considerations
One of the key challenges associated
with designing SoC system management
schemes stems from the growing number
of programmable devices on-chip.
Programmable devices exponentially
increase the number of combinations of
software operations that drive hardware
state changes in real time. This in turn
complicates system level testing in order
to achieve reasonable test coverage.
Optimizing the SoC design for a single
operating system provides little relief ,
because the diversity of applications
running on the SoC continues to
multiply the testing complexities at the
system level.
System level testing via traditional
silicon level functional and data path
regressions must now be augmented by
system functional test suites include the
programmable elements and their impact
on hardware state changes. Each
programmable core can be isolated and
tested to achieve a high level of code
coverage, and each execution path
through the different cores combinations
can be tested., but the combinations of
hardware state changes they require as a
result of application behavior makes it
almost impossible to achieve adequate
system level coverage solely from
testing the cores and the buses in
isolation or even pseudo random
combinations.
It is at this point that compromises are
often made in the SoC design. How
much risk is affordable when trading off
the cost and time to build these complex
system level regression suites with the
actual test coverage achieved? As
volumes grow the answer is risk must be
mitigated and therefore these tradeoffs
become essential to minimize.
This paper challenges the increasing
“tax” on the project costs to balance
adequate system level test coverage, and
risk, based on current system
management architecture assumptions .
Specifically, instead of continuing to
grow regression suites and make risk
choices based on the assumption that the
associations between the levels of
hardware and system testing are tightly
coupled, abstraction layers can be
inserted into the architecture to decouple
the hardware, operating system, and
applications support functions.
Furthermore, each of these components
can tested through independent
elements introduced into the SoC
architecture.
In fact, this trend has already begun. The
growing use of decoupled global
interconnect structures, such as those
that employ OCP or similar features,
provides a proven example of how to
ease chip architecture design as it
evolves from single to multicore or
multi-layer. By “abstracting” the data
plane, and allowing the associations
between the IP cores to become linked
through the independent global
interconnect structure, system
performance at the hardware level
becomes more predictable and tunable
(CPU to off chip memory for example).
This predictability affords opportunities
to streamline the design process because
these loosely coupled associations are
less effected by specific design changes.
This leads to more rapid timing closure
even though the complexity of the data
plane has grown significantly.
Similar abstraction techniques can be
applied to system management. The
software and hardware layers, the system
management, and the functional
operation of the SoC can be decoupled,
making it easier to test each component
of the system level architecture while
considering the system level driven
hardware state changes. This results in a
system level design which is more easily
understood and has better test coverage.
This approach also abstracts the system
management operational complexities
between hardware and software even
though the number of applications
grows.
The next section of the paper will
discuss three potential methods of
abstraction that lead to varied degrees of
optimizing system management.
System Management Scheme
Comparisons
Given that the objective is to reduce
overall system management complexity
there are three baseline characteristics
that system management schemes should
be benchmarked by:
1. How well does the approach
achieve independence between
the silicon-operating system-
and application layers?
2. How flexible is the approach to
adapt to each derivative design
in a SoC road map?
3. How much test coverage does
the resultant system
management scheme achieve
for the SoC architecture?
By applying these benchmark criteria,
three methods can be evaluated.
Method 1: Using a single operating
system hosted on a “master” CPU. This
has been a popular approach to perform
system management because silicon
elements already required for real time
operation also execute system
management functions.
When SoC complexities are relatively
low, this scheme is very efficient. No
extra silicon, some extra software
development, but very containable.
However, the complexity growth
associated with multicore SoC for
consumer designs today have weakened
the effectiveness of using this approach
because as system tasks become
distributed, that is more interdependent
as more cores are added to the SoC, the
visibility and control of any one core
over any of the others is reduced with
each new element added. The visibility
and control becomes more dependent on
the global interconnect as well as the
cores, adding even more complexity to
execute control functions. The addition
of the global interconnect as part of the
system testing is required in this case
because it controls access to external
memory, a key element in system
operations.
If the master CPU can no longer manage
and verify the hardware state changes of
the other core elements, the number of
possible states increasing results in
unpredictable coverage and the
methodology no longer has value.
Extending the scheme then to add
system test does not return meaningful
dividends on the potentially massive
investment of developing the tests and
verification infrastructure.
Applying the criteria then to this method
for today’s platform SoCs
1. This approach fundamentally
breaks down for multicore SoCs
because it will not adequately
allow the economical
construction of operating system
and application level system test
layers.
2. This criterion is considered
inconsequential given that the
criteria failed the first test.
Host
CPU
IP
Core
IP
CoreI/O
3. This approach will yield
extremely low system test
coverage and therefore its
usefulness is directly dependent
on the complexity of the SoC.
Method 2: Introducing global
interconnect structures and additional
logic to support pseudo-control plane
system management functions. This
approach is an extension of method 1
because often the host CPU continues to
act as the system management master.
Side band signaling, either contained in
the interconnect or designed separately
is used for the control functions.
Mixing data plane and control functions
introduces abstraction levels that aides in
achieving higher system test coverage as
long as the SoC does not drive the
interconnect requirements to become so
complex that the control functions
become a small and lower priority in the
overall mix of functions. When this
occurs the control tasks are executed
sub-optimally as delays occur from
priority choices between functional
operations and system management tasks
because of complex arbitration
sequences and delayed communication
through blocked hierarchical buses.
Applying the criteria then to this method
for today’s SoCs
1. This approach introduces levels
of abstraction which makes the
approach feasible for some
multicore SoCs.
2. However, the approach also has a
ceiling of usefulness which is
normally reached when extra
logic is required to manage
“special” cases for each of the
derivatives in the SoC road map
as inefficiencies mount that are
tolerated to minimize time to
market. One area where this
occurs is when the system
management master, usually the
host CPU, requests that another
core should power down.
Inefficiencies sometimes occur
when complex arbitration
schemes and blocked requests
delay the actual action of
powering down the core. These
delays can often be measured in
thousands of cycles, which is
power consumed for no useful
system function, and is therefore
power wasted.
3. As a result of the ceiling in the
benefits of the approach, overall
coverage is directly dependent on
the complexity of the SoC and as
such is useful only within a range
of SOC complexity.
Method 3: Introducing a control plane
that compliments a data plane global
interconnect.
Host
CPU
IP
Core
IP
Core
I/OIP
Core
IP
Core
This approach differs from the first two
methods because it does not extend the
traditional host CPU system master
approach. Rather, it introduces a
separate control plane and an
independent system controller to
perform system management tasks.
An independent control plane essentially
abstracts the system management tasks
from any one entity. As such, it can be
controlled by any-or all SoC elements as
required, and therefore offers multiple
layers of abstraction. System testing can
be developed by software, hardware,
verification, and system engineers and
applied using a common framework with
equal effectiveness.
This approach is also advantageous
because it separates targeted control
tasks ideally executed with low latency
from longer more complex and often
performance sensitive data plane tasks.
This separation is often necessary when
complexity is high, because traditional
approaches reach the ceiling of
effectiveness discussed during method 2.
Applying the criteria then to this method
for today’s SoCs
1. This approach creates maximum
levels of abstraction for system
management but introduces
control plane functionality.
2. This approach introduces high
levels of flexibility as both
control and data plane functions
can be tuned for each SoC
derivative without changing the
base architecture.
3. This approach also maximizes
the coverage achievable because
any source can direct the system
management and as such
operations (applications) can be
isolated and tested within the
approach without compromising
overall coverage.
Summary:
While method 3 introduces new control
plane functionality, it also enables SoCs
of virtually any complexity to be tested
and operated with maximum efficiency
achieved using the same approach. As
such it is best suited for roadmaps that
contain a wide variety of complexity or
when extreme flexibility is required for
the SoC architecture. The ability to
direct the system controller using any
SoC core is especially noteworthy
because it allows multiple applications
to directly control the hardware states in
real time when needed and without the
overhead of channeling its requests
through other entities, thus avoiding
inter-function dependencies,
complexities and delays.
The Impact of SoC Subsystems on
System Management.
The basic theme to achieving better
system management is successful
partitioning in order to increase adequate
levels of system test coverage. This is
why method 3 was chosen as the most
SystemController
Low
Speed
I/O
Media
Engine
High
Speed
I/O
DRAM
Controller
DSPCPU
Global Interconnect
Control Plane
SystemController
Low
Speed
I/O
Media
Engine
High
Speed
I/O
DRAM
Controller
DSPCPU
Global Interconnect
Control Plane
effective for today’s system management
needs.
It stands to reason, then, that the impact
of subsystem utilization further abstracts
the system management tasks. However,
creating systems within systems also
introduces hierarchies of complexity and
as such, further pushes traditional
methods of system management useless.
The growing use of subsystems over the
next generations of SoC design will
therefore accelerate the adoption of
control plane based system management
as the preferred method of architecture
so that hierarchical levels of complexity
can be absorbed into the system
management architecture while
maintaining a common architecture that
provides the flexibility and scalability
while minimizing risks and costs of
expensive architecture redesigns that
will accelerate as system requirements
continue to become more complex.
Recommended