Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
David Zhang, Cadence Design Systems
ARM Tech Symposia
Shanghai
November 2015
Portable and Reusable System-Level Verification Use Cases for ARM® -Based SoCs
2 © 2015 Cadence Design Systems, Inc. All rights reserved.
• System-level verification
• Reusable usable use cases
• Cache coherency
• I/O coherency
• Power shutoff
Agenda
3 © 2015 Cadence Design Systems, Inc. All rights reserved.
System-Level Verification
4 © 2015 Cadence Design Systems, Inc. All rights reserved.
• Many IP functions – Standard I/O
– WiFi, USB, PCI Express® (PCIe® ), etc.
– System infrastructure – Interconnect, interrupt control, UART, timers…
– Differentiators – Custom accelerators, modem…
• Many cores – Both symmetric and asymmetric
– Both homogeneous and heterogeneous
• Lots of software – Part of core functionality
– Communication stack, DSP software, GPU microcode…
– User application software infrastructure – Android, Linux…
A system-centric look at a modern SoC
Application Specific Components
SoC interconnect fabric
ARM V8 CPUSubsystem
3D
GFX
DSP
A/V
High speed, wired interface peripherals
DDR
3
PHY
Other peripherals
SATA
MIPI
HDMI
WLAN
LTE Low-speed peripheral
subsystem
Low speed peripherals
PMU
MIPI
JTAG
INTC
I2C
SPI
Timer
GPIO
Display
UART
Boot
processor
ARM M0
Modem
Cortex
-A53
L2 cache
USB3.0
3
.0 P
H
Y
2
.0 P
H
Y
PCIe
Gen 2,3
PHY
Eth
er
net
PHY
Cortex
-A53 Cortex
-A57
L2 cache
Cortex
A57
Cache coherent fabric
SoC
Software
Bare
-meta
l so
ftw
are
DS
P s
oft
ware
Init
. so
ftw
are
fo
r b
oo
t,
po
wer,
secu
rity
RTOS
Drivers
Communications L2
Communications L1
Firmware / HAL
Communications L3
Mobile
communications
software stack
Bare-metal software
Operating Systems (OS)
Drivers
Applications
Middleware
Firmware / HAL
Application
software stack
Low-Speed
Peripherals
General-
Purpose
Peripherals
High-Speed,
Wired-Interface
Peripherals
Customer’s
Application-Specific Components
Compute
Subsystem
5 © 2015 Cadence Design Systems, Inc. All rights reserved.
SoC-level verification and validation requirements
How to communicate/share use cases between users
How to create and reuse use cases from IP to SoC
How to use C code to execute natively on many cores and communicate
between cores
How to run use cases across platforms and run more constrained random
variants on faster platforms
Platform
Virtual Platform Simulation Emulation FPGA Prototype Silicon Board
User
Architect Hardware
Developer
Software
Developer
Verification
Engineer
Software
Test
Engineer
Post-Silicon
Validation
Engineer
Vert
ical R
euse
Horizontal Reuse
Use-Case Reuse
Scope
(Integration)
IP
Subsystem
OS and Drivers
Bare-Metal
Software
SoC (Hardware +
Software)
Middleware
(Graphics, Audio,
etc..)
6 © 2015 Cadence Design Systems, Inc. All rights reserved.
• Hardware/software complexity can result in testing at either a very low level or on top of the full production software stack – Does the SoC work with production OS?
– Bare-metal tests historically limited to programmer reference validation
– Can all IP be activated?
• Challenges – Multi-cluster, multi-processor with cache coherent fabric
– Crossing states between cores
– Traversing multi-step transitions
– How to develop and debug the tests – Bare-metal—need more directed tests, but more complicated to develop
– On top of OS—more infrastructure to speed development, but more difficult to debug issues because of side-effects that can be introduced by a complex OS
– How to measure what has actually been tested
– How to develop and debug tests and reuse on all validation platforms
The gaps
7 © 2015 Cadence Design Systems, Inc. All rights reserved.
Reusable System-Level Use Cases for ARM Architecture
8 © 2015 Cadence Design Systems, Inc. All rights reserved.
• Table-driven CPU and memory subsystem configuration
• Out-of-the-box testing – Coherency true sharing and false sharing
– Power up/down
– DVM
• Set of operations to manage the ARM compute cluster – Memory management
– Page table handling, virtual address
– Predefined actions that designer can use in their program – Write data, read data, copy data
– Caching operation – True sharing
– False sharing
– I/O coherency
– Low power
• Takes a description of the system in a spreadsheet and creates system scenarios
Perspec library for ARM architecture
9 © 2015 Cadence Design Systems, Inc. All rights reserved.
• Zero modeling is required since CPU and memory subsystem models are automatically generated from library by reading system configuration tables (shown below)
Perspec library processor configuration tables
• Memory blocks
• Processors, names, clusters,
coherency
• Pages, virtual address (VA),
physical address (PA), size
• Processors to memories
accessibility/restrictions
Tables can be extended to add
more design-specific attributes
10 © 2015 Cadence Design Systems, Inc. All rights reserved.
PSLib for ARM architectures
• Support for
− Cache coherency
− I/O coherency
− Power up/down
− DVM
− Crosses
11 © 2015 Cadence Design Systems, Inc. All rights reserved.
Cache Coherency
12 © 2015 Cadence Design Systems, Inc. All rights reserved.
CCI-400 or Customer CCI
A53 Cluster
A57 Cluster
MALI or Custome
r GPU
S4 S3 S2 S1 S0
ADB ADB ADB
ADB
GIC-400
NIC-400
PCIe RC
LCD DMA V8 Mobile
Example System
NIC-400 (2x1)
ADB
NIC
-40
0
ADB ADB
TZC-400
DMC-400 or Customer DDR Controller
F0 F1 F2 F3
On-Chip ROM
SRAM
Video SRAM
#2
#4
L2 Cache
Customer DMA
ADB
#1
#3
#2
#4
L2 Cache
#1
#3
Timers
UART
NIC-400 NIC-400
IP IP IP IP IP
DVFS CLK/PSO Domain
CLK/PSO Domain
System Control
Processor
Coherent Masters
Non-Coherent Masters
IP
ADB ADB
ADB
SMMU
Software Thread
#1 First, intra-cluster cache tests are
needed to cross-cover MOESI states of L1 and L2 caches
Coherency verification challenges Intra-cluster
13 © 2015 Cadence Design Systems, Inc. All rights reserved.
The MOESI protocol
• A state machine describes the transitions on a given cache block
• State transitions are caused by – Processor read/write instructions
– External probe (snoop) requests
• Instructions of one processor affect probes on the others
Coherent bus
Processor
1
Processor
2 Notes:
• Other transitions are caused by: – Special instructions
– Cache management (e.g., eviction)
• Corresponds to a certain cache policy – Write-back, read-allocate, write-allocate
• MESI and MOSI are variants in which exclusive and owned states respectively are collapsed to shared
14 © 2015 Cadence Design Systems, Inc. All rights reserved.
Reaching specific situations
Scenario goal: write
to a line when it is in
shared state
Scheduling view of
generated scenario
Control flow view of
generated scenario
Proc2 line transitions:
Invalid -> Modified ->
Owned -> Invalid
15 © 2015 Cadence Design Systems, Inc. All rights reserved.
Scenario goal: observe
any state on core 1 in
parallel to any state on
core 2
Crossing state between cores
Collect cross
coverage on legal
combinations
Generate
scenarios for all
combinations
16 © 2015 Cadence Design Systems, Inc. All rights reserved.
CCI-400 or Customer CCI
A53 Cluster
A57 Cluster
MALI or Custome
r GPU
S4 S3 S2 S1 S0
ADB ADB ADB
ADB
GIC-400
NIC-400
PCIe RC
LCD DMA V8 Mobile
Example System
NIC-400 (2x1)
ADB
NIC
-40
0
ADB ADB
TZC-400
DMC-400 or Customer DDR Controller
F0 F1 F2 F3
On-Chip ROM
SRAM
Video SRAM
#2
#4
L2 Cache
Customer
DMA
ADB
#1
#3
#2
#4
L2 Cache
#1
#3
Timers
UART
NIC-400 NIC-400
IP IP IP IP IP
DVFS CLK/PSO Domain
CLK/PSO Domain
System Control
Processor
Coherent Masters
Non-Coherent Masters
IP
ADB ADB
ADB
SMMU
Software
Thread
#1
Software
Thread
#1 First, intra-cluster cache tests are
needed to cross-cover MOESI states of L1 and L2 caches
Next add inter-cluster cache tests to stress L2 through adjacent snoop
traffic
Coherency verification challenges Intra-cluster
17 © 2015 Cadence Design Systems, Inc. All rights reserved.
Coherency—false sharing Processors run
in parallel
Cache lines
Address
offset
Processors
split cache
lines
18 © 2015 Cadence Design Systems, Inc. All rights reserved.
Coherency—true sharing scenarios
Processors run
in parallel
And
synchronize
Cache
lines
Address
Offset
Processors access
same address at
different times
19 © 2015 Cadence Design Systems, Inc. All rights reserved.
I/O Coherency
20 © 2015 Cadence Design Systems, Inc. All rights reserved.
Coherency verification challenges Critical coherent I/O
CCI-400 or Customer CCI
A53 Cluster
A57 Cluster
MALI or Custome
r GPU
S4 S3 S2 S1 S0
ADB ADB ADB
ADB
GIC-400
NIC-400
PCIe RC
LCD DMA V8 Mobile
Example System
NIC-400 (2x1)
ADB
NIC
-40
0
ADB ADB
TZC-400
DMC-400 or Customer DDR Controller
F0 F1 F2 F3
On-Chip ROM
SRAM
Video SRAM
#2
#4
L2 Cache
Customer DMA
ADB
#1
#3
#2
#4
L2 Cache
#1
#3
Timers
UART
NIC-400 NIC-400
IP IP IP IP IP
DVFS CLK/PSO Domain
CLK/PSO Domain
System Control
Processor
Coherent Masters
Non-Coherent Masters
IP
ADB ADB
ADB
SMMU
Software
Thread
#1
Software
Thread
#2
Software Thread
#1
Need to create some software thread to create I/O coherency scenario
Firstly intra-cluster cache tests are needed to cross-cover MOESI states of
L1 and L2 caches
Next add inter-cluster cache tests to stress L2 through adjacent snoop traffic
21 © 2015 Cadence Design Systems, Inc. All rights reserved.
Coherency verification challenges
CCI-400 or Customer CCI
A53 Cluster
A57 Cluster
MALI or Customer
GPU
S4 S3 S2 S1 S0
ADB ADB ADB
ADB
GIC-400
NIC-400
PCIe RC
LCD DMA V8 Mobile Example System
NIC-400 (2x1)
ADB
NIC
-40
0
ADB ADB
TZC-400
DMC-400 or Customer DDR Controller
F0 F1 F2 F3
On-Chip ROM
SRAM
Video SRAM
#2
#4
L2 Cache
Customer DMA
ADB
#1
#3
#2
#4
L2 Cache
#1
#3
Timers
UART
NIC-400 NIC-400
IP IP IP IP IP
DVFS CLK/PSO Domain
CLK/PSO Domain
System Control
Processor
Coherent Masters
Non-Coherent Masters
IP
ADB ADB
ADB
SMMU
Software
Thread
#4
Software
Thread
#3
Software
Thread
#2
Software Thread
#1
Software
Thread
#1
Extreme stress is now introduced with further threads on both clusters
Software
Thread
#4
Software
Thread
#3
Software
Thread
#2
Software Thread
#1
Need to create some software thread to create I/O coherency scenario
First, intra-cluster cache tests are needed to cross- cover MOESI states of L1 and L2
caches
Next add inter-cluster cache tests to stress L2 through adjacent snoop traffic
22 © 2015 Cadence Design Systems, Inc. All rights reserved.
Pre-defined basic software operations
• Basic software operations
− Write: write_data − Generates data and writes it into the memory
− Copy: copy_data − Copies data from one area to another
− Read: read_check_data − Reads data from previously written area
− Checks against the reference model
• Main control knobs
− Alignment
− Data size
− Memory block/address
23 © 2015 Cadence Design Systems, Inc. All rights reserved.
Advanced software operation: all processors to all memories
Different
processors
Different
memories
24 © 2015 Cadence Design Systems, Inc. All rights reserved.
I/O coherency verification challenges
CCI-400 or Customer CCI
A53 Cluster
A57 Cluster
MALI or Customer
GPU
S4 S3 S2 S1 S0
ADB ADB ADB
ADB
GIC-400
NIC-400
PCIe RC
LCD DMA ARM® v8 Mobile Example System
NIC-400 (2x1)
ADB
NIC
-40
0
ADB ADB
TZC-400
DMC-400 or Customer DDR Controller
F0 F1 F2 F3
On-Chip ROM
SRAM
Video SRAM
#2
#4
L2 Cache
Customer DMA
ADB
#1
#3
#2
#4
L2 Cache
#1
#3
Timers
UART
NIC-400 NIC-400
IP IP IP IP IP
DVFS CLK/PSO Domain
CLK/PSO Domain
System Control
Processor
Coherent Masters
Non-Coherent Masters
IP
ADB ADB
ADB
SMMU
Software
Thread
#4
Software
Thread
#3
Software
Thread
#2
Software Thread
#1 Software
Thread
#4
Software
Thread
#3
Software
Thread
#2
Software
Thread
#1
Now we need to test IO Coherency for all peripherals that generate
sharable transactions
Need to test I/O coherency for ALL peripherals that generate sharable
transactions
25 © 2015 Cadence Design Systems, Inc. All rights reserved.
False sharing with I/O: PCIe example
26 © 2015 Cadence Design Systems, Inc. All rights reserved.
True sharing with I/O: PCIe example
27 © 2015 Cadence Design Systems, Inc. All rights reserved.
Power
28 © 2015 Cadence Design Systems, Inc. All rights reserved.
Power shutoff verification challenges
CCI-400 or Customer CCI
A53 Cluster
A57 Cluster
MALI or Customer
GPU
S4 S3 S2 S1 S0
ADB ADB ADB
ADB
GIC-400
NIC-400
PCIe RC
LCD DMA ARM® v8 Mobile Example System
NIC-400 (2x1)
ADB
NIC
-40
0
ADB ADB
TZC-400
DMC-400 or Customer DDR Controller
F0 F1 F2 F3
On-Chip ROM
SRAM
Video SRAM
#2
#4
L2 Cache
Customer
DMA
ADB
#1
#3
#2
#4
L2 Cache
#1
#3
Timers
UART
NIC-400 NIC-400
IP IP IP IP IP
DVFS CLK/PSO Domain
CLK/PSO Domain
System Control
Processor
Coherent Masters
Non-Coherent Masters
IP
ADB ADB
ADB
SMMU
Software
Thread
#4
Software
Thread
#3
Software
Thread
#2
Software
Thread
#1 Software
Thread
#4
Software
Thread
#3
Software
Thread
#2
Software
Thread
#1
System control processor controls clocks, power, and resets—to be
confident the system is robust, we need to exercise all the range of
legal power shutoff (PSO) scenarios and traffic that goes with them
Power and
Clock Control Softwar
e
ARM big.LITTLETM with dynamic voltage frequency scaling (DVFS) creates potential
hazards through the combinations of clock frequencies—need to drive the SCP and coherent traffic to cover all the clock
combinations
29 © 2015 Cadence Design Systems, Inc. All rights reserved.
Coherency low power: coherency_low_power, stages 1, 2, 3
30 © 2015 Cadence Design Systems, Inc. All rights reserved.
Coherency low power: coherency_low_power, stages 4, 5
31 © 2015 Cadence Design Systems, Inc. All rights reserved.
• Reusable portable stimulus library – Configurable for your ARM CPU subsystem
– Automatically generates correct-by-construction complex multi-core, multi-cluster tests
– Tests execute at speed across verification and validation platforms (simulation, emulation, FPGA, post-silicon)
• Out-of-the-box testing – Focus on coverage, not on developing tests
• Reusable actions and scenarios – Enable test writers to create custom, complex test scenarios to verify important use cases for
your SoC
Summary
32 © 2015 Cadence Design Systems, Inc. All rights reserved.
Come visit us in Cadence Booth
© 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Cadence and the Cadence logo are registered trademarks of Cadence Design Systems. ARM
and the ARM logo are registered trademarks of ARM Limited (or its subsidiaries) in the EU and/or elsewhere. All rights reserved. PCI-SIG, PCI Express, and PCIe are
registered trademarks and/or service marks of PCI-SIG. All other trademarks are the property of their respective owners.