33
David Zhang, Cadence Design Systems ARM Tech Symposia Shanghai November 2015 Portable and Reusable System-Level Verification Use Cases for ARM ® -Based SoCs

Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

David Zhang, Cadence Design Systems

ARM Tech Symposia

Shanghai

November 2015

Portable and Reusable System-Level Verification Use Cases for ARM® -Based SoCs

Page 2: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

2 © 2015 Cadence Design Systems, Inc. All rights reserved.

• System-level verification

• Reusable usable use cases

• Cache coherency

• I/O coherency

• Power shutoff

Agenda

Page 3: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

3 © 2015 Cadence Design Systems, Inc. All rights reserved.

System-Level Verification

Page 4: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

4 © 2015 Cadence Design Systems, Inc. All rights reserved.

• Many IP functions – Standard I/O

– WiFi, USB, PCI Express® (PCIe® ), etc.

– System infrastructure – Interconnect, interrupt control, UART, timers…

– Differentiators – Custom accelerators, modem…

• Many cores – Both symmetric and asymmetric

– Both homogeneous and heterogeneous

• Lots of software – Part of core functionality

– Communication stack, DSP software, GPU microcode…

– User application software infrastructure – Android, Linux…

A system-centric look at a modern SoC

Application Specific Components

SoC interconnect fabric

ARM V8 CPUSubsystem

3D

GFX

DSP

A/V

High speed, wired interface peripherals

DDR

3

PHY

Other peripherals

SATA

MIPI

HDMI

WLAN

LTE Low-speed peripheral

subsystem

Low speed peripherals

PMU

MIPI

JTAG

INTC

I2C

SPI

Timer

GPIO

Display

UART

Boot

processor

ARM M0

Modem

Cortex

-A53

L2 cache

USB3.0

3

.0 P

H

Y

2

.0 P

H

Y

PCIe

Gen 2,3

PHY

Eth

er

net

PHY

Cortex

-A53 Cortex

-A57

L2 cache

Cortex

A57

Cache coherent fabric

SoC

Software

Bare

-meta

l so

ftw

are

DS

P s

oft

ware

Init

. so

ftw

are

fo

r b

oo

t,

po

wer,

secu

rity

RTOS

Drivers

Communications L2

Communications L1

Firmware / HAL

Communications L3

Mobile

communications

software stack

Bare-metal software

Operating Systems (OS)

Drivers

Applications

Middleware

Firmware / HAL

Application

software stack

Low-Speed

Peripherals

General-

Purpose

Peripherals

High-Speed,

Wired-Interface

Peripherals

Customer’s

Application-Specific Components

Compute

Subsystem

Page 5: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

5 © 2015 Cadence Design Systems, Inc. All rights reserved.

SoC-level verification and validation requirements

How to communicate/share use cases between users

How to create and reuse use cases from IP to SoC

How to use C code to execute natively on many cores and communicate

between cores

How to run use cases across platforms and run more constrained random

variants on faster platforms

Platform

Virtual Platform Simulation Emulation FPGA Prototype Silicon Board

User

Architect Hardware

Developer

Software

Developer

Verification

Engineer

Software

Test

Engineer

Post-Silicon

Validation

Engineer

Vert

ical R

euse

Horizontal Reuse

Use-Case Reuse

Scope

(Integration)

IP

Subsystem

OS and Drivers

Bare-Metal

Software

SoC (Hardware +

Software)

Middleware

(Graphics, Audio,

etc..)

Page 6: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

6 © 2015 Cadence Design Systems, Inc. All rights reserved.

• Hardware/software complexity can result in testing at either a very low level or on top of the full production software stack – Does the SoC work with production OS?

– Bare-metal tests historically limited to programmer reference validation

– Can all IP be activated?

• Challenges – Multi-cluster, multi-processor with cache coherent fabric

– Crossing states between cores

– Traversing multi-step transitions

– How to develop and debug the tests – Bare-metal—need more directed tests, but more complicated to develop

– On top of OS—more infrastructure to speed development, but more difficult to debug issues because of side-effects that can be introduced by a complex OS

– How to measure what has actually been tested

– How to develop and debug tests and reuse on all validation platforms

The gaps

Page 7: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

7 © 2015 Cadence Design Systems, Inc. All rights reserved.

Reusable System-Level Use Cases for ARM Architecture

Page 8: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

8 © 2015 Cadence Design Systems, Inc. All rights reserved.

• Table-driven CPU and memory subsystem configuration

• Out-of-the-box testing – Coherency true sharing and false sharing

– Power up/down

– DVM

• Set of operations to manage the ARM compute cluster – Memory management

– Page table handling, virtual address

– Predefined actions that designer can use in their program – Write data, read data, copy data

– Caching operation – True sharing

– False sharing

– I/O coherency

– Low power

• Takes a description of the system in a spreadsheet and creates system scenarios

Perspec library for ARM architecture

Page 9: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

9 © 2015 Cadence Design Systems, Inc. All rights reserved.

• Zero modeling is required since CPU and memory subsystem models are automatically generated from library by reading system configuration tables (shown below)

Perspec library processor configuration tables

• Memory blocks

• Processors, names, clusters,

coherency

• Pages, virtual address (VA),

physical address (PA), size

• Processors to memories

accessibility/restrictions

Tables can be extended to add

more design-specific attributes

Page 10: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

10 © 2015 Cadence Design Systems, Inc. All rights reserved.

PSLib for ARM architectures

• Support for

− Cache coherency

− I/O coherency

− Power up/down

− DVM

− Crosses

Page 11: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

11 © 2015 Cadence Design Systems, Inc. All rights reserved.

Cache Coherency

Page 12: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

12 © 2015 Cadence Design Systems, Inc. All rights reserved.

CCI-400 or Customer CCI

A53 Cluster

A57 Cluster

MALI or Custome

r GPU

S4 S3 S2 S1 S0

ADB ADB ADB

ADB

GIC-400

NIC-400

PCIe RC

LCD DMA V8 Mobile

Example System

NIC-400 (2x1)

ADB

NIC

-40

0

ADB ADB

TZC-400

DMC-400 or Customer DDR Controller

F0 F1 F2 F3

On-Chip ROM

SRAM

Video SRAM

#2

#4

L2 Cache

Customer DMA

ADB

#1

#3

#2

#4

L2 Cache

#1

#3

Timers

UART

NIC-400 NIC-400

IP IP IP IP IP

DVFS CLK/PSO Domain

CLK/PSO Domain

System Control

Processor

Coherent Masters

Non-Coherent Masters

IP

ADB ADB

ADB

SMMU

Software Thread

#1 First, intra-cluster cache tests are

needed to cross-cover MOESI states of L1 and L2 caches

Coherency verification challenges Intra-cluster

Page 13: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

13 © 2015 Cadence Design Systems, Inc. All rights reserved.

The MOESI protocol

• A state machine describes the transitions on a given cache block

• State transitions are caused by – Processor read/write instructions

– External probe (snoop) requests

• Instructions of one processor affect probes on the others

Coherent bus

Processor

1

Processor

2 Notes:

• Other transitions are caused by: – Special instructions

– Cache management (e.g., eviction)

• Corresponds to a certain cache policy – Write-back, read-allocate, write-allocate

• MESI and MOSI are variants in which exclusive and owned states respectively are collapsed to shared

Page 14: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

14 © 2015 Cadence Design Systems, Inc. All rights reserved.

Reaching specific situations

Scenario goal: write

to a line when it is in

shared state

Scheduling view of

generated scenario

Control flow view of

generated scenario

Proc2 line transitions:

Invalid -> Modified ->

Owned -> Invalid

Page 15: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

15 © 2015 Cadence Design Systems, Inc. All rights reserved.

Scenario goal: observe

any state on core 1 in

parallel to any state on

core 2

Crossing state between cores

Collect cross

coverage on legal

combinations

Generate

scenarios for all

combinations

Page 16: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

16 © 2015 Cadence Design Systems, Inc. All rights reserved.

CCI-400 or Customer CCI

A53 Cluster

A57 Cluster

MALI or Custome

r GPU

S4 S3 S2 S1 S0

ADB ADB ADB

ADB

GIC-400

NIC-400

PCIe RC

LCD DMA V8 Mobile

Example System

NIC-400 (2x1)

ADB

NIC

-40

0

ADB ADB

TZC-400

DMC-400 or Customer DDR Controller

F0 F1 F2 F3

On-Chip ROM

SRAM

Video SRAM

#2

#4

L2 Cache

Customer

DMA

ADB

#1

#3

#2

#4

L2 Cache

#1

#3

Timers

UART

NIC-400 NIC-400

IP IP IP IP IP

DVFS CLK/PSO Domain

CLK/PSO Domain

System Control

Processor

Coherent Masters

Non-Coherent Masters

IP

ADB ADB

ADB

SMMU

Software

Thread

#1

Software

Thread

#1 First, intra-cluster cache tests are

needed to cross-cover MOESI states of L1 and L2 caches

Next add inter-cluster cache tests to stress L2 through adjacent snoop

traffic

Coherency verification challenges Intra-cluster

Page 17: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

17 © 2015 Cadence Design Systems, Inc. All rights reserved.

Coherency—false sharing Processors run

in parallel

Cache lines

Address

offset

Processors

split cache

lines

Page 18: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

18 © 2015 Cadence Design Systems, Inc. All rights reserved.

Coherency—true sharing scenarios

Processors run

in parallel

And

synchronize

Cache

lines

Address

Offset

Processors access

same address at

different times

Page 19: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

19 © 2015 Cadence Design Systems, Inc. All rights reserved.

I/O Coherency

Page 20: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

20 © 2015 Cadence Design Systems, Inc. All rights reserved.

Coherency verification challenges Critical coherent I/O

CCI-400 or Customer CCI

A53 Cluster

A57 Cluster

MALI or Custome

r GPU

S4 S3 S2 S1 S0

ADB ADB ADB

ADB

GIC-400

NIC-400

PCIe RC

LCD DMA V8 Mobile

Example System

NIC-400 (2x1)

ADB

NIC

-40

0

ADB ADB

TZC-400

DMC-400 or Customer DDR Controller

F0 F1 F2 F3

On-Chip ROM

SRAM

Video SRAM

#2

#4

L2 Cache

Customer DMA

ADB

#1

#3

#2

#4

L2 Cache

#1

#3

Timers

UART

NIC-400 NIC-400

IP IP IP IP IP

DVFS CLK/PSO Domain

CLK/PSO Domain

System Control

Processor

Coherent Masters

Non-Coherent Masters

IP

ADB ADB

ADB

SMMU

Software

Thread

#1

Software

Thread

#2

Software Thread

#1

Need to create some software thread to create I/O coherency scenario

Firstly intra-cluster cache tests are needed to cross-cover MOESI states of

L1 and L2 caches

Next add inter-cluster cache tests to stress L2 through adjacent snoop traffic

Page 21: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

21 © 2015 Cadence Design Systems, Inc. All rights reserved.

Coherency verification challenges

CCI-400 or Customer CCI

A53 Cluster

A57 Cluster

MALI or Customer

GPU

S4 S3 S2 S1 S0

ADB ADB ADB

ADB

GIC-400

NIC-400

PCIe RC

LCD DMA V8 Mobile Example System

NIC-400 (2x1)

ADB

NIC

-40

0

ADB ADB

TZC-400

DMC-400 or Customer DDR Controller

F0 F1 F2 F3

On-Chip ROM

SRAM

Video SRAM

#2

#4

L2 Cache

Customer DMA

ADB

#1

#3

#2

#4

L2 Cache

#1

#3

Timers

UART

NIC-400 NIC-400

IP IP IP IP IP

DVFS CLK/PSO Domain

CLK/PSO Domain

System Control

Processor

Coherent Masters

Non-Coherent Masters

IP

ADB ADB

ADB

SMMU

Software

Thread

#4

Software

Thread

#3

Software

Thread

#2

Software Thread

#1

Software

Thread

#1

Extreme stress is now introduced with further threads on both clusters

Software

Thread

#4

Software

Thread

#3

Software

Thread

#2

Software Thread

#1

Need to create some software thread to create I/O coherency scenario

First, intra-cluster cache tests are needed to cross- cover MOESI states of L1 and L2

caches

Next add inter-cluster cache tests to stress L2 through adjacent snoop traffic

Page 22: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

22 © 2015 Cadence Design Systems, Inc. All rights reserved.

Pre-defined basic software operations

• Basic software operations

− Write: write_data − Generates data and writes it into the memory

− Copy: copy_data − Copies data from one area to another

− Read: read_check_data − Reads data from previously written area

− Checks against the reference model

• Main control knobs

− Alignment

− Data size

− Memory block/address

Page 23: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

23 © 2015 Cadence Design Systems, Inc. All rights reserved.

Advanced software operation: all processors to all memories

Different

processors

Different

memories

Page 24: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

24 © 2015 Cadence Design Systems, Inc. All rights reserved.

I/O coherency verification challenges

CCI-400 or Customer CCI

A53 Cluster

A57 Cluster

MALI or Customer

GPU

S4 S3 S2 S1 S0

ADB ADB ADB

ADB

GIC-400

NIC-400

PCIe RC

LCD DMA ARM® v8 Mobile Example System

NIC-400 (2x1)

ADB

NIC

-40

0

ADB ADB

TZC-400

DMC-400 or Customer DDR Controller

F0 F1 F2 F3

On-Chip ROM

SRAM

Video SRAM

#2

#4

L2 Cache

Customer DMA

ADB

#1

#3

#2

#4

L2 Cache

#1

#3

Timers

UART

NIC-400 NIC-400

IP IP IP IP IP

DVFS CLK/PSO Domain

CLK/PSO Domain

System Control

Processor

Coherent Masters

Non-Coherent Masters

IP

ADB ADB

ADB

SMMU

Software

Thread

#4

Software

Thread

#3

Software

Thread

#2

Software Thread

#1 Software

Thread

#4

Software

Thread

#3

Software

Thread

#2

Software

Thread

#1

Now we need to test IO Coherency for all peripherals that generate

sharable transactions

Need to test I/O coherency for ALL peripherals that generate sharable

transactions

Page 25: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

25 © 2015 Cadence Design Systems, Inc. All rights reserved.

False sharing with I/O: PCIe example

Page 26: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

26 © 2015 Cadence Design Systems, Inc. All rights reserved.

True sharing with I/O: PCIe example

Page 27: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

27 © 2015 Cadence Design Systems, Inc. All rights reserved.

Power

Page 28: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

28 © 2015 Cadence Design Systems, Inc. All rights reserved.

Power shutoff verification challenges

CCI-400 or Customer CCI

A53 Cluster

A57 Cluster

MALI or Customer

GPU

S4 S3 S2 S1 S0

ADB ADB ADB

ADB

GIC-400

NIC-400

PCIe RC

LCD DMA ARM® v8 Mobile Example System

NIC-400 (2x1)

ADB

NIC

-40

0

ADB ADB

TZC-400

DMC-400 or Customer DDR Controller

F0 F1 F2 F3

On-Chip ROM

SRAM

Video SRAM

#2

#4

L2 Cache

Customer

DMA

ADB

#1

#3

#2

#4

L2 Cache

#1

#3

Timers

UART

NIC-400 NIC-400

IP IP IP IP IP

DVFS CLK/PSO Domain

CLK/PSO Domain

System Control

Processor

Coherent Masters

Non-Coherent Masters

IP

ADB ADB

ADB

SMMU

Software

Thread

#4

Software

Thread

#3

Software

Thread

#2

Software

Thread

#1 Software

Thread

#4

Software

Thread

#3

Software

Thread

#2

Software

Thread

#1

System control processor controls clocks, power, and resets—to be

confident the system is robust, we need to exercise all the range of

legal power shutoff (PSO) scenarios and traffic that goes with them

Power and

Clock Control Softwar

e

ARM big.LITTLETM with dynamic voltage frequency scaling (DVFS) creates potential

hazards through the combinations of clock frequencies—need to drive the SCP and coherent traffic to cover all the clock

combinations

Page 29: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

29 © 2015 Cadence Design Systems, Inc. All rights reserved.

Coherency low power: coherency_low_power, stages 1, 2, 3

Page 30: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

30 © 2015 Cadence Design Systems, Inc. All rights reserved.

Coherency low power: coherency_low_power, stages 4, 5

Page 31: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

31 © 2015 Cadence Design Systems, Inc. All rights reserved.

• Reusable portable stimulus library – Configurable for your ARM CPU subsystem

– Automatically generates correct-by-construction complex multi-core, multi-cluster tests

– Tests execute at speed across verification and validation platforms (simulation, emulation, FPGA, post-silicon)

• Out-of-the-box testing – Focus on coverage, not on developing tests

• Reusable actions and scenarios – Enable test writers to create custom, complex test scenarios to verify important use cases for

your SoC

Summary

Page 32: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

32 © 2015 Cadence Design Systems, Inc. All rights reserved.

Come visit us in Cadence Booth

Page 33: Portable and Reusable System-Level Verification Use Cases ...armtechforum.com.cn/attached/article/Cadence... · –Caching operation –True sharing –False sharing –I/O coherency

© 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Cadence and the Cadence logo are registered trademarks of Cadence Design Systems. ARM

and the ARM logo are registered trademarks of ARM Limited (or its subsidiaries) in the EU and/or elsewhere. All rights reserved. PCI-SIG, PCI Express, and PCIe are

registered trademarks and/or service marks of PCI-SIG. All other trademarks are the property of their respective owners.