Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
TM
1Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
QorIQ™
P4080 Product Overview
Richard SchnurSr. Multicore Portfolio Marketing Manager
July 2009
TM
2Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Provide and/or make available solutions that enable our customers to achieve their system level requirements
better than their non-Freescale alternatives
Freescale’s Objective:
•Solutions and enablement technology focus•Optimized solutions throughout the eco-system•Deeper partnering to meet market challenges
Make Our Customer’s Products More Compelling
Silicon ●
Tools ●
Run-time SW ●
Boards ●
Reference Designs ●
Services ●
Etc.
TM
3Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Freescale Introduces Product Longevity Program
►The embedded market needs long-term product support, which allows OEMs to provide assurance to their customers.
►Freescale has a longstanding
track record of providing long-term production support for our products.
►Freescale is pleased to introduce a formal product longevity program for the market segments
we serve. •
For the automotive and medical
segments, Freescale will manufacture select devices for a minimum period of 15 years.
•
For all other market segments in which Freescale participates, Freescale will manufacture select devices for a minimum period of 10 years.
►A list of applicable Freescale products is available at www.freescale.com.
TM
4Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Four 45SOI Products Taped Out
PowerQUICC®
MPC8569Flip Chip PBGA130M transistors>1B vias and contacts
Multicore DSP MSC8156Flip Chip PBGA0.5B transistors2.5B vias and contacts>2 miles of interconnect
QorIQ™
P2020
Wire Bond TEPBGA-II~100M transistors>1B vias and contacts
•
Verified basic functionality:•
6 DSP cores •
MAPLE-B accelerator •
M2/M3 memories•
Running SW BIST on memories
•
1st
silicon in house•
Functional CPU and QUICC Engine™
module•
Passing traffic on:•
DDR •
Ethernet •
HSSI
•
1st
silicon in house•
Teams validating part atcomponent and board level
TM
5Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
P4080 key differentiators
•
Multicore key ‘’challenges’’•
SoC architecture and interconnect•
DPAA: Data Path Acceleration Architecture•
A ‘’Trust’’
architecture•
Partitioning and virtualization •
Complete software and debug enablement
TM
6Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Multicore challenges’’► Interconnect
How do you support high internal throughput between all agents: cores, peripherals, memory and accelerators?
► Partitioning and virtualizationHow do you run and manage in a reliable way, concurrent SW environments running on the
same SoC device?This has to be accomplished with protection, sharing, virtualization
► Load balancing and sharingHow do you ensure that increasing the number of cores really improves performance?
This has to see with support from both Software (OS & tools) and Hardware (flow classification and distribution, shared resource abstraction …)
Not exclusive but potentially worsened by multicore:
► SecurityHow do you make a system ’’Trusted’’ and/or ’’IP Protected’’?
► DebugHow do you make a system ’’debug-able’’ on such a complex platform?
In addition to Performance/Power targets, these are the questions that drove the P4080 design …
TM
7Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
QorIQ™
P4
Solutions for Multicore Challenges
► Single core SoCs struggle to retrieve minimum size packets at line rate•
Interrupt latency•
Buffer management•
Buffer descriptor ring management
► Hardware accelerators require different descriptors to be built•
Single core cannot build descriptors at a rate that utilizes the
full throughput capability of the accelerator
► Increasing number of cores does not guarantee improved performance•
Distributing packets to different cores•
Maintaining flow order•
Shared data structures
► Virtualization•
Sharing resources•
Protection across cores•
Messaging
► Debug•
Volume of information•
Core interaction
TM
8Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
The Right Core Matters for Multicore
►Frequency does still matter►Higher efficiency (IPC) means
higher performance►Cache size impacts performance►Cache “stashing”
and warming
►Hypervisor hardware►Floating Point is useful in many
applications►Performance monitoring and
instruction trace are keys to debug
I-FetcherDispatch Unit
BranchUnit
IntegerUnit
(2)
Ld/StUnit
GPR
CompleteUnit
e500mc Core
APUs
e500mc Core Complex
MMUD-Cache/
SRAM
(32KB)
I-Cache/SRAM(32KB)
Processor Interface Unit
12812864WriteRead-2Read-1Address
FPU
TM
9Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Power Architecture®
e500mc Core
► e500mc Core: •
Efficient dual-issue, out-of-order •
Seven-stage, superscalar •
High-frequency •
Multicore-optimized
► ISA Improvements•
Hypervisor provides protection and partitioning guarantees for multicore systems•
Special purpose “statistics instructions”
(a.k.a Decorated stores)•
“Classic”
floating point in place of SPE floating pointCompatible with e300 and e600 cores
► Private Backside L2•
Provides low-latency access to private cache•
Provides up to 4x more private cache resources for locking cache
lines into the L2 (as well as L1), to facilitate determinism, fast interrupt handlers, etc.
•
Flexible allocation modes:Unified: all 8 ways can be used for instruction or dataD-only: all 8 ways are reserved for dataI-only: all 8 ways are reserved for instructionsPer-way: N ways are reserved for data, 8-N ways are reserved for instructions
•
Reduces snoop traffic in the system
TM
10Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Frame Manager
Parse, Classify,Distribute
Buffer
P4080 Block Diagram
RapidIOMessageUnit (RMU)
2x DMA
PCIe
18-Lane 5 GHz SerDes
PCIe sRIOPCIe
CoreNet™
1024 KBFrontsideL3 Cache
64-bitDDR-2 / 3
Memory ControllerQorIQ™
P4080 MULTICOREPROCESSOR
SRIO
WatchpointCross
Trigger
PerfMonitor
CoreNet
Trace
Aurora
Security4.0
PatternMatchEngine
2.0
Queue Mgr.
BufferMgr.
eLBC
TestPort/SAP
1GE 1GE
1GE 1GE10GE
1024 KBFrontsideL3 Cache
64-bitDDR-2 / 3
Memory Controller
PAMU
Coherency FabricPAMUPAMUPAMU PAMU Peripheral
Access Mgmt Unit
eOpenPIC
Power Mgmt
2x USB 2.0/ULPI
SD/MMC
Clocks/Reset
2x DUART
4x I 2C
SPI
GPIO
PreBoot Loader
Security Monitor
Internal BootROM
CCSR
Power Architecture®
e500-mc Core
D-Cache I-Cache
128 KBBacksideL2 Cache 32 KB 32 KB
Frame Manager
Parse, Classify,Distribute
Buffer
1GE 1GE
1GE 1GE10GE
Real Time Debug
TM
11Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
QorIQ™
P4 Platform P4080
► Industry-leading performance in under 30-watts (max)► Streamlined programming
•
Through close partner collaboration, the P4080 is well tooled –even before silicon availability. Leveraging the hybrid simulation environment, SimicsP4080 from Virtutech, developers can migrate code, work through code partitioning and even have fully debugged software early in the development cycle.
► Eight e500mc cores, built on Power Architecture®
technology•
Operating at frequencies up to 1.5 GHz with private L2 cache and
embedded hypervisor technology, these are the most advanced cores available in a multicore architecture today. Who needs 16 when you can do it on eight?
► Advanced virtualization technology•
Each core is able to operate fully independent of the other cores –accesses to memories, datapath accelerators and network interfaces are completely contained; safe and autonomous operation of multiple individual operating systems is
ensured.► On-demand Data Path Acceleration Architecture (DPAA)
•
Datapath acceleration IP works in concert with the cores to manage packet routing, security, quality-of-service (QoS) and deep packet inspection –freeing the cores to focus on value-added services and application processing.
► CoreNet™
coherency fabric•
Eliminates bus contention, bottlenecks and latency issues associated with scaling shared bus/shared memory architectures that are common in other multicore approaches.
TM
12Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Platform Interconnect is Critical to Delivering Multicore Scalability
►Multicore interconnects must address:•
Scalability of CPU cores, memory and I/O bandwidth•
Flexible inter-processor communication programming models•
QoS differentiation for control/data plane and network traffic•
Efficient memory subsystem including caching and hardware coherency►The CoreNet™
interconnect fabric on the QorIQ™
P4080 addresses the scalability needs of multicore processors
P4080
Read Bandwidth
1 core 2 cores 8 coresP4080
Agg
rega
ted
inte
rface
ban
dwid
th (M
B/s
)
TM
13Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Datapath Acceleration Architecture
“Intelligence is the ability to avoid doing work, yet getting the work done.”
-
Linus Torvalds
Handles over-the-top traffic►
Bandwidth-intensive multimedia and mobile traffic affected by social patterns or new service creation (Facebook, Telepresence, Skype)
►
Drives new demands for network architecture responsiveness in service creation and transport
►
Freescale’s next-generation Datapath Acceleration Architecture (DPAA) provides the ability to meet such demands
►
18 Mpps parse and classify, load-steering, network accelerators and multi-level prioritized queuing
Congestion Mgmt
Parse
Classify
SteerPolicing
Stash Context Enqueue
Manage Work Q
QMan BMan
FMan
QorIQ™ P4 Platform DPAA
DPAA simultaneously enables a lower complexity software environment as well as very high networking performance
Cores Accelerators
NetworkInterfaces
TM
14Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
“Trust”
Platform
Secure Platform Boot:Configured to boot from on-chip ROM•
CPU#0 begins to boot from on-chip ROM, all other CPUs held in reset
•
CPU executing from on-chip trusted boot code (provided by FSL) performs initial SoC configuration and health checks, verifies a signature over the micro-kernel, stored in the NV RAM of OEM’s choice
Secure boot insures that the systembegins executing trusted code. Thistrusted code can test thetrustworthiness of other system codebefore allowing it to execute.
Note: ‘Trusted’
= passes signaturecheck. Don’t sign it if you don’t trust it!
External Tamper
Detection Circuits
P4080P4080
MULTICOREPROCESSOR
QorIQ™P4080
MULTICOREPROCESSOR
TM
15Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
“Trust”
Platform
Secure Platform Boot:Configured to boot from on-chip ROM•
CPU#0 begins to boot from on-chip ROM, all other CPUs held in reset
•
CPU executing from on-chip trusted boot code (provided by FSL) performs initial SoC configuration and health checks, verifies a signature over the micro-kernel, stored in the NV RAM of OEM’s choice
Secure boot insures that the systembegins executing trusted code. Thistrusted code can test thetrustworthiness of other system codebefore allowing it to execute.
Note: ‘Trusted’
= passes signaturecheck. Don’t sign it if you don’t trust it!
External Tamper
Detection Circuits
P4080P4080
MULTICOREPROCESSOR
QorIQ™P4080
MULTICOREPROCESSOR
TM
16Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Multicore Operating Systems► Wide variation of customer use-cases
•
Multiple operating systems utilized across cores on a single device•
Proprietary, 3rd
party and open source multicore operating systems •
Symmetric Multi-Processing (SMP) and Asymmetric Multi-Processing (AMP), often running concurrently•
Often no OS, or engineered light OS, used on forwarding/data plane cores
► Leverage Power Architecture®
technology’s 3rd
party OS ecosystem•
Enabled by Freescale embedded hypervisorFreescale boot standards, including u-boot Leverage open boot protocol and API standards (e.g. Power.org™)Freescale Light Weight Executive (LWE) for run to completion data plane processingDemonstrate performance and provide reference example for customers
3rd
Party
ServicesLight Weight
Executive
Forwarding/ Data Plane Control Plane
Linux® Linux
AMP
(Light
RTOS)
SMPAMP
Power ArchitectureCore
D-Cache I-Cache
L2 Cache
Power ArchitectureCore
D-Cache I-Cache
L2 Cache
Power Architecture®
Core
D-Cache I-Cache
L2 Cache
Power ArchitectureCore
D-Cache I-Cache
L2 Cache
Power ArchitectureCore
D-Cache I-Cache
L2 Cache
Power ArchitectureCore
D-Cache I-Cache
L2 Cache
Power ArchitectureCore
D-Cache I-Cache
L2 Cache
Power ArchitectureCore
D-Cache I-Cache
L2 Cache
TM
17Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Partitioning a Multicore
CPUCPU CPU CPU
Linux®
AppApp
RTOS
AppApp
Legacy OS
AppApp
►A common multicore usage model is to run multiple operating systems
►Requires partitioning hardware resources:
•
Private resources: CPUs, memory, I/O devices
•
Shared resources: memory, devices
I/O I/O I/O I/O
Multicore System
Hardware
Shared Cache
I/O
Interrupt Controller
► Doing this cooperatively (all operating systems well-behaved) presents challenges …
Memory Memory Memory
Memory
TM
18Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
SMP/AMP Operating System
Hypervisor
Optimized High Speed Drivers
Freescale Multicore P4080 Silicon
Applications
Freescale
Cycle-Accurate
SMP/AMP Operating System
Hypervisor
Optimized High-Speed Drivers
Multicore Simulation Environment
IDE (com
piler / debugger / build tools)
Simulation to HardwareSame
SoftwareFreescale-supplied SDK items
FunctionalAPI
Applications
TM
19Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009. 19
►Provider of high-performance, high fidelity, full system simulators
•
High Performance ..
fast enough to run real software loads•
High Fidelity
……….
run full production software, including firmware, device drivers,
hypervisor, RTOS/OS, application software
•
Full System
………..
simulate entire systems, not just processor cores, SoCs or boards
SimicsTM
Modeling
Physical hardware Host-compiled simulation
►Both run real binaries►HW has real timing►Simics offers convenient debug►Simics often faster►Simics available before hardware►Fault-injection possible in Simics►Simics better at networks
►Simics runs real binaries, host simulation doesn’t►Simics requires no special build and OS emulation
layer►Simics provides correct relative execution speed
between nodes►Both handle networks►Host-compiled might be faster
Instruction-set simulators Cycle-accurate simulation
►Simics provides the whole system►Simics typically faster than “old”
ISS►ISS doesn’t run whole system; need to modify SW
so it will run on ISS
►Increase detail timing than Simics►Low-level timed interaction visible►Simics much faster (10x to 100x)►Models take time to create►CA usually not fast enough to run real SW
(takes hours or days to boot Linux®)
TM
20Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Freescale Solution Enablement Strategy
•
Focus on silicon optimization capabilities
•
Performance optimized solutions built
around standard platforms and
available from strategic partners
•
Enables strategic partners to develop,
market, and provide comprehensive
Freescale integrated solutions
•
Addresses strategic partner challenges keeping up with market needs FSLDBG
CoreNet Perfmon
DDR Perfmon
Core Perfmon
Register Analyzer
TM
21Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Wide Range of QorIQ™
P4080 Debug Scenarios
► Debug via run control•
Useful when target system H/W and S/W is not stable (board bring-up)•
Useful when non-instrumented but non-real-time debug is requiredNo debug software required to run on P4080Cores halt during debug; does affect real-time characteristics of execution
•
Provides most complete access to e500mc, SoC resources► Debug via Nexus trace
•
Useful when low-intrusive, real-time debug is requiredNo debug software required to run on P4080, cores do not halt
•
Host debugger reconstructs, correlates instruction, data path history•
Captured trace history can be “played backward”
by host debuggerEnables identification of upstream cause of downstream errors
► Debug via software agent•
Useful for OS-aware debug and S/W debug of one core by anotherHost-based debugger can provide “system viewer” functionality
•
Useful for debug of software running on a simulatorAgent should execute identically as on real silicon
•
Debug software required to run on P4080, potentially intrusive
TMFreescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
TM
P4080 Debug Architecture
JTAG
Mem
ory
Map
ped
Inte
rfac
e
IEEE
114
9.1/
7 T
AP
HBDP(SerDes)
Bus Interface Trace
Inte
rnal
Bus Nexus Port Controller (NPC)
Control / Arbitration
Trace Buffer (16K)
Aur
ora
Inte
rfac
e
Aur
ora
Link
Events X-triggers
Performance Monitor
Select Unit
Event Processing Unit (EPU)
Sequence Unit
Action Unit
GigE DDR PCIe
PCIe Trace (ICT)PEX Controller
DDR Trace (ICT)DDR Controller
Data Path Trace (ICT)Network Packet DPath
DebugRegisters
CPUNexus
MBIST
Mem
ory
Map
ped
Inte
rfac
e
Perf. Monitor
InstructionJamming
CPU #1
DebugRegisters
CPUNexus
MBIST
Mem
ory
Map
ped
Inte
rfac
e
Perf. Monitor
InstructionJamming
CPU #2
DebugRegisters
CPUNexus
MBIST
Mem
ory
Map
ped
Inte
rfac
e
Perf. Monitor
InstructionJamming
CPU #3
DebugRegisters
CPUNexus
MBIST
Coh
eren
t Int
erfa
ce
Perf. Monitor
InstructionJamming
CPU #N
ICT Arbitration
OVF
TM
23Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
QorIQ™ P4080 Multicore News
• Eight e500mc cores• CoreNet™
scales to 32 cores• PCI Express®
2.0, 10GbE• PME 2.0, SEC 4.0• Data path acceleration• Trust/secure boot• Hypervisor• Standardized debug•
Virtualization with real applications
• High-performance SoC• Advanced technology• Tier one partnerships• Outstanding ecosystem
►
Innovative Multicore Microarchitecture
for unprecedented computing efficiency, performance
and scalability.
•
On-chip coherency fabric•
Back-side cache per CPU core•
On-demand application acceleration
►
Multicore Simulation Environment
for accurate, fast code development and debugging.
•
Fully tap the capabilities of the multicore platform•
Debug software -
not hardware•
Dynamic, real-time debug with non-intrusive capture
►
45-nm Process Technology
for industry-leading
power-to-performance solution.
•
Provides highest instructions-per-cycle (IPC) and frequency for given milliwatt/area
It’s a smarter approach to multicore. Freescale’s Multicore Platforms
TM
24Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.
Q&A
►Thank you for attending this presentation. We’ll now take a few moments for the audience’s questions and then we’ll begin the question and answer session.
TM