Upload
ami-french
View
220
Download
0
Embed Size (px)
Citation preview
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
THE TI OMAP PLATFORM THE TI OMAP PLATFORM APPROACH TO SOCAPPROACH TO SOC
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
CONTENTSCONTENTS
5.1 Overview of OMAP59105.1 Overview of OMAP5910
5.2 OMAP5910 Block Diagram5.2 OMAP5910 Block Diagram
5.3 Features5.3 Features
5.4 DSP Subsystem5.4 DSP Subsystem
5.5 Component of DSP Subsystem5.5 Component of DSP Subsystem
5.6 DSP Module Block Diagram5.6 DSP Module Block Diagram
5.7 TMS320C55x DSP Core5.7 TMS320C55x DSP Core
5.8 Feature of TMS320C55x5.8 Feature of TMS320C55x
5.9 C55x Block Diagram5.9 C55x Block Diagram
5.10 IU5.10 IU
5.11 PU5.11 PU
5.12AU5.12AU
5.13 DU5.13 DU
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
INTRODUCTIONINTRODUCTION
OMAP PlatformOMAP Platform general purpose computing enginesgeneral purpose computing engines HW acceleratorsHW accelerators memory, peripherals and interfacesmemory, peripherals and interfaces
Platform design parameterPlatform design parameterPerformancePerformancePowerPowerCostCostTime to marketTime to market
SoC Platform - systematic reuseSoC Platform - systematic reuse
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
HIERARCHY OF PLATFORMSHIERARCHY OF PLATFORMS
OMAP PlatformOMAP Platform general purpose computing enginesgeneral purpose computing engines HW acceleratorsHW accelerators memory, peripherals and interfacesmemory, peripherals and interfaces
Platform design parameterPlatform design parameterPerformancePerformancePowerPowerCostCostTime to marketTime to market
SoC Platform - systematic reuseSoC Platform - systematic reuse
INTRODUCTION
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
The OMAP HW/SW PlatformThe OMAP HW/SW Platform
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
The OMAP HW/SW PlatformThe OMAP HW/SW Platform
KEY COMPONENTS OF WORKKEY COMPONENTS OF WORKApplication engineering Application engineering Reference designReference designSW architecture and developmentSW architecture and developmentPerformance evaluationPerformance evaluation
- Estimation of the workload- Estimation of the workload
- Architecture exploration- Architecture exploration
- Archtecture tuning- Archtecture tuning
- Performance verification- Performance verification
- Silicon evaluation- Silicon evaluation
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.MULTI-PROCESSOR SW MULTI-PROCESSOR SW ARCHITECTUREARCHITECTURE
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.MULTI-PROCESSOR SW MULTI-PROCESSOR SW ARCHITECTURE 2ARCHITECTURE 2
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
The TI WIRELESS SoC PLATFORMThe TI WIRELESS SoC PLATFORM
Ideal socketIdeal socketDataflowDataflowClock cycleClock cycleClocksClocksResetResetInterrupt and DMA requestsInterrupt and DMA requestsSemanticsSemanticsScalable performanceScalable performanceHigher level functionsHigher level functionsExtensibility and FlexibilityExtensibility and FlexibilityComplianceCompliance
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
The TI WIRELESS SoC PLATFORMThe TI WIRELESS SoC PLATFORM- Platforms- Platforms
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
The TI WIRELESS SoC PLATFORMThe TI WIRELESS SoC PLATFORM- Software - Software
Software issuesSoftware issuesPort of the OS to basic peripheralsPort of the OS to basic peripheralsIntegration of additional devicesIntegration of additional devicesCommonalityCommonalityReuse between different OSesReuse between different OSes
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
Robustness against the evolution of Robustness against the evolution of
TechnologyTechnology processing scalingprocessing scalinghigh performancehigh performanceadditional mastersadditional mastersincreasing area, decreasing sizeincreasing area, decreasing sizenew high bandwidth modules and new high bandwidth modules and
workloadsworkloadsadditional, heterogeneous smart additional, heterogeneous smart
acceleratorsaccelerators
The TI WIRELESS SoC PLATFORMThe TI WIRELESS SoC PLATFORM- Future Proofing- Future Proofing
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.The TI WIRELESS SoC The TI WIRELESS SoC PLATFORMPLATFORMAdvantage of Platform based designAdvantage of Platform based design
a wide product range allowing reuse of a wide product range allowing reuse of hardware and software developmenthardware and software development
A hardware architecture adapted to problemA hardware architecture adapted to problemA software architecture delivering all the A software architecture delivering all the
benefits of the hardware to the application benefits of the hardware to the application developerdeveloper
An efficient SoC platform comprising An efficient SoC platform comprising hardware and low level softwarehardware and low level software
A complete and flexible socket allowing A complete and flexible socket allowing hardware to be easily developed, verified and hardware to be easily developed, verified and integratedintegrated
SoC platform definition for hardware and SoC platform definition for hardware and software reusesoftware reuse
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.1 Overview of 5.1 Overview of OMAP5910 OMAP5910 [6][7][6][7]
Highly integrated hardwareHighly integrated hardwareSoftware platform designedSoftware platform designedFor next generation embedded devicesFor next generation embedded devicesUnique dual-core architectureUnique dual-core architecture
TI-enhanced ARMTI-enhanced ARMTMTM 925 processor (TI925T) 925 processor (TI925T) Command and controlCommand and control
TMS320C55xTM DSP coreTMS320C55xTM DSP core Low-powerLow-power High-performanceHigh-performance
ApplicationApplicationMobile communications, Video and image processing, Mobile communications, Video and image processing, Advanced speech applications, Audio processing, Advanced speech applications, Audio processing, Graphics and video acceleration, Generalized web Graphics and video acceleration, Generalized web access, Data processingaccess, Data processing
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.5.2 OMAP5910 Block 5.2 OMAP5910 Block DiagramDiagram
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.3 Features (1/4)5.3 Features (1/4)TI925T MPU subsystemTI925T MPU subsystem
DSP subsystemDSP subsystem
DSP MMUDSP MMU
System DMA controllerSystem DMA controller
External memory interfacesExternal memory interfaces
Internal SRAM memoryInternal SRAM memory
External memory traffic controllerExternal memory traffic controller
MailboxesMailboxes
Endianism conversionEndianism conversion
Elastic bufferingElastic buffering
JTAG portJTAG port
Clock managementClock management
PeripheralsPeripherals
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.3 Features (2/4)5.3 Features (2/4)TI925T MPUTI925T MPU
Instruction cache :16K bytesInstruction cache :16K bytesData cache : 8K bytesData cache : 8K bytesMMUMMU17-word write buffer (WB)17-word write buffer (WB) Increases system performanceIncreases system performance
DSP (TMS320C55x DSP core)DSP (TMS320C55x DSP core)Dual-access RAM (DARAM), single-access Dual-access RAM (DARAM), single-access (SARAM), ROM(SARAM), ROMInstruction cacheInstruction cacheHardware acceleratorsHardware acceleratorsDMA controllerDMA controller
DSP MMUDSP MMUAddress translationAddress translationAccess permission checksAccess permission checks
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.3 Features (3/4)5.3 Features (3/4)System DMA controllerSystem DMA controller
Six ports, nine channelsSix ports, nine channelsAdditional dedicated DMA : LCD controllerAdditional dedicated DMA : LCD controllerTransfer : 8-,16-, or 32-bitTransfer : 8-,16-, or 32-bitSimultaneous transfersSimultaneous transfersLow-power design (no clocking when idle)Low-power design (no clocking when idle)
Two external memory interfacesTwo external memory interfacesExternal memory interface slow (EMIFS)External memory interface slow (EMIFS)External memory interface fast (EMIFF)External memory interface fast (EMIFF)
Clock managementClock managementOne digital phase-locked loop (DPLL)One digital phase-locked loop (DPLL)Three clock management unitsThree clock management unitsSystem power managementSystem power management
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.3 Features (4/4)5.3 Features (4/4)PeripheralsPeripherals
For the MPU For the MPU
For the DSP For the DSP
Shared peripherals Shared peripherals
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.4 DSP5.4 DSPSub-Sub-systemsystem
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.5.5 Component of DSP 5.5 Component of DSP
SubsystemSubsystemDSP moduleDSP module
DSP core : TMS320C55x (C55x) DSP core : TMS320C55x (C55x) Hardware accelerators (HWA)Hardware accelerators (HWA) DCT/IDCTDCT/IDCT Motion estimationMotion estimation Half-pixel interpolationHalf-pixel interpolation
MemoriesMemories DARAMDARAM SARAMSARAM PDROMPDROM
External memory interface (EMIF)External memory interface (EMIF)6-channel DMA controller6-channel DMA controllerMPUIMPUITIPBTIPB
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.5.5 Component of DSP 5.5 Component of DSP SubsystemSubsystem
DSP peripheralsDSP peripheralsThree general-purpose 32-bit timersThree general-purpose 32-bit timersOne general-purpose UARTOne general-purpose UART16-signal general-purpose input/output 16-signal general-purpose input/output (GPIO)(GPIO)MailboxMailbox For Inter-processor Communication For Inter-processor Communication
(between MPU and DSP)(between MPU and DSP)Watchdog timerWatchdog timerLevel 2 interrupt handlerLevel 2 interrupt handler
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.5.6 DSP Module Block 5.6 DSP Module Block DiagramDiagram
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.5.7 TMS320C55x DSP 5.7 TMS320C55x DSP CoreCore
Advanced multiple-bus architectureAdvanced multiple-bus architectureUnified program/data memory architectureUnified program/data memory architectureDual 17 x 17-bit multipliersDual 17 x 17-bit multipliersAdd/compare/select (CSSU) unitAdd/compare/select (CSSU) unitExponent encoderExponent encoderTwo address generatorsTwo address generators8M x 16-bit (16M-bytes) memory space8M x 16-bit (16M-bytes) memory spaceRepeat operationsRepeat operations288MIPS/144MHz, 320MIPS/160MHz, 288MIPS/144MHz, 320MIPS/160MHz, 400MIPS/200MHz, 600MIPS/300MHz400MIPS/200MHz, 600MIPS/300MHz
ARM9 : 220MIPS/200MHzARM9 : 220MIPS/200MHz0.05 mW/MIPS (20mW)0.05 mW/MIPS (20mW)
ARM9 : 0.8mW/MHz (160mW)ARM9 : 0.8mW/MHz (160mW)
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.7 TMS320C55x DSP 5.7 TMS320C55x DSP CoreCore
Conditional executionConditional execution
Seven-stage pipelineSeven-stage pipeline
Instruction buffer unit (IU)Instruction buffer unit (IU)
Program flow unit (PU)Program flow unit (PU)
Address data flow unit (AU)Address data flow unit (AU)
Data computation unit (DU)Data computation unit (DU)
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.5.8 Feature of 5.8 Feature of TMS320C55xTMS320C55x
64 x 8-bit Instruction buffer queue64 x 8-bit Instruction buffer queueTwo 17 x17-bit MAC unitsTwo 17 x17-bit MAC unitsOne 40-bit ALUOne 40-bit ALU
Performs high precision arithmetic and logical Performs high precision arithmetic and logical operationsoperations
One 40-bit Barrel ShifterOne 40-bit Barrel ShifterOne 16-bit ALUOne 16-bit ALU
Performs simpler arithmetic in parallel to main ALUPerforms simpler arithmetic in parallel to main ALUFour 40-bit accumulatorsFour 40-bit accumulatorsTwelve independent buses:Twelve independent buses:
Three data read busesThree data read busesTwo data write busesTwo data write busesFive data address busesFive data address busesOne program read busOne program read busOne program address busOne program address bus
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
5.9 5.9 C55xC55x
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
Summary (1/2)Summary (1/2)DSP processorsDSP processors
Fast and powerful performance of digital signal Fast and powerful performance of digital signal processing operationprocessing operation
Specialized instruction set : shift, multiplication, Specialized instruction set : shift, multiplication, additionaddition
Piccolo Piccolo
Digital signal processing unit for ARM7Digital signal processing unit for ARM7
Licensable coreLicensable core
v5TE: signal processing instruction set for ARM-Ev5TE: signal processing instruction set for ARM-E
Teak & TeakLite Teak & TeakLite
Synthesizable embedded DSP coreSynthesizable embedded DSP core
Process independent soft coreProcess independent soft core
OMAP (TI) : software platform with MCU and DSP coreOMAP (TI) : software platform with MCU and DSP core
TI925T MPU & TMS320C55x DPS coreTI925T MPU & TMS320C55x DPS core
VLSI Algorithmic Design Automation VLSI Algorithmic Design Automation Lab.Lab.
SummarySummaryPiccoloPiccolo TeakLiteTeakLite TeakTeak
OMAPOMAP
(TMS320C55(TMS320C55x)x)
ALUALU One 32bit One 32bit One 36bit One 36bit One 40bitOne 40bitOne 40bitOne 40bit
One 16bitOne 16bit
Barrel Barrel shiftershifter One 32bitOne 32bit One 36bitOne 36bit One 40bitOne 40bit One 40bitOne 40bit
MultiplierMultiplier One 16bitOne 16bit One 16bitOne 16bit Two 16bitTwo 16bit Two 17bitTwo 17bit
AccumulatAccumulatoror -- Four 36bitFour 36bit Four 40bitFour 40bit Four 40bitFour 40bit
PerformanPerformancece
(MHz)(MHz)7070 135135
144, 160,144, 160,
200, 300200, 300
PowerPower
(mA)(mA)0.27/MHz0.27/MHz
0.45/MHz0.45/MHz0.05/MIPS0.05/MIPS