76
4GWE Fall 2009: Development Tools for 4G Hardware and Software 9/2/2009 © Tensilica, Steepest Ascent, Synopsys 1 Development Tools for 4G Hardware and Software Frank Schirrmeister, Synopsys Frank Vince, Steepest Ascent Chris Rowen, Tensilica So it’s the Software… Phone differentiation used to be about radios and Phone differentiation used to be about radios and antennas and things like that. We think, going forward, the phone of the future will be differentiated by software.” Steve Jobs, CEO, Apple, August 11, 2008

Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

  • Upload
    hatuyen

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 1

Development Tools for 4G Hardware and Software

Frank Schirrmeister, SynopsysFrank Vince, Steepest Ascent

Chris Rowen, Tensilica

So it’s the Software…

“Phone differentiation used to be about radios andPhone differentiation used to be about radios and antennas and things like that. We think, going forward,

the phone of the future will be differentiated by software.”

Steve Jobs, CEO, Apple, August 11, 2008

Page 2: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 2

… Unless it’s the Hardware

“PA Semi is going to do t hi f iPh ”system-on-chips for iPhones...”

Steve JobsCEO, Apple

June 10, 2008

“…a strategic choice…ensuring Apple can continue to differentiate its flagship phone...”

Forbes.comApril 23, 2008

Some 4G Development Challenges

MIMO-OFDM Transmitter Receiver Chain Decide on the right algorithm

Optimize the algorithmOp e e a go

Comms Layer 2Comms Layer 3

HW/SW Implementation

Find the best implementation architecture

Optimize algorithm in the architecture context

Optimize HW/SW Middleware

ApplicationsComms Layer 2Comms Layer 3

HW/SW Chip and System Integration

Optimize HW/SW Integration

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Partitioning

Mobile Chipset Hardware

IPC

Application Subsystem

MPU DSPMulti-Media

Operating System

Middleware

HALDevice DriversDevice Drivers

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Start HW/SW

integration early

Page 3: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 3

Some 4G Development Challenges

MIMO-OFDM Transmitter Receiver Chain Decide on the right algorithm

Optimize the algorithmOp e e a go

Comms Layer 2Comms Layer 3

HW/SW Implementation

Find the best implementation architecture

Optimize algorithm in the architecture context

Optimize HW/SW Middleware

ApplicationsComms Layer 2Comms Layer 3

HW/SW Chip and System Integration

Optimize HW/SW Integration

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Partitioning

Mobile Chipset Hardware

IPC

Application Subsystem

MPU DSPMulti-Media

Operating System

Middleware

HALDevice DriversDevice Drivers

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Start HW/SW

integration early

Frank Vince, Steepest Ascent

STEEPEST ASCENT

Page 4: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 4

Steepest Ascent: Optimizing the 4G Algorithm

MIMO-OFDM Transmitter Receiver Chain Decide on the right algorithm

Optimize the algorithmOp e e a go

Comms Layer 2Comms Layer 3

HW/SW Implementation

Find the best implementation architecture

Optimize algorithm in the architecture context

Optimize HW/SW Middleware

ApplicationsComms Layer 2Comms Layer 3

HW/SW Chip and System Integration

Optimize HW/SW Integration

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Partitioning

Mobile Chipset Hardware

IPC

Application Subsystem

MPU DSPMulti-Media

Operating System

Middleware

HALDevice DriversDevice Drivers

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Start HW/SW

integration early

Presentation OverviewLTE Library for System Studio introduction

Features and capabilities

Why the LTE Library? Productivity increaseWhy the LTE Library? Productivity increase

LTE physical resources overview

Efficient use of matrices in LTE Library

LTE channel coding overview

LTE Library capabilitiesy p

Physical channels and modulation overview

Implementation example with LTE Library

Summary

Page 5: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 5

Introduction: LTE Library for System Studio

3G Evolution LabPart of Steepest Ascent’s 3G Evolution Lab product family

Comprehensive PHY simulation libraryComprehensive PHY simulation library

Release 8 of E-UTRA standard

Library applications

Golden reference verification

Custom T&M waveform generationCustom T&M waveform generation

Algorithm design – don’t sweat the math

Page 6: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 6

LTE System ComplexityLTE: High system complexity

Advanced algorithms required

Extensive testing required

eq alisationequalisationsynchronisation

Simplify the Design Process

Reference receivers System Studio

testsimulationverification

Test models & RMCs

Custom waveforms

LTE Library

verification

synchronisation

ch. estimation

MIMO receiver& equalisation

Page 7: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 7

Library OverviewPhysical layer blocks offered

Channel coding/decoding

Modulation/demodulationModulation/demodulation

Transport and physical channels supported

Downlink Uplink

Tr. channels & control information

DL-SCH BCH HI UL-SCH UCIPCH DCI CFIPCH DCI CFI

Physical channels & signals

PDSCH PDCCH PSS- SSS PUSCH PUCCH

PBCH PHICH PCFICH DRS SRS

Physical Layer

Presentation

Application

Physical

Data Link

Network

Transport

Session

LTE Library for System Studio

Page 8: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 8

Library FeaturesDownlink and uplink FDD duplexing

Transmit receive processing chain

Transport channel coding/decoding

Scrambling/descrambling

Symbol modulation/demapping

Layer mapping & precoding

Resource element mappingResource element mapping

OFDM & SC-FDMA modulation

DCI message creation

Why the LTE Library for System Studio?Productivity Increase

Prebuilt models: get standard waveforms fast

Test models & RMCs

Speed of execution: assess your designs fast

Efficient use of matrices to represent resource grid

Minimises data passing overhead between lib bl klibrary blocks

Exploit MKL library available in System Studio

Tried and tested functionality

Page 9: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 9

3G Evolution Lab – LTE LibraryPhysical Resources

LTE Introduction

This section describes how physical resources of time and frequency are quantised and used in LTE

Time is quantised as

Frame, subframe and slot

Frequency is quantised in the following wayFrequency is quantised in the following way

Subcarriers for OFDM modulation

Resource blocks and resource allocations

Page 10: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 10

Frame Structure Type 1 (FDD)

Type 1 (for FDD) frame structure

0 101 2 3 4 5 6 7 8 9 11 12 13 14 15 16 17 18 19

Frame: 10 msec (20 slots)

slot0.5 msec

subframe1 msec(2 slots)( )

Resource Grid2 dimensional structure:

OFDM symbol (time) and subcarrier (freq)slot

eq

OFDM symbols

time

fre

freq

subcarriers (ba

time

andwidth

resourceelement

Page 11: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 11

LTE Library Resource GridRepresented using multidimensional arrays

subc

arrie

rs

OFDM symbols

Efficient when passing between blocks

Efficient use of MKL matrix operations in System Studio

3G Evolution Lab – LTE LibraryChannel Coding

Page 12: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 12

Channel Coding in LTELTE makes use of the following techniques

CRC bits

Code block segmentation

Turbo and convolutional coding

Rate matching

Code block concatenation

Repetition and block codes

Don’t sweat the math

Steepest Ascent LTE library takes care of the math behind the physical layer

Downlink Transport Channel Coding

DL-SCH, PCH & MCH BCH

Page 13: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 13

Downlink Control Information Coding

CFI & HI DCI

Data & Control on PUSCHData and control information can be transmitted on the uplink shared channel

data control

Page 14: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 14

Control Signalling on PUSCHControl information can be transmitted on the uplink shared channel

Control Information on PUCCHControl information can be transmitted on the uplink control channel

Page 15: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 15

3G Evolution Lab – Channel CodingFull channel coding capability as per TS36.212

Channel coding operations grouped in blocks

3G Evolution Lab – Channel CodingChannel coding operations grouped in blocks

Page 16: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 16

3G Evolution Lab – Channel CodingFully parameterisable blocks

3G Evolution Lab – LTE LibraryPhysical Channels & Modulation

Page 17: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 17

Physical Channels & Modulation in LTEReview

Physical channels and signals in LTE

Processing chain

Mapping to resource grid

LTE Library for System Studio

SSupported physical channels and signals

PDSCH and Ref Signal example

Physical Channels & Signals

Physical channels

Set of resource elements carrying informationSet of resource elements carrying information originating at higher layers

Physical signals

Set of resource elements carrying information t i i ti t hi h lnot originating at higher layers

Page 18: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 18

Physical Channels & Signals

Different physical channels and signals have

Different processing chain requirements

Different mapping to the resource grid

PDSCH Processing ChainPDSCH (data channel) processing chain

Other physical channels have similar chainsOther physical channels have similar chains

Code words are the result of channel coding stages

Layer mapping & precoding: multi-antenna processing

Page 19: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 19

PUSCH Processing ChainPUSCH (data channel) processing chain

Other physical channels have similar chains

Main differences with DL processing

No multi-antenna processingNo multi-antenna processing

SC-FDMA used instead of OFDM

Precoding stage: DFT for SC-FDMA

Physical Signals Mapping Example

slot slot

Resource grid

Cell specific reference signals

Primary synch signals

Secondary synch signals

subcarriers

Resource grid

Page 20: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 20

Physical Channels Mapping Example

slot slot

Resource grid

PDCCH

PHICH

PCFICH

subcarriers

PBCH

PDSCH

Mapping to Physical ResourcesEquations describing mapping operations can be complex

( )⎧ ⎥⎢m ⎪⎪⎨

⎤⎡⎥⎢ Δ⋅−

Δ⋅<

=

if(1)PUCCH(1)(1)

PUCCHshift

(1)cs

(1)PUCCH

(2)RB

NNcnNcnN

m

Mapping is made easy by using the LTE Library

( )

( )⎪⎪⎩

⎪⎪⎨

=+⎥⎦⎥

⎢⎣⎢−−

=+⎥⎦⎥

⎢⎣⎢

=

12mod2mod if2

1

02mod2mod if2

sULRB

s

PRB

nmmN

nmm

n

⎩⎨⎧

=

⎪⎪⎩

⎥⎥⎥

⎢⎢⎢

⎡++

⎥⎥⎦

⎢⎢⎣

Δ⋅

Δ⋅−=

prefix cyclic extended2prefix cyclic normal3

otherwise8

( )cs(2)

RBPUCCHshift

RBsc

shiftcsPUCCH

c

NN

Nc

Ncnm

Mapping blocks are controlled by simple parameters

Page 21: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 21

3G Evolution Lab – DL Phy. Channels

Supported downlink physical channels

PDSCH

PBCH

PDCCH

PCFICH

PHICH

3G Evolution Lab – DL Phy. Signals

Supported downlink physical signals g

Primary synch signals

Secondary synch signals

Cell specific reference signal

Page 22: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 22

3G Evolution Lab – UL Phy. Channels

Supported uplink physical channels

PUSCH

PUCCH

3G Evolution Lab – UL Phy. Signals

Supported uplink physical signals

SRS

g

Demodulation reference signals (DRS)

S di fSounding reference signals (SRS) DRSDRS

Page 23: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 23

PDSCH & Reference Signal ExampleGenerate a waveform with

PDSCH

Reference signalReference signal

Processing chain to implement

Ref signalPDSCH

3G Evolution Lab – PDSCH ExamplePDSCH modulation operations

Scrambling

Symbol modulationy

Layer mapping & precoding

Reference signal generation

Mapping to resource grid (PDSCH & ref signal)

OFDM modulation

IFFT modulation

Cyclic prefix insertion

Windowing

Page 24: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 24

PDSCH ModulationProcessing chain

System Studio implementation with LTE Libraryy p y

Ref Signal Generation & Mapping

Map PDSCH and reference signal to the resource grid

Page 25: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 25

Ref Signal Generation & Mapping

System Studio implementation with LTE Library

OFDM ModulationApply OFDM modulation

Page 26: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 26

OFDM ModulationSystem Studio implementation with LTE Library

IFFT modulation

Cyclic prefix insertion

Windowing

Fully Parameterisable BlocksCell specific reference signal generator blockparameters

PDSCH mapper blockparameters

Page 27: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 27

Efficient Use of MatricesMatrices passed between blocks

High efficiency

multi-dimensionalmatrices

Matrix sizes can change during simulation (unlike other simulators)

3G Evolution Lab –LTE Library for System StudioSummary

Page 28: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 28

SummaryThe LTE Library for System Studio enables design and validation of the physical layer early in product life cycle

Full channel coding/decoding capability provide authentic data

Physical channel, signal generation and mapping of data to stimulate subsequent design stages

Block based model: Don’t sweat the math

Prebuilt systems available

Test models

RMCs

Why the LTE Library for System Studio?

Comprehensive set of blocks take care of the math behind the physical layer

I d ti itIncrease productivity:

Prebuilt models: Test Models & RMCs

Exploit System Studio speed of execution

Efficient use of data types

Exploit MKL library available in SystemExploit MKL library available in System Studio

Page 29: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 29

Chris Rowen, Tensilica

TENSILICA

Tensilica: Optimizing HW/SW Implementation

MIMO-OFDM Transmitter Receiver Chain Decide on the right algorithm

Optimize the algorithmOp e e a go

Comms Layer 2Comms Layer 3

HW/SW Implementation

Find the best implementation architecture

Optimize algorithm in the architecture context

Optimize HW/SW Middleware

ApplicationsComms Layer 2Comms Layer 3

HW/SW Chip and System Integration

Optimize HW/SW Integration

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Partitioning

Mobile Chipset Hardware

IPC

Application Subsystem

MPU DSPMulti-Media

Operating System

Middleware

HALDevice DriversDevice Drivers

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Start HW/SW

integration early

Page 30: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 30

Outline

• Baseband Trends• A Fresh Look at Dataplane Processorsp• Reaching for Performance:

– Integrating RTL Accelerators– Special-purpose Instruction Extension for Acceleration– Advanced Baseband DSPs

• LTE Reference Architecture• Drill Down on Turbo Processor

M th d l Fl f Fl ibl B b d• Methodology Flow for Flexible Baseband• Tensilica Integration with Synopsys Tools

Next-Generation Baseband StandardsDrive Fundamental Change in Market

100

1000

4G

orm

ance

(GO

PS

)

High End DSPs

1

10

0.1 1 10 100

2G

3G

Pea

k P

erfo

Power (Watts)

Embedded DSPs

General purpose

processors

Drive towards multi-standard receivers requires

programmable solutions

Emerging standards (LTE, WiMAX) require processing power exceeding the capabilities of today’s DSPs

60

Push towards low-cost green infrastructure requires high performance at very low power

Page 31: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 31

Complex Cellular Standards1G

(Voice)2G

(Capacity/ Coverage)

3G (Data)

Beyond 3G (Mobile broadband)

Future

AMPS

• LTE (Long Term Evolution) is the next generation cellular standard: Supports data rates of 150Mbps

1983 1993 1995 2005 2010

GSM GPRS Edge WCDMA HSDPA HSUPA HSPA+

CDMA RTT EVDO REV. 0

EVDO REV. A EVDO REV. B

TD-SCDMA

LTE4G

(LTE-A)

of 150Mbps• LTE uses 2 new fundamental technologies

– OFDM: Orthogonal Frequency Division Multiplexing• Has been used in WLAN, DSL, DVB• Requires high computation: Fast Fourier Transforms

– MIMO: Multiple Input Multiple Output• Uses multiple antennas both for transmit and receive• Proven successful in WLAN

• Traditional DSP based radio designs are not sufficient for LTE/4G basebands

Components of digital basebandMain data-path• Computationally very intensive• Configurability requirements for data-path is not trivial (many different modes)

Feed-forward receive data path

Filters FFT MIMO Demod. FEC

Control Elements

Freq/phase offsets/ gain.

Control

Sync. / decoding. Control Ch. processing

Channel Estimation

Baseband master control

From ADC To MAC

Control processing control

Control Elements• Requires programmable solution• Cat 4/5 LTE data-rates (150Mbps) have high computational requirement even for control.

Page 32: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 32

Typical Baseband Designer Problems

• Have solutions for current radio standards, but need latest generation (e.g. LTE)

– May reuse existing [certified] blocks or may want to cover all with programmable solution

• Have historically used hardwired accelerator blocks, controlled by RISC, but the number and service demands of blocks has grown beyond the capacity current structure, especially for multi-standard platforms

• Have historically used DSPs cores, but new designs require 10 hi h h h>10x higher throughput

• As baseband algorithms grow larger and more complex, customers outgrowing assembly-level tools and simple DSP architectures.

New Wireless Standards Drive Performance and Efficiency

Evolving from 2G to 4G:

• >100x increase in operation rate

1000

operation rate• Baseband power budget

reduced by more than 2-3x over previous generations

Preferred Implementation:

• 2G (GSM) DSP• 3G (UMTS) DSP +

function-specific

10

100

3G

4G

Pea

k P

erfo

rman

ce (G

OP

S)

Embedded DSP

High End DSPs

General purpose p

coprocessors• 3.9G/4G (LTE/LTE-

A) Dataplane Processing Units (DPUs)

10.1 1 10

2G

Power (Watts)

DSPs processors

64

Page 33: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 33

Typical 3G DesignDSP + multiple big RTL accelerators

[Infineon X-GOLD]

2/2 5G accelerators2/2.5G accelerators

3G accelerators

Tensilica Focus: Dataplane Processing Units (DPUs)

DPUs: A unique blend of CPU + DSP that deliver programmability and improved power, performance & cost

EmbeddedController

ForDataplane

Processing

Main Applications CPU

Tensilica focus: Dataplane Processors

Page 34: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 34

What is Automated Dataplane Processor Generation?

Complete Hardware DesignSource pre-verified RTL, EDA scripts, test suiteCores as small as 0.02mm2 (45nm)

Processor

XtensaProcessorGeneratorProcessor Configuration

1. Select from menu2. Add instruction description (TIE)

CustomizedSoftware Tools

C/C++ il D b

Extensions

Chips – Correct the First Time

Build SW+Hardware Estimates: <5 minutesBuild Full Hardware: <2 hours

description (TIE)3. Automatic instruction (XPRES)

C/C++ compiler Debuggers, SimulatorsRTOSesSystem Models

Anatomy of an Extensible Processor:Xtensa LX2 Block Diagram

Instruction Fetch and Decode

Base Execution

Xtensa LX2

N-issue FLIX

System Bus

InstCache

IROM

IRAM0

ALU

Register File

Pipeline

RTL, MEM, CPU

FIFOsTIE Queues

RTL Execution Units, R

egister Files and Interfaces

. . . . .

Execution Units, R

egister Files and Interfaces

. . . .

N issue FLIX parallel pipelines

. . . .

TIE Ports

U d fi d

RTL or Lookup Table TIE Lookup

Interface

FPU

MUL16/32, MAC16

On-Chip DebugHiFi2 Audio Engine

Vectra LX DSP

NSA, MIN,MAX, etc

Zero overhead loop

JTAG Tap DMA

SDRAMDDR

Device B

Device A

MasterInterface

SlaveInterface

Writ

eB

uffe

r

I-Fet

ch

Buf

fer

LD/ST1

Instr Mem MMU

LD/ST2

DataMem MMU

ECC/Parity

ECC/Parity

AMB

AH

B/A

XI B

ridge

s

Optional FunctionConfigurable FunctionBase ISA Feature Optional & Configurable

Designer Defined ExtensionsExternal RTL & PeripheralsMemories & Caches

User-defined Execution Units

Interrupt Control

Timers 0 to 3

Trace Port

On Chip Debug

Exception Support

JTAG Tap

TRAXPC Trace

DMAPIF

RTL, CoProc, Shared RAM

DROM

DataCache

XLMI

DRAM0

Page 35: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 35

DPU Architecture ChoicesKey Tradeoff:

Adaptability vs. Efficiency

Combine ultra-efficient task engines with existing or new (programmable) accelerators. Example: LTE Handsets

U t t f th tne

rgy

Effic

ienc

y

Hard-wired

Use state-of-the-art programmable DSPs optimized for baseband. Example: LTE Basestations

E

Adaptability

TraditionalDSP

Integration of Existing Accelerators• RTL accelerator blocks have many

interface types and widths• Data input stream• Data output stream• Data command inputs• Data output flags• Configuration registers

RTL Accelerator Block

Data DataI

Dat

am

emor

y

• Configuration registers• Mode control• Status outputs

Dat

apat

hE

lem

ents

ModeControl

Status

Command

DataFlags

DataOut

ConfigRegs• Extensible processor matches RTL

interface type and width (to 1024b)• Output queues• Input queues• Read only lookups• Read/write lookups• Import wires• Export states

• Full software support for interfaces:Mapped to instr ctions and compiler at

or C

ontro

l Pro

cess

or

In

Sys

tem

mem

ory

Bus

inte

rface

Flags Out• Mapped to instructions and compiler• Modeling in high-level and RTL tools• Visible to source debugger

Acc

eler

a

AdditionalRTL Accelerators

• Multiple RTL blocks controlled by one processor

• Processor performs “smart DMA” for RTL data transfers

Con

trol

mem

ory

Page 36: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 36

Direct Control of Multiple RTL Blocks

Processor Data Memories

RTLA

RTLB

RTLC

InstMemory Memories

On-chip bus

Load Reg[] Reg[] StoreReg[]

RTL A(Cmd)Reg[]

RTL C(Cmd)Reg[]

RTL B(Cmd)

5-slot VLIW Instruction streams data from memory, through 3 RTL data-paths and back to memory:

regfile DR 128 16 dlookup LUA {`128+32+8`, Mstage} {`128`, Mstage+3}state ModeA 32 add_read_writelookup LUB {`128+32+8`, Mstage} {`128`, Mstage +3}}state ModeB 32 add_read_writelookup LUC {`128+32+8`, Mstage} {`128`, Mstage +3}}state ModeC 32 add_read_writeformat f64 64 {l_slot,s_slot,a_slot,b_slot,c_slot}table cmdA 8 8 {0, 1, 2, 3, 4, 5, 6, 7}table cmdB 8 8 {0, 1, 2, 3, 4, 5, 6, 7}table cmdC 8 8 {0, 1, 2, 3, 4, 5, 6, 7}slot_opcodes l_slot {LDIU}slot_opcodes s_slot {SDIU}

Input Data

Output Data

Controlword

Cmd word

128b 128b 32b 3b

Interface and Instruction Set Declaration:

Load Reg[] Reg[] Store RTL_A(Cmd) Reg[]

RTL_C(Cmd) Reg[]

RTL_B(Cmd) Reg[]

slot_opcodes a_slot {LUOpA}slot_opcodes b_slot {LUOpB}slot_opcodes c_slot {LUOpC}operation LUOpA {out DR do, in DR di, in cmdA cmd}

{in ModeA, out LUA_Out, in LUA_In} { assign LUA_Out = {cmd,ModeA,di};assign do = LUA_In;}

operation LUOpB {out DR do, in DR di, in cmdB cmd} {in ModeB, out LUB_Out, in LUB_In} {

assign LUB_Out = {cmd,ModeB,di};assign do = LUB_In;}

operation LUOpC {out DR do, in DR di, in cmdC cmd} {in ModeC, out LUC_Out, in LUC_In} {

assign LUC_Out = {cmd,ModeC,di};assign do = LUC_In;}

Typical operations per cycle:1 128b read from memory1 128b operation through RTL A1 128b operation through RTL B1 128b operation through RTL C1 128b write to memory

Full Accelerator Integration• For new functions, integrated

acceleration is easy and efficient• Your proprietary accelerators are

fully integrated into instruction set and software tools for each Pr

ivat

e m

emor

y

Reg

iste

r File

WideDatapath

orD

ata

mem

ory

processor• Add any number of new data

pipelines, registers, memories, inter-processor channels – up to 100s of ops per cycle

• Tensilica Instruction Extension (TIE) format typically 10x more concise than Verilog

• The cycle-by-cycle behavior of each accelerator written in standard C and

Reg

iste

r Fi

le Datapath

SpecialFunction

Reg

iste

r

Reg

iste

r

SpecialFunction

SpecialFunctionR

egis

ter

Reg

iste

r

d ion cele

rato

r Con

trol P

roce

sso

Sys

tem

Mem

ory

Bus

inte

rface

acce e ato tte sta da d C a dmodeled in fast cycle-accurate simulator

• Use multiple small processor for additional throughput on complex sets of tasks

Reg

iste

rFi

le

Ded

icat

edC

omm

unic

ati

Cha

nnel

s

Acc

Con

trol

mem

ory

Page 37: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 37

Tensilica DSPs for LTE/4G baseband design

16 MAC and more

ConnX BBE16 MAC

Quad MAC

8 MACXtensa

TIE

ConnXVectra LX

ConnX545CK

Single MAC

Dual MAC

MAC16

ConnX D2

Custom DSPs Comms DSP (16 / 32-bit)

Anatomy of a DSPConnX Baseband Engine Architecture

YR Vector Register Bank(8 x 4 x 40b)YR Vector Register Bank

(8 x 4 x 40b)YR Vector Register Bank(8 x 4 x 40b)

Vector Register Bank(16 x 4 x 40b)Vector Register Bank

(16 x 4 x 40b)Vector Register Bank(16 4 40b)VR Vector Register Bank (16 x 4 x 40b)

YR Vector Register Bank (8 x 4 x 40b)AR General Registers

(16 x 32bits)

Local Memory and/or Cache

40-bit : 32+8 guard bits20-bit real 20-bit real20-bit imag 20-bit real

( )40-bit : 32+8 guard bits

20-bit real 20-bit real20-bit imag 20-bit real

(16 x 4 x 40b)40-bit : 32+8 guard bits

20-bit real 20-bit real20-bit imag 20-bit real

Local Memory and/or Cache I R I R

X XX X

g ( )

40-bit : 32+8 guard bits

20-bit real 20-bit real

20-bit imag 20-bit real

Load Store Unit (128 160b)

32b/128b

Load Store Unit (32/128 160b)

UR Alignment Registers(4 x 128 bits)

128bVector Selection Registers

(4 x 32b bits)

+

+

Shift / Saturation

ACC Registers

rounding 40b

36b

-

Shift / Saturation

I

ALU

R I RQ Q

Addressing Modes• Immediate• Immediate

updating• Indexed• Indexed updating• Aligning updating• Circular• Bit-reversed

Arith, Logical, Shift OpsArith, Logical, Shift Ops

Xtensa 32bBase Ops

Page 38: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 38

Anatomy of a DSP ConnX Baseband Engine Instruction Set

Rich baseline instruction set: up to 153 operations DSP instruction set: 285 operations in 3 VLIW slots

Load/Stores ops:• Addressing Modes:

• offset• offset-update • Index• index-update• circular• bit-reversed

• Load 16b/32b scalars and vectors

• Store 16b/32b scalar, vectors, transposed

Multiply ops:• Complex and scalar

18bx18b multiplies• Multiply, multiply-

round, multiply-add, multiply-subtract

• Multiply complex conjugate

• Magnitude-squared of complex

• Full precision and saturated/rounded

ALU ops:• 20b/40b extended

precision in 160b vectors

• Full arithmetic, logical and shift with saturation operations

• SIMD boolean setting for compares

• Ops: ABS, ADD, AND, ASUB, CLAMPS EQ XOR

Other ops:• Direct support for

single-cycle radix-2 and radix-4 butterfly operations

• 8-way SIMD integer and fractional divide [optional]

• 4-way SIMD reciprocal square root [optional]

• Arbitrary permutationtransposed• Load/store unaligned

and masked delivers full bandwidth loads and stores with unaligned data

saturated/rounded outputs

• Up to 16 multiplies per operations

• FIR-optimized multiply-add

CLAMPS EQ, XOR, LE , MAX, MAXB, MAXU, MIN, MINB, NAND, NEG, NSA, NSAU, OR, PACK, SLL, SLLI, SLLV, SRA SRAI, RADD SUB

Arbitrary permutation and selection from vector pairs

• Zero-overhead looping

• Conditional vector moves

Anatomy of a DSPBaseband Engine: FFT and FIR performance

• Multiple wide memories and parallel execution units for high performance:• Native support for complex arithmetic• VLIW instructions and vector register files support complex code with

minimum load-store

Performance includes cache and local-data-memory modeling

512 complex points

1024 complex points

2048 complex points

4096 complex points

8192 complex points

minimum load-store• Rich addressing modes minimize data reference overhead• Advanced compilers for register allocation, code scheduling, software

pipelining and vectorization.

FFT (incl bit reversal) 853 1,812 3,630 7,930 16,247FIR - 8 tap 1,100 2,200 4,350 8,700 17,400

Page 39: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 39

2-8 ConnX Baseband Engine ClusterBaseBand Engine 2

160b vector registers

Computation Units

32b scalar registers

BaseBand Engine 1

160b vector registers

Computation Units

32b scalar registers

2-8 Baseband Engines form powerful shared memory baseband processor platform8 engine cluster:

64b InstC h

128b DataRAM 0 64b InstC h

128b DataRAM 0

64b InstCache 128b DataRAM 0

128b DataRAM 164b InstCache128b DataRAM 0

128b DataRAM 1• 128 MACs/cycle• Up to 640 ops/cycle• 880K 2048pt complex FFTs per

secondDistributed DataRAM space visible to all engines accessed across 128b pipelined interconnectWrite-buffered interface allows aggregate 120GB/s processor load/store data

128blinks

Processor InterFace (PIF) Processor InterFace (PIF)

Processor InterFace (PIF)Processor InterFace (PIF)

BaseBand Engine 3

Cache128b DataRAM 1

160b vector registers

Computation Units

32b scalar registers

BaseBand Engine 4

Cache 128b DataRAM 1

160b vector registers

Computation Units

32b scalar registers

bandwidth and 60GB/s inter-engine data bandwidth (at 500MHz)Native SystemC modeling of multi-engine processors, including cycle-accurate and fast “Turbo” mode bit-accurate simulation

Typical 4-engine configuration

A continuum of design solutions

ConnXBBE

ConnXBBE

ConnXBBE

ConnXBBE

RTL

/TIE

Acce

lR

TL/T

IEAc

cel

RTL

/TIE

Acce

l

ConnXVectra/D2

RTL

/TIE

Acce

lR

TL/T

IEAc

cel

RTL

/TIE

Acce

l

Xtensa Controller

RTL

/TIE

Acce

lR

TL/T

IEAc

cel

RTL

/TIE

Acce

l

RTL

/TIE

Acce

l

TE

RTL

/TIE

Acce

l

TE

RTL

/TIE

Acce

l

TE

SingleBaseband Engine

MultipleBaseband Engines

Baseband Engine plusRTL or TIE

Lighter DSP plusRTL or TIE

One XtensaController plus

Multiple XtensaTask

function-specificaccelerators

function-specificaccelerators

RTL or TIEfunction-specificaccelerators

Engines withRTL or TIEfunction-specificaccelerators

Page 40: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 40

Putting it all together: LTE-Terminal Baseband PHY Development Architecture

• Purpose– Demonstrate ease of design of world-class LTE terminal PHY– Evaluate performance, cost, power, and flexibility

Support customers and partners in delivering production solutions– Support customers and partners in delivering production solutions• Functionality

– LTE Category-4 Solution, FDD– Functional Domain: Transport Blocks -- Time Domain Frames @Nyquist

• Features– Configurable– Modular-Distributed Processing– Resource Sharing

• Signal-BBE: NCO, FFT, Channel Estimation, Synchronization• Channel-BBE: MIMO decode: SM and TD,• Signal-BBE: Layer Mapping Pre-coding DFT FFT

79

• Signal-BBE: Layer Mapping, Pre-coding, DFT, FFT– Block-Adaptive Pipelining

• Low pipeline delay• Low memory requirement• Fine-grain power management

– Easy of Verification and Bringup– Partitioned into four domains: Signal, Matrix, Bit, Control

• Physical Parameters: 65LP process @ <300MHz

Domain Partitioning of Baseband PHY

1. Signal Domain (Time -- Frequency)– Signal (I-Q Data) Operations– FFT, DFT, Synchronization, Channel Estimation

Two BBEs with Extensions– Two BBEs with Extensions2. Matrix Domain (Frequency -- Soft Bits)

– Matrix Operations (I-Q Data)– MIMO Decoding

3. Bit Domain (Soft Bits-Bits)– FEC Encoding, Bit Scrambling, Interleaving, Rate Matching– Soft Descrambling, Deinterleaving, HARQ Recombining, Turbo Decoding– Turbo Processor– HARQ Processor: Controller 64-bit

PDCCH Processor: Vectra LX Class Processor + Viterbi Acc

80

– PDCCH Processor: Vectra LX Class Processor + Viterbi Acc– Tx Bit Processor: Vectra LX Class Processor

4. Control Domain– Configure and Control System– Communicate with MAC and Host– Bit PDCCH Processor: Vectra LX Class Processor + Viterbi Acc

Page 41: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 41

Tensilica LTE PHY Development ArchitectureConnX Baseband DSPs + Accelerators for Minimum Power /Area

Tx

Rx

Specialized Processor as Efficient as RTL Tensilica Turbo Processor Development Architecture

• LTE requires high-data-rate Turbo decoding: ~6000 ops per bit

• Xtensa’s instruction

Size of Turbo Decoders (scaled to 154Mbps)

1,600

(8

extensions, wide data-paths and multiple wide memories enable efficient programmable Turbo Engine

• Method:• 8 parallel windows per block –

two bits per cycle• 0.5 cycle for each forward and

backward pass (2 cycles per iteration) per bit 200

400

600

800

1,000

1,200

1,400

s to

ach

ieve

154

Mbp

s @

350

MH

z (

tions

) for

Pub

lishe

d Tu

rbo

Blo

cks

82

iteration) per bit• Log correction term for

improved bit error rate• Implementation:

• 325K gates + 80KB memory (2mm^2 in 65LP) achieves 154Mbps at 350MHz

0

[Lin]

[Bikerst

aff 1]

[Bickers

taff 2

]

[Salmela 1]

[Agarwala

,Wolf

]

[Vogt][Thu

l][Shin]

[Benkese

r]

[Xtensa

]

[Salmela 2]

Design

K g

ates

itera

Page 42: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 42

Turbo Processor Architecture

• Highly specialized 5-slot VLIW instruction set

• Six memory references per

LS0 LS1 Addr RdWrSt αβLD_SYS128.ILD_SYS128.IU

LD_PAR128.ILD_PAR128.IU

RD_INTERADDRRD_DEINTERADDR

RD_STATE_U ALPHABETADECISION• Six memory references per

cycle• Single 640b wide state memory• Single 30b interleave address

memory• Dual 64b interleaved apriori

memory• Dual 128b load/store interface

for main memory• Massive SIMD:

• Each state update: 21 8b

MAPαβ

83

poperations

• 8 states per window• 2 successive bits per window• 8 windows in parallel• Ops per cycle: >2600

Turbo Processor inner loop C code examplefor (k=kq/2-1;k>=0;k--) {InterleaveAddr = RD_INTERADDR(DEC);StateWord = RD_STATEU();BETA(StateWord,InterleaveAddr,*(SystemP),*(--ParityP));

}}

650: rd_statest1, a15, 0653: addmia2, a1, 0x600656: addmia2, a2, 0x7f00659: wr_statest1, a1565c: rd_statest0, a9, 265f: l32ia2, a2, 208662: state_statem1st0665: { ld_sys128.iusy0, a0, -16;ld_par128.iupa0, a2,-16; rd_interaddria0,0;rd_stateust0; nop }66d: { ld sys128.iusy1, a0, -16;ld par128.iupa1, a2,-16; rd interaddria1,0;rd stateust1; nop }

84

{ _ y y , , _p p , , _ , _ p }675: loopgtza3, 688 <main+0x688>678: { ld_sys128.iusy0, a0, -16;ld_par128.iupa0, a2,-16; rd_interaddria0,0;rd_stateust0; betast0, ia0, sy0, pa0 }680: { ld_sys128.iusy1, a0, -16;ld_par128.iupa1, a2,-16; rd_interaddria1,0;rd_stateust1; betast1, ia1, sy1, pa1 }688: movia3, 191

Page 43: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 43

Top-Down Baseband Design Methodology Algorithm Design

f1 f2 f3 f4 f5 f6 f7 f8Synopsys System StudioMatlab

Fixed Point Algorithm Refinement

f3 f4 f5 f6 f7 f8f1 f2Visual C++Synopsys System StudioXtensa XplorerTensilica cstubs

Partitioning to DSP C and Accelerators: Simulation of sub-systems

Accelerators

DSP C f1f2

Tensilica fast ISSTensilica cycle-accurate ISSTensilica TIE CompilerAccelerators

DSP C f3f4

f5Accelerators

DSP C f6f7

f8

Integration of sub system models

85

Integration of sub-system models Tensilica XTSCCoware Platform ArchitectCarbon SOC Designer

Interconnect and peripheralsAccelerators

DSP C f8f1f2

f3f4

f5 f6f7

Mapping of system to FPGA and ASIC RTL

Block 1 Block 1 Block 1

f1 f2 f3 f4 f5 f6 f7 f8

Interconnect and peripherals

Tensilica FPGA netlistTensilica RTL generationTensilica pin-level XTSCVerilog simulatorsFPGA synthesis and mappingTensilica RTL Testbench

Top-Down Baseband Design Methodology

MIMO-OFDM Transmitter Receiver Chain Decide on the right algorithm

Optimize the algorithm

Comms Layer 1Comms Layer 2Comms Layer 3

HW/SW Implementation

Find the best implementation architecture

Optimize algorithm in the architecture context

Optimize HW/SW

Partitioning Operating System

Middleware

Applications

RTOSComms Layer 1Comms Layer 2Comms Layer 3

HW/SW Chip and System Integration

Optimize HW/SW Integration

MPU DSPModem

Modem Subsystem

HAL

RTOSiversDevice Drivers

Partitioning

Mobile Chipset Hardware

IPC

Application Subsystem

MPU DSPMulti-Media

Operating System

HALDevice DriversDevice Drivers

MPU DSPModem

Modem Subsystem

HAL

RTOSiversDevice Drivers

Start HW/SW

integration early

Page 44: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 44

Algorithm ↔ Processor OptimizationReference Algorithm

P fil Al ith

TensilicaDPU

Host-basedCstubs

Profile Algorithm on ISS

Optimizedalgorithm

Host-basedfunctional

Validate performance on

(optional)Instruction extensions

Processor subsystem RTL

generation

RTL integrationfunctional Verification

performance on ISS

Validate in FPGA and SOC system implementation

Validate functionality and performance in SystemC model

RTL integrationverification

Tensilica DPU ISS in Synopsys Innovator

Wire to PinWire-to-PinAdaptor

Tensilica core ISS

PIF-to-TLM-2.0Adaptor

ScriptingEngine

Page 45: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 45

Tensilica DPU in Synopsys Innovator

Information for debugger

attachment

Tensilica DPU in Synopsys Innovator:Debugger attach

Page 46: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 46

Tensilica DPU in Synopsys Innovator:Debugging

Xtensa Xplorer Debug

Environment

Xtensa Xplorer console window

Wrap-up

• Baseband Trends• A Fresh Look at Dataplane Processorsp• Reaching for Performance:

– Integrating RTL– Special-purpose Instruction Extension– Advanced Baseband DSPs

• LTE Reference Architecture• Drill Down on Turbo Processor

M th d l Fl f Fl ibl B b d• Methodology Flow for Flexible Baseband• Tensilica Integration with Synopsys Tools

Page 47: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 47

Frank Schirrmeister, Synopsys

SYNOPSYS

Synopsys: Enabling Optimization and Software Development

MIMO-OFDM Transmitter Receiver Chain Decide on the right algorithm

Optimize the algorithmOp e e a go

Comms Layer 2Comms Layer 3

HW/SW Implementation

Find the best implementation architecture

Optimize algorithm in the architecture context

Optimize HW/SW Middleware

ApplicationsComms Layer 2Comms Layer 3

HW/SW Chip and System Integration

Optimize HW/SW Integration

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Partitioning

Mobile Chipset Hardware

IPC

Application Subsystem

MPU DSPMulti-Media

Operating System

Middleware

HALDevice DriversDevice Drivers

MPU DSPModem

Modem Subsystem

HAL

RTOSComms Layer 1

iversDevice Drivers

Start HW/SW

integration early

Page 48: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 48

Outline

• Challenges• Algorithm Optimization• Algorithm Optimization

– Performance– Productivity– Flow Integration

• HW/SW Optimization– Software Development

Verification Integration– Verification Integration– Architecture Analysis– System Prototyping

• Ecosystem

Some System-Level Challenges

Source: EETimes, 02/05/2007

How do I develop the right signal processing

algorithms, implement them and export them to

How do I get a virtual model of my platform to the programmers for

software development pHDL and Verification

flows?

pearly?

Page 49: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 49

ALGORITHM OPTIMIZATION

Designer’s Wish

Designer’s Challenge

Algorithm Design & Analysis

Designer’s Task

Model-based design,ultrafast simulation, and analysis of signal processing algorithms

Growing complexity⇒ efficiency is key

• Modeling efficiency• Simulation performance

• Analysis capabilitiesD i & ifi ti

Create algorithm and model meeting key requirements

• BER• SNR• Word length• Image/audio quality

S d/ lit • Design & verification flow integration

• Sync speed/quality• . . .

Page 50: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 50

Users need to handle complexity

rfor

man

ce Performance

Highest simulation performance (Stream-Driven Simulation

Per

Productivity

Fastest fixed-point simulation (20x-200x faster than OSCI SystemC reference)

Productivity

Model-based designSignal-processing specific analysisWorkgroup / data managementSimulation management

Flow Integration

Verification: Export to SystemC (FIFO & RTL interface)Reference modelVerification testbench

Implementation Code Generation

Handling Complexity

rfor

man

ce Performance

Highest simulation performance through Stream-Driven Simulation

Per

Productivity

Fastest fixed-point simulation (20x-200x faster than OSCI SystemC reference)

Productivity

Model-based designSignal-processing specific analysisWorkgroup / data managementSimulation management

Flow Integration

Verification: Export to SystemC (FIFO & RTL interface)Reference modelVerification testbench

Implementation Code Generation

Page 51: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 51

Modeling EfficiencyGetting to Simulations Earlier

• Writing a model– “Text book” modeling style– Focus on your algorithmy g– Design checks

• Developing a design– Block based design– Standard interfaces– Instantiate models via drag &

drop

• Managing your design & team– Compiler/linker flags, Makefile generation, build process– Parameters & scripting– Link to revision control system

Model LibrariesFrom Simple to Most Complex

• Signal-processing modelsData sources A l i d l ( BER)– Data sources

– Channel models– Display models

• PLUS: Reference design kits (RDKs)J t t ith t d d li t f

– Analysis models (e.g. BER)– Filters– Coder / Decoder

– Jump-start with standard compliant reference models of advanced wireless, multimedia and telecom technical standards

Page 52: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 52

Example LTE PHY Library

Complete physical layer simulation h i d i d t l tchain designed to accelerate your

physical layer and algorithm development

Channel coding

Scrambling Modulation mapper

Layer mapper

Pre-coding

Resource mapper

OFDM modulation

Channel coding

Scrambling Modulation mapper

Layer mapper

Pre-coding

Resource mapper

OFDM modulation

coding

Scrambling Modulation mapper

mapper coding

Resource mapper

OFDM modulation

coding

Scrambling Modulation mapper

mapper coding

Resource mapper

OFDM modulation

Transmitter example

LTE PHY LibraryApplications• Design and verify communications algorithms• Create custom test and measurement waveforms• Generate test models and reference measurement channels

(RMC’s)• Supports Golden Reference verification for both hardware and

software• Generate BitErrorRate (BER) and BlockErrorRate (BLER) link

level curves

Developed by and available from Steepest Ascent, d di t d t th LTE t d ddedicated experts on the LTE standard

Beta availability: ImmediatelyGeneral availability: October 2009

Page 53: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 53

LTE PHY Library Highlights

• 3GPP Release 8 E-UTRA Physical Layer implementation conforming to TS36.211, TS36.212 and TS36.213

• Tracking compliance with Release 8 into Release 9

• Full support for Downlink Reference and Synchronization Signals and Uplink Reference Signals

• Complete support for 1, 2 and 4 antenna transmissions including all MIMO layering and precoding options

• Encode/decode data channels: DL-SCH, UL-SCH, PCH, MCH & BCH

• Encode/decode control region channels: DCI, HI and CFI

LTE LibraryModel List

Page 54: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 54

Handling Complexity

rfor

man

ce Performance

Highest simulation performance through Stream-Driven Simulation

Per

Productivity

Fastest fixed-point simulation (20x-200x faster than OSCI SystemC reference)

Productivity

Model-based designSignal-processing specific analysisWorkgroup / data managementSimulation management

Flow Integration

Verification: Export to SystemC (FIFO & RTL interface)Reference modelVerification testbench

Implementation Code Generation

Simulation SpeedGetting to Results Faster

• High-speed dataflow simulation engine– Enabling industry’s fastest simulation and analysis of

i l i l ithsignal-processing algorithms – Highly efficient stream-driven simulation technology

• Fastest fixed-point simulation– Speed-up SystemC fixed-point by 20x to 200x– Simulate bit-true models almost at floating-point speed

• Distributed Simulations– Automatically distribute simulations on your compute

cluster– Get to results faster

Page 55: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 55

Complexity Demands FASTEST Simulation

Bit ErrorRate

Simulate 100 million data samples for one performance point

10-6

10-3

SNR

for one performance point

• Performance is judged by simulating billions of input data• Mandatory to deploy most efficient simulation paradigm• Compiled C/C++ simulations

SNR

core N

CPUcores

Is FASTESTFASTEST not Fast Enough?Parallel Iterations

core 1

core N

Time toTime saving results

• Use of job scheduler significantly reduces time to results• Utilize the power of your entire compute farm• Get to results faster!

Page 56: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 56

Objective

Fast Fixed-Point SimulationGet The Best of Two Worlds

native data30x Objective

SystemSt di

Sim

ulat

ion

effic

ienc

ydata types

fixed-pointlib i S

wl

1 Studio

Modelingefficiency

librariesiwl fwl

1x

20x1x

Handling Complexity

rfor

man

ce Performance

Highest simulation performance through Stream-Driven Simulation

Per

Productivity

Fastest fixed-point simulation (20x-200x faster than OSCI SystemC reference)

Productivity

Model-based designSignal-processing specific analysisWorkgroup / data managementSimulation management

Flow Integration

Verification: Export to SystemC (FIFO & RTL interface)Reference modelVerification testbench

Implementation Code Generation

Page 57: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 57

Flow Integration

• Re-use your algorithm

System

HW Design &

Verification

Embedded Software

• HW design & verification– Golden reference– Functional testbench

• Embedded SW dev.– Use in virtual platforms

RTL not available or tooyStudio – RTL not available or too

slow

• De-risk your flow

SystemC/C++ ExportIntegration into design/verification flow

Automatic generation of SystemC/C++ wrapper...g y ppFunction call interfaceSystemC FIFO interfaceSystemC signal interface

... and utilitiesMakefileExistence testbench

Plug&play: Export into HDL simulationNo System Studio knowledge required

Page 58: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 58

HDL Import

Automatic interface and wrapper generationDataflow interface for HDL modelDataflow interface for HDL model

Automatic synchronization of simulation engines

Support for all major HDL simulators for Verilog and VHDLg

VCS-MXMTINC-SIM

RTL Code Generation

Synthesizable RTL from dataflow description

Two options Library based: full control over HDL implementationSingle-source concept: fastest route to RTLBoth options can be mixed

Automatic generation of HDL co-simulation

Page 59: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 59

Integration into HW Verification FlowExporting Algorithmic Models

• Automatic generation of wrappers clock resetwrappers– C/C++ interface– SystemC FIFO or signal

• Replaces need for paper spec & I/O files

• No System Studio knowledge required by

AlgorithmicModel

port0datavalid

ready

param2=256param1

param1

port1datavalid

ready

dataknowledge required by HW verification engineer

• Optional encryption

SystemC Wrapperport2

datavalid

ready

Hardware-in-the-Loop Simulation

• Execute RTL code on CHIPit box (FPGA prototyping)• Reuse algorithm simulation setup for stimuli

generation and analysis

Page 60: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 60

Integration into Embedded SW FlowModel Reuse in Virtual Platforms

Picture Generator

ARM920 DMA Memory

controller

System Bus

SoC

RAM

Rendering

Analysis

Peripheral BusRendering

VT100Picture Generator

LCD

Stimuli generation

Functional model

Algorithm Design & Analysis Needs

Model-based Design

rfor

man

ce

Analysis and Debugging

Models (Source Code)

Highest Simulation Performance

Productivity

Per

Models (Source Code)

Verification Flow Integration

Page 61: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 61

Synopsys’ DSP Algorithm PortfolioSystem Studio Libraries System Studio

2000+ models(signal gen, basic DSP,

analog and digital)

Environment for Design and analysis of signal processing algorithmsanalog and digital) processing algorithms

Fastest way to functional specificationFastest way to functional specification through model based designShortest time to results through highest simulation performanceIncrease overall verification productivity through HDL import and SystemC export capabilities

HW/SW OPTIMIZATION

Page 62: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 62

Why Virtual Platforms?A Chip Design Project P&L Without Virtual Platforms

Tests for post-silicon validation are

developed late as well

Prototype for software development available late in the design cycle

Virtual PlatformsPre-Silicon Models of the Hardware

E l A il bilitEarly AvailabilityEnhanced Debugging

Easy Deployment

Fully functional software model of SoC, board, I/O,

user interface

Executes unmodifiedproduction code

Runs at almost real-time

High system visibility and control incl. multi-core

debug

Page 63: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 63

Virtual Platform Business ImpactGet to market early and increase profit

Start software development pre-silicon, long before

h d i il bl !hardware is available !

Start pre-silicon verification and post-

silicon validation using virtual platform as “DUT”

Virtual Platform Usage ExamplePre- and Post-Silicon

# of

runt

ime

licen

ses Deliver to OEM

Hardw

are Availability

Early (Pre-Si) Software Development#

Page 64: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 64

Functional Specification – Defining the overall hardware / software system

Virtual PlatformsFrom Functional Specification to RTL Implementation

Platform Creation & Analysis Platform Deployment

3 d P t

Model Creation

Algorithms

IP-XACT

IP Libraries

RTL to GDSII Implementation Flow

3rd Party Software

DebuggersSystemCYour-lib-1Your-lib-1Your-lib-1

Virtual Platform Value Three Major Use Cases of Virtual Platforms

Software Development(Prior to RTL)

Verification & Validation(Post RTL / Silicon)

Performance Profiling(Architecture Analysis)

High-performance simulation Simulation of complete Hybrid simulation of “loosely”High performance simulation of complete systems

Simulation of complete systems including board-level test harnesses

Hybrid simulation of loosely and “accurately” timed models

Professional IDE for enhanced debug & visibility

Innovator GUI enables test engineer productivity, emulating test harnesses

Innovator “Platform Analyzer” add-on enables customizable analysis

Open “Framework” based on SystemC TLM-2.0

Links to VCS / VMM“System to Silicon” verification flow

Innovator extensions to System-C for profiling & instrumentation

Scalable environment that enables internal & external users

Proven links to emulation solutions: EVE, Paladium

Proven integration with 3rd

party models

Integration with all major embedded software tool chains

Proven Links to Synopsys CHIPit

Developing “save & restore” capability to enable more visibility

Page 65: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 65

Virtual Platform Development Needs

• Powerful IDE– Graphical design capturep g p– Model creation wizard– Debugging support– Domain-specific

visualization• Native SystemC

support

• Run Time– User’s seat– Enabling execution of

Virtual platform

• Accelerate virtual platform creation, through a portfolio of transaction-level models (TLMs)

Library Needs

Example: DesignWare®

System-Level Librarytransaction-level models (TLMs)High performance & quality100+ titles & growing …

• Written in SystemC™• Migrating to TLM-2.0 API

• Tool independent: works with any IEEE-1666 compliant SystemC simulator

y y

Processors DesignWareCores

DW AMBAModels

CoreConnectModels

PrimeCell InfrastructureNEW

SystemC simulator• Supported on Windows & Linux• Delivered in binary format• Model Authoring Libraries

Models Models

Model Authoring Libraries

Pre-assembledPlatforms

NEW

Page 66: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 66

Example: Texas Instruments

OMAPOMAP OMAP

1623

OMAP2420

OMAP2430

OMAP3430

OMAP3®

Cellular• Virtual platform available 9-12 months before HW is available - allowing 85-90% of all software to be

Consumer

OMAP1®

OMAP1510

1610 1623

OMAPV1030

OMAPV1230

OMAPV2230

OMAP2®

OMAP-Vox®

OMAPV2320

TCS2310(LoCosto)

DRP

90% of all software to be developed pre-silicon

• 2-5x SW development productivity improvement

• 1st-day HW/SW integration • Deployed to TI customers

DA295S DM420

DaVinci Family Unannounceddevices“TI used the VPOM-2430 Virtual Platform successfully to accelerate software development for the OMAP2430

processor device. Because of the effective simulation environment provided by SNPS, our teams were able to immediately run and test our software when the OMAP2430 processor became available. SNPS is a key element of TI's plan to reduce the time needed to provide software after silicon becomes available.”

Avner Goren, Worldwide Marketing Director, Cellular Systems, TI's Wireless Terminal Business Unit.

Virtual Platform Value Three Major Use Cases of Virtual Platforms

Software Development(Prior to RTL)

Verification & Validation(Post RTL / Silicon)

Performance Profiling(Architecture Analysis)

High-performance simulation Simulation of complete Hybrid simulation of “loosely”High performance simulation of complete systems

Simulation of complete systems including board-level test harnesses

Hybrid simulation of loosely and “accurately” timed models

Professional “Innovator” IDE for enhanced debug & visibility

Innovator GUI enables test engineer productivity, emulating test harnesses

Innovator “Platform Analyzer” add-on enables customizable analysis

Open “Framework” based on SystemC TLM-2.0

Links to VCS / VMM“System to Silicon” verification flow

Innovator extensions to System-C for profiling & instrumentation

Scalable environment that enables internal & external users

Proven links to emulation solutions: EVE, Paladium

Proven integration with 3rd

party models

Integration with all major embedded software tool chains

New links to SynplicityHAPS available soon

Developing “save & restore” capability to enable more visibility

Page 67: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 67

RTL Verification in the Presence of SWMixing TLM Model, RTL & Assertions

Key Values & Usage– Software/Hardware Integration

ValidationV lid t HW/SW i t ti

System-on-Chip

PeriphPeriph PeriphPeriph

(System)Software

Validate HW/SW integration on actual hardware (RTL)

– System ValidationScales verification to system contextSoftware becomes part of the verification test bench Verification confidence increases with “real” system scenarios

RTL Si l ti S d

CPU(s)CPU(s)InstructionSet Simulator

TLM BusTLM Bus

SystemSystemI/OI/O

RTLRTL(sub)system(sub)system

Transactor

TLM B

usTLM

Bus

MemMemCtrlCtrl

SystemSystemI/OI/O VCS/VMM – RTL Simulation Speed-up

Maintain TLM level where possible

TLM model used to generate system stimuli

I/OI/O

CameraCamera

CtrlCtrl

FlashFlashMemoryMemory

I/OI/O

USBUSB

System/Device

VCS/VMM

Early Testbench Creation & IntegrationIntegration with VMM Methodology

TestbenchKey Values & Usage

Early Testbench DevelopmentDevelop all TB infrastructure with TLM platform

E l T t / S i

Test-case /Scenario

RTL

DUTUT

Virtual Platform

river

nito

r

Early Test-case / Scenario Development

VMM scenarios / test-cases“Embedded directed software” tests used for system (integration) testing can be efficiently developed on TLM model

Higher Test-case (software) productivity

Faster turnaround & better i ibilit i t TLM l tf

Coverage / Self-checkTestbench

RTL

DUT

Dr

Mo

Early testbench creation& test development

visibility into TLM platform Technology / Methodology

SystemC support in VMMLayered VMM testbench approachVCS TLI interface

Page 68: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 68

Co-VerificationMixing TLM Model, RTL & Assertions

Assertion Specification

DesignWare™ Virtual PlatformSoftware Debugger

VCS / DVE

System PrototypingVirtual & Hardware Prototypes

Page 69: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 69

Combined Virtual And FPGA Prototype

Best Of Both WorldsUse Models

– Offload software to workstation

– Virtual Platform re-using existing RTL in FPGA Prototype

– Virtual Platform as test bench for FPGA Prototype

– Joint virtual / real system environment connectionsenvironment connections (USB, SATA, …)

– Virtual ICE in virtual platform connected to FPGA prototype

System PrototypingFrom pre-RTL virtual prototype to hardware prototype

ChipArchitecture

Si ProtoRTL Netlist GDSII

Firmware OS & Driver

typi

ngw

SemiconductorHouseVirtual Prototype SDK usage

Middleware

Hybrid

ProductArchitecture

Device Proto

OS & Driver Middleware Application SW

Device Proto

End

to E

nd P

roto

tD

esig

n Fl

ow

System House

SI PrototypeFPGA Prototype

SI Prototype

Virtual PrototypeFPGA Prototype

Schedule Improvement

SDK usage

Previous Chip (for derivative)

Previous Chip

Page 70: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 70

Virtual Platform Value Three Major Use Cases of Virtual Platforms

Software Development(Prior to RTL)

Verification & Validation(Post RTL / Silicon)

Performance Profiling(Architecture Analysis)

High-performance simulation Simulation of complete Hybrid simulation of “loosely”High performance simulation of complete systems

Simulation of complete systems including board-level test harnesses

Hybrid simulation of loosely and “accurately” timed models

Professional “Innovator” IDE for enhanced debug & visibility

Innovator GUI enables test engineer productivity, emulating test harnesses

Innovator “Platform Analyzer” add-on enables customizable analysis

Open “Framework” based on SystemC TLM-2.0

Links to VCS / VMM“System to Silicon” verification flow

Innovator extensions to System-C for profiling & instrumentation

Scalable environment that enables internal & external users

Proven links to emulation solutions: EVE, Paladium

Proven integration with 3rd

party models

Integration with all major embedded software tool chains

New links to SynplicityHAPS available soon

Developing “save & restore” capability to enable more visibility

Transaction-level Modeling Abstraction Levels

80+ MIPS

App ViewTLM (AV)

40-60 MIPS

1-10 MIPS

Prog. ViewTLM (LT)

PV with TimingTLM (AT)

Pre-silicon Software Development & Integration

Architectural ExplorationSystem Verification

1-100 KIPS

FunctionallyAccurate

CycleApproximate

CycleAccurate

C-translated-RTL ModelsCo-Emulation

RTL co-simulation

& Real-Time SW Development

Page 71: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 71

SW Centric Architecture AnalysisTLM-2.0 “AT” Simulation

• Data visualization and processing

logfile

p g• Flexible and extensible

– Support for user-defined types– Field configurable through

plug-in API

• Rich set of data views• High performance loggerHigh performance logger

and loaderinteractive

data processing

Display Options (Examples)

Page 72: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 72

Synopsys’ Virtual Platform PortfolioDesignWare®

System-Level Library Innovator Services

High-performance models to build virtual

Environment for developing, running &

d b i i t lExpert services for model creation, virtual platformmodels to build virtual

platforms

SystemC™ TransactionLevel Models

Processors

DesignWare®

System-Level Library

Pre-Assembled Platforms

DesignWare Cores

DesignWareAMBA

Components

debugging virtual platforms

creation, virtual platform assembly & customization

Start SW development early and

Virtual Platforms

Start SW development early and shrink time-to-market using high-performance Virtual PlatformsEnhance design quality through SystemC executable specificationIncrease design confidence through complete HW/SW system verification

SYSTEM-LEVEL ECO-SYSTEM

Page 73: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 73

Example: Virtual PlatformsSolution for Early and Productive Software Development!

ChipArchitecture

Si ProtoRTL Netlist GDSII

Firmware OS & Driver

ProductArchitecture

Device Proto

OS & Driver Middleware Application SW

SI Prototype

Device PrototypeTrad

ition

al

Des

ign

Flow

SemiconductorHouse

System House

Middleware

Si ProtoRTL Netlist GDSII

ChipArchitecture

Firmware OS & Driver

ProductArchitecture

Device Proto

OS & Driver Middleware Application SW

Device ProtoEnd

to E

nd P

roto

typi

ngD

esig

n Fl

ow

SemiconductorHouse

System House

SI Prototype

Virtual PrototypeFPGA Prototype

SI Prototype

Virtual PrototypeFPGA Prototype

Schedule Improvement

SDK usage

SDK usage

Previous Chip (for derivative)

Middleware

Previous Chip

Multiple Players Needed

Models from multiple

Different Options for multiple

vendors

Different Simulator

Software Debug

Options

Driver Software for Specific IP Blocks

Page 74: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 74

System-Level Eco-system

Steepest Ascent and Synopsys

Reference receivers System Studio

testsimulationverification

Test models & RMCs

Custom waveforms

LTE Library

verification

synchronisation

ch. estimation

MIMO receiver& equalisation

Page 75: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 75

Tensilica DPU ISS in Synopsys Innovator

Wire to PinWire-to-PinAdaptor

Tensilica core ISS

PIF-to-TLM-2.0Adaptor

ScriptingEngine

SUMMARY

Page 76: Development Tools for 4G Hardware and Software … Implementation Find the best implement ation architecture Optimize algorithm in t he architecture context ... Development Tools for

4GWE Fall 2009: Development Tools for 4G Hardware and Software

9/2/2009

© Tensilica, Steepest Ascent, Synopsys 76

System-Level ChallengesHow do I develop the

right signal processing algorithms, implement

them and export them to HDL and Verification

How do I get a virtual model of my platform to the programmers for

software development early?

Source: EETimes, 02/05/2007

System Studio

Design and analysis of signal

System Studio Model Libraries

HDL and Verification flows?

early?

Innovator

Develop, execute & analyze virtual

l tf

DesignWare® System-Level Library

processing algorithms

Services: Model & virtual platform

creation & supportTraining & methodology transfer

platforms

Predictable Success