29
www.imgtec.com Saeid Azmoodeh Director of Engineering, Bristol Design Centre 24 th September, 2012 Multicore, Scalability and High Performance at Low Power

Multicore, Scalability and High Performance at Low Power · PDF fileMulticore, Scalability and High Performance at ... Android MeOS Graphics & GPU: OpenGL ES OpenCL EP OpenGL OpenCL

  • Upload
    lamtram

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

www.imgtec.com

Saeid Azmoodeh

Director of Engineering, Bristol Design Centre

24th September, 2012

Multicore, Scalability and High Performance at Low Power

2 © Imagination Technologies Multicore Conference Bristol 2012

Leading silicon, software & cloud IP supplier

Graphics, video, comms, processor, cloud

Licensing and royalty business model

Licensed to many top 20 semis & OEMs

Servicing high volume, high growth markets

Shipped by most major consumer brands

Smartphones, media players, tablets/netbooks, TVs/STBs, gaming devices,

radios, connected devices, dashboards/navigation

Strategic product division: PURE

Digital radio, internet connected audio (today)

IP business pathfinder, market maker

Established technology powerhouse Founded 1985; London FTSE 250 (IMG.L) Employees: 1,200+ UK HQ; operations world-wide Global customer base

Company Overview

UK Headquarters

R&D

Sales

Solution Centric IP

3 © Imagination Technologies Multicore Conference Bristol 2012

Delivering all the world’s standards:

Our Technology business

SoC Technologies

Comms:

TV

Radio

Mobile TV

Wi-Fi

Bluetooth

CPU OS:

Linux

Android

MeOS

Graphics & GPU:

OpenGL ES

OpenCL EP

OpenGL

OpenCL

OpenVG

DirectX 9/10/11

OpenRL

Display:

De-interlace

Frame RateConversion

Noise reduction

Video:

H264

MPEG4

MPEG2

VC1

AVS

VP6/8

SoC

PowerVRVideo

PowerVRDisplay

Customer & 3rd

Party IP

PowerVRGraphics

IMGworks Customised IP

Real Time

Audio

MetaProcessorSignal

Signal Processing

General Processing

MetaEnsigmaCommunications

Mobile

Phone

Handheld

Multimedia

Home

Electronics

Mobile

Computing

Automotive

Emerging

Markets

DSP/Real Time:

Audio

SW Stacks

DSP

Real Time

V.VoIP

VoLTE

I/O

AudioScreens

Video & graphics

CausticProfessional

Ray Tracing Solutions

HelloSoft

VoLTE & V.VoIP

Flow

Cloud Connectivity

Solutions

4 © Imagination Technologies Multicore Conference Bristol 2012

Our technology partnerships growing strongerMany partners including:

6 © Imagination Technologies Multicore Conference Bristol 2012

How we see SoCs

SoC

CPU Memory

GPU + VPURPU

7 © Imagination Technologies Multicore Conference Bristol 2012

From CPU-centric Homogeneous…

Application

Operating System

Lo

w le

ve

l Driv

ers

“o

ptim

ised

co

de

CPU Peripherals

Lo

w le

ve

l Driv

ers

Lo

w le

ve

l Driv

ers

8 © Imagination Technologies Multicore Conference Bristol 2012

To Heterogeneous Computing

Operating

System

CPU

RP

U (R

ad

io)

Com

munic

atio

ns

AP

I Driv

ers

Op

en

GL

ES

gra

ph

ics A

PI

GP

U

Application

ISA low level API

Op

en

CL

co

mp

ute

AP

I

VP

U

Op

en

Ma

x

vid

eo

/au

dio

AP

I

9 © Imagination Technologies Multicore Conference Bristol 2012

Making Heterogeneous Real

Uniqueness

Differentiation

Cost optimised

A typical SoC

Memory

Meta MTP

Audio DSP

EnsigmaRPU

connectivity&

broadcast

PowerVRDisplay

Frame Rate Converter

PowerVR VPUVideo

decoder& encoder

PowerVR GPUGraphics

3D/2D, Ray Tracingand GPU \compute

Video

Audio

Wi-Fi

Bluetooth

DTV

Radio

Camera

HelloSoftV.VoIP/VoLTE

Voice & videoover IP

Meta HTP

Android/Linux

EnsigmaRPU

connectivity

Wireless

Display

IMGworksCustom

Pixel DisplayPipeline

10 © Imagination Technologies Multicore Conference Bristol 2012

Anticipate TrendsStarted development at least 5-10 years before market traction

SoC

PowerVRVideo

Audio Display

PowerVRDisplay

Customer’sInternal IP

PowerVRGraphics

3rd Party IP IMGworks Customised IP

MetaApplications

EnsigmaCommunications

Tile-based Deferred Rendering Highest performance per mm2 & mW Highly scalable architecture

Universal architecture for broadcast and connectivity

Optimum mix of configurable hardware and programmability

Hardware multi-threading Automatic MIPS Allocation Configurable GP/DSP threads

11 © Imagination Technologies Multicore Conference Bristol 2012

New Key Technology Trends

GPUs will power the parallel software revolution GPUs will become the „heavy lifting‟ processors of SoCs

GPUs drive mass market scalable parallel processing

This will trigger a new wave of parallel software

On-chip RPUs complete SoC evolution Communications must move on-chip just like everything else

RPUs (Radio Processor Units) supporting a broad range of global connectivity and broadcast receiver standards, will drive this

Multi-threading makes embedded processors effective Without it, unified memory architectures just stall SoC

CPU Memory

GPU + VPURPU

12 © Imagination Technologies Multicore Conference Bristol 2012

Why multi-threading works: Pipeline Occupancy

Idle (Cache Miss)

Cache Miss

Supervisory Code(RTOS Task Switch)

Idle(Cache Miss)

Cache Miss

Conventional CPU pipeline occupancy

Task A Task B

Elapsed Time

Meta CPU pipeline occupancy

Idle

Thread B

Thread A

Super-threading canoccur where there isno resource conflict

The Meta Advantage

Realise as:

Less MHzLess PowerSmaller CoreMore Processing

13 © Imagination Technologies Multicore Conference Bristol 2012

Ensigma RPU – Overview (Dataflow)

MCP

Modulation & Coding Processor (MCP)

Software (de)modulation of the signal

SIMD Complex Vector Processor

Executes firmware (software)

Utilizes local code & global data stores

Has access to external memory

Soft

Decisions

IF

Low IF

I/Q

ECP

Output

Stream

Error Correction Processor (ECP)

Error correction & recovery

Multi-mode Hardware Toolkit

SCP

Well Timed

Complex

Samples

Signal Conditioning Processor (SCP)

Digital Down Conversion

Multi-mode HardwareTransmit Processor (TXP)

Error Correction & IQ mapping

Multi-mode Hardware

TXPComplex

SymbolsTx DataI/Q

Resampler

14 © Imagination Technologies Multicore Conference Bristol 2012

Ensigma UCCP Series3Multiple processors; multi-context engines

Ensigma UCCP

(Share

d)

Sys

tem

Mem

ory

Meta

LTP/MTP

Processor

External Host

SCPSCP

L2 (MAC)

AcceleratorsTXP

Tx Digital IF/IQ

Rx Digital IF/ZIF

MPEG-2

TS

ECP

MCP

UCCP Memory

MCP

So

C F

ab

ric

Core

Memories

PT

P

15 © Imagination Technologies Multicore Conference Bristol 2012

It used to be so simple…

16 © Imagination Technologies Multicore Conference Bristol 2012

Then came mobile…

17 © Imagination Technologies Multicore Conference Bristol 2012

And they got smaller – and bigger too…

18 © Imagination Technologies Multicore Conference Bristol 2012

But they were all still just phones

19 © Imagination Technologies Multicore Conference Bristol 2012

Then came the apps…

20 © Imagination Technologies Multicore Conference Bristol 2012

But “supercomputing in mobile”?

21 © Imagination Technologies Multicore Conference Bristol 2012

Yes you can!

Pe

rfo

rma

nce

Perf

orm

ance

range

To

da

y

Time

22 © Imagination Technologies Multicore Conference Bristol 2012

GPU increasingly dominates SoC processingData parallel architectures

GPU multi-processor and multi-pipe configurability enables far more extensive processor scaling than CPUs

SMPs unlikely to scale past 4 CPUs

OpenCL unlocks the enormous processing potential of GPU Compute

23 © Imagination Technologies Multicore Conference Bristol 2012

PowerVR Series 6 Rogue Block DiagramArchitecture & Data Flow Overview

System

Memory

Bus

PowerVR Series 6 - Rogue

Vertex Data

Master

TilingCo-Processor

Tessellation Co-Processor

ComputeData

Master

Pixel DataMaster

PixelCo-Processor

Unified Shading Cluster Array

CoarseGrain

Scheduler

Host

Bus

Core Management

Unit

Control and Register Bus Host CPU Interface

System MemoryInterface

System Memory Bus

Shared texture pipeline

USCn-1 USCn

Multi-level Memory Cache Unit (MCU)

2D Core(PTLA)

Shared texture pipeline

USC0 USC1

Notes Tessellation Co-Processor included in DX11 Rogue Architecture Cores Only

24 © Imagination Technologies Multicore Conference Bristol 2012

Imagination’s Scalable ApproachMulti-threading to Multi-Pipeline to Multi-Core

MP

Co

re S

ch

ed

uli

ng

Pip

e S

ch

ed

ulin

g

Applications

API Driver

Th

read

S

ch

ed

uli

ng

Th

read

S

ch

ed

uli

ng

Low Level Services

Pip

e

Sch

ed

uli

ng

Pip

e S

ch

ed

ulin

g

Th

read

S

ch

ed

uli

ng

Th

read

S

ch

ed

uli

ng

25 © Imagination Technologies Multicore Conference Bristol 2012

GPU graphics - more than gamesG

rap

hic

s

an

d V

ide

o

User Interface 3D Navigation Gaming Films

Re

al-tim

e

Pro

ce

ss

ing

Camera EnhancementAugmented Reality

Game Physics

Feature Detection

26 © Imagination Technologies Multicore Conference Bristol 2012

GPU compute – much more than graphics

Image Processing: dynamic contrast enhancement

image sensor compensation

post-processing effects

Image Analysis: gesture recognition

object “lassoing” and tracking

Complex Physics: advanced body dynamics

fluid dynamics

Data Analysis: in-scene object classification

image correlation

OpenCL APIs will be used alongside OpenGL ES APIs to bring a wealth of new capabilities to mobile and embedded platforms…

27 © Imagination Technologies Multicore Conference Bristol 2012

Summary: how we see SoCsEach processor is multicore; each SoC is heterogeneous

SoC

CPU Memory

GPU + VPURPU

28 © Imagination Technologies Multicore Conference Bristol 2012

Conclusions

We’re entering the era of mass market supercomputing

It will be driven by highly parallel architectures such as GPUs complementing today’s CPUs

The evolution will be less dramatic than you may think…but the consequences will be profound

www.imgtec.com

Saeid Azmoodeh

Director of Engineering, Bristol Design Centre

24th September, 2012

Multicore, Scalability and High Performance at Low Power