34
© ARM 2017 ARM instruction sets and CPUs for wide-ranging applications Chris Turner ARM Tech Forum Taipei Director, CPU technology marketing July 4 th 2017

ARM instruction sets and CPUs for wide-ranging applications · ARM instruction sets and CPUs for wide-ranging applications Chris Turner ... Cortex-A Rich UI and OS, open system,

Embed Size (px)

Citation preview

Title 44pt sentence case

Affiliations 24pt sentence case

20pt sentence case

© ARM 2017

ARM instruction sets and CPUs for wide-ranging applications

Chris Turner

ARM Tech Forum Taipei

Director, CPU technology marketing

July 4th 2017

© ARM 2017 2

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM computing is everywhere

6.6BnARM-based

embedded

chips shipped in

2016

#1shipping GPU in

the world is

Mali

> 5Bnpeople using

ARM-based

mobile phones

100BnARM-based

chips to date

© ARM 2017 3

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Total computing from sensors to server

Efficiency

enabling

distributed

compute

Efficiency

delivering

TCO

benefits

Demands more performance with greater emphasis on efficiency and power management

Performance,

efficiency

and security

© ARM 2017 4

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-AHighest performance

Optimized for

high-level operating

systems

Cortex-RFast response

Optimized for

high performance,

hard real-time

applications

Cortex-MSmallest/lowest power

Optimized for

discrete processing

and

microcontrollers

Optimized for

physical security

SecurCoreTamper resistant

ARM CPU architecture for total computing

© ARM 2017 5

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Total computing in mobile

© ARM 2017 6

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Automotive

The right combination of CPUs

Consumer

Cortex-M Low power, deterministic sensing and control

Cortex-A Rich UI and OS, open system, high performance

Cortex-R Safety, performance and real-time control

IoTMobile

© ARM 2017 7

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Automated

exceptions

Virtual memory

Protected

memory

Programmable

exception

model

ARM architecture profiled for the application

ARMv8-A

ARMv8-R

TrustZone +

Virtualization

ARMv8-M T32

A64, A32, T32

A32, T32

NEON

SIMD

TrustZone DSP

AutomotiveConsumer IoTMobile

Virtualization

© ARM 2017 8

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Evolving architecture that supports an ecosystem

Silicon partners

Applications

Software vendors

Development tools

ARM11ARM9

© ARM 2017 9

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM ecosystem – widest choice, most innovation

© ARM 2017 10

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

▪ Cortex-A CPUs cover a wide variety of markets

▪ Scale efficiently to substantially higher performance

▪ Fit even more compute in a smaller footprint with less power

Automotive, industrial

ARM application processors are everywhere

Mobile and consumer Servers and networking IoT and embedded

© ARM 2017 11

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM Cortex-A portfolio

Cortex-A57

2008 – 2013 2017201620152014

Cortex-A72

A5x

Series

A3x

Series

Cortex-A9

Well-established, mid-range processor

Cortex-A5/A7

Smallest and lowest power

ARMv7-A

Cortex-A15/A17

Infrastructure performance;

mobile efficiency

Cortex-A57

Proven infrastructure performance

Cortex-A72

For all applications

Cortex-A35

Smallest, lowest power ARMv8-A

Cortex-A53

Balanced performance and

efficiency

Cortex-A73

For mobile and consumer

Cortex-A32

Smallest, lowest power 32-bit

ARMv8-A

ARMv7-A ARMv8-AARMv8-AARMv7-A

Cortex-A55

Highest efficiency mid-range processor

64/32-bit

Cortex-A75

Ground-breaking performance for

all markets

64/32-bit

32-bit64/32-bit

64/32-bit64/32-bit64/32-bit

64/32-bit

32-bit

32-bit

32-bit

Year of IP release, volume devices in the subsequent year

A7x

Series

big.LITTLE compatible

© ARM 2017 12

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Meeting performance needs for any application

*Performance comparison using SPECINT2000 benchmark suite

Cortex-A portfolio performance comparisons*

0%

20%

40%

60%

80%

100%

120%

140%

Rela

tive

iso

-fre

quency

perf

orm

ance

Ultra high efficiency

Cortex-A5

Cortex-A7

Cortex-A32, Cortex-A35

Cortex-A portfolio performance comparisons*

Co

rtex-A

5

Co

rtex-A

7

Co

rtex-A

32,

Co

rtex-A

35

0%

20%

40%

60%

80%

100%

120%

140%

160%

180%

200%

Rela

tive

Perf

orm

ance

at Tar

get

Fre

quency

High Performance

Cortex-A9 @ 1.5GHz Cortex-A15 @ 1.9GHz

Cortex-A17 @ 1.6GHz Cortex-A57 @ 2.1GHz

Cortex-A72 @ 2.5GHz Cortex-A73 @ 2.7GHz

Cortex-A75 @ 3GHz

Co

rtex-A

9

Co

rtex-A

15

Co

rtex-A

17

Co

rtex-A

57

Co

rtex-A

72

Co

rtex-A

73

Co

rtex-A

75

0%

20%

40%

60%

80%

100%

120%

140%

Rela

tive

Iso

-fre

quency

Perf

orm

ance

Cortex-A

High Efficiency

Cortex-A9

Cortex-A35

Cortex-A53C

ort

ex-A

9

Co

tex-A

35

Co

rtex-A

53

Co

rtex-A

55

© ARM 2017 13

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

big.LITTLE – performance and efficiency for mobile

▪ Faster, more responsive systems

▪ Increased battery life

▪ Higher, longer sustained performance

▪ Tuned for mobile, consumer and embedded

Higher

performance

Superior power

efficiency

Higher compute capacity

Cache Cache

Cache Coherent Interconnect

Interrupt controller

big clusterLITTLE

cluster

© ARM 2017 14

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

DynamIQ plus big.LITTLE

1b+4L1b+3L1b+2L1b+7L 2b+6L 4b+4L

cluster architecture

1 to 8 CPUsCPU1

Snoop

filter

Power

manager

L3

Cache

Bus

i/f

ACP and

peripheral

port i/f

CPU8

Asynchronous bridges

DynamIQ Shared Unit (DSU)

L1C & L2C L1C & L2C

© ARM 2017 15

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case▪ Enhanced multimedia user experience

▪ Intensive data processing

▪ Parallel computing

▪ AI ML CV

NEON – flexible high-performance data computing

Delivers

Performance

Wide Software

Support for TTM

Seamless

Development

© ARM 2017 16

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM compute library leverages NEON for CV/ML

▪ Optimized low-level functions for CPU and GPU

▪ Most popular CV and ML functions

▪ Common functions underpinning popular ML frameworks

▪ Faster performance and development cycle

▪ Public availability as open source

Key Functions

categories

Basic arithmetic

Convolutions

Colour manipulation

Feature detection

Neural network

GEMM

Pyramids

Filters

Image reshaping

Mathematical functions

CPU version - tested on Huawei Mate 8 (single threaded)

© ARM 2017 17

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM TrustZone protecting billions of devices

Device Security

Communications Security

Lifecycle Security

trusted software

Crypto

Root of Trust

non-trusted

trusted

trusted hardware

secure

system

secure

storage

Authentication

Mobile Payment

Content

Protection

Enterprise

Security

© ARM 2017 18

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-AHighest performance

Optimized for

high-level operating

systems

Cortex-RFast response

Optimized for

high performance,

hard real-time

applications

Cortex-MSmallest/lowest power

Optimized for

discrete processing

and

microcontrollers

Optimized for

physical security

SecurCoreTamper resistant

ARM architecture for total computing

© ARM 2017 19

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-AHighest performance

Optimized for

high-level operating

systems

Cortex-RFast response

Optimized for

high performance,

hard real-time

applications

Cortex-MSmallest/lowest power

Optimized for

discrete processing

and

microcontrollers

Optimized for

physical security

SecurCoreTamper resistant

ARM architecture for total computing

© ARM 2017 20

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARMv8-M – TrustZone for IoT

▪ Optimized for microcontrollers

▪ Simplified development

▪ Real-time response

▪ Efficient security

▪ Functional safety

Cort

ex-M

Cry

pto

Cell

Core

Lin

k

© ARM 2017 21

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM Cortex-M portfolio

High

performance

Performance

efficiency

Lowest

power & area

Cortex-M23

TrustZone in smallest area, lowest power

Cortex-M33Flexibility,

control and

DSP with

TrustZone

ARMv8-M

Cortex-M0

Lowest cost,

low power

Cortex-M0+

Highest energy

efficiency

Cortex-M4

Mainstream

control and

DSP

Cortex-M3

Performance

efficiency

Cortex-M7Maximum

performance,

control and

DSP

ARMv6-M

ARMv7-M

25BnTotal units

shipped*

*Data as of Dec. 2016

© ARM 2017 22

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Smallest footprint Maximum efficiency Constrained applications

Cortex-M23 – TrustZone in the smallest footprint

+50%more

efficient

than Cortex-M33

75%smaller

than Cortex-M33

Making energy harvesting IoT viable

Secure

Smart lock

Ultra efficient

Smart bandage

Safe

Medical nanorobot

Ubiquitous

Asset tracking

same ultra-high efficiency

as Cortex-M0+

© ARM 2017 23

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Extremely compact Configurable and extensible Widely applicable

Cortex-M33 – efficiency, security and flexibility

Cortex-A5

Cortex-M33

than Cortex-A580% smaller

Cortex-A5, Cortex-M33 size based on 40nm

TrustZone

Base core

DSP

FPU

Co-p

roc

i/f

© ARM 2017 24

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-AHighest performance

Optimized for

high-level operating

systems

Cortex-RFast response

Optimized for

high performance,

hard real-time

applications

Cortex-MSmallest/lowest power

Optimized for

discrete processing

and

microcontrollers

Optimized for

physical security

SecurCoreTamper resistant

ARM CPU architecture for total computing

© ARM 2017 25

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-R – real-time, high performance, safety

▪ HDD and SSD storage

▪ 3G, 4G, 5G and modems

▪ Automotive – functional safety

▪ Industrial control

▪ Communications

▪ Networking

▪ SoC real-time controllers

Market-leading, real-time compute across many markets

>4.5Bnunits shipped

to date

© ARM 2017 26

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM Cortex-R portfolio

Storage

and

modem

Cortex-R7

High

performance

4G modem and

storage

Cortex-R8

Highest

performance

5G modem and

storage

Functional

safety

Cortex-R4

Real-time performance

Cortex-R5

Real-time

performance

with functional

safety

Cortex-R52

Most advanced processor for

functional safety

ARMv8-RARMv7-R

Cortex-R5

Real-time

performance and

peripheral

control

© ARM 2017 27

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-R8 – next generation mobile and storageC

ore

Mar

ks*

Scalable & efficient

Software workload can be

spread across up to 4 cores

* Total MP CoreMarks using 28nm HPM. Max multi processor config.

0

5000

10000

15000

20000

25000

30000

Cortex-R4 Cortex-R5 Cortex-R7 Cortex-R8

Software compatible

Reduce time-to-market and

protect software investment

Market-leading performance

Best-in-class hard real-time

performance and power efficiency

Spanning performance needs

5G

© ARM 2017 28

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-R52 – ARM’s most advanced processor for safety

ARM’s highest performance, real-time

processor for safety applications

Enabling partner choice through the standardized ARM

architecture and #1 ecosystem

Simplifying functional safety. Providing

enhanced safety features and safety support

© ARM 2017 29

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Cortex-AHighest performance

Optimized for

high-level operating

systems

Cortex-RFast response

Optimized for

high performance,

hard real-time

applications

Cortex-MSmallest/lowest power

Optimized for

discrete processing

and

microcontrollers

Optimized for

physical security

SecurCoreTamper resistant

ARM CPU architecture for total computing

© ARM 2017 30

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM SecurCore for physical security

SecurCore

SC000

Optimized area,anti-tampering

SC300

Performance,

anti-tampering

• 32-bit embedded, high performance

CPU with anti-tampering

• De facto standard for secure elements

• De facto standard for SIM and identification

• Small 32-bit embedded secure CPU for

constrained applications

ARMv6-M

Proven SolutionAnti-tampering PerformanceUltra Low Power

ARMv7-M

The easiest and most proven path to meet for physical security

© ARM 2017 31

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

ARM SecurCore portfolio

High

performance

Performance

efficiency

Lowest

power & area

Cortex-M0

Lowest cost,

low power

Cortex-M0+

Highest energy

efficiency

Cortex-M4

Mainstream

control and

DSP

Cortex-M3

Performance

efficiency

Cortex-M7Maximum

performance,

control and

DSP

ARMv6-M

ARMv7-M

Anti-tampering

SC000

Optimized area,anti-tampering

SC300

Performance,

anti-tampering

ARMv6-M

ARMv7-M

2.5BnSecurCore

shipments in

2016

© ARM 2017 32

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Summary

ARM provides the world’s most power-efficient processors

ARM’s diverse portfolio has solutions for a wide range of applications

The ARM ecosystem is innovating for Total Computing

© ARM 2017 33

Title 40pt sentence case

Bullets 24pt sentence case

bullets 20pt sentence case

Performance and scalability for a diverse range of applications

ARMv5 ARMv6

ARM11MPCore

ARM1176JZ(F)-S

ARM1136J(F)-S

Cort

ex

-A

Cort

ex

-R

Cort

ex

-M

ARMv7-A ARMv8-A

ARMv7-R ARMv8-R

ARMv7-M ARMv8-M

Previous ARMv7 ARMv8

ARM968E-S

ARM946E-S

ARM926EJ-S

Cortex-R8

Cortex-R7

Cortex-R5

Cortex-R4

ARM7TDMI

ARM920T

Cortex-A72

Cortex-A73

Cortex-R52

Cortex-A17

Cortex-A15

ARMv4 ARMv6-M

ARMv6

Cortex-M0+

Cortex-M0

Cortex-A7

Cortex-A5

Cortex-A9

Cortex-A8Cortex-A53

Cortex-A35

Cortex-A32

Cortex-M23

High

performance

High

efficiency

Ultra high

efficiency

Real Time

Performance

efficiency

Lowest power

and area

Cortex-A57

ARM1156T2(F)-S

Cortex-M33

Cortex-M7

Cortex-M4

Cortex-M3

High

performance

Cortex-A75

Cortex-A55

The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its

subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their

respective owners.

Copyright © 2017 ARM Limited