Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Company Public – NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP
B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.
Auto MCU APPLICATIONS ENGINEER
汽车微控制器和处理器应用工程师
Ye Ge / 葛烨
Acceleration on Automotive Radar Signal Processing – S32R
September 2018 | APF-AUT-T3279
COMPANY PUBLIC 1COMPANY PUBLIC 1
• Automotive MMW Radar Basics
• Automotive MMW Radar MCU - S32R
• Software Enablement - Radar SDK
Agenda
COMPANY PUBLIC 2
Automotive MMW Radar Basics
COMPANY PUBLIC 3
Basic Automotive MMW-Radar System
Processor
处理器
MMIC
微波集成电路
COMPANY PUBLIC 4
MEDIUM RANGE 中距雷达RADAR
Blind Side Detection 盲点检测Lane Change Assistant 变道辅助
SHORT RANGE/
MEDIUM RANGE中短距雷达RADAR
Park Assist 停车辅助Cross-Traffic Alert 十字交通报警
LONG RANGE 长距雷达RADAR
Adaptive Cruise Control 自适应巡航Automatic Emergency Braking 自动紧急刹车
Forward Collision Warning 前向碰撞预警
Applications In Automotive
Higher Resolution
COMPANY PUBLIC 5
Typical radar signal processing flow – Chirp Sequence
RF
收发器处于低功耗状态
M samples per chirp
N chirps per frame
P receiving channels雷达信号处理算法 说明
Range FFTN*P次M点实数FFT,在MMIC采样同时完成,获得
距离信息
Doppler FFT M/2*P次N点复数FFT,获得速度信息
Non-coherent CombingM/2*N*P次复数求模累加,获得P个接收通道的非
相干累加幅值矩阵
CFAR计算噪声阈值,从M/2*N幅值矩阵中提取出K1个目
标点
DoA Estimation计算K1个目标点的方位角,得到K2个目标点(带方
位角信息)
Clustering & Tracking对K2各目标点聚类并进行帧间追踪,得到K3个真
实目标轨迹
Post processing 应用相关
时间
帧处理时间(典型值
50
ms)
COMPANY PUBLIC 6
Range (1D) & Doppler (2D) FFTs
主要耗时操作:FFT
COMPANY PUBLIC 7
通道P
通道1
Non-coherent Combing
复数矩阵
通道1
求模
𝑚𝑎𝑔 = 𝑅e2 + 𝐼𝑚2
M
N通道P
通道1
幅值矩阵
通道1
M
N
累加
幅值矩阵
通道累加
M
N
𝑠𝑢𝑚𝑖,𝑗 =
𝑃
mag𝑖,𝑗
主要耗时操作:复数求模
COMPANY PUBLIC 8
CFAR (Constant False Alarm Rate) Detection
幅值矩阵
通道累加
M
N 二维峰值搜索带峰值点标记的幅值矩阵
通道累加
M
N噪声阈值估计
+目标检测
𝑚𝑎𝑔𝑖,𝑗 > 𝑚𝑎𝑔𝑖±1,𝑗±1CA-CFAR
OS-CFAR目标列表
目标1
目标2
。。。
目标K1
主要耗时操作:二维峰值搜索,噪声阈值估计
COMPANY PUBLIC 9
DoA Estimation
目标列表
目标1
目标2
。。。
目标K1
通道P
通道1
复数矩阵
通道1
M
N
DBF 复数矩阵K1
L
求模+一维峰值搜索
目标列表
目标1
目标2
。。。
目标K2
1
L为波束数
MUSIC / ESPRIT, etc.
OR
2
主要耗时操作:
方法1:乘加,复数求模,一维峰值搜索
方法2:矩阵运算 - SVD,特征分解,求逆等
𝑦𝑖,𝑗 =
𝑃
𝑥𝑖𝑤𝑗
𝑚𝑎𝑔 = 𝑅e2 + 𝐼𝑚2
𝑚𝑎𝑔𝑖 > 𝑚𝑎𝑔𝑖±1, 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑
矩阵计算:SVD,特征分解,求逆等
COMPANY PUBLIC 10
Clustering & Tracking目标列表
目标1
目标2
。。。
目标K2
DBSCAN /K-
means
轨迹列表
轨迹1
轨迹2
。。。
轨迹H
聚类列表
聚类1
聚类2
。。。
聚类K4
LAPJV +
Kalman filter
轨迹列表
轨迹1
轨迹2
。。。
轨迹K3
主要耗时操作:
矩阵运算 -相乘,求逆等
COMPANY PUBLIC 11
Automotive MMW Radar MCU -
S32R
COMPANY PUBLIC 12
50%radar modules use
NXP radar technology in 2016
S32R#1 in Radar Processing Integration &
Performance Per Watt
Central
Smart Radar
Integrated
Smart Sensor
Multi Mode TX/RX
Scalable, highly integrated, safe and
secure family driving the digitalization of
radar and sensor data fusion.
#1 Radar MCU Provider
Optimized for
Higher Efficiency
Extended
functionality
SPT2.5• 20bit operand
• Code compatible
SPT2.0• 4x perf improvement for FFTs
• Histogram, thresholding
• 2D Peak Search
SPT1.0• Fast FFT/Win
• Transpose
• Multi-Dimensional data handling
MPC5775K
S32R27x
S32R37x
World’s 1st
IP module for processing
of FCM RADAR
COMPANY PUBLIC 13
S32R - Highly Integrated & Revolutionary
NXP 77 GHz Chipset replaces
▪ Bare Die RF solutions with a RF Chipset based
on RCP package technology
▪ Discrete Filter Components and Amplifiers
S32Rx Product Family replaces
▪ 8 ADC
▪ 1 DAC
▪ 1 FPGA
▪ External SRAM
▪ General purpose MCU
NXP enables
✓ Significant PCB area saving
✓ Reduced assembly cost
✓ Increased PCB quality
Befo
re t
he
S32R
x
ADC
BB
FilterAmplifier
FPGASignal ProcessingTiming ControllerChirp Generation
ADC
D
A
C
SRAM
V
C
O
S32R
x
RF_RX
RF_TXSafety MCU
S32RxTransceiver
COMPANY PUBLIC 14
S32R274 (Racerunner Ultra) – Block Diagram
Specification
▪ CPU: 2xZ7 240MHz (w/ SPE2) & Z4 120MHz in permanent lockstep
▪ SPT 2.0: FFT Accelerator, DMA, additional mathematical functions
▪ Analog: 4xSD ADC & 2xSAR, Low jitter PLL, D/A as option for 24GHz
▪ Package: 257 MAPBGA (14x14mm2, 0.8mm pitch)
▪ Temp Range (Ta): -40 to 125C (150C Tj), AEC-Q100 Grade 1
▪ Main Supply: 3.3V IO & 1.25V Core (ext or PMU)
Key Features
▪ Functional Safety: as per ISO26262 with target ASIL-D
▪ Security: CSE2
▪ DSE: Radar acceleration mathematical functions
▪ Memory: 2MB Flash/1.5MB SRAM (both ECC)
▪ Top of Class Analogue IP: PLL, OSC & SD ADC
▪ SW Enablement: Safe Autosar MCAL ASIL-B (-D)
NV Memory
CPU Platform
Z4LS @ 120MHz
General ADC
2MB with ECC
2x SAR ADC
12bit 1MSps, ch mux
Volatile Emb. Memory
1.5MB RAM with ECC
Connectivity
1 x Cross Trig Unit 2 x IIC
1x FlexPWM (12 ch) 1 x LinFlex Ctrl 2 x dSPI
2 x eTimers – 6 ch. each 3x FlexCAN/CAN-FD SWT & STM
Safety & Support
OSC and PLL
T-Sensor
FCCU/FOSU & CRC
Safe DMA
DEBUG Nexus 3+
Fabric
64 bit XBAR with E2E ECC
ADC Input
4 x SD ADC
12bit 10MSps
SPT2.0
(Signal Processing Toolbox)
Command
Sequencer
COPY FFT
DMA
Radar Processing PlatformMaster Comm Bus
128 msg FlexRay
Gigabit Ethernet
Vehicle secure Network
Z4 LS @ 120MHz
8kB I-cache
2 way
SFPU
4kB D-cache
2 way
PMU
Safe Memory
MEMU
Security
CSE2
Z7 @ 240MHz
16kB I-cache
2 way
SPE2-SIMD
16kB D-cache
2 way
32KB DTCM
VFPU-SiMD
DAC Output
10MSps
Z7 @ 240MHz
16kB I-cache
2 way
SPE2-SIMD
16kB D-cache
2 way
64KB TCM
VFPU-SIMD
64kB TCM
External ADC Interface
MIPI-CSI2
MAXS VMT
COMPANY PUBLIC 15
SPT2.0 - Overview
• SPT sub-modules:
− Data Acquisition and
Sample DMA (SDMA)
− Programmable DMA
(PDMA)
− Command Sequencer
with DMA (CSDMA)
− Hardware Accelerator for
signal processing
− Local Memory Controller
− DMA Arbitration
− Peripheral I/FHardware Accelerator
FFT Copy
Local Memory Controller
Work
RegisterTwiddle RAM Operand RAM
Command
Sequencer
Op
era
tio
n
sch
ed
ule
r
Com
ma
nd
Qu
eu
e
Data
Acquis
itio
n
Programmable DMA .
Data
type
conv
Peripheral I/F
Fast- DMA Engine
Configuration
and Status
Register
DMA
ArbitrationAHB
IPS
Sample
DMAADC
CSICompression
Decompr.
Aggre-
gation
MAXS VMT HIST
COMPANY PUBLIC 16
How to accelerate processing on S32R274
雷达信号处理算法
说明 耗时计算 加速方式
Range FFTN*P次M点实数FFT,在MMIC采样同时完成,获得距离信息
多次大点数FFT SPT2.0:FFT硬件引擎
Doppler FFTM/2*P次N点复数FFT,获得速度
信息多次大点数FFT SPT2.0:FFT硬件引擎
Non-coherent
Combing
M/2*N*P次复数求模累加,获得P
个接收通道的非相干累加幅值矩阵多次复数求模运算 SPT2.0:VMT硬件引擎
CFAR Detection计算噪声阈值,从M/2*N幅值矩阵
中提取出K1个目标点二维峰值搜索,噪声阈值估计
SPT2.0:MAXS和HIST硬件引擎PowerPC® Z7内核:SIMD指令
DoA Estimation计算K1个目标点的方位角,得到K2
个目标点(带方位角信息)
DBF:乘加,复数求模,一维峰值搜索
SPT2.0:FFT, VMT和MAXS硬件引擎
MUSIC/ESPRIT:矩阵运算 -
SVD,特征分解,求逆等PowerPC® Z7内核:SIMD指令
Clustering &
Tracking
对K2各目标点聚类并进行帧间追踪,得到K3个真实目标轨迹
矩阵运算 –乘法,求逆等 PowerPC® Z7内核:SIMD指令
COMPANY PUBLIC 17
Debug
SPT
A
D
SRAM
Chirp -
Buffer
A
D4ch, up to 10MSps
Timing
Engine
Acquis
tion
PDMA
PDMA
Range
FFT
Doppler
FFT
Beam-
Forming
Detection TCM
SDMA
Chirp -
RADAR
Cube
Buffer
OR
AM
; T
RA
M
Object
List
SPE2
A
D
A
D
PDMA
Tra
ckin
g
PDMA
MIPI
CSI2
AutoSAR OS
I/F Driver
Timing Control
Adv. Processing
Z7
Core2
Z7
Core1
CA
N-F
D
LVDS
Nexus/
Aurora
RADAR Algorithm Mapping Example
GbE
VMT
MAXS
COPY
PDMA w/o
Compression
Data logging
HWL
Mag
Peak
Search
COMPANY PUBLIC 18
Where Acceleration is Used
Transmitter
/Receiver
Signal Conditio
ning
Signal Analysis
DetectionClustering Tracking
RADAR Processing
Dolphin + S32R274
Transceiver SPTv2.0 e200z7
COMPANY PUBLIC 19
MPC5775K
S32R37x
SPT1.0• Fast FFT/Win
• Transpose
• Multi-Dimensional data handling
SPT DevelopmentWorld’s 1st
IP module for processing
of FCM RADAR
SPT2.5• 20bit operand
• Code compatible
S32R27xExtended
functionality
SPT2.0• 4x perf improvement for FFTs
• Histogram, thresholding
• 2D Peak Search
Optimized for
Higher Efficiency
COMPANY PUBLIC 20
Software Enablement - Radar SDKRadar Software Development Kit
COMPANY PUBLIC 21
Radar SDK
Development Tools
Mem
Visualization
Assembler
SPT
Graphical Chirp
Designer
Radar Frontend
Base LibrariesEagle
Adaption
Layer
Dolphin
Adaption
Layer
???
Adaption
Layer
Frontend Adaption API
SPT Algo
Lib
Simulation SupportMatlab / C
Func.models
SPT Lib
Advanced Algorithm LibraryPro Libraries
Demo ApplicationRadar Demo Application
Virtual Prototype
Targ
et
So
ftw
are
SPT Driver
Early-access release available
Graph Tool
SPT2.0(RadarSDK)
COMPANY PUBLIC 22
RSDK1.0 Components
SPT Control &
Processing
SPT Driver Lib
SPT Kernel Lib
NXP Radar RF
Front-End Interface
Abstraction Layer
SPI Driver
MIPI-CSI2 Driver
Offline Modelling
Tools
MATLAB Bit-
Exact Kernels
CPU Algorithms
CFAR Example
Debug &
Development Tools
Host File I/O
Status Reporting
Trace Logging
Sample Applications
1RF 4Antennas
TD-MIMO
SPT Standalone
S32R Libraries
Auxiliary
Demo/Example Projects
Reference Data
Test input/output
Twiddle & Window
Doxygen
API Description
Module User Guide
Documentation
Quality Package
Test Reports
COMPANY PUBLIC 23
RSDK Application Block Diagram
S32R
SPI
MIPI
CSI2
Config
Status
ADC data
SPT 2.0
Doppler FFT
Range FFT
DoA (Beamforming)
Peak Search
e200z4/e200z7 Core
SRAM
Radar
Data
‘Cube’
SPT Code Peak List
User Algorithms
SPT Driver
Radar FE Abstraction Layer
CPU Radar Algorithms
Radar RF
Front-end
SPI Driver
CSI2 Driver
Antenna Combining
COMPANY PUBLIC 24
Simplifying SPT Application Development
• RSDK provides application developers with a meaningful interface to the SPT hardware
• Exposes SPT functionality through a set of configuration objects and functions
• Allows for easy integration and execution of SPT programs (kernels)
Construct SPT initialization object
• Number of samples
• Number of chirps
• Input sample buffer address
• Acquisition mode
• Operating mode
Call initialization function
• Basic check of argument validity
• Initializes driver internal state
• Configures module registers
• Returns status information
Construct memory initialization kernel context
• Provide addresses of FFT twiddle factors
• Provide addresses of window function coefficients
Call run function with memory initialization kernel
• Uses context object to provide user arguments to the kernel
• Loads SPT memory with constant data necessary for signal processing
Application now ready to leverage the performance of SPT!
RSDK SPT initialization steps:
COMPANY PUBLIC 25
Additional Features
• Full support for radar ADC sample data acquisition
− Input either from MIPI-CSI2 or onboard Sigma-Delta ADCs
− Reporting of SDMA acquisition statistics (min, max, sum, toggle)
− Automatic DC offset compensation
• Blocking and non-blocking operation modes
− Blocking mode polls for SPT completion inside driver and returns when execution is
complete
− Non-blocking mode the SPT uses an interrupt to signal completion and the user provides
a call-back function
COMPANY PUBLIC 26
What is an SPT Kernel?
• Kernel refers to an SPT program/command sequence.
− A series of SPT instructions intended to perform a radar processing function
Range FFT Kernel
N point FFT
Radix instruction
Radix instruction
Input Data Transfer
COPY prepare data
PDMA SysRAM to SPT RAM
A kernel consists of a
sequence of operations
Each operation is performed through the execution of one or more SPT instructions
• Kernels are assembled into SPT machine code and linked into the application binary
• The linked address is used with SPT driver to allow SPT to execute the kernel…
Machine code is linked into
program binary
COMPANY PUBLIC 27
Driver-Kernel Relationship
• SPT Driver can be used with kernels provided by RSDK or with user custom kernels
• A kernel context object is used to pass information from application to the driver and then kernel
− Selection of blocking or non-blocking execution mode (+ call-back function)
− System memory address of the kernel code to be executed
− Kernel-specific input argument list (such as input/out buffer addresses, scaling factor)
Application Code
• Define context object
• Pass object address to SPT driver run function
SPT Driver
• Set up execution environment
• Parse input arguments and set work registers
• Start SPT command sequencer
SPT Kernel
• Read input arguments from work registers
• Write return values to work registers
• SPT driver defines a calling convention for parameter exchange with SPT kernels
− WR1:WR10 are used to pass input arguments to the kernel
− SPT driver parses the context argument list and writes to the work registers in sequential order
Context
Object
Work
Registers
COMPANY PUBLIC 28
RSDK Kernel Library
• Range (1D) FFT
• Doppler (2D) FFT
• Non-Coherent Combining
Fast-chirp radar signal processing functions:
• 3D FFT
• Peak Search
• Digital Beamforming and DoA
Supported radar system configurations:
• 256 samples, 256 chirps
• 512 samples, 128 chirps
• 1024 samples, 128 chirps*
• 256 samples, 256 chirps TD-MIMO*
* These configurations use CP4D type 2:1 compression mode for the SRAM radar data cube
which must be taken into account when extracting antenna data from the compressed cube
COMPANY PUBLIC 29
SPT Kernel Timing
・・・
Chirp 0
FFT
Chirp 0
Radar frame (10ms)
Legend
Long range radar typ:
Chirps 128
Samples 512
78 us
SPT blocks
28 us FFT Doppler
~2 ms 2-3 ms
FFT
Chirp 1
FFT
Chirp 127
Chirp 1 Chirp 127
Peak Search
TRX low-power state (30ms)
COMPANY PUBLIC 30
Radar Front-End Abstraction
• Radar systems require configuration and control of the attached radar RF front-end transceiver
− Define the chirp shape(s), transmit/receive settings and chirp sequencer behavior
− Reporting of transceiver status/errors allowing the MCU to react accordingly
• RSDK offers a unified high-level API which gives the application a flexible and powerful method of interacting with various transceiver hardware
− Support for general radar transceiver concepts, specifically chirp design and MCU communications
− Includes bare-metal SPI and CSI2 drivers to enable communication
Radar Front-End Abstraction API
MR3003 (Eagle)
Adaptation Layer
TEF810x (Dolphin)
Adaptation Layer
Other Vendor FE
Adaptation Layer
MR3003 HW TEF810x HW Other HW
SPI Driver CSI2 Driver
COMPANY PUBLIC 31
Frame and Chirp Characteristics
Abstraction layer converts the generic chirp definition into front-end specific configuration settings
Generic objects used to capture radar
frame and chirp characteristics
Chirp A Shape
• Timing parameters
• Acq bandwidth
and centre freq
• Slope direction
Chirp B Shape
• Timing parameters
• Acq bandwidth
and centre freq
• Slope direction
Multiple chirp shapes supported
(used for MIMO)
Radar Frame Parameters
• Number of samples
• Number of chirps
• Sampling frequency
• Data acquisition settings
COMPANY PUBLIC 32
Transceiver Configuration Flow
• User responsible for configuration of SPI pin multiplexing settings based on specific hardware
• MIPI-CSI2 uses dedicated device pads (no pin muxing required)
Set the init params and link default optional with common
• Front-end type
• SPI chip select and module ID
• External clock mode
Call initialisation function
• Basic check of argument validity
• Initialise the SPI module for control
• Bring up the front end and report initial status
Construct mandatory frame parameter object
• Number of samples
• Number of chirps
• Sampling frequency
• Number of chirp shapes to employ
• Pointer to shapes
Construct chirp shape object(s)
• Time parameters
• Acq bandwidth
• Centre freq
• Transmitter channel
Call frame config function
• Check arguments are valid and supported by RFE
• Configure chirp waveform
• Init MIPI-CSI2 link
Ready to trigger radar chirp sequence using single function call!
COMPANY PUBLIC 33
MATLAB Software Development
COMPANY PUBLIC 34
MATLAB Software Development
PC
MATLAB
SPT Kernel Functions
NXP Model-Based Design
Toolbox for Radar
• NXP Model-Based Design Toolbox for Radar is a simulation of the SPT instruction set implemented in MATLAB− Provides bit-exact simulation of the SPT instruction set to represent the
SPT module low-level commands
− E.g. RDX4, RDX2, MAXS, VMT, WIN, ADD etc.
− Owned by AMP enablement tools team (not AMP software team)
• RSDK MATLAB Bit-Exact Model for SPT Kernels uses the Toolbox to simulate the RSDK kernel functions− These MATLAB scripts represent the kernel command sequences
provided with RSDK
− E.g. Range FFT, Doppler FFT, peak search, DBF etc.
Two distinct NXP software products combine to provide a MATLAB-based PC
development environment for SPT:
COMPANY PUBLIC 35
MATLAB Environment
• Must install the Model-Based Design Toolbox add-on (see link)
• RSDK Bit-exact kernel scripts are included in release package
− Each MATLAB ‘.m’ script represents a kernel
− Scripts dynamically adapt to samples/chirps configuration
• Support for MATLAB on 32-bit and 64-bit (R2015aSP1, R2016a, R2016b, R2017a, R2017b)
RSDK SPT
kernels
• Users have the ability to develop radar algorithms in MATLAB that incorporate the RSDK kernels
− These algorithms can then be implemented in C and used with RSDK on the S32R274 embedded system
• Advanced users can use the Design Toolbox to design their own SPT kernels in MATLAB
− In the simulation environment we can ignore the SPT memory layout demands
− Once the desired simulation results are achieved the MATLAB script must be translated into SPT code
COMPANY PUBLIC 36
SPT Example Script
• SPT_Example uses RSDK reference input data and executes an example sequence of kernels
• Identical operation to SPT_int_tester example project which runs on real S32R274 EVB hardware
− We can compare the output of the MATLAB script with example project output and the reference output
data
NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.
www.nxp.com