Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Company External – NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of
NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.
Microcontrollers
Jia Chen
Voice and Audio Solutions: Creating Smart Devices with NXP
September 2018 | APF-MME-T3284
PUBLIC 11PUBLIC
• Voice and Audio Market Trends
• What Issues Need to Be Solved
• NXP Leading Tech Solutions for Voice
and Audio Markets
• NXP – Production Grade SoC
• Innovations
Agenda
PUBLIC 2
Voice and Audio Market Trends
PUBLIC 3
PUBLIC 4
Market Sizing: Intelligent Home Speaker
Source: Strategy Analytics
PUBLIC 5
Voice Becoming the Key Interface in the Smart Home
Market
• Smart speakers are driving voice assistants in the home
• Google will eventually take smart speaker lead
• Google Assistant can leverage the wider Google device
and service ecosystem in the home
PUBLIC 6
Echo/Assistant to 3rd Party Brands
• In 2022 more than half of Intelligent
Speaker shipments will be from third
party hardware manufacturers who
license the virtual assistant technology.
• Connected audio brands such as
Sonos, Bose, Harman Kardon and
Denon will be key players in this market
as they develop partnerships with voice
platform providers.
• 1st party = Amazon Alexa (Echo, Dot) & Google Assistant
PUBLIC 7
The Smart Home Revolution: Smart Life is Coming
SMART HOME SMART CITY
PUBLIC 8
The Smart Home Application Scenario
XX, turn
on the light
Cloud Voice
Service
Home Hub
Light Switch
Dimmers
Skills
Be
dro
om
Kitch
en
XX, turn
on the light
Voice Enabled
Lamp
Skills
Voice Enabled
Coffee Machine
PUBLIC 9
What Issues Need to be Solved
PUBLIC 10
Use Cases
Hardware
Audio
Codec/
ADC
ECNS
Beamformi
ng
On device
RX signal
processing
P
A
i.M
X
Mix
er
Local
Content
Desired speech
Near End
Echo signal
Far End
Calling use case
Hardware
Audio
Codec/
ADC
ECNS
Beamformi
ng
Wake
Word
Detection
(WWD)
On device
Audio Processing,
Speech Synthesis
P
A
NLUSmall Vocabula
ry ASR
i.M
X
Mix
er
Content
Local
Content
Echo signal
Barge-In
Voice control use case
PUBLIC 11
CallingLifeVibes VoiceExperience (LVVE) in Detail
PUBLIC 12
Voice Control
MPU TodayDSP Companion OR Integrated SW
Hardware
Audio
Codec/
ADC
Beamforming
Dereverberation
AEC
Trigger
Phrase/
Hot Word
Detection
Voice
Recognition
Voice
Assistant
Microphone
array, one or
more
Information
and/or
Action
Front End (for “far-field”) Back End Processing
In the Cloud
(“Ecosystem Play”)MCU TodayDSP Companion PLUS
AEC = Acoustic Echo Cancellation; Important distinction: Far field (across the room) vs Near field (ie. on a headset)
PUBLIC 13
Audio Play Back: NXP Smart Amplifier Makes a Real
Difference Real time sensing of speaker temperature and excursion enables superior audio quality and
loudness management Qcapability while providing critical speaker safety information
AudioCapture
HIGH SPLNOISE
SUPPRESSION
FRONT /
BACK
AUDIO
ZOOM
SPEAKER
PROTECT AND
AUDIO BOOST
LOUDSPEAKER
MODEL
CURRENT
SENSING
AMP
EMBEDDED
PROCESSINGHARDWARE
10V BOOST
CONVERTER
AUDIO IN
CLASS-D
Spkr-As-Mic
(SaM)
MICROPHONE
PROCESSING
MOTION FILTER
SaM
TAP TAP
SENSING
NXP Smart Amplifier (TFA9894)NXP Host Software Codec
Music Enhancement software
PUBLIC 14
NXP Leading Tech Solutions for
Voice and Audio Markets
PUBLIC 15
Automotive
Software
• Enable premium quality
calling/speech UI
• ITU-T and Apple CarPlay
compliant
Easy Deployable:
• Pre-integrated in NXP Radio
and i.Mx
• Tools for easy and automated
acoustic tuning
AI Voice Nodes
Software
• Enable premium quality calling/speech UI
• Skype premium and Cortana Premium
compliancy
Smart Amplifier: TFA9862, TFA9892,
TFA9894
Supported Platforms:
• Intel ISST, AMD
• Microsoft Windows APO
Tools: for easy and automated acoustic tuning
AI Speaker & Smart Home
Software
• Enable premium quality calling and
speech UI incl. far-field support
(roadmap)
Smart Amplifier:
• TFA9892, TFA9894/74, TFA9862
Easy Deployable:
• Pre-integrated in i.MX8M / LPC next
• Tools for easy and automated
acoustic tuning
IoT Market OfferingPremium solutions to improve audio and voice end-user experiences for wide range of application domains
PUBLIC 16
Automotive
PUBLIC 17
Automotive: What are the Problems to Solve?
Echo from multiple speakers
Engine noise
Passenger noise/voice
Fan noise
Exhaust noise
Wind noise from window
Road noise from tyres
EchoEcho makes it difficult to be understood
NoiseInterior and exterior noises make it difficult to be understood and
to understand
PUBLIC 18
NXP Audio Solutions – Automotive Offering
Communicate
• Premium call quality in car
• Providing multi-mic echo
cancellation, noise
suppression.
• Automotive certified (ITU-T
1100, 1110)
• CarPlay compliant
Control
Improves the robustness
of speech recognition
engine:
• Barge-In recognition
robustness
• Background noise
robustness
Easy Deployable
Pre-Integrated
• NXP Dirana/Dione/Mercury
• NXP i.Mx
Tuning Tools
Roadmap
ICC: Improve the
intelligibility for In Car
Communication
ESE: Enhance in cabin
engine sound
NXP Voice Audio Solutions offers premium software solutions to improve end-user voice
experiences for automotive
PUBLIC 19
DIONE-ECNR – SAF775C Feature OverviewRadio
• Highly sensitive AM/FM tuner with best-in-class performance
• Integrated FM antenna buffer
• RDS and RBDS demodulation and decoding
• Support of WX
• Digital radio support (HD/DRM) in combined use of Saturn via baseband I2S
• Support of advanced radio processing (iMS, CEQ etc.)
Audio
• 4 x ADCs, 4 x DACs
• Versatile IO ports (I2S, S/PDIF, TDM)
• Two integrated DSPs dedicated for core audio control (SRC, volume, tone, equalization etc.)
• One open audio core (Tensilica HiFi2) dedicated for advanced audio processing (ASE, ANC, ECNR, ESE) with max. frequency of 300MHz
IC Package
• HLQFP176
• HVQFN184
PDC
I
N
F
R
A
Radio
Processing
ARM M0
µP
Audio
Processing
Audio
ADCs
Digital
Audio
Interfaces
E7a
DSPE7a
DSP
HiFi-2
DSP
Audio
DACs
E7a
DSP
E7a
DSP
E7a
DSP
(AM/)AM
FM/HD/DRMTuner
FM
RF Buffer
optional
DIONE
(SAF775D/C)
PUBLIC 20
Echo Cancellation and Noise Reduction (ECNR)
Echo from multiple speakers
Engine noise
Passenger noise/voice
Fan noise
Exhaust noise
Wind noise from window
Road noise from tyres
EchoEcho makes it difficult to be understood
NoiseInterior and exterior noise make it difficult to be
understood and to understand
PUBLIC 21
CarVoice 1.0: 1-mic ECNR for Automotive
Description
1-mic voice enhancements software suite for car. Including HD voice, super-wideband (EVS, VoIP) and full-band (VoIP)
Key Features:
• Uplink 1-mic Acoustic Echo Cancellation and Noise Suppression
• Downlink Loudness Maximizer and Noise Suppression
• Full-band support: from 8 to 48kHz rates to fully support EVS and VoIP
• Automotive certified
Tool:
• Parameter tool with automated tuning for AEC
Type Certification Current State (August 18th 2017)
Voice Call ITU-T P1100
ITU-T P1110
Qualified. ITU-T P.1110: fulfilled with
class 1 double talk at nominal level
Voice Call Apple CarPlay Qualified
Certifications:
Platforms:
• NXP i.Mx8
• NXP DiRaNA 3/ DIONE-ECNR
PUBLIC 22
CarVoice 1.0
Description
1-mic voice enhancements software suite for car. Including HD voice, super-wideband (EVS, VoIP) and full-band (VoIP)
Key Features:
• Uplink 1-mic Acoustic Echo Cancellation and Noise Suppression
• Downlink Loudness Maximizer and Noise Suppression
• Full-band support: from 8 to 48kHz rates to fully support EVS and VoIP
• Automotive certified
Tool:
• Parameter tool with automated tuning for AEC
Type Certification Current State (August 18th 2017)
Voice Call ITU-T P1100
ITU-T P1110
Qualified. ITU-T P.1110: fulfilled
with class 1 double talk at nominal
level
Voice Call Apple CarPlay Qualified
Certifications:
Platforms:
• NXP i.Mx8
• NXP Dirana / Dione
PUBLIC 23
NXP CarVoice – Compliance Status
Application Spec / Operator Result
Mobile / Smartphone
AT&T (2.6.1) Passed
T-Mobile US Passed
Vodafone Passed
Orange Passed
CMCC Passed
Application area Spec Result
Car
VDA 1.6 Passed VDA 1.5
ITU-T 1100/1110 Passed
ITU-T 1140 Passed
CarPlay (ECNR part) Passed
PUBLIC 24
Full Band 48 KHz
Sample Rate Matters
Narrowband 8 KHz
Wideband 16 KHz
Ultra-wideband 24 KHz
PUBLIC 25
Getting ITU-T P1000, P1100 and ECNR CarPlay
Certification
NXP has contacted and can refer the following labs
Lab’s name Lab location Contact personEstimated cost to receive
ITU-T certification
深圳市计量质量检测研究院Shenzhen Academy of Metrology & Quality Inspection
http://www.smq.com.cn/Web/index.aspx
ShenzhenMr Li 李主任13760491339
2 days @ 3000RMB/Hour or
20kRMB/Day
=> 40 kRMB
中国泰尔实验室China Telecommunication Technology Labs
http://www.chinattl.com/
Beijing
Xinglong Zhao 赵兴龙18311191852
2 days @ 3000RMB/Hour or
20kRMB/Day
=> 40 kRMB
深圳微测 MicroTesthttp://www.mtitest.com/
http://www.mtitest.com/Projects/Projects-
08391546581.html
Shenzhen
Nancy Le 乐丽琴15019402635
Mail: [email protected]
2 days @ 2000RMB/Hour
=> 32 kRMB
http://www.smq.com.cn/Web/index.aspxhttp://www.chinattl.com/http://www.mtitest.com/http://www.mtitest.com/Projects/Projects-08391546581.htmlmailto:[email protected]
PUBLIC 26
AI Speaker and Smart Home
PUBLIC 27
AI Speaker Far Field Reference Design
Hardware
Audio
Codec/
ADC
Beamforming
Dereverberation
AEC
“XXXX” Trigger Phrase/
Hot Word Detection
(ASR)
Voice
Assistant
7
Microphone
array
Information
and/or
Action
Front End (for “far-field”) Back End Processing
In the Cloud
NXP i.MX 7D/ 8M Mini
Codec and DSP
AEC = Acoustic Echo Cancellation; Important distinction: Far field (across the room) vs Near field (ie. on a headset)
PUBLIC 28
AI Speaker Ecosystem Components
i.MX 7D/8M Mini
Yocto LinuxMicrophone
Array
Audio Codec
with DSP
1 2 3
1 to 8 Mics Hardware DSP Wake Word Royalty
4
System Integrator Production
Microphone
Array
i.MX 7D/8M Mini
Yocto Linux
5
All Far Field Audio
runs on i.Mx
Wake Word & Audio
Algorithms Royalty
PUBLIC 29
Microphone Arrays
Circular, Rectangular, or Triangular Arrays:• Best for 360°capture field – center room locations
Line Array:• Best for up to 180° capture field
• Suitable for close to, and on wall locations
PUBLIC 30
Voice Front End Solutions Overview
Partner(Alphabetical
Order)
MicrophonesFront End
Processing
Trigger Phrase
DetectionCloud Interfaces
Demo
Availability
Mass Market
Support
Device Part
No.
ADI
7
6 in Circle
1 in Center
Yes
Amazon
restricted
Yes
AlexaAmazon
I2S
SPIFrom Amazon No ADN8080
4
3 in Triangle
1 in Center
YesYes
SensoryAmazon
I2S
SPILimited No ADN8080
Synaptics
2
YesYes
SensoryAmazon
USB
I2S
SPI
Now
Arrow CX20921
4 Arrow CX20924
Microsemi
2
YesYes
SensoryAmazon I2S Yes Arrow
Future3
DSP
Concepts2 to 7 Yes, software
Yes
SensoryAmazon API 3Q2017 Planned
Rokid 2 to 6 Yes, softwareYes
RokidRokid API 2Q2018 Yes
i.MX8M/8M
Mini
iFlytek 2 to 6 Yes, softwareYes
iFlytekiFlytek API 2Q2018 Yes
i.MX8M/8M
Mini
PUBLIC 31
Voice HAT Solution: i.MX + Daughterboard
• Daughterboard that plugs
directly into PicoPi
baseboard
• HAT comes with 2x I2S
microphone and 2x of NXP
TFA9892 speaker amplifiers
suitable for voice assistant
applications such as AVS
and Google Voice Assistant
• Co-designed, manufactured
and sold by TechNexion
40-pin Header
TFA9892
Header
Mic A Mic B
Power
Mgmt I2SI2C
SPKR A
SPKR B
PUBLIC 32
Creoir Far Field Voice HAT Solution
• Daughterboard plugs directly
into PicoPi baseboard
• AVS for i.MX6UL, i.MX7D,
i.MX8M, i.MX8M Mini
• Utilizes same Yocto image
• Short design cycle
• Creoir are a certified AVS
System Integrator with
experience in audio
design/tuning and working with
customer in the AVS product
certification process
• Creating companion set up app
40-pin Header
CX2029
1
Header
Mic A Mic B
Power
MgmtI2S
I2C
PUBLIC 33https://developer.amazon.com/alexa-voice-service/dev-kits/amazon-7-mic
https://developer.amazon.com/alexa-voice-service/dev-kits/amazon-7-michttps://developer.amazon.com/alexa-voice-service/dev-kits/amazon-7-mic
PUBLIC 34
Synaptics Solution: 2Mic + NXP PICO-PI-IMX7
Kit includes:
• PICO-PI-IMX7 i.MX7D development board− Pre-flashed with Yocto Linux BSP
− AVS Device SDK
− Sensory wake word
• CX20921 evaluation board, pre-flashed with firmware
• Microphone module with two omnidirectional mics
• Microphone holder board
This kit can be ordered through Arrow https://www.arrow.com/en/products/conexantnxp2micforavs/arrow-development-tools
NXPPICO-PI-IMX7
Conexant2MicrophoneModule
ConexantCX20921EvaluationBoard
NXPPICO-PI-IMX7
Conexant2MicrophoneModule
ConexantCX20921EvaluationBoard
https://www.arrow.com/en/products/conexantnxp2micforavs/arrow-development-tools
PUBLIC 35
Amazon Alexa & Google Assistant Reference Platform
PUBLIC 36
Cloud Service Support
PUBLIC 37
Audio Lab Support by NXP Lab
Reproduce complex
acoustic test cases
PUBLIC 38
3rd Audio Lab: Digital Benchmark
• Below curves are Sensory Hot Word Detection without SpeechAssist on Win32
• Eventually, with lab, we will have curve with / without SpeechAssist4.0 at different
SNR/distance on I.MX-RT1050 & have competitors curves
• We’re moving from subjective and unverified claim (often by competition) to objective
measurementsTrue positive percentage Vs SNR True positive percentage Vs Reverb SNR =
12dB
PUBLIC 39
AI Voice Nodes
PUBLIC 40
The Low Cost Voice Control Project
1. Cost effective robust MCU based voice control
2. Seamless WiFi and Bluetooth Connectivity
3. Robust to best-in-class Security options
4. Confidence to Production experience
“We want to add voice control to our existing application but we still need to be
within our target cost range. MCU voice would be a great fit for us if it doesn’t delay
our timeline, or add a bunch of risk to the project.”
PUBLIC 41
Easily/inexpensively add voice control capabilities to your existing application…
Targeted Product Areas Include
PUBLIC 42
Enablement FunctionalityKey Provisioning | Cloud Onboarding | Mobile Control | Discovery | BLE Streaming | Skill Enablement
Use Case – Cloud Based Voice Enablement
• Enabled voice in all products which can
have specific target skills to control.
• Build a home regarded as a voice
repeater (removing voice dead zones).
• Allow consumers to multi-task unlike
single task mobile use. For example:
− Prepare dinner.
− Listen to your favourite podcast.
− Turn on the kettle.
− Control home equipment.
− Set reminders.
• Are controlled devices instead of
controllers
PUBLIC 43
APPLICATION
MICROCONTROLLER
POWER SUPPLY
AU
DIO
OU
TP
UT
An
alo
g A
mp
lifie
r
CONNECTIVITYWireless Interfaces
AU
DIO
INP
UT
An
alo
g F
ron
t En
d
MEMORY
AnalogSpeaker
Analog or DigitalMicrophones
WiFi and Bluetooth radio
Voice control as an add-on or
integrate existing application and
voice on same MCU?
Low Cost MCU Voice Control Hardware System Diagram
PUBLIC 44
DESIGN FOCUSED
ON OPTIMIZED BOM
TO ACHIEVE
OPTMIZED COST
SOLUTION
Ethernet on Proto version only
QSPI
256Mbit QSPI FLASHMEMORY
MIMXRT1051DVL06A196MAPBGA 10x10mm 0.65
main MCU
5 to 3.3V RegulatorPOWER
u.FLConn
i.MX RT
Control module
40
-pin
BB
Co
nn
ecto
r
MIC INConn
USB Connector
AUDIO OUT
I2S
I2S
VOICE
Application board
SDIO
UART
CYW4343WWiFi and BLEwireless SoC
I2C
SPI
A71CHAuthentication
SECURITY
KW21ZZigbee/Thread
Wireless SoC
I2S
TFA9894DBAudio Amplifier
ANALOG
SPH0641LM4H-1Digital MIC
SPH0641LM4H-1Digital MIC
SPH0641LM4H-1Digital MIC
1.8V RegulatorPOWER
3.3V RegulatorPOWER
40-
pin
BB
Co
nn
ecto
r
ETHERNETConn
KSZ8081ANALOG
SERIAL PORTHeader
RGB LED
Wifi LED
SW1
SW2
5V DC
Power LED
i.MX RT
JTAG
KW
21Z
JT
AG
USB
Note: Default 180 is 2 Mics
HW Block Diagram Preview for the Voice Control Solution
PUBLIC 45
22mm
30
mm
30mm
THEORETICAL
DIMENSIONS
40
mm
MOLEX 501745-0401
Board-to-board connector 40-pin10X2.3mm 0.4
BACK VIEW
RT1052DVL06
196MAPBGA 10x10mm 0.65
1D
XW
iFi a
nd
BLE
ABRACON ABM8G-24.000MHZ-B4Y-T
24MHz Crystal 30ppm3.2X2.5mm
NXP MIMXRT1052DVL6B
Application Microcontroller196-ball MAPBGA 10X10mm 0.65
MURATA LBEE5KL1DX
WiFi and BLE moduleLGA 6.95X5.15mm
MOLEX 0734120110
u.FL micro antenna connectorSMT 1.25mm
FRONT VIEW
IS25LP256DJ256Mbit
QSPI NOR Flash
ABRACON ABS07-32.768KHz-7-T
32.768KHz Crystal 30ppm3.2X2.5mm
ISSIIS26KL256S-DABLI00256Mbit HYPER Flash24-ball TBGA 6X8mm 5x5
Layout Preview for the i.MX RT Connected Module
MAX DIMENSIONS
PUBLIC 46
22mm
30
mm MOLEX
051338-0474Board-to-board connector 40-pin
10X2.3mm 0.4
BACK VIEW FRONT VIEW
SPEAKER
Coonector
user
RGB LED
SW1 SW2
NXP TFA9894DB
Audio Amplifier I2S interface48 WLCSP 3.55X2.5mm
KNOWLESSPH0641LM4H-1MEMs Microphone PDM outputBottom Port LGA 3.5X2.65mm
MOLEX 201267-0005
USB Type-C connectorRight Angle Top/Surface Mount
Pin-holesI2S signals
power
LED
wifi
LED
Layout Preview for the Voice Control Board
PUBLIC 47
75mm x 75mm x 45mm
Connected
System Module
with Certified Radio
Design based on two boards stacked together:• Connected Module embedding Power, Digital, Radio
• Audio Board embedding Sensors and User interfaces
SPEAKER
Coonector
RT1052DVL06196MAPBGA
10x10mm 0.65
1D
XW
iFi a
nd
BLE
TBD256Mbit
Octal SPI NOR FlashA71CH
KW21Z48QFN
7x7mm 0.5
Layout Preview for the Overall Voice Control Solution
PUBLIC 48
NXP SW
Driver Layer
ADC DriverWiFi DriverXMOS/I2C/PW
M
Audio Framework
Media Codec
Streamer
Audio Input ProcessorAudio Processing
Beamforming
Barge-in
Echo Cancellation
Noise Supression
Wake Word
Engine
AVS Client SDK
LWIP
mBedTLS
MQTT HTTP/2 Enable Processing Callback
Audio Input Streaming
when wake word trigger
Local
Commands
Additional
Drivers
Customer SW
Partner SW*
Later Phase*
Customer/Solutions ApplicationKey Provisioning*
Onboarding*
WiFi Access Point Mode
Companion App communication
Callb
ack
*Pre-integrated by NXP
Software Block Diagram for the Voice Control Solution
PUBLIC 49
Driver Layer
ADC DriverWiFi DriverXMOS/I2C/P
WM
Audio Framework
Media Codec
Streamer
Audio Input Processor
Audio ProcessingBeamforming
Barge-in
Echo Cancellation
Noise Supression
Wake Word
Engine
AVS Client SDK
LWIP
mBedTLS
MQTT HTTP/2 Enable Processing Callback
Audio Input Streaming
when wake word trigger
Local
Commands
Additional
Drivers
Customer/Solutions ApplicationKey Provisioning*
Onboarding*
WiFi Access Point Mode
Companion App communication
Callb
ack
Init: ~300MHz
CPU: ~80MHz
Flash: 233Kb
RAM: 216Kb
Estimated
CPU: ~150 MHz
Flash: 192Kb
RAM: 160Kb
Includes:
• model
• algorithm.
• WWE
Does not include the decimation which is currently being worked on.
NXP SW
Customer SW
Partner SW*
Later Phase*
*Pre-integrated by NXP
Resource Estimates
PUBLIC 50
SPI
CypressCYW4343W
WiFi+BT4.1
wireless SoC
QSPI
QSPI NOR FlashMEMORY
Regulator/ProtectionPOWER
MIMXRT1052DVL06A196MAPBGA 10x10mm 0.65
main MCU
i.MX-RT
Connected MODULE
NXP SW
Partner SW*
Driver Layer
Audio Framework
Media Codec
Streamer
Audio Input
Processor
Audio ProcessingBeamforming
Barge-in
Echo Cancellation
Noise Supression
Wake Word EngineEnable Processing Callback
Audio Input Streaming
when wake word
trigger
I2S 3in
Audio Pre Processing
I2S 2in
I2S 1Out
Decimationin
Decimationin
uFLCONNECTOR
*Pre-integrated by NXP
Audio Processing Overview
PUBLIC 51
SPI
CypressCYW4343W
WiFi+BT4.1
wireless SoC
QSPI
QSPI NOR FlashMEMORY
Regulator/ProtectionPOWER
MIMXRT1052DVL06A196MAPBGA 10x10mm 0.65
main MCU
i.MX-RT
Connected MODULE
Driver Layer
I2S 3in
Audio Pre Processing
I2S 2in
I2S 1Out
Decimationin
Decimationin
• Two I2S ports are used for the individual
Microphones
• The last I2S is used for the Audio Amplifier
• Decimation to reduce the sample to the
estimated rate
uFLCONNECTOR
NXP SW
Partner SW*
*Pre-integrated by NXP
Driver and Decimation
PUBLIC 52
Audio Framework
Media Codec
Streamer
Audio Input ProcessorAudio Processing
Beamforming
Barge-in
Echo Cancellation
Noise Supression
Wake Word
EngineEnable Processing Callback
Audio Input Streaming
when wake word trigger
Local
Commands
The audio processing software is a library that is optimisized for the
product, microphone placement/orientation. It cleans up the
microphone audio in preparation to handing it off to the Wake Word
engine or Audio streamer. It also supports Far Field implementation
great than eight feet.
Barge-in: is the process of taking the outputting audio and removing it from the
microphone input. This is normally the case of playing music while also trying to
issue voice commands.
Echo Cancellation: is the process of removing the reflective sounds from floors,
walls and ceilings.
Beamforming: is the process of taking multiple microphone inputs and focusing
on the sounds coming from the direction the user is.
Noise Suppression: is the process of reducing the environmental noise from the
microphone input.
NXP SW
Partner SW*
*Pre-integrated by NXP
Audio Processing Software
PUBLIC 53
Audio Framework
Media Codec
Streamer
Audio Input ProcessorAudio Processing
Beamforming
Barge-in
Echo Cancellation
Noise Supression
Wake Word EngineEnable Processing Callback
Audio Input Streaming
when wake word trigger
After the audio has been processed, it will be fed into the
wake word engine. This is engine determines if the Alexa
wake word was uttered.
What happens when the wake word is said:
• The engine will inform the Audio Framework to start streaming.
• The streaming data will be sent to AVS.
• The engine will inform the Audio Framework to stop streaming.
NXP SW
Partner SW*
*Pre-integrated by NXP
Wake Word Engine
PUBLIC 54
Audio FrameworkMedia
Player
Media
Browser
Media
Indexing
Playlist /
Play
Queue
Common Audio Framework API
Media
Device
Support
Device
ManagerInput
Manager
Audio Framework provided by NXP• Provides Audio Media Player
• Supports Industry Codecs
– MP3
– WAV
– AAC
• Support for multiple audio interfaces.
• A light weight NXP IP version of gstreamer.
Audio Framework
PUBLIC 55
BOMs
RF Certification Artifacts
LayoutsSchematics
Software Source
Training Videos
Documentation
Design Flows
NXP IoT Solutions Provide
• Coherency
• Pre-Integration
• Testing
• Production Readiness
OOB HW/SW
Opportunity to Start with Kit Plus Rich Set of IDEx IP
PUBLIC 56
Project: NXP HotwordVoice User Interface Solution: Long term goal
microphones I.MXRT
EC / NR NXP
Wake up
word
NXP
TFA Amplifier
NXP HW Machine Learning
Acceleration
Audio framework
NXP HW
NXP SW
• IMXRT: MICR
• TFA Amplifier:
SIP
• EC / NR: SIP
• Wake-up word: SIP
• Audio Framework:
SIP / MICR
• ML SDK : MICR
PUBLIC 57
NXP – Production Grade SoC
PUBLIC 58
i.MX 7 family
i.MX 8 family
Safety Certifiable & Efficient Performance
Flexible Efficient Connectivity
ARM® v7-A
i.MX 8M family
i.MX 8X family
Advanced Audio, Voice & Video
Advanced Graphics & Performance
i.MX 7ULP family
Ultra Low Power with Graphics
ARM ® v8-A (32-bit/ 64-bit)
ARM® v7-A (32-bit)
i.MX 6QuadPlus
i.MX 6Dual
i.MX 6Solo
i.MX 6DualLite
i.MX 6SoloLite
i.MX 6SoloX
i.MX 6UltraLite
i.MX 6DualPlus
i.MX 6Quad
i.MX 6SLL
i.MX 6ULL
i.MX Series: 3 Processor Families with Targeted Features
M4A7
M4
A53
M4
A35
M4
A53
A72
Pin
-to-p
in C
om
pa
tib
le
Soft
ware
Com
patible
M4
A9
A9
A7
PUBLIC 59
i.MX 8M ‘Mini’ FamilyCost-effective Applications Processors in 3rd generation 14LPC FinFET process
For Consumer and Industrial Applications
Video ARM CPU
Cortex-A53 | Cortex-M4
Up
to
18,0
00 D
MIP
S
So
ftwa
re C
om
pa
tibility
14x14 0
.5m
m o
r 17x17 0
.75m
m (*T
BD
)
Pin
Co
mp
atib
ility
• Separate 2D/3D
• 1 Vec4 Shader
• Up to 6.4 GFLOPS
• OpenGL ES 2.0
• OpenVG 1.1
• Up to 40 MTri/s
• Up to 400 MPix/s
Connectivity I/ODisplay / Camera Audio I/O2D + 3D GPU • Decode:
1080p60 H.265,
VP9 H.264, VP8
• Encode:
1080p60 H.264,
VP8
• MIPI-DSI 4-lane
• MIPI-CSI 4-lane
Streaming Media Voice Assistants Industrial IoT Edge Compute Video ConferencingAI, Machine Learning
i.MX 8M Mini
(Quad/Dual/Solo) 1x, 2x, or 4x A53
Video ARM CPU
Cortex-A53 | Cortex-M4
i.MX 8M Mini Lite
(Quad/Dual/Solo) 1x, 2x, or 4x A53Up
to
18,0
00 D
MIP
S
SAME AS i.MX
8M MiniNo Video
acceleration
SAME AS i.MX
8M Mini
SAME AS i.MX
8M Mini
SAME AS i.MX
8M Mini
• 20-channels
• 32-bits @ 384KHz
• DSD512, TDM
• SPDIF Tx & Rx
• 8-ch PDM MIC
• 2x USB 2.0
• 1x PCIe
• 3x SDIO
• 1x GbE
PUBLIC 60
Quad/Dual/Solo ARM Cortex-A53 @ 1.6-2.0 GHz (up to 18,400 DMIPS)
− ARM v8 Fully 64-bit capable
ARM Cortex-M4 @ 400+ MHz for Low Power, Security
Package: FCBGA 14x14mm, 0.5mm pitch de-pop array
FCBGA 17x17mm, 0.75mm pitch (*TBD)
Operating System targets: Linux OS, Android OS, FreeRTOS
Qualification for Consumer and Industrial applications
Feature Highlights:
− Security: DRM support for RSA, AES, 3DES, DES
− GC NanoUltra 3D Graphics GPU, 1 shader core, OpenGL ES 2.0, 6.4 GFLOPS, 400M
Pix/s, 40M Tri/s
− GC328 2D Graphics GPU
− 1080p60 H.265/HEVC, VP9, H.264, VP8 decoder
− 1080p60 H.264, VP8 encoder
− x16, x32 LPDDR4/DDR4/DDR3(L) (up to 3200 Mtps) LPDDR4: 1600bps @ 800MHz
− High quality image resizing and graphics overlay
− Audio: S/PDIF Rx & Tx, 20x I2S (up to 20ch 32bit @ 384Khz support)
− Display Interfaces: 1x MIPI DSI (4-lane) with PHY
− Camera Interfaces: 1x MIPI CSI2 input (4-lane each) with PHY
− 2x USB 2.0 OTG with PHY
− 1x Gb Ethernet (MAC): AVB & IEEE 1588 for sync, and EEE for low power
− 1x PCIe 2.0 (1-lane) with L1 substates (low power, fast wakeup)
− 4x UART, 4x I2C, 3x SPI
− 3x SDIO3.0 / eMMC5.0 / SD4
− Raw NAND controller (BCH62)
− Quad-SPI for fast boot from SPI NOR; with Execute in Place (XIP); single-bit, dual-bit,
quad-bits and octal-bits access are supported.
i.MX 8M Mini
External Memory
Multimedia
i.MX 8M Mini
System Control
Security
Secure JTAG
XTAL
3D Graphics: GC NanoUltra
Smart DMA x3
Random Number
TrustZone
Secure Clock
eFuse Key Storage
DRM Ciphers
Temperature Sensor
Connectivity & I/O
4x UART 5Mbps4x I2C, 3x SPI
Dual-ch QuadSPI (XIP)
1x PCIe 2.0 – 1-lane
1Gb Ethernet
(IEEE 1588, EEE & AVB)
1080p60 VP8, VP9, H.264, H.265 decoder
Main CPU Platform
NEON
32KB D-cache
FPU
32KB I-cache
512KB L2 Cache
Quad/Dual Corte-A53
NEON
32KB D-cache
FPU
32KB I-cache
Quad/Dual ortex-A53
NEON
32KB D-cache
FPU
32KB I-cache
Quad Cortex-A53
NEON
32KB D-cache
FPU
32KB I-cache
2x USB2.0 OTG + PHY
1080p60 VP8, H.264 encoder
Low Power, Security CPU
Cortex-M4
16KB I-cache
256KB TCM (SRAM)
16KB D-cache
x16-x32 LPDDR4/DDR4/DDR3(L)
up to 3200 Mbps
3x SDIO3.0/MMC5.1/SD4
S/PDIF Rx & Tx,
20x I2S/SAI
32KB Secure RAM
MIPI-CSI 4-lane with PHY
MIPI-DSI 4-lane with PHY
8ch PDM Input
NAND Controller (BCH62)
2D Graphics: GC328
Timer x6
PWM x4
Watchdog x3
PLLs
PUBLIC 61
i.MX 8M Mini Key Features• System Design Optimization
- 14x14 0.5mm package designed for maximum feature
enablement with 6-8 layer board design and no microvias
- Pin-compatible with the i.MX 8M Nano provides drop-in scalable
product performance
- 17x17 0.75mm package for the broad market (*TBD)
- 8ch DMIC support for direct connection of PDM
microphones (no CODEC) enables low system-cost
- Enabling software such as Linux/Android BSP and solutions
software (e.g. Voice, Machine Learning, Audio Framework)
• Triple-Play Audio/Voice/Video
− Up to 1080p60 video decoding (H.265, H.264, VP8/9)
− Up to 1080p60 video encoding (H.264, VP8) using
parallel VPU engine enables video transcode
applications (video calling)
− 2D and 3D GPU to enable 1080p media UI
− Advanced audio capabilities including 8ch DMIC
support, 32-bit @ 384kHZ audio interfaces, multiple
audio channels
• Broad System Connectivity
− MIPI-DSI (4-lanes) for display
− MIPI-CSI (4-lanes) for camera input
− Multiple SDIO interfaces to enable flexibility in supporting
boot, expansion and connectivity (Wi-Fi)
− PCIe with L1 low power substates enables a range of high-
performing Wi-Fi/BT solutions and other connectivity
− Gigabit Ethernet and USB 2.0
• Scalable Performance at Low Power
- Advanced process technology node delivers much lower
leakage than standard technology
- Single-, dual- or quad-core Cortex-A53 cores up to 2.0
GHz; scalable performance in a pin-compatible package
- Heterogeneous multi-core processing with Cortex-M4
running at 400+ MHz; offload tasks, optimize power
- Power efficient 3D GPU and VPU enables 1080p video
transcode and display
- DDR3L, DDR4, and LPDDR4 Support
Preliminary, subject to change
PUBLIC 62
Ultra-low Power
Dynamic & Static
ARM v8/v8m + GPU/DSP
ARM v7/v7m + 2D/3D
ARM v7m + Audio
i.MX 6UL/ULL
i.MX RT
i.MX 7ULP
Scalability of Embedded Processing, the New Normal
PUBLIC 63
i.MX RT1050: Key Differentiators
High Performance
Real-Time Processing
High level of Integration
Low BOM Cost
Easy to Use
• MCU customers can leveraging their current toolchain (MCUXpresso, IAR, Keil)
• Rapid and easy prototyping and development with NXP FreeRTOS, SDK, ARM mbed and the global ARM ecosystem
• Single voltage input simplifies power
circuit design
• Scalability to Kinetis & i.MX products
• Competitive Pricing
– starting @ $2.98 10k RSL
• Fully integrated PMIC with DC-DC
• Low cost package, 10x10 BGA,
enabling 4 Layer PCB design
• SDRAM interface
• Cortex-M7 up to 600MHz (50% faster
than current existing M7 products)
• 20ns interrupt latency
• Up to 512KB Tightly Couple Memory
• High Security enabled by AES-128,
HAB and On-the-fly QSPI Flash
Decryption
• 2D graphics acceleration engine
• Parallel camera sensor interface
• LCD display controller up to WXGA
(1366x768)
• Audio interface with three I2S for
multichannel high performance audio
PUBLIC 64
Specifications
• Package: MAPBGA196 | 10x10mm^2, 0.65mm pitch (130 GPIOs)
• Temp / Qual: -40 to 105°C (Tj) Industrial / 0 to 95°C (Tj) Consumer
High Performance Real Time system
• Cortex-M7 up to 600MHz , 50% faster than any other existing M7 products
• 20ns interrupt latency, a TRUE Real time processor
• 512KB SRAM, configurable to 512KB TCM
Rich Peripheral
• Motor Control: Flex PWM X 4, Quad Timer X 4, ENC X 4
• 2x USB, 2x SDIO, 2x CAN, 1x ENET with 1588, 8xUART, 4x SPI, 4X I2C
• 8/16-bit CSI interface and 8/16/24-bit LCD interface
• Qual-SPI interface, with Bus Encryption Engine
• Audio interface: 3x SAI/ SPDIF RX & TX/ 1x ESAI
Security
• TRNG&PRNG(NIST SP 800-90 Certified)
• 128-AES cryptography
• Bus Encryption Engine: Protect QSPI Flash Content
Ease of Use
• FreeRTOS with SDK
• MCUXpresso
• Comprehensive ecosystem
Low BOM Cost
• Competitive Price
• Fully integrated PMIC with DC-DC
• Low cost package, 10x10 BGA with 0.65mm Pitch
• SDRAM interface
i.MX RT1050 Series Block DiagramKey Features and Benefits
PUBLIC 65
Connectivity Solutions
Evaluation Kits:
Runtime SoftwareSoftware
Development Tools
Hardware
Development Tools Application Specific
Comprehensive frameworks and
solutions for low-power, connected,
and secure embedded systems
Industry leading IDE support and
intuitive software configuration
tools to accelerate application
development
Low cost hardware platforms for
evaluation and application
development. Partner solutions for
hardware debugging solutions
Software frameworks and
development tools for targeted
applications and certified
connectivity solutions
Get started quickly and get
the support you need, when
you need it
Support
NXP Solutions: IDE / Toolchains:
• NXP Community
• Solution Designs
• Application Notes
• Schematics
RTOS, Middleware Partners: Partner Solutions
i.MX RT1050: Enablement Overview
• Graphics• Touch HMI• Camera interface
• Motor Control• Voice activation• Audio• Sensor Fusion• Cloud Connectivity
Broad Market:
• Professional Support
• Professional Services
High Touch:
802.15.4
PUBLIC 66
NXP Smart Amplifier for IOT
Device
PUBLIC 67
What is NXP Smart SPK Amplifier?
• 3-4 dB perceived loudness (w and w/o amp)
• Enhance audio-fidelity w/ dynamic EQ and MBDRC
• Protection on Speaker Excursion and Temperature
• DSP embedded – Easy of use/Integration
• Proven solution in the market
Get the Best possible sound from small form factor
Speaker BoostTuning Tool
LoudspeakerModeling
X-limit / T-max
Voltage+ CurrentSensing
Class-DG
Boosted DC-DCIntegrated
Embedded Processing
AnalogHardware
Au
dio
InD
igit
al In
terf
ace
Man
aged A
ud
io O
ut Bring your product to next level
PUBLIC 68
Mobile Audio Solution• One-Stop-Shop for Audio Solutions in Mobile Devices
• Creative and Innovative teams with many Ideas
• Knowledgeable Local Support teams solving customer’s problems
• Successful deployments with key players providing good references
• Extending scope to Adjacent Markets
Feature Smartphones / Wearables / Hearables Smart Home / Internet
Phones Tablets of Things
PUBLIC 69
Smart Amplifier Product Overview
Features TFA9892 TFA9894 TFA9874 TFA9896
Embedded DSP Yes Yes No Yes
Package size (mm2) 10 / 11 8 6 6
Customer Sample Available 1Q18 4Q17 Available
Audio input interfaces I2S/PDM I2S/PDM I2S/TDM I2S/TDM
Boost Voltage 12V 9.5V 9.5V 6.1V
RMS Output Power ( 8ohm) 6.3W 4W 4W 2.1W
RMS Output Power ( 4ohm) >7W 5.5W 5.5W 3.2W
Speaker Channels Mono Mono Mono Mono
Typical Speaker Load (Ω) 4-8-32 4-8-32 4-8-32 4-8-32
SNR (dB) 115 100 100 102
MB-DRC Yes (3-band) None (On Host) None
Noise (uV) / Receiver mode 12 12 12 30/18
PUBLIC 70
Schedule NXP schedule
Customer Sample May 2017
RAMP UP Dec 2017
Schedule NXP schedule
Customer Sample June 2017
RAM UP Dec 2017
4.87
4.4
5
CVDDPCBST
CVBATLBST
TFA9874
2.65
2.5
5
CSENSERSENSECVD DD
5.27
4.4
5
CVDDPCBST
CVBATLBST
TFA9894
3.13
2.5
5
CSENSERSENSE
CVD DD
DS
P
DSP
Key Improvement versus TFA9872
• WCSP packge
PUBLIC 71
New Layout of TFA9894 – Supporting IoT Device
TFA9894 – Easy RoutingTFA9894 Original
POD’s (Package Outline Dimension) will be the same: No change in physical location of the bumps, only function will change
PUBLIC 72
Innovations
PUBLIC 73
Neural Network Applied to: Audio to Haptic
Sound Effect detector
AUDIO STREAM
• Voice
• Sound Effects
• Music
• Ambient
SPEECH
NON-SPEECH
Audio routed to
LoudSpeaker
Audio routed
to
Haptic
• Voice
• Music
• Sound Effects
https://youtu.be/4DZxE77QPcY?t=165
https://youtu.be/4DZxE77QPcY?t=165
PUBLIC 74
AI Based Speech Detection – Ex: Speaker Authentication
PUBLIC 75
Example: Speech vs. Music
PUBLIC 76
At the Customer At the Edge
Training in
the cloud
Customer board
CPU GPU
DSP MLa
Firmware
Storage
Vision SDK
Sensor
SDK
Machine
Learning
SDK eIQ-Engine
Ready for
inference
Voice
Vision
NXP MCU/MPU ML/DL Support
PUBLIC 77
Q & A
NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.
www.nxp.com
http://www.nxp.com/