79
Company External NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V. Microcontrollers Jia Chen Voice and Audio Solutions: Creating Smart Devices with NXP September 2018 | APF-MME-T3284

Voice and Audio Solutions: Creating Smart Devices with NXP · 2020. 9. 18. · ADI 7 6 in Circle 1 in Center Yes Amazon restricted Yes Alexa Amazon I2S SPI From Amazon No ADN8080

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Company External – NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of

    NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.

    Microcontrollers

    Jia Chen

    Voice and Audio Solutions: Creating Smart Devices with NXP

    September 2018 | APF-MME-T3284

  • PUBLIC 11PUBLIC

    • Voice and Audio Market Trends

    • What Issues Need to Be Solved

    • NXP Leading Tech Solutions for Voice

    and Audio Markets

    • NXP – Production Grade SoC

    • Innovations

    Agenda

  • PUBLIC 2

    Voice and Audio Market Trends

  • PUBLIC 3

  • PUBLIC 4

    Market Sizing: Intelligent Home Speaker

    Source: Strategy Analytics

  • PUBLIC 5

    Voice Becoming the Key Interface in the Smart Home

    Market

    • Smart speakers are driving voice assistants in the home

    • Google will eventually take smart speaker lead

    • Google Assistant can leverage the wider Google device

    and service ecosystem in the home

  • PUBLIC 6

    Echo/Assistant to 3rd Party Brands

    • In 2022 more than half of Intelligent

    Speaker shipments will be from third

    party hardware manufacturers who

    license the virtual assistant technology.

    • Connected audio brands such as

    Sonos, Bose, Harman Kardon and

    Denon will be key players in this market

    as they develop partnerships with voice

    platform providers.

    • 1st party = Amazon Alexa (Echo, Dot) & Google Assistant

  • PUBLIC 7

    The Smart Home Revolution: Smart Life is Coming

    SMART HOME SMART CITY

  • PUBLIC 8

    The Smart Home Application Scenario

    XX, turn

    on the light

    Cloud Voice

    Service

    Home Hub

    Light Switch

    Dimmers

    Skills

    Be

    dro

    om

    Kitch

    en

    XX, turn

    on the light

    Voice Enabled

    Lamp

    Skills

    Voice Enabled

    Coffee Machine

  • PUBLIC 9

    What Issues Need to be Solved

  • PUBLIC 10

    Use Cases

    Hardware

    Audio

    Codec/

    ADC

    ECNS

    Beamformi

    ng

    On device

    RX signal

    processing

    P

    A

    i.M

    X

    Mix

    er

    Local

    Content

    Desired speech

    Near End

    Echo signal

    Far End

    Calling use case

    Hardware

    Audio

    Codec/

    ADC

    ECNS

    Beamformi

    ng

    Wake

    Word

    Detection

    (WWD)

    On device

    Audio Processing,

    Speech Synthesis

    P

    A

    NLUSmall Vocabula

    ry ASR

    i.M

    X

    Mix

    er

    Content

    Local

    Content

    Echo signal

    Barge-In

    Voice control use case

  • PUBLIC 11

    CallingLifeVibes VoiceExperience (LVVE) in Detail

  • PUBLIC 12

    Voice Control

    MPU TodayDSP Companion OR Integrated SW

    Hardware

    Audio

    Codec/

    ADC

    Beamforming

    Dereverberation

    AEC

    Trigger

    Phrase/

    Hot Word

    Detection

    Voice

    Recognition

    Voice

    Assistant

    Microphone

    array, one or

    more

    Information

    and/or

    Action

    Front End (for “far-field”) Back End Processing

    In the Cloud

    (“Ecosystem Play”)MCU TodayDSP Companion PLUS

    AEC = Acoustic Echo Cancellation; Important distinction: Far field (across the room) vs Near field (ie. on a headset)

  • PUBLIC 13

    Audio Play Back: NXP Smart Amplifier Makes a Real

    Difference Real time sensing of speaker temperature and excursion enables superior audio quality and

    loudness management Qcapability while providing critical speaker safety information

    AudioCapture

    HIGH SPLNOISE

    SUPPRESSION

    FRONT /

    BACK

    AUDIO

    ZOOM

    SPEAKER

    PROTECT AND

    AUDIO BOOST

    LOUDSPEAKER

    MODEL

    CURRENT

    SENSING

    AMP

    EMBEDDED

    PROCESSINGHARDWARE

    10V BOOST

    CONVERTER

    AUDIO IN

    CLASS-D

    Spkr-As-Mic

    (SaM)

    MICROPHONE

    PROCESSING

    MOTION FILTER

    SaM

    TAP TAP

    SENSING

    NXP Smart Amplifier (TFA9894)NXP Host Software Codec

    Music Enhancement software

  • PUBLIC 14

    NXP Leading Tech Solutions for

    Voice and Audio Markets

  • PUBLIC 15

    Automotive

    Software

    • Enable premium quality

    calling/speech UI

    • ITU-T and Apple CarPlay

    compliant

    Easy Deployable:

    • Pre-integrated in NXP Radio

    and i.Mx

    • Tools for easy and automated

    acoustic tuning

    AI Voice Nodes

    Software

    • Enable premium quality calling/speech UI

    • Skype premium and Cortana Premium

    compliancy

    Smart Amplifier: TFA9862, TFA9892,

    TFA9894

    Supported Platforms:

    • Intel ISST, AMD

    • Microsoft Windows APO

    Tools: for easy and automated acoustic tuning

    AI Speaker & Smart Home

    Software

    • Enable premium quality calling and

    speech UI incl. far-field support

    (roadmap)

    Smart Amplifier:

    • TFA9892, TFA9894/74, TFA9862

    Easy Deployable:

    • Pre-integrated in i.MX8M / LPC next

    • Tools for easy and automated

    acoustic tuning

    IoT Market OfferingPremium solutions to improve audio and voice end-user experiences for wide range of application domains

  • PUBLIC 16

    Automotive

  • PUBLIC 17

    Automotive: What are the Problems to Solve?

    Echo from multiple speakers

    Engine noise

    Passenger noise/voice

    Fan noise

    Exhaust noise

    Wind noise from window

    Road noise from tyres

    EchoEcho makes it difficult to be understood

    NoiseInterior and exterior noises make it difficult to be understood and

    to understand

  • PUBLIC 18

    NXP Audio Solutions – Automotive Offering

    Communicate

    • Premium call quality in car

    • Providing multi-mic echo

    cancellation, noise

    suppression.

    • Automotive certified (ITU-T

    1100, 1110)

    • CarPlay compliant

    Control

    Improves the robustness

    of speech recognition

    engine:

    • Barge-In recognition

    robustness

    • Background noise

    robustness

    Easy Deployable

    Pre-Integrated

    • NXP Dirana/Dione/Mercury

    • NXP i.Mx

    Tuning Tools

    Roadmap

    ICC: Improve the

    intelligibility for In Car

    Communication

    ESE: Enhance in cabin

    engine sound

    NXP Voice Audio Solutions offers premium software solutions to improve end-user voice

    experiences for automotive

  • PUBLIC 19

    DIONE-ECNR – SAF775C Feature OverviewRadio

    • Highly sensitive AM/FM tuner with best-in-class performance

    • Integrated FM antenna buffer

    • RDS and RBDS demodulation and decoding

    • Support of WX

    • Digital radio support (HD/DRM) in combined use of Saturn via baseband I2S

    • Support of advanced radio processing (iMS, CEQ etc.)

    Audio

    • 4 x ADCs, 4 x DACs

    • Versatile IO ports (I2S, S/PDIF, TDM)

    • Two integrated DSPs dedicated for core audio control (SRC, volume, tone, equalization etc.)

    • One open audio core (Tensilica HiFi2) dedicated for advanced audio processing (ASE, ANC, ECNR, ESE) with max. frequency of 300MHz

    IC Package

    • HLQFP176

    • HVQFN184

    PDC

    I

    N

    F

    R

    A

    Radio

    Processing

    ARM M0

    µP

    Audio

    Processing

    Audio

    ADCs

    Digital

    Audio

    Interfaces

    E7a

    DSPE7a

    DSP

    HiFi-2

    DSP

    Audio

    DACs

    E7a

    DSP

    E7a

    DSP

    E7a

    DSP

    (AM/)AM

    FM/HD/DRMTuner

    FM

    RF Buffer

    optional

    DIONE

    (SAF775D/C)

  • PUBLIC 20

    Echo Cancellation and Noise Reduction (ECNR)

    Echo from multiple speakers

    Engine noise

    Passenger noise/voice

    Fan noise

    Exhaust noise

    Wind noise from window

    Road noise from tyres

    EchoEcho makes it difficult to be understood

    NoiseInterior and exterior noise make it difficult to be

    understood and to understand

  • PUBLIC 21

    CarVoice 1.0: 1-mic ECNR for Automotive

    Description

    1-mic voice enhancements software suite for car. Including HD voice, super-wideband (EVS, VoIP) and full-band (VoIP)

    Key Features:

    • Uplink 1-mic Acoustic Echo Cancellation and Noise Suppression

    • Downlink Loudness Maximizer and Noise Suppression

    • Full-band support: from 8 to 48kHz rates to fully support EVS and VoIP

    • Automotive certified

    Tool:

    • Parameter tool with automated tuning for AEC

    Type Certification Current State (August 18th 2017)

    Voice Call ITU-T P1100

    ITU-T P1110

    Qualified. ITU-T P.1110: fulfilled with

    class 1 double talk at nominal level

    Voice Call Apple CarPlay Qualified

    Certifications:

    Platforms:

    • NXP i.Mx8

    • NXP DiRaNA 3/ DIONE-ECNR

  • PUBLIC 22

    CarVoice 1.0

    Description

    1-mic voice enhancements software suite for car. Including HD voice, super-wideband (EVS, VoIP) and full-band (VoIP)

    Key Features:

    • Uplink 1-mic Acoustic Echo Cancellation and Noise Suppression

    • Downlink Loudness Maximizer and Noise Suppression

    • Full-band support: from 8 to 48kHz rates to fully support EVS and VoIP

    • Automotive certified

    Tool:

    • Parameter tool with automated tuning for AEC

    Type Certification Current State (August 18th 2017)

    Voice Call ITU-T P1100

    ITU-T P1110

    Qualified. ITU-T P.1110: fulfilled

    with class 1 double talk at nominal

    level

    Voice Call Apple CarPlay Qualified

    Certifications:

    Platforms:

    • NXP i.Mx8

    • NXP Dirana / Dione

  • PUBLIC 23

    NXP CarVoice – Compliance Status

    Application Spec / Operator Result

    Mobile / Smartphone

    AT&T (2.6.1) Passed

    T-Mobile US Passed

    Vodafone Passed

    Orange Passed

    CMCC Passed

    Application area Spec Result

    Car

    VDA 1.6 Passed VDA 1.5

    ITU-T 1100/1110 Passed

    ITU-T 1140 Passed

    CarPlay (ECNR part) Passed

  • PUBLIC 24

    Full Band 48 KHz

    Sample Rate Matters

    Narrowband 8 KHz

    Wideband 16 KHz

    Ultra-wideband 24 KHz

  • PUBLIC 25

    Getting ITU-T P1000, P1100 and ECNR CarPlay

    Certification

    NXP has contacted and can refer the following labs

    Lab’s name Lab location Contact personEstimated cost to receive

    ITU-T certification

    深圳市计量质量检测研究院Shenzhen Academy of Metrology & Quality Inspection

    http://www.smq.com.cn/Web/index.aspx

    ShenzhenMr Li 李主任13760491339

    2 days @ 3000RMB/Hour or

    20kRMB/Day

    => 40 kRMB

    中国泰尔实验室China Telecommunication Technology Labs

    http://www.chinattl.com/

    Beijing

    Xinglong Zhao 赵兴龙18311191852

    2 days @ 3000RMB/Hour or

    20kRMB/Day

    => 40 kRMB

    深圳微测 MicroTesthttp://www.mtitest.com/

    http://www.mtitest.com/Projects/Projects-

    08391546581.html

    Shenzhen

    Nancy Le 乐丽琴15019402635

    Mail: [email protected]

    2 days @ 2000RMB/Hour

    => 32 kRMB

    http://www.smq.com.cn/Web/index.aspxhttp://www.chinattl.com/http://www.mtitest.com/http://www.mtitest.com/Projects/Projects-08391546581.htmlmailto:[email protected]

  • PUBLIC 26

    AI Speaker and Smart Home

  • PUBLIC 27

    AI Speaker Far Field Reference Design

    Hardware

    Audio

    Codec/

    ADC

    Beamforming

    Dereverberation

    AEC

    “XXXX” Trigger Phrase/

    Hot Word Detection

    (ASR)

    Voice

    Assistant

    7

    Microphone

    array

    Information

    and/or

    Action

    Front End (for “far-field”) Back End Processing

    In the Cloud

    NXP i.MX 7D/ 8M Mini

    Codec and DSP

    AEC = Acoustic Echo Cancellation; Important distinction: Far field (across the room) vs Near field (ie. on a headset)

  • PUBLIC 28

    AI Speaker Ecosystem Components

    i.MX 7D/8M Mini

    Yocto LinuxMicrophone

    Array

    Audio Codec

    with DSP

    1 2 3

    1 to 8 Mics Hardware DSP Wake Word Royalty

    4

    System Integrator Production

    Microphone

    Array

    i.MX 7D/8M Mini

    Yocto Linux

    5

    All Far Field Audio

    runs on i.Mx

    Wake Word & Audio

    Algorithms Royalty

  • PUBLIC 29

    Microphone Arrays

    Circular, Rectangular, or Triangular Arrays:• Best for 360°capture field – center room locations

    Line Array:• Best for up to 180° capture field

    • Suitable for close to, and on wall locations

  • PUBLIC 30

    Voice Front End Solutions Overview

    Partner(Alphabetical

    Order)

    MicrophonesFront End

    Processing

    Trigger Phrase

    DetectionCloud Interfaces

    Demo

    Availability

    Mass Market

    Support

    Device Part

    No.

    ADI

    7

    6 in Circle

    1 in Center

    Yes

    Amazon

    restricted

    Yes

    AlexaAmazon

    I2S

    SPIFrom Amazon No ADN8080

    4

    3 in Triangle

    1 in Center

    YesYes

    SensoryAmazon

    I2S

    SPILimited No ADN8080

    Synaptics

    2

    YesYes

    SensoryAmazon

    USB

    I2S

    SPI

    Now

    Arrow CX20921

    4 Arrow CX20924

    Microsemi

    2

    YesYes

    SensoryAmazon I2S Yes Arrow

    Future3

    DSP

    Concepts2 to 7 Yes, software

    Yes

    SensoryAmazon API 3Q2017 Planned

    Rokid 2 to 6 Yes, softwareYes

    RokidRokid API 2Q2018 Yes

    i.MX8M/8M

    Mini

    iFlytek 2 to 6 Yes, softwareYes

    iFlytekiFlytek API 2Q2018 Yes

    i.MX8M/8M

    Mini

  • PUBLIC 31

    Voice HAT Solution: i.MX + Daughterboard

    • Daughterboard that plugs

    directly into PicoPi

    baseboard

    • HAT comes with 2x I2S

    microphone and 2x of NXP

    TFA9892 speaker amplifiers

    suitable for voice assistant

    applications such as AVS

    and Google Voice Assistant

    • Co-designed, manufactured

    and sold by TechNexion

    40-pin Header

    TFA9892

    Header

    Mic A Mic B

    Power

    Mgmt I2SI2C

    SPKR A

    SPKR B

  • PUBLIC 32

    Creoir Far Field Voice HAT Solution

    • Daughterboard plugs directly

    into PicoPi baseboard

    • AVS for i.MX6UL, i.MX7D,

    i.MX8M, i.MX8M Mini

    • Utilizes same Yocto image

    • Short design cycle

    • Creoir are a certified AVS

    System Integrator with

    experience in audio

    design/tuning and working with

    customer in the AVS product

    certification process

    • Creating companion set up app

    40-pin Header

    CX2029

    1

    Header

    Mic A Mic B

    Power

    MgmtI2S

    I2C

  • PUBLIC 33https://developer.amazon.com/alexa-voice-service/dev-kits/amazon-7-mic

    https://developer.amazon.com/alexa-voice-service/dev-kits/amazon-7-michttps://developer.amazon.com/alexa-voice-service/dev-kits/amazon-7-mic

  • PUBLIC 34

    Synaptics Solution: 2Mic + NXP PICO-PI-IMX7

    Kit includes:

    • PICO-PI-IMX7 i.MX7D development board− Pre-flashed with Yocto Linux BSP

    − AVS Device SDK

    − Sensory wake word

    • CX20921 evaluation board, pre-flashed with firmware

    • Microphone module with two omnidirectional mics

    • Microphone holder board

    This kit can be ordered through Arrow https://www.arrow.com/en/products/conexantnxp2micforavs/arrow-development-tools

    NXPPICO-PI-IMX7

    Conexant2MicrophoneModule

    ConexantCX20921EvaluationBoard

    NXPPICO-PI-IMX7

    Conexant2MicrophoneModule

    ConexantCX20921EvaluationBoard

    https://www.arrow.com/en/products/conexantnxp2micforavs/arrow-development-tools

  • PUBLIC 35

    Amazon Alexa & Google Assistant Reference Platform

  • PUBLIC 36

    Cloud Service Support

  • PUBLIC 37

    Audio Lab Support by NXP Lab

    Reproduce complex

    acoustic test cases

  • PUBLIC 38

    3rd Audio Lab: Digital Benchmark

    • Below curves are Sensory Hot Word Detection without SpeechAssist on Win32

    • Eventually, with lab, we will have curve with / without SpeechAssist4.0 at different

    SNR/distance on I.MX-RT1050 & have competitors curves

    • We’re moving from subjective and unverified claim (often by competition) to objective

    measurementsTrue positive percentage Vs SNR True positive percentage Vs Reverb SNR =

    12dB

  • PUBLIC 39

    AI Voice Nodes

  • PUBLIC 40

    The Low Cost Voice Control Project

    1. Cost effective robust MCU based voice control

    2. Seamless WiFi and Bluetooth Connectivity

    3. Robust to best-in-class Security options

    4. Confidence to Production experience

    “We want to add voice control to our existing application but we still need to be

    within our target cost range. MCU voice would be a great fit for us if it doesn’t delay

    our timeline, or add a bunch of risk to the project.”

  • PUBLIC 41

    Easily/inexpensively add voice control capabilities to your existing application…

    Targeted Product Areas Include

  • PUBLIC 42

    Enablement FunctionalityKey Provisioning | Cloud Onboarding | Mobile Control | Discovery | BLE Streaming | Skill Enablement

    Use Case – Cloud Based Voice Enablement

    • Enabled voice in all products which can

    have specific target skills to control.

    • Build a home regarded as a voice

    repeater (removing voice dead zones).

    • Allow consumers to multi-task unlike

    single task mobile use. For example:

    − Prepare dinner.

    − Listen to your favourite podcast.

    − Turn on the kettle.

    − Control home equipment.

    − Set reminders.

    • Are controlled devices instead of

    controllers

  • PUBLIC 43

    APPLICATION

    MICROCONTROLLER

    POWER SUPPLY

    AU

    DIO

    OU

    TP

    UT

    An

    alo

    g A

    mp

    lifie

    r

    CONNECTIVITYWireless Interfaces

    AU

    DIO

    INP

    UT

    An

    alo

    g F

    ron

    t En

    d

    MEMORY

    AnalogSpeaker

    Analog or DigitalMicrophones

    WiFi and Bluetooth radio

    Voice control as an add-on or

    integrate existing application and

    voice on same MCU?

    Low Cost MCU Voice Control Hardware System Diagram

  • PUBLIC 44

    DESIGN FOCUSED

    ON OPTIMIZED BOM

    TO ACHIEVE

    OPTMIZED COST

    SOLUTION

    Ethernet on Proto version only

    QSPI

    256Mbit QSPI FLASHMEMORY

    MIMXRT1051DVL06A196MAPBGA 10x10mm 0.65

    main MCU

    5 to 3.3V RegulatorPOWER

    u.FLConn

    i.MX RT

    Control module

    40

    -pin

    BB

    Co

    nn

    ecto

    r

    MIC INConn

    USB Connector

    AUDIO OUT

    I2S

    I2S

    VOICE

    Application board

    SDIO

    UART

    CYW4343WWiFi and BLEwireless SoC

    I2C

    SPI

    A71CHAuthentication

    SECURITY

    KW21ZZigbee/Thread

    Wireless SoC

    I2S

    TFA9894DBAudio Amplifier

    ANALOG

    SPH0641LM4H-1Digital MIC

    SPH0641LM4H-1Digital MIC

    SPH0641LM4H-1Digital MIC

    1.8V RegulatorPOWER

    3.3V RegulatorPOWER

    40-

    pin

    BB

    Co

    nn

    ecto

    r

    ETHERNETConn

    KSZ8081ANALOG

    SERIAL PORTHeader

    RGB LED

    Wifi LED

    SW1

    SW2

    5V DC

    Power LED

    i.MX RT

    JTAG

    KW

    21Z

    JT

    AG

    USB

    Note: Default 180 is 2 Mics

    HW Block Diagram Preview for the Voice Control Solution

  • PUBLIC 45

    22mm

    30

    mm

    30mm

    THEORETICAL

    DIMENSIONS

    40

    mm

    MOLEX 501745-0401

    Board-to-board connector 40-pin10X2.3mm 0.4

    BACK VIEW

    RT1052DVL06

    196MAPBGA 10x10mm 0.65

    1D

    XW

    iFi a

    nd

    BLE

    ABRACON ABM8G-24.000MHZ-B4Y-T

    24MHz Crystal 30ppm3.2X2.5mm

    NXP MIMXRT1052DVL6B

    Application Microcontroller196-ball MAPBGA 10X10mm 0.65

    MURATA LBEE5KL1DX

    WiFi and BLE moduleLGA 6.95X5.15mm

    MOLEX 0734120110

    u.FL micro antenna connectorSMT 1.25mm

    FRONT VIEW

    IS25LP256DJ256Mbit

    QSPI NOR Flash

    ABRACON ABS07-32.768KHz-7-T

    32.768KHz Crystal 30ppm3.2X2.5mm

    ISSIIS26KL256S-DABLI00256Mbit HYPER Flash24-ball TBGA 6X8mm 5x5

    Layout Preview for the i.MX RT Connected Module

    MAX DIMENSIONS

  • PUBLIC 46

    22mm

    30

    mm MOLEX

    051338-0474Board-to-board connector 40-pin

    10X2.3mm 0.4

    BACK VIEW FRONT VIEW

    SPEAKER

    Coonector

    user

    RGB LED

    SW1 SW2

    NXP TFA9894DB

    Audio Amplifier I2S interface48 WLCSP 3.55X2.5mm

    KNOWLESSPH0641LM4H-1MEMs Microphone PDM outputBottom Port LGA 3.5X2.65mm

    MOLEX 201267-0005

    USB Type-C connectorRight Angle Top/Surface Mount

    Pin-holesI2S signals

    power

    LED

    wifi

    LED

    Layout Preview for the Voice Control Board

  • PUBLIC 47

    75mm x 75mm x 45mm

    Connected

    System Module

    with Certified Radio

    Design based on two boards stacked together:• Connected Module embedding Power, Digital, Radio

    • Audio Board embedding Sensors and User interfaces

    SPEAKER

    Coonector

    RT1052DVL06196MAPBGA

    10x10mm 0.65

    1D

    XW

    iFi a

    nd

    BLE

    TBD256Mbit

    Octal SPI NOR FlashA71CH

    KW21Z48QFN

    7x7mm 0.5

    Layout Preview for the Overall Voice Control Solution

  • PUBLIC 48

    NXP SW

    Driver Layer

    ADC DriverWiFi DriverXMOS/I2C/PW

    M

    Audio Framework

    Media Codec

    Streamer

    Audio Input ProcessorAudio Processing

    Beamforming

    Barge-in

    Echo Cancellation

    Noise Supression

    Wake Word

    Engine

    AVS Client SDK

    LWIP

    mBedTLS

    MQTT HTTP/2 Enable Processing Callback

    Audio Input Streaming

    when wake word trigger

    Local

    Commands

    Additional

    Drivers

    Customer SW

    Partner SW*

    Later Phase*

    Customer/Solutions ApplicationKey Provisioning*

    Onboarding*

    WiFi Access Point Mode

    Companion App communication

    Callb

    ack

    *Pre-integrated by NXP

    Software Block Diagram for the Voice Control Solution

  • PUBLIC 49

    Driver Layer

    ADC DriverWiFi DriverXMOS/I2C/P

    WM

    Audio Framework

    Media Codec

    Streamer

    Audio Input Processor

    Audio ProcessingBeamforming

    Barge-in

    Echo Cancellation

    Noise Supression

    Wake Word

    Engine

    AVS Client SDK

    LWIP

    mBedTLS

    MQTT HTTP/2 Enable Processing Callback

    Audio Input Streaming

    when wake word trigger

    Local

    Commands

    Additional

    Drivers

    Customer/Solutions ApplicationKey Provisioning*

    Onboarding*

    WiFi Access Point Mode

    Companion App communication

    Callb

    ack

    Init: ~300MHz

    CPU: ~80MHz

    Flash: 233Kb

    RAM: 216Kb

    Estimated

    CPU: ~150 MHz

    Flash: 192Kb

    RAM: 160Kb

    Includes:

    • model

    • algorithm.

    • WWE

    Does not include the decimation which is currently being worked on.

    NXP SW

    Customer SW

    Partner SW*

    Later Phase*

    *Pre-integrated by NXP

    Resource Estimates

  • PUBLIC 50

    SPI

    CypressCYW4343W

    WiFi+BT4.1

    wireless SoC

    QSPI

    QSPI NOR FlashMEMORY

    Regulator/ProtectionPOWER

    MIMXRT1052DVL06A196MAPBGA 10x10mm 0.65

    main MCU

    i.MX-RT

    Connected MODULE

    NXP SW

    Partner SW*

    Driver Layer

    Audio Framework

    Media Codec

    Streamer

    Audio Input

    Processor

    Audio ProcessingBeamforming

    Barge-in

    Echo Cancellation

    Noise Supression

    Wake Word EngineEnable Processing Callback

    Audio Input Streaming

    when wake word

    trigger

    I2S 3in

    Audio Pre Processing

    I2S 2in

    I2S 1Out

    Decimationin

    Decimationin

    uFLCONNECTOR

    *Pre-integrated by NXP

    Audio Processing Overview

  • PUBLIC 51

    SPI

    CypressCYW4343W

    WiFi+BT4.1

    wireless SoC

    QSPI

    QSPI NOR FlashMEMORY

    Regulator/ProtectionPOWER

    MIMXRT1052DVL06A196MAPBGA 10x10mm 0.65

    main MCU

    i.MX-RT

    Connected MODULE

    Driver Layer

    I2S 3in

    Audio Pre Processing

    I2S 2in

    I2S 1Out

    Decimationin

    Decimationin

    • Two I2S ports are used for the individual

    Microphones

    • The last I2S is used for the Audio Amplifier

    • Decimation to reduce the sample to the

    estimated rate

    uFLCONNECTOR

    NXP SW

    Partner SW*

    *Pre-integrated by NXP

    Driver and Decimation

  • PUBLIC 52

    Audio Framework

    Media Codec

    Streamer

    Audio Input ProcessorAudio Processing

    Beamforming

    Barge-in

    Echo Cancellation

    Noise Supression

    Wake Word

    EngineEnable Processing Callback

    Audio Input Streaming

    when wake word trigger

    Local

    Commands

    The audio processing software is a library that is optimisized for the

    product, microphone placement/orientation. It cleans up the

    microphone audio in preparation to handing it off to the Wake Word

    engine or Audio streamer. It also supports Far Field implementation

    great than eight feet.

    Barge-in: is the process of taking the outputting audio and removing it from the

    microphone input. This is normally the case of playing music while also trying to

    issue voice commands.

    Echo Cancellation: is the process of removing the reflective sounds from floors,

    walls and ceilings.

    Beamforming: is the process of taking multiple microphone inputs and focusing

    on the sounds coming from the direction the user is.

    Noise Suppression: is the process of reducing the environmental noise from the

    microphone input.

    NXP SW

    Partner SW*

    *Pre-integrated by NXP

    Audio Processing Software

  • PUBLIC 53

    Audio Framework

    Media Codec

    Streamer

    Audio Input ProcessorAudio Processing

    Beamforming

    Barge-in

    Echo Cancellation

    Noise Supression

    Wake Word EngineEnable Processing Callback

    Audio Input Streaming

    when wake word trigger

    After the audio has been processed, it will be fed into the

    wake word engine. This is engine determines if the Alexa

    wake word was uttered.

    What happens when the wake word is said:

    • The engine will inform the Audio Framework to start streaming.

    • The streaming data will be sent to AVS.

    • The engine will inform the Audio Framework to stop streaming.

    NXP SW

    Partner SW*

    *Pre-integrated by NXP

    Wake Word Engine

  • PUBLIC 54

    Audio FrameworkMedia

    Player

    Media

    Browser

    Media

    Indexing

    Playlist /

    Play

    Queue

    Common Audio Framework API

    Media

    Device

    Support

    Device

    ManagerInput

    Manager

    Audio Framework provided by NXP• Provides Audio Media Player

    • Supports Industry Codecs

    – MP3

    – WAV

    – AAC

    • Support for multiple audio interfaces.

    • A light weight NXP IP version of gstreamer.

    Audio Framework

  • PUBLIC 55

    BOMs

    RF Certification Artifacts

    LayoutsSchematics

    Software Source

    Training Videos

    Documentation

    Design Flows

    NXP IoT Solutions Provide

    • Coherency

    • Pre-Integration

    • Testing

    • Production Readiness

    OOB HW/SW

    Opportunity to Start with Kit Plus Rich Set of IDEx IP

  • PUBLIC 56

    Project: NXP HotwordVoice User Interface Solution: Long term goal

    microphones I.MXRT

    EC / NR NXP

    Wake up

    word

    NXP

    TFA Amplifier

    NXP HW Machine Learning

    Acceleration

    Audio framework

    NXP HW

    NXP SW

    • IMXRT: MICR

    • TFA Amplifier:

    SIP

    • EC / NR: SIP

    • Wake-up word: SIP

    • Audio Framework:

    SIP / MICR

    • ML SDK : MICR

  • PUBLIC 57

    NXP – Production Grade SoC

  • PUBLIC 58

    i.MX 7 family

    i.MX 8 family

    Safety Certifiable & Efficient Performance

    Flexible Efficient Connectivity

    ARM® v7-A

    i.MX 8M family

    i.MX 8X family

    Advanced Audio, Voice & Video

    Advanced Graphics & Performance

    i.MX 7ULP family

    Ultra Low Power with Graphics

    ARM ® v8-A (32-bit/ 64-bit)

    ARM® v7-A (32-bit)

    i.MX 6QuadPlus

    i.MX 6Dual

    i.MX 6Solo

    i.MX 6DualLite

    i.MX 6SoloLite

    i.MX 6SoloX

    i.MX 6UltraLite

    i.MX 6DualPlus

    i.MX 6Quad

    i.MX 6SLL

    i.MX 6ULL

    i.MX Series: 3 Processor Families with Targeted Features

    M4A7

    M4

    A53

    M4

    A35

    M4

    A53

    A72

    Pin

    -to-p

    in C

    om

    pa

    tib

    le

    Soft

    ware

    Com

    patible

    M4

    A9

    A9

    A7

  • PUBLIC 59

    i.MX 8M ‘Mini’ FamilyCost-effective Applications Processors in 3rd generation 14LPC FinFET process

    For Consumer and Industrial Applications

    Video ARM CPU

    Cortex-A53 | Cortex-M4

    Up

    to

    18,0

    00 D

    MIP

    S

    So

    ftwa

    re C

    om

    pa

    tibility

    14x14 0

    .5m

    m o

    r 17x17 0

    .75m

    m (*T

    BD

    )

    Pin

    Co

    mp

    atib

    ility

    • Separate 2D/3D

    • 1 Vec4 Shader

    • Up to 6.4 GFLOPS

    • OpenGL ES 2.0

    • OpenVG 1.1

    • Up to 40 MTri/s

    • Up to 400 MPix/s

    Connectivity I/ODisplay / Camera Audio I/O2D + 3D GPU • Decode:

    1080p60 H.265,

    VP9 H.264, VP8

    • Encode:

    1080p60 H.264,

    VP8

    • MIPI-DSI 4-lane

    • MIPI-CSI 4-lane

    Streaming Media Voice Assistants Industrial IoT Edge Compute Video ConferencingAI, Machine Learning

    i.MX 8M Mini

    (Quad/Dual/Solo) 1x, 2x, or 4x A53

    Video ARM CPU

    Cortex-A53 | Cortex-M4

    i.MX 8M Mini Lite

    (Quad/Dual/Solo) 1x, 2x, or 4x A53Up

    to

    18,0

    00 D

    MIP

    S

    SAME AS i.MX

    8M MiniNo Video

    acceleration

    SAME AS i.MX

    8M Mini

    SAME AS i.MX

    8M Mini

    SAME AS i.MX

    8M Mini

    • 20-channels

    • 32-bits @ 384KHz

    • DSD512, TDM

    • SPDIF Tx & Rx

    • 8-ch PDM MIC

    • 2x USB 2.0

    • 1x PCIe

    • 3x SDIO

    • 1x GbE

  • PUBLIC 60

    Quad/Dual/Solo ARM Cortex-A53 @ 1.6-2.0 GHz (up to 18,400 DMIPS)

    − ARM v8 Fully 64-bit capable

    ARM Cortex-M4 @ 400+ MHz for Low Power, Security

    Package: FCBGA 14x14mm, 0.5mm pitch de-pop array

    FCBGA 17x17mm, 0.75mm pitch (*TBD)

    Operating System targets: Linux OS, Android OS, FreeRTOS

    Qualification for Consumer and Industrial applications

    Feature Highlights:

    − Security: DRM support for RSA, AES, 3DES, DES

    − GC NanoUltra 3D Graphics GPU, 1 shader core, OpenGL ES 2.0, 6.4 GFLOPS, 400M

    Pix/s, 40M Tri/s

    − GC328 2D Graphics GPU

    − 1080p60 H.265/HEVC, VP9, H.264, VP8 decoder

    − 1080p60 H.264, VP8 encoder

    − x16, x32 LPDDR4/DDR4/DDR3(L) (up to 3200 Mtps) LPDDR4: 1600bps @ 800MHz

    − High quality image resizing and graphics overlay

    − Audio: S/PDIF Rx & Tx, 20x I2S (up to 20ch 32bit @ 384Khz support)

    − Display Interfaces: 1x MIPI DSI (4-lane) with PHY

    − Camera Interfaces: 1x MIPI CSI2 input (4-lane each) with PHY

    − 2x USB 2.0 OTG with PHY

    − 1x Gb Ethernet (MAC): AVB & IEEE 1588 for sync, and EEE for low power

    − 1x PCIe 2.0 (1-lane) with L1 substates (low power, fast wakeup)

    − 4x UART, 4x I2C, 3x SPI

    − 3x SDIO3.0 / eMMC5.0 / SD4

    − Raw NAND controller (BCH62)

    − Quad-SPI for fast boot from SPI NOR; with Execute in Place (XIP); single-bit, dual-bit,

    quad-bits and octal-bits access are supported.

    i.MX 8M Mini

    External Memory

    Multimedia

    i.MX 8M Mini

    System Control

    Security

    Secure JTAG

    XTAL

    3D Graphics: GC NanoUltra

    Smart DMA x3

    Random Number

    TrustZone

    Secure Clock

    eFuse Key Storage

    DRM Ciphers

    Temperature Sensor

    Connectivity & I/O

    4x UART 5Mbps4x I2C, 3x SPI

    Dual-ch QuadSPI (XIP)

    1x PCIe 2.0 – 1-lane

    1Gb Ethernet

    (IEEE 1588, EEE & AVB)

    1080p60 VP8, VP9, H.264, H.265 decoder

    Main CPU Platform

    NEON

    32KB D-cache

    FPU

    32KB I-cache

    512KB L2 Cache

    Quad/Dual Corte-A53

    NEON

    32KB D-cache

    FPU

    32KB I-cache

    Quad/Dual ortex-A53

    NEON

    32KB D-cache

    FPU

    32KB I-cache

    Quad Cortex-A53

    NEON

    32KB D-cache

    FPU

    32KB I-cache

    2x USB2.0 OTG + PHY

    1080p60 VP8, H.264 encoder

    Low Power, Security CPU

    Cortex-M4

    16KB I-cache

    256KB TCM (SRAM)

    16KB D-cache

    x16-x32 LPDDR4/DDR4/DDR3(L)

    up to 3200 Mbps

    3x SDIO3.0/MMC5.1/SD4

    S/PDIF Rx & Tx,

    20x I2S/SAI

    32KB Secure RAM

    MIPI-CSI 4-lane with PHY

    MIPI-DSI 4-lane with PHY

    8ch PDM Input

    NAND Controller (BCH62)

    2D Graphics: GC328

    Timer x6

    PWM x4

    Watchdog x3

    PLLs

  • PUBLIC 61

    i.MX 8M Mini Key Features• System Design Optimization

    - 14x14 0.5mm package designed for maximum feature

    enablement with 6-8 layer board design and no microvias

    - Pin-compatible with the i.MX 8M Nano provides drop-in scalable

    product performance

    - 17x17 0.75mm package for the broad market (*TBD)

    - 8ch DMIC support for direct connection of PDM

    microphones (no CODEC) enables low system-cost

    - Enabling software such as Linux/Android BSP and solutions

    software (e.g. Voice, Machine Learning, Audio Framework)

    • Triple-Play Audio/Voice/Video

    − Up to 1080p60 video decoding (H.265, H.264, VP8/9)

    − Up to 1080p60 video encoding (H.264, VP8) using

    parallel VPU engine enables video transcode

    applications (video calling)

    − 2D and 3D GPU to enable 1080p media UI

    − Advanced audio capabilities including 8ch DMIC

    support, 32-bit @ 384kHZ audio interfaces, multiple

    audio channels

    • Broad System Connectivity

    − MIPI-DSI (4-lanes) for display

    − MIPI-CSI (4-lanes) for camera input

    − Multiple SDIO interfaces to enable flexibility in supporting

    boot, expansion and connectivity (Wi-Fi)

    − PCIe with L1 low power substates enables a range of high-

    performing Wi-Fi/BT solutions and other connectivity

    − Gigabit Ethernet and USB 2.0

    • Scalable Performance at Low Power

    - Advanced process technology node delivers much lower

    leakage than standard technology

    - Single-, dual- or quad-core Cortex-A53 cores up to 2.0

    GHz; scalable performance in a pin-compatible package

    - Heterogeneous multi-core processing with Cortex-M4

    running at 400+ MHz; offload tasks, optimize power

    - Power efficient 3D GPU and VPU enables 1080p video

    transcode and display

    - DDR3L, DDR4, and LPDDR4 Support

    Preliminary, subject to change

  • PUBLIC 62

    Ultra-low Power

    Dynamic & Static

    ARM v8/v8m + GPU/DSP

    ARM v7/v7m + 2D/3D

    ARM v7m + Audio

    i.MX 6UL/ULL

    i.MX RT

    i.MX 7ULP

    Scalability of Embedded Processing, the New Normal

  • PUBLIC 63

    i.MX RT1050: Key Differentiators

    High Performance

    Real-Time Processing

    High level of Integration

    Low BOM Cost

    Easy to Use

    • MCU customers can leveraging their current toolchain (MCUXpresso, IAR, Keil)

    • Rapid and easy prototyping and development with NXP FreeRTOS, SDK, ARM mbed and the global ARM ecosystem

    • Single voltage input simplifies power

    circuit design

    • Scalability to Kinetis & i.MX products

    • Competitive Pricing

    – starting @ $2.98 10k RSL

    • Fully integrated PMIC with DC-DC

    • Low cost package, 10x10 BGA,

    enabling 4 Layer PCB design

    • SDRAM interface

    • Cortex-M7 up to 600MHz (50% faster

    than current existing M7 products)

    • 20ns interrupt latency

    • Up to 512KB Tightly Couple Memory

    • High Security enabled by AES-128,

    HAB and On-the-fly QSPI Flash

    Decryption

    • 2D graphics acceleration engine

    • Parallel camera sensor interface

    • LCD display controller up to WXGA

    (1366x768)

    • Audio interface with three I2S for

    multichannel high performance audio

  • PUBLIC 64

    Specifications

    • Package: MAPBGA196 | 10x10mm^2, 0.65mm pitch (130 GPIOs)

    • Temp / Qual: -40 to 105°C (Tj) Industrial / 0 to 95°C (Tj) Consumer

    High Performance Real Time system

    • Cortex-M7 up to 600MHz , 50% faster than any other existing M7 products

    • 20ns interrupt latency, a TRUE Real time processor

    • 512KB SRAM, configurable to 512KB TCM

    Rich Peripheral

    • Motor Control: Flex PWM X 4, Quad Timer X 4, ENC X 4

    • 2x USB, 2x SDIO, 2x CAN, 1x ENET with 1588, 8xUART, 4x SPI, 4X I2C

    • 8/16-bit CSI interface and 8/16/24-bit LCD interface

    • Qual-SPI interface, with Bus Encryption Engine

    • Audio interface: 3x SAI/ SPDIF RX & TX/ 1x ESAI

    Security

    • TRNG&PRNG(NIST SP 800-90 Certified)

    • 128-AES cryptography

    • Bus Encryption Engine: Protect QSPI Flash Content

    Ease of Use

    • FreeRTOS with SDK

    • MCUXpresso

    • Comprehensive ecosystem

    Low BOM Cost

    • Competitive Price

    • Fully integrated PMIC with DC-DC

    • Low cost package, 10x10 BGA with 0.65mm Pitch

    • SDRAM interface

    i.MX RT1050 Series Block DiagramKey Features and Benefits

  • PUBLIC 65

    Connectivity Solutions

    Evaluation Kits:

    Runtime SoftwareSoftware

    Development Tools

    Hardware

    Development Tools Application Specific

    Comprehensive frameworks and

    solutions for low-power, connected,

    and secure embedded systems

    Industry leading IDE support and

    intuitive software configuration

    tools to accelerate application

    development

    Low cost hardware platforms for

    evaluation and application

    development. Partner solutions for

    hardware debugging solutions

    Software frameworks and

    development tools for targeted

    applications and certified

    connectivity solutions

    Get started quickly and get

    the support you need, when

    you need it

    Support

    NXP Solutions: IDE / Toolchains:

    • NXP Community

    • Solution Designs

    • Application Notes

    • Schematics

    RTOS, Middleware Partners: Partner Solutions

    i.MX RT1050: Enablement Overview

    • Graphics• Touch HMI• Camera interface

    • Motor Control• Voice activation• Audio• Sensor Fusion• Cloud Connectivity

    Broad Market:

    • Professional Support

    • Professional Services

    High Touch:

    802.15.4

  • PUBLIC 66

    NXP Smart Amplifier for IOT

    Device

  • PUBLIC 67

    What is NXP Smart SPK Amplifier?

    • 3-4 dB perceived loudness (w and w/o amp)

    • Enhance audio-fidelity w/ dynamic EQ and MBDRC

    • Protection on Speaker Excursion and Temperature

    • DSP embedded – Easy of use/Integration

    • Proven solution in the market

    Get the Best possible sound from small form factor

    Speaker BoostTuning Tool

    LoudspeakerModeling

    X-limit / T-max

    Voltage+ CurrentSensing

    Class-DG

    Boosted DC-DCIntegrated

    Embedded Processing

    AnalogHardware

    Au

    dio

    InD

    igit

    al In

    terf

    ace

    Man

    aged A

    ud

    io O

    ut Bring your product to next level

  • PUBLIC 68

    Mobile Audio Solution• One-Stop-Shop for Audio Solutions in Mobile Devices

    • Creative and Innovative teams with many Ideas

    • Knowledgeable Local Support teams solving customer’s problems

    • Successful deployments with key players providing good references

    • Extending scope to Adjacent Markets

    Feature Smartphones / Wearables / Hearables Smart Home / Internet

    Phones Tablets of Things

  • PUBLIC 69

    Smart Amplifier Product Overview

    Features TFA9892 TFA9894 TFA9874 TFA9896

    Embedded DSP Yes Yes No Yes

    Package size (mm2) 10 / 11 8 6 6

    Customer Sample Available 1Q18 4Q17 Available

    Audio input interfaces I2S/PDM I2S/PDM I2S/TDM I2S/TDM

    Boost Voltage 12V 9.5V 9.5V 6.1V

    RMS Output Power ( 8ohm) 6.3W 4W 4W 2.1W

    RMS Output Power ( 4ohm) >7W 5.5W 5.5W 3.2W

    Speaker Channels Mono Mono Mono Mono

    Typical Speaker Load (Ω) 4-8-32 4-8-32 4-8-32 4-8-32

    SNR (dB) 115 100 100 102

    MB-DRC Yes (3-band) None (On Host) None

    Noise (uV) / Receiver mode 12 12 12 30/18

  • PUBLIC 70

    Schedule NXP schedule

    Customer Sample May 2017

    RAMP UP Dec 2017

    Schedule NXP schedule

    Customer Sample June 2017

    RAM UP Dec 2017

    4.87

    4.4

    5

    CVDDPCBST

    CVBATLBST

    TFA9874

    2.65

    2.5

    5

    CSENSERSENSECVD DD

    5.27

    4.4

    5

    CVDDPCBST

    CVBATLBST

    TFA9894

    3.13

    2.5

    5

    CSENSERSENSE

    CVD DD

    DS

    P

    DSP

    Key Improvement versus TFA9872

    • WCSP packge

  • PUBLIC 71

    New Layout of TFA9894 – Supporting IoT Device

    TFA9894 – Easy RoutingTFA9894 Original

    POD’s (Package Outline Dimension) will be the same: No change in physical location of the bumps, only function will change

  • PUBLIC 72

    Innovations

  • PUBLIC 73

    Neural Network Applied to: Audio to Haptic

    Sound Effect detector

    AUDIO STREAM

    • Voice

    • Sound Effects

    • Music

    • Ambient

    SPEECH

    NON-SPEECH

    Audio routed to

    LoudSpeaker

    Audio routed

    to

    Haptic

    • Voice

    • Music

    • Sound Effects

    https://youtu.be/4DZxE77QPcY?t=165

    https://youtu.be/4DZxE77QPcY?t=165

  • PUBLIC 74

    AI Based Speech Detection – Ex: Speaker Authentication

  • PUBLIC 75

    Example: Speech vs. Music

  • PUBLIC 76

    At the Customer At the Edge

    Training in

    the cloud

    Customer board

    CPU GPU

    DSP MLa

    Firmware

    Storage

    Vision SDK

    Sensor

    SDK

    Machine

    Learning

    SDK eIQ-Engine

    Ready for

    inference

    Voice

    Vision

    NXP MCU/MPU ML/DL Support

  • PUBLIC 77

    Q & A

  • NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V.

    www.nxp.com

    http://www.nxp.com/