CREATING VOICE EXPERIENCES WITH AMAZON ALEXA · smartphones and smartwatches, to home ... A closer...

Preview:

Citation preview

CR E A T IN G VOICE E X PE R IE N CE S WI TH A MA ZON A L E X A

VOICE WILL BE EVERYWHERE

Theageoftouchcouldsooncometoanend.Fromsmartphonesandsmartwatches,tohomedevices,toin-carinfotainmentsystems, touchisnolongerthe

primaryuserinterface.

Source:DesignNews

Although voice technology is still in it’s relative infancy, the future is not as far off as you think.

A S RAutomatedSpeechRecognition

Whatistheuseractuallysaying?

N L PNaturalLanguageProcessing

Whatistheintentoftheuser?

MachineLearningandIntelligence

Provideextraordinaryandadvancedcustomerexperiences

RAPID ADVANCEMENT

A IC L O U DScalable,Reliable&Secure

Abilitytointroducefeaturesatscale,thatcontinuouslyaddvalueovertime

A C C E S SDemocratizingVoiceTechnology

BetterASRandNLPhasleadtobetteraccessandadoption

1970 1980 1990 2000 2010 2020

HUMAN ACCURACY

50% 55%60% 62%

70%

95%

ASRaccuracyhasdramaticallyincreasedinthelast4-5years.

Thisinflectionpointhascreatedsustainedmomentuminconsumeradoptionofvoicetechnology

MACHINE ASR ACCURACY

Source:MindMeld

Asspeechrecognitionaccuracygoesfrom95%to99%,allofusintheroomwillgofrombarelyusingittodaytousingitallthetime.Mostpeopleunderestimatethe

differencebetween95%and99%accuracy.99%isagamechanger.

”AndrewNg,ChiefScientistatBaidu

TIME

PER

FOR

MAN

CE

COMPUTER PERFORMANCE& MACHINE LEARNING

MACHINE LEARNING & INTELLIGENCE

HUMAN PERFORMANCE

WE ARE HERE

Soonitwillseemalmostquainttherewasatimewelookedatvoiceassistantsasvirtualfriendswholivedin

ourpocketsandansweredourquestions.

”Source:TheDrumNews– HowVoiceTechWillChangeOurLivesForever

INTERFACE EVOLUTION EVENTSIttookgenerationsandseveralmajortechnologicaladvancementsfortouchscreens,GUIandVUItoachievecriticaladoption.

Followingnon-commercialGUImilestones,theadvancesoftheearly80sand90s(Windows95,Apple’sOS,theInternet)changedtrajectoryoftheGUI

Itwasn’tuntilthePalmPilotofthelate‘90sandsmartphonesofthemid-2000sthatallowedTouchtoemergeasakeyinteractionmodality

Human-To-HumanVUIwasbroughtinwiththedawnofthetelephone,butHuman-To-MachineVUIshavejustrecentlybecomeviable

• Canhandlemoreinfo• Morefamiliar• Hardertogetlost• Providesflexibility

TOUCH vs. VOICE

• Faster• Lesscumbersome• Universal• Removesnoise

VOICE AS A KEY MODALITY Whereit’sgoingislimitedbyone’simaginationbutvoiceWILLplayakeyroleinhowwecontrolourhomes,ouroutdoorspacesandaccessinformation…Why?

ACCESS ACCURACY EFFICIENCY SECURITYAcrossbillionsofdevicesbetweenphones,watchescars,Alexa-powereddevicessuchastheAmazonEcho,EchoDot,AmazonTap,AmazonFireTV,ismakingaccesstovoiceubiquitous.

Advancesintheabilitytounderstanduserintentionischangingthegameinadoptionasinteractionsbecomefasterandmorereliable

Easeofusewillmakeitapowerfulchoiceforquickaccesstoanythinginsideandoutsideourenvironment

Voiceisahighlyuniquesignature,asadvancesinbiometricsareintegratedwithvoice,ourindividualitywillbecomeakeytofurtherpersonalizationandsecurity.

DE VE L OP IN GFOR VOICE

Create Great Content: ASK is how you connect

to your consumer

THE ALEXA SERVICESupported by two powerful SDKs

A LE X AVO I C E

S E RV I C E

Unparalleled Distribution: AVS allows your content

to be everywhereLives In The Cloud

Automated SpeechRecognition (ASR)

Natural Language Understanding (NLU)

Always Learning

A LE X AS K I L LS

K I T

A LE X AS K I L L S K I T

EUROPEAN ALEXA SKILLS PARTNERS

ALEXA SKILLS KIT ARCHITECTUREA closer look at how the Alexa Skills Kit processes a request and returns an appropriate response

You Pass Back a Textual or Audio Response

You Pass Back a Graphical Response

Alexa Converts Text-to-Speech (TTS) & Renders Graphical Component

Respond to Intent through Text & Visual

Alexa sends Customer Intent to

Your Service

Your ServiceprocessesRequest

User Makes a Request

Audio Stream issent up to Alexa Alexa Identifies Skill & Recognizes

Intent Through ASR & NLU

Speech Platform

SkillsWeather

ASR

NLU

TTS

“speak”directive

intent

recognitionresult

recognize

intent

recognitionresult

text/SSML

user’sutterance

Alexa’svoice

Alexa’svoice

Alexa, what’s the weather?

WAKE WORD DETECTION

SPEECH CAPTURE

TEXT TO SPEECH OUTPUT

AlexaVoiceService

A LE X AVOICE S E R V ICE

ALEXA FOR MANY KINDS OF DEVICES

ALEXA VOICE SERVICE

REST APIYOUR CODE

YOUR DEVICE“What time is it?”

”It’s 8 PM”

REST REQUEST

REST RESPONSE

Audio Capture

Audio Playback

ALEXA VOICE SERVICE ARCHITECTUREA closer look at how the Alexa Voice Service streams and receives audio from the AVS API

WH A TN E X T?

S OME US E FUL R E S OUR CE S

http://developer.amazon.com/askhttp://developer.amazon.com/blog

http://developer.amazon.com/alexa-fundhttp://bit.ly/alexadevchathttp://bit.ly/alexaforums

http://bit.ly/alexacerthelphttp://bit.ly/alexadevevents

TH A N K YOU QUE S T I ON S ?

Recommended