39
THE FUTURE OF VOICE WEBDAGENE 2017 CHERYL PLATZ Owner, IDEAPLATZ Senior Designer, MICROSOFT

The voice of the future (en) – med Cheryl Platz

Embed Size (px)

Citation preview

THE FUTURE OF

VOICE

WEBDAGENE 2017

CHERYL PLATZOwner, IDEAPLATZ

Senior Designer, MICROSOFT

I’ve been designing for voice

and multimodal interfaces

since 2006.

AT AMAZON:

First designer on Echo Look

and Alexa Notifications

AT MICROSOFT:

Designer for voice and

multimodal interfaces on

Windows Automotive and

Cortana

WEBDAGENE 2017

COMPUTER, WHO IS CHERYL?

CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017

HUMANS HAVE DEVELOPED THE ART OF CONVERSATION FOR THOUSANDS OF YEARS.

CHERYL PLATZ //

@MUPPETAPHRODITE

The accessibility benefits are vast, and not just

limited to those with permanent accessibility

challenges.

WEBDAGENE 2017

Voice user interfaces leverage

this experience to improve lives.

CHERYL PLATZ //

@MUPPETAPHRODITE

My wife passed away 4 years ago leaving

me, not only a widow, but a widowed

quadriplegic trying to survive on his own…

Alexa has been a blessing beyond my

imagination. She has given me an opportunity

that I never thought would be possible.

AMAZON ECHO REVIEW FROM MICHAEL DAVIS, FEB

2017

DESCRIBING ECHO’S AID IN HIS LIFE AS A

QUADRIPLEGIC

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

35.6 MILLION

AMERICANS USE

A VOICE-

ACTIVATED

ASSISTANT

DEVICE AT

LEAST ONCE A

MONTH.

SOURCE: eMarketer

WEBDAGENE 2017

VOICE UI IS NOW MAINSTREAM, BUT IT’S FAR FROM MATURE.

IN TODAY’S WEAKNESSES LIE THREE KEY OPPORTUNITIES FOR THE FUTURE OF VOICE UI.

CHERYL PLATZ //

@MUPPETAPHRODITE

Limited training data and a an affluent user

base is excluding underrepresented groups

with inaccuracy.

WEBDAGENE 2017

Today’s voice interfaces are

inherently biased.

CHERYL PLATZ //

@MUPPETAPHRODITE

OPPORTUNITY 1

“…looking at race, I found that

Caucasian speakers had by far the

lowest error rate. African-American

speakers and speakers with a mixed

racial background had higher error rates.DR. RACHEL TATMAN, LINGUISTICS, UNIVERSITY OF

WASHINGTON

ON ACCURACY OF SIRI FOR VARIOUS DEMOGRAPHIC

GROUPS

KUOW, SEPTEMBER 19 2017

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

GENDER

Systems were initially

trained with internal

data collection – at

companies where

engineering teams

are still largely male.

ETHNICITY

Training data

expands to include

early adopters, often

affluent.

This may exclude

underrepresented

ethnicities due to

wage gaps.

ACCENT

The North American

focus of most of

today’s products

mean we have yet to

attain critical mass of

training data for

second-language

speakers.

WEBDAGENE 2017

DECONSTRUCTING VOICE UI BIAS

CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

Biased Training Data

Poor Accuracy for Excluded

Groups

High Attrition by Excluded

Groups

BIAS

SPIRAL

WE MUST FIND A WAY TO BREAK THE BIAS

SPIRAL,

AND MAKE THE FUTURE OF VOICE UI

VIABLE FOR ALL.

We are wasting time re-implementing the same

basic tasks on multiple systems. Most systems

emphasize a single modality at a time.

WEBDAGENE 2017

Today’s voice interfaces are

simple and siloed.

CHERYL PLATZ //

@MUPPETAPHRODITE

OPPORTUNITY 2

We currently have an ecosystem of voice

assistants chasing each others’ tails.

What could we accomplish if we relied on each

other’s expertise?

WEBDAGENE 2017

TIME LOST TO TIMERS

CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017

Complicated

CHERYL PLATZ //

@MUPPETAPHRODITEClip from Adobe vision video: “What if you had an intelligent agent for voice editing?”

WEBDAGENE 2017

DO WE NEED ONE ASSISTANT TO

RULE THEM ALL?CHERYL PLATZ //

@MUPPETAPHRODITE

Through its collaboration with

Microsoft, Amazon said, Alexa

users will get answers to some

of the same questions that

Cortana can now answer – for

instance, when is the next

budget review with the boss?NICK WINGFIELD, NEW YORK TIMES

AUGUST 30, 2017

ILLUSTRATION: MENGXIN LI

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

LET’S BUILD A CHOIR OF

HARMONIOUS VOICE

INTERFACES TOGETHER.

Alexa, Google Home and Cortana essentially

allow only command-and-control scenarios.

WEBDAGENE 2017

Today’s voice UIs aren’t

conversational –yet.

CHERYL PLATZ //

@MUPPETAPHRODITE

OPPORTUNITY 3

IT LOOKS LIKE YOU MIGHT BE

IN THE AWKWARD EARLY

STAGES OF

CONVERSATIONAL UI. CAN I

HELP?PLEASE

NO

RUN

AWAY

AUDIBLE CUES PHYSICAL CUES

WEBDAGENE 2017

Tone

Speed

Volume

Eye contact & gaze

Heart rate

Posture

Gesture

SPOKEN CONVERSATION IS MORE

THAN WORDS

CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITEClip from “Her”: Warner Brothers / Anapurna Pictures

CONVERSATION REQUIRES

TRUST.

HUMANS BUILD TRUST

OVER TIME.

WEBDAGENE 2017

WHAT BENEFIT CAN HUMANS GAIN FROM TRUSTING THESE ASSISTANTS?

CHERYL PLATZ //

@MUPPETAPHRODITE

The other night, I found Gary playing his

own version of a memory game with

Alexa. He was trying to come up with

songs he remembered and hadn't heard

for awhile and would ask her to play

them.AMAZON ECHO REVIEW FROM ALEX S.

DESCRIBING ECHO’S AID IN HUSBAND’S STRUGGLE WITH

PARKINSON’S

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

People have serious conversations with Siri.

People talk to Siri about all kinds of things,

including when they’re having a stressful day

or have something serious on their mind.

They turn to Siri in emergencies or when they

want guidance on living a healthier life.APPLE JOB POSTING, SIRI SOFTWARE ENGINEER, HEALTH

AND WELLNESS

APRIL 4, 2017

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITEClip from “Her”: Warner Brothers / Anapurna Pictures

How can we (ethically) model a relationship over

time?

What information is saved, and what is discarded?

What level of transparency and control is required?

Does the assistant’s personality adapt, or remain

fixed?

WEBDAGENE 2017

WHAT DOES A RELATIONSHIP

LOOK LIKE?

CHERYL PLATZ //

@MUPPETAPHRODITE

HOW DO WE GET TO

THE FUTURE OF

VOICE UI?

Inclusive and unbiased speech

recognition

Harmonious cross-product partnerships

Semantic web represents common

knowledge

Trust built over time with shared context

Conversation informed by non-verbal

cuesWEBDAGENE 2017

THE FUTURE OF VOICE

CHERYL PLATZ //

@MUPPETAPHRODITE

WEBDAGENE 2017

THESE ADVANCES WILL COMBINE TO OPEN NEW OPPORTUNITIES AND A NEW ERA IN HUMAN EMPOWERMENT.

CHERYL PLATZ //

@MUPPETAPHRODITE

ENHANCED PRODUCTIVITY

MULTIMODAL ADAPTIVITY

COMPANIONSHIP AND

COMFORT

WEBDAGENE 2017CHERYL PLATZ //

@MUPPETAPHRODITEClip from Star Trek IV: The Voyage Home / Paramount Pictures

WEBDAGENE 2017

LET’S BUILD A FUTURE OF INTERFACES WHERE OUR HUMANITY IS AMPLIFIED, NOT ATROPHIED.

CHERYL PLATZ //

@MUPPETAPHRODITE

May the voice be with you.http://ideaplatz.com

WEBDAGENE 2017

CHERYL PLATZOwner, IDEAPLATZ -- Senior Designer, MICROSOFT

Twitter & Medium: @MuppetAphrodite