Upload
webdagene
View
144
Download
2
Embed Size (px)
Citation preview
I’ve been designing for voice
and multimodal interfaces
since 2006.
AT AMAZON:
First designer on Echo Look
and Alexa Notifications
AT MICROSOFT:
Designer for voice and
multimodal interfaces on
Windows Automotive and
Cortana
WEBDAGENE 2017
COMPUTER, WHO IS CHERYL?
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
HUMANS HAVE DEVELOPED THE ART OF CONVERSATION FOR THOUSANDS OF YEARS.
CHERYL PLATZ //
@MUPPETAPHRODITE
The accessibility benefits are vast, and not just
limited to those with permanent accessibility
challenges.
WEBDAGENE 2017
Voice user interfaces leverage
this experience to improve lives.
CHERYL PLATZ //
@MUPPETAPHRODITE
“
”
My wife passed away 4 years ago leaving
me, not only a widow, but a widowed
quadriplegic trying to survive on his own…
Alexa has been a blessing beyond my
imagination. She has given me an opportunity
that I never thought would be possible.
AMAZON ECHO REVIEW FROM MICHAEL DAVIS, FEB
2017
DESCRIBING ECHO’S AID IN HIS LIFE AS A
QUADRIPLEGIC
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITE
35.6 MILLION
AMERICANS USE
A VOICE-
ACTIVATED
ASSISTANT
DEVICE AT
LEAST ONCE A
MONTH.
SOURCE: eMarketer
WEBDAGENE 2017
VOICE UI IS NOW MAINSTREAM, BUT IT’S FAR FROM MATURE.
IN TODAY’S WEAKNESSES LIE THREE KEY OPPORTUNITIES FOR THE FUTURE OF VOICE UI.
CHERYL PLATZ //
@MUPPETAPHRODITE
Limited training data and a an affluent user
base is excluding underrepresented groups
with inaccuracy.
WEBDAGENE 2017
Today’s voice interfaces are
inherently biased.
CHERYL PLATZ //
@MUPPETAPHRODITE
OPPORTUNITY 1
“
”
“…looking at race, I found that
Caucasian speakers had by far the
lowest error rate. African-American
speakers and speakers with a mixed
racial background had higher error rates.DR. RACHEL TATMAN, LINGUISTICS, UNIVERSITY OF
WASHINGTON
ON ACCURACY OF SIRI FOR VARIOUS DEMOGRAPHIC
GROUPS
KUOW, SEPTEMBER 19 2017
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITE
GENDER
Systems were initially
trained with internal
data collection – at
companies where
engineering teams
are still largely male.
ETHNICITY
Training data
expands to include
early adopters, often
affluent.
This may exclude
underrepresented
ethnicities due to
wage gaps.
ACCENT
The North American
focus of most of
today’s products
mean we have yet to
attain critical mass of
training data for
second-language
speakers.
WEBDAGENE 2017
DECONSTRUCTING VOICE UI BIAS
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITE
Biased Training Data
Poor Accuracy for Excluded
Groups
High Attrition by Excluded
Groups
BIAS
SPIRAL
We are wasting time re-implementing the same
basic tasks on multiple systems. Most systems
emphasize a single modality at a time.
WEBDAGENE 2017
Today’s voice interfaces are
simple and siloed.
CHERYL PLATZ //
@MUPPETAPHRODITE
OPPORTUNITY 2
We currently have an ecosystem of voice
assistants chasing each others’ tails.
What could we accomplish if we relied on each
other’s expertise?
WEBDAGENE 2017
TIME LOST TO TIMERS
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
Complicated
CHERYL PLATZ //
@MUPPETAPHRODITEClip from Adobe vision video: “What if you had an intelligent agent for voice editing?”
“
”
Through its collaboration with
Microsoft, Amazon said, Alexa
users will get answers to some
of the same questions that
Cortana can now answer – for
instance, when is the next
budget review with the boss?NICK WINGFIELD, NEW YORK TIMES
AUGUST 30, 2017
ILLUSTRATION: MENGXIN LI
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITE
Alexa, Google Home and Cortana essentially
allow only command-and-control scenarios.
WEBDAGENE 2017
Today’s voice UIs aren’t
conversational –yet.
CHERYL PLATZ //
@MUPPETAPHRODITE
OPPORTUNITY 3
IT LOOKS LIKE YOU MIGHT BE
IN THE AWKWARD EARLY
STAGES OF
CONVERSATIONAL UI. CAN I
HELP?PLEASE
NO
RUN
AWAY
AUDIBLE CUES PHYSICAL CUES
WEBDAGENE 2017
Tone
Speed
Volume
Eye contact & gaze
Heart rate
Posture
Gesture
SPOKEN CONVERSATION IS MORE
THAN WORDS
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
WHAT BENEFIT CAN HUMANS GAIN FROM TRUSTING THESE ASSISTANTS?
CHERYL PLATZ //
@MUPPETAPHRODITE
“
”
The other night, I found Gary playing his
own version of a memory game with
Alexa. He was trying to come up with
songs he remembered and hadn't heard
for awhile and would ask her to play
them.AMAZON ECHO REVIEW FROM ALEX S.
DESCRIBING ECHO’S AID IN HUSBAND’S STRUGGLE WITH
PARKINSON’S
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITE
“
”
People have serious conversations with Siri.
People talk to Siri about all kinds of things,
including when they’re having a stressful day
or have something serious on their mind.
They turn to Siri in emergencies or when they
want guidance on living a healthier life.APPLE JOB POSTING, SIRI SOFTWARE ENGINEER, HEALTH
AND WELLNESS
APRIL 4, 2017
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITE
How can we (ethically) model a relationship over
time?
What information is saved, and what is discarded?
What level of transparency and control is required?
Does the assistant’s personality adapt, or remain
fixed?
WEBDAGENE 2017
WHAT DOES A RELATIONSHIP
LOOK LIKE?
CHERYL PLATZ //
@MUPPETAPHRODITE
Inclusive and unbiased speech
recognition
Harmonious cross-product partnerships
Semantic web represents common
knowledge
Trust built over time with shared context
Conversation informed by non-verbal
cuesWEBDAGENE 2017
THE FUTURE OF VOICE
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
THESE ADVANCES WILL COMBINE TO OPEN NEW OPPORTUNITIES AND A NEW ERA IN HUMAN EMPOWERMENT.
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017CHERYL PLATZ //
@MUPPETAPHRODITEClip from Star Trek IV: The Voyage Home / Paramount Pictures
WEBDAGENE 2017
LET’S BUILD A FUTURE OF INTERFACES WHERE OUR HUMANITY IS AMPLIFIED, NOT ATROPHIED.
CHERYL PLATZ //
@MUPPETAPHRODITE