Upload
camilla-benson
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
1
High Resolution Statistical Natural Language Understanding:
Tools, Processes, and Issues.
Roberto Pieraccini
SpeechCycle
2
Directed Dialog vs.Open Prompt
Please choose one of the following: account balance, fund transfer, payments, mortgage rates.
Account balance
Please tell me what you are calling about.
I want to buy a house and I would like to know how much it would cost to borrow money from the bank.
DIRECTED DIALOG OPEN PROMPT
Context-Free Grammars Statistical Spoken Language
Understanding
3
Context-free grammars
account balancefund transfer
paymentsmortgage rates
account balancefund transfer
paymentsmortgage rates
BALANCE
TRANSFER
PAYMENT
MORTGAGE
In grammarutterances
ANYTHINGELSE
NO-MATCH
Out of grammar
handcraftedSRGS
<rule id=“bank" > <one-of>
<item>”account balance”</item> <item>“fund transfer</item> <item>payments</item> <item>”mortgage rates”</item>
</one-of> </rule>
CFG
N-best semanticcategories
4
Statistical Spoken Language Understanding (SSLU)
BALANCE
TRANSFER
PAYMENT
MORTGAGE
All possible natural language
expressions
N-best semantic categories
SSLU
Statistical Language
Model(SLM)
Statistical Semantic
Model(SSM)
Provides a probabilistic constraint to the speech recognition engine
Classifies a word string intoa number of predefined categories
N-best word strings
OTHERAnything Else
)|( 1tt wwPBi-gram languagemodel
T
ii
CwHC
1
)(maxargˆStatisticalclassifier
5
Building SSLUs
SSLUTraining
StatisticalLanguage
Model(SLM)
StatisticalSemantic
Model(SSM)
AnnotatedTranscriptions
I need to know how much money I have BALANCEI need to move funds from checking to savings TRANSFERHow much would it cost to borrow money to by a house MORTGAGEI need to pay my utility bills PAYMENTI dialed the wrong number OTHER……
6
The accuracy point of view
expectedresponses
unexpectedresponses
expectedresponses
unexpectedresponses
DIRECTED DIALOG OPEN PROMPTS WITH SSLU
Increase grammar coverage,Tighten prompt
More training data
High accuracy obtainedby limiting unexpected responses and by controlling vocabulary and word confusability
Unlimited input and uncontrolled vocabulary results in lower accuracy than directed dialog.
When unexpected responses and user’s vocabulary can be controlled, directed dialog typically provides higher accuracy.
7
Why SSLU?
• Number of options too large for directed dialog.– Please choose one of the following: clothing, automotive,
hardware, appliances, …, gardening, …, bedding, …• Options make little sense to users
– Do you have a hardware, software, or configuration problem?• User may chose the wrong option
– Hmm…hardware?• Unexpected responses and user’s vocabulary hard to
control– I need to buy a car CD player that plays MP3s
In all these situations, open prompts with SSLU can outperform directed dialog.
8
Low and High Resolution SLUs
• APPLICATION: Call routing– 10s of broad semantic categories
• APPLICATION: Technical Support– 100s of semantic categories– Different degrees of specificity– Detailed confirmation
• User model differs from underlying model • User don’t know the underlying model
Low resolution
High resolution
9
TV Symptoms
On Demand
Pay-per-view
Ordering
No Picture
Error
PIN
Other
Error
I have a problem with my TV service
I could not order a show
My movie on demand does not work
I ordered a pay-per-view event but all I see is an error code on the display.
Hierarchical SSLU
10
TV Symptoms
On Demand
Pay-per-view
Ordering
No Picture
Error
PIN
Other
Error
Hierarchical SSLU
I could not order a show
I have a problem with my TV service
My movie on demand does not work
I ordered a pay-per-view event but all I see is an error code on the display.
I understand you have a problemwith ordering. Is it on demand or pay-per-view?
11
• In order to build good SSLU it is important to establish a repeatable process– Transcription management– Creation of annotation guide– Measure annotation consistency– Revise annotation guide– Create VUI
12
Development Cycle
Transcription
Normalization
Linguistic
Annotation
SSLU
Training
Symptom
and Annotation
Guide
SSLU
Test
Review
Annotation
Guide
Develop
disambiguation
VUI
Remove artifacts, acronyms, misspellings
Annotate according to what the user says
Measure annotation consistency
Merge or split categories for better SLU performance
Tens of thousands of utterances are needed for creating high performance SLUs
13
SE
MA
NT
IC T
RU
TH
SLU RESULTS
Confusion Matrix
14
Performance Analysis
Utterances 10,332 100.00%
In domain 9,322 90.22%
Correct in-domain 7,591 81.43%
Out of domain 1,010 9.78%
Correct rejection out-of-domain 249 24.65%
SSLU accuracy analysis
15
Confirmation Analysis
• As a result of the confirmation prompt, users can– Accept a correct hypothesis– Accept a wrong hypothesis– Deny a correct hypothesis– Deny a wrong hypothesis– Do not confirm at all
Accepted correct 535 59.8%
Accepted wrong 118 13.2%
Denied correct 22 2.5%
Denied wrong 63 7.0%
Unconfirmed 57 6.4%
No result 100 11.2%
TOTAL 895 100.0%
16
Experimental VUIThe effect of the prompt
Cumulative average deflection
14.00%
15.00%
16.00%
17.00%
18.00%
19.00%
20.00%
21.00%
10/26 10/27 10/28 10/29 10/30 10/31 11/1 11/2 11/3 11/4 11/5 11/6 11/7
WITH EXAMPLES
OFFER CHOICES
ORIGINAL
0
5000
10000
15000
20000
25000
10/26 10/27 10/28 10/29 10/30 10/31 11/1 11/2 11/3 11/4 11/5 11/6 11/7
Number of calls
17
Conclusions
• Understanding the choice between SSLU and directed dialog
• High-resolution SSLU for applications with hundreds of semantic categories.
• SSLU development process
• Data, data, data – assessment of performance is key to success.