1 High Resolution Statistical Natural Language Understanding: Tools, Processes, and Issues. Roberto Pieraccini SpeechCycle [email protected]

1

High Resolution Statistical Natural Language Understanding:

Tools, Processes, and Issues.

Roberto Pieraccini

SpeechCycle

[email protected]

2

Directed Dialog vs.Open Prompt

Please choose one of the following: account balance, fund transfer, payments, mortgage rates.

Account balance

Please tell me what you are calling about.

I want to buy a house and I would like to know how much it would cost to borrow money from the bank.

DIRECTED DIALOG OPEN PROMPT

Context-Free Grammars Statistical Spoken Language

Understanding

3

Context-free grammars

account balancefund transfer

paymentsmortgage rates

account balancefund transfer

paymentsmortgage rates

BALANCE

TRANSFER

PAYMENT

MORTGAGE

In grammarutterances

ANYTHINGELSE

NO-MATCH

Out of grammar

handcraftedSRGS

<rule id=“bank" > <one-of>

<item>”account balance”</item> <item>“fund transfer</item> <item>payments</item> <item>”mortgage rates”</item>

</one-of> </rule>

CFG

N-best semanticcategories

4

Statistical Spoken Language Understanding (SSLU)

BALANCE

TRANSFER

PAYMENT

MORTGAGE

All possible natural language

expressions

N-best semantic categories

SSLU

Statistical Language

Model(SLM)

Statistical Semantic

Model(SSM)

Provides a probabilistic constraint to the speech recognition engine

Classifies a word string intoa number of predefined categories

N-best word strings

OTHERAnything Else

)|( 1tt wwPBi-gram languagemodel

T

ii

CwHC

1

)(maxargˆStatisticalclassifier

5

Building SSLUs

SSLUTraining

StatisticalLanguage

Model(SLM)

StatisticalSemantic

Model(SSM)

AnnotatedTranscriptions

I need to know how much money I have BALANCEI need to move funds from checking to savings TRANSFERHow much would it cost to borrow money to by a house MORTGAGEI need to pay my utility bills PAYMENTI dialed the wrong number OTHER……

6

The accuracy point of view

expectedresponses

unexpectedresponses

expectedresponses

unexpectedresponses

DIRECTED DIALOG OPEN PROMPTS WITH SSLU

Increase grammar coverage,Tighten prompt

More training data

High accuracy obtainedby limiting unexpected responses and by controlling vocabulary and word confusability

Unlimited input and uncontrolled vocabulary results in lower accuracy than directed dialog.

When unexpected responses and user’s vocabulary can be controlled, directed dialog typically provides higher accuracy.

7

Why SSLU?

• Number of options too large for directed dialog.– Please choose one of the following: clothing, automotive,

hardware, appliances, …, gardening, …, bedding, …• Options make little sense to users

– Do you have a hardware, software, or configuration problem?• User may chose the wrong option

– Hmm…hardware?• Unexpected responses and user’s vocabulary hard to

control– I need to buy a car CD player that plays MP3s

In all these situations, open prompts with SSLU can outperform directed dialog.

8

Low and High Resolution SLUs

• APPLICATION: Call routing– 10s of broad semantic categories

• APPLICATION: Technical Support– 100s of semantic categories– Different degrees of specificity– Detailed confirmation

• User model differs from underlying model • User don’t know the underlying model

Low resolution

High resolution

9

TV Symptoms

On Demand

Pay-per-view

Ordering

No Picture

Error

PIN

Other

Error

I have a problem with my TV service

I could not order a show

My movie on demand does not work

I ordered a pay-per-view event but all I see is an error code on the display.

Hierarchical SSLU

10

TV Symptoms

On Demand

Pay-per-view

Ordering

No Picture

Error

PIN

Other

Error

Hierarchical SSLU

I could not order a show

I have a problem with my TV service

My movie on demand does not work

I ordered a pay-per-view event but all I see is an error code on the display.

I understand you have a problemwith ordering. Is it on demand or pay-per-view?

11

• In order to build good SSLU it is important to establish a repeatable process– Transcription management– Creation of annotation guide– Measure annotation consistency– Revise annotation guide– Create VUI

12

Development Cycle

Transcription

Normalization

Linguistic

Annotation

SSLU

Training

Symptom

and Annotation

Guide

SSLU

Test

Review

Annotation

Guide

Develop

disambiguation

VUI

Remove artifacts, acronyms, misspellings

Annotate according to what the user says

Measure annotation consistency

Merge or split categories for better SLU performance

Tens of thousands of utterances are needed for creating high performance SLUs

13

SE

MA

NT

IC T

RU

TH

SLU RESULTS

Confusion Matrix

14

Performance Analysis

Utterances 10,332 100.00%

In domain 9,322 90.22%

Correct in-domain 7,591 81.43%

Out of domain 1,010 9.78%

Correct rejection out-of-domain 249 24.65%

SSLU accuracy analysis

15

Confirmation Analysis

• As a result of the confirmation prompt, users can– Accept a correct hypothesis– Accept a wrong hypothesis– Deny a correct hypothesis– Deny a wrong hypothesis– Do not confirm at all

Accepted correct 535 59.8%

Accepted wrong 118 13.2%

Denied correct 22 2.5%

Denied wrong 63 7.0%

Unconfirmed 57 6.4%

No result 100 11.2%

TOTAL 895 100.0%

16

Experimental VUIThe effect of the prompt

Cumulative average deflection

14.00%

15.00%

16.00%

17.00%

18.00%

19.00%

20.00%

21.00%

10/26 10/27 10/28 10/29 10/30 10/31 11/1 11/2 11/3 11/4 11/5 11/6 11/7

WITH EXAMPLES

OFFER CHOICES

ORIGINAL

0

5000

10000

15000

20000

25000

10/26 10/27 10/28 10/29 10/30 10/31 11/1 11/2 11/3 11/4 11/5 11/6 11/7

Number of calls

17

Conclusions

• Understanding the choice between SSLU and directed dialog

• High-resolution SSLU for applications with hundreds of semantic categories.

• SSLU development process

• Data, data, data – assessment of performance is key to success.

Documents

1 High Resolution Statistical Natural Language Understanding: Tools, Processes, and Issues. Roberto Pieraccini SpeechCycle [email protected]