35
ROLLEN 2017 reason.ai — attention models for chatbot technology 1 1 Attention models for chatbot technology

DN 2017 | Attention models for chatbot technology | Igor Mikhalev | Firmshift

Embed Size (px)

Citation preview

ROLLEN 2017

reason.ai — attention models for chatbot technology

1 1

Attention models for chatbot technology

“a medium extra hot latte macchiato with skimmed milk and a double shot of

espresso?”

HOW WOULD YOU PREFER ORDERING

2

edenspiekermann_

#1

edenspiekermann_

#1#2

Chose #1?

Congratulations: You seem to be human.

Humans prefer expressing what they feel and desire in their natural

language.

But during the industrial age we needed to learn using machine

interfaces for driving cars, washing clothes, ordering stuff online

and getting coffee.

We’re nearing the moment when we can start using our own language

again.

And give machines the responsibility to learn to understand us.

Conversational Commerce

A brand gets to have a hyper-relevant, goal-oriented and personal 1:1 conversation with customers through a channel they like and

anytime they like it.

How is this different from “just a chatbot”?

ROLLEN 2017

reason.ai — attention models for chatbot technology

End-to-end learnability

From bad intents to resource logic

To achieve goals with a conversational dialog, users should cooperate.

But they usually don’t.

Usual pattern:

testing the limits and trying to break the toy

In a goal-oriented system failed dialogs abound

“Resilient toy” approach

• User’s primary goal is to break the system

• Users employ a set of strategies to achieve that

Users’ strategies for breaking the toy

User: I have a red car. Bot: OK. User: What color is my car? Bot: Maybe it is green?

Forgetfulness

Contradiction

User: I have two brothers.Bot: OK.User: I don’t have brothers.Bot: OK.

“Bad intent” detection

• Contradiction • Forgetfulness • Topic Irrelevance • Word sense misinterpretation • Syntactic misinterpretation • Named entity confusion • Word play • Repetition

But also:

•  Phatic •  Cooperative

Assumption: no background knowledge    

“What’s the radius of the Earth?”

Not allowed when ordering a pizza

Building on this assumption

• Premises are introduced by the user and get discharged (utilized) during dialog

• We can use natural deduction and resource logic to model that pattern

Some (old) theory behind it

Prawitz-style proofs

Rules discharge premises

Gentzen-style proofs

Sequents internalise context

Syntactic categories represent classes of context, which should be glued into a coherent whole

Lambek calculus

Functions consume resources to produce another resources

Lambek calculus

Lambek Categorial Grammar

Prove that John loves Mary is a sentence given the lexicon

The logic of resources

Lambek calculus is a fragment of linear logic

Decomposition of a sequent

Decomposition of a sequent

A proof net for LCGA proof net for LCG is a frame with linkage, satisfying a set of well-formedness constraints (acyclicity, coloring, depending on the formulation)

Attention Matrix

syntactic categories

sequents

sequent decomposition

linkage

embeddings

sequences

seq2seq

attention

Smarter dialog management

•Spot the user’s utterance that the current response violates/misinterprets

•Assign a score (probability) that a hypothetical answer breaks the dialog

Boosting quality of goal-oriented systems

• Reestimate the set of hypotheses from competing agents before producing a response

• Agents directly use that information for decision making

• Use reinforcement learning to make future dialogs better