54
Interactive and Interpretable Machine Learning Models for Human Machine Collaboration Been Kim Nov 2015

Been Kim - Interpretable machine learning, Nov 2015

Embed Size (px)

Citation preview

Page 1: Been Kim - Interpretable machine learning, Nov 2015

Interactive and Interpretable Machine Learning Models

for Human Machine Collaboration

Been Kim Nov 2015

Page 2: Been Kim - Interpretable machine learning, Nov 2015

VisionHarness the relative strength of

humans and machine learning models

2

HumanMachine Learning Models

http://blogs.teradata.com/

Page 3: Been Kim - Interpretable machine learning, Nov 2015

Research objectivesDevelop machine learning models inspired by how humans think that can…

3

HumanMachine Learning Models

http://blogs.teradata.com/

Page 4: Been Kim - Interpretable machine learning, Nov 2015

Go#here#

4

HumanMachine Learning Models

infer decisions of humans

Develop machine learning models inspired by how humans think that can…

http://blogs.teradata.com/

Research objectives

Page 5: Been Kim - Interpretable machine learning, Nov 2015

5

HumanMachine Learning Models

make sense to humansinfer decisions

of humans

Develop machine learning models inspired by how humans think that can…

http://blogs.teradata.com/

Research objectives

Page 6: Been Kim - Interpretable machine learning, Nov 2015

6

HumanMachine Learning Models

infer decisions of humans

interact with humans

make sense to humans

Develop machine learning models inspired by how humans think that can…

http://blogs.teradata.com/

Research objectives

Page 7: Been Kim - Interpretable machine learning, Nov 2015

Develop machine learning models inspired by how humans think that can…

1. Infer human team decisions from team planning conversation

2. Communication from machine to

human: provide intuitive

explanations

3. Communication from human to

machine: incorporate feedback

Go#here#

infer decisions of humans

make sense to humans

interact with humans

7

Research objectives

Page 8: Been Kim - Interpretable machine learning, Nov 2015

Road map

8

2. Communication from machine to

human: provide intuitive

explanations

3. Communication from human to

machine: incorporate feedback

make sense to humans

interact with humans

1. Infer human team decisions from team planning conversation

Go#here#

infer decisions of humans

Page 9: Been Kim - Interpretable machine learning, Nov 2015

Road map1. Infer human team

decisions from team planning conversation

9

2. Communication from machine to

human: provide intuitive

explanations

3. Communication from human to

machine: incorporate feedback

infer decisions of humans

make sense to humans

interact with humans

Go#here#

Page 10: Been Kim - Interpretable machine learning, Nov 2015

• Human’s tactical decision is based on exemplar-based reasoning (matching and prototyping) [Cohen 96, Newell 72]

• Skilled fire fighters use recognition-primed decision making — a situation is matched to typical cases [Klein 89]

• Machines can better support peoples’ decision-making by representing data in the same way

Mirror the way humans think

10

Page 11: Been Kim - Interpretable machine learning, Nov 2015

Case-based reasoning and interpretable models

11

Case-based reasoning

• Applied to various applications thanks to its intuitive power

[Aamodt 94, Slade 91, Bekkerman 06]

Limitations

• Always require labels (supervised)

• Does not scale to complex problems

• Does not leverage global patterns of data

Interpretable models

• Decision tree [De`ath 00] • Sparse linear classifiers [Tibshirani 96, Ustun 14] • Prototype-based [Graf 09]

Limitations

• Sparsity is not enough [Freitas 14]

• Linear models or supervised

Page 12: Been Kim - Interpretable machine learning, Nov 2015

Our approach: Bayesian Case Model (BCM)

*

Bayesian generative models

Case-based reasoning

Bayesian Case Model (BCM)

• Leverage the power of examples (prototypes) and subspaces (hot features) to explain machine learning results

prototypessubspaces

Explain complicated

concepts using examples

12[Kim, Rudin, Shah NIPS 2014]

Page 13: Been Kim - Interpretable machine learning, Nov 2015

Bayesian Case Model (BCM)

13

Page 14: Been Kim - Interpretable machine learning, Nov 2015

• A general framework for Bayesian case-based reasoning • Joint inference on prototypes, subspaces and cluster labels

Cluster A

Bayesian Case Model (BCM)

Cluster B Cluster C

prototypes subspaces cluster labels

14

Page 15: Been Kim - Interpretable machine learning, Nov 2015

subspaces

prototypes

Explanations provided by Bayesian Case Model (BCM)

salsa sour cream avocado salt, pepper, taco shell, lettuce, oil

Taco

flour egg water, salt, milk, butter

Basic crepe

chocolate strawberry pie crust, whipping cream, kirsch, almonds

Chocolate berry tart

Cluster A Cluster B Cluster C

15

Page 16: Been Kim - Interpretable machine learning, Nov 2015

Prototype Quintessential observation that best represents the cluster

Subspace sets of important features in characterizing clusters

• A general framework for Bayesian case-based reasoning • Joint inference on cluster labels, prototypes and subspaces

Bayesian Case Model (BCM)

salsa sour cream avocado

Taco

Explain Cluster A1. clustering 2. learning

explanation

prototypes subspacescluster labels

16

Page 17: Been Kim - Interpretable machine learning, Nov 2015

It is a crepe, since it has flour and egg. It is inspired by Mexican food, because it has avocado, salsa and sour cream.

Cluster labels:

• Admixture model for modeling the underlying distributionsCluster A Cluster B Cluster C

= [A, B, A]mexican_crepe

Bayesian Case Model (BCM)

1. Clustering part

17

Page 18: Been Kim - Interpretable machine learning, Nov 2015

It is a crepe, since it has flour and egg. It is sweet crepe that is like chocolate

and berry dessert.

• Admixture model for modeling the underlying distributionsCluster A Cluster B Cluster C

= [B, C, C]chocolate_crepe

Bayesian Case Model (BCM)

1. Clustering part

18

Page 19: Been Kim - Interpretable machine learning, Nov 2015

• Cluster distribution + supervised classification methods can be used for evaluating the clustering performance[1]

• Hyper parameter can be used to control how many different cluster labels within one data point

The concentration parameter:

Cluster distribution of the data point

[1] D. Blei, A. Ng, M. Jordan 2003

Bayesian Case Model (BCM)

1. Clustering part

19

Page 20: Been Kim - Interpretable machine learning, Nov 2015

Subspacesbinary variable 1 for important features

• Each cluster is characterized by a prototype and subspaces

Prototype

Bayesian Case Model (BCM)

2. Learning explanation part

20

Page 21: Been Kim - Interpretable machine learning, Nov 2015

A prototype is an actual data point that exists in the dataset

Prototype

Bayesian Case Model (BCM)

2. Learning explanation part• prototype: quintessential observation that best represents the cluster

21

Page 22: Been Kim - Interpretable machine learning, Nov 2015

Bayesian Case Model (BCM)

2. Learning explanation part• subspace: sets of important features in characterizing clusters

Subspacesbinary variable 1 for important features

22

Page 23: Been Kim - Interpretable machine learning, Nov 2015

Subspacesbinary variable 1 for important features

• subspace: sets of important features in characterizing clusters

Any similarity measure can be used. For example, using any loss function:

The feature j of cluster s is an

important feature (i.e., subspace)

The value of feature j is identical to the

value of the prototype of clusters

Bayesian Case Model (BCM)

2. Learning explanation part

23

Page 24: Been Kim - Interpretable machine learning, Nov 2015

ResultsChallenges of interpretable models

1. Do the learned prototypes and subspaces make sense?

2. Are we sacrificing performance for the interpretability?

3. Do learned prototypes and subspaces help humans’ understanding?

24

Page 25: Been Kim - Interpretable machine learning, Nov 2015

Data from computer cooking contest: liris/cnrs.fr/ccc/ccc2014

• Unsupervised clustering on a subset of recipe data

1. Do the learned prototypes and subspaces make sense?

BCM on recipe data

25

sesame

Page 26: Been Kim - Interpretable machine learning, Nov 2015

1. Do the learned prototypes and subspaces make sense?

BCM on digit data

http://www.cs.nyu.edu/~roweis/data.html26

Page 27: Been Kim - Interpretable machine learning, Nov 2015

Learned cluster D

Gibbs sampling iteration

1. Do the learned prototypes and subspaces make sense?

BCM on digit data

27

Page 28: Been Kim - Interpretable machine learning, Nov 2015

2. Are we sacrificing anything for the interpretability?

Maintain accuracy

Handdigit dataset

20Newsgroups dataset

Sensitivity Analysis

BCM BCM

28

Page 29: Been Kim - Interpretable machine learning, Nov 2015

2. Are we sacrificing anything for the interpretability?

Joint inference on prototypes, subspaces and cluster labels is the key

Posterior distribution

Level set

Another solution that clusters data equally well

and has better interpretability

—- BCM gives higher score for this solution

One solution that clusters

data well

29

Page 30: Been Kim - Interpretable machine learning, Nov 2015

Collapsed Gibbs sampling for inference

• Observed to converge quickly in admixture models

• Integrating out and for efficient inference

30[Kim, Rudin, Shah NIPS 2014]

Page 31: Been Kim - Interpretable machine learning, Nov 2015

3. Does the model make sense to humans? Objective measure of human understanding

Accuracy of human classifier

a new data point to be classified

• Participant’s task is to assign the ingredients of a specific dish (a new data point) to a cluster

• Each cluster is explained using either BCM or LDA

31

Page 32: Been Kim - Interpretable machine learning, Nov 2015

• 384 classification questions asked to 24 people

• Statistically significantly better performance with BCM (85.9% v.s. 71.3%)

a new data point to be classified

Explanations of clusters

Clusters explained using

1. BCM : ingredients of the prototype recipe

2. LDA: representative ingredients of each cluster

3. Does the model make sense to humans? Objective measure of human understanding

Accuracy of human classifier

32[Kim, Rudin, Shah NIPS 2014]

sesame

Page 33: Been Kim - Interpretable machine learning, Nov 2015

Road map1. Infer human team

decisions from team planning conversation

33

2. Communication from machine to

human: provide intuitive

explanations

3. Communication from human to

machine: incorporate feedback

make sense to humans

interact with humans

Go#here#

infer decisions of humans

Page 34: Been Kim - Interpretable machine learning, Nov 2015

Why interactive?

34

Page 35: Been Kim - Interpretable machine learning, Nov 2015

Why interactive?

35

Page 36: Been Kim - Interpretable machine learning, Nov 2015

Why interactive?

36

Page 37: Been Kim - Interpretable machine learning, Nov 2015

Related work on interactive machine learning

• Interact via multiple model parameter settings [Patel 10, Amershi 15]

• Design smart interfaces [Amershi 11] and visualization [Chaney 12, Gou 03]

• Interact via simplified medium of interaction [Kapoor 10, Ware 01]

Prototypesand

Subspaces!

37

Page 38: Been Kim - Interpretable machine learning, Nov 2015

interactive BCM (iBCM)

38

BCM iBCM

Double circled nodes represent interacted latent variables —

Node that get information from both user feedback and

information obtained from data points

Page 39: Been Kim - Interpretable machine learning, Nov 2015

interactive BCM (iBCM)

39

BCM iBCM

Double circled nodes represent interacted latent variables —

Node that get information from both user feedback and

information obtained from data points

Page 40: Been Kim - Interpretable machine learning, Nov 2015

interactive BCM (iBCM) internal mechanism

40

3. Listen to Data

Key: Balance between what the data indicates and what makes most sense to the user

Our approach: Decompose Gibbs sampling steps to

1) adjust the feedback propagation depending on user’s confidence 2) accelerate the inference by rearranging latent variables

2. Propagate Users feedback

to accelerate inference

1. Listen to Users

Page 41: Been Kim - Interpretable machine learning, Nov 2015

User’s workflow with iBCM abstract domain

41

click to change to to

click to promote any items to be

prototype

Page 42: Been Kim - Interpretable machine learning, Nov 2015

Experiment procedure1. Subjects are asked how they want to

group items

2. Subjects view results from BCM

• Essentially shows one of the optimal clustering

3. Subjects indicate how well the results matched their preferred clustering

4. Subjects interact with iBCM

5. Subjects indicate how well the results matches with what they want

42Compare 24 participants, 192 questions

Page 43: Been Kim - Interpretable machine learning, Nov 2015

Experiment results1. Subjects are asked how they want to

group items

2. Subjects view results from BCM

• Essentially shows one of the optimal clustering

3. Subjects indicate how well the results matched their preferred clustering

4. Subjects interact with iBCM

5. Subjects indicate how well the results matches with what they want

4324 participants, 192 questions

Participants agreed more strongly that final clusters matched their preferences

compared to the initial clusters

Wilcoxon signed rank test

Page 44: Been Kim - Interpretable machine learning, Nov 2015

iBCM for introductory programming education

44

• Why education?

• Current teachers’ workflow for creating grading rubric: randomly pick 4-5 assignments and Hodgepodge Grading [Cross 99]

• Understanding this variation is important for providing appropriate, tailored feedback to students [Basu13, Huang 13]

• What are the challenges?

• Extracting right features — OverCode [Glassman 15]

Page 45: Been Kim - Interpretable machine learning, Nov 2015

iBCM + OverCode system

45submissions from MIT introductory python classes

Page 46: Been Kim - Interpretable machine learning, Nov 2015

iBCM + OverCode system

46

Select/unselect subspaces (keywords)

Promote/demote prototypes

Page 47: Been Kim - Interpretable machine learning, Nov 2015

iBCM experiment with domain experts results

Click here to get a new grouping

V.S.

Task: Explore the full spectrum of students’ submissions and write down `discovery list’ for a recitation

47

Page 48: Been Kim - Interpretable machine learning, Nov 2015

d

48

Page 49: Been Kim - Interpretable machine learning, Nov 2015

Experiment with domain experts results

• 48 problems explored by 12 subjects who previously taught introductory python class

• participants agreed more strongly to the following compared to BCM ( )

Were more satisfied Better explored the full spectrum of students’ submissions Better identified important features to expand discovery listImportant features and prototypes are useful𝑝 < 0.001

49

with iBCM, they…

Wilcoxon signed rank test

Page 50: Been Kim - Interpretable machine learning, Nov 2015

Experiment with domain experts results

• 48 problems explored by 12 subjects who previously taught introductory python class

• participants agreed more strongly to the following compared to BCM ( )

Were more satisfied Better explored the full spectrum of students’ submissions Better identified important features to expand discovery listImportant features and prototypes are useful𝑝 < 0.001

50

with iBCM, they…

“[iBCM enabled me to] go in depth as to how students could do”

“ [iBCM] is useful with large datasets where brute-force would not be practical.”

Wilcoxon signed rank test

Page 51: Been Kim - Interpretable machine learning, Nov 2015

Summary

51[Kim, Chacha, Shah AAAI13] [Kim, Chacha, Shah JAIR15]

Communication from machine to human:

provide intuitive explanations

make sense to humans

interact with humans

[Kim, Rudin, Shah NIPS 2014] [Kim, Glassman, Johnson, Shah submitted*][Kim, Patel, Rostamizadeh, Shah AAAI 2015]

Inspiration: how humans make decisions

Approach: case-based Bayesian model

Results: provided intuitive explanations while

maintaining performance

Approach: enable interaction by

decomposing sampling inference steps

Results: implemented and validated the approach in

education domain

Communication from human to machine:

incorporate feedback

Page 52: Been Kim - Interpretable machine learning, Nov 2015

miss-classified data

Next steps• Interpretability for data

exploration: visualization

• Domain specific interpretability: learning features that distinguishes clusters

• Interactive machine learning for debugging models or hyper parameter explorations

predicted: politics

Doc id #24

True label: medicine

[Kim, Patel, Rostamizadeh, Shah AAAI 2015]

52

[Kim, Doshi-Velez, Shah NIPS 2015]

Page 53: Been Kim - Interpretable machine learning, Nov 2015

Next steps at AI2• Extend interpretability for initially

uninterpretable features (neural nets)

53

4th grade science exam

question

Page 54: Been Kim - Interpretable machine learning, Nov 2015

Q&A

[Kim, Chacha, Shah AAAI13] [Kim, Chacha, Shah JAIR15]

Communication from machine to human:

provide intuitive explanations

make sense to humans

interact with humans

[Kim, Rudin, Shah NIPS 2014][Kim, Glassman, Johnson, Shah submitted*][Kim, Patel, Rostamizadeh, Shah AAAI 2015]

Inspiration: how humans make decisions

Approach: case-based Bayesian model

Results: provided intuitive explanations while

maintaining performance

Approach: enable interaction by

decomposing sampling inference steps

Results: implemented and validated the approach in

education domain

Communication from human to machine:

incorporate feedback

[Kim, Doshi-Velez, Shah NIPS 2015]

AI2 is hiring research interns

any time of the year. Shoot me an email

if interested! [email protected]