Carolyn Penstein Rosé Language Technologies Institute Human-Computer Interaction Institute School of Computer Science

1

Carolyn Penstein RoséLanguage Technologies InstituteHuman-Computer Interaction InstituteSchool of Computer Science

With funding from the National Science Foundation and the Office of Naval Research

LightSIDE

2

lightsidelabs.com/research/

3

4

Click here to load a file

5

Select Heteroglossia as the predicted category

6

Make sure the text field is selected to extract text features from

Punctuation can be a “stand in” for mood “you think the answer is 9?” “you think the answer is 9.”

Bigrams capture simple lexical patterns “common denominator” versus “common multiple”

Trigrams (just like bigrams, but with 3 words next to each other) Carnegie Mellon University

POS bigrams capture syntactic or stylistic information “the answer which is …” vs “which is the answer”

Line length can be a proxy for explanation depth

Feature Space Customizations

Contains non-stop word can be a predictor of whether a conversational contribution is contentful “ok sure” versus “the common denominator”

Remove stop words removes some distracting featuresStemming allows some generalization

Multiple, multiply, multiplicationRemoving rare features is a cheap form of feature

selection Features that only occur once or twice in the corpus won’t generalize, so

they are a waste of time to include in the vector space



Think like a computer!Machine learning algorithms look for features that are good predictors, not features that are necessarily meaningful

Look for approximations If you want to find questions, you don’t need to do a complete

syntactic analysis Look for question marks Look for wh-terms that occur immediately before an auxilliary

verb

10

Click to extract text features

11

Select Logistic Regression as the Learner

12

Evaluate result by cross validation over sessions

13

Run the experiment

14

Stretchy Patterns(Gianfortoni, Adamson, & Rosé, 2011)

A sequence of 1 to 6 categories May include GAPs

Can cover any symbol GAP+ may cover any number

of symbols Must not begin or end with a GAP

16

17

18

Now it’s your turn!We’ll explore some advanced features and error analysis

after the break!

Error Analysis Process

Identify large error cellsMake comparisons

Ask yourself how it is similar to the instances that were correctly classified with the same class (vertical comparison)

How it is different from those it was incorrectly not classified as (horizontal comparison)

PositiveNegative

Error Analysis on Development Set

20

21


22


23


24


25

26

27

Positive: is interesting, an interesting scene

Negative: would have been more interesting, potentially interesting, etc.

What’s different?

28

29

30

31

32

33

* Note that in this case we get no benefit if we use feature selection over the original feature space.

Feature Splitting (Daumé III, 2007)

34

General

Domain A Domain BGeneral

Why is this nonlinear?

It represents the interaction between each feature and the Domain variable

Now that the feature space represents the nonlinearity, the algorithm to train the weights can be linear.

35

Healthcare Bill Dataset

36


37


38


39


40


41


42


43


Documents

Carolyn Penstein Rosé Language Technologies Institute Human-Computer Interaction Institute School of Computer Science