We Are Humor Beings: Understanding and Predicting Visual Humor

We Are Humor Beings: Understanding andPredicting Visual Humor

Shuai Wang

University of Toronto

March 29, 2016

1 / 31

Intro

I An integral part but not understood in detail

I An adult laughs 18 times a dayI A good sense humor

I is related to communication competenceI helps raise an individual’s social status & popularityI even helps attract compatible matesI makes yourself happier :)

2 / 31

Intro


I An adult laughs 18 times a day

I A good sense humorI is related to communication competenceI helps raise an individual’s social status & popularityI even helps attract compatible matesI makes yourself happier :)

3 / 31

Intro


I An adult laughs 18 times a dayI A good sense humor

I is related to communication competenceI helps raise an individual’s social status & popularityI even helps attract compatible matesI makes yourself happier :)

4 / 31

What makes an image funny?

5 / 31

Humor Techniques

I Animal doing something unusual

I Person doing something unusual

I Somebody getting hurt

I Somebody getting scared

6 / 31

Animal doing something unusual

7 / 31

Person doing something unusual

8 / 31

Somebody getting hurt

9 / 31

Somebody getting scared

10 / 31

Changing objects can alter the funniness of a scene

11 / 31

Removing Incongruities

An elderly person kicking afootball while skateboarding isincongruous, but a young girldoing so is not

12 / 31

Adding Incongruities

Add incongruities (and humor)by replacing the expected withthe unexpected

13 / 31

Two Tasks to Understand Visual Humor

I Predicting how funny a given scene is (scene-level)

I Changing the funniness of a scene (object-level)

14 / 31

Object-level Features

I Object embedding (150-d): captures the context in whichan object usually occurs

I Local embedding (150-d): weighted sum of objectembeddings of all other instances

15 / 31

Scene-level Features

I Cardinality (150-d): bag-of-words representation of howmany instances of each object are in the scene

I Location (300-d): horizontal and vertical coordinates of everyobject (closest to the center if multiple instance)

I Scene embedding (150-d): sum of object embeddings of allobjects in the scene

16 / 31

Predicting Funniness Score

I Dataset: 6,400 scenes, with funny score from 1-5 labelled byworkers from Amazon Mechanical Turk

I Support Vector Regressor (SVR) on scene-level features

I Metric: average relative error

17 / 31





18 / 31





19 / 31

Predicting Funniness Score: Ablation Analysis

Different feature subsets perform about the same: slightly betterthan baseline (average score of the training scenes)

20 / 31

Alter Funniness of a Scene

I Detect the objects that do (or do not) contribute to humor

I Identify which objects should replace the objects from step 1

21 / 31

Predicting Objects to be Replaced

I On average, the model replaces 3.67 objects (2.54 groundtruth) → this bias towards replace ensures a large ‘margin’

I Animate objects like humans and animals are more likelysources of humor → tends to replace these objects

22 / 31

Predicting Objects to be Replaced

I On average, the model replaces 3.67 objects (2.54 groundtruth) → this bias towards replace ensures a large ‘margin’

I Animate objects like humans and animals are more likelysources of humor → tends to replace these objects

23 / 31

Funny → Unfunny

Old man dancing → young boy dancingHawk stealing meat → baseball

24 / 31

Funny → Unfunny

Cute puppy → InsectWatermelon → Ax

25 / 31

Unfunny → Funny

Couple having dinner at the table → Puppies having dinner at thetable

26 / 31

Unfunny → Funny

Cating playing around → Racoon driving motorcycle

27 / 31

Discussion

I Style/genre of an image or painting can make a difference

I Dataset is small: 6,400 images

I Feature representation can be improved

28 / 31

Discussion




29 / 31

Discussion




30 / 31

31 / 31

Documents

We Are Humor Beings: Understanding and Predicting Visual Humor