Cross-Domain Scruffy Inferencealumni.media.mit.edu/~kcarnold/scruffy-inference/... · Incorporating...

Cross-Domain Scruffy Inference

Kenneth C. Arnold Henry Lieberman

MIT Mind Machine ProjectSoftware Agents Group, MIT Media Laboratory

AAAI Fall Symposium on Common Sense KnowledgeNovember 2010

Kenneth C. Arnold, Henry Lieberman (MIT) Cross-Domain Scruffy Inference CSK 2010 1 / 16

Vision

Informal (“scruffy”) inductive reasoning over non-formalizedknowledgeUse multiple knowledge bases without tedious alignment.

We’ve Done This. . .

ConceptNet and WordNet (Havasi et al. 2009)Topics and Opinions in Text (Speer et al. 2010)Code and Descriptions of Purpose (Arnold and Lieberman 2010)

but how does it work?

This Talk

BackgroundBlending is Collective Matrix Factorization.Singular vectors rotate.Other blending layouts work too.

Background

Matrix Representations of Knowledge

Background

Factored Inference

Filling in missing values is inference.

Background

Factored Inference

Represent each concept i and each feature j by k -dimensionalvectors ~ci and ~fj such that when A(i , j) is known,

A(i , j) ≈ ~ci ·~fj .

If A(i , j) is unknown, infer ~ci ·~fj .Equivalently, stack each ~ci in rows of C, same for F , then

A ≈ CF T .

Collective Factorization

Quantifying factorization quality

Quantify the “≈” in A ≈ CF T as a divergence:

D(XY T |A)

Minimizing loss ensures that the factorization fits the dataMany functions possible, e.g., SVD minimizes squared error:

Dx2(A|A) =∑

(aij − aij)2.

Collective Matrix Factorization

An analogy. . .Let people p rate restaurants r , represented by positive ornegative values in ‖p‖ × ‖r‖ matrix A.Restaurants also have characteristics c (e.g., “serves vegetarianfood”, “takes reservations”, etc.), represented by matrix B.Incorporating characteristics may improve rating prediction.Use the same restaurant vector to factor preferences andcharacteristics:

A ≈ PRT B ≈ RCT

Collective Matrix Factorization

A ≈ PRT B ≈ RCT

(A is person by restaurant, B is restaurant by characteristics)

Collective Matrix Factorization (Singh and Gordon 2008) gives aframework for solving this type of problemSpread out the approximation loss:

αD(PRT |A) + (1− α)D(RCT |B)

At α = 1, factors as if characteristics were just patterns of ratings.At α = 0, factors as if only qualities, not individual restaurants,mattered for ratings.

Blending is a CMF

A ≈ PRT B ≈ RCT

(A is person by restaurant, B is restaurant by characteristics)

Can also solve with Blending:

(1− α)B

]≈ R

If decomposition is SVD, loss is seperable by component:

= D(RPT |αAT ) + D(RCT |(1− α)B)

⇒ Blending is a kind of Collective Matrix Factorization

Blended Data Rotates the Factorization

Veering

What happens at an intersection point?Consider you’re blending X and Y . Start with X ≈ ABT ; whathappens as you add in Y?First add in the new space that only Y covered.Now data is off-axis, so rotate the axes to align with the data.

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

−1.0

−0.5

1.0a = 1, θ = 0π

a = 1, θ = 0.25π

a = 1, θ = 0.5π

a = 0, θ = 0.5π

Veering

“Veering” is caused by singular vectors of the blend rotating betweencorresponding singular vectors of the source matrices.

0.0 0.5 1.0 1.5 2.0α

θU = 1.1

θU = 0.26

θU = 1.3

Layout Tricks

Bridge Blending

EnglishConceptNet

FrenchConceptNet

English Features French Concepts

En ↔ Fr

Dictionary

General bridge blend:[X Y

]≈[UXYU0Z

] [VX0VYZ

[UXY V T

X0 UXY V TYZ

U0Z V TX0 U0Z V T

]Again, loss factors:

D(A|A) =D(UXY V TX0|X ) + D(UXY V T

YZ |Y )+

D(U0Z V TX0|0) + D(U0Z V T

YZ |Z )

VYZ ties factorization of X and Z togetherthrough bridge data Y .Could use weighted loss in empty corner.

Summary

Blending is a Collective Matrix Factorization“Veering” indicates singular vectors rotating between datasets

What’s next?CMF permits many objective functions, even different ones fordifferent input data. What’s appropriate for commonsenseinference?Incremental?Can CMF do things we thought we needed 3rd-order for?

Cross-Domain Scruffy Inferencealumni.media.mit.edu/~kcarnold/scruffy-inference/... · Incorporating...

Documents

I Know What You Want to Express: Sentence Element ... · I Know What You Want to Express: Sentence Element Inference by Incorporating External Knowledge Base Xiaochi Wei, Heyan Huang,

Variational Inference - Marc DeisenrothVariational Inference Variational inference is the mostscalable inference method available (at the moment) Can handle (arbitrarily) large datasets

file · Web viewBaggy Smart Stylish Trendy Old – fashioned Preppy. Scruffy Tight Sporty

Unit 5: Inference for categorical variables Lecture 1: Inference ...tjl13/s101/slides/unit5lec1H.pdfUnit 5: Inference for categorical variables Lecture 1: Inference for proportions

ZORRO : A masking program for incorporating Alignment Accuracy in Phylogenetic Inference Sourav Chatterji Martin Wu

4 : Exact Inference: Variable Elimination 1 Probabilistic Inference

Envisioning a Robust, Scalable Metacognitive Architecture ...web.media.mit.edu/~kcarnold/scruffy-metacog/alonso-2010-AAAI10-MetacogSlides.pdfEnvisioning a Robust, Scalable Metacognitive

Scruffy Dog Concept Design

Incorporating Structural Breaks in GARCH Models · Introduction Bayesian Inference Marginal Likelihood Simulations Empirics Varia Extensions - p. 3/44 Motivation "Simple" GARCH models,

Unit 5: Inference for categorical variables Lecture 1: Inference ...Unit 5: Inference for categorical variables Lecture 1: Inference for proportions - theoretical Statistics 101 Mine

Direct Inference and Inverse Inference Teddy … Inference and Inverse Inference Teddy Seidenfeld The Journal of ... section 11 I consider Ian Hacking's novel ... Direct Inference

Let’s get scruffy with some of Dublin’s horrible histories!

Variational Inference and Mean Field · Variational Inference “Variational inference”: Formulate inference problem as constrained optimization. Approximate the function or constraintsto

robynbegalkeportfolio.weebly.comrobynbegalkeportfolio.weebly.com/.../nursing_research… · Web viewMale. 78 years old. English speaking. Grey hair, scruffy face, thin, with a newspaper

Statistical Inference Two Statistical Tasks 1. Description 2. Inference

My Dog Looks Really Scruffy – Why? Dog Skin Problems

Incorporating activelearningintoyourclassroom

STATS 200: Introduction to Statistical Inference 200: Introduction to Statistical Inference ... Statistical inference Statistical inference = Probability 1 ... STATS 200: Introduction

PINK StoryGreenWordCards 01 · Pink Storybook 1 Scruffy Ted Story Green Words Pink Storybook 1 Scruffy Ted Story Green Words Pink Storybook 1 Scruffy Ted Story Green Words FOLDAnn

Inference. Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference