112
Hierarchies in representations Latent structure discovery and Dimensionality reduction Amitabha Mukerjee IIT Kanpur, India

Hierarchies in representations

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hierarchies in representations

Hierarchies in representations Latent structure discovery and

Dimensionality reduction

Amitabha Mukerjee

IIT Kanpur, India

Page 2: Hierarchies in representations

TacitKnowing (Thinking fast)

LatentRelations

Page 3: Hierarchies in representations

images: 100 x 100 pixels

Ack: A. Efros, original images from hormel corp.

Latent Representation

Page 4: Hierarchies in representations

4

Manifolds in language

• grounded syntax:

[Nayak and Mukerjee AAAI-12]

Page 5: Hierarchies in representations

Visuo-Motor expertise

[A. van der Meer, 1997: Keeping the arm in the limelight]

Newborns (10-24 days)

Small weights tied to wrists

Will resist weights to movethe arm they can see

Will let it droop ifthey can't see it

in darkened room,works hard to position arm

in a narrow beam of light

Page 6: Hierarchies in representations

Simulation

Page 7: Hierarchies in representations
Page 8: Hierarchies in representations
Page 9: Hierarchies in representations
Page 10: Hierarchies in representations

Robot self-discovery

How does a robot braindiscover the world?

Page 11: Hierarchies in representations

Camera Motion and Shape via Factorization

Page 12: Hierarchies in representations

Homogeneous Coordinates

• 2-D point (x,y) represented as (x1,x2,x3) in homogeneous system

x = x1/x3 y=x2/x3

• (x1,x2,x3) same as (kx1,kx2,kx3) same as (x1/x3,x2/x3,1)

• Similarly for 3-D point – (x1,x2,x3,x4)

Page 13: Hierarchies in representations

Camera Models

• Orthographic

• Weak perspective/Scaled orthographic

• Para-perspective

• Perspective

• Affine

• Projective

Page 14: Hierarchies in representations

Camera Models

• x = (u,v,1)T or (x1,x2,x3)T=> homogeneous image coordinates

• X = (x,y,z,1)T or (X1,X2,X3,X4)=> homogeneous 3D coordinates

x(3x1) = T(3x4)X(4x1)

Page 15: Hierarchies in representations

Orthographic Camera Model

u=x v=y

Page 16: Hierarchies in representations

Orthographic Camera Model

Page 17: Hierarchies in representations

Perspective Camera Model

u = fx/Zv = fy/Z

Page 18: Hierarchies in representations

Perspective Camera Model

Page 19: Hierarchies in representations

Weak Perspective or Scaled Orthographic

Page 20: Hierarchies in representations

Affine Camera Model

Page 21: Hierarchies in representations

Projective Camera

Page 22: Hierarchies in representations

Structure from Motion by Factorization

Tomasi, Carlo, and Takeo Kanade. 1992 "Shape and motion from imagestreams under orthography: a factorization method.

Page 23: Hierarchies in representations

Factorization Method

• Given: Image stream, where P points have been tracked over F frames

• Image coordinates of p-th point in f-th frame = ufp,vfp

• W = 2F x P Measurement Matrix

• Column => observation for pointRow => observation for frame

Page 24: Hierarchies in representations

Factorization Method : Orthography

• W = M(2Fx3)S(3xP) +t(2Fx1) [1…1] (1xP)

translation vector t: element f = mean of row f in W

• registered measurement matrix W*W* = W – t [1…1] = M S (rank 3)

M (2Fx3) : row f = camera horiz/ vert axes in frame fS (3xP) : column p = (x,y,z) of tracked point p

OBJECTIVE : obtain M and S from W* via SVD

Page 25: Hierarchies in representations

Hotel Stream Data

frame 1

Page 26: Hierarchies in representations

frame 50

Hotel Stream Data

Page 27: Hierarchies in representations

frame 100

Hotel Stream Data

Page 28: Hierarchies in representations

Tracked Features

Page 29: Hierarchies in representations

Tracking

Page 30: Hierarchies in representations

Factorizataion Method : Orthography• Creation of measurement matrix W

• Obtaining registered measurement matrix W*

• Performing SVD to obtain W* = UEVT = MS

• M = UE0.5 S = E0.5VT …not unique solution

• MS or (M2A)(A-1S2) …both are solutions

• Constraints to obtain unique A

• Directions can be aligned to camera directions in first frame

Page 31: Hierarchies in representations

SVD

Page 32: Hierarchies in representations

Camera Motion Estimation [M]

error < 0.3 degrees

Page 33: Hierarchies in representations

error < 0.35 degrees

Camera Motion Estimation [M]

Page 34: Hierarchies in representations

3D Shape Estimation [S]

Page 35: Hierarchies in representations

Multi-body Factorization Method

• Multiple objects in a video sequence moving independently

• Earlier approach => from object frame (as if object stationary and camera moving)

• Common representation required for all objects in multi-object scenario

• Why not look from camera frame!!!• (Moving camera+static object) equivalent to (static camera

+moving object) • unique for all objects for a frame

Page 36: Hierarchies in representations

Multi-body Factorization Method

• Points from all objects collectively taken• Expected Rank <= 4*number of objects• Shape Interaction Matrix

Q =VVT

• Q independent of camera orientation, coordinate transformation

• Permuting columns of V does not change values of Q matrix(only row/column number changed in same way as V permutation)

• Only the interaction values of points from same object get non-zero value.

• Block-diagonalization to obtain the points corresponding to same object

Page 37: Hierarchies in representations

Multi-body Factorization Method

Page 38: Hierarchies in representations

Aspects of Factorization

• All images treated uniformly

• Alternate approaches are initialization based and may do poorly if improper.

• Convergence guaranteed by numerical approach

• In multi-object factorization, can determine number of objects and point clusters

Page 39: Hierarchies in representations

Unsupervised Action Learningand

Anomaly Detection

Page 40: Hierarchies in representations

3.2. Topic Modelling

• Given: Document and Vocabulary

* Document is histogram over vocabulary

• Goal: Identify topics in a given set of Documents

Idea: Topics are latent variables

Alternate view :

• Clustering in topic space

• Dimensionality reduction

Page 41: Hierarchies in representations

3.3. Models in practice

• LSANon-parametric clustering into topics using SVD.

• pLSA :Learns probability distribution over fixed number of topics; Graphical model based approach.

• LDA :Extension of pLSA with dirichlet prior for topic distribution. Fully generative model.

Page 42: Hierarchies in representations

4. Vision to NLP : Notations

Text Analysis Video Analysis

Vocabulary of words Vocabulary of visual words

Text documents Video clips

Topics Actions/Events

Page 43: Hierarchies in representations

5.1. Video Clips

• 45 minute video footage of traffic available

• 25 frames per second

• 4 kinds of anomaly

• Divided into clips of fixed size of say 4 – 6 seconds

Page 44: Hierarchies in representations

5.2. Visual Words

• Each frame is 288 x 360

• Frame is divided into 15 x 18 parts, each part containing 400 pixels

• Features

– Optical flow

– Object size

• Background subtraction is performed on each frame to obtain the objects in foreground. Features are computed only for these objects

• Foreground objects consist of vehicles, pedestrians and cyclists

Page 45: Hierarchies in representations

5.2. Visual Words (contd..)

• Foreground pixels then divided into “big” and “small” blobs (connected components)

• Optical flow computed on foreground

• Flow vector quantised into 5 values :

– Static

– Dynamic- up, down, left and right

• 15x28x5x2 different “words” obtained

Page 46: Hierarchies in representations

5.3. Foreground Extraction

Page 47: Hierarchies in representations

5.4. Optical Flow: Heat Map

Page 48: Hierarchies in representations

6. Modelling pLSA

Page 49: Hierarchies in representations

7. Results Demo

• 3 clips

• 3 different types of anomalies

Page 50: Hierarchies in representations

8. Results (ROC plot)

Page 51: Hierarchies in representations

8. Results (PR curve)

Page 52: Hierarchies in representations

Motion Planning

Page 53: Hierarchies in representations

Motion planning

Given start / goal image, map to manifold using local interpolation

Use k-nn connectivity in manifold as “roadmap” for motion planning

Page 54: Hierarchies in representations

Obstacle modeling by node deletion

If obstacle intersects robot in image space → delete corresponding nodes from “visual roadmap”

Page 55: Hierarchies in representations

Path planning as obstacle moves

Page 56: Hierarchies in representations

Mobile robots

Page 57: Hierarchies in representations
Page 58: Hierarchies in representations

Residual error : disk robot

Page 59: Hierarchies in representations

Robot Motion Planning

Source

Destination

Page 60: Hierarchies in representations

Path Planning Interface

Page 61: Hierarchies in representations
Page 62: Hierarchies in representations
Page 63: Hierarchies in representations

Real robots

Page 64: Hierarchies in representations

SCARA arm

Page 65: Hierarchies in representations

SCARA arm : degrees of freedom

Page 66: Hierarchies in representations

Background Subtraction : Robotimage

learned background

foreground (moving part)

Page 67: Hierarchies in representations

Background Subtraction : Obstacle

Page 68: Hierarchies in representations

Visual Configuration Space

Page 69: Hierarchies in representations
Page 70: Hierarchies in representations

Application to Graphics

Page 71: Hierarchies in representations

Head Motion

[M. S. Ramaiah, Ankit Vijay, Geetika Sharma, Amitabha Mukerjee IEEE VR-13]

Page 72: Hierarchies in representations

Minimal Commitment Language Acquisition

Page 73: Hierarchies in representations

73

Previous Work: Unsupervised Semantics• single word or phrase learning [no grammar]

• Hand-coded propositional (T/F) semantics• [plunkett etal 92] [siskind, 94/03] (phrases)

• [regier 96] (prepositions)

• [steels 03] [roy/reiter 05] [caza/knott 12]

• Supervised Learning of semantics• [kate/mooney 06] : set of predicates are known

• [yu/ballard 07] : semantics = scene-region

• Unsupervised Semantic Acquisition : “right” granularity for concepts; dynamic predicates

Page 74: Hierarchies in representations

74

What is a Symbol?

• grounded lexicon:

[langacker 87]

english lexicon hindi lexicon

Page 75: Hierarchies in representations

75

Lexicon

• grounded lexicon:

• semantic pole : perceptual patterns (image schemas) probabilistic predicate + arguments

Page 76: Hierarchies in representations

76

Grammar

• grounded syntax:

Page 77: Hierarchies in representations

77

Minimal Commitment

• minimize prior knowledge in agent:

• preference: minimize description lengths inventory of machine learning algorithms

• no knowledge of grammar – no POS tags, no syntactic structure

• no knowledge of domain

• bootstrapping stage:

• semantic schemas come first

• language regularities later

Page 78: Hierarchies in representations

ADIOS [solan rupin edelman 05]

POS categories are discovered

Page 79: Hierarchies in representations

Acquisition Domains

Page 80: Hierarchies in representations

80

Previous Work: Unsupervised Semantics• single word or phrase learning [no grammar]

• Hand-coded propositional (T/F) semantics• [plunkett etal 92] [siskind, 94/03] (phrases)

• [regier 96] (prepositions)

• [steels 03] [roy/reiter 05] [caza/knott 12]

• Supervised Learning of semantics• [kate/mooney 06] : set of predicates are known

• [yu/ballard 07] : semantics = scene-region

• Unsupervised Semantic Acquisition : “right” granularity for concepts; dynamic predicates

Page 81: Hierarchies in representations

81

Previous Work: Grammar

• Grammar learning:

• Grammatical categories: • [redington etal 98] (RNN)

• [wang / mintz 07] (frequent frame)

• Grammar induction : Structure is known

• No semantics: • [marino etal 07] [solan edelman 05]

• Propositional semantics• [dominey /boucher 05]

• [kwiatkowski zettlemoyer 10] (SVM)

• [kim/mooney 12] (altered visual input)

Page 82: Hierarchies in representations

82

Language Acquisition : Domain 1

• Perceptual input

[heider/simmel 1944] [hard/tversky 2003]• Discovery Targets: • semantics: objects, 2-agent actions, relations

• lexicon : nominal, transitive verbs, preposition

• lexical categories: N VT P Adj

• constructions: PP VP S

• sense extension (metaphor) [nayak/mukerjee (AAAI-12)]

Page 83: Hierarchies in representations

83

Language Acquisition : Domain 2

• Perceptual input

[ mukerjee / joshi RANLP 11]

• Discovery Targets: • semantics: object categories, motion categories

Page 84: Hierarchies in representations

84

Language Acquisition : Domain 2

• objectcategories

• Discovery Targets: • semantics: object categories, motion categories

• lexicon : word boundaries, nominals, intransitive verbs

• construction: intransitive VP

[ mukerjee / joshi RANLP 11]

Page 85: Hierarchies in representations

85

Video Fragment

Page 86: Hierarchies in representations

86

Linguistic input

• Unconstrained description by different subjects:

• the little square hit the big square• they're hitting each other• the big square hit the little square• circle and square in [unitelligible stammer] • the two squares stopped fighting

• छोटा बक्ा बडा बक्ा मे कुछ बातचीत होती हैlittle box big box between some talk happens

input = description commentaries transcribed into text• 48 descriptions in English

[hard/tversky 03] [satish/mukerjee 08]• 10 descriptions in Hindi

Page 87: Hierarchies in representations

87

Discovering Language

• Perceptual structure discovery: • Given perceptual space W discover set of structures

Γ that partition it into patterns relevant to agents goals.

• Elements γ ∈ Γ constitute a hierarchy; structures learned earlier are used for more complex patterns

• Linguistic Structure Discovery• Given set of sentences formed from words w ∈ L,

discover set of subsequences Λ that result in a more compact description of the structure

• Elements λ ∈ Λ constitute a hierarchy, leaf nodes (POS) are subsets of L

Page 88: Hierarchies in representations

Semantics First: Objects / Nominals

Page 89: Hierarchies in representations

Language Grounding: Entity/Object

object = coherent salient region in perceptual space

object view schema [white maruti 800 from camera 1]

object schema [white maruti 800]

object category schema [car]

bottom-up dynamic attention

[singh maji mukerjee 06]

Page 90: Hierarchies in representations

Language – Meaning Associaction

Relative Association (bayesian)

Mutual association (contribution to M.I.)

Page 91: Hierarchies in representations

Language Grounding: Nominals

Page 92: Hierarchies in representations

Perceptual Discovery : Actions : Verbs

Page 93: Hierarchies in representations

Perceptual Discovery: 2-agent actions

Consider every pair of objects A,BA : attended to object (tr) B : other object (landmark, lm).

2 features suffice:

Page 94: Hierarchies in representations

Static time-shots of feature space trajectories

Perceptual Discovery: 2-agent actions

Page 95: Hierarchies in representations

Switching the in-Focus agent

Human Labels (CC, MA, Chase) Ground TruthLabel Vs Cluster assigned

CC: Come-Closer (C1), MA: Move Away (C2), C3 & C4 : ChaseChase sub-categories: Chase_RO-chases-LO: C3

Chase_LO-chases-RO: C4

Number of Clusters from MNG = 4 when Edge Aging = 30 (0.9 prob)

Page 96: Hierarchies in representations

Hierarchy in Concept Space

More clusters Reveals category hierarchy:

Number of Clusters from MNG = 8 when Edge Aging = 16

Come-Closer

Come-Closer-RO-static Come-Closer-both-movingCome-Closer-LO-static

C1, C5, C6 : sub-classes of Come-Closer; C2, C7, C8 :of Move-Away

Similarly: Move-Away : 3 subclasses

Page 97: Hierarchies in representations

CC-RO-static CC-both-movingCC-LO-static

Two-agent actions

MA

MA-RO-static MA-both-movingMA-LO-static

CH

CH (LO, RO)CH (RO, LO)

CC

Two agent action ontology

Page 98: Hierarchies in representations

Learning verbs

Page 99: Hierarchies in representations

DiscoveringContainment Relations :

Prepositions

Page 100: Hierarchies in representations

Clustering spatial relations

[Singh et al CRV 2006]

Match object under gaze focus with words in narrative

Narrative:the little square

hit the big square

Histogram of visual subtended anglefor the 3 shapes

Feature Commitment: Visual angle subtended at trajector by landmark

Meanshift clusterson subtended

visual angle fordiff shapes

[Sarkar/Mukerjee 07; Nayak/Mukerjee 12]

Page 101: Hierarchies in representations

Clustering spatial relations

Narrative:the little square

hit the big square

Histogram of visual subtended anglefor the 3 shapes

[Sarkar/Mukerjee 07; Nayak/Mukerjee 12]

IN cluster(emergent)

Page 102: Hierarchies in representations

Words for motions ending in / out

Page 103: Hierarchies in representations

Syntax discovery and Semantic Association

Page 104: Hierarchies in representations

Syntax Discovery

ADIOS [solan / edelman 05]

• Syntactic discovery: • Given input text, attempt to find graph that results in

minimizing the description length

• Relational Graph RDS: patterns as nodes; edges as transitions

• Attempt to edit RDS to detect significant patterns

• Equivalence classes emerge at the nodes

Page 105: Hierarchies in representations

Computing the Image Schema

Our reflective baby has discovered:

“in” = label corresponding to this image schema

Hence: symbol for [IN] is

(note: this is an early, very basic, low-confidence characterization

Page 106: Hierarchies in representations

Language Structures : Verbs

ADIOS [solan / edelman 05]

Page 107: Hierarchies in representations

Hindi Acquisition: Word learning

Page 108: Hierarchies in representations

Incipient Syntax

Page 109: Hierarchies in representations

Scaling up?

• Domain-specific grammar

• After learning several such grammars for different domains, how to merge?

Page 110: Hierarchies in representations

Conclusion

Page 111: Hierarchies in representations

The power of Latent Discovery

• Data is not randomly drawn

• Discovering implicit strucgture in data

• Tacit Knowledge

• Much work remains in scaling up

• How to merge diverse domains?

•Learning to plan motions

- Learn low-dimensional representations of motion

• Learning Language / Vision

Page 112: Hierarchies in representations

Humans and Robots

madhur ambastha cs665 2002