32
Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen Grauman (University of Texas at Austin)

Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Embed Size (px)

Citation preview

Page 1: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Semantic Kernel Forests from Multiple TaxonomiesSung Ju Hwang (University of Texas at Austin),

Fei Sha (University of Southern California),and Kristen Grauman (University of Texas at Austin)

Page 2: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Limitation of status quo recognition

Until recently, most categorization methods solely relied on the category labels, treating each instance as an isolated entity.

Semantic spaceVisual world

Cat

Dog

Wolf

Zebra

1

2

3

4

xx

x

x

Page 3: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Limitation of status quo recognition

Semantic space

Cat

Wolf

Zebra

Canine

Visual world

Dog

Pet

Wild

Similar

Dissimilar

However, semantic entities exist in relation to others.

Larger and finer-grained datasets → more meaningful relations

How can we exploit such relations for improved categorization?[Fergus10] Semantic Label Sharing for Learning with Many Categories, R. Fergus, H. Bernal, Y. Weiss, A. Torralba,, ECCV 2010[Zhao11] Large Scale Category Structure Aware Image Classification, B. Zhao, L. Fei Fei, E. P. Xing, NIPS 2011

Page 4: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

MotivationOur focus: a semantic taxonomy

1) Partial alignment between the taxonomy and visual distribution

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

Dalmatian WolfLeopard

Spotted

Siam.Cat

Pointy Corner

Texture

Dalmatian Siam.CatWolf

Canine

leopard

Feline

Animal

BiologicalAppearance Habitat

2) No single ‘optimal’ taxonomy

- But, potentially two snags.

What information to exploit from multiple taxonomies and how to leverage it?

Page 5: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Idea

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

Exploit multiple semantic taxonomies for visual feature learning

Dalmatian WolfLeopard

Spotted

Siam.Cat

Pointy Corner

Texture

Dalmatian Siam.CatWolf

Canine

leopard

Feline

Animal

Dog-like shape

Cat-face

Biological Appearance HabitatSpot Pointy corner Indoor setting,

personWoods

- Taxonomies provide human merge/split criteria- Each taxonomy provides complementary information

How do we then,

1. Learn granularity and view specific features on each taxonomy, and

2. Combine learned features across taxonomies for object recognition?

Page 6: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

OverviewGoal: Learn and combine features across multiple taxonomies

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

Dalmatian WolfLeopard

Spotted

Siam.Cat

Pointy Corner

Texture

Dalmatian Siam.CatWolf

Canine

leopard

Feline

Animal

Dog-like shape

Cat-face Spot Pointy corner Indoor setting,person

Woods

Dog-like shape

Cat-face Spot Pointy cornerIndoor setting,person

Woods

1. Learn view and granularity specific features at each taxonomy

Categorization model

2. Optimally combine learned features for categorization+ + + + +

[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Page 7: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Tree of Metrics

How to learn granularity- and view- specific features?

Siamese cat

Persian cat

Canine

Carnivore

Feline

Domestic Cat

Dalmatian Wolf Bit cat

Intuition: Features useful for the discrimination of the superclasses less useful for subcategory discrimination

– Exploit parent-child relationship to isolate features at each node

[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Page 8: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Given a taxonomy ,we learn a metric for each internal (superclass) node n to discriminate between its subclasses.

Tree of Metrics

Approach the feature learning problem as hierarchical metric learning with disjoint regularization

xixj

xl

Feline

Canine

margin

[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Lighter element has higher value

Siamese cat

Persian cat

Carnivore

Domestic Cat

Wolf

Canine Feline

Dalmatian Big cat

Page 9: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Tree of Metrics

Siamese cat

Persian cat

Carnivore

Domestic Cat

Given a taxonomy ,we learn a metric for each internal (superclass) node n to discriminate between its subclasses.

Canine Feline

WolfDalmatian Big cat

Approach the feature learning problem as hierarchical metric learning with disjoint regularization

[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Page 10: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Further, we learn all metrics simultaneously with two regularizationsA sparsity-based regularization to identify informative features.A disjoint regulazation to learn features exclusive to each granularity.

Tree of Metrics

Siamese cat

Persian cat

Carnivore

Feline

Domestic Cat

Canine

WolfDalmatian Big cat

[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Page 11: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Regularization Terms to Learn Compact, Discriminative Metrics

Minimize the sum of the diagonal entries.

Sparsity regularizationHow can we select few informative features at each node?

→ Competition between features in a single metric

[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Page 12: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Regularization Terms to Learn Compact, Discriminative Metrics

How can we regularize each metric to use features disjoint from its ancestors?

Disjoint regularization

Enforce two metrics not to have large value at the same time, for the same feature.

Ancestor Descendant

→ Competition between ancestors and descendants

Both regularizers are convex

[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Page 13: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

OverviewGoal: Learn and combine features across multiple taxonomies

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

Dalmatian WolfLeopard

Spotted

Siam.Cat

Pointy Corner

Texture

Dalmatian Siam.CatWolf

Canine

leopard

Feline

Animal

Dog-like shape

Cat-face Spot Pointy corner Indoor setting,person

Woods

Dog-like shape

Cat-face Spot Pointy cornerIndoor setting,person

Woods

1. Learn view and granularity specific features at each taxonomy

2. Optimally combine learned features for categorization+ + + + +

Categorization model[Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012

Page 14: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Semantic Kernel Forest

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

Dalmatian WolfLeopard

Spotted

leopard

Pointy Corner

Texture

Dalmatian Siam.CatWolf

Canine

leopard

Feline

Animal

Biological Appearance Habitat

From multiple ToMs, we obtain a semantic kernel forest, a set of non-linear view- and granularity- specific feature spaces

Compute RBF kernel on the distance computed using the learned metric

[Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012

Page 15: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Semantic Kernel Forest

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

Dalmatian WolfLeopard

Spotted

leopard

Pointy Corner

Texture

Dalmatian Siam.CatWolf

Canine

leopard

Feline

Animal

Biological Appearance Habitat

How to combine the learned kernel forest for optimal discrimination?

Obtain class specific kernel by linearly combining kernels on the tree paths.

multiple kernel learning

Consider only a small fraction of relevant kernels – O(TlogN)

Page 16: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Proposed Sparse Hierarchical Regularization

Dalmatian Siam.CatWolf leopard

Feline

Animal Biological

Dalmatian WolfSiam.Cat

Domestic

leopard

Wild

Habitat

Usual L1 regularization: selects few useful kernels

Multiple taxonomies may provide some redundant kernels

Canine

Tameness

- Interleaved selection of kernels

Are all kernels equal?

Page 17: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Hierarchical regularization- weight of a node must be larger than its children’s

Dalmatian Siam.CatWolf

Canine

leopard

Feline

Animal Biological

<

Proposed Sparse Hierarchical Regularization

- Implicitly enforce hierarchical structure among kernels

Dalmatian WolfSiam.Cat

Domestic

leopard

Wild

Habitat

<

Tameness

Multiple taxonomies provide redundant kernels- Higher level kernels discriminate with more categories

Page 18: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Optimization for Semantic Kernel Forest

Nonsmooth due to the hierarchical regularization term

MKL objective

- Use projected subgradient to optimize

We minimize the sum of the MKL objective + regularization term

Sparsity Reg. Hierarchical regularization

[Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012

Page 19: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Datasets

AWA-10-6,180 images-10 animal classes-Fine-grained

Imagenet-20 -28,957 images-20 non-animal classes-Coarser-grained

(a) Wordnet (b) Appearance (c) Behavior (d) Habitat

(a) Wordnet (b) Visual (c) Attributes

Constructed on different attribute groups

Page 20: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Multiclass Classification Results

Method AWA-4 AWA-10 Imagenet-20 Raw feature kernel 47.67±2.22 30.80±1.36 28.20±1.45

Raw feature kernel + MKL 48.50±1.89 31.13±2.81 27.67±1.50 Perturbed semantic kernel tree N/A 31.53±2.07 28.20±2.02

We compare to three baselines

- Raw feature kernel: RBF kernel computed on the original image features- Raw feature kernel + MKL: MKL with RBF kernels with different bandwidth.- Perturbed semantic kernel tree: Semantic kernel forest on randomly permuted taxonomy.

Page 21: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Multiclass Classification Results

Method AWA-4 AWA-10 Imagenet-20 Raw feature kernel 47.67±2.22 30.80±1.36 28.20±1.45

Raw feature kernel + MKL 48.50±1.89 31.13±2.81 27.67±1.50 Perturbed semantic kernel tree N/A 31.53±2.07 28.20±2.02

Semantic kernel tree + Avg 47.17±2.40 31.92±1.21 28.97±1.61 Semantic kernel tree + MKL 48.89±1.06 32.43±1.93 29.74±1.26

Semantic kernel tree + MKL-H 50.06±1.12 32.68±1.79 29.90±0.70

Semantic kernel tree (ToM) > perturbed kernel tree

- Semantic kernel tree + Avg: Averged semantic kernel tree on a single taxonomy- Semantic kernel tree + MKL: MKL on a single taxonomy only with sparsity reg.- Semantic kernel tree + MKL-H: with both sparsity and hierarchical regularization.

- More meaningful grouping/splits for object categorization

Page 22: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Multiclass Classification Results

Method AWA-4 AWA-10 Imagenet-20 Raw feature kernel 47.67±2.22 30.80±1.36 28.20±1.45

Raw feature kernel + MKL 48.50±1.89 31.13±2.81 27.67±1.50 Perturbed Semantic kernel tree N/A 31.53±2.07 28.20±2.02

Semantic kernel tree + Avg 47.17±2.40 31.92±1.21 28.97±1.61 Semantic kernel tree + MKL 48.89±1.06 32.43±1.93 29.74±1.26

Semantic kernel tree + MKL-H 50.06±1.12 32.68±1.79 29.90±0.70 Semantic kernel forest+MKL 49.67±1.11 34.60±1.78 30.97±1.14

Semantic kernel forest+MKL-H 52.83±1.68 35.87±1.22 32.30±1.00

- Semantic kernel forest+MKL: MKL with kernels learned on multiple taxonomies, with only the sparsity regularization

- Semantic kernel forest+MKL-H: with both sparsity and hierarchical regularization.

Multiple taxonomies > a single taxonomy

- Each taxonomy provides complementary information

Page 23: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Multiclass Classification Results

Method AWA-4 AWA-10 Imagenet-20 Raw feature kernel 47.67±2.22 30.80±1.36 28.20±1.45

Raw feature kernel + MKL 48.50±1.89 31.13±2.81 27.67±1.50 Perturbed Semantic kernel tree N/A 31.53±2.07 28.20±2.02

Semantic kernel tree + Avg 47.17±2.40 31.92±1.21 28.97±1.61 Semantic kernel tree + MKL 48.89±1.06 32.43±1.93 29.74±1.26

Semantic kernel tree + MKL-H 50.06±1.12 32.68±1.79 29.90±0.70 Semantic kernel forest+MKL 49.67±1.11 34.60±1.78 30.97±1.14

Semantic kernel forest+MKL-H 52.83±1.68 35.87±1.22 32.30±1.00

Hierarchical regularizer > Standard L1 regularization

- Regularizer’s effect is minimal on the semantic kernel tree, which lacks redundancy

- Good to consider the structure of the feature spaces

Page 24: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Confusion matrices on 4 animal classes

Blue: Low confusion

Red: High confusion

Each taxonomy is suboptimal, but provides complementary information which could be optimally leveraged with MKL

Canine Feline

Animal

Biological

Dalmatian Wolf Siam. Cat Leopard

Spotted Pointy Ear

Animal

Appearance

Dalmatian WolfSiam. CatLeopard

Domestic Wild

Animal

DalmatianSiam. Cat Leopard Wolf

Habitat

Page 25: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Lower

Wordnet

Higher

Appearance Behavior Habitat

Effect of hierarchical regularization

Sparsity regularization only: 34.33Sparsity+ Hierarchical: 35.67

Hierarchical regularizer avoids overfitting with implicit structure enforced among kernels

procyonid

felineeven-toed

aquatic

carnivore

placental

cat/rathairless

~panda

appearance

racoon/rat

landaquatic

predator/prey

behavior

junglenonjungle

aquatic

landhabitat

Page 26: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Key message: semantic taxonomies for visualfeature learning

Summary

- Exploits disjoint sparsity between parent and child classes in a taxonomy: Tree of Metrics- Leverages complementary information from multiple

semantic taxonomies: Semantic Kernel Forest- Novel regularizers that exploit category relations

- Disjoint regularizer that exploits parent-child relationship to learn disjoint features.

- Hierarchical regularizer that favors upper level kernels.

[Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

IntuitionCompeting features between parent and childComplementary across different semantic views

Learning MethodsDisjoint and hierarchical regularizer for competing featuresMKL with hierarchical regularizer for complementary features

Page 27: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Key message: Semantic taxonomies for visual feature learning

Summary

[Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012[Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Intuition:Competing features between parent and childComplementary across different semantic views

Learning methods:Disjoint regularizer for competing featuresMKL with hierarchical regularizer for complementary features

Page 28: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

A single taxonomy often improves performance on someclasses, at the expense of others. - Individual taxonomy suboptimal.

Habitat- Better for h. whale- Worse for panda Wordnet- Better for panda- Worse for h. whaleAll- Better for both

Semantic kernel forest takes the best of both through learned combination.

Per-class results

Page 29: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

IdeaLearn non-linear feature space for each view and granularity,that splits the categories according to each merge/split criteria

Dog-like shape

Cat-face

Canine vs. Feline

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

Dalmatian WolfLeopard

Spotted

Siam.Cat

Pointy Corner

Texture

Appearance HabitatSpot Pointy corner Indoor setting,

personWoods

Page 30: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Idea

Dog-like shape

Cat-face

Canine vs. Feline

Dalmatian WolfSiam. Cat

Domestic

leopard

Wild

Tameness

HabitatIndoor setting,person

Woods

Spot vs. Pointy cornerSpot Pointy corner

Learn non-linear feature space for each view and granularity,that splits the categories according to each merge/split criteria

Page 31: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Idea

Dog-like shape

Cat-face

Canine vs. Feline Spot vs. Pointy cornerSpot Pointy corner

Learn non-linear feature space for each view and granularity,that splits the categories according to each merge/split criteria

Domestic vs. WildIndoor setting,person

Woods

Page 32: Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen

Idea

Dog-like shape

Cat-face

Spot vs. Pointy corner Domestic vs. WildSpot Pointy corner Indoor setting,

personWoods

Canine vs. Feline

Then, combine the feature space to obtain an optimally discriminative space for categorization.

Combinedfeature space

How do we then,- learn such features, and- optimally combine them?