Zero-Shot Learning - The Good, the Bad and the...

Zero-Shot Learning - The Good, the Bad and the UglyYongqin Xian1, Bernt Schiele1, Zeynep Akata1,2

1 Max-Planck Institute for Informatics 2 University of Amsterdam

Zero-Shot Evaluation Protocol, Evaluated Methods and Evaluation Metrics

Training time

Yts Yts ∪ Ytr

Zero-shot Learning Generalized Zero-Shot Learning

Test time

polar bear

black: yes white : nobrown: yesstripes: nowater: yeseats fish: yes

black: yes white : yesbrown: nostripes: yeswater: noeats fish: no

polar bear

The Good: Zero-Shot learning attracts lots of attentionThe Bad: No agreed evaluation protocolThe Ugly: Test classes overlap ImageNet 1K

Dataset Size |Y | |Y tr| |Y ts|SUN 14K 717 580 + 65 72CUB 11K 200 100 + 50 50AWA1 30K 50 27 + 13 10AWA2 [11] 37K 50 27 + 13 10aPY 1.5K 32 15 + 5 12

ImageNet Split |Y ts|ImageNet 21K - Y tr 20345Within 2/3 hops from Y tr 1509/7678Most populated classes 500/1K/5K

Least populated classes 500/1K/5K

Group Method Main Idea

Linear

ALE [1] Learn linear embedding by weighted ranking loss

compatibility

DEVISE [4] Learn linear embedding by ranking lossSJE [2] Learn linear embedding by multi-class SVM lossESZSL [8] Apply square loss and regularize the embeddingSAE [5] Learn linear embedding with autoencoder

Non-linear LATEM [10] Learn piece-wise linear embeddingcompatibility CMT [9] Learn non-linear embedding by neural network

Two-stageDAP [6] Predict attributes→ unseen class

inferenceIAP [6] Predict seen class→ attributes→ unseen classCONSE [7] Predict seen class→embedding→ unseen class

HybridSSE [12] Learn embedding by seen class proportionsSYNC [3] Learn base classifiers→unseen class classifiers

Per-class Top-1 Accuracy:

accY = 1‖Y‖

‖Y‖∑c=1

# correct in c# in c

Harmonic Mean:

H = 2 ∗ accY tr ∗ accY ts

accY tr + accY ts

Propose a unified evaluation protocol and data splitsCompare and analyze a significant number of the state-of-the-art methods in depthZero-shot setting & more realistic generalized zero-shot setting

Zero-Shot Learning Setting

CONSECM

DEVISE

SYNCSAE

PS vs SS

• Proposed Split (PS) vs Standard Split (SS) on AWA1• 6 out of 10 test classes of SS appear in ImageNet 1K• This violates zero-shot setting: Feature learning = part of training• PS test sets are not a part of ImageNet 1K

1 2 3 4 5 6 7 8 9 10 11 12Models

DAPIAPCONSECMTSSELATEMALEDEVISESJEESZSLSYNCSAE

123123

133221

1 2 3 4 5 6 7 8 9 10 11 12Rank

ALE [2.4]DEVISE [3.3]

SJE [3.9]SSE [4.7]

ESZSL [4.7]LATEM [5.2]

SYNC [6.1]DAP [8.4]SAE [9.6]CMT [9.7]

CONSE [9.8]IAP [10.3]

• Top-1 accuracy of PS on SUN, CUB, AWA1 and AWA2 (above)• Methods ranking of PS on all 5 datasets (left)Element (i, j) indicates number of times model i ranks at jth

• Compatibility learning methods perform better

2H 3H M500 M1K M5K L500 L1K L5K All0

CONSECMTLATEMALEDEVISESJEESZSLSYNCSAE

• Top-1 accuracy on ImageNet with different test class splits• Hybrid models perform the best for highly populated classes• All methods perform similarly for sparsely populated classes• Results on least populated class lower than most populated class

Generalized Zero-Shot Learning Setting

CONSECM

DEVISE

SYNCSAE

CONSECM

DEVISE

SYNCSAE

Ranking methods wrt u (left),s (right) and H (bottom)

117311

114231

112421

4111321

1 2 3 4 5 6 7 8 9 10 11 12 13Rank

SJE [4.8]LATEM [5.4]ESZSL [5.5]CMT* [5.8]SYNC [5.9]

SSE [8.1]SAE [9.0]CMT [9.7]IAP [10.0]

DAP [10.8]CONSE [11.5]

243112

132144

113112

1 2 3 4 5 6 7 8 9 10 11 12 13Rank

CONSE [1.7]IAP [5.2]

SYNC [6.0]DAP [6.1]

CMT* [6.1]CMT [6.2]ALE [6.2]SSE [8.1]

ESZSL [8.4]DEVISE [8.8]LATEM [9.0]

SAE [9.0]SJE [10.1]

141232

1 2 3 4 5 6 7 8 9 10 11 12 13Rank

SJE [4.7]SYNC [5.2]

LATEM [5.4]ESZSL [5.5]CMT* [5.7]

SSE [8.1]SAE [9.3]CMT [9.9]IAP [10.0]

DAP [10.8]CONSE [11.7]

• Methods with very high seen class accuracy have low unseen class accuracy• Harmonic mean is a good measure that balances seen and unseen class accuracy

Top-K accuracy on ImageNet

[1] Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid. Label-embedding for image classification. TPAMI, 2016.

[2] Z. Akata, S. Reed, D. Walter, H. Lee, and B. Schiele. Evaluation of output embeddings for fine-grained image classification. In CVPR, 2015.

[3] S. Changpinyo, W.-L. Chao, B. Gong, and F. Sha. Synthesized classifiers for zero-shot learning. In CVPR, 2016.

[4] A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, M. A. Ranzato, and T. Mikolov. Devise: A deep visual-semantic embedding model. In NIPS, 2013.

[5] E. Kodirov, T. Xiang, and S. Gong. Semantic autoencoder for zero-shot learning. In CVPR, 2017.

[6] C. Lampert, H. Nickisch, and S. Harmeling. Attribute-based classification for zero-shot visual object categorization. In TPAMI, 2013.

[7] M. Norouzi, T. Mikolov, S. Bengio, Y. Singer, J. Shlens, A. Frome, G. Corrado, and J. Dean. Zero-shot learning by convex combination of semantic embeddings. In ICLR, 2014.

[8] B. Romera-Paredes and P. H. Torr. An embarrassingly simple approach to zero-shot learning. ICML, 2015.

[9] R. Socher, M. Ganjoo, C. D. Manning, and A. Ng. Zero-shot learning through cross-modal transfer. In NIPS. 2013.

[10] Y. Xian, Z. Akata, G. Sharma, Q. Nguyen, M. Hein, and B. Schiele. Latent embeddings for zero-shot classification. In CVPR, 2016.

[11] Y. Xian, H. C. Lampert, B. Schiele, and Z. Akata. Zero-shot learning - a comprehensive evaluation of the good, the bad and the ugly. arXiv preprint arXiv:1707.00600, 2017.

[12] Z. Zhang and V. Saligrama. Zero-shot learning via semantic similarity embedding. In ICCV, 2015.

Zero-Shot Learning - The Good, the Bad and the...

Documents

Zero-shot Recognition via Semantic Embeddings and ...xiaolonw/papers/cvpr2018_gcn_zeroshot.pdf · Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs Xiaolong Wang

Creativity Inspired Zero-Shot Learningopenaccess.thecvf.com/...Creativity_Inspired_Zero... · Creativity Inspired Zero-Shot Learning Mohamed Elhoseiny1,2 Mohamed Elfeki3 1Facebook

Heterogeneous Zero-Shot Federated Learning Gautham …

Zero Shot Recognition Using GCN

OntoZSL: Ontology-enhanced Zero-shot Learning

A Generative Adversarial Approach for Zero-Shot Learning ...openaccess.thecvf.com/.../Zhu_A_Generative...paper.pdf · A Generative Adversarial Approach for Zero-Shot Learning from

Gradient Matching Generative Networks for Zero-Shot Learningopenaccess.thecvf.com/content_CVPR_2019/papers/... · Gradient Matching Generative Networks for Zero-Shot Learning Mert

Zero-Shot Visual Recognition Using Semantics-Preserving ...openaccess.thecvf.com/.../Chen_Zero-Shot...paper.pdf · Zero-Shot Learning One main stream of ZSL is the attribute-based

Open World Compositional Zero-Shot Learning

Marginalized Latent Semantic Encoder for Zero-Shot Learningopenaccess.thecvf.com/content_CVPR_2019/papers/Ding... · 2019-06-10 · Marginalized Latent Semantic Encoder for Zero-Shot

Zero-Shot Recognition Using Dual Visual-Semantic Mapping Pathsopenaccess.thecvf.com/content_cvpr_2017/papers/Li_Zero-Shot... · Zero-Shot Recognition using Dual Visual-Semantic Mapping

Risk Minimization for Zero-shot Sequence Labeling

Zero-Shot Learning via Visual Abstraction - Larry Zitnicklarryzitnick.org/publication/eccv2014_zsl_via_visual_abstraction.pdf · Zero-Shot Learning via Visual Abstraction Stanislaw

COSTA: Co-occurrence statistics for zero-shot classificationcc.gatech.edu/~parikh/PnA2014/slides/mensink.pdf · COSTA: Co-occurrence statistics for zero-shot classification Thomas

Multi-label Zero-shot Classiﬁcation by Learning to Transfer from … · 2020-07-31 · ARXIV SUBMISSION: MULTI-LABEL ZERO-SHOT CLASSIFICATION 1 Multi-label Zero-shot Classiﬁcation

Emergent Complexity and Zero-shot Transfer via

Zero-Shot Autonomous Vehicle Policy Transfer: From

Few-Shot and Zero-Shot Learning

Test drive WORDS: ANDREW TULLOCH | IMAGES: JAMIE …badboymowerdistributors.com.au/.../bad-boy-maverick-zero-turn-mower... · Bad Boy Maverick Zero-Turn Mower Bad Boy’s Maveric

From Zero to Hero: On the Limitations of Zero-Shot Cross …From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers Anne Lauscher1,