Hierarchical Classification Rongcheng Lin Computer Science Department

Hierarchical ClassificationRongcheng Lin

Computer Science Department

Contents

Motivation, Definition & Problem

Review of SVM

Hierarchical Classification

Path-based Approaches

Regularization-based Approaches

MotivationThe classes in real world are structured, specially often hierarchically related.

Gene function prediction Document categorization Image Search …

Hierarchies or taxonomies offer clear advantage in supporting tasks like browsing, searching or visualization International Patent Classification scheme Yahoo! Web catalogs …

Prior knowledge about class relationships will improve the classification performance, especially for tasks with large class number

Prior knowledge about class relationships will boost the classification performance, especially for tasks with large class number

Definition and Problemautomatically categorize data into pre-defined topic hierarchies or taxonomies Supervised Learning Structured Output

DAG and Tree Structure

Definition and Problemautomatically categorize data into pre-defined topic hierarchies or taxonomies Supervised Learning Structured Output

Problem and solution?

Definition and ProblemIncorporate the inter-class relationship(hierarchy) into classification

Redefine the problem

Lower level categories are more detailed while upper level categories are more general Redefine the margin

Different classification mistake are of different severity Redefine the loss function

Review: Binary SVMBinary classification

Margin

Loss Function

wTx + b = 0

wTx + b < 0wTx + b > 0

f(x) = sign(wTx + b)w

xw br i

𝐿( 𝑓 (𝑥 ) , 𝑦 )

Review: Binary SVM

𝐽 (𝑤 )=𝑅 (𝑤 )+ ∑𝑖=1 …𝑛

𝐿(𝑤 ,𝑥 𝑖 , 𝑦 𝑖)

General Form:

Review: Multiclass SVM1) one-vs-the rest2) Crammer & Singer (pairwise)

Review: Multiclass SVMDedicated Loss Function

𝑀𝑎𝑟𝑔𝑖𝑛 :𝛾𝑖 (𝑤 )=𝑤𝑦 𝑖𝑇 𝑋 𝑖−𝑤𝑘

𝑇 𝑋 𝑖 for k ≠ 𝑦 𝑖

Review: Hinge Loss Function the more you violate the margin, the higher the penalty is.

Loss Function

Hierarchical ClassifiersPath-based Approaches

Large Margin Hierarchical Classification Hierarchical Document Categorization with Support Vector Machine On Large Margin Hierarchical Classification with multiple paths

Regularization-based Approaches Tree-Guided Group Lasso for Multi-task Regression Hierarchical Multitask Structured Output Learning for Large-Scale Segmentation

Tree DistanceA given hierarchy induces a metric over the set of classes tree distance or tree induced error

(y,) is defined to be the number of edges along the (unique) path from y to

Tree DistanceA given hierarchy induces a metric over the set of classes tree distance or tree induced error

(y,) is defined to be the number of edges along the (unique) path from y to

�̂�

𝛾 (𝑦 , �̂� )=4

Tree Distance

5 6 �̂� 8 9

𝐷 (𝑦 , �̂� )= 𝑓 4∗𝐶4 + 𝑓 1∗𝐶1+ 𝑓 3∗𝑐3

Loss Functions

1Zero-One Loss

Hinge Loss

Hierarchical Hinge Loss

𝐷（ �̂� , 𝑦 ¿

𝐷( �̂� , 𝑦 ) 𝑓 𝑦 (𝑥 ) − 𝑓 �̂� (𝑥)

Path-based Approachespath-based approaches try to find the most likely path from the root.

Only need to update the parameters of miss-classified

nodes in the tree

Large margin hierarchical classifier

𝑓 𝑦 (𝑥 ) − 𝑓 �̂� (𝑥)

𝑛𝑜𝑡𝑒: 𝑦 𝑖𝑠 h𝑡 𝑒𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑙𝑎𝑏𝑒𝑙 𝑎𝑛𝑑 �̂� ≠ 𝑦

√𝛾(𝑦 , �̂� )

Training Algorithm

𝑓 𝑦 (𝑥 ) − 𝑓 �̂� (𝑥)1

Δ (𝑦 𝑖 , 𝑦 )

Regularization-based ApproachesK individual classification tasks

Use a n additional regularization term to penalizes the disagreement between the individual models

Multitask Learning

Inductions of multiple tasks are performed simultaneously to capture intrinsic relatedness

L1-Norm, L2-Norm

Penalize model complexity to avoid overfitting

L-1 Norm give more sparse estimate than L-2 Norm

Group Lasso and Sparse Group Lasso

HMTL: Hierarchical Multitask Learning

determine the contribution of regularization from the origin vs. the parent node’s parameters (i.e., the strength of coupling between the node and its parent)

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

Original Approach:

New Approach:

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsityeach leaf node is a class

each inner node is a group of classes

Tree-Guided Group Lasso

Advantages and DrawbacksAssume children is good

Assume parent is good

Assume both are not good

Advantages and DrawbacksAssume children is good

Tree Guided Group Lasso

Assume parent is good HMTL

Assume both are not good Path-based

It depends!

Hierarchical Classification Rongcheng Lin Computer Science Department

Documents

Tangshan Wanjie Machinery Equipment Co., Ltd. Tangshan Rongcheng Science & Technology ...f03.s.alicdn.com/kf/HTB1fRntGFXXXXcgaXXX.PRXFXXX4.pdf · 2019. 7. 29. · Machine Weight 280Kg

Lin Lin Recommendation Letter Complete

Hierarchical Data Visualizationshfang/cs552/cs552-hierarchy.pdf · Hierarchical Data Visualization 1 Hierarchical Data Hierarchical data emphasize the subordinate or membership relations

Operationalizing Hierarchical Condition Categories (HCC ... · Operationalizing Hierarchical Condition Categories (HCC Scoring) Objectives • Define Hierarchical Condition Categories

OBBTree: A Hierarchical Structure for Rapid Interference Detection Gottschalk, M. C. Lin and D. ManochaM. C. LinD. Manocha Department of Computer Science,

BLADE Manual - · PDF filelin lin lf link hf blade lin lf link hf blade lin lin lf link hf blade lin lf link hf blade

Multiscale Neural Networks based on hierarchical matriceslzepeda/mnnh.pdf · 2019. 8. 29. · 1 A multiscale neural network based on hierarchical matrices 2 Yuwei Fany, Lin Linz,

A Hierarchical Adaptive Approach to Optimal Experimental ...A Hierarchical Adaptive Approach to Optimal Experimental Design Woojae Kim1, Mark A. Pitt1, Zhong-Lin Lu1, Mark Steyvers2,

India Social hierarchical controls India Social hierarchical controls

Che-Ju Lin (Jewel Lin) Portfolio

HCDD : Hierarchical Cluster-based Data Dissemination in Wireless Sensor Networks with Mobile Sink Ching-Ju Lin Institute of Networking and Multimedia NTU

Hierarchical Modulation for DVB-T · Hierarchical Modulation 3 assured communications ™ 21-Jun-04 What is hierarchical Modulation ? • Hierarchical Modulation allows to transmit

Learning Based Hierarchical Vessel Segmentation Learning Based Hierarchical Vessel Segmentation Learning Based Hierarchical Vessel Segmentation Presenter:

Hierarchical Clustering - unipi.itdidawiki.cli.di.unipi.it/.../dm/dm2014_clustering_hierarchical.pdf · Hierarchical Clustering Two main types of hierarchical clustering – Agglomerative:

J9458-Hierarchical Routes in ArcGIS Network Analystdownloads2.esri.com/.../other_/ArcGIS_NA_Hierarchical_Routes_Aug0… · Hierarchical Routes in ArcGIS ... Hierarchical versus Exact

video chat with Lin Lin

Hierarchical Bayesian Models - Central Web Server 2 - UITS ...web2.uconn.edu/cyberinfra/module3/Downloads/Day 6 - Hierarchical... · What is a hierarchical model? Hierarchical models

Lin Lin and Benjamin Stamm - UCB Mathematics

Hierarchical Hardness Models for SAT Lin Xu, Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {xulin730, hoos, kevinlb}@cs.ubc.ca

Mystery skype lin lin