Contextual models for object detection using boosted random fields

by Antonio Torralba,

Kevin P. Murphy and

William T. Freeman

Quick Introduction

What is this?

Now can you tell?

Belief Propagation (BP)

Network (Pairwise Markov Random Fields) observed nodes (yi)

hidden nodes (xi)

Statistical dependency, called local evidence:

),( iii yx )( ii xShord-hand

Statistical dependency:Local evidence

),( iii yx )( ii xShord-hand

Statistical dependency:Compatibility function

),( jiij xx

Joint probability

),()(1

})({ij

ii xxxZ

Joint probability

x1 x2 xi….

x5 x3 x1 x4 xjx12

y1 y2 yi

),()(1

})({ij

ii xxxZ

Joint probability

x1 x2 xi….

x5 x3 x1 x4 xjx12

y1 y2 yi

),()(1

})({ij

ii xxxZ

The belief b at a node i is represented by the local evidence of the node all the messages coming in from

neighbors

)()()(iNj

ijiiiii xmxkxb xi xj

)( ii x

The belief b at a node i is represented by the local evidence of the node all the messages coming in from

neighbors

)()()(iNj

ijiiiii xmxkxb xi xj

)( ii x

)|( yxp ii

Messages m between hidden nodes

How likely node j thinks it is that node i will be in the corresponding state.

xi xjmji(xi)

ijjijjiji xmxxxxmj \)(

)(),()()(

xi xj xk

)( jj x

),( ijji xx

xi xjmji(xi)

Conditional Random Field

Distribution of the form:

Conditional Random Field

ii xxxZ

yxp ),()(1

Distribution of the form:

Boosted Random Field

Basic Idea:

Use BP to estimate P(x|y)

Use boosting to maximize Log Likelihood of each node wrt to )( ii x

Algorithm: BP

Minimize negative log likelihood of training data (yi). Label Loss function to minimize:

t xbJJ )( ,,

Algorithm: BP

t xbJJ )( ,,

mimi bb*,

,, )1()1(

Algorithm: BP

t xbJJ )( ,,

mimi bb*,

,, )1()1(

2/)1( ,*, mimi xx

}1,1{, mix

Algorithm: BP

1 )()()(iNj

ijiiiti xmxkxb

)( ii x

Algorithm: BP

1 )()()(iNj

ijiiiti xmxxb

)( ii x

Algorithm: BP

1 )()()(iNj

ijiiiti xmxxb

)1(1 tiM

Algorithm: BP

)1(1 tiM

}1,1{,,

1 )()()(iNj

ijiiiti xmxxb

Algorithm: BP

1 )()()(iNj

ijiiiti xmxxb

)( ii x];[)( 2/2/ t

i FFii eex

F: a function of the input data

Algorithm: BP

)()1( ti

ti GFb

withtiG

Algorithm: BP

)()1( ti

ti GFb

)1(log)1(log ti

ti MMG

withtiG

tmimieJ )( ,,,1loglog

Function F

)()()( ,,1

i yfyFyF

Boosting! f is the weak learner: weighted decision

stumps.

byahyfi )()(

Minimization of loss L

xfYwJt

,,, )(minarglogminarg

xfYwJt

,,, )(minarglogminarg

)1()1(, ti

tmi bbw

timi GFx

mitmi exY where

Local Evidence: algorithm

For t=1..T Iterate Nboost times

find the best basis function h update local evidence with update the beliefs update the weights

Iterate NBP times update messages update the beliefs

ti fF 1

)1()1(, ti

tmi bbw

ti fF 1

)1()1(, ti

tmi bbw

ti fF 1

)( ii xb)1()1(, ti

tmi bbw

ti fF 1

)1()1(, ti

tmi bbw

ti fF 1

)1()1(, ti

tmi bbw

ti fF 1

)( ii xb )( jj xb)1()1(, ti

tmi bbw

Function G

By assuming that the graph is densely connected we can make the approximation:

Now G is a non-linear additive function of the beliefs:

Function G

Instead of learning the function

can be learnt with an

additive model:

tmi bgG

bbwabg tm

weighted regression stumps

Function G

The weak learner is chosen by

minimizing the loss:

bgbgFxtti

tmimiebJ

,, )()(1 1log)(log

The Boosted Random Field Algorithm

For t=1..T find the best basis function h for f find the best basis function for compute local evidence compute compatibilities update the beliefs update weights

xi xjt

The Boosted Random Field Algorithm

For t=1..T find the best basis function h for f find the best basis function for compute local evidence compute compatibilities update the beliefs update weights

Final classifier

For t=1..T update local evidences F update compatibilities G compute current beliefs

Output classification: )5.0( ,, tmimi bx

Multiclass Detection

U: Dictionary of ~2000 images patches V: Same number of image masks

Multiclass Detection

U: Dictionary of ~2000 images patches V: Same number of image masks

At each round t, for each class c for each dictionary entry d there is a weak learner:

0)()( dddd VUIIv

Function f

To take into account different sizes, we first downsample the image and then upsample and OR the scales:

which is our function for computing the local evidence.

dcyx ssIvIf ])([)( ,,,

Function g

The compatibily function has a similar form:

dcyxcyx

ddcyx Wbbg

1'',','',',',, )(

Function g

The compatibily function has a similar form:

W represent a kernel with all the messages directed to node x,y,c

dcyxcyx

ddcyx Wbbg

1'',','',',',, )(

Kernels W

Example of incoming messages:

Function G

The overall incoming messages function is given by:

tcyx WbbG

1'',','',',',, )(

'',','',','

ccyxcyx Wb

Learning…

Labeled dataset of office and street scenes, with each ~100 images In the first 5 round updated only the local

evidence After the 5th iteration update also the

compatibility functions At each round update only F and G of

the single object class that reduces the most the multiclass cost.

Learning…

Biggest objects are detected first because they reduce the error of all classes the fastest:

The End

Introduction

Observed: Picture Dictionary: Dog

P(Dog|Pic)

Introduction

P(Head|Pici)

P(Tail|Pici)

P(Front Legs|Pici)P(Back Legs|Pici)

Introduction

Comp(Head, Legs)

Comp(Head, Tail)

Comp(F. Legs, B. Legs)

Comp(Tail, Legs)

Introduction

P(Piraña|Pici)

Comp(Piraña, Legs)

Graphical Models

Observation nodes yi

yi can be a pixel or a patch

Graphical Models

Hidden Nodes

Local Evidence: ),( iii yx

XDictionary

)( ii xShord-hand

Graphical Models

Compatibility Function:

),( jiij xx

Contextual models for object detection using boosted random fields

Documents

Jet substructures of boosted Higgs

Constraint contextual rewriting - CORE · From contextual rewriting to constraint contextual rewriting Contextual rewriting is an extended form of conditional rewriting whereby information

Social Media Marketing: Facebook Boosted Posts

HCI: Focus Groups and Contextual Inquiry · HCI: Focus Groups and Contextual Inquiry ... First, the news … Contextual Inquiry 4. Contextual Inquiry •An approach to ethnographic

Minimizing effects of methodological decisions on ......Niche of modeling Correlative beyond models Maxent Boosted regression trees Random forest GLM Background data a b s t r a c

Introduction to Contextual Multi-bandit Algorithm to Contextual Multi-bandit... · Outline §Introduction §Motivation §Contextual-free Bandit Algorithms §Contextual Bandit Algorithms

Semileptonic Boosted Tops

Downsized, boosted gasoline engines

Electric Boosted Models: Gas Boosted Models · 2020. 10. 16. · to the electric boosted models. Throughout this manual, most pages are common to both electric and gas boosted models

Boosted Tree

Classiﬁcation using Boosted Decision Trees

Boosted Partial Least-Squares Regressionjf-durand-pls.com/bibliography/transpFenelon.pdf · Boosted PLS Regression: S¶eminaire J.P. F¶enelon 2008-1 Boosted Partial Least-Squares

A multi-representation approach to the contextual interference ......Further testing with single-stimulus as well as novel and unstructured (i.e., random) sequences indicated that

Gradient Boosted Decision Trees on Hadoop - LCCClccc.eecs.berkeley.edu/Slides/YeChChZh10_slides.pdfGradient Boosted Decision Trees on Hadoop ... Gradient Boosted Decision Trees was

Gradient Boosted Models with H2O · best-in-class algorithms at scale, such as distributed random forest, gradient boosting, and deep learning. Customers can build thousands of models

arXiv:2005.05131v1 [cs.AI] 11 May 2020Multi-Layered Artiﬁcial Neural Networks, Decision Trees, Support Vector Machines, K-Nearest Neighbor, Random Forest, Gradient Boosted Trees

Why to choose Random Forest to predict rare species ... · 99 that the software platform of the former three algorithms (Boosted Regression Trees, Random 100 Forest and CARTs) applied

Bootstrapping Boosted Random Ferns for Discriminative and

Rationale for MVC + boosted PI regimen

Boosted Regression