44
Human-Computer Human-Computer Interaction Interaction Tracking Hanyang University Jong-Il Park

Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Embed Size (px)

Citation preview

Page 1: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Human-Computer InteractionHuman-Computer Interaction Tracking

Hanyang University

Jong-Il Park

Page 2: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Tracking in computer visionTracking in computer vision Definition

Problem of generating an inference about the motion of an object given a sequence of images

Application Motion capture Recognition from motion Surveillance

Who is doing what? Eg. Security, HCI(Kinect…)

Page 3: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

TrackingTracking Establish state of object using time sequence

state could be: position; position+velocity; position+velocity+acceleration or more complex, Eg. all joint angles for a person

Biggest problem -- Data Association which image pixels are informative, which are not?

Key ideas Tracking by detection

if we know what an object looks like, that selects the pixels to use

Tracking through flow if we know how an object moves, that selects the pixels

to use

Page 4: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Appearance vs. FlowAppearance vs. Flow

Page 5: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Tracking by DetectionTracking by Detection Assume

a very reliable detector (e.g. faces; back of heads) detections that are well spaced in images (or have

distinctive properties) e.g. news anchors; heads in public

Link detects across time only one - easy multiple - weighted bipartite matching but what if one is missing?

Better: create abstract tracks link detects to track create tracks, reap tracks as required clean up spacetime paths

Page 6: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park
Page 7: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Tracking by Known AppearanceTracking by Known Appearance Even if we don’t have a detector Know rectangle in image (n) Want to find corresponding rectangle in (n+1) Search over nearby rectangles

to find one that minimizes SSD error

where sum is over pixels in rectangle

Application stabilize players in TV sport

Page 8: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park
Page 9: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park
Page 10: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Buidling TracksBuidling Tracks

Start at scattered points in image 1 perhaps corner detector responses

For each in image (n), compute position in (n+1) as in previous slide

Now check tracks patch in (n+1) should look like an affine transform

of patch in 1 Prune bad tracks

Page 11: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park
Page 12: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park
Page 13: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

What if the patch deforms?What if the patch deforms? Eg a football player’s jersey

Colors are “similar” but SSD won’t work Idea: patch histogram is stable

To track: repeat

predict location of new patch search nearby for patch whose histogram matches

original the best

Page 14: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park
Page 15: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park
Page 16: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

When are motions easy?When are motions easy? Current procedure

predict state obtaining measurement from prediction by search correct state

Easy When the object is close to where you expect it to be

eg Object guaranteed to move a little

Large motions can be easy When they’re “predictable”

e.g. ballistic motion e.g. constant velocity

Need a theory to fuse this procedure with motion model

Page 17: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

General ModelGeneral Model We assume there are moving objects, which

have an underlying state X There are measurements Y, some of which

are functions of this state Eg. There is a clock

at each tick, the state changes at each tick, we get a new observation

object is ball, state is 3D position+velocity, measurements are stereo pairs

object is person, state is body configuration, measurements are frames, clock is in camera (30 fps)

Page 18: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Tracking as Tracking as

an Abstract Inference Probleman Abstract Inference Problem Internal state: X Measurement: Y Major Steps:

1. Prediction

2. Data association: determining which data are informative

3. Correction

Page 19: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Independence AssumptionIndependence Assumption Only the immediate past matters

* X: Markov process

Measurements depend only on the current state

Page 20: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Bayes RuleBayes Rule

Page 21: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

PredictionPrediction

Page 22: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

CorrectionCorrection

Page 23: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Linear Dynamic ModelLinear Dynamic Model Model

D: Transition matrix

M: Measurement matrix

Kalman filter is well-suited for this type of tracking

Read Ch.11.3, Forsyth & Ponce, Computer Vision, 2012.

Page 24: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Kalman filterKalman filter

Page 25: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Eg. Kalman filterEg. Kalman filter

*: predictedx: measured+: estimatedo: truebar: 3

Page 26: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Nonlinear DynamicsNonlinear Dynamics

Problems tend not to be normal

may not be Gaussian(quite common in vision problem)

multiple, well-separated modes

Page 27: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Eg. Nonlinear modelEg. Nonlinear model

Page 28: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Eg.Eg. Nonlinear modelNonlinear model Time Evolution & pdfTime Evolution & pdf

Page 29: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Difficulties Difficulties in Complicated Modelsin Complicated Models

In order to maintain handling multiple peaks handling multi-dimensional state vector

Particle filtering is a useful approach.

How?

Page 30: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Sampled RepresentationSampled Representation Representation of pdf is

NOT for representing a pdf itself BUT for computing some expectation

Computing expectation using sampled representation

Page 31: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Monte Carlo IntegrationMonte Carlo Integration

where : sampling distribution

: weight

Page 32: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Obtaining a sampled representation Obtaining a sampled representation of probability distributionof probability distribution

Page 33: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Computing an expectation using Computing an expectation using a set of samplesa set of samples

Page 34: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

TransformationTransformation Transforming a sampled representation of Transforming a sampled representation of

a prior into that of a posteriora prior into that of a posterior

Page 35: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Naive particle filterNaive particle filter

Page 36: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Eg. Naive particle filterEg. Naive particle filter

Poor results due to sample impoverishment! Most of the weights get small very fast

The way to get accurate estimates = to have samples lie where the p is likely to

be large

Page 37: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Overcoming Sample ImpoverishmentOvercoming Sample Impoverishment

Equivalent to maintaining a set of good particles

Then, how?

Resampling the prior1. expand the sample set non-uniformly using

the weights Form a new set of samples consisting of a

union of Nk copies of (sk,1) for each k.

2. Subsample the sample set uniformly

Page 38: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Practical particle filterPractical particle filter Initialization Prediction Correction Resampling:

1. Normalise the weights so that 2. Compute the variance of the

normalised weights. If(var >Th) construct a new set of

samples by drawing, with replacement, N samples from the old set, using the weights as the probability that a sample will be drawn.

The weight of each sample is now 1/N.

Naive particle filter

Page 39: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Consequence of resamplingConsequence of resampling

Particles that tend to reflect the state rather well usually reappear in the resampled set

Many particles lie within one standard deviation of the mean of the posterior

Page 40: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Particle filtersParticle filters Different community -> different name

statistics particle filter

AI survival of the fittest

computer vision condensation

Page 41: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Tracking peopleTracking people Essential components

Motion model Likelihood model

P(image features|person present at given configuration)

Motion model Strong motion model: markers, angles,… Weak motion model: drift model

Likelihood model SSD, edges, other features,…

Page 42: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Likelihood computationLikelihood computation

Boundaryinformation

Non-backgroundinformation

Sample points

Page 43: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Annealed particle filterAnnealed particle filter To overcome the problems of

High dimensionality of the state Many local peaks in the likelihood

Annealing Starts from smooth approximations to the

likelihood to less smooth approximation Repeats weighting and resampling

[Deutscher,Blake,Reid, CVPR2000]

Page 44: Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park

Finding people Finding people As an initialization of people tracker There is no person tracker that represents

the configuration of the body and can start automatically

Known approaches1. Template matching

2. Finding faces

3. Search over correspondence

* Challenging topic: learning models from data