Object Detection and Tracking Mike Knowles 11 th January 2005

Object Detection and Object Detection and TrackingTrackingMike KnowlesMike Knowles

1111thth January 2005 January 2005

http://postgrad.eee.bham.ac.uk/knowlmhttp://postgrad.eee.bham.ac.uk/knowlm

IntroductionIntroduction

Goal – to detect and track objects moving Goal – to detect and track objects moving independently to the backgroundindependently to the background

Two situations to be considered:Two situations to be considered: Static BackgroundStatic Background Moving BackgroundMoving Background

Applications of Motion TrackingApplications of Motion Tracking

Control ApplicationsControl Applications Object AvoidanceObject Avoidance Automatic GuidanceAutomatic Guidance Head Tracking for Video ConferencingHead Tracking for Video Conferencing

Surveillance/Monitoring ApplicationsSurveillance/Monitoring Applications Security CamerasSecurity Cameras Traffic MonitoringTraffic Monitoring People CountingPeople Counting

My WorkMy Work

Started by tracking moving objects in a Started by tracking moving objects in a static scenestatic scene

Develop a statistical model of the Develop a statistical model of the background background

Mark all regions that do not conform to the Mark all regions that do not conform to the model as moving objectmodel as moving object

My WorkMy Work

Now working on object detection and Now working on object detection and classification from a moving cameraclassification from a moving camera

Current focus is motion compensated Current focus is motion compensated background filteringbackground filtering

Determine motion of background and Determine motion of background and apply to the model.apply to the model.

Detecting moving objects in a static Detecting moving objects in a static scenescene

Simplest method:Simplest method: Subtract consecutive frames.Subtract consecutive frames. Ideally this will leave only moving objects. Ideally this will leave only moving objects. This is not an ideal world….This is not an ideal world….

Using a background modelUsing a background model

Lack of texture in objects mean incomplete Lack of texture in objects mean incomplete object masks are produced.object masks are produced.

In order to obtain complete object masks In order to obtain complete object masks we must have a model of the background we must have a model of the background as a whole.as a whole.

Adapting to variable backgroundsAdapting to variable backgrounds

In order to cope with varying backgrounds In order to cope with varying backgrounds it is necessary to make the model dynamicit is necessary to make the model dynamic

A statistical system is used to update the A statistical system is used to update the model over timemodel over time

Background FilteringBackground Filtering

My algorithm based on:My algorithm based on:““Learning Patterns of Activity using Real-Time Tracking”Learning Patterns of Activity using Real-Time Tracking”

C. Stauffer and W.E.L. Grimson. IEEE Trans. On Pattern C. Stauffer and W.E.L. Grimson. IEEE Trans. On Pattern Analysis and Machine Intelligence. August 2000Analysis and Machine Intelligence. August 2000

The history of each pixel is modelled by a The history of each pixel is modelled by a sequence of Gaussian distributionssequence of Gaussian distributions

Multi-dimensional Gaussian Multi-dimensional Gaussian DistributionsDistributions

Described mathematically as:Described mathematically as:

More easily visualised as:More easily visualised as:

(2-Dimensional)(2-Dimensional)

ttT

tt XX

nt eX

1

2

1

2

1

22

1,,

Simplifying….Simplifying….

Calculating the full Gaussian for every Calculating the full Gaussian for every pixel in frame is very, very slowpixel in frame is very, very slow

Therefore I use a linear approximationTherefore I use a linear approximation

How do we use this to represent a How do we use this to represent a pixel?pixel?

Stauffer and Grimson suggest using a Stauffer and Grimson suggest using a static number of Gaussians for each pixelstatic number of Gaussians for each pixel

This was found to be inefficient – so the This was found to be inefficient – so the number of Gaussians used to represent number of Gaussians used to represent each pixel is variableeach pixel is variable

WeightsWeights

Each Gaussian carries a weight valueEach Gaussian carries a weight value This weight is a measure of how well the This weight is a measure of how well the

Gaussian represents the history of the pixelGaussian represents the history of the pixel If a pixel is found to match a Gaussian then the If a pixel is found to match a Gaussian then the

weight is increased and vice-versaweight is increased and vice-versa If the weight drops below a threshold then that If the weight drops below a threshold then that

Gaussian is eliminatedGaussian is eliminated

MatchingMatching

Each incoming pixel value must be Each incoming pixel value must be checked against all the Gaussians at that checked against all the Gaussians at that locationlocation

If a match is found then the value of that If a match is found then the value of that Gaussian is updatedGaussian is updated

If there is no match then a new Gaussian If there is no match then a new Gaussian is created with a low weightis created with a low weight

UpdatingUpdating

If a Gaussian matches a pixel, then the If a Gaussian matches a pixel, then the value of that Gaussian is updated using value of that Gaussian is updated using the current valuethe current value

The rate of learning is greater in the early The rate of learning is greater in the early stages when the model is being formedstages when the model is being formed

Static Scene Object Detection and Static Scene Object Detection and Tracking Tracking

Model the background and subtract to Model the background and subtract to obtain object maskobtain object mask

Filter to remove noiseFilter to remove noise Group adjacent pixels to obtain objectsGroup adjacent pixels to obtain objects Track objects between frames to develop Track objects between frames to develop

trajectoriestrajectories

Moving Camera SequencesMoving Camera Sequences

Basic Idea is the same as beforeBasic Idea is the same as before Detect and track objects moving within a Detect and track objects moving within a

scenescene

BUT – this time the camera is not BUT – this time the camera is not stationary, so everything is movingstationary, so everything is moving

Motion SegmentationMotion Segmentation

Use a motion estimation algorithm on the Use a motion estimation algorithm on the whole framewhole frame

Iteratively apply the same algorithm to Iteratively apply the same algorithm to areas that do not conform to this motion to areas that do not conform to this motion to find all motions presentfind all motions present

Problem – this is very, very slowProblem – this is very, very slow

Motion Compensated Background Motion Compensated Background FilteringFiltering

Basic PrincipleBasic Principle Develop and maintain background model as Develop and maintain background model as

previouslypreviously Determine global motion and use this to Determine global motion and use this to

update the model between framesupdate the model between frames

AdvantagesAdvantages

Only one motion model has to be foundOnly one motion model has to be found This is therefore much fasterThis is therefore much faster

Estimating motion for small regions can be Estimating motion for small regions can be unreliableunreliable

Not as easy as it sounds though…..Not as easy as it sounds though…..

Motion ModelsMotion Models

Trying to determine the exact optical flow Trying to determine the exact optical flow at every point in the frame would be at every point in the frame would be ridiculously slowridiculously slow

Therefore we try to fit a parametric model Therefore we try to fit a parametric model to the motionto the motion

Affine Motion ModelAffine Motion Model

y

x

aa

aa

a

a

v

u

54

21

3

0

The affine model describes the vector at each The affine model describes the vector at each point in the imagepoint in the image

Need to find values for the parameters that best Need to find values for the parameters that best fit the motion presentfit the motion present

Background Motion EstimationBackground Motion Estimation

Uses a framework developed by Black and Uses a framework developed by Black and Anandan Anandan

Black M.J. and Anandan P. Black M.J. and Anandan P. The robust estimation of motion The robust estimation of motion models: Parametric and Piecewise-smooth Fieldsmodels: Parametric and Piecewise-smooth Fields, Computer , Computer Vision and Image Understanding, Vol. 63, No. 1, pp. 75-104, Vision and Image Understanding, Vol. 63, No. 1, pp. 75-104, January 1996.January 1996.

For more details see my talk from last yearFor more details see my talk from last year

ExamplesExamples

Other approaches to TrackingOther approaches to Tracking

Many approaches using active contours – Many approaches using active contours – a.k.a. snakesa.k.a. snakes Parameterised curvesParameterised curves Fitted to the image by minimising some cost Fitted to the image by minimising some cost

function – often based on fitting the contour to function – often based on fitting the contour to edgesedges

Constraining shapeConstraining shape

To avoid the snake being influenced by To avoid the snake being influenced by point we aren’t interested in, use a model point we aren’t interested in, use a model to constrain its shape.to constrain its shape.

CONDENSATIONCONDENSATION

No discussion on tracking can omit the No discussion on tracking can omit the CONDENSATION algorithm developed by CONDENSATION algorithm developed by Isard and Blake.Isard and Blake.

CONditional DENSity propagATIONCONditional DENSity propagATION Non-gaussian substitute for the Kalman Non-gaussian substitute for the Kalman

FilterFilter Uses factored sampling to model non-Uses factored sampling to model non-

gaussian probabiltiy densities and gaussian probabiltiy densities and estimate propogate them though time.estimate propogate them though time.

CONDENSATIONCONDENSATION

Thus we can take a set of parameters and Thus we can take a set of parameters and estimate them from frame to frame, using estimate them from frame to frame, using current information from the framescurrent information from the frames

These parameters may be positions or These parameters may be positions or shape parameters from a snake.shape parameters from a snake.

CONDENSATION - AlgorithmCONDENSATION - Algorithm

Randomly take samples from the previous Randomly take samples from the previous distribution.distribution.

Apply a random drift and deterministic diffusion Apply a random drift and deterministic diffusion based on a model of how the parameters based on a model of how the parameters behave to the samples.behave to the samples.

Weight each sample on the basis of the current Weight each sample on the basis of the current information.information.

Estimate of actual value can be either a Estimate of actual value can be either a weighted average or a peak value from the weighted average or a peak value from the distributiondistribution

SummarySummary

Static-scene background subtraction Static-scene background subtraction methodsmethods

Extensions to moving camera systemsExtensions to moving camera systems Use of model-constrained active contour Use of model-constrained active contour

systemssystems CONDENSATIONCONDENSATION

Documents

Object Detection and Tracking Mike Knowles 11 th January 2005