Project 35 Visual Surveillance of Urban Scenes. PROJECT 35: VISUAL SURVEILLANCE OF URBAN SCENES Principal Investigators David Clausi, Waterloo Geoffrey

Project 35

Visual Surveillance of Urban Scenes

PROJECT 35: VISUAL SURVEILLANCE OF URBAN SCENES

Principal Investigators

• David Clausi, Waterloo

• Geoffrey Edwards, Laval

• James Elder, York (Project Leader)

• Frank Ferrie, McGill (Deputy Leader)

• James Little, UBC


Partners

• Honeywell (Jeremy Wilson)

• CAE (Ronald Kruk)

• Aimetis (Mike Janzen)

bleuant

mention CAE Strive


Participants

Postdoctoral Fellows

Francisco J. Estrada (York)

Bruce Yang (Waterloo)

Students

Eyhab Al-Masri (Waterloo)

Kurtis McBride (Waterloo)

Natalie Nabbout (Waterloo)

Isabelle Begin (McGill)

Albert Law (McGill)

Prasun Lala (McGill)

John Harrison (McGill)

Antoine Noel de Tilly (Laval)

Samir Fertas (Laval)

Michael Yurick (UBC)

Wei-Lwun Lu (UBC)

Patrick Denis (York)


Goals

• Visual surveillance of urban scenes can potentially be used to enhance human safety and security, to detect emergency events, and to respond appropriately to these events.

• Our project investigates the development of intelligent systems for detecting, identifying, tracking and modeling dynamic events in an urban scene, as well as automatic methods for inferring the three-dimensional static or slowly-changing context in which these events take place.


Results

• Here we demonstrate new results in the automatic estimation of 3D context and automatic tracking of human traffic from urban surveillance video.

• The CAE S-Mission real-time distributed computing environment is used as a substrate to integrate these intelligent algorithms into a comprehensive urban awareness network.


STRIVE-SFX

HLA - EMS-FOM

Facility Level

...

GPS Feed

CAM Feed

TerraVizUI

TerraVizUI

TerraVizUI

EMS-PAServer

EMS-EnvServer

STRIVE-TFXTerrain Server

ActenumScheduler System

Server

STRIVE-SFX

STRIVE-SFX

McGillVideo-Cam Traffic Analyser

Server

...

STRIVE-SFX

STRIVE-SFX STRIVE-SFX

Feed to Legacy System

ActenumProtocol

ActenumProtocol

SARLOG

HistoricalTraffic Data

GISData Historical Calls

Post ListsConstraints

AppSpy

AppSpy

AppSpy

LegacyProtocol

LegacyProtocol

LegacyProtocol

Note: To provide a faillure-safe architecture allthe database disks need to be duplicatedand provide a dual access ( or a raidsystem could be used). The four servershave to be duplicated as backup serversand share the dual access databases withthe main system. The backup servers aremonitoring the status of the main systemsand when a faillure of the main system isdetected, they reinitialized their internalstates from the last SAR & LOG of themain system and resume operations.

CAE Professional Services

CAE Inc

McGill University

Actenum Proprietary CAE Inc 2007

dispatcher dispatcher dispatcher

logic

HLA

logs other typesof logs

historicdata

CAE STRIVE ARCH.

3D Urban Awareness from Single-View Surveillance Video


3D Urban Awareness

• 3D scene context (e.g., ground plane information) is crucial for the accurate identification and tracking of human and vehicular traffic in urban scenes.

• 3D scene context is also important for human interpretation of urban surveillance data

• Limited static 3D scene context can be estimated manually, but this is time-consuming, and cannot be adapted to slowly-changing scenes.



Ultimate Goal

• Our ultimate goal is to automate this process!


Immediate Goal

• Automatic estimation of the three vanishing points corresponding to the “Manhattan directions”.


Manhattan Frame Geometry

• An edge is aligned to a vanishing point if the interpretation plane normal is orthogonal to the vanishing point vector in the Gaussian Sphere (i.e. dot product is 0)

Optical Centre

vanishing pointvectorGaussian

Sphere

InterpretationPlane

OrientedEdges

Interpretation planenormal

Image Plane


Mixture Model• Each edge Eij in the image is

generated by one of four possible kinds of scene structure:

– m1-3: a line in one of the three Manhattan directions

– m4: non-Manhattan structure

• The observable properties of each edge Eij are:

– position

– angle

• The likelihoods of these observations are co-determined by:

– The causal process (m1-4)

– The rotation Ψ of the Manhattan frame relative to the camera

mimi

mimi

E11 E12

E22E21

Ψ

Image


Mixture Model

• Our goal is to estimate the Manhattan frame Ψ from the observable data Eij. mimi

mimi

E11 E12

E22E21

Ψ

Image


E-M Algorithm• E Step

– Given an estimate of the Manhattan coordinate frame, calculate the mixture probabilities for each edge

m1




m2




m3




m4


E-M Algorithm• M Step

– Given estimates of the mixture probabilities for each edge, update our estimate of the Manhattan coordinate frame


Results

X Y Z0

1

2

3

4

5

6

7

8

9

Manhattan Directions

Ab

solu

te A

ng

ula

r D

evia

tio

n

Mean Error Over Entire Test Image Database


Results

• Convergence of the E-M algorithm for example image

0 2 4 6 8 10 12 14 16 180

5

10

15

20

25

30

35

40

Iteration

Ab

so

lute

An

gu

lar

De

via

tio

n

Vanishing point XVanishing point YVanishing point Z

Test Image


Results

• Example: lines through top 10 edges in each Manhattan direction

Tracking Human Activity

Single-Camera Tracking


Tracking Using Only Colour / Grey Scale

• Tracking using only grey scale or colour features can lead to errors


Tracking Using Dynamic Information

• Incorporating dynamic information enables successful tracking

Tracking over Multi-Camera Network


Goal

• Integrate tracking of human activity from multiple cameras into world-centred activity map


Input left and right sequences


Independent tracking

• Each person tracked independently in each camera using Boosted Particle Filters.

– Background subtraction identifies possible detections of people which are then tracked with a particle filter using brightness histograms as the observation model.

• Tracks are projected via a homography to the street map, and then Kalman filtered independently based on the error model.


Independent tracks


Integration

• Tracks are averaged to approximate joint estimation of composite errors


Merged trajectories


Future Work

• Integrated multi-camera background subtraction

• Integrated particle filter in world coordinates using joint observation model over all sensors in network.

Tracking in Dynamic Background Settings


Foreground Extraction and Tracking in Dynamic Background Settings

• Extracting objects from dynamic backgrounds is challenging

• Numerous applications:

– Human Surveillance

– Customer Counting

– Human Safety

– Event Detection

• In this example, the problem is to extract people from surveillance video as they enter a store through a dynamic sliding door


Methodology Overview

• Video sequences are pre-processed and corner feature points are extracted

• Corners are tracked to obtain trajectories of the moving background

• Background trajectories are learned and a classifier is formed

• Trajectories of all moving objects in the test image sequences are classified based on learned model into either background or foreground trajectories

• Foreground Trajectories are kept in image sequence and the object corresponding to those trajectories is tagged as foreground


Demo 1: Successful Tracking and Classification

• This demo illustrates This demo illustrates a case of successful a case of successful tracking and tracking and classification of an classification of an entering person. entering person.

• The person is The person is classified into classified into foreground based on foreground based on the extracted the extracted

trtrajectories..


Demo 2: Failed Tracking but Successful Classification

• Demo 2 shows a case when the tracker loses track of the person after a few frames

• However, the classification is still correct since only a small number of frames are required to identify the trajectory.

Recognizing Actions using the Boosted Particle Filter


Motivation

Frame 682 Frame 814

Input

Output


update the SPPCA template updater

System Diagram

New frame

BPF Tracker

Tracking results

Action Recognizer

SPPCA Template Updater

Extracted image patches

Output 2:Action labelsof the players

Output 1:Locations/sizesof the players

predict new templates


HSV Color Histogram

• The HSV color histogram is composed of:

– 2D histogram of Hue and Saturation

– 1D histogram of Value

+

HueSaturation

2D histogram

Value

1D histogram


The HOG descriptor

The HOG descriptor

SIFT descriptor SIFT descriptor

SIFT descriptor SIFT descriptor

Image gradients


?

Template Updating: Motivation

• Tracking: search for the location in the image whose image patch is similar to a reference image patch – the template.

• Template Updating: Templates should be updated because the players change their pose.

?? ?

Frame 677 Frame 687


Template Updating: Operations

• Offline

– Learning: Learn the template model from training data

• Online:

– Prediction:

Predict the new template used in the next frame

– Updating:

Update the template model using the current observation



New frame

Tracker

Tracking results

Extracted image patches


New templates

Update the SPPCA template updater

Predict new templates


Graphical Model of SPPCA

observation (continuous)

coordinate on the Eigen space

(continuous)

switch to select an Eigen space

(discrete)


Action Recognizer

• Input: a sequence of image patches

• Output: action labels

Action Recognizer

skating down

skating left

skating right

skating up


• Summary:

– Features:

The HOG descriptor

– Classifier:

The SMLR classifier

– Weights:

weights learned by MAP estimation with a sparsity-promoted Laplacian prior

– Basis functions:

motion similarity between the testing and training data

Action Recognizer


Action Recognizer: Framework

Testing data

Training data

Frame similarity

Weightingmatrix

Motion similarity

Compute the frame-to-frame

similarity

Convolve the frame similarity

with the weighting matrix

SMLR classifier

Action labels

HOG descriptors

HOG descriptors


Tracking & Action Recognition

Frame 97 Frame 116 Frame 682

Frame 710 Frame 773 Frame 814


Vehicle Tracking

Documents

Project 35 Visual Surveillance of Urban Scenes. PROJECT 35: VISUAL SURVEILLANCE OF URBAN SCENES Principal Investigators David Clausi, Waterloo Geoffrey