30

IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as
Page 2: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as
Page 3: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Intelligent Video Surveillance Systems

Page 4: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as
Page 5: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Intelligent VideoSurveillance Systems

Edited by

Jean-Yves Dufour

Page 6: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

First published 2013 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, aspermitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced,stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers,or in the case of reprographic reproduction in accordance with the terms and licenses issued by theCLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at theundermentioned address:

ISTE Ltd John Wiley & Sons, Inc.27-37 St George’s Road 111 River StreetLondon SW19 4EU Hoboken, NJ 07030UK USA

www.iste.co.uk www.wiley.com

© ISTE Ltd 2013

The rights of Jean-Yves Dufour to be identified as the author of this work have been asserted by him inaccordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2012946584

British Library Cataloguing-in-Publication DataA CIP record for this book is available from the British LibraryISBN: 978-1-84821-433-0

Printed and bound in Great Britain by CPI Group (UK) Ltd., Croydon, Surrey CR0 4YY

Page 7: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Table of Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiJean-Yves DUFOUR and Phlippe MOUTTOU

Chapter 1. Image Processing: Overview and Perspectives . . . . . . . . . . . 1Henri MAÎTRE

1.1. Half a century ago . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. The use of images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3. Strengths and weaknesses of image processing. . . . . . . . . . . . . . . 41.3.1. What are these theoretical problems that image processinghas been unable to overcome? . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.2. What are the problems that image processing has overcome?. . . . 5

1.4. What is left for the future? . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Chapter 2. Focus on Railway Transport . . . . . . . . . . . . . . . . . . . . . . 13Sébastien AMBELLOUIS and Jean-Luc BRUYELLE

2.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2. Surveillance of railway infrastructures . . . . . . . . . . . . . . . . . . . . 152.2.1. Needs analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2.2. Which architectures? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2.3. Detection and analysis of complex events . . . . . . . . . . . . . . . 172.2.4. Surveillance of outside infrastructures . . . . . . . . . . . . . . . . . 20

2.3. Onboard surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.3.1. Surveillance of buses. . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.3.2. Applications to railway transport. . . . . . . . . . . . . . . . . . . . . 23

2.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Page 8: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

vi Intelligent Video Surveillance Systems

Chapter 3. A Posteriori Analysis for Investigative Purposes . . . . . . . . . . 33Denis MARRAUD, Benjamin CÉPAS, Jean-François SULZER,Christianne MULAT and Florence SÈDES

3.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2. Requirements in tools for assisted investigation . . . . . . . . . . . . . . 343.2.1. Prevention and security . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2.2. Information gathering . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2.3. Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3. Collection and storage of data . . . . . . . . . . . . . . . . . . . . . . . . . 363.3.1. Requirements in terms of standardization . . . . . . . . . . . . . . . 373.3.2. Attempts at standardization (AFNOR and ISO) . . . . . . . . . . . . 37

3.4. Exploitation of the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.4.1. Content-based indexing . . . . . . . . . . . . . . . . . . . . . . . . . . 393.4.2. Assisted investigation tools . . . . . . . . . . . . . . . . . . . . . . . . 43

3.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Chapter 4. Video Surveillance Cameras . . . . . . . . . . . . . . . . . . . . . . 47Cédric LEBARZ and Thierry LAMARQUE

4.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.2. Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2.1. Financial constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2.2. Environmental constraints. . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3. Nature of the information captured . . . . . . . . . . . . . . . . . . . . . . 494.3.1. Spectral bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.3.2. 3D or “2D + Z” imaging. . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.4. Video formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.5. Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.6. Interfaces: from analog to IP. . . . . . . . . . . . . . . . . . . . . . . . . . 574.6.1. From analog to digital . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.6.2. The advent of IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.6.3. Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.7. Smart cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624.9. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Chapter 5. Video Compression Formats . . . . . . . . . . . . . . . . . . . . . . 65Marc LENY and Didier NICHOLSON

5.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.2. Video formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.2.1. Analog video signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.2.2. Digital video: standard definition . . . . . . . . . . . . . . . . . . . . 67

Page 9: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Table of Contents vii

5.2.3. High definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685.2.4. The CIF group of formats . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.3. Principles of video compression . . . . . . . . . . . . . . . . . . . . . . . 705.3.1. Spatial redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705.3.2. Temporal redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.4. Compression standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.4.1. MPEG-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.4.2. MPEG-4 Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.4.3. MPEG-4 Part 10/H.264 AVC. . . . . . . . . . . . . . . . . . . . . . . 775.4.4. MPEG-4 Part 10/H.264 SVC . . . . . . . . . . . . . . . . . . . . . . . 795.4.5. Motion JPEG 2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.4.6. Summary of the formats used in video surveillance . . . . . . . . . 82

5.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Chapter 6. Compressed Domain Analysis for Fast Activity Detection . . . 87Marc LENY

6.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 876.2. Processing methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.2.1. Use of transformed coefficients in the frequency domain . . . . . . 886.2.2. Use of motion estimation . . . . . . . . . . . . . . . . . . . . . . . . . 906.2.3. Hybrid approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.3. Uses of analysis of the compressed domain . . . . . . . . . . . . . . . . . 936.3.1. General architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946.3.2. Functions for which compressed domain analysis is reliable . . . . 966.3.3. Limitations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006.5. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1016.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Chapter 7. Detection of Objects of Interest . . . . . . . . . . . . . . . . . . . . 103Yoann DHOME, Bertrand LUVISON, Thierry CHESNAIS, Rachid BELAROUSSI,Laurent LUCAT, Mohamed CHAOUCH and Patrick SAYD

7.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1037.2. Moving object detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1047.2.1. Object detection using background modeling . . . . . . . . . . . . . 1047.2.2. Motion-based detection of objects of interest . . . . . . . . . . . . . 107

7.3. Detection by modeling of the objects of interest . . . . . . . . . . . . . . 1097.3.1. Detection by geometric modeling . . . . . . . . . . . . . . . . . . . . 1097.3.2. Detection by visual modeling. . . . . . . . . . . . . . . . . . . . . . . 111

7.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1177.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Page 10: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

viii Intelligent Video Surveillance Systems

Chapter 8. Tracking of Objects of Interest in a Sequence of Images . . . . 123SimonaMAGGIO, Jean-Emmanuel HAUGEARD, Boris MEDEN,Bertrand LUVISON, Romaric AUDIGIER, Brice BURGERand Quoc Cuong PHAM

8.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1238.2. Representation of objects of interest and their associatedvisual features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1248.2.1. Geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1248.2.2. Characteristics of appearance . . . . . . . . . . . . . . . . . . . . . . . 125

8.3. Geometric workspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1278.4. Object-tracking algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . 1278.4.1. Deterministic approaches . . . . . . . . . . . . . . . . . . . . . . . . . 1278.4.2. Probabilistic approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 128

8.5. Updating of the appearance models . . . . . . . . . . . . . . . . . . . . . 1328.6. Multi-target tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358.6.1. MHT and JPDAF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358.6.2. MCMC and RJMCMC sampling techniques . . . . . . . . . . . . . . 1368.6.3. Interactive filters, track graph. . . . . . . . . . . . . . . . . . . . . . . 138

8.7. Object tracking using a PTZ camera . . . . . . . . . . . . . . . . . . . . . 1388.7.1. Object tracking using a single PTZ camera only . . . . . . . . . . . 1398.7.2. Object tracking using a PTZ camera coupled with astatic camera. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

8.8. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1418.9. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Chapter 9. Tracking Objects of Interest Through a CameraNetwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147Catherine ACHARD, Sébastien AMBELLOUIS, Boris MEDEN,Sébastien LEFEBVRE and Dung Nghi TRUONGCONG

9.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1479.2. Tracking in a network of cameras whose fields of view overlap. . . . . 1489.2.1. Introduction and applications . . . . . . . . . . . . . . . . . . . . . . . 1489.2.2. Calibration and synchronization of a camera network . . . . . . . . 1509.2.3. Description of the scene by multi-camera aggregation . . . . . . . . 153

9.3. Tracking through a network of cameras with non-overlappingfields of view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1559.3.1. Issues and applications. . . . . . . . . . . . . . . . . . . . . . . . . . . 1559.3.2. Geometric and/or photometric calibration of a cameranetwork. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1569.3.3. Reidentification of objects of interest in a camera network . . . . . 1579.3.4. Activity recognition/event detection in a camera network . . . . . . 160

9.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1619.5. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Page 11: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Table of Contents ix

Chapter 10. Biometric Techniques Applied to Video Surveillance . . . . . . 165Bernadette DORIZZI and Samuel VINSON

10.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16510.2. The databases used for evaluation. . . . . . . . . . . . . . . . . . . . . . 16610.2.1. NIST-Multiple Biometrics Grand Challenge(NIST-MBGC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16710.2.2. Databases of faces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

10.3. Facial recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16810.3.1. Face detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16810.3.2. Face recognition in biometrics . . . . . . . . . . . . . . . . . . . . . 16910.3.3. Application to video surveillance. . . . . . . . . . . . . . . . . . . . 170

10.4. Iris recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17310.4.1. Methods developed for biometrics . . . . . . . . . . . . . . . . . . . 17310.4.2. Application to video surveillance. . . . . . . . . . . . . . . . . . . . 17410.4.3. Systems for iris capture in videos. . . . . . . . . . . . . . . . . . . . 17610.4.4. Summary and perspectives . . . . . . . . . . . . . . . . . . . . . . . 177

10.5. Research projects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17710.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17810.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

Chapter 11. Vehicle Recognition in Video Surveillance. . . . . . . . . . . . . 183Stéphane HERBIN

11.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18311.2. Specificity of the context . . . . . . . . . . . . . . . . . . . . . . . . . . . 18411.2.1. Particular objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18411.2.2. Complex integrated chains. . . . . . . . . . . . . . . . . . . . . . . . 185

11.3. Vehicle modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18511.3.1. Wire models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18611.3.2. Global textured models. . . . . . . . . . . . . . . . . . . . . . . . . . 18711.3.3. Structured models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

11.4. Exploitation of object models . . . . . . . . . . . . . . . . . . . . . . . . 18911.4.1. A conventional sequential chain with limited performance . . . . 18911.4.2. Improving shape extraction . . . . . . . . . . . . . . . . . . . . . . . 19011.4.3. Inferring 3D information. . . . . . . . . . . . . . . . . . . . . . . . . 19111.4.4. Recognition without form extraction. . . . . . . . . . . . . . . . . . 19211.4.5. Toward a finer description of vehicles. . . . . . . . . . . . . . . . . 193

11.5. Increasing observability. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19411.5.1. Moving observer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19411.5.2. Multiple observers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

11.6. Performances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19611.7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19611.8. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

Page 12: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

x Intelligent Video Surveillance Systems

Chapter 12. Activity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 201Bernard BOULAY and François BRÉMOND

12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20112.2. State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20212.2.1. Levels of abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 20212.2.2. Modeling and recognition of activities. . . . . . . . . . . . . . . . . 20312.2.3. Overview of the state of the art . . . . . . . . . . . . . . . . . . . . . 206

12.3. Ontology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20612.3.1. Objects of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20712.3.2. Scenario models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20812.3.3. Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20912.3.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

12.4. Suggested approach: the ScReK system . . . . . . . . . . . . . . . . . . 21012.5. Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21212.5.1. Application at an airport . . . . . . . . . . . . . . . . . . . . . . . . . 21312.5.2. Modeling the behavior of elderly people . . . . . . . . . . . . . . . 213

12.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21512.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

Chapter 13. Unsupervised Methods for Activity Analysisand Detection of Abnormal Events . . . . . . . . . . . . . . . . . . . . . . . . . 219Rémi EMONET and Jean-Marc ODOBEZ

13.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21913.2. An example of a topic model: PLSA . . . . . . . . . . . . . . . . . . . . 22113.2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22113.2.2. The PLSA model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22113.2.3. PLSA applied to videos . . . . . . . . . . . . . . . . . . . . . . . . . 223

13.3. PLSM and temporal models . . . . . . . . . . . . . . . . . . . . . . . . . 22613.3.1. PLSM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22613.3.2. Motifs extracted by PLSM. . . . . . . . . . . . . . . . . . . . . . . . 228

13.4. Applications: counting, anomaly detection . . . . . . . . . . . . . . . . 23013.4.1. Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23013.4.2. Anomaly detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23013.4.3. Sensor selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23113.4.4. Prediction and statistics . . . . . . . . . . . . . . . . . . . . . . . . . 233

13.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23313.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Chapter 14. Data Mining in a Video Database . . . . . . . . . . . . . . . . . . 235Luis PATINO, Hamid BENHADDA and François BRÉMOND

14.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23514.2. State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

Page 13: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Table of Contents xi

14.3. Pre-processing of the data . . . . . . . . . . . . . . . . . . . . . . . . . . 23714.4. Activity analysis and automatic classification. . . . . . . . . . . . . . . 23814.4.1. Unsupervised learning of zones of activity . . . . . . . . . . . . . . 23914.4.2. Definition of behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . 24214.4.3. Relational analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

14.5. Results and evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24514.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24814.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

Chapter 15. Analysis of Crowded Scenes in Video . . . . . . . . . . . . . . . . 251Mikel RODRIGUEZ, Josef SIVIC and Ivan LAPTEV

15.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25115.2. Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25315.2.1. Crowd motion modeling and segmentation . . . . . . . . . . . . . . 25315.2.2. Estimating density of people in a crowded scene . . . . . . . . . . 25415.2.3. Crowd event modeling and recognition . . . . . . . . . . . . . . . . 25515.2.4. Detecting and tracking in a crowded scene . . . . . . . . . . . . . . 256

15.3. Data-driven crowd analysis in videos. . . . . . . . . . . . . . . . . . . . 25715.3.1. Off-line analysis of crowd video database . . . . . . . . . . . . . . 25815.3.2. Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25815.3.3. Transferring learned crowd behaviors . . . . . . . . . . . . . . . . . 26015.3.4. Experiments and results . . . . . . . . . . . . . . . . . . . . . . . . . 260

15.4. Density-aware person detection and tracking in crowds. . . . . . . . . 26215.4.1. Crowd model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26315.4.2. Tracking detections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26415.4.3. Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

15.5. Conclusions and directions for future research . . . . . . . . . . . . . . 26815.6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26815.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

Chapter 16. Detection of Visual Context . . . . . . . . . . . . . . . . . . . . . . 273Hervé LEBORGNE and Aymen SHABOU

16.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27316.2. State of the art of visual context detection . . . . . . . . . . . . . . . . . 27516.2.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27516.2.2. Visual description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27616.2.3. Multiclass learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278

16.3. Fast shared boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27916.4. Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28116.4.1. Detection of boats in the Panama Canal . . . . . . . . . . . . . . . . 28116.4.2. Detection of the visual context in video surveillance . . . . . . . . 283

16.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28516.6. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

Page 14: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

xii Intelligent Video Surveillance Systems

Chapter 17. Example of an Operational Evaluation Platform: PPSL . . . . 289Stéphane BRAUDEL

17.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28917.2. Use of video surveillance: approach and findings . . . . . . . . . . . . 29017.3. Current use contexts and new operational concepts . . . . . . . . . . . 29217.4. Requirements in smart video processing . . . . . . . . . . . . . . . . . . 29317.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

Chapter 18. Qualification and Evaluation of Performances . . . . . . . . . . 297Bernard BOULAY, Jean-François GOUDOU and François BRÉMOND

18.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29718.2. State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29818.2.1. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29818.2.2. Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

18.3. An evaluation program: ETISEO . . . . . . . . . . . . . . . . . . . . . . 30318.3.1. Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30318.3.2. Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30518.3.3. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

18.4. Toward a more generic evaluation . . . . . . . . . . . . . . . . . . . . . 30918.4.1. Contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31018.4.2. Shadows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

18.5. The Quasper project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31218.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31318.7. Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

List of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

Page 15: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Introduction

I.1. General presentation

Video surveillance consists of remotely watching public or private spaces usingcameras. The images captured by these cameras are usually transmitted to a controlcenter and immediately viewed by operators (real-time exploitation) and/or recordedand then analyzed on request (a posteriori exploitation) following a particular event(an accident, an assault, a robbery, an attack, etc.), for the purposes of investigationand/or evidence gathering.

Convenience stores, railways and air transport sectors are, in fact, the largestusers of video surveillance. These three sectors alone account for over 60% of thecameras installed worldwide. Today, even the smallest sales points have fourcameras per 80 m2 of the shop floor. Surveillance of traffic areas to help ensure thesmooth flow of the traffic and the capacity for swift intervention in case of anaccident brings the figure upto 80%, in terms of the number of installations. Theprotection of other critical infrastructures accounts for a further 10% of installations.The proliferation of cameras in pedestrian urban areas is a more recent phenomenon,and is responsible for the rest of the distribution.

Over the past 30+ years, we have seen a constant increase in the number ofcameras in urban areas. In many people’s minds, the reason behind this trend is aconcern for personal protection, sparked first by a rise in crime (a steady increase inassaults in public areas) and then by the increase in terrorism over the past 10 years.However, this aspect cannot mask the multiplication of cameras in train stations,airports and shopping centers.

The defense of people and assets, which states are so eager to guarantee, hasbenefited greatly from two major technological breakthroughs: first, the advent ofvery high capacity digital video recorders (DVRs) and, second, the development of

Page 16: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

xiv Intelligent Video Surveillance Systems

Internet protocol (IP) networks and so-called IP cameras. The latter breakthroughenables the images delivered by cameras to be distributed to various processingcenters. This facilitates the (re)configuration of the system and the transmission ofall the data (images, metadata, commands, etc.) over the same channel.

Today, we are reaping the benefits of these technological advances for theprotection of critical infrastructures. Indeed, it is becoming easier to ensureinteroperability with other protection or security systems (access monitoring,barriers, fire alarms, etc.). This facility is often accompanied by a poorer quality ofimages than those delivered by CCTV cameras.

Currently, the evolution of the urban security market is leading to the worldwidedeployment of very extensive systems, consisting of hundreds or even thousands ofcameras. While such systems, operated in clusters, have long been the panacea fortransport operators, they have become unavoidable in urban areas.

All these systems generate enormous quantities of video data, which render real-time exploitation solely by humans near-impossible, and extremely long and verycostly in terms of human resources. These systems have now come to be usedessentially as operational aids. They are a tool for planning and support in theintervention of a protective force, be it in an urban area or in major transport centers.

“Video analytics”1 is intended to solve the problem of the incapability to exploitvideo streams in real time for the purposes of detection or anticipation. It involveshaving the videos analyzed by algorithms that detect and track objects of interest(usually people or vehicles) over time, and that indicate the presence of events orsuspect behavior involving these objects. The aim is to be able to alert operators insuspicious situations in real time, economize on the bandwidth by only transmittingdata that are pertinent for surveillance and improve searching capabilities in thearchived sequences, by adding data relating to the content (metadata) to the videos.

The “Holy Grail” of video analytics can be summed up as the three mainautomatic functions: real-time detection of expected or unexpected events, capabilityto replay the events leading up to the observed situation in real time and the capacityto analyze the video a posteriori and retrace the root of an event.

Belonging to the wider academic domain of computer vision, video analytics hasaroused a phenomenal surge of interest since the early 2000s, resulting – in concreteterms – in the proliferation of companies developing video analytics softwareworldwide and the setting up of a large number of collaborative projects

1 Literature on the topic usually uses the term video analytics, but we may also come acrossthe terms video content analysis, intelligent video surveillance or smart video surveillance.

Page 17: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Introduction xv

(e.g. SERKET, CROMATICA, PRISMATICA, ADVISOR, CARETAKER,VIEWS, BOSS, LINDO, VANAHEIM and VICOMO, all funded by the EuropeanUnion).

Video analytics is also the topic of various academic gatherings. For instance, ona near-yearly basis since 1998, the Institute of Electrical and Electronics Engineers(IEEE) has organized an international conference: Advanced Video and Signal-based Surveillance (AVSS), which has become a reference point in the domain, andfacilitates a regular meeting for people belonging to the fields of research, industryand governmental agencies.

Although motion detection, object detection and tracking or license platerecognition technologies have now been shown to be effective in controlledenvironments, very few systems are, as yet, sufficiently resistant to the changingenvironment and the complexity of urban scenes. Furthermore, the recognition ofobjects and individuals in complex scenes, along with the recognition of complex or“unusual” behavior, is one of the greatest challenges faced by researchers in thisdomain.

Furthermore, new applications, such as consumer behavior analysis and thesearch for target videos on the Internet, could accelerate the rise of video analytics.

I.2. Objectives of the book

The aims of this book are to highlight the operational attempts for videoanalytics, to identify possible driving forces behind potential evolutions in years tocome and, above all, to present the state of the art and the technological hurdles thathave yet to be overcome. This book is intended for an audience of students andyoung researchers in the field of computer visualization, and for engineers involvedin large-scale video surveillance projects.

I.3. Organization of the book

In Chapter 1, Henri Maître, a pioneer and an eminent actor in the domainof image analysis today, provides an overview of the major issues that havebeen addressed and the advances that have been achieved since the advent ofthis discipline in the 1970s–1980s. The new challenges that have arisen today arealso presented, along with the most promising technical approaches to overcomethese challenges. These approaches will be illustrated in certain chapters of thebook.

Page 18: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

xvi Intelligent Video Surveillance Systems

The subsequent chapters have been sequenced so as to successively deal with theapplications of video analytics and the nature of the data processed, before goinginto detail about the technical aspects, which constitute the core of this book, andfinishing with the subject of performance evaluation.

Chapters 2 and 3 deal with the applications of video analytics and present twoimportant examples: the security of rail transport, which tops the list of users ofvideo surveillance (both chronologically and in terms of the volume of activitygenerated), and an a posteriori investigation using video data. These chapters list therequirements in terms of video analytics functions, as well as the constraints andmain characteristics identified for these two applications. Chapter 2 also discussesthe research programs conducted in France and Europe in the field of transport,which have enabled significant advances in this domain.

Chapters 4 and 5 present the characteristics of the videos considered, by way ofthe sensors used to generate them and issues of transport and storage that, inparticular, give rise to the need for compression. In Chapter 4, the recent evolutionsin video surveillance cameras are presented, as are the new modes of imaging thatcould, in the future, enhance the perception of the scenes. Chapter 5 presents theformats of video images and the principles of video compression used in videosurveillance.

Chapters 6–11 present the problems related to the analysis of objects of interest(people or vehicles) observed in a video, based on a chain of processing that isclassic in image analysis: detection, tracking and recognition of these objects. Eachchapter deals with one function, presenting the main characteristics and constraints,as well as the problems that need to be solved and the state-of-the-art of the methodsproposed to tackle these problems. Chapter 6 presents the approaches of detectionand tracking, based on the direct analysis of the information contained in thecompressed video, so as to reduce the computation time for “low-level” operationsfor video analysis, as far as possible. Object detection is presented in Chapter 7,which describes the various approaches used today (background subtraction,estimation and exploitation of the motion apparent in the images, detection based onmodels that can be either explicit or estimated by way of automatic learning). Objecttracking is dealt with in Chapter 8 (tracking within the field of view of a camera)and Chapter 9, which extends the problem to the case of observation by a network ofcameras, considers two distinct configurations: (1) a single object is perceived at thesame time by several cameras and (2) a single object is seen at different times bydifferent cameras. In the latter case, the particular problem of “re-identification” ofthe object arises. Chapter 10 presents the application and adaptation to videosurveillance of two functions used for biometrics: facial recognition and irisrecognition. Chapter 11 focuses on the function of automatic vehicle recognition.

Page 19: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Introduction xvii

Chapters 12–16 deal with the “higher level” analysis of the video, aimed atlending semantic content to the scenes observed. Such an analysis might relate tothe actions or behaviors of individuals (Chapters 12–14) or crowds (Chapter 15), orindeed to the overall characteristics of the scene being observed (Chapter 16).Chapter 12 examines the approaches that use a description of the activities in theform of scenarios, with a particular emphasis on representation of knowledge,modeling of the scenarios by the users and automatic recognition of thesescenarios. Chapters 13 and 14 relate to the characterization of the activitiesobserved by a camera over long periods of observation, and to the use ofthat characterization to detect “abnormal” activity, using two different approaches:the first (Chapter 13) operates on “visual words”, constructed from simple featuresof the video such as position in the image, apparent motion and indicators of sizeor shape; and the second (Chapter 14) uses data-mining techniques to analyzetrajectories constituted by prior detection and tracking of objects of interest.Chapter 15 gives an overview of the recent projects that have dealt with the variousissues associated with crowd scene analysis, and presents two specificcontributions: one relating to the creation of a crowd analysis algorithm usinginformation previously acquired on a large database of crowd videos and the othertouching on the problem of detection and tracking of people in crowd scenes, in theform of optimization of an energy function combining the estimation of the crowddensity and the location of individuals. Finally, Chapter 16 relates to thedetermination of the visual context (or “scene recognition”), which consists ofdetecting the presence or absence of pre-established visual concepts in a givenimage, providing information about the general atmosphere in the image (indoor oroutdoor scene; photo taken at night, during the day or at sunrise/sunset; an urban orsuburban scene; the presence of vegetation, buildings, etc.). A visual concept mayalso refer to the technical characteristics of an image (level of blur, quality of theimage) or to a more subjective impression of a photograph (amusing, worrying,aesthetically pleasing, etc.).

The final two chapters (Chapters 17 and 18) deal with performance evaluation.Chapter 17 presents the aims of a structure called Pôle Pilote de Sécurité Locale(PPSL) – Pilot Center for Urban Security, set up to create and implementquasi- real-world tests for new technologies for local and urban security, involvingboth the end users (police, firefighters, ambulance, etc.) and the designers.Chapter 18 discusses the issue of performance evaluation of the algorithms. It firstpresents the main initiatives that have seen the light of day, with a view tocomparing systems on shared functional requirements with evaluation protocols andshared data. Then it focuses on the ETISEO2 competition, which has enabled

2 Evaluation du Traitement et de l’Interprétation de SEquences vidEO (Evaluation for videoprocessing and understanding).

Page 20: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

xviii Intelligent Video Surveillance Systems

significant advances to be made, offering – besides annotated video sequences –metrics meant for a particular task and tools to facilitate the evaluation. Theobjective qualification of an algorithmic solution in relation to measurable factors(such as the contrast of the object) remains an unsolved problem on which there hasbeen little work done to date. An approach is put forward to make progress in thisarea, and the chapter closes with a brief presentation of the research programQUASPER R&D, which aims to define the scientific and technical knowledgerequired for the implementation of a platform for qualification and certification ofperception systems.

Page 21: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Chapter 1

Image Processing: Overviewand Perspectives

“Puissance de l’image, dit-on? il s’agit bel et bien, plutôt, de l’extrême richessedu plus évolué de nos sens : la vue – ou, pour mieux dire, de la plus remarquable denos fonctions de contact avec l’environnement: la vision, œil et cerveau. De fait, entermes de quantité d’information véhiculée et de complexité de son traitement, il n’y

a guère, pour l’être humain, que la fonction de reproduction qui puisse soutenir lacomparaison avec la fonction de vision.”

D. Estournet1

1.1. Half a century ago

In an exercise in prospective, it is always helpful to look back toward thefoundation of the domain in question, examine the context of its apparition and thenthat of its evolutions, to identify the reasons for its hurdles or – conversely – theavenues of its progressions. Above all, the greatest advantage can be found in

Chapter written by Henri MAÎTRE.1 “The power of the image, they say? It is a question, rather, of the extreme richness of themost highly evolved of our senses: sight – or, better put, the most remarkable of our functionsof interfacing with the environment: vision, the eye and the brain. In fact, in terms of theamount of information channeled and the complexity of the processing applied, humans haveonly the reproductive function that is even remotely comparable to the function of vision”.D. Estournet, “Informations d’Images, analyse, transmission, traitement”, ENSTA, 1970.

Page 22: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

2 Intelligent Video Surveillance Systems

revisiting the promises made by the discipline, comparing them with what hasactually been achieved and measuring the differences.

Today, the field of image processing is a little over 50 years old. Indeed, it wasin the 1960s that elementary techniques began to emerge – in parallel but oftenindependently of one another – which gradually came together to form imageprocessing as we now know it, which is partly the subject of this book.

Of these techniques, we will begin by discussing the extension to two or threedimensions (2D or 3D) of signal processing methods. In this exercise, among othergreat names, the following have distinguished themselves: R.M. Mersereau, L.R.Rabiner, J.H. McClellan, T.S. Huang, J.L. Shanks, B.R. Hunt, H.C. Andrews, A.Bijaoui, etc., recognized for their contribution both to 1D and 2D. The aim of theirwork was to enable images to benefit from all the modeling, prediction, filtering andrestoration tools that were becoming established at the time in acoustics, radar andspeech. Based on the discovery of rapid transformations and their extension to 2D,these works naturally gave rise to spectral analysis of images – a technique that isstill very much in use today. However, this route is pockmarked by insightful butunfulfilled, abandoned projects that have hitherto not been widely exploited –relating, for example, to the stability of multidimensional filters or 2D recursiveprocesses – because the principle of causality that governs temporal signals had longthwarted image processors, which expected to find it in the television signal, forinstance. From then on, this field of signal processing became particularly fertile. It isdirectly at the root of the extremely fruitful approaches of tomographic reconstruction,which nowadays is unavoidable in medical diagnostics or physical experimentation,and wavelet theory, which is useful in image analysis or compression. More recently,it is to be found at the heart of the sparse approaches, which harbor many hopes ofproducing the next “great leap forward” in image processing.

A second domain also developed in the 1960s, was based on discrete – and oftenbinary – representation of images. Using completely different tools, the pioneers ofthis domain turned their attention to other properties of images: the connexity, themorphology, the topology of forms and spatial meshes that are a major componentof an image. Turning away from continuous faithful representation of the signal,they set about identifying abstract properties: the relative position, the inside andoutside, contact and inclusion, thereby opening the way to shape semantics on theone hand, and a verbal description of the space, which naturally gave way to sceneanalysis on the other hand. In this discipline as well, a number of great names can beheld up: A. Rosenfeld, T. Pavlidis, M. Eden, M.J.E. Golay, A. Guzman, H. Freeman,G. Matheron and J. Serra.

The third field of activities, which was crucial in the foundation of imageprocessing as we know it, is that of pattern recognition. This accompanied the

Page 23: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Image Processing: Overview and Perspectives 3

emergence of artificial intelligence (AI) and automatic learning. Both statistical andstructural classification methods emerged during these years, following the works ofF. Rosenblatt, S. Watanabe, T. Pavlidis, E. Diday, R.O. Duda, M. Levine, P.E. Hart,M. Pavel, K.S. Fu, J.C. Simon, etc. In image processing, they found a field withexceptional development and progression, because it offers an infinite base forexperimentation, where each programmer is also the expert who verifies the qualityand solidity of the results.

1.2. The use of images

In the 1960s and in the particular context of the Western world, in a societydeeply scarred by the Cold War, highly open to mass consumption and marked bysocial welfare, we wonder what applications these various techniques weredeveloped for. Three fields of application largely dominate the academic scene:biological and medical imaging, document processing and television (today, wewould speak of “multimedia”). Other domains also emerged, but in a less structuredway, e.g. around sensors in physics or the nascent spatial applications.

In medical imaging, to begin with, efforts were concentrated around radiology,with the aim of dealing with a very high demand for mass sanitary prevention.Around radiography, algorithms were constructed for filtering, detection,recognition, contour tracking, density evaluation, etc. The requirements in terms ofmemory, display, networking and archiving also became apparent, as did theconcepts of interaction and annotation. The notions of calibration, readjustment andchange detection also emerged. For a long time, radiologists guarded the piloting ofthe technical platforms and their costly imaging systems. However, at the other sideof the hospital, far from the huge instruments of in vivo inspection, another researchactivity was rapidly emerging in the specialist services: in cytology, hematology,histology, etc., with a view to acquiring and quickly and safely processing biologicalsamples. This led to the development of imaging to determine form and carry outcell counting, classification and quantification. The notion of texture came intoexistence. Mathematical morphology found very fertile soil in this domain.

In the domain of television, all work was – unsurprisingly – aimed atcompression of the images with a view to reducing the bandwidth of thetransmission channels. Very fortuitously, these works were accompanied byresearch that went far beyond this exact objective, which produced a great manyresults that are still being drawn upon even today, about the quality of the image, itsstatistical properties, whether it is static or animated, and on the psycho-physiological properties of the human observer, or the social expectations of theviewers. These results have greatly fertilized the other domains of application,

Page 24: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

4 Intelligent Video Surveillance Systems

lending them an exceptional basis of founding principles that have been used in theprocessing algorithms and the hardware developed to date.

Today, it could be said that document processing has lost its place as the drivingforce behind image processing; however, it was the object of the most noteworthyefforts in the early 1960s, to help postal sorting, archive plans and books, andaccompanied the explosion of telecommunications, laying the groundwork for theemergence of “paper-free” office automation. It contributed greatly to the developmentof cheap analysis materials: scanners, printers and graphics tables, and thereforecaused the demise of photographic film and photographic paper. To a large extent, itwas because of the requirements of document processing that theories and low-levelprocessing techniques, discrete representation, detection, recognition, filtering andtracking were developed. It stimulated the emergence of original methods for patternrecognition, drove forward the development of syntactic and structural descriptions,grammars, pattern description languages, etc.

To conclude this brief review of the past, let us cite a few phrases taken from oldtexts that illuminate this particular context, and reread them in the light of ourcontemporary society. It is striking to note their ongoing pertinence, even if certainwords seem very quaint:

“The demand for picture transmission (picturephone images, spacepictures, weather maps, newspapers, etc.) has been ever increasingrecently, which makes it desirable if not necessary for us to considerthe possibility of picture bandwidth compression”. [HUA 72]

Or indeed:

“The rapid proliferation of computers during the past two decadeshas barely kept pace with the explosive increase in the amount ofinformation that needs to be processed”. [ROS 76]

Compression and processing, the problems facing society half a century ago, areobviously still facing us today, expressed in more or less the same words. We might,therefore, be forgiven for wondering: so what has image processing been doing allthis time?

1.3. Strengths and weaknesses of image processing

Let us answer this provocative question with a two-pronged witticism:

– Image processing has solved none of the theoretical problems it set out tosolve, but it has solved many of its practical problems.

Page 25: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Image Processing: Overview and Perspectives 5

– Image processing, by solving a handful of problems, has created an armful foritself.

1.3.1. What are these theoretical problems that image processing has been unableto overcome?

To begin with, it is the problem of segmentation that constitutes an unsolvedproblem after half a century of effort and thousands of articles and communications.We still do not know how to properly deal with this issue without an explicitreference to a human observer who serves simultaneously as worker, reference pointand referee. Certainly, the methods have been greatly improved; they are easier toreproduce, more reliable, more easily controllable (see, for example, [KUM 10,GRO 09, LAR 10]), but they are also still just as blind to the object that they areprocessing, and ignorant of the intentions of their user.

Then, we turn to contour detection – an ambiguous abstraction in itself butcommonly shared, necessary at numerous stages but too often unpredictable andwith disappointing results (in spite of highly interesting works such as [ARB 11]).Along with segmentation, contours have the great privilege of having mobilizedlegions of image processors and witnessed the advent of cohorts of “optimal”detectors that sit in tool boxes, awaiting a user who will likely never come.

Finally, texture detection and recognition still pose a problem: the practicalimportance of textures is proven in all fields of application, but they do not as yethave a commonly held definition, and far less a robust, reliable and transferablemethodology (the recent works [GAL 11, XIA 10] would be of great interest to aninquisitive reader).

1.3.2.What are the problems that image processing has overcome?

To begin with, we might cite the problem of compression that, by successivestages, has enabled the establishment of standards of which the user may not evenknow the name (or the principles, for that matter), but which enable him to carry, ona USB stick, enough movies for a flight from Paris to New York – or which, at theother end of the scale, compress an artistic photo to a tenth of its original size,without adversely affecting the quality, even for an exacting photographer. Yet it isthrough these generations – who have worked on the Hadamard transforms, then ondiscrete cosine transforms (DCTs) and then on wavelets; who have optimizedcoefficients, truncations and scans; who have developed motion prediction,interframe coding, visual masking and chromatic quantification – that we havewitnessed the emergence of the successive representations, ever more powerful and

Page 26: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

6 Intelligent Video Surveillance Systems

yet ever more supple in order to be able to adapt to the image, making use ofefficient and clever algorithms, capable of responding to the real time ofincreasingly demanding applications [CHE 09, COS 10]. A presentation of theevolutions of compression in the domain of video is presented in Chapter 5 of thisbook. In connection with this topic, Chapter 7 presents an approach for detectingmoving objects in a compressed video, which exploits the mode of videocompression in the MPEGx standards.

Enormous leaps forward have been made in the field of pattern recognition: theexceptional capacity of face detection and recognition systems, no matter what thesize or the type of face within complex scenes, in crowds, and on varying mediasupports [PAR 10]. This function is now routinely available not only in databases ofimages diffused as free products, but also on all compact photo cameras, mobiletelephones and video cameras, where it governs the focusing function, and perhapsin the future will govern the framing and the next stages as well. In this book,applications of these techniques in video analytics are presented for detection(Chapter 8), tracking (Chapters 9 and 10) and recognition of people by facial or irisscans (Chapter 11), as well as for vehicle recognition (Chapter 12).

Next, we can cite the capacity to restore degraded documents [DEL 11], by wayof linear or nonlinear filters, identifying the defects either blindly or undersupervision, dealing with the non-homogeneities [DEL 06, RAB 11], andsupplementing the missing parts with inpainting techniques [AUJ 10]. Here, theavailable quick and precise focusing systems, astutely combining optical principlesand image processing [YAS 10, ZHO 11], are in competition with techniques that –on the contrary – ignore the focusing issues and reconstruct a profound scene basedon a plethora of views, all out of focus.

Finally, a major step forward lies in the management of large image bases, thesearch for specific objects, the detection of identical elements [MOR 09, SIV 05],determination of overlap and possibly automatic mosaicing of complex scenes.

1.4. What is left for the future?

The progress made by the efforts of researchers creates new requirements andnew aspirations in turn. The availability of on-line digital resources has given rise toa universal demand, which at present the communication channels and the archivingsupports have a limited capacity to satisfy. Attempts are being made to deliver evengreater compression than that achieved by wavelets. Technical progress made bydeveloping the classic methods will certainly yield further gains but, among thedecisive steps, sparse representation approaches hold out the hope of greaterprogress [BAR 07, HOR 10]. This progress will probably come at the expense of a

Page 27: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Image Processing: Overview and Perspectives 7

great deal of computation, both at the source and at the receiving end, but it seemsthat, today, the resources to perform these computations are available – particularlyin users’ homes, where the workload demanded of the resources is often well belowtheir capacity, but also (why not?) on the cloud. The domain of image or videocompression has always progressed through stages. Over the past few years, anumber of techniques have become well established: differential pulse codemodulations (DPCMs), DCTs and wavelets. Their competitors, even those betterequipped, appear unable to rival their performances, which have gradually beenachieved by way of painstaking – and collaborative – optimization of all theparameters. Hence, the superiority of the most powerful approaches can be seen intheir performances, but firstly at the expense of software or hardware that is socomplex that it will still need years of appropriation before it can be affordablyimplemented in silicon or in algorithms. To date, we have not yet reached a pointwhere the performances of techniques based on redundant dictionaries, modelselection and statistical learning can surpass the Daubechy 9/7 wavelets or theLeGall 5/3 wavelets, but numerous examples are appearing today which suggest thatwe could soon reach that point.

In the domain of restoration and filtering, groups of techniques are graduallyemerging that could offer rapid progress [ABE 97, GAS 07, YAN 10]. They relateto restoration by the use of a large number of potentially heterogeneous images.Successfully employed in satellite imaging to reconstitute multispectral images inhigh resolution using low-resolution multispectral images and high-resolutionpanchromatic images, they have also been used to reconstitute images withimproved resolution from a series of images with lower resolution, but always insomewhat canonic configurations, which are difficult to generalize. In the next fewyears we should see the emergence of techniques that exploit the diversity ofresolution of the sensors, different angles of observation, varied lighting conditions,differing sensitivities or field depths in 3D scenes to reconstitute references of thescenes observed, based on the classical work in matching, stereovision and signalprocessing.

Yet above all, it is in the extracting of information (data mining) and specificallyin the mining of semantic data, that this progress is expected (Figure 1.1). TheInternet has, in recent years, become very specialized in the use of keywords toaccess information, and has more or less imposed this on society. Search enginesmake exclusive use of them, and attempt to associate images and words withindexing operations before their archiving. In spite of the remarkable progress made,the automation of these operations is still in its infancy [MAR 08, HOA 10, THO06]. This book presents applications for analysis of actions of moving objects(Chapters 13, 14 and 15) or crowds (Chapter 16) or for semantic classification ofimages and scenes (Chapter 17).

Page 28: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

8 Intelligent Video Surveillance Systems

Figure 1.1. Semantic information and image. These photos are all unearthed by the samesearch request, with the expression “Charles de Gaulle”. Four different categories clearlyemerge: the man, the aircraft carrier, the square in Paris and the airport. These concepts arefairly well separated by image processing. We can go a little further: for instance, within thecategory “Man”, de Gaulle in his role as a politician and in his role as a military general canbe distinguished based on the image

While pattern recognition has seen considerable advances, which we havehighlighted above, it is essentially an aid to indexing in supervised protocols wherethe user plays an important part [SAH 11]. This solves a great many specificproblems – particularly in specialized professional applications that use precisecategories of images. It is now clear that this reliance on a human expert is asignificant limitation to the generalization of learning techniques. Recourse to“community or social computing” does not facilitate the implementation of lastingand robust solutions. Automatic extraction of the meaning has as yet eludedresearchers [AYT 11, DAT 08].

Reasoning and deduction require elaborate and reliable information [BLO 05].Ontologies seem to provide the essential references, but their employment in image

Page 29: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

Image Processing: Overview and Perspectives 9

processing proves limited because of the very limitations of the diversity ofrepresentations, the uncertainty of their detection and the difficulties in inferring themorphological or spatial properties based on the variability of the aspects. Inductiveor abductive inference is even more difficult to implement. Analogy is anothermeans of deduction that is too infrequently used [ATI 11]. It draws on pre-existingcorpora containing explicitly annotated maps, diagrams and drawings; yet becauseof these limitations, it can only be applied to very specific target domains:cartography, anatomy, biology, etc. It is remarkable that in spite of the enormousprogress made in image processing over the decades, we find ourselves still facingthe great challenges that were being targeted even in the very earliest days of imageprocessing [UHR 73].

Alongside these “academic” advances, which steer image processing towardabstract developments, toward applied mathematics and theoretical computing,toward models of perception and reasoning, other advances are driven byapplications that sit squarely in the crosshairs. In the field of audiovisual technology,progress not only relates to signal compression, but is also expected in signalexploitation to offer the audience – beyond the current transmissions of 2D scenesand beyond the 3D images that are gradually emerging – a true sensory immersion,whereby the content would be released from the small field of the screen bymultiview retransmission, facilitating multiple reconstructions, by the capacity toincorporate elements of augmented reality (AR), chosen by the user and, perhaps,produced locally by the users themselves, incorporating effects from theirenvironment, from their favorite movies or games. Interactivity, free of joysticks andother controls, would complete this immersion by desired (or inhibited) reactivity tothe observer’s actions, to accompany them in their entertainments – or possibly intheir education, because similar techniques will certainly be at the heart of modernlearning techniques. Image processing will thus join forces with image synthesis andhuman/machine interaction in a single field of AR. These advances, when realized,will have truly brought the image into a new era. After the fixed image created byphotography, after the animated image of the cinema, relayed by television, theimage in immersive AR will constitute a new stage – just as revolutionary as anythat have gone before.

1.5. Bibliography

[ABE 97] ABED-MERAIMK., HUAY., “Blind identification of multi-input multi-output systemusing minimum noise subspace”, IEEE Transactions on Signal Processing, vol. 45, no. 1,pp. 254–258, 1997.

[ARB 11] ARBELÁEZ P., MAIRE M., FOWLKES C., MALIK J., “Contour detection andhierarchical image segmentation”, IEEE Transactions on PAMI, vol. 33, no. 5,pp. 898–916, 2011.

Page 30: IntelligentVideoSurveillanceSystemsdownload.e-bookshelf.de/download/0000/7531/74/L-G...Firstpublished2013inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,as

10 Intelligent Video Surveillance Systems

[ATI 11] ATIF J., HUDELOT C., BLOCH I., “Abduction in description logics using formalconcept analysis and mathematical morphology: application to image interpretation”, 8thInternational Conference on Concept Lattices and Their Applications (CLA2011), Nancy,France, pp. 405–408, October 2011.

[AUJ 10] AUJOL J.F., LADJAL S., MASNOU S., “Exemplar-based inpainting from a variationalpoint of view”, SIAM Journal on Mathematical Analysis, vol. 42, no. 3, pp. 1246–1285,2010.

[AYT 11] AYTAR Y., ZISSERMAN A., “Tabula rasa: model transfer for object categorydetection”, ICCV, Barcelona, Spain, 2011.

[BAR 07] BARANIUK T.G., “Compressive sensing”, Signal Processing Magazine, IEEE,vol. 24, no. 4, pp. 118–121, 2007.

[BLO 05] BLOCH I., “Fuzzy spatial relationships for image processing and interpretation: areview”, Image Vision Computing, vol. 23, pp. 89–110, 2005.

[CHE 09] CHEN Y., WANG Y.K., UGUR K., HANNUKSELAM., LAINEMA J., GABBOUJ M., “Theemerging MVC standard for 3D video services”, EURASIP Journal on Advances inSignal Processing, vol. 2009, article no. 8, 2009.

[COS 10] COSSALTER M., VALENZISE G., TAGLIASACCHI M., TUBARO S., “Joint compressivevideo coding and analysis”, IEEE Transactions on Multimedia, vol. 12, no. 3, pp. 168–183,2010.

[DAT 08] DATTA R., JOSHI D., LI J., WANG J.Z., “Image retrieval: ideas, influences and trendsof the new age”, ACM Computing Surveys, vol. 40, no. 2, article no. 5, 2008.

[DEL 11] DELEDALLE C.A., DUVAL V., SALMON J., “Non-local methods with shape-adaptivepatches (NLM-SAP)”, Journal of Mathematical Imaging and Vision, 2011.

[DEL 06] DELON J., “Movie and video scale-time equalization: application to flickerreduction”, IEEE Transactions on Image Processing, vol. 15, no. 1, pp. 241–248, 2006.

[GAL 11] GALERNE B., GOUSSEAU Y., MOREL J.M., “Random phase textures: theory andsynthesis”, IEEE Transactions on Image Processing, vol. 20, no. 1, pp. 257–267, 2011.

[GAS 07] GASTAUD M., LADJAL S., MAÎTRE H., “Blind filter identification and imagesuperresolution using subspace methods”, Eusipco, Poznan, Poland, September 2007.

[GRO 09] GROSJEAN B., MOISAN L., “A-contrario detectability of spots in texturedbackgrounds”, Journal of Mathematical Imaging and Vision, vol. 33, no. 3, pp. 313–337,2009.

[HOA 10] HOANGN.V., GOUET-BRUNETV., RUKOZM., MANOUVRIERM., “Embedding spatialinformation into image content description for scene retrieval”, Pattern RecognitionJournal, vol. 43, no. 9, pp. 3013–3024, 2010.

[HOR 10] HORMATI A., ROY O., LU Y.M., VETTERLI M., “Distributed sampling of signalslinked by sparse filtering: theory and applications”, IEEE Transactions on SignalProcessing, vol. 58, no. 3, pp. 1095–1109, 2010.