August - Inria · Pulsar will b e a com bination of researc h in computer vision, soft w are engineering and kno wledge engineering. More precisely e will prop ose new computer vision

Proposal for an INRIA Proje t-TeamPULSARPer eption Understanding and Learning Systemsfor A tivity Re ognition30 August 2007Theme: CogCMultimedia data: interpretation and man-ma hine intera tion.Abstra tThis resear h proje t-team will be fo used on A tivity Re ognition. More pre iselywe are interested in the real-time semanti interpretation of dynami s enes ob-served by sensors. We will thus study long-term spatio-temporal a tivities performedby human beings, animals or vehi les in the physi al world. The major issue in seman-ti interpretation of dynami s enes is the gap between the subje tive interpretationof data and the obje tive measures provided by the sensors. Our approa h in order toaddress this problem is to keep a lear boundary between the appli ation dependentsubje tive interpretations and the obje tive general analysis of the videos. Pulsarwill propose new te hniques in the �eld of ognitive vision and ognitive systemsfor physi al obje t re ognition, a tivity understanding, a tivity learning, system designand evaluation and will fo us on two main appli ation domains: safety/se urity andhealth are.KeywordsA tivity Re ognition, Cognitive Vision, Cognitive Systems, S ene Understanding,Software Ar hite ture, Computer Vision, Knowledge-based Systems, Software Engineering

A pulsar is a rapidly rotating neutron star. The radiation from su h a star appears to ome in a series of regular pulses (one per revolution), whi h explains the name.1

Contents1 The Team 32 Context and Overall Goal 43 From Orion to Pulsar 64 Resear h Dire tions 84.1 S ene Understanding for A tivity Re ognition . . . . . . . . . . . . . . . . . 94.1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.1.2 Proposed Approa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.2 Software Ar hite ture for A tivity Re ognition . . . . . . . . . . . . . . . . 134.2.1 Software Engineering Aspe ts of A tivity Re ognition . . . . . . . . 144.2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.2.3 Proposed Approa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Obje tives for the next 4 years 205.1 S ene Understanding for A tivity Re ognition . . . . . . . . . . . . . . . . . 205.2 Software Ar hite ture for A tivity Re ognition . . . . . . . . . . . . . . . . 216 Evaluation 227 Positioning and Collaborations 227.1 Positioning within INRIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227.2 A ademi ollaborations and positioning . . . . . . . . . . . . . . . . . . . . 247.2.1 National . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247.2.2 International . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257.2.3 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267.3 Industrial Collaborations and Funding . . . . . . . . . . . . . . . . . . . . . 268 Referen es 279 Bibliography of the Orion Team (from 1998) 35A The Lama Platform 46A.1 Motivation and Approa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46A.2 Lama Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46A.3 Lama Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47B The VSIP Platform 49B.1 Motivation and Approa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49B.2 Detailed Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49B.3 VSIP status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50C Performan es of Orion Videosurveillan e Systems 51C.1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51C.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51C.2.1 European Proje t ADVISOR . . . . . . . . . . . . . . . . . . . . . . 51C.2.2 Industrial Proje t Cassiopee . . . . . . . . . . . . . . . . . . . . . . . 52C.2.3 Industrial Proje t Videa . . . . . . . . . . . . . . . . . . . . . . . . . 52C.3 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532

1 The TeamThe permanent s ienti� members are INRIA s ientists with di�erent ba kground andskills. François Brémond and Monique Thonnat are ognitive vision s ientists with alarge experien e in arti� ial intelligen e and omputer vision. Sabine Moisan and An-nie Ressou he are omputer s ientists with respe tively a large experien e in arti� ialintelligen e software design and in formal te hniques for software veri� ation. GuillaumeCharpiat is a new junior s ientist spe ialized in omputer vision (PhD with O. Faugeras)and statisti s (Post-do with B. S holkopf) http://www.kyb.mpg.de/∼ harpiat/.Permanent s ienti� members• Team Leader: Monique Thonnat DR1, HDR• Resear h s ientists INRIA:François Brémond CR1, HDRGuillaume Charpiat CR2 (from De ember 2007)Sabine Moisan CR1, HDRAnnie Ressou he CR1• Team Assistant: Catherine Martin (part time) INRIANon permanent team members• External ollaborator:Prof Jean-Paul Rigault UNSA (University Ni e Sophia Antipolis)• Post-do : Sundaram Suresh, PhD• PhD Students:Bui Binh (Cifre RATP UNSA) [2nd year℄Mohamed Be ha Kaani he UNSA [1st year℄Naoufel Khayati (STIC Tunisie Tunis/UNSA) [3rd year℄Vin ent Martin UNSA [3rd year℄Anh Tuan Nghiem UNSA [1st year℄Lan Le Thi (AUF Hanoi/UNSA) [2nd year℄Nadia Zouba UNSA [1st year℄Mar os Zúniga (Coni yt/UNSA) [2nd year℄• Resear h engineers INRIA:Bernard Boulay, PhD (PACAlab STMi roele troni s)Etienne Corvée, PhD (European proje t CARETAKER)Ruihua Ma, PhD (European proje t SERKET)Luis Patino Vil his, PhD (European proje t CARETAKER)Valery Valentin (System�ti proje t SIC)HDR stands for �Habilité à Diriger des Re her hes� whi h is the diploma enabling tosupervise PhD theses. CR2 stands for Chargé de Re her hes se ond lass (junior s ien-tist), CR1 stands for Chargé de Re her hes �rst lass and DR1 stands for �Dire teur deRe her hes� (senior s ientist). 3

2 Context and Overall GoalPULSAR (Per eption Understanding and Learning Systems for A tivity Re og-nition) is in the ontinuation of the Orion INRIA proje t-team in the Cognitive Systemstheme �Multimedia data: interpretation and man-ma hine intera tion�,CogC in short.This new resear h proje t-team will be fo used on A tivity Re ognition. More pre- isely we are interested in the real-time semanti interpretation of dynami s enesobserved by sensors. We will thus study long-term spatio-temporal a tivities performedby several intera ting human beings, animals or vehi les in the physi al world. A typi alexample of omplex a tivities in whi h we are interested is shown in �gure 1 for air raftmonitoring in apron areas. In this example the duration of the servi ing a tivities aroundthe air raft is about twenty minutes and the a tivities involve intera tions between severalground vehi les and human operators. The goal is to re ognize these a tivities throughformal a tivity models (as shown in �gure 2) and data aptured by a network of video am-eras (su h as the ones shown in �gure 3). For more details, refer to the related Europeanproje t website http://www.avitra k.net/. A tivity re ognition should not be onfusedwith a tion re ognition. A tion re ognition seeks to dete t short-term sequen es of on-�gurations of an individual arti ulated form (typi ally human) that are hara tristi tospe i� human a tions su h as grasping or tou hing an obje t. In the omputer vision ommunity the term video event is often used as a general term for both a short-terma tion and a long-term a tivity.The major issue in semanti interpretation of dynami s enes is the gap between thesubje tive interpretation of data and the obje tive measures provided by the sensors. Theinterpretation of a video sequen e is not unique; it depends on the a priori knowledge ofthe observer and on his/her goal. For instan e a video showing apron s enes will havedi�erent interpretations from a se urity o� er (looking for dangerous situations) or froman airport manager (looking at servi ing operations s heduling) point of views.Our approa h to address this problem is to keep a lear boundary between the appli a-tion dependent subje tive interpretations and the obje tive analysis of the videos. We thusde�ne a set of obje tive measures whi h an be extra ted in real-time from the videos, wepropose formal models to enable users to express their a tivities of interest and we buildmat hing te hniques to bridge the gap between the obje tive measures and the a tivitymodels.Pulsar will propose new te hniques in the �eld of ognitive vision and ognitivesystems for physi al obje t re ognition, a tivity understanding and learning, dependablea tivity re ognition system design and evaluation. We will have a multidis iplinary and apragmati approa h. Alike Orion, Pulsar will be a ombination of resear h in omputervision, software engineering and knowledge engineering. More pre isely we will proposenew omputer vision te hniques for mobile obje t per eption with a fo us on real-timealgorithms and 4D analysis (e.g. 3D models and long-term tra king). We will deepenour resear h work on knowledge representation and symboli reasoning for a tivity re og-nition. We will study how statisti al te hniques and ma hine learning an in general omplement our a priori knowledge te hniques for a tivity re ognition. Pulsar will deepenour resear h work on software engineering for the design and the evaluation of e�e tiveand e� ient a tivity re ognition systems. For instan e, we will ontinue our work done inETISEO [217℄(http://www.etiseo.net) on designing new evaluation te hniques taking intoa ount the diversity of video sequen es. Pulsar will also take advantage of a pragmati approa h working on on rete problems of a tivity re ognition to propose new ognitivesystems te hniques inspired by and validated on appli ations, in a virtuous y le. ThusPulsar will provide both theoreti al results for a tivity re ognition (e.g. formal models4

a bFigure 1: A tivity re ognition problem in airport: a) overview of the main servi ing oper-ations and of the lo ation of the video ameras (in blue) b) front view of the apron areashowing the video ameras

Figure 2: A tivity re ognition problem in airport: example of an a tivity model enablingto des ribe an unloading operation with a high-level languageof a tivity, methodology for a tivity re ognition system design, et .) and on rete re-sults (e.g. real-time algorithms, performan e evaluation metri s, prototypes of integrateda tivity re ognition systems, et .).A tivity re ognition is a hot topi in the a ademi �eld not only due to s ienti� mo-tivations but also due to strong demands oming from the industry and the so iety; inparti ular for videosurveillan e and home are. In fa t, there is an in reasing need to auto-mate the re ognition of a tivities observed by visual sensors (usually CCD ameras, omnidire tional ameras, infrared ameras), but also by mi rophones and other sensors (e.g.opti al ells, physiologi al sensors)[237℄. Pulsar will fo us its work on two main appli a-tion domains: safety/se urity and health are. Our experien e in videosurveillan e (see�gure 4) for safety/se urity [149, 139, 132, 133, 134℄ is a strong basis whi h ensures both apre ise view of the resear h topi s to develop and a network of industrial partners rangingfrom end-users, integrators and software editors to provide data, obje tives, evaluationsand funding. Pulsar will develop new resear h a tivities for a tivity monitoring for health- are (in parti ular the assistan e of elderly). In this domain global traje tory analysis andshort term temporal analysis are not su� ient. An issue will be to develop detailed humanshape and long term temporal analyses. We have initiated new ollaborations with severalhospitals (CHU Ni e, Prof Chauvel in Marseille La Timone) to understand their needs.5

a b d e fFigure 3: di�erent views of the apron area aptured by the video amerasWe have set-up an experimental laboratory (Ger'home) at Sophia Antipolis together withCSTB (the Fren h s ienti� and te hni al entre for building,http://international. stb.fr/) to test new sensors and new a tivity re ognition te hniques (see �gure 5).3 From Orion to PulsarAmong the �ve permanent resear h s ientists who are members of Pulsar, four belongedto the Orion INRIA proje t team; G. Charpiat has re ently been hired and he will join theteam in De ember 2007. Orion was reated in July 1995 in the theme Data bases, Knowl-edge bases and Cognitive Systems (3A in short). Orion was, in the same way as Pulsaris now, a multi-dis iplinary team at the frontier of omputer vision, knowledge-based sys-tems, and software engineering. In 1995 the two resear h axes were Image Understandingand Program Supervision. There have been several evolutions of these resear h axes duringthe last years. The image understanding axis had two main evolutions: �rst the in reasingimportan e of resear h in video understanding and se ond the emergen e of a ognitivevision approa h mixing a priori knowledge [145, 238℄ with ontology [153℄, learning [152℄ andeven program supervision [150℄. The program supervision axis has evolved from a knowl-edge based approa h for reusing a library of programs to a resear h axis on obje t-orientedframework for knowledge-based systems whi h led to the Lama platform 1.The two main resear h dire tions of Pulsar are respe tively s ene understanding fora tivity re ognition and software ar hite ture for a tivity re ognition. The major di�er-en es between Orion and Pulsar are des ribed below.First, a fo us on spatio-temporal a tivities of physi al obje ts whi h will be studied notonly from the s ene understanding point of view but also from the software engineeringpoint of view.Se ond, the new appli ation domain in health are, espe ially for assistan e to theelderly, whi h will bring interesting hallenging new problems.1Lama [254℄ is a software platform enabling to design not only knowledge bases, but also inferen eengines, and additional tools. It o�ers toolkits to build and to adapt all the software elements that ompose a knowledge based system or a ognitive system (for more details see Annex A).6

unruly behaviors detection on train aprons monitoring on airport

lock chamber monitoringbank agency access controlFigure 4: Typi al videosurveillan e problems addressed by Orion

Elderly assistance in Ger’home project atCSTB premises

Epileptic patient monitoring(courtesy to Pr Chauvel from Inserm/La Timone hospital)Figure 5: Examples of a tivity monitoring for health are7

Third, a new resear h area in ma hine learning with the obje tive to propose omplete ognitive systems mixing per eption, understanding and learning te hniques.Finally to enri h the visual data with other sensors.Pulsar an rely on several results obtained by Orion in the last years. Among them we an ite the following ones:• an original 4D semanti approa h for video understanding in luding mobile obje tdete tion, lassi� ation, tra king and omplex s enario re ognition [144, 146, 147,149, 255, 256, 258℄,• an original program supervision approa h for software reuse in luding knowledgedes ription languages for operators and planning te hniques [154, 158, 159, 207, 210,231, 256℄,• a visual on ept ontology [152, 153, 252℄ and a video event ontology [258℄,• the VSIP2 platform for real-time video understanding [139, 150, 250℄; VSIP led toseveral produ ts and to the reation of Keeneo (www.keeneo. om) in July 2005 forits industrialisation (see Annex B for more details),• the Lama platform that provides a uni�ed environment to design knowledge basesand inferen e engines [156, 157, 254℄ (see Annex A for more details).Annex C summarizes the performan es obtained by Orion in videosurveillan e usingour 4D semanti approa h and our VSIP real-time video understanding library. This annexpresents the results of three evaluation ampaigns led by industrial partners and end-usersin the European proje t ADVISOR, the industrial proje t Cassiopee and the industrialproje t Videa. In on lusion these performan e evaluations show that our urrent systemsare very robust for simple a tivities mainly based on the 3D lo alisation and tra king of asmall number of persons inside a metro station, a bank agen y or a building. Globally thereare more than 90% of true positive (or orre t) dete tions, less than 2 false alarms overseveral hours. The results for more omplex a tivities su h as �ghting are promising (95%true positives in the ase of isolated persons inside a metro station and about 60% in the ase of far �eld of view of a urban busy environment) but these results are fragile and needto be improved. In parti ular new resear h need to be laun hed for improving thepre ision of obje t des ription and ombining a priori knowledge and learning te hniquesfor setting the various models of an a tivity re ognition system.4 Resear h Dire tionsWe propose to follow two main resear h dire tions: s ene understanding for a tivity re og-nition and software ar hite ture for a tivity re ognition. These two resear h dire tionsare interleaved: the software ar hite ture dire tion will provide new methodologies forbuilding safe a tivity re ognition systems and the s ene understanding dire tion will pro-vide new a tivity re ognition te hniques whi h will be designed and validated for on retevideosurveillan e and health are appli ations. Conversely these on rete systems will raisenew software issues that will enri h the software ar hite ture resear h dire tion.2VSIP is a real-time Intelligent Videosurveillan e Software Platform omposed of (1) image pro essingalgorithms in harge of mobile obje t dete tion, lassi� ation and frame to frame tra king; (2) multi- amera algorithms for the spatial and temporal analysis (4D) of the dete ted mobile obje ts; (3) a highlevel s enario re ognition algorithms. VSIP is oriented to help developers building a tivity monitoringsystems and des ribing their own s enarios dedi ated to spe i� appli ations.8

4.1 S ene Understanding for A tivity Re ognitionParti ipants3 : François Brémond, Guillaume Charpiat, Monique Thonnat andSabine MoisanDespite few su ess stories, su h as tra� monitoring (e.g. http://www. itilog.fr),swimming pool monitoring (e.g. http://www.poseidon-te h. om) and intrusion dete tion(e.g. http://www.obje tvideo. om), s ene understanding systems remain brittle and anfun tion only under restri tive onditions (e.g. daylight, di�use lighting onditions, noshadows), with poor performan e over time [139℄. Moreover, these systems are very spe- i� and need to be developed from s rat h for a new appli ation. To address these issues,most resear hers have tried to develop new vision algorithms, robust enough to handle reallife onditions. Up to now no vision algorithm has been able to address the large vari-ety of onditions hara terizing real world s enes in terms of sensor onditions, hardwarerequirements, lighting onditions, physi al obje ts, and appli ation obje tives.4.1.1 Related WorkThe state-of-the-art in s ene understanding embra es several de ades of produ tive work.It re ently led to a new resear h theme: Cognitive Vision.S ene understanding is an ambitious resear h topi whi h aims at solving the om-plete interpretation problem ranging from low level signal analysis to semanti des riptionof what is happening in the s ene viewed by video ameras and possibly other sensors.This issue implies solving several problems grouped in three major ategories: per eption,understanding and learning.Video Pro essing for A tivity Re ognition Several teams (e.g. University of Leeds,University of Hamburg, University of Southern California, Georgia Te h, University ofCentral Florida, University of Kingston upon Thames, National University of Cheng Kungand Prima Team at Inria Grenoble) work on video pro essing for a tivity re ognition,and a huge literature des ribes video pro essing te hniques. Among the lassi al ap-proa hes, we an ite motion estimation [126, 115, 36℄, obje t dete tion from a ba k-ground image [82, 106℄, feature-based obje t dete tion [65, 85℄, template-based obje tdete tion [48℄, rowd dete tion [130, 6℄, human-body dete tion [116, 59, 114, 29, 38, 2℄,gait analysis [96, 76℄, vision systems [60, 34, 44℄ and video pro essing performan e evalua-tion [13, 64, 131℄. Several surveys [3, 118, 90℄ ategorize these te hniques in a more or lessexhaustive way.We highlight one major issue related to per eption for a tivity re ognition: obje ttra king and multi-sensor analysis. Mu h work studied tra king algorithms (mostly basedon feature-tra king [54, 86℄ or template-tra king [18, 28℄ using for instan e probabilisti te hniques [43, 55, 24℄, 3D te hniques [37, 113℄, a tive-vision te hniques [49℄, long-termte hniques [128, 10℄ or graph-based te hniques [112, 58℄) and fusion of information te h-niques (mostly with multiple- ameras [103, 42, 17℄ and, less frequently by ombining video ameras with other sensors [40, 31, 51℄) to improve results and robustness. In parti ular,an interesting work [89℄ has addressed the problem of multi-view tra king using syn hro-nized ameras. This system is able to ombine information oming from multiple amerapairs (i.e., up to 16 syn hronized ameras were used in the experiments) in order to han-dle o lusions and orre tly tra k densely lo ated obje ts in a luttered s ene. However,due to omplexity, the system is not yet able to work in real-time (i.e., it urrently takes3The main permanent resear h s ientists working in this resear h dire tion are in bold font, permanentresear h s ientist working only partly in this resear h dire tion are in regular font9

approximately 5 se onds per pro essing step). In the same way, re ent work has beendone to ombine di�erent types of media (video, audio and others) [237℄ to move fromvideo understanding to s ene understanding. However, a general framework is needed toperform this information fusion seamless. In omparison, the Pulsar team will design aglobal framework for 4D spatio-temporal reasoning.Understanding Con erning related work in high level understanding for a tivity re og-nition we will fo us on two aspe ts: event re ognition and ontology-based knowledge a -quisition.An in reasing e�ort is done to re ognize events from videos [9, 19, 122℄, in parti ularusing motion-based te hniques [27, 99, 80, 97℄. For instan e several studies have been doneon home- are appli ations [71, 30℄. Along the work a hieved in event re ognition, two main ategories of approa hes are used to re ognize temporal s enarios based on (1) a probabilis-ti network [110, 73, 124℄ (espe ially using Bayesian te hniques [109, 83℄ or Hidden Markovmodels HMMs [52, 7℄) or a neural network [16, 72℄ ombining potentially re ognized s e-narios, (2) a symboli network that stores all previously re ognized s enarios [94, 101, 41℄.For instan e, for the probabilisti te hniques, Hongeng et al. [70℄ propose a s enario re og-nition method that uses on urrent Bayesian threads to estimate the likelihood of potentials enarios. Most of these te hniques allow an e� ient re ognition of s enarios, but thereare still some temporal onstraints whi h annot be pro essed. Moreover, most of theseapproa hes require that the s enarios are bounded in time. To address this issue and tohandle un ertainty, the Pulsar team will extend the Orion's original approa h on s enariore ognition based on temporal onstraint he king [238, 258℄. Knowledge a quisition isstill a omplex and time- onsuming task but it is required to obtain an e�e tive system tore ognize omplex obje ts. Most approa hes use dedi ated knowledge dependent on theappli ation domain. For instan e, in [117℄ obje t des ription is based on domain relatedannotations. We propose to use a per eptual on ept ontology to make easier the knowl-edge a quisition task in the same way as we have done in [153, 152, 252℄ for stati 2Dvisual on epts.Learning Con erning related work in learning for a tivity re ognition, we fo us on twoaspe ts: learning te hniques for obje t and event re ognition and learning te hniques forprogram parameter tuning.Regarding event re ognition, most approa hes learn event patterns using statisti alte hniques [125, 79, 68, 50℄ and several approa hes an dedu e abnormalities from thelearnt events [121, 127℄. Regarding obje t re ognition, a great e�ort has also been devotedto representing obje ts via their 2D hara teristi s omputed with statisti al te hniques [92,33, 81℄. For instan e, the Lear proje t of INRIA is working on obje t re ognition and s eneinterpretation from stati images and video sequen es. They use shape and a�ne interestpoint information to des ribe obje ts in images. This numeri al ontent is learned to buildvisual models of obje t lasses. Therefore, visual des ription and obje t models are mainlybased on low-level features. They do not use symboli information to des ribe the obje ts.Little work has been done to design learning te hniques for system tuning and forsetting program parameters [63, 105, 88℄. For instan e, in the Prima Team at INRIAGrenoble, Hall [62℄ proposes an automati parameter regulation s heme for a tra kingsystem. An auto- riti al fun tion is able to dete t a drop in system performan es withrespe t to a s ene referen e model. In su h a situation, an automati regulation module istriggered to provide a new parameter setting with better performan es. The omparisonwith a manually tuned system shows that the ontrolled system is not able to rea h thesame performan es. This is due to the fa t that the ontrolled system does not use a priori10

knowledge of programs whereas the expert takes advantage of this kind of knowledge andis rapidly able to understand whi h parameters are related to whi h tra king problems.This work shows that this resear h domain is new and that more work is still needed. Tota kle this problem, the Pulsar team proposes to use program supervision te hniques to on�gure and ontrol system exe ution, following the promising results we have obtainedin [150, 250℄. This involves adding a reasoning level whi h ombines a priori knowledge ofthe program to tune and learned parameter values in fun tion of the ontextual lass.4.1.2 Proposed Approa hOur goal is to propose innovative te hniques for autonomous and e�e tives ene understanding systems for long-term a tivity re ognition [150, 250℄. Thisobje tive is very ambitious; however the urrent state-of-the-art te hniques in ognitivevision already led to partial solutions [23, 95, 26, 123℄.The major issue in semanti interpretation of dynami s enes is the gap between thesubje tive interpretation of data and the obje tive measures provided by the sensors. Ourapproa h to solve this problem is to keep a lear boundary between the appli ation depen-dent subje tive interpretations and the obje tive analysis of the videos. We thus de�nea set of obje tive measures whi h an be extra ted in real-time from the videos, we alsopropose formal models to enable users to de�ne their a tivities of interest and we buildmat hing te hniques to bridge the gap between the obje tive measures and the a tivitymodels.S ene understanding is a omplex pro ess where information is abstra ted throughfour levels: signal (e.g. pixel, sound), per eptual features, physi al obje ts and a tivities.The signal and the feature levels are hara terized by strong noise, ambiguous, orruptedand missing data. The whole pro ess of s ene understanding onsists in analysing thisinformation to bring forth pertinent insights of the s ene and its dynami s. Moreover, torea h a semanti abstra tion level, a tivity models are the ru ial points. A still open issue onsists in determining whether these models should be given a priori or learned. The whole hallenge onsists in organizing this knowledge in order to apitalize experien e, share itwith others and update it along with experimentation. To fa e this hallenge, tools inknowledge engineering su h as ontologies are needed.More pre isely we plan to work along the following resear h axes: per eption (how todete t physi al obje ts), understanding (how to re ognize their a tivities based on high levela tivity models) and learning (how to learn the models needed for a tivity re ognition).The results of this work will be integrated as per eption, understanding and learning omponents in the PULSAR platform for a tivity re ognition whi h will be des ribed inse tion 4.2.3.Per eption for A tivity Re ognition Our ontribution in per eption will be twofolds:�rst to propose new te hniques for physi al obje t re ognition, and se ond to deepen ourwork on the management of a library of video pro essing programs.A �rst issue is to re ognize in real-time physi al obje ts from per eptual features andprede�ned 3D models. It requires �nding a good balan e between e� ient methods andpre ise spatio-temporal models. We will not work on hardware solutions (as GPU Graphi sPro essing Unit or FPGA Field-Programmable Gate Array) for getting video frame rateperforman e but we will design software solutions either by adapting existing algorithmsor by proposing new ones. We plan to work on information fusion to handle per eptualfeatures oming from various sensors (several ameras overing a large s ale area or het-erogeneous sensors apturing more or less pre ise and ri h information). Currently we11

are working with CSTB (Centre S ienti� et Te hnique du Batiment) to setup Ger'home(http://gerhome. stb.fr/), an experimental laboratory to study di�erent sensor on�gu-rations. For a tivity re ognition we need robust and oherent obje t tra king over longperiods of time (often several hours in videosurveillan e and several days in health are).Despite all the work done in obje t dete tion and tra king, state of the art algorithmsremain brittle. To guarantee the long term oheren e of tra ked obje ts, spatio-temporalreasoning is required. Modeling and managing the un ertainty of these pro esses is also anopen issue. In Pulsar we propose to add a reasoning layer to a lassi al bayesian frameworkmodelling the un ertainty of the tra ked obje ts. This reasoning layer will take into a - ount the a priori knowledge of the s ene for outlier elimination and long-term oheren y he king. Moreover a lot of resear h work is needed to provide �ne and a urate mod-els of human shape and gesture. Currently the VSIP library is limited to oarse motiondete tion and global 3D features. We plan to deepen the work we have done on humanposture re ognition with B. Boulay [144, 176, 177℄ to mat h 3D models and 2D silhouettes.We also plan to work on gesture re ognition based on feature tra king; for instan e M. B.Kaani he is starting a PhD on this topi . These kinds of detailed 3D features are required,for instan e, for patient monitoring in health are.A se ond issue is to manage a library of video pro essing programs. One ontributionin this domain will be to build a per eption library in the same way VSIP was developed inOrion. More pre isely we will not only sele t robust algorithms for feature extra tion butalso ensure they work e� iently with real-time onstraints and formalize their onditionsof use within the program supervision models. In the ase of video ameras at least twoproblems are still open: robust image segmentation and meaningful feature extra tion. Forthese issues we plan to use supervised learning te hniques (see the following paragraph onlearning).Understanding for A tivity Re ognition A se ond resear h axis is to re ognizesubje tive a tivities of physi al obje ts (i.e. human beings, animals, vehi les) based on apriori models and on the obje tive per eptual measures (e.g. robust and oherent obje ttra ks).We propose to de�ne new a tivity re ognition algorithms and a tivity models. A tivityre ognition in ludes the omputation of relationships between physi al obje ts. The real hallenge is to explore e� iently all the possible spatio-temporal relationships of the obje tsthat may orrespond to a tivities of interest. The variety of these a tivities, generally alled video events, is huge and depends on their spatial and temporal granularity, on thenumber of physi al obje ts involved in the events, and on the event omplexity (numberof omponents onstituting the event). One hallenge is to explore and analyse this largeevent spa e without getting lost in ombinatorial sear hes during the on-line pro essing ofthe video streams.Con erning the modeling of a tivities, there are two kinds of open problems: the in-trodu tion of un ertainty in the formalism for expressing a priori knowledge and the de-velopment of knowledge a quisition fa ilities based on ontologi al engineering te hniques.For the �rst problem we will investigate either lassi al statisti al te hniques and fuzzylogi s or new approa hes mixing the two kinds of te hniques su h as Markov Logi Net-works [39℄. We need to extend our urrent visual on ept ontology ( urrently limited to olor, texture and spatial on epts) with temporal on epts (motion, traje tories, events,...) and other per eptual on epts (su h as audio on epts or physiologi al sensor on- epts). A short term goal will be to add audio on epts respe tively in the framework ofthe European SERKET proje t for se urity and in the framework of the European proje tCARETAKER [179℄ for a tivity monitoring.12

Learning for A tivity Re ognition The third resear h axis is to study whi h ma hinelearning te hniques ould help to learn the models needed for a tivity re ognition. Sin eit is still very di� ult to build an a tivity re ognition system for a new appli ation, wepropose to study how ma hine learning te hniques an automate model building or modelenri hment at the per eption level and at the understanding level.At the per eption level, a remaining open problem is the low level dete tion and inparti ular image segmentation. A possible approa h to improve image segmentation isto use learning te hniques for image segmentation method sele tion and for image seg-mentation parameter optimization. A PhD student (V. Martin[208℄) is urrently workingon this topi . For an image sampling set a user provides ground truth data (manual re-gion boundary and semanti labels). An evaluation metri together with an optimizations heme based on the simplex algorithm or a geneti algorithm are applied for �nding thebest parameters for ea h segmentation algorithm. Another work we have done with B.Georis [150, 250℄) has shown in a spe i� example of illumination hange how lusteringte hniques applied to intensity histograms ould help in learning the di�erent lasses ofillumination ontext for dynami parameter setting. This resear h work an be gener-alized, for example, to learn parameter initialization riteria in the program supervisionformalism. More generally there is a need for learning te hniques for program supervisionknowledge base re�nement.At the understanding level two main resear h topi s are planned: a �rst topi is thelearning of primitive event dete tors. This an be done in the same way as in Ni olasMaillot's PhD [252, 153, 152℄, by learning visual on ept dete tors using SVMs (SupportVe tor Ma hines) with per eptual features samples. This work was restri ted to olor,texture and simple shapes. An open question is how far an we go in weakly supervisedlearning for ea h type of per eptual on ept (i.e. lowering the human annotation task)?A se ond topi is the learning of typi al omposite event models for frequent a tivities[233℄ using data mining te hniques. We name omposite event a parti ular ombination ofseveral primitive events. This topi is very important for health are a tivity monitoringwhere large amounts of data are available but little a priori knowledge is formalized.Coupling learning te hniques with a priori knowledge te hniques is very promising butstill at a very preliminary stage for a tivity re ognition and, in parti ular, to re ognizemeaningful semanti a tivities.Pulsar being interested in proposing new te hniques for a tivity re ognition systems,the results of the resear h performed in s ene understanding (�rst resear h dire tion) will ontribute to spe ifying the needs for new software ar hite tures (se ond resear h dire -tion).4.2 Software Ar hite ture for A tivity Re ognitionParti ipants4: Sabine Moisan, Annie Ressou he, Jean-Paul Rigault, François Bré-mond, Monique Thonnat.In Pulsar, our �rst aim is to design a new platform dedi ated to a tivity re ognitionthat integrates and extends the fun tionalities of both Lama and VSIP; the ar hite tureof this platform will follow the same Software Engineering approa h that led to Lama.Se ond, we wish to ontribute to set up, in the long term, a omplete software engineeringmethodology for the development of a tivity re ognition systems. This methodology shouldbe based on Model Engineering and formal te hniques.4The main permanent resear h s ientists working in this resear h dire tion are in bold font, otherpermanent resear h s ientist working only partly in this resear h dire tion are in regular font13

4.2.1 Software Engineering Aspe ts of A tivity Re ognitionA tivity re ognition systems are omplex to design due to a number of hara teristi prop-erties. First, their algorithmi stru ture is omplex: these systems en ompass many ab-stra tion levels, from pixel-based pro essing to high level knowledge and data handling;ea h level as well as their intera tions impa t the quality of the overall system. Se ond,their ar hite ture is inherently distributed and often heterogeneous at almost all levels:many sensors of various kinds, lo al pro essing, distributed ressour es, distributed knowl-edge and data, et . Third, these systems have to be e� ient sin e they have to workmore and more in real time. Fourth, the target appli ations are demanding in terms ofdependability, safety and priva y (e.g., transport, medi al domains). Fifth and �nally, andfor all the previous reasons, these systems have requirements that are di� ult to expressand validate.A single hara teristi property in the previous list makes software development a toil.Combining them onstitutes a real hallenge whi h alls for sophisti ated Software Engi-neering (SE) te hniques to set up a lear methodology and develop adequate supportingtools. It is not su� ient, however, to adopt well proven traditional SE methods; one needsto invent new ones, possibly pi king, adapting, and developing state of the art resear h inthis domain. Hen e, our goal is not only to provide operational software tools, but also todraw new methodologies to build these tools.In parti ular two bodies of SE te hniques, both of whi h have drawn a onsiderableinterest in the re ent years, may be fruitfully applied to the design of a tivity re ognitionsystems: generi ity for organizing design and ode, and model engineering for method-ologi al guidan e.Developing generi entities has always been a major task of software engineering, sin eits reation in 1969. Indeed su h entities an be highly reusable, sparing both developmentand testing/validation time. Consequently, they also promote extensibility, dependability,and maintainability. Subprograms, modules, omponents, obje ts, frameworks, all of these ontribute to the domain. There is a danger, though: generi ity implies raising the ab-stra tion level, possibly at the expense of spe i� appli ation requirements or demandedperforman e. But software engineering has always striven to produ e te hniques and toolsthat try to re on ile reusability with adaptability to a spe i� task. Nowadays, the keystudies in generi ity involve the use of patterns [53℄ and obje t-oriented or omponent-basedframeworks [45℄ as well as the numerous te hniques hidden behind the general term of gen-erative programming [32℄ (metaprogramming, aspe t-oriented programming, et .) [1, 77℄.A few years ago, the Obje t Management Group (OMG) laun hed an initiative to de-velop a so- alled Model-driven Ar hite ture (MDA) [12, 87℄. In the stri test sense, MDA isan attempt to favor a seamless development of �business� software using models (based onthe UML, another produ tion of OMG) and automati transformations of these models,but establishing a lear separation between the high level appli ation requirements andthe middle and low level implementations, whi h are platform dependent. The urrentresear h broadens the s ope of MDA, envisioning to design systems by an automati (orsemi-automati , but ontrolled and validated) sequen e of model transformations. Withthis meaning the approa h is best designated as Model Engineering or Model Driven De-velopment (MDD). Also, we do not underestimate the important work of the SoftwareAr hite ture ommunity [5℄ and think that both approa hes are worth being merged.The Software Ar hite ture resear h dire tion proposed by Pulsar�as a ontinuationof previous Orion work� on erns the study of generi systems for a tivity re ognition as14

Figure 6: The two Pulsar resear h axes onsolidate ea h for produ ing a tivity re ognition(AR) platmform and systems.well as the de�nition of a methodology for their design. We wish to ensure generi ity,modularity, reusability, extensibility, dependability, and maintainability, but still ensuringadequate (real time) performan e. To ta kle this hallenge, we rely on state of the artSoftware Engineering pra ti e. In parti ular we shall explore the new te hniques mentionedabove, those related to generi ity on the ar hite ture side as well as model engineeringapproa hes on the methodologi al side.It is almost impossible to a hieve generi ity from s rat h, hen e our approa h is prag-mati : it is indeed fed by the problems that Pulsar has to fa e in image and video un-derstanding and more generally in a tivity re ognition. Now that we have rea hed somematurity in image and s ene understanding, we an onsider building more general so-lutions and tools. Therefore, we need to abstra t from our experien e with operationalsystems to provide more generi solutions and user-friendly tools. On e the orrespondinggeneri entities are available, they an be applied to new problems. This new experien ewill in turn be abstra ted, enri hing generi models and platform ar hite ture. Therefore,the two Pulsar resear h axes (s ene understanding and software ar hite ture) onsolidateea h other in a �virtuous y le�: as shown on �gure 6, an operationalization �ow pro-du es e�e tive systems from the generi platform whereas an abstra tion �ow enri hes theplatform from the lessons learned with on rete appli ations.Another important hara teristi of this resear h dire tion is that we strive to developmodels with formally orre t foundations. This makes it possible to build dependablesystems relying on formal veri� ation te hniques, sin e traditional testing is not su� ientin these domains.This approa h proved feasible with the development of the Lama platform (see Ap-pendix A) in the Orion proje t-team. This set of obje t-oriented omponent frameworksprovides generi omponents (as C++ lasses) for generating knowledge-based systems gen-erators. Hen e, it is a meta-meta-platform (thus rather high on the generi level s ale!).Its generi ity has been put to the test by using Lama to develop systems requiring verydi�erent types of tasks: model alibration, image lassi� ation, or video understanding.Yet, its high generi level is not at the expense of e� ien y: for instan e in a video under-standing appli ation, the planning engine generated with Lama provides high versatilitybut a ounts for less that 4% of the overall exe ution time.4.2.2 Related WorkModels Histori ally the �rst Arti� ial Intelligen e task that we have addressed in Orionwas Program Supervision. Program Supervision has two major aspe ts: setting up modelsto represent knowledge about the use of programs and ontrolling their use through infer-en e engines. Hen e, it requires knowledge representation [210℄ as well as (task) planningte hniques. Thus it draws on previous work in these two domains [102, 93℄. However15

most of these studies were generally motivated by spe i� appli ation domain needs (forinstan e, image pro essing, signal pro essing, or s ienti� omputing). By ontrast ourapproa h relies on knowledge-based system te hniques.Similar work has been developed, espe ially in the knowledge a quisition ommunity:their main obje tive is not to reuse ode and programs (as we do) but knowledge it-self [108, 46℄. For instan e we an ite Protege from Stanford University [56℄, whi h isoriented toward ontology editing, knowledge representation, and reuse of abstra t meth-ods for problem solving. Program supervision suits well the image pro essing area: we an ite the ExTI system (IRIT, Toulouse) or the Hermes and Panthéon proje ts [104℄. Inother domains, the work of Chien and his ollaborators at the Jet Propulsion Laboratory(NASA) [20℄ has an approa h lose to Lama, although the task and domain are di�erent.They propose two generi omponent systems based on arti� ial intelligen e te hniques,ASPEN and CASPER. Close to program supervision, the AROM platform also relies ona generi approa h [57℄.Some issues in the Semanti Web and Web Servi es resear h [8, 35℄ are also relatedto this topi as far as knowledge representation is on erned. On the other hand thereexists, in the Semanti Web, a dynami ontrol aspe t that ould be a matter for programsupervision te hniques (see for instan e the e�orts towards OWL-S and Jena5).Platform There are several types of ooperation between arti� ial intelligen e and soft-ware engineering. Mu h work deals with applying AI te hniques for better software devel-opment. We subs ribe to the opposite viewpoint: we rather use SE te hniques to improveAI system design, a less ommon line of work.Lama is omposed of obje t-oriented frameworks (written in C++ and Java) [156℄.Roughly speaking a framework is a generi program (and asso iated libraries) bringingsolutions to re urrent problems in a parti ular (but usually wide) domain [74℄. Be ause ofits generi nature, a framework has to be adapted to a spe i� appli ation in its problemdomain. For a long time now, frameworks have been onsidered as a powerful tool fordesigning and building omplex software systems in a rather e onomi al way. As of today,many domains have dedi ated frameworks (e.g., graphi interfa es, on urrent systems,Web appli ations, ompilation, et .) whereas we fo us on KBS software (i.e., inferen eengines or knowledge managers).In arti� ial intelligen e, many frameworks have been developed for, e.g., agents andlearning. These approa hes rely on omponent engineering, sharing the same ideas aboutreusability and generi ity [98℄. Yet, in the domain of knowledge-based systems, Lamaappears rather original. Indeed one may �nd �expert systems generators� dedi ated toparti ular tasks, but Lama raises the generi ity level one step higher, sin e it is a generatorof knowledge based system generators. In parti ular, the range of Lama extends beyondProgram Supervision itself and an en ompass other arti� ial intelligen e tasks, su h asmodel alibration in s ienti� omputing [160℄ or lassi� ation [251, 252℄ (to name just theones with whi h we had a dire t experien e). Obviously, Lama must evolve to supportdistributed omponents. Some aspe ts of the previously mentioned Semanti Web learlyrelate to this issue. We will study how the orresponding te hniques an apply to our ase. Moreover, the multi-agent ommunity has also been a tive in developing frameworksdedi ated to the distributed aspe t of agents (the �middleware�): for instan e Aglets [78℄,Jade6, the Oasis team in Paris [15℄, or Proa tive developed in the Oasis proje t at INRIASophia Antipolis7. We have already started to use agent te hniques (Aglets) for distributing5http://www.daml.org/servi es/owl-s and http://jena.sour eforge.net6http://jade.tilab. om/7http://www-sop.inria.fr/oasis/ProA tive/ 16

Program Supervision systems [151℄.Safeness In this matter, our goal is not to develop original resear h as INRIA proje tssu h as Mimosa or Aoste do. Rather we intend to rely on well known formal te hniques(syn hronous models[11, 61℄, model- he king [22, 75℄) in the line of our previous work withthe Blo ks framework [157, 213℄.4.2.3 Proposed Approa hAs indi ated before (see 4.2.1), the resear h dire tion we proposed here onsists in study-ing generi systems for a tivity re ognition and in elaborating a methodology for theirdesign. We wish to ensure generi ity, modularity, reusability, extensibility, dependability,and maintainability.We plan to work in the following three resear h axes: models (adapted to the a tivityre ognition domain), platform ar hite ture (to ope with deployment onstraints su h asreal time or distribution), and system safeness (to generate dependable systems). For allthese tasks we shall follow state of the art Software Engineering pra ti es and, if needed,we shall attempt to set up new ones.Models for A tivity Re ognition This evolution toward a tivity re ognition requirestheoreti al studies ombined with validation based on the on rete experiments ondu tedtogether with the s ene understanding resear h dire tion. Thus the evolution will ertainlynot be instantaneous but rather in remental over a long period. Altogether it is a longterm goal although some of its aspe ts an be addressed in the near future.First, the in orporation of a model of time [107, 4℄, both physi al and logi al, is neededto deal with temporal a tivity re ognition, espe ially s enario re ognition. We �rst in-tend to onsider models of logi al time based on the Syn hronous Paradigm. Indeed thisparadign is well suited to deterministi systems and has many ni e properties su h as beingliable to formal veri� ation.Se ond, handling un ertainty is a major theme of Pulsar and we want to introdu e itinto the platform 8. This might well shake up the syn hronous paradigm and requires deeptheoreti al studies; thus it is a long term goal.Third, the in orporation of an abstra t model of s enarios is not only mandatory todes ribe and re ognize a tivities. It also allows to use s enarios as a means to de�ne new omplex operators in Program Supervision. Indeed s enarios ould onstitute both a spe -i� ation and a sort of �s ripting� me hanism (this se ond point appears rather spe ulativeand is learly a long term goal) [164, 66℄.Platform Ar hite ture for A tivity Re ognition Figure 7 details the intended ar- hite ture of the AR platform.It onsists of three levels:• The Component Level, the lowest one, o�ers software omponents providing ele-mentary operations and data for Per eption, Understanding, and Learning.� Per eption omponents ontain algorithms for sensor management, image andsignal analysis, image and video pro essing (segmentation, tra king...), et .� Understanding omponents provide the building blo ks for Knowledge-basedSystems: knowledge representation and management, elements for ontrollinginferen e engine strategies, et .8Lama already provides a fuzziness module, but probabilisti approa hes are also needed.17

Figure 7: Ar hite ture of the Pulsar a tivity re ognition platform.� Learning omponents implement di�erent learning strategies, su h as SupportVe tor Ma hines (SVM), Case-based Learning (CBL), lustering, et .An A tivity Re ognition system is likely to pi k omponents from these three pa k-ages. Hen e, tools must be provided to sele t, assemble, simulate, verify the resulting omponent ombination. Other support tools help to generate task or appli ationdedi ated languages or graphi interfa es.• The Task Level, the middle one, ontains exe utable realizations of individual tasksthat will ollaborate for a parti ular �nal appli ation. Of ourse, the ode of thesetasks is built on top of the omponents from the previous level. We have alreadyidenti�ed several of these important tasks: Program Supervision to ontrol datapro essing, Obje t Re ognition and Tra king to extra t interesting features fromimages, S enario Re ognition to identify interesting a tivities. In the future, othertasks will probably enri h this level.For these tasks to ni ely ollaborate, ommuni ation and intera tion fa ilities areneeded, providing a sort of �software bus� to ex hange data and ontrol.• The Appli ation Level integrates several of these tasks to build a system for aparti ular type of appli ation, e.g., vandalism dete tion, patient monitoring, air raftloading/unloading follow up, et . Ea h system is parameterized to adapt to the lo al ontext of its instan es (number, type, lo ation of sensors, s ene geometry, visualparameters, number of obje ts of interest...). Thus on�guration and deploymentfa ilities are required.The philosophy of this ar hite ture is to o�er, at ea h level, a balan e between the18

widest possible generi ity and the maximum e�e tive reusability, in parti ular at the odelevel.An intrinsi feature of most a tivity re ognition appli ations is to rely on distributedhardware/software and to work in real time. Therefore, the new platform should supportdistributed ar hite ture and ensure real time performan e. Another issue is that the endusers are generally not omputer experts; thus user friendliness and graphi al supports areimportant. That is why the platform will omply with the following design orientations:• Con urrent implementation (multi-threading) is known to favor both real time re-sponse and distribution. First it allows to take advantage of parallel ar hite tures(multipro essor, Grid omputing...), hen e improving real time performan e. Se ond,by enfor ing modularity, it makes distribution easier. In parti ular, we have alreadydesigned distributed knowledge-based systems that demonstrated the advantages of on urren y.• Real time is not just a matter of performan e. The new platform should also omplywith real time modeling. We shall rely on our experien e in this domain [163, 165℄.Moreover, real time also impa ts the s ene understanding algorithms as well as thereasoning me hanisms. In parti ular, for Program Supervision, the ausality of timerequires revisiting the repair strategy, sin e real time hardly allows ba ktra king.• Supporting distributed systems also requires a model of distribution. We hie�y onsider multi-agent systems sin e they appear to be well-suited to arti� ial intelli-gen e tasks [119, 120, 47℄. We shall also work on the possibility of embedding �light�knowledge-based systems (together with asso iated knowledge) into mobile agents.• Last, but not least, the new platform should be easier to use than the urrent Lamaand VSIP. We should thus de�ne and implement tools to support modeling, de-sign and veri� ation. Another important issue deals with graphi al user interfa es.It should be possible to plug existing (domain or appli ation dependent) graphi alinterfa es into the platform. This requires de�ning an extra generi layer to a om-modate various sorts of interfa es. This is learly a medium/long term goal, in itsfull generality at least.Con erning implementation issues, we use when possible existing standard tools su has bison and �ex for parser generators, NuSMV for model- he king, Protege for ontologymanagement, E lipse for graphi interfa es or model engineering support, et . Note that,in �gure 7, some of the boxes an be naturally adapted from Lama and VSIP existingelements (many per eption and understanding omponents, program supervision, s enariore ognition...) whereas others are to be developed, ompletely or partially (learning om-ponents, most support and on�guration tools).Safeness of Systems for A tivity Re ognition In most a tivity re ognition systems,safeness is a ru ial issue. It is a very general notion dealing with person and goodsprote tion, respe t of priva y, or even legal onstraints. However, when designing softwaresystems it will end up with software se urity. Here we only onsider software se urityissues. Our previous work on veri� ation and validation has to be pursued; in parti ular,we need to he k its s alability and to develop asso iated tools.Generation of dependable systems is an important hallenge. System validation is a ru ial phase in any development y le. Testing, although required in the �rst phase ofvalidation, appears to be too weak for the system to be ompletely trusted. An exhaus-tive approa h of validation using formal methods is learly needed. Model- he king is an19

appealing te hnique sin e it an be automatized [84, 69, 100, 67, 21℄ and helps to produ ea ode that has been formally proved.We base veri� ation on the omponent stru ture of our frameworks. This makes itpossible to follow a ompositional approa h [213℄, a well-know te hnique to ope withs alability problems in model- he king.In the long term we wish to take into a ount time and un ertainty. To formally des ribeprobabilisti timed systems, the most �popular� approa h proposes probabilisti extensionsof timed automata (su h as real-time probabilisti pro esses, probabilisti timed automataand ontinuous probabilisti timed automata). The semanti s of these formalisms relieson in�nite-state stru tures but equivalen e relations an be applied to obtain �nite-statesystems. Then these systems an be used for model- he king [111℄, provided that suitablededi ated logi s are de�ned (su h as bran hing time temporal logi [129℄).Nevertheless, software dependability annot be proved by relying on a single te hnique.Indeed, model- he king works on a model of the system and allows to automate the veri-� ation of de idable properties. By ontrast some unde idable properties require to workdire tly on the ode, for instan e by applying non exhaustive methods su h as abstra tinterpretation [25℄. Thus, veri� ation methods must re urse to several omplementaryte hniques. This is indeed a major issue and a rather hallenging one. Altogether, this isa long term goal, with an in remental evolution.5 Obje tives for the next 4 yearsThe general obje tives presented in the previous se tions are long term goals. Dependingon the human resour es and funding we will fo us our a tivity on the following priorities:5.1 S ene Understanding for A tivity Re ognition1. Per eption:• Gesture re ognition: urrently the VSIP library is limited to oarse motiondete tion and global 3D features. We plan to re�ne people des ription basedon re ent work we have a hieved with B. Boulay [144℄ on real time posturedete tion. Moreover a PhD student urrently works on gesture re ognition (M.B.Kaani he). These kinds of detailed 3D features are required for instan e forpatient monitoring.• Multi sensors per eption: we plan to extend our 4D semanti approa h to het-erogeneous sensor data interpretation. A PhD thesis (N. Zouba) is starting onthis topi in the ontext of health are monitoring.2. Understanding:• Ontology-based a tivity re ognition: we need to extend our urrent visual on- ept ontology [152, 153, 252℄ ( urrently limited to olor, texture and geometry)with temporal on epts (motion, traje tories, events, ...) and other per eptual on epts (su h as audio events or physiologi al sensor events). A short termgoal will be to add audio on epts respe tively in the ontext of the Euro-pean SERKET proje t for se urity and in the ontext of the European proje tCARETAKER for a tivity monitoring.• Un ertainty management: Con erning the modeling of a tivity we plan to ex-tend our formal a tivity model [238℄ to take into a ount the un ertainty of thedete ted physi al obje ts. An example of use of the urrent formalism is shown20

in �gure 2. A major issue is to keep the realtime performan es of the re ognitionpro ess.3. Learning:• Primitive event dete tors: an important resear h topi is the learning of primi-tive event dete tors. This an be done by learning visual on ept dete tors withper eptual features samples. An open question is for ea h type of per eptual on ept how far we an go in weakly supervised learning. Two PhD theses (M.Zúniga and L. Le Thi [227℄) are starting on this topi .5.2 Software Ar hite ture for A tivity Re ognition1. Models and methods:• Model of time and s enarios: a model of time, both physi al and logi al, isneeded to deal with temporal a tivity re ognition. We shall propose a PhD toexplore this topi . Moreover, the extension with an abstra t model of s enariosis unavoidable for a tivity re ognition and might also onstitute a PhD work.• Real time support: we intend to make software generated with the Pulsar plat-form able to work in real time. In parti ular, for Program Supervision [150, 250℄,this requires to revisit the repair strategy, sin e real time hardly allows ba k-tra king. A PhD is planned to start in 2008.• Methodology: we shall start to explore the appli ability of model driven devel-opment (MDD) to a tivity re ognition systems. Of ourse, setting up a ompletemethodology is a long term obje tive and within a period of 4 years we onlyexpe t feasibility results.2. Platform:• We intend to propose in the next 4 years a �rst version of the Pulsar platform.This platform will integrate most of the per eption and understanding ompo-nents belonging to VSIP and LAMA. Moreover we plan to develop new supporttools mainly for omponent assembly.• Con urrent implementation (multi-threading) is underway. It will ensure bettermodularity and evolving as well as faster real time response. It will also makeit easier to distribute knowledge-based systems.• Distributed ar hite ture: a PhD (N. Khayati) is already underway on this topi ;it uses multi-agents te hnology. The obje tives are to provide broader a essesand ollaboration fa ilities, and to take advantage of s attered omputing re-sour es or data. Distribution also demands in orporating me hanisms to enfor epriva y and se urity within the systems generated with the platform. Beyondthe well-known system issues, we have to ope with semanti and knowledgerepresentations. Hen e we will propose ontologies for des ribing, deploying,and using distributed knowledge-based systems.• User interfa e and tools: to support the methodologi al aspe t, tools and user-friendly interfa es are ne essary. This is indeed a time- onsuming a tivity anddevelopment in these domains will depend on the available work power.Short term (4 years) on rete goals in terms of a tivity re ognition systems are :21

• to design manually a real-time re ognition system for omplex a tivities involvingintera tions between a person and his/her environment based on postures with �xedlarge �eld of view ameras.• to automate the set-up (parametrization) of an a tivity re ognition system (su h asan intrusion dete tion system) for a new site.Long term on rete goals in terms of a tivity re ognition systems are :• to generate semi-automati ally new a tivity re ognition systems using a tool box andhigh level spe i� ations (in terms of a tivity to re ognize and onstraints).• to quantify the performan es of a new a tivity re ognition system based on ben h-marks.6 EvaluationWe are aware that Pulsar resear h dire tions are very ambitious. Our on rete goal interms of a ademi impa t is to publish both in omputer vision (in parti ular CVIU andIJCV international journals) and in arti� ial intelligen e (IJCAI onferen e, AI Journal).We also onsider onferen es and journals establishing links between Software Engineeringand Arti� ial Intelligen e (su h as SEKE or Data and Knowledge Engineering) as wellas the ones more spe i� ally devoted to Software engineering (ICSE) with a parti ularemphasis on those promoting the model-driven approa hes (MoDeLs onferen e, SoSymjournal). We also intend to be a tors in the major onferen es in vision and ognitivesystems (su h as ICVS) and in visual surveillan e (su h as AVSS).A tivity re ognition systems are di� ult to evaluate, sin e their performan e dependsdire tly on the test videos and on the system on�guration phase. A robust algorithmtested in new environmental onditions or with inadequate parameters an get poor perfor-man e. Thus, we will ontinue our work performed in the evaluation program ETISEO [217℄(http://www.etiseo.net) to design novel evaluation metri s taking into a ount the depen-den ies with video hara teristi s (e.g. ontrasted shadows, illumination hanges) and on�guration phases (e.g. parameter tuning, obje t model learning). We will also ontinueto parti ipate through industrial ooperations to performan e evaluation ampaigns madewith data taken with the existing operational video ameras on real sites (see annex A forsome re ent experimentations).7 Positioning and Collaborations7.1 Positioning within INRIAPULSAR has an original multidis iplinary approa h mixing te hniques from the domainof per eption, ognitive systems and software engineering for a tivity re ognition.Per eption through video streamsSeveral INRIA teams are working on video analysis for human a tivity des ription andthus are lose to PULSAR obje tives. Currently we do not have e�e tive ollaborationswith these teams, but we intend to develop new ollaborations on ertains topi s.• PRIMA The PRIMA resear h team, whi h is also in CogC is very lose to PULSARas it shares with PULSAR a ognitive system approa h. They mainly work on 2D22

appearan e-based approa hes whereas Pulsar will design 4D spatio-temporal videounderstanding approa hes.• VISTA The VISTA resear h team has a strong ba kground in 2D motion analysis.More re ently they have been interested in motion event re ognition ( f Y. Laptev'swork on repetitive motion dete tors). Pulsar's work is omplementary to these ap-proa hes sin e we are using 3D models and human expertise. A new ooperationwith VISTA for learning new motion event dete tors for instan e between Y. Laptevand G. Charpiat would be ertainly fruitful.• LEAR The LEAR resear h team is working on obje t re ognition for stati imagesand video sequen es. They fo us on learning low level visual features. In parti ularthey use shape and a�ne interest point information to des ribe obje ts in images.This numeri al ontent is learned to build visual models of obje t lasses. A new ooperation with LEAR is ertainly interesting for learning new shape and texturedes riptors.• MOVI/Per eption The MOVI resear h team is urrently reating a new proje tnamed Per eption. Their ba kground is oriented towards 3D geometri models. De-pending on their new orientations we ould have fruitful ooperations.Per eption through other sensorsFor audio sensor understanding new ooperations with the proje ts of CogC are possibleand interesting, in parti ular with PAROLE, METISS and/or CORDIAL. A new onta t isstarting with DEMAR (Bio) for the study of wearable physiologi al sensors for health aremonitoring.Cognitive SystemsWe share ommon interests in knowledge representation, knowledge a quisition, ontolo-gies, ma hine learning and reasoning with several INRIA proje t-teams mainly in CogA.Among them we an ite the following ones:• DREAM This proje t works on diagnosis and surveillan e systems using arti� ialintelligen e te hniques. We will ontinue our unformal ooperations on temporala tivity modeling and reasoning.• MAIA This proje t is fo used on autonomous real-time roboti systems . We share ommon obje tives on real-time arti� ial intelligen e te hniques for signal under-standing and on health are appli ations.• ACACIA/EDELWEISS This proje t develops knowledge management and on-tologies te hniques mainly for Semanti Web appli ations. We had unformal oop-erations with ACACIA on ontology languages whi h are interesting to ontinue.• HELIX This is a proje t with whi h we have already ollaborated [91℄; it works on omponents, platforms, and problem solving environments, all on erns whi h are lose to ours.• AXIS This multi-dis iplinary proje t-team (Arti� ial Intelligen e, Data Analysisand Software Engineering) studies the area of Information Systems, parti ularlyWeb sites. They develop data mining te hniques for modeling web user a tivity pro-�les. Unformal ooperations with AXIS are interesting on data mining te hniques formodeling long term people a tivities observed by sensors in parti ular for health areappli ations. 23

• CORTEX This proje t studies neuronal onne tionist models developed throughtwo sour es of inspiration, namely omputational neuros ien es and ma hine learn-ing. Their goal is to build intelligent systems, able to extra t knowledge from dataand to manipulate that knowledge to solve problems. A ooperation in ma hinelearning te hniques and more pre isely onne tionist ones would be useful in thefuture for learning event dete tors.Software engineeringSeveral INRIA proje t-teams within ComA or ComC deal with omponent-based pro-gramming. We share the syn hronous foundations with AOSTE (a proje t with whi h along ollaboration has existed) and ESPRESSO and the on ern about formal proof for omponent assembly with TRISKELL and the two former proje ts. The te hniques andtools developed within OBASCO and JACQUARD ould also be interesting for us in thefuture.In the domain of distributed and multi-agent systems, the models and tools that OASISis developing are ertainly interesting for distributing knowledge-based systems.7.2 A ademi ollaborations and positioning7.2.1 NationalWe share ommon interests with several resear h teams in Fran e, outside INRIA.Video Analysis• We ooperate with the Mathemati al Morphology laboratory of E oles des Minesde Paris at Fontainebleau (Mi hel Bilodeau) together with StMi roele troni s in aproje t for intelligent video ameras. In parti ular we ombine their robust imagesegmentation algorithms with our 3D human posture re ognition algorithm [177, 144℄.• We regularly ooperate with the resear h team dire ted by Patrik Sayd at CEA inSa lay in the ontext of videosurveillan e [237℄. Currently, we work together in the ontext of the SIC proje t. They fo us in 2D video analysis for people tra king overnetworks of video ameras while we are in harge of high level s enario understanding.Software engineering• We plan to ontinue our ooperation with the Rainbow team (I3S-UNSA-CNRS) atSophia Antipolis on the omponent-based approa h and in health are monitoringwhi h has produ ed joint papers [189, 148, 221, 222℄.• We already ooperate with the Mosarts team (I3S-UNSA-CNRS) and with the CMAteam (E ole des Mines) at Sophia Antipolis to begin the development of a spe ialpurpose language dedi ated to a tivity re ognition appli ations.Health are and Life S ien esWe have initiated a series of new ooperations in the domain of health are. These ooperations are a �rst step whi h need to be enri hed and supported.• We have started a ollaboration with CSTB (Centre S ienti�que et Te hnique duBatiment) and the Ni e City Hospital (Groupe de Re her he sur la Tophi ité et leViellissement) in the framework of the GER'HOME proje t, funded by the PACA24

region. The GER'HOME proje t is devoted to experiment and develop te hniquesthat allow long-term people monitoring at home. In this proje t an experimentalhome has been built in Sophia Antipolis nearby Pulsar premises [244℄.• We have begun a new ollaboration with the INSERM neurophysist group headedby Prof Chauvel at La Timone hospital in Marseille for epilepti patient observationwith video ameras and EEG signal.We have a long standing ollaboration with INRA at Sophia Antipolis, in the domain oflife s ien es for the early dete tion of inse ts in rops or for the theoreti al study of theirbehavior. This biologi al domain raises interesting issues in image and video interpretation[199℄.7.2.2 InternationalWe have a pre ise view of the main groups working in our domains of interest throughinternational networks gathering our main ompetitors and ollaborators.Europe We are in ompetition with several European resear h teams (e.g. Prof A Cohnand Prof D. Hoggs from University of Leeds, UK, Prof Neumann from University of Ham-burg) on a tivity re ognition and (e.g. Prof. R. Cu hiara from University of Modena,Italy, Prof. G. Jones from University of Kingston, UK) on videosurveillan e. We are alsoinvolved with some of these European resear h teams (e.g. University of Leeds, Universityof Hamburg) in a new network of ex ellen e EuCognition (see http://www.eu ognition.org)for the Advan ement of Arti� ial Cognitive Systems. This network is a follow-up and anextension of the Cognitive Vision network of ex ellen e ECVision (for more details seehttp://www.ECVision.info/home/Home.htm). We also ooperate with several Europeanresear h teams through European Proje ts (see next se tion on industrial ollaborationsand funding).US We are ompeting with US a ademi resear h teams (e.g., Prof. Nevatia from Uni-versity of Southern California, Prof. Bobi k for Georgia Te h, Prof. Larry Davis fromUniversity of Maryland) on the topi of video event re ognition. We also ooperate withthese teams for de�ning standard video event ontology (e.g., Prof. Nevatia from Universityof Southern California, Prof. R. Chelappa from University of Maryland) and for ompar-ing video understanding algorithm performan es (e.g. Prof. M. Shah from University ofFlorida and Prof. Larry Davis from University of Maryland).Asia Our team is a member of the STIC-Asie Inter-media Semanti Extra tion andReasoning (ISERE) a tion. ISERE a tion gathers six resear h enters from Asia (I2R A-STAR and NUS for Singapore, MICA for Vietnam, NII Tokyo for Japan and the NCKUand the NTU for Taiwan) and three Fren h teams. It on erns both the developmentof resear h on semanti s analysis, reasoning and multimedia data, and the appli ation ofthese results in the domains of e-learning, automati surveillan e and medi al issues.Afri a Pulsar will ontinue to ooperate with ENSI Tunis (Tunisia) in the framework ofPAI ooperations; this work deals with software reuse for distributed appli ations, moreparti ularly in mede ine; we o-dire t several PhD and Master theses.

25

7.2.3 GeneralOur team has laun hed ETISEO (http://www.etiseo.net), an international a ademi andindustrial ompetition for the evaluation of videosurveillan e te hniques with urrently 25international teams.7.3 Industrial Collaborations and FundingPulsar will rely on a strong network of industrial partners in the domain of videosurveillan e(see below). Several other industrials are interested in health are monitoring su h as Fran eTele om, A enture and STMi roele troni .National Initiatives• SYSTEM�TIC SIC proje t: Pulsar is strongly involved in the new �pole de ompetivité� SYSTEM�TIC whi h is a strategi initiative in se urity. More pre iselya new proje t (SIC) is a epted for funding for 42 months on the theme of perimeterse urity. The industrial partners in lude Thales, EADS, BULL, SAGEM, Bertin,Trusted Logi .European Proje ts• SERKET is a European ITEA proje t in ollaboration with Thales R&T FR,Thales Se urity Syst, CEA, EADS and Bull (Fran e); Atos Origin, INDRA andUniversidad de Mur ia (Spanish); XT-I, Capvidia, Multitel ABSL, FPMs, ACIC,BARCO, VUB-STRO and VUB-ETRO (Belgium). It has begun at the end of Novem-ber 2005 and will last 2 years. The main obje tive of this proje t is to developte hniques to analyze rowd behaviors and to help in terrorist prevention.• CARETAKER is a new STREP European proje t that began in mar h 2006. Itsduration is planned for thirty months. The main obje tive of this proje t is to dis- over information in multimedia data. The prime partner is Thales Communi ations(Fran e) and others partners are: Multitel (Belgium), Kingston University (UK),IDIAP (Switzerland), Roma ATAC Transport Agen y (Italy), SOLID software edi-tor for multimedia data basis (Finland) and Brno University of Te hnology (Cze hia).Our team has in harge modeling, re ognizing and learning s enarios for frequent orunexpe ted human a tivities using both video and audio events.Industrial Cooperations• Bull: we have a resear h ollaboration with Bull sin e 1998 through a sequen eof ontra ts. Within three su essive industrial ontra ts (TELESCOPE 1, 2 and3) from 1998 to mar h 2006 we have developed together a toolkit in the domainof ognitive video interpretation for videosurveillan e appli ations. We are urrently ontinuing this strategi ooperation through 2 proje ts in safety/se urity: the Euro-pean Proje t SERKET and the SIC proje t within the se urity �pole de ompétivité�SYSTEM�TIC.• RATP: We have had, sin e 2001, a lose ooperation with RATP for multisensorsa ess ontrol systems. A PhD is urrently working on learning te hniques for pas-senger lassi� ation. 26

• STmi roele troni Orion also ooperates with STmi roele troni s and E ole desMines de Paris at Fontainebleau for the design of intelligent ameras in luding imageanalysis and interpretation apabilities. In parti ular one post-do and one PhDstudent are urrently working on new algorithms for 3D human posture re ognitionin real-time for video ameras.• Keeneo: the start-up Keeneo was reated in July 2005 for the industrializationand exploitation of Orion results in videosurveillan e (VSIP library). Pulsar will ontinue to maintain a lose ooperation with Keeneo for impa t analysis of VSIPand for exploitation of new results.8 Referen es[1℄ D. Abarahams and A. Gurtovoy. Template Metaprogramming: Con epts, Tools, andTe hniques for Bost and Beyond. Addison Wesley, 2005.[2℄ A. Agarwal and B. Triggs. A lo al basis representation for estimating human posefrom luttered images. In Asian Conferen e on Computer Vision, January 2006.[3℄ J. Aggarwal and Q. Cai. Human motion analysis: a review. In Pro eedings ofthe IEEE Workshop on Motion of Non-Rigid and Arti ulated Obje ts, pages 90�102,Puerto Ri o, USA, June 1997.[4℄ J.F Allen. Time and time again: The many ways to represent time. Intern.. Jr. ofIntelligent Systems, 6(4):341�356, July 1991.[5℄ R. Allen and D. Garlan. A formal basis for ar hite tural onne tion. ACM Transa -tions on Software Engineering and Methodology, 6(3):213�249, 1997.[6℄ E. Andrade, S. Blunsden, and R. Fisher. Performan e analysis of event dete tionmodels in rowded s enes. In Workshop on Towards Robust Visual Surveillan eTe hniques and Systems at Visual Information Engineering 2006, pages 427�432,Bangalore, India, September 2006.[7℄ E. L. Andrade, S. Blunsden, and R. B. Fisher. Hidden markov models for opti al�ow analysis in rowds. In International Conferen e on Pattern Re ognition, HongKong, August 2006.[8℄ G. Antoniou and F. van Harmelen. A Semanti Web Primer. MIT Press, 2004.[9℄ D. Ayers and M. Shah. Monitoring human behavior from video taken in an o� eenvironment. Image and Vision Computing, 19:833�846, O tober 2001.[10℄ J. Ber laz, F. Fleuret, and P. Fua. Robust people tra king with global traje toryoptimization. In Conferen e on Computer Vision and Pattern Re ognition, 2006.[11℄ G. Berry. The foundations of esterel. In G. Plotkin, C. Stearling, and M. Tofte,editors, Proof, Language, and Intera tion, Essays in Honor of Robin Milner. MITPress, 2000.[12℄ Sami Beydeda and Matthias Book, editors. Model-Driven Software Development.Springer, 2005. 27

[13℄ M. Borg, D. J. Thirde, J. M. Ferryman, K. D. Baker, J. Aguilera, and M. Kampel.Evaluation of obje t tra king for air raft a tivity surveillan e. In The Se ond JointIEEE International Workshop on Visual Surveillan e and Performan e Evaluationof Tra king and Surveillan e (VS-PETS 2005), Beijing, China, O tober 2005.[14℄ B. Boulay, F. Bremond, and M. Thonnat. Applying 3d human model in a posturere ognition system. Pattern Re ognition Letter, Spe ial Issue on vision for CrimeDete tion and Prevention, 27(15):1788�1796, 2006.[15℄ J.-P. Briot. A talk: A framework for obje t-oriented on urrent programming - de-sign and experien e. In Jean-Paul Bahsoun, Takanobu Baba, Jean-Pierre Briot, andAkinori Yonezawa, editors, Obje t-Based Parallel and Distributed Computing II -Pro eedings of the 2nd OBPDC'97 Fran e-Japan Workshop. Hermès S ien e Publi- ations, Paris, 1999.[16℄ H. Buxton. Learning and understanding dynami s ene a tivity. In ECCV GenerativeModel Based Vision Workshop, Copenhagen, 2002.[17℄ A. Cavallaro. Event dete tion in underground stations using multiple heterogeneoussurveillan e ameras. In International Symposium on Visual Computing, IntelligentVehi les and Autonomous Navigation, Lake Tahoe, Nevada, De ember 2005.[18℄ A. Cavallaro, O. Steiger, and T. Ebrahimi. Tra king video obje ts in lutteredba kground. IEEE Transa tions on Cir uits and Systems for Video Te hnology,15(4):575� 584, April 2005.[19℄ R. Chellappa, A. R. Chowdhury, and S. Zhou. Re ognition of Humans and TheirA tivities Using Video. Morgan Claypool, 2005.[20℄ S. Chien, R. Knight, A. Ste hert, R. Sherwood, and G. Rabideau. Using iterativerepair to improve the responsiveness of planning and s heduling. In Pro . 5th Inter-national Conferen e on Arti� ial Intelligen e Planning and S heduling (AIPS2000),Bre kenridge, CO, April 2000.[21℄ A. Cimatti, E. Clarke, E. Giun higlia, F. Giun higlia, M. Pistore, M. Roveri, R. Se-bastiani, and A. Ta hella. NuSMV 2: an OpenSour e Tool for Symboli ModelChe king. In Ed Brinksma and Kim Guldstrand Larsen, editors, Pro eeeding CAV,number 2404 in LNCS, pages 359�364, Copenhagen, Danmark, July 2002. Springer-Verlag.[22℄ E.M. Clarke, O. Grumber, and D. Long. Veri� ation tools for �nite-state on urrentsystems. LNCS: A De ade of Con urren y, Pro REX S hool/Symp, 803:124�175,1994.[23℄ A. G. Cohn, D. C. Hogg, B. Bennett, V. Devin, A. Galata, D. R. Magee, C. Need-ham, and P. Santos. Cognitive vision: Integrating symboli qualitative representa-tions with omputer vision. Cognitive Vision Systems: Sampling the Spe trum ofApproa hes, Le ture Notes in Computer S ien e, pages 221�246, 2006.[24℄ C. Coué, C. Pradalier, C. Laugier, T. Frai hard, and P. Bessière. Bayesian o upan y�ltering for multitarget tra king: an automotive appli ation. Journal of Roboti sResear h, 25(1):19�30, January 2006. 28

[25℄ P. Cousot and R. Cousot. On Abstra tion in Software Veri� ation. In Ed Brinksmaand Kim Guldstrand Larsen, editors, Pro eeeding CAV, number 2404 in LNCS, pages37,56, Copenhagen, Danmark, July 2002. Springer-Verlag.[26℄ J. L. Crowley. Cognitive Vision Systems, hapter Things that See: Context-AwareMulti-modal Intera tion. Springer Verlag, Mar h 2006.[27℄ J. L. Crowley. Situation models for observing human a tivity. ACM Queue Magazine,May 2006.[28℄ R. Cu hiara, C. Grana, G. Neri, M. Pi ardi, and A. Prati. The sakbot system formoving obje t dete tion and tra king. Video-based Surveillan e Systems: ComputerVision and Distributed Pro essing (Part II - Dete tion and Tra king), pages 145�158,2001.[29℄ R. Cu hiara, C. Grana, A. Prati, and R. Vezzani. Probabilisti posture lassi�- ation for human behaviour analysis. IEEE Transa tions on Systems, Man, andCyberneti s, Part A: Systems and Humans, 35(1):42�54, 2005.[30℄ R. Cu hiara, A. Prati, and R. Vezzani. Domoti s for disability: smart surveillan eand smart video server. In 8th Conferen e of the Italian Asso iation of Arti� ialIntelligen e - Workshop on Ambient Intelligen e, pages 46�57, Pisa, Italy, September2003.[31℄ R. Cu hiara, A. Prati, R. Vezzani, L. Benini, E. Farella, and P. Zappi. An integratedmulti-modal sensor network for video surveillan e. Journal of Ubiquitous Computingand Intelligen e (JUCI), 2006.[32℄ K. Czarne ki and U. W. Eisene ker. Generative Programming: Methods, Tools, andAppli ations. Addison Wesley, 2000.[33℄ N. Dalal, B. Triggs, and C. S hmid. Human dete tion using oriented histograms of�ow and appearan e. In European Conferen e on Computer Vision, 2006.[34℄ A. C. Davies and S. A. Velastin. Progress in omputational intelligen e to support tv surveillan e systems. International S ienti� Journal of Computing, 4(3):76�84,2005.[35℄ J. Davies, D. Fensel, and F. van Harmelen. Towards the Semanti Web: Ontology-Driven Knowledge Management. John Wiley & Sons, 2003.[36℄ X. Desurmont, J. B. Hayet, J. F. Delaigle, J. Piater, and B. Ma q. Tri tra videodataset: Publi hdtv syntheti so er video sequen es with ground truth. In Work-shop on Computer Vision Based Analysis in Sport Environments (CVBASE), pages92�100, 2006.[37℄ X. Desurmont, I. Ponte, J. Meessen, and J. F. Delaigle. Nonintrusive viewpointtra king for 3d for per eption in smart video onferen e. In Three-DimensionalImage Capture and Appli ations VI, San Jose, CA USA, January 2006.[38℄ M. Dimitrijevi , V. Lepetit, and P. Fua. Human body pose re ognition using spatio-temporal templates. In ICCV workshop on Modeling People and Human Intera tion,Beijing, China, O tober 2005.[39℄ P. Domingos and M. Ri hardson. Markov logi networks. Ma hine Learning, 62(1-2):107�136, February 2006. 29

[40℄ W. Dong and A. Pentland. Multi-sensor data fusion using the in�uen e model. InBody Sensor Networks Workshop, Boston, MA, April 2006.[41℄ C. Dousson, P. Gaborit, and M. Ghallab. Situation re ognition: Representation andalgorithms. In The 13rd IJCAI, pages 166�172, August 1993.[42℄ W. Du, J. B. Hayet, J. Piater, and J. Verly. Collaborative multi- amera tra king ofathletes in team sports. In Workshop on Computer Vision Based Analysis in SportEnvironments (CVBASE), pages 2�13, 2006.[43℄ W. Du and J. Piater. Multi-view tra king using sequential belief propagation. InAsian Conferen e on Computer Vision, pages 684�693, Hyderabad, India, 2006.Springer-Verlag.[44℄ A. En� iaud, B. Lienard, N. Allezard, R. Sebbe, S. Beu her, X. Desurmont, P. Sayd,and J. F. Delaigle. Clovis - a generi framework for general purpose visual surveillan eappli ations. In IEEE International Workshop on Visual Surveillan e (VS), 2006.[45℄ M. Fayad and D. S hmidt, editors. Obje t-Oriented Appli ation Frameworks, vol-ume 40 of Spe ial issue of the Communi ations of ACM, O tober 1997.[46℄ D. Fensel, V. R Benjamins, E. Motta, and B. Wielinga. UPML: A Framework forKnowledge System Reuse. In IJCAI-99, 1999.[47℄ J. Ferber. Multi-Agent Systems: An Introdu tion to Distributed Arti� ial Intelligen e.Addison Wesley, 1st edition edition, 1999.[48℄ J. M. Ferryman, S. Maybank, and A. Worrall. Visual surveillan e for moving vehi les.International Journal of Computer Vision, 37(2):187�197, June 2000.[49℄ G. L. Foresti and C. Mi heloni. A robust feature tra ker for a tive surveillan e ofoutdoors s enes. Ele troni Letters on Computer Vision and Image Analysis, 1(1):21�34, 2003.[50℄ G. L. Foresti, C. Pi iarelli, and L. Snidaro. Traje tory lustering and its appli ationsfor video surveillan e. In Advan ed Video and Signal-Based Surveillan e (AVSS),Como, Italy, September 2005.[51℄ G. L. Foresti and C. S. Regazzoni. Multisensor data fusion for driving autonomous ve-hi les in risky environments. IEEE Transa tions on Vehi ular Te hnology, 51(5):1165�1185, September 2002.[52℄ A. Galata, N. Johnson, and D. Hogg. Learning variable length markov models ofbehaviour. Computer Vision and Image Understanding, 81(3):398�413, 2001.[53℄ E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements ofReusable Obje t-Oriented Software. Addison Wesley, 1995.[54℄ D. M. Gavrila and S. Munder. Multi- ue pedestrian dete tion and tra king froma moving vehi le. International Journal of Computer Vision (est: February-Mar h2007 issue), 2007.[55℄ M. Gelgon, P. Bouthemy, and J. P. Le-Cadre. Re overy of the traje tories of multiplemoving obje ts in an image sequen e with a pmht approa h. Image and VisionComputing Journal, 23(1):19�31, 2005.30

[56℄ J. Gennari, M. A. Musen, R. W. Fergerson, W. E. Grosso, M. Crubezy, H. Eriksson,N. F. Noy, and S. W. Tu. The evolution of Protégé: An environment for knowledge-based systems development. In SMI-2002, 2002.[57℄ P. Genoud, V. Dupierris, M. Page, C. Bruley, D. Ziébelin, J. Gensel, and D. Bar-dou. From arom, a new obje t based knowledge representation system, to webarom,a knowledge bases server. In Workshop Appli ation of Advan ed Information Te h-nologies to Mede ine (AAITM), AIMSA 2000, September 2000.[58℄ C. Gomila and F. Meyer. Tra king obje ts by graph mat hing of image partitionsequen es. In 3rd IAPR-TC15 Workshop on Graph-Based Representations in PatternRe ognition, pages 1�11, Is hia, Italy, May 2001.[59℄ N. Gourier, J. Maisonnasse, D. Hall, and J. L. Crowley. Head pose estimation onlow resolution images. In CLEAR 2006, South Hampton, UK, April 2006.[60℄ D. Greenhill, P. Remagnino, and G. A. Jones. Video Based Surveillan e Systems -Computer Vision and Distributed Pro essing, hapter VIGILANT: Content-Queryingof Video Surveillan e Streams, pages 193�204. Kluwer A ademi Publishers, 2002.[61℄ N. Halbwa hs. Syn hronous Programming of Rea tive Systems. Kluwer A ademi ,1993.[62℄ D. Hall. Automati parameter regulation for a tra king system with an auto- riti alfun tion. In Pro eedings of the IEEE International Workshop on Computer Ar hi-te ture for Ma hine Per eption (CAMP05), pages 39�45, Palermo, Italy, July 2005.[63℄ D. Hall, R. Emonet, and J. L. Crowley. An automati approa h for parametersele tion in self-adaptive tra king. In International Conferen e on Computer VisionTheory and Appli ations (VISAPP), Setubal, Portugal, February 2006.[64℄ D. Hall, J. Nas imento, P. Ribeiro, E. Andrade, P. Moreno, S. Pesnel, T. List,R. Emonet, R. B. Fisher, J. Santos-Vi tor, and J. L. Crowley. Comparison of tar-get dete tion algorithms using adaptive ba kground models. In 2nd Joint IEEEInt. Workshop on Visual Surveillan e and Performan e Evaluation of Tra king andSurveillan e (VS-PETS), pages 113�120, Beijing, O tober 2005.[65℄ D. Hall, F. Pelisson, O. Ri�, and J. L. Crowley. Brand identi� ation using gaussianderivative histograms. Ma hine Vision and Appli ations, in Ma hine Vision andAppli ations, 16(1):41�46, 2004.[66℄ D. Harel and R. Marelly. Come, Let's Play: S enario-Based Programming UsingLSCs and the Play-Engine. Springer, 1st edition edition, 2003.[67℄ T.A. Henzinger, P.-H. Ho, and H. Wong-Toi. HyTe h: A Model Che ker for HybridSystem. Software Tools for Te hnology Transfer, 1:110�122, 1997.[68℄ D. C. Hogg, A. G. Cohn, V. Devin, D. Magee, C. Needham, and P. Santos. Learningabout obje ts and a tivities. In 24th Leeds Annual Statisti al Resear h (LASR)Workshop, Leeds, UK, 2005.[69℄ G.J. Holzmann. The Spin Model Che ker. IEEE Trans. on Software Engineering,23:279�295, 1997. 31

[70℄ S. Hongeng, F. Brémond, and R. Nevatia. Bayesian framework for video surveillan eappli ation. In Pro eedings of the International Conferen e on Pattern Re ognition(ICPR'00), pages 1164�1170, Bar elona, Spain, September 2000.[71℄ C. L. Huang, E. L. Chen, and P. C. Chung. Fall dete tion using modular neuralnetworks and ba k-proje ted opti al �ow. In The 12th International Conferen e onNeural Information Pro essing (ICONIP 2005), 2005.[72℄ C. L. Huang, E. L. Chen, and P. C. Chung. Using modular neural networks forfalling dete tion. In The 10th Conferen e on Arti� ial Intelligen e and Appli ation(TAAI2005), 2005.[73℄ Y. Ivanov and A. Bobi k. Re ognition of visual a tivities and intera tions by sto has-ti parsing. IEEE Trans. Patt. Anal. Ma h. Intell., 22(8):852�872, 2000.[74℄ R. E. Johnson. Frameworks = (Components + Patterns). CACM, 10(40):39�42,1997.[75℄ E. M. Clarke Jr, O.Grumberg, and D.Peled. Model Che king. MIT Press, 2000.[76℄ A. Kale, N. Cuntoor, B. Yegnanarayana, A. N. Rajagoplan, and R. Chellappa. Us-ing Appearan e Mat hing in Opti al and Digital Te hniques of Information Se urity, hapter Gait-based Human Identi� ation. Springer, 2005.[77℄ G. Ki zales, J. Lamping, A. Mendhekar, C. Maeda, C. V. Lopez, J. M. Loingtier,and J. Irwin. Aspe t-oriented programming. In Pro . ECCOP 1997, number 1241in LNCS. Springer Verlag, June 1997.[78℄ Danny B. Lange and Mitsuru Oshima. Programming and Deploying Java(TM) MobileAgents with Aglets(TM). Addison-Wesley, 1998.[79℄ I. Laptev, B. Caputo, and T. Lindeberg. Lo al velo ity-adapted motion events forspatio-temporal re ognition. Computer Vision and Image Understanding, 2007.[80℄ I. Laptev and T. Lindeberg. Velo ity-adaptation of spatio-temporal re eptive �eldsfor dire t re ognition of a tivities: An experimental study. Image and Vision Com-puting, 22:105�116, 2004.[81℄ D. Larlus and F. Jurie. Latent mixture vo abularies for obje t ategorization. InBritish Ma hine Vision Conferen e, 2006.[82℄ S. N. Lim, A. Mittal, L. S. Davis, and N. Paragios. Fast illumination-invariant ba k-ground subtra tion using two views: Error analysis, sensor pla ement and appli a-tions. In IEEE Conferen e on Computer Vision and Pattern Re ognition (CVPR),San Diego, CA, June 2005.[83℄ F. Lv, X. Song, B. Wu, V. K. Singh, and R. Nevatia. Left luggage dete tion usingbayesian inferen e. In PETS, 2006.[84℄ K.L Ma Millan. The SMV Model Che ker. Available from http:www- ad.ee s.berkeley.edu/∼kenm mil/smv/, 2001.[85℄ W. Mark and D. M. Gavrila. Real-time dense stereo for intelligent vehi les. IEEETransa tions on Intelligent Transportation Systems, 7(1):38�50, Mar h 2006.32

[86℄ T. Mathes and J. Piater. Robust non-rigid obje t tra king using point distribu-tion manifolds. In 28th Annual Symposium of the German Asso iation for PatternRe ognition (DAGM), 2006.[87℄ S. J. Mellor and M. J. Bal er. Exe utable UML: A Foundation for Model-DrivenAr hite ture. Addison-Wesley, 2002.[88℄ C. Mi heloni and G. L. Foresti. Fast good feature sele tion for wide area monitoring.In IEEE Conferen e on Advan ed Video and Signal Based Surveillan e, pages 271�276, Miami Bea h, Florida, July 2003.[89℄ A. Mittal and L. Davis. M2tra ker: A multi-view approa h to segmenting andtra king people in a luttered s ene. International Journal of Computer Vision,51(3):189�203, 2003.[90℄ T. Moeslund, A. Hilton, and V. Krüger. A survey of advan es in vision-based humanmotion apture and analysis. Computer Vision and Image Understanding, 104(2-3):90�126, 2006.[91℄ S. Moisan and D. Ziébelin. Résolution de problèmes en pilotage de programmes. InRFIA'2000, 12ème Congrès Re onnaissan e des Formes et Intelligen e Arti� ielle,pages 387�395, Paris, Fran e, février 2000.[92℄ S. Munder and D. M. Gavrila. An experimental study on pedestrian lassi� ation.IEEE Transa tions on Pattern Analysis and Ma hine Intelligen e, 28(11):1863�1868,2006.[93℄ D. Nau, M. Ghallab, and P. Travers. Automated Planning: Theory and Pra ti e.Morgan Kaufmann, 2004.[94℄ B. Neumann and T. Weiss. Navigating through logi -based s ene models for high-level s ene interpretations. In Springer, editor, 3rd International Conferen e onComputer Vision Systems - ICVS 2003, pages 212�222, 2003.[95℄ R. Nevatia, J. Hobbs, and B. Bolles. An ontology for video event representation. InIEEE Workshop on Event Dete tion and Re ognition, June 2004.[96℄ M. S. Nixon, T. N. Tan, and R. Chellappa. Human Identi� ation Based on Gait.Springer, De ember 2005.[97℄ T. Ogata, W. Christmas, J. Kittler, and S. Ishikawa. Improving human a tivitydete tion by ombining multi-dimensional motion des riptors with boosting. In Pro- eedings of the 18th International Conferen e on Pattern Re ognition, volume 1, pages295�298. IEEE Computer So iety, August 2006.[98℄ Mourad Oussalah and Colle tif. Ingénierie des omposants : Con epts, te hniques etoutils. Vuibert, 2005.[99℄ V. Parameswaren and R. Chellappa. Human a tion-re ognition using mutual invari-ants. Computer Vision and Image Understanding, 98:295�325, September 2005.[100℄ P. Pettersson and K. Larsen. Uppaal2k. Bulletin of the European Asso iation forTheoreti al Computer S ien e, 70:40�44, 2000.[101℄ C. Pinhanez and A. Bobi k. Interval s ripts: a programming paradigm for intera tiveenvironments and agents. Pervasive and Ubiquitous Computing, 7(1):1�21, 2003.33

[102℄ R.Bra hman and H. Levesque. Knowledge Representation and Reasoning. MorganKaufmann Series in Arti� ial Intelligen e, 2004.[103℄ P. Remagnino, A. I. Shihab, and G. A. Jones. Distributed intelligen e for multi- amera visual surveillan e. Pattern Re ognition, Spe ial Issue on Agent-based Com-puter Vision, 37(4):675�689, April 2004.[104℄ A. Renouf, R. Clouard, and M. Revenu. Un modèle de formulation d'appli ations detraitement d'images. In ORASIS'05, Mai 2005.[105℄ K. Sage and H. Buxton. Joint spatial and temporal stru ture learning for task based ontrol. In International Conferen e on Pattern Re ognition, 2004.[106℄ E. Salvador, A. Cavallaro, and T. Ebrahimi. Cast shadow segmentation using in-variant olour features. Computer Vision and Image Understanding, 95(2):238�259,August 2004.[107℄ F.A S hreiber. Is Time a Real Time? An Overview of Time Ontology in informati s,volume F127, pages 283�307. Springer Verlag NATO-ASI, 1994.[108℄ G. S hreiber, B. Wielinga, R. de Hoog, H. Akkermans, and W. van de Velde. Com-monKADS: A Comprehensive Methodology for KBS Development. IEEE Expert,9(6):28�37, 1994.[109℄ Y. Sheikh and M. Shah. Bayesian modelling of dynami s enes for obje t dete tion.IEEE Transa tions on PAMI, O tober 2005.[110℄ Y. Shi, Y. Huang, D. Minnen, A. Bobi k, and I. Essa. Propagation networks forre ognition of partially ordered sequential a tion. In Pro eedings of IEEE ComputerVision and Pattern Re ognition (CVPR 2004), Washington DC, June 2004.[111℄ J. Sproston. Validation of Sto hasti System, volume 2925 of Le ture Notes in Com-puter S ien e, hapter Model Che king for Probabilisti Timed Systems, pages 189�229. Springer Berlin / Heidelberg, 2004.[112℄ A. M. Taj, E. Maggio, and A. Cavallaro. Multi-feature graph-based obje t tra k-ing. In Classi� ation of Events, A tivities and Relationships (CLEAR) Workshop,Southampton, UK, April 2006. Springer.[113℄ R. Urtasun, D. Fleet, and P. Fua. 3d people tra king with gaussian pro ess dynami almodels. In Conferen e on Computer Vision and Pattern Re ognition, June 2006.[114℄ A. Veeraraghavan, A. K. RoyChowdhury, and R. Chellappa. Mat hing shape se-quen es in video with appli ations in human movement analysis. IEEE Transa tionson Pattern Analysis and Ma hine Intelligen e, 27, De ember 2005.[115℄ S. A. Velastin, B. A. Boghossian, and M. A. Vi en io-Silva. A motion-based imagepro essing system for dete ting potentially dangerous situations in underground rail-way stations. Transportation Resear h Part C: Emerging Te hnologies, 14(2):96�113,2006.[116℄ P. Viola and M. J. Jones. Robust real-time fa e dete tion. International Journal ofComputer Vision, 57(2):137�154, 2004.34

[117℄ S. Von-Wum, L. Chen-Yu, Y. Jaw-Jium, and C. Ching-Chi. Using sharable ontol-ogy to retrieve histori al images. In Pro eedings of the 2nd ACM/IEEE-CS joint onferen e on digital libraries, pages 197�198, 2002.[118℄ L. Wang, W. Hu, and T. Tan. Re ent developments in human motion analysis.Pattern Re ognition, 36(3):585�601, 2003.[119℄ G. Weiss. Multiagent Systems, A Modern Approa h to Distributed Arti� ial Intelli-gen e. MIT Press, 1999.[120℄ M. Wooldridge. An Introdu tion to MultiAgent Systems. John Wiley & Sons, 2001.[121℄ T. Xiang and S. Gong. Video behaviour pro�ling and abnormality dete tion withoutmanual labelling. In IEEE International Conferen e on Computer Vision, pages1238�1245, Beijing, O tober 2005.[122℄ T. Xiang and S. Gong. Beyond tra king: Modelling a tivity and understandingbehaviour. International Journal of Computer Vision, 67(1):21�51, 2006.[123℄ T. Xiang and S. Gong. In remental visual behaviour modelling. In IEEE VisualSurveillan e Workshop, pages 65�72, Graz, May 2006.[124℄ T. Xiang and S. Gong. Optimal dynami graphs for video ontent analysis. In BritishMa hine Vision Conferen e, volume 1, pages 177�186, Edinburgh, September 2006.[125℄ S. Zaidenberg, O. Brdi zka, P. Reignier, and J. L. Crowley. Learning ontext modelsfor the re ognition of s enarios. In In 3rd IFIP Conferen e on Arti� ial Intelligen eAppli ations and Innovations (AIAI) 2006, Athens, Gree e, June 2006.[126℄ Y. Zhai and M. Shah. Visual attention dete tion in video sequen es using spatiotem-poral ues. In ACM MM, Santa Barbara, CA, USA, 2006.[127℄ D. Zhang, D. Gati a-Perez, S. Bengio, and I. M Cowan. Semi-supervised adaptedhmms for unusual event dete tion. In EEE CVPR, 2005.[128℄ J. Zhang and S. Gong. Beyond stati dete tors: A bayesian approa h to fusing long-term motion with appearan e for robust people dete tion in highly luttered s enes.In IEEE Visual Surveillan e Workshop, pages 121�128, Graz, May 2006.[129℄ L. Zhang, H. Hermanns, and D. Jansen. Logi and model he king for hidden markovmodels. In F. Wang, editor, 25th International Conferen e on Formal Te hniques forNetworked and Distributed Systems, volume Le ture Note in Conputer S ien e, pages98�112, National Tawain University, O tober 2005.[130℄ T. Zhao and R. Nevatia. Tra king multiple humans in rowded environment. Com-puter Vision and Pattern Re ognition, 2:406�413, 2004.[131℄ F. Ziliani, S. Velastin, F. Porikli, L. Mar enaro, T. Kelliher, A. Cavallaro, andP. Bruneaut. Performan e evaluation of event dete tion solutions: the reds ex-perien e. In IEEE International Conferen e on Advan ed Video and Signal basedSurveillan e (AVSS 2005), Como, Italy, September 2005.35

9 Bibliography of the Orion Team (from 1998)Book Chapters[132℄ N. Chleq, F. Bremond, and M. Thonnat. Advan ed Video-based Surveillan e Systems, hapter Image Understanding for Prevention of Vandalism in Metro Stations, pages108�118. Kluwer A.P. , Hangham, MA, USA, C.S. Regazzoni, G. Fabri, and G.Vernazza edition, November 1998.[133℄ N. Chleq, C. Regazzoni, A. Tes hioni, and M. Thonnat. Advan es in InformationTe hnologies: the business hallenge, hapter A visual surveillan e system for theprevention of vandalism in metro stations, pages 287� 294. IOS Press, J.Y. Roger,B. Stanford-Smith and P.T. Kidd edition, 1998.[134℄ F. Cupillard, F. Brémond, and M. Thonnat. Tra king groups of people for videosurveillan e. In P. Remagnino, GA. Jones, N. Paragios, and C. Regazzoni, editors,Video-Based Surveillan e Systems, pages 89�100. Kluwer A ademi , 2002.[135℄ J. Malenfant, S. Moisan, and A. Moreira, editors. Obje t-Oriented Te hnology -ECOOP'2000 Workshop Reader, number 1964 in Le ture Notes in Computer S ien e,Sophia Antipolis and Cannes, Fran e, juin 2000. Springer.[136℄ V. Martin and M. Thonnat. S ene Re onstru tion, Pose Estimation and Tra king, hapter A Learning Appora h for Adaptive Image Segmentation, pages 431� 454.ISBN 978-3-902613-06-6, Rustam Stolkin edition, 2007.[137℄ S. Moisan and J.-L. Ermine. Ingénierie des onnaissan es, hapter Gestion opéra-tionnelle des onnaissan es sur les odes, pages 297�319. L'Harmattan, 2005.[138℄ Monique Thonnat. Knowledge-based te hniques for image pro essing and for imageunderstanding. In A. He k and F. Murtagh, editors, Voies nouvelles pour l'Analysede Donnees en S ien es de l'Univers, volume 12 of Journal Phys. IV FRan e, pages189�236. EDP S ien es, Les Ulis, 2002.Arti les in Referred Journals[139℄ A. Avanzi, F. Bremond, C. Tornieri, and M. Thonnat. Design and assessment ofan intelligent a tivity monitoring platform. EURASIP Journal on Applied SignalPro essing, spe ial issue in "Advan es in Intelligent Vision Systems: Methods andAppli ations", 2005(14):2359�2374, August 2005.[140℄ P. Boissard, C. Hudelot, G. Pérez, P. Béarez, P. Pyrrha, M. Thonnat, and J. Lévy.La déte tion pré o e in situ des bioagresseurs : appli ation à l'oidium du rosier deserre. Revue Horti ole, septembre 2004.[141℄ P. Bonton, A. Bou her, M. Thonnat, R. Tom zak, P.J. Hidalgo, J. Belmonte, andC. Galan. Colour image in 2d and 3d mi ros opy for the automation of pollen ratemeasurement. Image Analysis and Stereology, 20(Suppl 1):527�532, 2001. also Pro .8th European Congress for Stereology and Image Analysis (ECSIA).[142℄ A. Bou her, P.J. Hidalgo, M. Thonnat, J. Belmonte, C. Galan, P. Bonton, andR. Tom zak. Development of a Semi-Automati System for Pollen Re ognition.Aerobiologia, International Journal of Aerobiology,, 2001.36

[143℄ A. Bou her, P.J. Hidalgo, M. Thonnat, J. Belmonte, C. Galan, P. Bonton, andR. Tom zak. Development of a Semi-Automati System for Pollen Re ognition.Aerobiologia, International Journal of Aerobiology,, 18(3-4), 2002.[144℄ B. Boulay, F. Bremond, and M. Thonnat. Applying 3d human model in a posturere ognition system. Pattern Re ognition Letter, Spe ial Issue on vision for CrimeDete tion and Prevention, 27(15):1788�1796, 2006.[145℄ F. Brémond and M. Thonnat. Issues of representing ontext illustrated by video-surveillan e appli ations. International Journal of Human-Computer Studies Spe ialIssue on Context, 48:375�391, 1998.[146℄ F. Brémond and M. Thonnat. Tra king multiple non-rigid obje ts in video se-quen es. IEEE Transa tion on Cir uits and Systems for Video Te hnology Journal,8(5), September 1998.[147℄ F. Bremond, M. Thonnat, and M. Zúniga. Video Understanding Framework ForAutomati Behavior Re ognition. Behavior Resear h Methods Journal, 38(3):416�426, 2006.[148℄ A-M. Dery, M. Blay-Fornarino, and S. Moisan. Apport des intera tions pour ladistribution des onnaissan es. L'OBJET, spe ial issue � Systèmes distribués etConnaissan es �, 8(4), 2002.[149℄ Florent Fusier, Valery Valentin, François Bremond, Monique Thonnat, Mark Borg,David Thirde, and James Ferryman. Video understanding for omplex a tivity re og-nition. Ma hine Vision and Appli ations Journal, 18:167�188, 2007.[150℄ Benoit Georis, François Bremond, and Monique Thonnat. Real-time ontrol of videosurveillan e systems with program supervision te hniques. Ma hine Vision and Ap-pli ations Journal, 18:189�205, 2007.[151℄ N. Khayati, W. Lejouad-Chaari, S. Moisan, and J.P. Rigault. Distributingknowledge-based systems using mobile agents. WSEAS Transa tions on Comput-ers, 5(1):22�29, January 2006.[152℄ N. Maillot and M. Thonnat. Ontology based omplex obje t re ognition. Image andVision Computing Journal, Spe ial Issue on Cognitive Computer Vision, 2006. Toappear, available on line July 2006.[153℄ N. Maillot, M. Thonnat, and A. Bou her. Towards ontology based ognitive vision.Ma hine Vision and Appli ations (MVA), 16(1):33�40, De ember 2004.[154℄ M. Mar os, S. Moisan, and A. P. del Pobil. Knowledge modeling of program super-vision task and its appli ation to knowledge base veri� ation. Applied Intelligen e,10(2/3):181�192, Mar h/April 1999.[155℄ G. Medioni, I. Cohen, F. Brémond, S. Hongeng, and G. Nevatia. A tivity Analysisin Video. Pattern Analysis and Ma hine Intelligen e PAMI, 23(8):873�889, 2001.[156℄ S. Moisan. Réutilisation et générateurs de systèmes à base de onnaissan es : leframework blo ks. TSI, 20(4):529�553, 2001.[157℄ S. Moisan, A. Ressou he, and J-P. Rigault. Blo ks, a omponent framework with he king fa ilities for knowledge-based systems. Informati a, Spe ial Issue on Com-ponent Based Software Development, 25(4):501�507, November 2001.37

[158℄ C. Shekhar, S. Moisan, R. Vin ent, P. Burlina, and R. Chellappa. Knowledge-based ontrol of vision systems. Image and Vision Computing, 17(8):667�683, May 1999.[159℄ M. Thonnat and S. Moisan. What an Program Supervision do for Software Re-use? IEE Pro eedings - Software Spe ial Issue on Knowledge Modelling for software omponents reuse, 147(5), 2000.[160℄ J-P. Vidal, S. Moisan, J-B. Faure, and D. Dartus. Towards a reasoned 1D river model alibration. J. Hydroinformati s, 7(2):79�90, April 2005.[161℄ J.P. Vidal, S. Moisan, and D. Faure, J.B.and Dartus. River model alibration,from guidelines to operational support tools. Environmental Modelling and Soft-ware, 22(11):1628�1640, November 2007.[162℄ T. Vu, F. Brémond, and M. Thonnat. Human behaviour visualisation and simulationfor automati video understandin. Journal of WSCG, 10(1-3):485�492, 2002.Publi ations in International Referred Conferen es[163℄ C. André and J-P. Rigault. Variations on the semanti s of graphi al models forrea tive systems. In SMC'02, ISBN: 2-9512309-4-x, Hammamet, Tunisia, O tober2002. IEEE Press.[164℄ Charles André, Marie-Agnès Peraldi-Frati, and Jean-Paul Rigault. S enario andProperty Che king of Real-Time Systems Using a Syn hronous Approa h. In4th International Symposium on Obje t-Oriented Real-Time Distributed Computing(ISORC 2001), Magdeburg, Germany, May 2001. IEEE.[165℄ Charles André, Marie-Agnès Peraldi, and Jean-Paul Rigault. Integrating the syn- hronous paradigm into UML: Appli ation to ontrol-dominated systems. In 5thUML Conferen e (�UML 2002�, number 2460 in LNCS, Dresden, Germany, 2002.Springer Verlag.[166℄ A. Avanzi, F. Brémond, and M. Thonnat. Tra king multiple individuals for video ommuni ation. In International Conferen e on Image Pro essing ICIP'01, Thessa-loniki, Gree e, O tober 2001. IEEE Signal Pro essing So iety.[167℄ J. Bannour, B. Georis, F. Bremond, and M. Thonnat. Generation of 3d anima-tions from s enario models. In Pro eedings of VIIP'04- 4th IASTED InternationalConferen e on Visualization, Imaging and Image Pro essing, September 6-8th, 2004.[168℄ H. Benhadda, J.L. Patino, E. Corvee, F. Bremond, and M. Thonnat. Data miningon large video re ordings. In Veille Strategique S ienti�que et Te hnologique VSST2007, Marrake h, Moro o, 21st -25th O tober 2007.[169℄ P. Boissard, C. Hudelot, M. Thonnat, G. Perez, P. Pyrrha, and F. Bertaux. Auto-mated approa h to monitor the sanitary status and to dete t early biologi al atta kson plants in greenhouse. In IEEE, editor, International Symposium on GreenhouseTomato, Avignon. CTIFL, 2003.[170℄ Mark Borg, David Thirde, James Ferryman, Florent Fusier, Fran ois Bremond, andMonique Thonnat. An integrated vision system for air raft a tivity monitoring. InPro eedings of IEEE International Workshop on Performan e Evaluation of Tra k-ing and Surveillan e (PETS), Bre kenridge, Colorado, USA, January 2005. IEEEComputer So iety. 38

[171℄ Mark Borg, David Thirde, James Ferryman, Florent Fusier, Valery Valentin, Fran oisBremond, and Monique Thonnat. Video surveillan e for air raft a tivity monitoring.In Pro eedings of IEEE International Conferen e on Advan ed Video and Signal-Based Surveillan e, Como, Italy, September 2005. IEEE Computer So iety.[172℄ Mark Borg, David Thirde, James Ferryman, Florent Fusier, Valery Valentin, Fran oisBremond, Monique Thonnat, Josep Aguilera, and Martin Kampel. Visual surveil-lan e for air raft a tivity monitoring. In Pro eedings of the Se ond Joint IEEE Inter-national Workshop on Visual Surveillan e and Performan e Evaluation of Tra kingand Surveillan e, Beijing, China, O tober 2005. IEEE Computer So iety.[173℄ A. Bou her, P. Hidalgo, M. Thonnat, J. Belmote, and C. Galan. 3D Pollen ImageRe ognition based on Palynologi al Knowledge. In Se ond European Symposium onAerobiology, Vienne, Autri he, September 2000.[174℄ A. Bou her and M. Thonnat. Obje t re ognition from 3d blurred images. In Inter-national Conferen e on Pattern Re ognition, ICPR2002, Quebe , Canada, August2002.[175℄ A. Bou her, P. Tom za k, P. Hidalgo, M. Thonnat, J. Belmote, C. Galan, and P. Bon-ton. A Semi-Automati System for Pollen Re ognition. In Se ond European Sympo-sium on Aerobiology, Vienne, Autri he, September 2000.[176℄ B. Boulay, F. Bremond, and M. Thonnat. Human posture re ognition in videosequen e. In Pro eedings Joint IEEE International Workshop on VS-PETS, VisualSurveillan e and Performan e Evaluation of Tra king and Surveillan e, pages 23�29,OCT 11-12 2003.[177℄ Bernard Boulay, Fran ois Bremond, and Monique Thonnat. Posture re ognitionwith a 3d human model. In Pro eedings of IEE International Symposium on Imagingfor Crime Dete tion and Prevention, London, UK, 2005. Institution of Ele tri alEngineers.[178℄ Binh Bui, Fran ois Brémond, Monique Thonnat, and JC. Faure. Shape re ognitionbased on a video and multi-sensor system. In IEEE International Conferen e on Ad-van ed Video and Signal-Based Surveillan e (AVSS 2005), Como, Italie, September2005.[179℄ C. Carin otte, X. Desurmont, B. Ravera, F. Brémond, J. Orwell, S. Ve-lastin, J-M. Odobez, B. Corbu i, J. Palo, and J. Cerno ky. Toward generi intelligent knowledge extra tion from video and audio: the eu-funded are-taker proje t. In IET onferen e on Imaging for Crime Dete tion and Pre-vention (ICDP 2006), London, Great Britain, June 2006. http://www-sop.inria.fr/orion/Publi ations/Arti les/ICDP06.html.[180℄ J. Cavarro , S. Moisan, and J-P. Rigault. Simplifying an Extensible Class LibrayInterfa e with OpenC++. In OOPSLA'98, Workshop on Re�e tive Programming inC++ and Java (UTCCP Report 98-4). University of Tsukuba, O tober 1998.[181℄ M. Crubézy, M. Mar os, and S. Moisan. Experiments in Building Program Supervi-sion Engines from Reusable Components. In 13th European Conferen e on Arti� ialIntelligen e-ECAI98. Workshop on Appli ations of Ontologies and Problem-SolvingMethods. John Wiley & Sons, Ltd., August 1998.39

[182℄ F. Cupillard, A. Avanzi F. Brémond, and M. Thonnat. Video understanding for metrosurveillan e. In IEEE ICNSC 2004 in the spe ial session on Intelligent TransportationSystems, Tapei(Taiwan), Mar h 2004.[183℄ F. Cupillard, F. Brémond, and M. Thonnat. Tra king groups of people for video-surveillan e. In 2nd European Workshop n Advan ed Video-based surveillan e SystemsAVBS'01, Kingston, UK, September 2001.[184℄ F. Cupillard, F. Brémond, and M. Thonnat. Group behavior re ognition with mul-tiple ameras. In IEEE Pro eedings of the Workshop on Appli ations of ComputerVision - ACV2002, Orlando, USA, De ember 2002.[185℄ F. Cupillard, F. Brémond, and M. Thonnat. Behaviour re ognition for individu-als, groups of people and rowd. In IEE Pro eedings of the Intelligent DistributedSurveillan e Systems Symposium - IDSS2003, London, UK, February 2003.[186℄ Frederi Cupillard, Fran ois Brémond, and Monique Thonnat. Automati visualre ognition for metro surveillan e. In Measuring Behavior, Wageningen The Nether-lands, September 2005.[187℄ S. Delaitre, A. Giboin, and S. Moisan. The AEX Method and its Instrumentation. InKluwer, editor, ICEIS, International Conferen e on Enterprise Information Systems,Ciudad Real - Spain, April 2002.[188℄ S. Delaître and S. Moisan. Knowledge Management by Reusing Experien e. InRose Dieng and Olivier Corby, editor, EKAW'2000, 12th International Conferen eon Knowledge Engineering and Knowledge Management, Le ture Notes in Arti� ialIntelligen e 1935, pages 304�311, Juan les Pins, Fran e, o tobre 2000. Springer.[189℄ M. Dery, A-M.and Blay-Fornarino, S. Moisan, B. Ar ier, and L. Mule. Distributeda ess knowledge-based system: Rei�ed intera tion servi e for tra e and ontrol. In3rd International Symposium on Distributed Obje ts & Appli ations, DOA'01, pages76�84, Rome, Italy, September 2001.[190℄ N. Dey, A. Bou her, and M. Thonnat. Image formation model of a 3d translu entobje t observed in light mi ros opy. In International Conferen e on Image Pro essingICIP2002, Ro hester, USA, September 2002.[191℄ James Ferryman, Mark Borg, David Thirde, Florent Fusier, Valery Valentin, Fran oisBremond, Monique Thonnat, Josep Aguilera, and Martin Kampel. Automated s eneunderstanding for airport aprons. In Pro eedings of 18th Australian Joint Conferen eon Arti� ial Intelligen e, Sidney, Australia, De ember 2005. Springer-Verlag.[192℄ B. Georis, F. Bremond, M. Thonnat, and B. Ma q. Use of an evaluation and di-agnosis method to improve tra king performan es. In Pro eedings of VIIP'03- 3rdIASTED International Conferen e on Visualization, Imaging and Image Pro essing,Benalmadera, SPAIN, SEP 8-10 2003.[193℄ B. Georis, M. Maziere, F. Bremond, and M. Thonnat. A video interpretation platformapplied to bank agen y monitoring. In Pro eedings of IDSS'04 - 2nd Workshop onIntelligent Distributed Surveillan e Systems, February 23th, 2004.[194℄ B. Georis, M. Mazière, F. Brémond, and M. Thonnat. Evaluation and knowledgerepresentation formalisms to improve video understanding. In Pro eedings of the40

International Conferen e on Computer Vision Systems (ICVS'06), New-York, NY,USA, January 2006.[195℄ F. Hongeng, S. Brémond and R. Nevatia. Bayesian Framework for Video Surveillan eAppli ation. In 15th International Conferen e on Pattern Re ognition, Bar elone,Espagne , September 2000.[196℄ F. Hongeng, S. Brémond and R. Nevatia. Representation and Optimal Re ognition ofHuman A tivities. In IEEE Conferen e on Computer Vision and Pattern Re ognition,USA , June 2000.[197℄ C. Hudelot and M. Thonnat. An ar hite ture for knowledge based image interpreta-tion. In Workshop on Computer Vision System Control Ar hite tures, VSCA, Graz,2003.[198℄ C. Hudelot and M. Thonnat. A ognitive vision platform for automati re ognitionof natural omplex obje ts. In IEEE, editor, International Conferen e on Tools withArti� ial Intelligen e, ICTAI, Sa ramento, 2003.[199℄ M. B. Kaâni he, F. Brémond, and M. Thonnat. Monitoring tri hogramma a -tivities from videos : An adaptation of ognitive vision system to biology �eld.In International Cognitive Vision Workshop, (ICVW'2006) in onjun tion with the9th European Conferen e on omputer Vision (ECCV'2006), Graz, Austria, 2006.http://www-sop.inria.fr/orion/Publi ations/Arti les/ICVW06.html.[200℄ Naoufel Khayati, Wided Lejouad-Chaari, and Sabine Moisan. Systèmes distribuésde pilotage de programmes : appli ation à l'imagerie médi ale. In Pro . Méthodeset Heuristiques pour l'Optimisation des Systèmes industriels (MHOSI'2005), Ham-mamet, Tunisie, April 2005.[201℄ Naoufel Khayati, Wided Lejouad-Chaari, Sabine Moisan, and Jean-Paul Rigault.A mobile agent-based ar hite ture for distributed program supervision systems. InPro . 4th WSEAS Int. Conf. on Information Se urity, Communi ations and Com-puters, Tenerife, Spain, De ember 2005.[202℄ A. Lux. Tools for automati interfa e generation in s heme 2nd. In Workshop onS heme and Fun tional Programming, Colloquium prin iples, Logi s, and Implemen-tations of high-level programming languages", Floren e, Italy, September 2001.[203℄ N. Maillot, M. Thonnat, and A. Bou her. Towards ontology based ognitive vision.In James L. Crowley, Justus H. Piater, Markus Vin ze, and Lu as Paletta, editors,Computer Vision Systems, Third International Conferen e, ICVS, volume 2626 ofLe ture Notes in Computer S ien e. Springer, 2003.[204℄ N. Maillot, M. Thonnat, and A. Bou her. A visual on ept ontology for automati image understanding. In Poster Session of the 2nd International Semanti WebConferen e, ISWC, 2003.[205℄ Ni olas Maillot and Monique Thonnat. A weakly supervised approa h for seman-ti image indexing and retrieval. In International Conferen e on Image and VideoRetrieval, CIVR, pages 629�638, 2005.[206℄ Ni olas Maillot, Monique Thonnat, and Celine Hudelot. Ontology based obje t learn-ing and re ognition : Appli ation to image retrieval. In Pro eedings of 16th IEEE In-ternational Conferen e on Tools For Arti� ial Intelligen e, Bo a Raton, USA, 2004.IEEE Computer So iety Press. 41

[207℄ M. Mar os, S. Moisan, and A. Del Pobil. Knowledge modeling of program supervisiontask. In Jose Mira, Angel Pasqual del Pobil, and Moonis Ali, editors, Pro eeding ofthe 11th International Conferen e on Industrial Appli ations of Arti� ial Intelligen eand Expert Systems, Le ture Notes in Computer S ien e no 1415, Beni assim, Spain,June 1998. Springer-Verlag.[208℄ Vin ent Martin, Ni olas Maillot, and Monique Thonnat. A learning approa h foradaptive image segmentation. In Pro eedings of the International Conferen e onComputer Vision Systems (ICVS'06), New-York, NY, USA, January 2006.[209℄ N. Moenne-Lo oz, F. Brémond, and M. Thonnat. Re urrent bayesian network forthe re ognation of human behaviors from video. In Third International Conferen eOn Computer Vision Systems (ICVS 2003), volume LNCS 2626, pages 44�53, Graz,Austria, 2003. Springer.[210℄ S. Moisan. Knowledge representation for program reuse. In Pro . European Confer-en e of Arti� ial intelligen e (ECAI 2002), Lyon, Fran e, July 2002.[211℄ S Moisan, M. Boër, C. Thiebaut, F. Tri oire, and M. Thonnat. A versatile s hed-uler for automati teles opes. In SPIE Conferen e on Astronomi al Teles opes andInstrumentation, Hawaii, USA, August 2002.[212℄ S. Moisan and W. Lejouad-Chaari. Réutilisation intelligente de programmes de vi-sion en environnement distribué. In 2émes Ateliers Traitement et Analyse d'ImagesMéthodes et Appli ations, TAIMA01, Hammamet, Tunisia, O tober 2001.[213℄ S. Moisan, A. Ressou he, and J.-P. Rigault. Towards Formalizing Behavorial Substi-tutability in Component Frameworks. In 2nd International Conferen e on SoftwareEngineering and Formal Methods, pages 122�131, Beijing, China, 2004. IEEE Com-puter So iety Press.[214℄ S. Moisan, A. Ressou he, and J-P. Rigault. Syn hronous Formalism and BehavioralSubstitutability in Component Frameworks. In F. Maranin hi, editor, Syn hronousLanguages Appli ations and Programming, ENCS. Elsevier S ien e, 2005.[215℄ S. Moisan and M. Thonnat. Program Supervision under Resour e Constraints. In13th European Conferen e on Arti� ial Intelligen e-ECAI98. Workshop on Monitor-ing and Control of Real-Time Intelligent Systems. John Wiley & Sons, Ltd., August1998.[216℄ Sabine Moisan, Annie Ressou he, and Jean-Paul Rigault. Behavioral substitutabilityin omponent frameworks: a formal approa h. In Pro . Spe i� ation and Veri� ationof Component-based Systems (SAVCBS'2003), pages 22�28, Helsinki, Finland, 2003.IEEE Computer So iety Press.[217℄ A. T. Nghiem, F. Bremond, M. Thonnat, and R. Ma. A new evaluation approa hfor video pro essing algorithm. In Pro eedings of WMVC 2007 IEEE Workshop onMotion and Video Computing, Austin, Texas, USA, February 2007.[218℄ A. T. Nghiem, F. Bremond, M. Thonnat, and V. Valentin. Etiseo: an evaluationproje t for video surveillan e systems. In Pro eedings of AVSS2007, IEEE Interna-tional Conferen e on Advan ed Video and Singal based Surveillan e, London, UK,September 2007. 42

[219℄ J.L. Patino, H. Benhadda, E. Corvee, F. Bremond, and M. Thonnat. Video-datamodelling and dis overy. In International Conferen e on Visual Information Engi-neering VIE 2007, London, UK, 25th -27th July 2007.[220℄ J.L. Patino, E. Corvee, F. Bremond, and M. Thonnat. Management of large videore ordings. In Ambient Intelligen e Developments AmI.d 2007, Sophia Antipolis,Fran e, 17th -19th September 2007.[221℄ Pas al Rapi ault and Jean-Paul Rigault. Open implementation of UML meta-model(s): Making meta-modeling and meta-programming meet. In Re�e tion 2001,number 2192 in LNCS, pages 276�277, Kyoto, Japan, 2001. Springer Verlag.[222℄ Pas al Rapi ault, Jean-Paul Rigault, and Lu Bourlier. Model, notation, and tools forveri� ation of proto ol-based omponents assembly. In Judith Bishop, editor, Com-ponent Deployment, LNCS 2370, pages 257�268, Berlin, Germany, 2002. Springer-Verlag.[223℄ A. Ressou he, V. Roy, J.Y. Tigli, and D. Cheung. Sas ar hite ture: Veri� ationoriented formal modeling of on rete riti al systems. In 2003 IEEE InternationalConferen e on Systems, Man and Cyberneti s, SMC'03, Washington, USA, O tober2003.[224℄ N. Rota, R. Stahr, and M. Thonnat. Tra king for Visual Surveillan e in VSIS.In First IEEE International Workshop on Performan e Evaluation of Tra king andSurveillan e PETS2000, Grenoble, Fran e, Mar h 2000.[225℄ N. Rota and M. Thonnat. A tivity Re ognition from Video Sequen es using De lar-ative Models. In 14th European Conferen e on Arti� ial Intelligen e. ECAI2000.,Berlin, Allemagne, August 2000.[226℄ N. Rota and M. Thonnat. Video Sequen e Interpretation for Visual Surveillan e. InWorkshop on Visual Surveillan e. VS'00., pages 59�67, Dublin,Irlande , July 2000.[227℄ Lan Le Thi, Alain Bou her, and Monique Thonnat. Subtraje tory-based video index-ing and retrieval. In 13th International MultiMedia Modeling Conferen e (MMM),Singapore, 2007.[228℄ David Thirde, Mark Borg, James Ferryman, Florent Fusier, Valery Valentin, Fran- ois Bremond, and Monique Thonnat. Video event re ognition for air raft a tiv-ity monitoring. In Pro eedings of 8th IEEE International Conferen e on IntelligentTransportation Systems, Vienna, Austria, September 2005. IEEE Computer So iety.[229℄ M. Thonnat, A. Avanzi, and F. Brémond. Analyse de s ènes de bureaux pour la ommuni ation video. In 2èmes Ateliers Traitement et Analyse d'Images Méthodeset Appli ations, TAIMA01, Hammamet, Tunisia, O tober 2001.[230℄ M. Thonnat and S. Moisan. What an program supervision do for program re-use?In Angel Pasqual del Pobil, Jose Mira, and Moonis Ali, editors, Pro eeding of the11th International Conferen e on Industrial Appli ations of Arti� ial Intelligen eand Expert Systems, Le ture Notes in Computer S ien e no 1416, Beni assim, Spain,June 1998. Springer-Verlag.[231℄ M. Thonnat, S. Moisan, and M. Crubézy. Experien e in Integrating Image Pro essingPrograms. In Internatianal Conferen e on Computer Vision Systems, Las Palmas,Canary Islands, January 1999. 43

[232℄ M. Thonnat and N. Rota. Image Understanding for Visual Surveillan e Appli ations.In Third International Workshop on Cooperative Distributed Vision, Kyoto, Japan,November 1999.[233℄ A. Toshev, F. Brémond, and M. Thonnat. An a priori-based method for frequent omposite event dis overy in videos. In Pro eedings of 2006 IEEE InternationalConferen e on Computer Vision Systems, New York USA, January 2006.[234℄ J.-P. Vidal, J.-B. Faure, S. Moisan, and D. Dartus. De ision Support System forCalibration of 1-D River Models. In S. Y. Liong, K.-K. Phoon, and Vladan Babovi ,editors, 6th International Conferen e on Hydroinformati s, volume 2, pages 1067�1074, Singapore, June 2004. World S ienti� Publishing.[235℄ J.-P. Vidal and S. Moisan. Opérationalisation de onnaissan es pour le alage demodèles numériques � Appli ation en hydraulique �uviale. In IC'2004, 15èmesJournées Fran ophones d'Ingénierie des Connaissan es, pages 261�272, Lyon,Fran e, 2004. Presses Universitaires de Grenoble.[236℄ J-P. Vidal, S. Moisan, and J-B. Faure. Knowledge-based hydrauli model alibration.In L. Jain V. Pallade, R.J. Howlett, editor, Knowledge-Based Intelligent Informationand Engineering Systems (KES), volume I of LNAI, pages 676�683, Oxford, UK,September 2003. Springer.[237℄ T. Vu, F. Brémond, G. Davini, M. Thonnat, QC. Pham, N. Allezard, P. Sayd, JL.Rouas, S. Ambellouis, and A. Flan quart. Audio video event re ognition systemfor publi transport se urity. In IET onferen e on Imaging for Crime Dete tionand Prevention (ICDP 2006), London, Great Britain, June 2006. http://www-sop.inria.fr/orion/Publi ations/Arti les/ICDP06.html.[238℄ V-T. Vu, F. Brémond, and M. Thonnat. Automati video interpretation: A novelalgorithm for temporal s enario re ognition. In The Eighteenth International JointConferen e on Arti� ial Intelligen e (IJCAI'03), A apul o, Mexi o, 2003.[239℄ V-T. Vu, F. Brémond, and M. Thonnat. Automati video interpretation: A re og-nition algorithm for temporal s enarios based on pre- ompiled s enario models. InThe 3rd International Conferen e on Vision System (ICVS'03), Graz, Austria, 2003.[240℄ V.T. Vu, F. Brémond, and M. Thonnat. Human behaviour visualisation and simula-tion for automati video understanding. In 10-th International Conferen e in Cen-tral Europe on Computer Graphi s, Visualization and Computer Vision'2002 WSCG2002, Plzen, Cze h Republi , February 2002.[241℄ V.T. Vu, F. Brémond, and M. Thonnat. Temporal onstraints for video interpreta-tion. In 15th European Conferen e on Arti� ial Intelligen e, Workshop on Modellingand Solving Problems with Constraints, Lyon, FRANCE, July 2002.[242℄ V.T. Vu, F. Brémond, and M. Thonnat. Video surveillan e: human behaviour repre-sentation and on-line re ognition. In Sixth International Conferen e on Knowledge-Based Intelligent Information and Engineering Systems, KES2002, Crema, Italy,September 2002.[243℄ M. Zhan, P. Remagnino, S. Velastin, F. Brémond, and M. Thonnat. Mat hinggradient des riptors with topologi al onstraints to hara terise the rowd dynami s.In The 3rd International Conferen e on Visual Information Engineering (VIE2006),pages 441�446, Bangalore, India, September 26-28 2006.44

[244℄ N. Zouba, F. Bremond, M. Thonnat, and V. T. Vu. Multi-sensors analysis for every-day elderly a tivity monitoring. In Pro eedings of SETIT 2007 the 4th InternationalConferen e S ien es of Ele troni s, Te hnologies of Information and Tele ommuni- ations, Tunis, Tunisia, Mar h 2007.[245℄ M. Zúniga, F. Brémond, and M. Thonnat. Fast and reliable obje t lassi� ation invideo based on a 3d generi model. In The 3rd International Conferen e on VisualInformation Engineering (VIE2006), pages 433�441, Bangalore, India, September26-28 2006.HDR and PhD Theses[246℄ Bernard Boulay. Human Posture Re ognition for behavior Understanding. PhD thesis,Ni e-Sophia Antipolis University, January 2007.[247℄ Fran ois Bremond. S ene Understanding: per eption, multi-sensor fusion, spatio-temporal reasoning and a tivity re ognition. Habilitation à diriger les re her hes,Université de Ni e-Sophia Antipolis, juillet 2007.[248℄ Moni a Crubézy. Pilotage de programmes pour le traitement d'images médi ales.PhD thesis, Université de Ni e-Sophia Antipolis, February 1999.[249℄ Ni olas Dey. Etude de la formation de l'image d'un objet mi ros opique 3D translu- ide - Appli ation à la mi ros opie optique. PhD thesis, Université de Ni e-SophiaAntipolis, De ember 2002.[250℄ Benoit Georis. Program Supervision Te hniques for Easy Con�guration of VideoUnderstanding Systems. PhD thesis, Louvain Catholi University, January 2006.[251℄ Céline Hudelot. Towards a Cognitive Vision Platform for Semanti Image Interpreta-tion; Appli ation to the Re ognition of Biologi al Organisms. PhD thesis, Ni e-SophiaAntipolis University, 2005.[252℄ Ni olas Maillot. Ontology Based Obje t Learning and Re ogntion. PhD thesis, Ni e-Sophia Antipolis University, De ember 2005.[253℄ Maria Mar os. Veri� ation and Validation of Program Supervision Systems. PhDthesis, université de Jaume I (Castellón, Espagne), February 1999.[254℄ Sabine Moisan. Une plate-forme pour une programmation par omposants de systèmesà base de onnaissan es. Habilitation à diriger les re her hes, Université de Ni e-Sophia Antipolis, avril 1998.[255℄ Nathanael Rota. Contribution à la re onnaissan e de omportements humains àpartir de séquen es videos. PhD thesis, Université de Ni e-Sophia Antipolis, O tober2001.[256℄ Monique Thonnat. Vers une vision ognitive: mise en oeuvre de onnaissan es et deraisonnements pour l'analyse et l'interprétation d'images. Habilitation à diriger lesre her hes, Université de Ni e-Sophia Antipolis, o tobre 2003.[257℄ Jean-Philippe Vidal. Assistan e au alage de modèles numériques en hydraulique�uviale : apports de l'intelligen e arti� ielle. PhD thesis, INP Toulouse, Mar h2005. 45

[258℄ Van-Thinh Vu. Temporal S enario for Automati Video Interpretation. PhD thesis,Ni e-Sophia Antipolis University, 2004.

46

A The Lama PlatformLama is a software platform for knowledge-based system design. It is written in Java andC++ and ontains toolkits to build and to adapt all the software elements that ompose aknowledge-based system, su h as knowledge bases, inferen e engines and veri� ation tools.A.1 Motivation and Approa hA knowledge-based system (KBS) is a program that performs omplex pro essing, relyingon de larative knowledge. Its ar hite ture is omposed of three main parts: a knowledgebase storing expert knowledge on a parti ular domain, a fa t base ontaining fa ts abouta end-user problem in this domain, and an engine, written by a software designer, thatperforms inferen es to solve the end-user problem based on the expert knowledge. Froma software engineering point of view, building and maintaining a KBS is a di� ult taskbe ause there is a huge semanti gap between the ontents of the expert's mind and the apabilities of regular programming languages on whi h the implementation relies. Thela k of software engineering support is espe ially noti eable when reenginering existingsystems to ope with evolutions and adaptations. Designing a knowledge-based systemimplies to develop and to maintain several software odes, whi h are the ones ne essary forrunning the KBS itself (i.e. inferen e engine and knowledge manager). It also implies todevelop all the utilities for both experts and end-users to ommuni ate with the system; forinstan e, knowledge editors and veri� ators are needed by the experts during the knowledgebase development. These developments may be long, omplex and error-prone. The notionof KBS generators emerged in the 80's to share, to a ertain extend, a panel of ommonelements from whi h a KBS an be designed, su h as an inferen e engine, a knowledgerepresentation manager, veri� ation tools, and various editors. A generator may yieldmany systems a ross many appli ation domains (e.g. lassi� ation of pollens, of galaxies,of inse ts, of �owers...). However, throughout their rather long life time (typi ally morethan 10 years), re on�gurations of generator elements are unavoidable.A.2 Lama ContentsTo support these hanges, we propose a �meta generator� approa h to go one step furtherand to generate or adapt elements proposed by generators, i.e. inferen e engines, interfa es,knowledge base des ription languages, knowledge veri� ation tools, et . This approa h isimplemented in a software platform, named Lama, (see �gure 8) whi h gathers severalgeneri extensible toolkits to design, test, and modify these software elements. The plat-form relies on up-to-date software engineering approa hes, mainly reusable omponentsand frameworks. For instan e, in order to (re)engineer engines, we propose a C++ ompo-nent framework (library and usage on�guration fa ilities) named Blo ks. The platformalso in ludes a C++ library for verifying knowledge bases, omposed of fun tions to per-form basi knowledge veri� ations on usual knowledge representation s hemes (stru turedobje ts, or frames, and rules). Next, a framework for parsers and ompilers of knowledgedes ription languages allows to reate expert-oriented languages and to translate theminto exe utable (C++) ode. Finally, to ustomize graphi interfa es, a Java frameworksupports the development of editors for reating and browsing knowledge bases.A typi al example of Lama toolkit is the Blo ks framework that implements generi data stru tures and methods for designing knowledge-based system engines without exten-sive ode rewriting. It is rooted in our extensive experien e of the design of various KBSgenerators for omputer aided design, lassi� ation, or planning, and in domains as di�er-ent as ivil engineering, astronomy, medi ine or biology. Brie�y, omponents in Blo ks47

orrespond to interrelated lasses, and more pre isely to the �rst levels of lass hierar- hies. The stru ture of these (abstra t) lasses form patterns for des ribing the on eptsinvolved in an engine algorithm; generi fun tions or (abstra t) methods of these lasses onstitute a kernel of basi instru tions that an be rede�ned to implement a reasoningstrategy. Designers an thus extend both the set of on epts and the algorithmi apa-bilities. Blo ks provides a ommom layer that implements generi and abstra t featuresuseful for a large range of knowledge-based systems. Spe i� layers an be spe ialised forappli ation purposes. The ommon layer supplies on rete and abstra t lasses that anbe derived or omposed, relationships among lasses that an be extended, generi fun -tions that an be parametrized by other fun tions, and (abstra t) methods that an berede�ned. It also provides a global organization of the lasses, relationships and fun tionsthat must be respe ted by designers when developing a spe ialisation for a parti ular ap-pli ation. Respe ting these onstraints, that is enfor ing the orre t use of omponents, isan important issue addressed by model- he king te hniques. Spe ialisations mainly leadto on rete lasses, the instan es of whi h will populate the knowledge bases, and methodsor fun tions that will onstitute reasoning steps in the algorithm of an engine strategy.The general layer of Blo ks onsists of about 75 lasses and 5,000 lines of C++ ode,non in luding omments and utility lasses that implement usual data stru tures (lists, sets,maps...). Ex ept for these utilities, the remaining lasses implement ommon on epts that an be found in KBSs. For instan e, the lassi al arti� ial intelligen e notions of frameand of inferen e rule are implemented as lasses in the framework. Class interfa es are omplete enough to over most designer needs without modi� ations, but some points of�exibility (hooks) have been foreseen, in parti ular in methods. Spe ialization, ompositionand hooks allow designers to �ne tune an engine behavior.Java Libraryfor graphic interfaces

Compiler generatorfor knowledge description languages

Library for knowledge base verification

Framework for engine design andknowledge representation (Blocks)

Program Image supervision understandinglayer layer

Dedicated Interface

Adapted language (compiler/verifier...)

Dedicated engine

Knowledgebase

Knowledge−based system

Designer

LAMA : environnement for designers

Expert

Environnement generated for expertsFigure 8: Lama ar hite ture and tools for engine design, knowledge base des ription,veri� ation, and visualizationA.3 Lama StatusWe have used Lama to design several generators and thus several orresponding inferen eengines: three variants of planning engines (used e.g. at ENSI Tunis), two lassi� ationengines (used at INRA Sophia), one model alibration engine (used at Cemagref Lyon). A48

typi al engine ode introdu es around 15 derived lasses (spe ialised from Blo ks ones),and 2 to 5 new ones. The number of extra ode lines to write for the engine algorithmitself ranges from 250 to 800, whi h is a tra table size when it omes to modi� ations. Thisdepends of ourse on the more or less sophisti ated me hanisms that are to be implementedin the engine (e.g. an engine with history management and ba ktra king me hanism ne- essitates more ode lines than an engine that does not need to memorise its past a tions).

49

B The VSIP PlatformVSIP, VideoSurveillan e Intelligent Platform, is a software platform written in C and C++for video sequen e analysis.B.1 Motivation and Approa hThere is an in reasing demand in the safety and se urity domains for intelligent video-surveillan e systems. These systems are ompli ated to build from s rat h and di� ultto on�gurate due to the large variety of s enes to pro ess and to the numbers of pro- essing steps involved in video sequen e analysis. Most of these videosurveillan e systemsshare ommon fun tionalities (as motion dete tion or obje t tra king). We thus propose aplatform, named VSIP, to help developers building videosurveillan e systems.This platform is omposed of reusable omponents whi h an be ombined for di�erentappli ations. VSIP mainly ontains: (1) image pro essing algorithms in harge of mobileobje t dete tion, lassi� ation and frame to frame tra king; (2) multi- amera algorithmsfor the spatial and temporal analysis (4D) of the dete ted mobile obje ts; (3) high levels enario re ognition algorithms.Figure 9 shows the global stru ture of a videosurveillan e system built with VSIP.First, a motion dete tion step followed by a frame to frame tra king is made for ea hvideo amera. Then the tra ked mobile obje ts oming from di�erent video ameras withoverlapping �elds of view are fused into a unique 4D representation for the whole s ene.Depending on the hosen appli ation, a ombination of one or more of the available tra kers(individuals, groups and rowd tra ker) is used. Then s enario re ognition is performed bya ombination of one or more of the available re ognition algorithms (automaton based,Bayesian-network based, AND/OR tree based and temporal onstraints based). Finallythe system generates the alerts orresponding to the prede�ned re ognized s enarios.... ...

cameras withoverlapped FOV

fusion

trackingframe to framemotion

detection

motiondetection tracking

frame to frame

motiondetection tracking

frame to frame

from camera N

long termindividualtracking

long term

trackinggroup

long termcrowd

tracking

imagesfromcamera 1

imagesfromcamera 2

imagesfromcamera N

alerts

automaton−based

temporal−constraints based

Bayesian−network basedscenario

recognition

scenariorecognition

scenariorecognition

scenariorecognition

AND/OR tree−based

for the whole scenefused tracked mobile objects

mobile objectsfrom camera 1

tracked mobile objectsfrom camera 1

tracked mobile objects

physical objectsFigure 9: Global stru ture of an a tivity monitoring system built with VSIP.B.2 Detailed ContentsThe VSIP platform urrently ontains 26 modules for real-time video sequen e analysis anda set of 19 tools for o�-line pro essing and for building an appli ation in videosurveillan e.The real-time modules are: Video A quisition, Context Loading, Ba kground Initiali-sation, Changes Adaptation, Door Dete tion, Image Segmentation, 2D Classi� ation, 3DClassi� ation, Blob Merging, Blob Separation, Blob Position Corre tion, Blob Filtering,Blob to Mobile Conversion, Frame to Frame Tra king, Ghost Removal, Person Dete tion,Noise Tra king, Noise Features, Ba kground Updating, Camera Syn hronisation, Fusion50

Tra king, Long-term Tra king, Global Fusion of Distant Cameras, Global Tra king, EventRe ognition based on Finite State Automata, and Temporal Event Re ognition.The o�-line tools are: Datapool Manager, Appli ation Controller, Make�les, VideoRe ording, Video Header Insertion, Video Conversion, Module Parameters Tuning, Planarand Radially Distorded Camera Calibration, S ene Context Visualization, Performan eEvaluation, Histogram Creation, 3D S ene Visualisation, Video Visualisation, TemporalS enario Parser, Temporal Coheren e Che king for S enario, Traje tory Clustering, Peo-ple and Event Statisti s Cal ulation, Event Graphs Browser, and Unsupervised S enarioLearning.B.3 VSIP statusThe modules have been registered at APP (the Fren h agen y for patrinomy prote tion)and transfered to Keeneo (http://www.keeneo. om) in 2005 for their industrialisation.Sin e that date these modules have slightly evolved in parti ular with the adding of a 3D lassi� ation module. The main re ent hanges on ern the tools for o�-line analysis (e.g.Traje tory Clustering and Unsupervised S enario Learning).

51

C Performan es of Orion Videosurveillan e SystemsC.1 Introdu tionIn the three last years Orion has obtained on rete results on intelligent videosurveillan esystems [139℄. In this annex we present a summary of the performan es measured on realworld problems through several hallenging evaluations led by industrials and end-users inthe framework of the European proje t ADVISOR and two industrial proje ts Cassiopeeand Videa.C.2 Evaluation ResultsC.2.1 European Proje t ADVISORThe obje tive of this European proje t was to show the feasability of intelligent video-surveillan e in metro stations. For more details see the ADVISOR site http://www-sop.inria.fr/orion/ADVISOR. The end-users (i.e. the metro se urity operators) have de-�ned �ve main types of a tivities of interest involving on person and a stati obje t (jump-ing over a barrier and vandalism against a ti ket vending ma hine), several intera tingpersons (�ghting people), a group of persons and the physi al environment (a group blo k-ing a subway a ess) and a rowd (over rowding of the metro platform).The validation of our s enario re ognition te hnique has been performed on videostaken with the operational CCTV visual surveillan e ameras of two metro sta-tions in Bar elona (Spain) and in Brussels (Belgium) with either real passengers or a torsof dramati s hools a ting s enarios de�ned by the end-users. Table 1 shows a summaryof the results. One an see that the average true positive rate is really high 90% and thatthe false alarm rate is low i.e. 1.2%. Our approa h analyses the a tivities in 4D and notbased on 2D data. Another measure (i.e. the a urary) is given, it shows the a ura y intime of the re ognition (that means what per entage of the duration of the shown a tivityis � overed� by the generation of the orresponding alert by the system. This value variesa lot with respe t to the type of a tivity. In parti ular for �ghting people, it is still verydi� ult to dete t a pre ise beginning and ending of this type of behavior.A tivity type Nb of a tivities True positives A ura y False alarms�ghting 21 95 % 61 % 0 (over 16 hours)blo king 9 78 % 60 % 1 (over 16 hours)vandalism 2 100 % 71 % 0 (over 16 hours)jumping o.t.b. 42 88 % 100 % 0 (over 16 hours)over rowding 7 100 % 80 % 0 (over 16 hours)TOTAL 81 90 % 85 % 1 (over 16 hours)Table 1: This table shows the results of the te hni al validation of our intelligent video-surveillan e system in subways performed during the European proje t ADVISOR. For ea hs enario, we report in parti ular the per entage of re ognized behaviors of this s enario(third olumn) and the a ura y in time of the re ognition (that means what per entage ofthe duration of the shown behavior is � overed� by the generation of the orresponding alertby the system; this value is an average over all the s enario instan es (fourth olumn).52

C.2.2 Industrial Proje t CassiopeeThe obje tive of this industrial proje t was to show the feasabilty of intelligent surveillan ein bank agen ies [139℄. In this proje t a �rst set of a tivities of bank atta k inside the publi area of the bank agen y have been de�ned by the se urity managers. As these events arerare the evaluation has been made on videos a ted by bank agen y employees. The videoswere taken on real environments in three bank agen ies to test di�erent on�gurations(di�erent sizes, geometries and illuminations). A se ond type of a titivity has been de�nedby the bank agen y se urity managers for unauthorized a ess to se ure zone (a es limitedto one employee). In this ase it has been possible to evaluate our approa h in live duringa total of 10 days. The results (see table 2) show that the false alarm rate is againvery low (only 2 false alarms during 10 days) due to the 4D analysis of videos taken bysyn hronized ameras.A tivity type Nb of a tivites True positives False negatives False alarmsbank atta kwith 2 persons 10 10 (100 % ) 0 (0 %) 0 (over 1 hour)bank atta kwith 3 persons 16 15 (93.75 %) 1 (6.25 %) 0 (over 1 hour)unauthorized a essto se ure zone 20 (over 10 days) 20 (100 %) 0 (0 %) 2 (over 10 days)Table 2: Validation results for a live installation of the bank agen y monitoring system(industrial proje t Cassiopee).C.2.3 Industrial Proje t VideaThe obje tive of this ooperation was to use our videosurveillan e platform VSIP to buildtwo produ ts: one for a ess ontrol of a buiding and one for urban violen e dete tion.The fun tionality a ess ontrol was de�ned as the ount of the persons in the lo k- hamber of a building and as the determination of where they are oming from and wherethey are going to (origin and destination) using only one video amera. The main di� ultieswere to lo ate pre isely the persons be ause the entran es of the building have transparentdoors and to estimate presi ely the number of persons be ause lo k- hambers are verynarrow. The evaluation has been made on a real site with the existing video ameras.The results (shown in table 3) are very high between 92.9 % and 96 % of true positiverates and from 0% to 2% of false alarm rates.The fun tionality urban violen e has been de�ned as a violen e behavior between severalpersons in a group in a street or in an open area. The evaluation has been made on tworeal sites in Namur in Belgium with data taken by the existing videosurveillan e ameras.The bahaviors have been played by professionnal a tors under the ontrol of poli e o� ers.The main di� ulties are the fa t that the a tivity involves a group of persons (two ormore) and that it happens outside in a busy environment along with the normal vehi leand pedestrian tra� and observed with a far �eld of view video amera (80 meters).The results (shown in table 3) range between 58 % and 61 % whi h is onsidered tobe promising by the end-user assessment but whi h is far from the rate obtained whenmonitoring a lo k- hamber of a building.53

A tivity type Nb of a tivities True positives False negatives False alarmssmall lo k hamber,1 person passing alone 17 94.10 % 5.90 % 1 (over 4 hours)small lo k hamber, >= 2persons passing together 25 96.00 % 4.00 % 1 (over 4 hours)large lo k hamber,1 person passing alone 72 94.50 % 5.50 % 4 (over 4 hours)large lo k hamber,>= 2persons passing together 28 92.90 % 7.10 % 2 (over 4 hours)violen e in urban site 1 26 58 % 42 % 1 (over 4 hours)violen e in urban site 2 26 61 % 39 % 2 (over 2 hours)Table 3: Validation results for a lo k hamber a ess monitoring system and an urbanviolen e videosurveillan e system (industrial proje t Videa).C.3 Con lusionIn on lusion these performan e evaluations show that our urrent systems are very robustfor simple a tivities mainly based of the lo alisation in 3D of a small number of personsinside a metro station, a bank agen y or a building. Globally there are more than 90% oftrue positive (or orre t) dete tions and less than 2 false alarms over several hours. Theresults for more omplex a tivities su h as �ghting are promising (95% true positives inthe ase of isolated persons inside a metro station and about 60% in the ase of far �eld ofview of a urban busy environment) but these results are fragile and need to be improved.

54

Documents

August - Inria · Pulsar will b e a com bination of researc h in computer vision, soft w are engineering and kno wledge engineering. More precisely e will prop ose new computer vision