View
5
Download
0
Category
Preview:
Citation preview
This project has received funding from Horizon 2020, European Union’s
Framework Programme for Research and Innovation, under grant agreement
No. 653355
Deliverable D1.4 Data Management Plan
Work Package 1 Project Management
FORENSOR Project
Grant Agreement No. 653355
Call H2020-FCT-2014-2015 “Fight against Crime and terrorism”
Topic FCT-05-2014 “Law enforcement capabilities topic 1: Develop novel monitoring systems and
miniaturised sensors that improve Law Enforcement Agencies' evidence- gathering abilities”
Start date of the project: 1 September 2015
Duration of the project: 36 months
D1.4 Data Management Plan
FORENSOR Project Page 2 of 26
Disclaimer This document contains material, which is the copyright of certain FORENSOR contractors, and
may not be reproduced or copied without permission. All FORENSOR consortium partners have
agreed to the full publication of this document. The commercial use of any information contained
in this document may require a license from the proprietor of that information. The reproduction
of this document or of parts of it requires an agreement with the proprietor of that information.
The document must be referenced if used in a publication.
The FORENSOR consortium consists of the following partners.
No. Name Short Name
Coun-try
1 ETHNIKO KENTRO EREVNAS KAI TECHNOLOGIKIS ANAPTYXIS CERTH GR
2 JCP-CONNECT SAS JCP-C FR
3 STMICROELECTRONICS SRL STM IT
4 FONDAZIONE BRUNO KESSLER FBK IT
5 EMZA VISUAL SENSE LTD EMZA IL
6 SYNELIXIS LYSEIS PLIROFORIKIS AUTOMATISMOU & TILEPIKOI-NONION MONOPROSOPI EPE
SYNELIXIS GR
7 VRIJE UNIVERSITEIT BRUSSEL VUB BE
8 ALMAVIVA - THE ITALIAN INNOVATION COMPANY SPA ALMAVIVA IT
9 VISIONWARE-SISTEMAS DE INFORMACAO SA VISION-WARE
PT
10 VALENCIA LOCAL POLICE PLV ES
11 POLÍCIA JUDICIÁRIA (MINISTÉRIO DA JUSTIÇA) MJ PT
D1.4 Data Management Plan
FORENSOR Project Page 3 of 26
Document Information
Project short name and number FORENSOR (653355)
Work package WP1
Number D1.4
Title Data Management Plan
Responsible beneficiary CERTH
Involved beneficiaries JCP-C
Type1 R
Dissemination level2 PU
Contractual date of delivery 30/11/2015
Last update 30/11/2015
1 Types. R: Document, report (excluding the periodic and final reports); DEM: Demonstrator, pilot, proto-type, plan designs; DEC: Websites, patents filing, press & media actions, videos, etc.; OTHER: Software, technical diagram, etc. 2 Dissemination levels. PU: Public, fully open, e.g. web; CO: Confidential, restricted under conditions set out in Model Grant Agreement; CI: Classified, information as referred to in Commission Decision 2001/844/EC.
D1.4 Data Management Plan
FORENSOR Project Page 4 of 26
Document History
Ver-sion
Date Sta-tus
Authors, Reviewers
Description
v 0.01
05/10/2015 Tem-plate
CERTH Project deliverable template.
v 0.02
11/11/2015 Draft CERTH Adjustment of the project deliverable template.
v 0.03
12/11/2015 Draft CERTH Document structure, ToC, sections content in bul-leted form with brief explanation, writing alloca-tion. Note: This version was submitted to the WPL, PTM, PSC, and PMC for internal review.
v 0.04
17/11/2015 Draft WPL, PTM, PSC, PMC,
CERTH
Feedback to v0.03 of the document by WPL, PTM, PSC, and PMC. Provision of a full description of a dataset as template for partner input. Introduction content in bulleted form.
v 0.05
20/11/2015 Final draft
VUB, EMZA, ALMA-VIVA,
CERTH
Inputs and feedback to v0.04 of the document by VUB (Section 2), EMZA (Section 3), and ALMAVIVA (Section 4). Final development of all sections in-cluding Introduction and Executive Summary. Al-phabetical ordering of acronyms/ abbreviations. Note: This version was submitted to the QM, PSC, and PMC. The QM forwarded this version to VIS and STM for internal review.
v 1.00
30/11/2015 Final VIS, STM, VUB,
CERTH
Internal review feedback to v0.05 of the document by VIS, STM, VUB (Sections 2.4, 2.5, 3.4, 3.5, 4.4, 4.5, 5.4, 5.5), and CERTH. Final document for sub-mission.
D1.4 Data Management Plan
FORENSOR Project Page 5 of 26
Acronyms and Abbreviations
Acronym/Abbreviation Description
CC0 Creative Commons “No Rights Reserved” license
DaPPECL Data Protection, Privacy, Ethical and Criminal Law
DMP Data Management Plan
DOI Digital Object Identifier
FORENSOR FOREnsic evidence gathering autonomous seNSOR
FT Field Test
GA Grant Agreement
HTTP Hypertext Transfer Protocol
HW Hardware
IPR Intellectual Property Rights
LEA Law Enforcement Authority
OAI-PMH Open Archives Initiative Protocol for Metadata Harvesting
PMC Project Management Committee
PSC Project Steering Committee
PTM Project Technical Manager
QVGA Quarter Video Graphics Array
SW Software
ToC Table of Contents
TX.Y Task X.Y
UC Use Case
VGA Video Graphics Array
WPL Work Package Leader
WPX Work Package X
D1.4 Data Management Plan
FORENSOR Project Page 6 of 26
Contents
1 Introduction ........................................................................................................................... 10
2 Project Datasets..................................................................................................................... 11
2.1 DaPPECL Impact Assessment Survey Data (FORENSOR DaPPECL Impact Assessment
Data) 11
2.1.1 Dataset reference and name ................................................................................. 11
2.1.2 Dataset description ............................................................................................... 11
2.2 Staged Surveillance Videos and FORENSOR Emulated Staged Surveillance Videos
(FORENSOR Development and Testing Benchmark) ................................................................. 12
2.2.1 Dataset reference and name ................................................................................. 12
2.2.2 Dataset description ............................................................................................... 14
2.3 Real Life Surveillance Videos (FORENSOR Pilots/ Field Tests Content) ......................... 16
2.3.1 Dataset reference and name ................................................................................. 16
2.3.2 Dataset description ............................................................................................... 17
2.4 Automatically Obtained Image and Video Forensic Evidence (FORENSOR Pilots/ Field
Tests Testbed Data) ................................................................................................................... 18
2.4.1 Dataset reference and name ................................................................................. 18
2.4.2 Dataset description ............................................................................................... 18
3 Standards and Metadata ....................................................................................................... 20
3.1 General note on dataset structure ................................................................................ 20
3.2 FORENSOR DaPPECL IA SD Dataset ............................................................................... 20
3.3 Other datasets ............................................................................................................... 21
4 Data Sharing .......................................................................................................................... 22
4.1 FORENSOR DaPPECL IA SD Dataset ............................................................................... 22
4.2 Other datasets ............................................................................................................... 22
4.2.1 Access and licensing .............................................................................................. 23
4.2.2 Re-use .................................................................................................................... 23
4.2.3 Reasons for not sharing ......................................................................................... 23
5 Archiving and Preservation (Including Storage and Backup) ................................................ 25
5.1 FORENSOR DaPPECL IA SD Dataset ............................................................................... 25
5.2 Other datasets ............................................................................................................... 25
6 Appendix I: Relevant ZENODO Archiving and Preservation Policies ..................................... 26
D1.4 Data Management Plan
FORENSOR Project Page 7 of 26
6.1 Retention period ........................................................................................................... 26
6.2 Functional preservation................................................................................................. 26
6.3 File preservation ............................................................................................................ 26
6.4 Fixity and authenticity ................................................................................................... 26
6.5 Succession plans ............................................................................................................ 26
6.6 What does it cost? ......................................................................................................... 26
D1.4 Data Management Plan
FORENSOR Project Page 8 of 26
List of Tables
Table 1: Overview of the research data of the FORENSOR project ............................................... 10
Table 2: DaPPECL Impact Assessment Survey Data (FORENSOR DaPPECL Impact Assessment
Data) .............................................................................................................................................. 11
Table 3: FORENSOR DaPPECL IA SD Dataset description .............................................................. 12
Table 4: Staged Surveillance Videos and FORENSOR Emulated Staged Surveillance Videos
(FORENSOR Development and Testing Benchmark) ..................................................................... 12
Table 5: FORENSOR DTB and FORENSOR DTB-E Datasets description .......................................... 15
Table 6: Real Life Surveillance Videos (FORENSOR Pilots/ Field Tests Content) ........................... 16
Table 7: FORENSOR PFT Dataset description ................................................................................ 17
Table 8: Automatically Obtained Image and Video Forensic Evidence (FORENSOR Pilots/ Field
Tests Testbed Data) ....................................................................................................................... 18
Table 9: FORENSOR Evidence Dataset description........................................................................ 18
D1.4 Data Management Plan
FORENSOR Project Page 9 of 26
Executive Summary
A further new element in Horizon 2020 is the use of Data Management Plans (DMPs) detailing
what data the project will generate, whether and how it will be exploited or made accessible for
verification and re-use, and how it will be curated and preserved.
The present deliverable constitutes the DMP of the FORENSOR project, i.e. a short, general outline
of the project policy for data management. The described policy herein reflects the current state
of consortium agreements regarding data management and is consistent with those referring to
exploitation and protection of results.
This deliverable can be also considered as a checklist for the future. It is a living document that is
expected to mature during the project lifetime and will be updated accordingly. The overall pur-
pose of the DMP is to support the data management life cycle for all data that will be collected,
processed or generated by the project.
Five different datasets have been identified, at this early stage of the project. In sections 2-5 we
provide a detailed presentation of all the data that are expected to be collected, processed or
generated by the FORENSOR project along with respective data management policies.
The first project dataset (FORENSOR DaPPECL IA SD Dataset) will consist of all filled-in question-
naires that will be collected within the context of the project Data Protection, Privacy, Ethical and
Criminal Law (DaPPECL) Impact Assessment.
The second (FORENSOR DTB Dataset) and third (FORENSOR DTB-E Dataset) project datasets will
be the Development and Testing Benchmark of the FORENSOR project.
The fourth project dataset (FORENSOR PFT Dataset) will consist of all real life surveillance videos
that will be captured within the context of the project pilots and field tests.
Finally, the fifth project dataset (FORENSOR Evidence Dataset) will consist of all parts of the FO-
RENSOR PFT Dataset that will be automatically identified by the FORENSOR system as forensic
evidence.
The five datasets mentioned above are specified in this document and details are given regarding
their standards, metadata, sharing, archiving and preservation.
D1.4 Data Management Plan
FORENSOR Project Page 10 of 26
1 Introduction The present deliverable constitutes the Data Management Plan (DMP) of the FORENSOR project,
i.e. a short, general outline of the project policy for data management, including the following
issues:
What types of data will the project collect/generate?
What standards will be used?
How will this data be exploited and/or shared/made accessible for verification and re-
use? Reasons why the data cannot be made available in some cases.
How will this data be curated and preserved?
In other words the DMP is a document outlining how research data will be handled during the
FORENSOR project, and after it is completed.
The overall purpose of the DMP is to support the data management life cycle
for all data that will be collected, processed or generated by the project.
The described policy herein reflects the current state of consortium agreements regarding data
management and is consistent with those referring to exploitation and protection of results.
This deliverable can also be considered as a checklist for the future. It is a living document that is
expected to mature during the project lifetime and will be updated accordingly.
Five different datasets have been identified, at this early stage of the project. Table 1 gives an
overview of all the data that are expected to be collected, processed or generated by the FOREN-
SOR project.
Table 1: Overview of the research data of the FORENSOR project
Dataset ID Dataset Name
1. FORENSOR DaPPECL IA SD Da-taset
DaPPECL Impact Assessment Survey Data
2. FORENSOR DTB Dataset Staged Surveillance Videos
3. FORENSOR DTB-E Dataset FORENSOR Emulated Staged Surveillance Videos
4. FORENSOR PFT Dataset Real life surveillance videos
5. FORENSOR Evidence Dataset Automatically Obtained Image and Video Forensic Evi-dence
The five datasets listed in Table 1 are described in Section 2 of this deliverable while extensive
details regarding standards, metadata, sharing, archiving and preservation of the datasets are
provided in Sections 3-6.
D1.4 Data Management Plan
FORENSOR Project Page 11 of 26
2 Project Datasets
2.1 DaPPECL Impact Assessment Survey Data (FORENSOR DaPPECL Impact As-
sessment Data)
2.1.1 Dataset reference and name Dataset and Subset IDs, names and references are shown in Table 2.
Table 2: DaPPECL Impact Assessment Survey Data (FORENSOR DaPPECL Impact Assessment Data)
Dataset ID Dataset Name Dataset Reference
FORENSOR DaPPECL IA SD Dataset
DaPPECL Impact Assessment Sur-vey Data
Link will be provided at a later stage of the FORENSOR project.
1st Level Subset ID 1st Level Subset Name 1st Level Subset Reference
FORENSOR DaPP IA SD Subset
Data Protection and Privacy Im-pact Assessment Survey Data
Same as above.
FORENSOR E IA SD Subset
Ethical Impact Assessment Survey Data
Same as above.
FORENSOR CL IA SD Subset
Criminal Law Impact Assessment Survey Data
Same as above.
2.1.2 Dataset description FORENSOR DaPPECL IA SD Dataset will consist of all filled-in questionnaires that will be collected
within the context of the project Data Protection, Privacy, Ethical and Criminal Law (DaPPECL)
Impact Assessment and specifically within T2.2 “Impact assessment against DaPPECL require-
ments” of the project (M7-M12 or March to August 2016). Analytically, collected data will com-
prise written answers concerning the perceived implication of DaPPECL issues regarding the FO-
RENSOR project and any FORENSOR technology that may be developed.
FORENSOR DaPPECL IA SD Dataset will be processed within the same task (T2.2) in order to facil-
itate the assessment of the FORENSOR system:
upon personal data protection and privacy, and formulate recommendations on how to
improve the observance of data protection and privacy in the FORENSOR system;
against a specially developed ethical framework for the impact of the FORENSOR system
upon the applicable ethical principles, taking into account the specific features of that
system, and formulate recommendations on how to improve the observance of applica-
ble ethical principles in the FORENSOR system.
FORENSOR DaPPECL IA SD Dataset will be also used within T2.3 “Continuous monitoring of the
impacts of the FORENSOR upon personal data protection, privacy, ethical and criminal law
(DaPPECL) requirements” (M7-M36 or March 2016 to August 2018) in order to monitor the im-
pacts of FORENSOR on the requirements identified in DaPPPECL, as the system is integrated,
tested and evaluated in the field trials.
Finally, another purpose of the collection of this information (FORENSOR DaPPECL IA SD Dataset)
is to alert all partners to potential issues concerning DaPPECL matters that may not be immedi-
ately obvious, especially partners who do not have expertise in the area concerned.
D1.4 Data Management Plan
FORENSOR Project Page 12 of 26
The FORENSOR DaPPECL IA SD Dataset will be structured as shown in Table 2.
Key attributes, characteristics and other information regarding FORENSOR DaPPECL IA SD Dataset
are presented in Table 3.
Table 3: FORENSOR DaPPECL IA SD Dataset description
Description of the data that will be generated or col-lected
Quantitative and qualitative survey data from filled in DaPPECL Impact As-sessment questionnaires.
Data origin (in case it is col-lected)
Content acquired in the context of the DaPPECL Impact Assessment of the FORENSOR EU funded project (GA No. 653355).
Data nature Filled-in DaPPECL Impact Assessment questionnaires.
Data scale Tenths of filled-in questionnaires with less than 100 answered questions each.
To whom could the data be useful
Researchers involved in projects similar to FORENSOR, i.e. concerning the use of surveillance technologies in the area of criminal investigations.
Do the data un-derpin a scientific publication
To be determined in due time.3
Information on the existence (or not) of similar data
Data protection and privacy Impact Assessment questionnaires are used often for a variety of reasons (social sciences research, internal or external evaluation of security systems etc.). However proper public archiving of the respective datasets is much less common. To the best of our knowledge ethical and criminal law Impact Assessment questionnaires have not been used by other researchers in the past.
Information on the possibilities for integration and reuse of the data
The data could be reused for comparison purposes or being subject to different analyses in projects concerning the potential use of surveillance technologies for investigation of criminal activity.
2.2 Staged Surveillance Videos and FORENSOR Emulated Staged Surveillance Vid-
eos (FORENSOR Development and Testing Benchmark)
2.2.1 Dataset reference and name Dataset and Subset IDs, names and references are shown in Table 4.
Table 4: Staged Surveillance Videos and FORENSOR Emulated Staged Surveillance Videos (FORENSOR Development and Testing Benchmark)
Dataset ID Dataset Name Dataset Reference
FORENSOR DTB Dataset
Staged Surveillance Videos Link will be provided at a later stage of the FORENSOR project.
3 Any developments will be duly reported in subsequent versions of the present deliverable.
D1.4 Data Management Plan
FORENSOR Project Page 13 of 26
FORENSOR DTB-E Dataset
FORENSOR Emulated Staged Surveillance Videos
Link will be provided at a later stage of the FORENSOR project.
1st Level Sub-set ID
1st Level Subset Name 1st Level Subset Reference
FORENSOR DTB1 Subset
Unfrequented provincial road surveillance Same as above.
FORENSOR DTB2 Subset
Illegal trafficking in desolate coastal area Same as above.
FORENSOR DTB3 Subset
Burglary at private house located in hamlet/ village
Same as above.
FORENSOR DTB4 Subset
Pedestrian and bicycle path surveillance Same as above.
FORENSOR DTB-E1 Subset
Unfrequented provincial road surveillance Same as above.
FORENSOR DTB-E2 Subset
Illegal trafficking in desolate coastal area Same as above.
FORENSOR DTB-E3 Subset
Burglary at private house located in hamlet/ village
Same as above.
FORENSOR DTB-E4 Subset
Pedestrian and bicycle path surveillance Same as above.
2nd Level Sub-set ID
2nd Level Subset Name 2nd Level Subset Reference
FORENSOR DTB1.1 Subset
Car crash in unfrequented provincial road Same as above.
FORENSOR DTB1.2 Subset
Illegal garbage disposal in unfrequented pro-vincial road adjacent to a protected/ conser-vation area
Same as above.
FORENSOR DTB1.3 Subset
Zigzagging/ Driving under the influence of substances in unfrequented provincial road
Same as above.
FORENSOR DTB2.1 Subset
Vehicles passing by through desolate coast access road
Same as above.
FORENSOR DTB2.2 Subset
People passing by through desolate coast access road
Same as above.
FORENSOR DTB2.3 Subset
Vessel approaching desolate coast Same as above.
FORENSOR DTB2.4 Subset
People walking in desolate beach Same as above.
FORENSOR DTB3.1 Subset
Vehicle stationary (suspicious activity) nearby private house located in hamlet/ vil-lage
Same as above.
FORENSOR DTB3.2 Subset
Person stationary (suspicious activity) nearby private house located in hamlet/ vil-lage
Same as above.
FORENSOR DTB3.3 Subset
People trespassing gate or fence of private house located in hamlet/ village
Same as above.
D1.4 Data Management Plan
FORENSOR Project Page 14 of 26
FORENSOR DTB4.1 Subset
Vehicle that is not a bicycle enters pedes-trian and bicycle path
Same as above.
FORENSOR DTB4.2 Subset
People entering pedestrian and bicycle path Same as above.
FORENSOR DTB-E1.1 Sub-set
Car crash in unfrequented provincial road Same as above.
FORENSOR DTB-E1.2 Sub-set
Illegal garbage disposal in unfrequented pro-vincial road adjacent to a protected/ conser-vation area
Same as above.
FORENSOR DTB-E1.3 Sub-set
Zigzagging/ Driving under the influence of substances in unfrequented provincial road
Same as above.
FORENSOR DTB-E2.1 Sub-set
Vehicles passing by through desolate coast access road
Same as above.
FORENSOR DTB-E2.2 Sub-set
People passing by through desolate coast access road
Same as above.
FORENSOR DTB-E2.3 Sub-set
Vessel approaching desolate coast Same as above.
FORENSOR DTB-E2.4 Sub-set
People walking in desolate beach Same as above.
FORENSOR DTB-E3.1 Sub-set
Vehicle stationary (suspicious activity) nearby private house located in hamlet/ vil-lage
Same as above.
FORENSOR DTB-E3.2 Sub-set
Person stationary (suspicious activity) nearby private house located in hamlet/ vil-lage
Same as above.
FORENSOR DTB-E3.3 Sub-set
People trespassing gate or fence of private house located in hamlet/ village
Same as above.
FORENSOR DTB-E4.1 Sub-set
Vehicle that is not a bicycle enters pedes-trian and bicycle path
Same as above.
FORENSOR DTB-E4.2 Sub-set
People entering pedestrian and bicycle path Same as above.
2.2.2 Dataset description FORENSOR DTB Dataset will consist of all staged surveillance videos that will be captured within
the context of project development and testing and specifically (but not exclusively) within T5.1
“Embedded low-level algorithms” of the project (M1-M20 or September 2015 to April 2017). Due
to the fact that the FORENSOR project foresees extensive hardware development towards the
D1.4 Data Management Plan
FORENSOR Project Page 15 of 26
creation of the homonymous sensor and given that the project benchmark is required as early as
possible for developing algorithms and designing hardware and software, the FORENSOR DTB Da-
taset will be initially captured using a vision system that has characteristics (e.g. analysis, sensitiv-
ity, etc.) as similar as possible to those of the novel Low-Power Imager, FORENSOR, i.e. a generic,
low resolution, grey-level consumer camera.
Subsequently, the FORENSOR DTB Dataset will be processed (within T5.1) by a software emulator
that will be developed specifically for this purpose, which will be emulating the behaviour of the
Low-Power Imager, i.e. the operations of the FORENSOR chip. The full set of videos that will be
generated in this way, i.e. the output of the emulator (low resolution, binary or grey-level videos),
will be the FORENSOR DTB-E Dataset.
The two datasets (original and emulated one) will be the Development and Testing Benchmark of
the FORENSOR project and will be subsequently used within WP5 “Ultra-low-power vision algo-
rithms” – and potentially other technical WPs, e.g. WP6 and WP7 – of the project in order to test
the developed low- and high-level algorithms and compare them to the performance of current
state-of-the-art algorithms.
The two datasets will include both training and testing videos while training videos will include
videos depicting both ‘positive’ as well as negative ‘events’.
The FORENSOR DTB and FORENSOR DTB-E Datasets will be structured as shown in Table 4.
Key attributes, characteristics and other information regarding the FORENSOR DTB and FOREN-
SOR DTB-E Datasets are presented in Table 5.
Table 5: FORENSOR DTB and FORENSOR DTB-E Datasets description
Description of the data that will be gen-erated or collected
FORENSOR DTB Da-taset
Low resolution (VGAi), grey-level (256 grey levels) vid-eos captured by a generic consumer camera from a sta-ble viewpoint.
FORENSOR DTB-E Da-taset
Low resolution (VGAi or QVGAii), binary or grey-level (256 grey levels) videos produced by a SW emulator of the FORENSOR chip.
Data origin (in case it is collected)
FORENSOR DTB Da-taset
Content acquired in the context of project develop-ment and testing of the FORENSOR EU funded project (GA No. 653355).
FORENSOR DTB-E Da-taset
Content generated in the context of project develop-ment and testing of the FORENSOR EU funded project (GA No. 653355).
Data nature Videos.
Data scale FORENSOR DTB Da-taset
Several hours of video (30 mins to 10 hours estimate) or video data in the order of tenths or hundreds of gi-gabytes (> 15 GB, < 350 GB estimate).
FORENSOR DTB-E Da-taset
Several hours of video (30 mins to 10 hours estimate) or video data in the order of gigabytes (> 1 GB, < 350 GB estimate).
To whom could the data be useful Image and video processing community.
D1.4 Data Management Plan
FORENSOR Project Page 16 of 26
Do the data underpin a scientific publication
To be determined in due time.4
Information on the existence (or not) of similar data
Similar datasets or even databases of scientific or other origin do exist.
Information on the possibilities for integration and reuse of the data
The data could be merged with existing and/ or un-der development staged surveillance video da-tasets/ databases.
The data could be easily reused for scientific re-search or for testing and validation of commercial products, in the field of image and video pro-cessing.
2.3 Real Life Surveillance Videos (FORENSOR Pilots/ Field Tests Content)
2.3.1 Dataset reference and name Dataset and Subset IDs, names and references are shown in Table 6.
Table 6: Real Life Surveillance Videos (FORENSOR Pilots/ Field Tests Content)
Dataset ID Dataset Name Dataset Reference
FORENSOR PFT Dataset
Real life surveillance videos Link will be provided at a later stage of the FORENSOR project.
1st Level Sub-set ID
1st Level Subset Name 1st Level Subset Reference
FORENSOR PFT1 Subset
Unfrequented provincial road surveillance Same as above.
FORENSOR PFT2 Subset
Illegal trafficking in desolate coastal area Same as above.
FORENSOR PFT3 Subset
Burglary at private house located in hamlet/ village
Same as above.
FORENSOR PFT4 Subset
Pedestrian and bicycle path surveillance Same as above.
2nd Level Sub-set ID
2nd Level Subset Name 2nd Level Subset Reference
FORENSOR PFT1.1 Subset
Car crash in unfrequented provincial road Same as above.
FORENSOR PFT1.2 Sub-set5
Illegal garbage disposal in unfrequented pro-vincial road adjacent to a protected/ conser-vation area
Same as above.
FORENSOR PFT1.3 Subset
Zigzagging/ Driving under the influence of substances in unfrequented provincial road
Same as above.
FORENSOR PFT2.1 Subset
Vehicles passing by through desolate coast access road
Same as above.
4 Any developments will be duly reported in subsequent versions of the present deliverable. 5 FORENSOR PFT1.2 Subset is under consideration.
D1.4 Data Management Plan
FORENSOR Project Page 17 of 26
FORENSOR PFT2.2 Subset
People passing by through desolate coast ac-cess road
Same as above.
FORENSOR PFT2.3 Subset
Vessel approaching desolate coast Same as above.
FORENSOR PFT2.4 Subset
People walking in desolate beach Same as above.
FORENSOR PFT3.1 Subset
Vehicle stationary (suspicious activity) nearby private house located in hamlet/ vil-lage
Same as above.
FORENSOR PFT3.2 Subset
Person stationary (suspicious activity) nearby private house located in hamlet/ village
Same as above.
FORENSOR PFT3.3 Subset
People trespassing gate or fence of private house located in hamlet/ village
Same as above.
FORENSOR PFT4.1 Subset
Vehicle that is not a bicycle enters pedes-trian and bicycle path
Same as above.
FORENSOR PFT4.2 Sub-set6
People entering pedestrian and bicycle path Same as above.
2.3.2 Dataset description FORENSOR PFT Dataset will consist of all real life surveillance videos that will be captured within
the context of the project pilots and field tests and specifically within T8.1 “Content acquisition”
of the project (M7-M18 or March to February 2016); it will be the full content of the project pilots
and field tests. FORENSOR PFT Dataset will be then processed within T8.2 “Pilots” (M16-M32 or
December 2016 to April 2018) in order to:
a. drive the demonstration and highlight the innovative functionalities of FORENSOR, and
b. validate the project into real applications in three European countries: Spain, Portugal
and Italy.
The FORENSOR PFT Dataset will be structured as shown in Table 6.
Key attributes, characteristics and other information regarding FORENSOR PFT Dataset are pre-
sented in Table 7.
Table 7: FORENSOR PFT Dataset description
Description of the data that will be generated or col-lected
Low resolution (VGAi), binary or grey-level (256 grey levels) vid-eos captured by a low power static rigged visual sensor.
Data origin (in case it is col-lected)
Content acquired in the context of the pilots and field tests of the FORENSOR EU funded project (GA No. 653355).
Data nature Videos.
Data scale Several hours of video (30 mins to 10 hours estimate) or video data in the order of gigabytes (> 4 GB, < 350 GB estimate).
6 FORENSOR PFT4.2 Subset is under consideration.
D1.4 Data Management Plan
FORENSOR Project Page 18 of 26
To whom could the data be useful
Image and video processing community.
Do the data underpin a scien-tific publication
To be determined in due time.7
Information on the existence (or not) of similar data
Similar datasets or even databases of scientific or other origin do exist.
Information on the possibili-ties for integration and reuse of the data
The data could be merged with existing and/ or under de-velopment real life surveillance video datasets/ databases.
The data could be easily reused for scientific research or for testing and validation of commercial products, in the field of image and video processing.
2.4 Automatically Obtained Image and Video Forensic Evidence (FORENSOR Pi-
lots/ Field Tests Testbed Data)
2.4.1 Dataset reference and name Dataset and Subset IDs, names and references are shown in Table 8.
Table 8: Automatically Obtained Image and Video Forensic Evidence (FORENSOR Pilots/ Field Tests Testbed Data)
Dataset ID Dataset Name Dataset Reference
FORENSOR Evidence Dataset
Automatically Obtained Image and Video Forensic Evidence
Link will be provided at a later stage of the FORENSOR project.
1st Level Subset ID 1st Level Subset Name 1st Level Subset Reference
FORENSOR Image Evidence Dataset
Automatically Obtained Image Forensic Evidence
Same as above.
FORENSOR Video Ev-idence Dataset
Automatically Obtained Video Forensic Evidence
Same as above.
2.4.2 Dataset description FORENSOR Evidence Dataset will consist of all parts of the FORENSOR PFT Dataset that will be
automatically identified by the FORENSOR system as forensic evidence, i.e. as evidence regarding
a specific event that has been, again, automatically identified by the system. The FORENSOR Evi-
dence Dataset will consist of still images (video frames) and limited length 6videos (video seg-
ments).
The FORENSOR Evidence Dataset will be structured as shown in Table 8.
Key attributes, characteristics and other information regarding FORENSOR Evidence Dataset are
presented in Table 9.
Table 9: FORENSOR Evidence Dataset description
Description of the data that will be generated or collected
Low resolution (VGAi), grey-level (256 grey levels) images or videos captured by a low power static rigged visual sensor.
7 Any developments will be duly reported in subsequent versions of the present deliverable.
D1.4 Data Management Plan
FORENSOR Project Page 19 of 26
Data origin (in case it is col-lected)
Content acquired in the context of the pilots and field tests of the FORENSOR EU funded project (GA No. 653355).
Data nature Images and videos.
Data scale Several hundred images (10 to 1,000 estimate) or image data in the order of megabytes (> 3 MB, < 300 MB esti-mate).
Several minutes of video (10 to 120 minutes estimate) or video data in the order of gigabytes (> 5 GB, < 65 GB esti-mate).
To whom could the data be useful
Social sciences researchers, judicial/legal experts, digital fo-rensic experts, LEAs or LEA related organisations, Image and video processing community.
Do the data underpin a scien-tific publication
To be determined in due time.8
Information on the existence (or not) of similar data
Similar datasets or even databases of scientific or other origin do exist but are rarely shared publicly.
Information on the possibili-ties for integration and reuse of the data
The data could be merged with existing and/ or under de-velopment similar datasets/ databases.
The data could be easily reused for scientific research in the fields of social sciences and image and video pro-cessing.
8 Any developments will be duly reported in subsequent versions of the present deliverable.
D1.4 Data Management Plan
FORENSOR Project Page 20 of 26
3 Standards and Metadata
3.1 General note on dataset structure The structure of the FORENSOR DTB and FORENSOR DTB-E Datasets and the FORENSOR PFT Da-
taset that is shown in Table 4 and Table 6 respectively, corresponds to the three project UCs and
one project FT and the respective events to be detected (as they are currently identified):
UC1: Unfrequented provincial road surveillance
o Car crash in unfrequented provincial road
o Illegal garbage disposal in unfrequented provincial road adjacent to a protected/
conservation area
o Zigzagging/ Driving under the influence of substances in unfrequented provincial
road
UC2: Illegal trafficking in desolate coastal area
o Vehicles passing by through desolate coast access road
o People passing by through desolate coast access road
o Vessel approaching desolate coast
o People walking in desolate beach
UC3: Burglary at private house located in hamlet/ village
o Vehicle stationary (suspicious activity) nearby private house located in hamlet/
village
o Person stationary (suspicious activity) nearby private house located in hamlet/
village
o People trespassing gate or fence of private house located in hamlet/ village
FT1: Pedestrian and bicycle path surveillance
o Vehicle that is not a bicycle enters pedestrian and bicycle path
o People entering pedestrian and bicycle path
Note: The final project UCs and FTs will be defined in D3.1 “Use case analysis and user scenarios”
(M6, February 2015), as contractually agreed. Any needed changes to the description of the FO-
RENSOR DTB and the FORENSOR DTB-E Datasets will be reported in subsequent versions of the
present document as it will be a “living document”, updated accordingly until the end of the pro-
ject.
3.2 FORENSOR DaPPECL IA SD Dataset FORENSOR DaPPECL IA SD Dataset data will concern potential DaPPECL issues that partners see
in the FORENSOR project. This will include opinions based from varying perspectives including
both technical and those based in the law enforcement sector. Answers will be given in the light
of the varying expertise of each concerned partner. The surveys will be conducted in a way that
allows for the information generated to be evaluated objectively. This will occur through the use
of questions that are designed to elaborate issues that exist but which may not be obvious given
the particular background of the concerned interviewee. The resulting data will be stored in the
form of a set of PDF files.
D1.4 Data Management Plan
FORENSOR Project Page 21 of 26
3.3 Other datasets The FORENSOR DTB, FORENSOR DTB-E, and FORENSOR PFT Datasets will consist of uncompressed
low resolution, binary or grey-level videos while the FORENSOR Evidence Dataset will consist of
uncompressed low resolution, grey-level images and videos.
The metadata of all four aforementioned datasets will consist of selected object, event or other
system attributes. The metadata will be generated either automatically by the system or through
manual content annotation. The metadata will be associated with specific frames, sequences or
whole videos.
Metadata structure is expected to be the same for all four aforementioned datasets and common
for all project UCs and the FT (as those are currently defined); each UC and the FT will use a proper
database to store metadata. Metadata will be framed using XML and the respective structure will
include at least the following:
Metadata that describe objects/events
o Existing/detected object type (from a predefined list of object types)
o Occurring/detected event type (from a predefined list of event types)
o Identification of frames where an event of a certain type occurs
o Identification of locations within a frame where an object or a certain type ap-
pears
o Text-based descriptions of actions in video and/or image sequences (e.g. a car
passing, a person running, two people walking etc.)
Metadata that trace the path/process (workflow) of gathering metadata for an ob-
ject/event, i.e. chain of custody of evidence, in the system.
o Time information (e.g. timestamps)
o Location information (e.g. imaging device georeferenced coordinates)
o Identification of the algorithms that have been used
o Identification of algorithmic input and output
Metadata to monitor/ control the use of the metadata
o DaPPECL requirements
o Logs
More detailed descriptions of the metadata will become available as part of the work in various
project WPs – i.e. DaPPECL constraints (WP2), system specification (WP3), algorithm development
(WP5), secure communications infrastructure (WP6), system integration (WP7), project pilots
(WP8) – and will be duly reported in subsequent versions of the present deliverable.
D1.4 Data Management Plan
FORENSOR Project Page 22 of 26
4 Data Sharing
4.1 FORENSOR DaPPECL IA SD Dataset The decision on publicly sharing parts or the entirety of the FORENSOR DaPPECL IA SD Dataset will
be taken in due time – when the dataset is available – and will be duly reported in subsequent
versions of present deliverable. Besides their scientific value, the data contained in this dataset
will also give away the full profile of the FORENSOR system in terms of privacy protection, ethics
and the criminal law (e.g. preservation of the chain of custody of evidence). Therefore, the FO-
RENSOR consortium will carefully examine to what extent the FORENSOR DaPPECL IA SD Dataset
can be publicly shared.
In case it is decided to publicly share parts of the FORENSOR DaPPECL IA SD Dataset this will be
realised through their inclusion as an annex to the public version of D2.3 “The impact assessment
report” (which will be available for download from the project website) and/or separately, as PDF
files, through the project website.
4.2 Other datasets Due to the large estimated size of the FORENSOR DTB, FORENSOR DTB-E, and FORENSOR PFT
Datasets (currently estimated in the order of gigabytes), the FORENSOR consortium decided that
the only appropriate way for sharing them would be through a well organised open access repos-
itory. Since, no consortium partner maintains an appropriate institutional repository and no rele-
vant subject repository was found, it was decided that if the three aforementioned datasets were
to be shared they would be deposited in ZENODO9, a repository hosted by CERN and available to
all.
The FORENSOR consortium will carefully consider if any part of the FORENSOR Evidence Dataset
can be shared with specific groups of individuals (e.g. LEAs) or with the public and for what reason;
all ethical and legal aspects will be thoroughly examined by the Project Ethical Manager, Data
Controller and Security Officer. In case of a positive decision, careful anonymization will take place
in order to remove any personal data depicted in the images or videos (e.g. blurring of faces). In
such case, the FORENSOR Evidence Dataset will be also deposited in ZENODO.
ZENODO supports Closed, Open and Embargoed10 Access. However, only Open Access uploads
are displayed on the front-page of the ZENODO website. Closed Access uploads are still discover-
able through search queries, their DOI11, and any community collections where they are included.
Metadata is licensed under CC012, except for email addresses. All metadata is exported via OAI-
9 https://zenodo.org/. 10 Users may deposit content under an embargo status and provide and end date for the embargo. The repository will restrict access to the data until the end of the embargo period; at which time, the content will become publically available automatically. 11 A Digital Object Identifier (DOI) is a serial code used to uniquely identify objects. Further information can be found at https://en.wikipedia.org/wiki/Digital_object_identifier. 12 Creative Commons “No Rights Reserved” license (CC0) enables scientists, educators, artists and other creators and owners of copyright- or database-protected content to waive those interests in their works and thereby place them as completely as possible in the public domain, so that others may freely build upon, enhance and reuse the works for any purposes without restriction under copyright or database law. Further information can be found at https://creativecommons.org/about/cc0.
D1.4 Data Management Plan
FORENSOR Project Page 23 of 26
PMH13 and can be harvested. Access to metadata and data files is provided over standard proto-
cols such as HTTP and OAI-PMH.
ZENODO accepts data under a variety of licenses, but extra benefits, in terms of visibility and
credit, and additional services and upload quotas are offered to data deposited under the most
open licenses.
Note: The full set of ZENODO “Terms of Use” can be found online at https://zenodo.org/terms
while the full set of ZENODO Policies can be found online at https://zenodo.org/policies.
4.2.1 Access and licensing The FORENSOR consortium will take its final decisions regarding the sharing, access, and licensing
strategies for FORENSOR DTB, FORENSOR DTB-E, FORENSOR PFT, and FORENSOR Evidence Da-
tasets in due time, when the datasets are available. All relevant details will be duly reported in
subsequent versions of the present deliverable.
4.2.2 Re-use FORENSOR DTB, FORENSOR DTB-E, and FORENSOR PFT Datasets will consist of videos in standard
video formats (easily readable by any standard video player), thus enabling easy re-use. FOREN-
SOR Evidence Dataset will consist of images and videos in standard formats, thus easily readable.
In all four aforementioned cases, no specific SW or HW will be needed in order to view or re-use
the datasets.
Note: FORENSOR will not provide open access to the software emulator that will be developed
specifically for emulating the behaviour of the Low-Power Imager, i.e. the operations of the FO-
RENSOR chip, and which will be used for processing the FORENSOR DTB Dataset in order to acquire
the FORENSOR DTB-E Dataset. It will neither provide open access to any other algorithm developed
during the project, which will not be needed in order to access or re-use any of the datasets de-
scribed in this document.
4.2.3 Reasons for not sharing
Note: FORENSOR consortium declares that it reserves the right to exclude any video(s) belonging
to the datasets described in the present deliverable (or later versions of it) from the dataset they
belong to and/or from Open Access in general in case it interferes with ethical rules, or contains
personal data, or raises IPR, commercial, privacy-related or security issues.
4.2.3.1 FORENSOR DTB and FORENSOR DTB-E Datasets
FORENSOR DTB and FORENSOR DTB-E Datasets might contain videos where personal data (such
as the faces for example) of the actors, which voluntarily participated in the creation of the videos,
are shown. However, all participants to the research will be requested, prior to their participation,
to sign an informed consent form in which they will: (a) acknowledge the right of the FORESNSOR
consortium to provide Open Access to the project datasets (as these are described and under the
13 The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. Further information can be found at: https://www.openar-chives.org/pmh/.
D1.4 Data Management Plan
FORENSOR Project Page 24 of 26
terms described in the present deliverable or in later versions of this deliverable), and (b) author-
ise the FORENSOR consortium to provide Open Access to all the videos depicting them. These
actions reflect the FORENSOR Consortium’s commitment to conduct responsible research and
fulfill respective legal and regulatory requirements in all the countries where videos will be shot.
4.2.3.2 FORENSOR PFT Dataset
FORENSOR PFT Dataset might contain videos where personal data (such as the faces for example)
of bypassers are shown. For this reason, we have foreseen the placement of appropriate informa-
tive signs in all pilot and field test locations where the various videos of the FORENSOR PFT Da-
taset will be captured. The signs will inform people – prior their entering the area of the pilot/
field test – about the experimental sensor trial, the fact that they may be captured in video, and
that their personal data may be shared amongst the research partners of the FORENSOR consor-
tium for research purposes. Individual data subjects may object to this if they wish. These actions
reflect the FORENSOR Consortium’s commitment to conduct responsible research and fulfill re-
spective legal and regulatory requirements in all the countries where videos will be shot.
4.2.3.3 FORENSOR Evidence Dataset
FORENSOR Evidence Dataset will contain videos where personal data (such as the faces for exam-
ple) of people are shown. The FORENSOR consortium will carefully consider if any part of the
FORENSOR Evidence Dataset can be shared. Such data will only be shared where consent has been
secured from the relevant data subject or where the data in question has been anonymized.
D1.4 Data Management Plan
FORENSOR Project Page 25 of 26
5 Archiving and Preservation (Including Storage and Backup)
5.1 FORENSOR DaPPECL IA SD Dataset The FORENSOR DaPPECL IA SD Dataset will be preserved for an additional two years after the
end of the project. The cost for the preservation of the project website for the whole lifetime of
the FORENSOR project plus an additional two years has already been included in the project
budget. The approximated end volume of the dataset is tenths of filled-in questionnaires with
less than 100 answered questions each.
5.2 Other datasets We believe that the FORENSOR DTB, FORENSOR DTB-E, and FORENSOR PFT Datasets should be
preserved for at least seven years after the end of the FORENSOR project. This would be ensured
through their deposit in the ZENODO repository. Archiving and long-time preservation of the FO-
RENSOR Evidence Dataset could be also ensured (in case it is deemed necessary) through its de-
posit in the ZENODO repository.
Approximated end volume is in the order of gigabytes for FORENSOR DTB, FORENSOR DTB-E, and
FORENSOR PFT Datasets (more details are provided in Table 5 and Table 7 respectively) and in the
order of megabytes for the images and gigabytes for the videos in the case of the FORENSOR
Evidence Dataset (more details are provided in Table 9).
For further details regarding archiving and preservation of all four aforementioned datasets
please refer to ‘Appendix I: Relevant ZENODO Archiving and Preservation Policies’.
D1.4 Data Management Plan
FORENSOR Project Page 26 of 26
6 Appendix I: Relevant ZENODO Archiving and Preservation Poli-
cies
6.1 Retention period Items will be retained for the lifetime of the repository. This is currently the lifetime of the host
laboratory CERN, which currently has an experimental programme defined for the next 20 years
at least.
6.2 Functional preservation ZENODO makes no promises of usability and understandability of deposited objects over time.
6.3 File preservation Data files and metadata are backed up nightly and replicated into multiple copies in the online
system.
6.4 Fixity and authenticity All data files are stored along with a MD5 checksum of the file content. Files are regularly checked
against their checksums to assure that file content remains constant.
6.5 Succession plans In case of closure of the repository, best efforts will be made to integrate all content into suitable
alternative institutional and/or subject based repositories.
6.6 What does it cost? ZENODO is free for the long tail of Science. In order to offer services to the more resource hungry
research, we will introduce a ceiling to the free slice and offer paid for slices above, according to
the business model developed within the sustainability plan.
i Video Graphics Array (VGA) refers to a 640x480 pixels resolution. ii Quarter Video Graphics Array (QVGA) refers to a 320x240 pixels resolution.
Recommended