18
Global fisheries “catch” statistics: structures, standards and formats for data interoperability and dissemination Version Log Versi on Date Author Changes V 1.0 06-09- 2017 Aymen Charef and Anton Ellenbroek V 2.0 Table of contents 1. Preamble..........................................................1 2. Scope.............................................................2 3. Data structure....................................................2 3.1 For reporting purpose..........................................3 3.2 For dissemination purpose......................................3 4. Exchange and dissemination formats................................3 4.1 ‘Unstructured’ Comma Separated Values CSV......................4 4.2 Statistical Data and Metadata eXchange SDMX/SDMX-ML............4 4.3 Fisheries Language for Universal eXchange FLUX.................5 4.3.1 Introduction................................................5 4.3.2 Scope.......................................................5 4.3.3 General principles..........................................7 4.3.4 Use of the FLUX standard...................................10 4.3.4 Implementation by EU DG MARE...............................10

Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

  • Upload
    lycong

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

Global fisheries “catch” statistics: structures, standards and formats for data interoperability and dissemination

Version Log

Version Date Author ChangesV 1.0 06-09-2017 Aymen Charef and Anton Ellenbroek

V 2.0

Table of contents1. Preamble.............................................................................................................................................1

2. Scope...................................................................................................................................................2

3. Data structure......................................................................................................................................2

3.1 For reporting purpose..................................................................................................................3

3.2 For dissemination purpose..........................................................................................................3

4. Exchange and dissemination formats..................................................................................................3

4.1 ‘Unstructured’ Comma Separated Values CSV.............................................................................4

4.2 Statistical Data and Metadata eXchange SDMX/SDMX-ML.........................................................4

4.3 Fisheries Language for Universal eXchange FLUX........................................................................5

4.3.1 Introduction...............................................................................................................................5

4.3.2 Scope..........................................................................................................................................5

4.3.3 General principles......................................................................................................................7

4.3.4 Use of the FLUX standard........................................................................................................10

4.3.4 Implementation by EU DG MARE............................................................................................10

Page 2: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

1. Preamble The general objective of the Fisheries Data Interoperability Working Group (FDIWG) is to devise a global data exchange and integration framework to support scientific advice on stock status and exploitation that build on fisheries data. Various fisheries data domains utilized in such scientific processes are concerned, including data collected for statistical or status and trends reporting, monitoring control and surveillance, scientific fisheries data.

More specifically the FDIWG has two main objectives; the first to foster better interoperability of statistical data, and the second to better represent fisheries data in geospatial data formats. This will improve the reporting, access and analysis of fisheries data based on open data services.

As presented in its work plan, FDIWG will address the minimum data requirements to describe fisheries data required for supporting stock assessment and fisheries management in particular fisheries statistical data and geospatial data. These objectives are in line with the activities of the Coordinating Working Party for Fisheries Statistics (CWP), a forum composed of intergovernmental organizations which have a competence in fishery statistics. At its 2017 inter-sessional meeting, the CWP recognized the FDIWG’s efforts to examine standards and alternatives to improve data harmonization and interoperability. CWP also solicited the WG to present their assessment output, proposals and recommendations with a remit for formats of data exchange and dissemination, which will provide guidance to the CWP parties including Regional Fisheries Management Organizations (RFMOs).

2. ScopeThis document focuses on data structures used for fisheries data reporting and dissemination, that are exchanged at different levels; country level, regional (e.g RFMOs) and global level (e.g FAO).

In the second part, three electronic data formats are considered to assess their capabilities for the purpose of interoperability of standard data structure format, and exchange of fisheries statistics data and the associated metadata.

3. Data structure Fisheries statistical data are used nationally, principally for reporting purposes and internationally (e.g. ICES, Eurostat), for analytical and dissemination purposes. Data also serve as reference for other international organizations and as a means for crosschecking and reconciling information from national sources. At global level, FAO is the only intergovernmental organization formally mandated by its constitution to undertake the worldwide collection, compilation, analysis and dissemination of data and information in fisheries.

By integrating and coordinating the statistical programmes among organizations, the CWP has standardized statistical reporting systems, resulting in the adoption of a reduced number of FAO questionnaires which helped reducing the burden on national fisheries statistical offices.

Page 3: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

3.1 For reporting purpose

Catch and effort data are the main data schemes that have been collected periodically for different major fishing areas by means of forms and questionnaires using questionnaires applying common classifications and definitions and but tailored to the particular needs of the relevant regional fisheries organization.

The catch and effort questionnaires developed by the CWP are called STATLANT and they are sent by FAO on behalf of the regional fishery organizations to the relevant national authorities.

- STATLANT A questionnaires are used for reporting annual nominal catch by species and by statistical sub-area, division or sub-division.

- STATLANT B questionnaires are used for reporting fishing effort by month, vessel size class, gear and statistical sub-area, division or subdivision and together with associated catch by species. In some cases the species sought (target species) are also specified.

The FishSTAT reporting system is used by FAO to collate global statistics on catch and production from over 240 countries for over 1200 species of aquatic organisms of significant commercial importance in all inland and marine fishing areas. It is run in parallel with the STATLANT system in areas where the latter is operated.

The other main scheme of data reporting are logbooks. They are widely used as a method of collecting statistical information on commercial activities. They provide a mean of recording data at the source. They have the important role as vehicles for data used by several rather different groups of users. Two distinct major groups of users are identified as being:

(a) biologist-economist (fishery activity data, catch and effort data).(b) enforcement authorities (especially under licensed fisheries schemes).

3.2 For dissemination purpose

Regional and international organisations with responsibilities in fishery statistics disseminate aggregated data such as catches, landings. For instance, FAO compiles, analyses and disseminates fishery data, structured within data collections. The capture production database contains statistics

In general, structure of data for dissemination purpose contains information on volume and value landings or catches by date, country or territory, species item, reporting or fishing area (e.g ICES area, FAO Major Fishing Area).

4. Exchange and dissemination formatsThe data format provides the structure of the data sent over the network. There are several formats and standards of dissemination and exchange which can be used to implement the interoperability of the fisheries statistical data and the related reference metadata. Some criteria need to be considered to score the file format:

Page 4: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

It should be spread throughout the fisheries data hosts, bodies, stakeholders to minimise compatibility issues;

It needs to be readable for human and machine, complexity should therefore be kept at an acceptable level, but it should also be based on a structure for the data;

Structure standardisation needs to be possible – The format structure must be standardised and the file format must support an open standard;

Maturity and availability of (free) tools and platforms to support the generation and exchange of data.

As for best practices to be pointed out, the writing convention or format could be recommended when exchanging data with specific coding system (e.g FAO areas breakdown,.). In this case, the easy digitalization of the codes should be considered to facilitate the data interoperability and exchange.

4.1 ‘Unstructured’ Comma Separated Values CSV

CSV files are the widely used by national statistical agencies and international organizations to disseminate datasets and metadata. The readability of CSV files is acceptable and facilitate the interaction of human user. It could be a good candidate to exchange the reference data and metadata.

In FAO, an immediate focus is on the publication of Data Structure Definition (DSD) directly relevant to the disseminated statistical data collections. The DSD of Capture production has been made available in a packaged format comprising codelists in CSV files and related metadata in text file.

Tools to manage csv files …. E.g. The BlueBRIDGE project has

4.2 Statistical Data and Metadata eXchange SDMX/SDMX-ML

SDMX which stands for Statistical Data and Metadata eXchange is an international initiative that aims at standardising and modernising the mechanisms and processes for the exchange of statistical data and metadata among international organisations and their member countries. The organizations involved in the SDMX initiative developed guidelines applicable to all statistical domains. Furthermore, the community made available software tools and a registry to host reusable SDMX artefacts.

SDMX is not just a technical standard and offers many guidelines such as a Checklist for Design Projects and Modelling Guidelines which are highly relevant for establishing an SDMX project for a data domain. For a specific data domain (e.g capture data for dissemination purpose), an SDMX project starts by creating a concept scheme that describes this domain and the data flows (e.g Country sends dataset to an organization). The design and creation of SDMX artefacts and the management of such a project are detailed in this standard project workflow. The structure of this checklist is based, to the largest extent possible, on the UNECE Generic Statistical Business Process Model.

The SDMX principles have been applied to fisheries statistics and in particular the fisheries catch DSD for the collection of data in the context of a joint-project SEIF that stands for SDMX for Eurostat, ICES

Page 5: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

and FAO. The initiative aimed at the alignment and the exchange of SDMX artefacts between the three organizations.

SDMX is being adopted as the data collection format for fisheries data in Eurostat, in-line with policy for all statistical domains covered by the European Statistical System. FAO is making progress in the implementation of SDMX principles and acquisition of related tools.

SDMX standard offers an information model which describes statistical data sets and the structural metadata needed to exchange them in a standard fashion. The content of SDMX files have visible structure with explanations what is stored where in the file. The usual format in SDMX information model is XML (SDMX-ML) which make it a good option for exchange of fisheries statistical data sets and accompanied metadata.

4.3 Fisheries Language for Universal eXchange FLUX

4.3.1 Introduction

The Fisheries Language for Universal Exchange (FLUX) standard, developed by the Centre for Trade Facilitation and e-Business (UN/CEFACT), provides an harmonized message standard allowing Fishery Management Organizations (FMOs) to automatically access the electronic data needed for stock management, such as vessel and trip identification, fishing operations (daily catch or haul-by-haul) , fishing data (catch area, species and quantity, date and time, and gear used), landing and sales information, license information and inspection data. With this standard, FMOs around the world have for the first time a communication tool to automate the collection and dissemination of the fishery catch data needed for sustainable fishery management and for detecting and combatting illegal, unreported and unregulated fishing.

FLUX can be used in any fishing industry and fishing operation. Fishing vessels are the primary source of data that needs to be collected during fishing. Based on this electronic messaging standard, data can be shared with other stakeholders. The standard focuses on all messages sent with an electronic reporting system. Therefore it is not restricted to any specific business process or data model.

4.3.2 Scope

FLUX is developed, maintained and promoted by UN/CEFACT, an intergovernmental body of the United Nations Economic Commission for Europe (UNECE). The FLUX standard is directly linked to Sustainable Development Goal (SDG) 14 of the 2030 Agenda on Sustainable Development, and particularly to Target 14.4, which focuses on illegal and unreported fishing and overfishing, and aims at sustainable fishing practices. It also contributes to ensure sustainable production (SDG 12) in the fishing industry, through fisheries management based on reliable database of fishing data and thus preserving biodiversity and conserving fish stock and overall fishing practices.

Furthermore, in way of preserving fish stock and promoting sustainable fisheries management, FLUX helps providing animal protein for the current and future generations and hence contributing to ending hunger; achieve food security and improved nutrition (SDG 2).

At the twenty-seventh UN/CEFACT Forum on 27 April 2016 in Geneva, the Agricultural Domain of the UN/CEFACT supported decided to create a Sustainable Fisheries User Community as a Team of

Page 6: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

Specialists. The objective of the Team of Specialists is to promote, facilitate and support the implementation of the FLUX standard or other sustainable fisheries standards on a global scale.

4.3.3 General principles

FLUX contains two distinct but related parts: - The FLUX business layer - The FLUX transportation layer

The core of the FLUX business layer is - the detailed and standardised description of each and any data element needed - the standardised grouping of those data elements in messages required by the business for

exchanging data between parties

For the FLUX business layer, standardisation of the data elements and formats is based upon the UN/CEFACT approach. This allows for the description of the typical business processes. Technically speaking, UN/CEFACT standardization provides a standardized schema for business process (XSD) and a standardized content (Core Components).

- The practical outcome of a UN/CEFACT standardisation project is a technical file called XSD (XML Schema Definition) for the business processes and requirements subject to the project. This XSD can be used for all data exchanges and processes described by the project.

- The data exchanged are also harmonized and published in standardized library (UN/CEFACT Core Component Library).

The FLUX transportation layer provides description for: The FLUX Envelope, one single yet universal message format that can encapsulate any business-

specific message or structured data in a predictable way whatever the business system and associated data types and formats, using industry standard data representation techniques

The FLUX Protocol, a mechanism describing how to reliably deliver the FLUX Envelopes to their destination and without human intervention, leveraging state-of-the-art existing technologies (SOAP Web Services) in a sensible manner so as to as much as possible avoid interoperability issues between FLUX implementations based on different vendors’ solutions.

The UN/CEFACT Modeling Methodology (UMM) approach and Unified Modeling Language is detailed in this document (Figure 1). The structure is based on the structure of the UN/CEFACT Business Requirements Specification (BRS) document reference CEFACT/ICG/005.

FLUX offers a protocol to create a secure and configurable network between different parties IT systems. Its standards includes a protocol for the exchange of a request for information, the exchange of the information itself and the acknowledgement and rejection of information exchanged. Built over SOAP and WSDL, this mechanism provides an envelope that can contain a business message, and software

Page 7: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

which serves as infrastructure to transport the envelopes. FLUX is strongly tied to XML as a data format, and, more specifically, UN/CEFACT standardized XML Schemas.

Technically speaking, UN/CEFACT standardization provides a standardized schema for business process (XSD) and a standardized content (Core Components). The practical outcome of a UN/CEFACT standardisation project is a technical file called XSD (XML Schema Definition) for the business processes and requirements subject to the project. This XSD can be used for all data exchanges and processes described by the project.

The data exchanged are also harmonized and published in standardized library (UN/CEFACT Core Component Library UN/CCL).

For instance, the UN/CCL contains the different concepts used by FLUX Message. For each concept corresponds a Unique Identifier for approved library objects (e.g ABIE, BBIE, ASBIE). By selecting the Value of business process “FLUX”, all unique assigned UN ID will be selected.

Page 8: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

Figure 1. FLUX message activity diagram.

Page 9: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

Figure 2 Overview diagram of Entities used in General Principles document.

Source: BUSINESS REQUIREMENTS SPECIFICATION (BRS) FLUX General Principles (GP) domain - FLUX P1000 – 1, Version: 2.1.3. UN/CEFACT International Trade and Business Processes Group

Page 10: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

4.3.4 Use of the FLUX standard

Standard implementations have been defined for the following data exchange domains:

- Vessel Domain: aims to standardize the exchange of fishing fleet data, and more specifically the information directly related to fishing vessels and vessels supporting fishing operations.

- Fishing Activities and Sales Domain: is related to data exchanges in the context of fishing activities performed by vessels during a fishing voyage. Fishing activities include all activities of vessels, related to a fishing trip. This includes catching activity, but also transshipments, relocations and landings, etc. The data exchange contains reports related to the fishing trip: departure, arrival, entry and exit from zones, etc.

- Vessel positions domain: provides a standard for the communication of vessel position information (e.g. VMS or AIS) between monitoring centers.

- Fishing licenses, authorizations and permits: to standardize the exchange of data between stakeholders in the context of request for fishing license, authorization or permit.

- Aggregated Catch Data (ACDR): provides standard to exchange aggregated catch data between stakeholders.

- Master Data Management (MDM): encompasses exchanges from a Master Data Register to any requester of Fisheries information registered in it.

FLUX offers several advantages, including free, open and global standard to automate the collection and dissemination of the fishery catch data needed for sustainable fishery management. It provides a common approach towards electronic logbooks for fishing vessels, interoperability between IT systems, and easy exchange of data between parties.

Notwithstanding these benefits, implementation of FLUX are to be further explored for the purpose of interoperability of standard data structure format. It would be necessary to scrutinize the capabilities of FLUX whether it could accommodate generic catch DSD, the reference metadata and in particular the various levels of hierarchies of codelists used in the data reporting. A positive conclusion on FLUX’s ability to accommodate fisheries statistics would make it a good exchange format of global catch data structure.

4.3.4 Implementation by EU DG MARE

In the context of European Union usage, with the purpose to record and report fishing activities, the European Commission’s Directorate-General for Maritime Affairs and Fisheries (EU DG-MARE) implemented UN/CEFACT FLUX to data domains for:

4.3.4.1 Master Data Register

Document of implementation for each data domain are available on the Master Data Register (MDR) that contains data structures and lists of fisheries codes to be used in electronic information recording and exchanges among Member States and for Member States' communications with Norway. All EU regulations defining or referencing a code list are also stored in the MDR.

Page 11: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

The Master Data Register is accessible here: https://ec.europa.eu/fisheries/cfp/control/codes/

Exchange of the business messages is done through the FLUX Transportation Layer. They must be validated by the receiver upon receipt of a message. The validation process validates on two levels:

1. XML Validation level: Based on the definition in the XSD, the parser validates the structure and cardinality as well as compliance for mandatory elements of the XML provided.

2. Business Rules Validation level: a process validates the content of XML according to, firstly, the General Principles Business Rules definition and, secondly, to all other specific business rules defined in the domain.

For the validation, business rules are classified in two data sets:

1. Validation data set: rules are applied immediately when a party receives a message. The following types of rules fall under this category: • Structural validations: the message must be valid according to the standard UN/CEFACT XSD for the data domain (figure 3); • Mandatory fields: they must be present in the message; There is no need for the recipient of the message to validate entities or attributes that are not defined in this implementation document. • Business validations;

2. Verification data set: rules that could be applied on data after forwarding the message. There is no such rules for this implementation document.

The Response message returned to the requester has a unique structure which contains, depending on the results of the validation, either a message with information and data elements belonging to the requested list of code(s) or a message containing the Error code and description.

Figure 3. Example of UN/CEFACT XSD messages.

Page 12: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

4.3.4.2 Aggregated Catch Data Reporting

The Aggregated Catch Data Reporting (ACDR) is a data structure implemented by EU DG-MARE following FLUX standard.

The FLUX ACDR Message data model (figure 4) is used to send catch data from the flag State to DG MARE Data warehouse (DWH) and the FLUX Response Message (figure 5) sent back from DG MARE to the flag State explaining the data correctness and results of performed validation.

A set of rules are defined for the FLUX ACDR data model (XSD), including:- Business rules for report validation- Periodicity and submission- Mandatory data elements (e.g Gear, vessel, area, species) - Constraints and values that can be used for each field in the FLUX ACDR message following

general constraints at the level of XSD element attributes.

The functional documentation of implementation is published on the MDR.

Page 13: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

Figure 4. FLUX ACDR Message data model.

Page 14: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

Figure 5. FLUX General Principles Response XSD data model.

4.3.4.3 Master Data Management

The implementation of the Master Data Management UN/CEFACT international standard in the context of the European Union usage is detailed in this document. Submissions of MDM Messages are done through the FLUX Transportation Layer for which technical and functional documentations are published on MDR.

The following activity diagram describes the normal procedure defined for submitting MDM Query Messages from a Requester to the European Commission. This procedure respects the FLUX General Principles transmission procedure.

Page 15: Preamble - Research Data Alliance€¦  · Web viewthe Centre for Trade Facilitation and e-Business ... FMOs around the world have for the first time a communication tool to automate

Figure 6. Query Message Transmission activity diagram (Source)