16
OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS AT EUROSTAT G. Pongas, F. Vernadat EC Eurostat B2

OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS AT EUROSTAT

Embed Size (px)

DESCRIPTION

OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS AT EUROSTAT. G. Pongas, F. Vernadat EC Eurostat B2. Overview of the talk. Introduction CVD (Cycle de Vie des Données) REFIN: Internal Reference Eurostat Dissemination Portal (Site 3) Conclusion. Introduction. - PowerPoint PPT Presentation

Citation preview

Page 1: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

OPERATIONAL METADATA FOR FEDERATING STATISTICAL

REFERENCE SYSTEMS AT EUROSTAT

G. Pongas, F. Vernadat

EC Eurostat B2

Page 2: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

Overview of the talk

• Introduction

• CVD (Cycle de Vie des Données)

• REFIN: Internal Reference

• Eurostat Dissemination Portal (Site 3)

• Conclusion

Page 3: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

Introduction

Metadata in statistical information:

• define some of the semantics of data• needed for proper production and usage of data• make data comparable• ensure some level of data quality• required for efficient search

Page 4: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

CVD (Cycle de Vie des Données)

DataProviders

Da

ta C

olle

ctio

n

Va

lida

tion

Co

rre

ctio

nIm

pu

tatio

n

Inte

rna

lR

efe

ren

ceO

pe

ratio

ns

Ext

ern

al

Re

fere

nce

Dis

sem

ina

tion

StatisticalMetadata

Production

Raw

data

Validated

data

Derived

data

Public

data

Page 5: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

Current Situation at Eurostat

ReferenceEnvironment

Sta

dium

Sta

tel

NSI

NSI

Suppliers Customers

CollectionEnvironment

ProductionEnvironment

DisseminationEnvironment

Comext

NewCronos

Public. Office, Web Site,

Info. relays

DG

ECB

Externalusers

NSI

PS

PS

PS

PS

PS

PSNSI

OECD

ReferenceEnvironment

Sta

dium

Sta

tel

Sta

dium

Sta

tel

NSI

NSI

Suppliers Customers

CollectionEnvironment

ProductionEnvironment

DisseminationEnvironment

Comext

NewCronos

Public. Office, Web Site,

Info. relays

DG

ECB

Externalusers

NSI

PS

PS

PS

PS

PS

PSNSI

OECD

Page 6: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

EUROSTAT INTERNAL REFERENCE The problem

Two many different systems at EUROSTAT for handling data:

– FAME– Oracle Express– Oracle RDBMS– SAM– SAS

Page 7: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

REFIN: The problem (Cont’d)

• Some of them are general purpose (e.g. Oracle RDBMS) whereas others may include special features (for data validation or computation) but they all have their own access methods and user interfaces (Express Analyser, FAME...)

• Major drawbacks:- High complexity for users- Data comparison between different systems is not

easy

Page 8: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

What is REFIN ?

The REFIN system specifically addresses these issues

– Gives access to heterogeneous systems

– Provides the users with a common interface

• Data location and source system is hidden• Data not duplicated, access to the original data.

– Uses a unique exchange format (PIVOT)

– Implements specific security rules

Page 9: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

REFIN architecture

FAMEDATA Bases+METADATA

ORACLE DBMSDATA bases

+METATDATA

MICROSOFT ACCESS

SAM+METADATA

HLI SNAPI OCI

REFININTERNAL REFERENCE

REFIN ADAPTOR

SECURITY LAYER

METADATA

METADATA + LOCALISATION DATA + PROCEDURES

RPC/DCE or XMLDAO/ODA/ODBC

ORACLE EXPRESS

DATA Bases+METADATA

Page 10: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

REFIN architecture

SAM

EXPRESS

ORACLE

FAME

MetabaseBuilder

REFINParticular

Metabases

REFINCommonMetabase

MetabaseConverter

1) Generation of REFIN metadata

2) Mapping to Common Metadata

Page 11: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

REFIN architecture

SAM

EXPRESS

ORACLE

FAMEFAME Driver

SAM Driver

EXPRESS Driver

ORACLE Driver

HLI

ODBC

SNAPI

OCI

REFINAPI

Page 12: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

New possibilities provided by REFIN

To build heterogeneous data sets by mixing data from different origins and systems

Page 13: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

Eurostat Dissemination Portal (Site 3)

Professionaluser

InternetPortalWeb

Server

LDAP

User Groups. EC. Journalists. Students. Citizens. ...

Publications Datasets

Dedicatedsections

+ virtual Publi/Datasets (URL’s)+ METADATA

Application Server

Quick/Adv. SearchSubscriptionAlert/Info pushContent DownloadContent ImportPrint

E-commerce

Local DB/File server

Ap

plic

atio

n I

nte

gra

tion

NewCronos(Num. Data+ metadata)

SUITEXML/XSL

JSP

Datasets

(fixed)

EVA/EVALightDatasets

(open)

Comext DB (Ref. Data+ statisticalmetadata)

ComextClient

ComextServer

RAMONCODED

CIRCA

STATPUB

API EU-Bookshop(OPOCE)

EU-DOREU off. Publi.

Presentation Layer Business Layer Back-office Layer

WSDL/SOAPService

callWeb services

Open Datasets

ESTAT Portal Platform

Internet user

WebCache

DOUCEUR

XML/XSLJSP

StatisticalMetadata

Publications

Page 14: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

Site 3 MetadataSite3 Attribute Description Dublin Mandatory Domain

product_code Unique identifier of the content object stringISBN_ISSN Official ISBN or ISSN publication code stringauthor Author's name(s) of the content object stringresponsible_unit Identification of Unit responsible for the

availability of the content object

LOV

coeditor Name of co-editing organisation, if any stringcreator Name of user who uploaded the content object stringapproved_by Name of person who approved content object

upload/creation on the Website

string

current_version Current version of the content object stringrelease_date Issue date of content object by authoring unit datecreation_date Date of creation/upload of the content object datestart_effectivity_date Date and time at which the content object

becomes visible on the Website

full date

end_effectivity_date Date and time at which the content objectbecomes invisible on the Website

full date

expiration_date Date at which the content object must bedeleted/purged from on the Website

date

theme Theme name of content object LOVcollection Collection name of content object LOVlanguage Default language of the content object LOVother_languages List of other languages in which a version of

the content object is available

LOV

Page 15: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

Site 3 Metadatatable_of_contents Name of file containing the table of contents stringtitle Official title of content object stringsummary Content object summary or official abstract stringkeywords Unordered list of keywords (maximum 10) stringfreetext Free text to add comments if needed. Not

visible on the Websitestring

graphs Indicate if there is any graph attached to thecontent object

Boolean

tables Indicate if there is any data table in thecontent object

Boolean

maps Indicate if there is any map attached to thecontent object

Boolean

cover_image Name of file containing data for the coverimage, if any

string

filename_url URL or file name of the content object (if notphysically stored on the Website)

string

related_products Name(s) of related products/datasets stringtype Type of content object (publication, dataset,

metadata, link)

LOV

support_format Medium type (electronic or paper) LOVother_formats List of other available mediums LOVlayout_size Size of publication format (e.g. A4, A5…) stringpage_nb Number of pages of the content object numbertable_number Dataset identifier (logical name) stringprice Selling price of the content object in Euro numberout_of_stock Indicate if publication is out of stock Booleanupdate_frequency Update frequency (e.g. daily, weekly,

monthly, quarterly, yearly, biannual)LOV

coverage Time period covered by publication or dataset stringstatus Content object status (visible, embargoed…) LOV

Page 16: OPERATIONAL METADATA FOR FEDERATING STATISTICAL REFERENCE SYSTEMS  AT EUROSTAT

Conclusion

• Importance of linking data and metadata

• Importance of having an integrated metadata environment

• Clear distinction between– Statistical metadata– IT metadata– Dissemination metadata