10
NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

Embed Size (px)

Citation preview

Page 1: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

NOMENCLA: a server to manage, display and disseminate metadata

by Emile Bruneau (INSEE – France)

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS)

Geneva, 9-11 February 2004

Page 2: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

A set (or a subset) of targets• To build up a reference DB

– classifications then tree structures and lists plus tables

• To consistently manage the metadata– precise process under control

• To link them to statistical data (eventually)• To view any data and related ones

– definitions or contents, links, any attributes

• To extract subsets perfectly sized• To download structured files

– XML files (uploading flagged DBs), but also Text files

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

Page 3: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

NOMENCLA

Consequences…

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

Management DB Dissemination BD

Screen

Messages

Management functions:- creation, update, stopping, deletion- of tree-structures, lists and tables- on line or batch files-controls, help for consistency

Working DB Reference DBSame datamodel

Dissemination functions:- displaying, extraction, navigation- of photos or updates- of trees, lists, tables

Exports in Text or XML:- to view or for publishing- to directly upload DBs

NLP functions:- dictionaries, semantic concepts, indexes- on line searching and coding- batch coding and help to build tables semantically based

download

Page 4: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

The core

Datamodel (simplified)

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

Family

DescriptionComment

Sub-family

Tree structure

Level

ItemRelation

Table

DefinitionsExplanatory notes

Heading

Page 5: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

Navigation in the DB

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

ItemBefore After

Above

Below

e.g. « active »

e.g. « employee »

Other concept

Same concept

e.g. « activity »

e.g. « occupied »

Page 6: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

Navigation in the DB

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

ItemBefore After

Above

Below

Result: a network of metadata

e.g. « active »

e.g. « employee »

Other concept

Same concept

e.g. « occupied »

Page 7: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

Particularities• No repeated textual data

– descriptions, headings, explanatory notes

– Recorded one time and dated links

• Every entities with their own validity period• Relations « weighted »

– a lot of options but no convergence

– So, two possible ways: manual or automatic weighting

– Present choice: automatic. • Based on the numbers of relations

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

Page 8: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

The automatic weighting method for classifications

What are the weights of the A D indirect table? with

A B B C C D

1 11 0.333 1.000 11 aa 0.500 1.000 aa a 1.000 0.333

1 12 0.333 1.000 11 bb 0.500 0.500 bb a 1.000 0.333

1 20 0.333 0.500 12 bb 0.500 0.500 cc a 0.500 0.333

2 20 1.000 0.500 12 cc 0.500 1.000 cc b 0.500 1.000

3 31 0.333 1.000 20 dd 1.000 0.333 dd c 1.000 0.500

3 32 0.333 1.000 31 dd 1.000 0.333 ee c 1.000 0.500

3 33 0.333 1.000 32 dd 0.500 0.33332 ee 0.500 0.50033 ee 1.000 0.500

A B (direct table)

1 11 1 121 202 203 313 323 33

A B (calculated weights)

1 11 0.333 1.000

1 12 0.333 1.000

1 20 0.333 0.500

2 20 1.000 0.500

3 31 0.333 1.000

3 32 0.333 1.000

3 33 0.333 1.000

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

The system provides: A D as well the relations as

the weights

1 a 0.583 1.0001 b 0.083 1.000 1 c 0.333 0.0832 c 1.000 0.0833 c 1.000 0.833

Same process with manually weighted tables

Page 9: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

Exchanges: the message CLASET• Developed by EU statisticians and Eurostat

– for tree structures, lists and correlation tables– with all the detailed attributes they can support– as well for photos (state at a date) as updates (between two dates)

• Already normalized in EDIFACT format• Also in XML, SGML and DB

– HTML for browsing– NOMENCLA choice: XML

• A CLASET-toolbox – to translate in any formats– to extract TXT files

Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004

Page 10: NOMENCLA: a server to manage, display and disseminate metadata by Emile Bruneau (INSEE – France) Joint UNECE/Eurostat/OECD work session on statistical

Thank you for your attention