Upload
chloe-boyle
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
News from the Publications Office
Norbert HohnPublications OfficeEurolib Plenary Meeting, Lisbon, 19-20 May 2011
News about…
•Virtua OPac•EUBookshop•Cellar•Eurovoc•Metadata Registry (MDR)
News from the cataloguing service
virtua : an off-the-shelf cataloguing tool for OP
Agenda
•Project background•Going into production•Challenges and benefits•What next?
virtua : project background
What were we looking for?
An off-the-shelf ILMS – cataloguing module and OPAC module
• To enable cataloguing to be done in-house
A web OPAC• To allow external users to search and download OP
bibliographical records (replacement for LIBCO)
virtua: project background
25/02/2009 Launch of Call for Tender AO 10021 for an integrated library management system
13/07/2009 Award of contract to VTLS Europe, S.L. (virtua)
01/09/2009 Kick-off meeting
27/04/2010 Initial projected start date
14/12/2010 Start of production
virtua: project background
What caused the delays?
Requirement to communicate with several proprietary systems
Complex migration scenario
Two additional projects required in order to go live:• Codes• Punctuation
virtua: codes project
Example notice from virtua with codes
041 0 $a eng $1 EN
044 $c eu
084 $a M11 $2 LU-LuOPE
245 1 0 $a CEMP, the creation of European management practice : $b final report.
260 $a {LUXB} : $b OPL, $c 2004.
300 $a III, 127 NPAG : $b NFIG, NTAB ; $c A4 $d BR.
440 $a EUR_SER_C ; $v 20968, $x 1018-5593
504 $a UA_BIB : NPAG 90-97.
540 $a REPRO1.
650 7 $a 003656. $2 EUROVOC
…
710 2 $a CEU. $b RTD.
773 1 8 $t EUR_SER_C $q 2004, NPER 20968
…
910 $a GR
920 $a 702
Codes and their translations
• M11 = Theme (Social Sciences Research)
• {LUXB} = Luxembourg
• OPL = Publications Office
• NPAG = p.
• NFIG = ill.
• NTAB = tab.
• BR = softcover
• UA_BIB = Bibl.
• REPRO1 = Reproduction is authorised provided the source is acknowledged
• 003656 = Community research policy
• CEU = European Commission
• RTD = Directorate-General for Research
• EUR_SER_C = EUR. EU socio-economic research
• NPER = No
• GR = Free
• 702 = Specialised
virtua: punctuation project
Example notice from virtua with automatically added punctuation
… 245 1 0 $a CEMP, the creation of European management
practice : $b final report. 260 $a {LUXB} : $b OPL, $c 2004. 300 $a III, 127 NPAG : $b NFIG, NTAB ; $c A4 $d BR. 440 $a EUR_SER_C ; $v 20968, $x 1018-5593 499 $a Project SOE1-CT97-1072 504 $a UA_BIB : NPAG 90-97.… 540 $a REPRO1. 650 7 $a 003656. $2 EUROVOC… 700 1 $a Engwall, Lars, $e ED. 710 2 $a CEU. $b RTD. 773 1 8 $t EUR_SER_C $q 2004, NPER 20968 …
virtua: going into production
Migration of >200 000 records from PROCATX (OP’s database for legal and general publications metadata) to virtua
Re-import of these records from virtua to PROCATX (synchronisation of both systems)
Parallel running with external cataloguing contractor until 18.01.2011
virtua: challenges and benefits
Challenges: Learning a new system Creating bibliographical records (as opposed to
controlling them) Adapting our workflows Indexation of records using EUROVOC
Benefits: Reduced time delays for cataloguing publications (3
days reduced to 24 hours) Automated validation checks to ensure quality and
consistency Autonomy, enabling rapid intervention in records when
requested And not least, increased team spirit
virtua: what next?
Opening of OPac to current LIBCO users – March 2011 Possibility for users to export notices in MARC, CSV and
Endnote
Deep-linking to EUbookshop of all records held in Virtua (MARC21 field 856)
Production of prepublication records
Automatic activation of DOIs via an export from virtua Reduction of delays from moment publication is on EUB
to activation of DOI
OPac - the OP online public access catalogue
Out of the box OPAC of Virtua (Chamo) http://opac.publications.europa.eu/ Interface does not require a specific login etc. but we don’t publicise
it and give the address only to 'approved' users. Lets users discover materials quickly, using familiar search methods
such as Quick Search and faceted result links. Refining a search is as easy as picking a facet from a list or typing
additional terms in the search box and letting OPac add them to the original search string.
Advanced search give users the advantage of applying multiple filters simultaneously.
Users are able to export references to EU general publications in the format more specifically designed for the library world (e.g. MARC21) as well as in EndNote or CSV format.
OPac - the OP online public access catalogueThe tabs of the menu bar
Login For administrators only.
Heading To make searches by author, subject, title, and PUB_ID/workflow (catalogue number).
Cart To store all records selected by user and to export them.
Clear session Resets all searches done during the current session, cleans the cart and returns user to first page.
Caveat: one peculiarity of the OPac service.
OP uses a system of codes in records (e.g. 260 $a {LUXB} :) in order to produce each record in the language of the publication catalogued. Although the facets display the translated values of these codes for the end-user, the MARC records themselves are displayed on the screen still coded. However, when adding the records to the cart and then downloading them (by selecting 'Export records to MARC'), the codes are automatically translated and you will receive decoded notices in the resulting file (e.g. 260 $a Luxembourg :) for import into your system.
FeedbackAs this is a new service, we welcome any feedback from our users, including ways in which we can improve it. If you need any further help or would like to propose any improvements, please contact our team using the following address: [email protected]
Deep-linking to EUbookshop of all records held in Virtua (MARC21 field 856)
Purpose: Only records since February 2011 systematically have a deep-link to EUBookshop. By adding a deep-link to the bibliographical records for General Publications anyone
using these records will automatically be able to redirect their end-users to the Publication Details page on EUB where the user can order or download the publication they are looking for, even if the library/information centre displaying the record does not hold a copy of the publication themselves.
Actions: Retrospectively adding a deep-link to field 856 in each bibliographical record to all
existing records (250.000). Multiple assignment possible, i.e. in addition to DOI (link to resolver)
Updating the import workflows into virtua so that all new records are given this deep-link by default.
Proposed date of putting into production: June 2011 Customers: Current LIBCO clients Pilot project with the British Library, which would result in some 50, 000 records being
made available in the UK through a syndication of libraries
Pre-publication records
We foresee to create a preliminary record (prenotice) before publication is finalised
Might be interesting for Eurolib members to be alerted to new publications
Automatic activation of DOIs via an export from virtua
Automatic mapping from Marc21 to ONIX for DOI Reduction of delays from moment publication is on EUB to
activation of DOI
News from the EUBookshop
Metadata added to the publication detail page target audience and Eurovoc descriptors. These terms keywords are browsable.
New "discover" section - a menu through which users can access thematic collections of publications that cannot be easily retrieved by site search or browsing. The compilation is often informed by frequently searched terms, such as map or comics
A "just published" section - recently published titles
News from the CELLAR
Common Access to EU Information
To guarantee to the citizen a better access to law and publications of the European Union;
To encourage and facilitate reuse of content and metadata by professionals and experts;
To preserve content and metadata and access to contents and metadata over time.
Vision
To make available at a single place all metadata and digital content managed by the Publications Office in a harmonised and standardised way in order:
Common access to EU information
24/7
Production
Citizens/Professionals
Authors
Common access to EU information
EUR-Lex
TEDDissemination - Specializedportals
EU Booksh
op
CORDIS
Present: silos = independent solutions
25/7
Dissemination
Production
Citizens/Professionals
Authors
Common portalSpecialized portals
Future: harmonized architecture = common & shared solutions
Common access to EU information
Common access to EU information
Target architecture
Data flows:
Dissemination layer
Data layer
Definition layer
Tendering documents
General publications
External sources (Court of
Justice…)
CORDIS
Validation
Reference
Publishing
Post-Production
Production
ArchiveLong term
preservation
Official publications
CELLAR – Functional architecture 1/3
Reception, technical validation and storage of content and metadata.
Common access to EU information
Data flows:
Dissemination layer
Data layer
Definition layer
Tendering documents
General publications
External sources (Court of
Justice…)
CORDIS
Validation
Reference
Publishing
Post-Production
Production
ArchiveLong term
preservation
Official publications
CELLAR – Functional architecture 2/3
Repository models (CCR and CMR), business rules (for uploading, archiving and dissemination), transformation rules, EuroVoc dissemination, authority tables including translations.
Common access to EU information
Data flows:
Dissemination layer
Data layer
Definition layer
Tendering documents
General publications
External sources (Court of
Justice…)
CORDIS
Validation
Reference
Publishing
Post-Production
Production
ArchiveLong term
preservation
Official publications
CELLAR – Functional architecture 3/3
Access to and provision of content and metadata in the requested format and/or presentation.
Common access to EU information
Data flows:
Dissemination layer
Data layer
Definition layer
Tendering documents
General publications
External sources (Court of
Justice…)
CORDIS
Validation
Reference
Publishing
Post-Production
Production
ArchiveLong term
preservation
Official publications
METS
FRBR
METS
CELLAR – Based on standards
OAISReference
model
XML
Common access to EU information
Data flows:
Dissemination layer
Data layer
Definition layer
Tendering documents
General publications
External sources (Court of
Justice…)
CORDIS
Validation
Reference
Publishing
Post-Production
Production
ArchiveLong term
preservation
Official publications
RDFSKOS
SPARQL endpoint
CELLAR – Web 3.0, semantic technology
OWL
Complete collection of EU legal documents including Treaties Official Journal Case-law Preparatory acts Consolidated acts …
General publications
Research reports
Merger taskforce decisions
Digital archive of the EU
Content
CELLAR – A service enablerOn-line access Provide on-line access through the Internet
portals of the Publications Office.
Automated access
Provide suitable interfaces for access by automated agents.
External indexing
Enable indexing by Internet search engines.
Notification Provide configurable notification services (RSS-feeds…).
Downloading Support sporadic and regular downloading of resources (subscription). Regular downloading should be configurable.
Strategic formats
PDF, in particular PDF/A-1a and PDF/A-1b; XML; TIFF
Specific formats Provide formats, which are not natively available in the CELLAR (LegisWrite, ONIX notices…), i.e. transformation services.
Deep linking Enable external referencing of resources and guarantee persistence of links over time.
Common access to EU information
2010/2011 development (ongoing)
2011 data migration and upload (ongoing)
2012 online (planned)
Common access to EU information
CELLAR – ROADMAP
News from Eurovoc
EuroVoc – Next releases 4.4
Next release in Summer 2011 (EuroVoc 4.4)
Update linked to the new “Lisbon Treaty” • EC EU• European Community European Union
You can contribute via the website
Permanent URI and ID for thesaurus Terms and concepts LOD (Linked Open Data)
No deletion for concepts obsolete (use instead) deprecated (move as Non Preferred Term of a new
concept)
EuroVoc – TAE Project - Purpose
TAE = Thesaurus Alignment Environment Initiative of the Publications Office
Mapping = matching Create semantic correspondences between concepts of two
thesauri
Objective: Map EuroVoc to ETT - European Vocational Training Thesaurus (Cedefop) GEMET - General Multilingual Environmental Thesaurus (European
Environmental Agency) Directory of European Legislation in force (EUR-Lex) EuroVoc 4.2 Taxonomy EUB
ETT
EuroVoc – TAE Project - Approach
Project participants Mondeca (Paris) – Alignement Tools Inria (Grenoble) – Matching algorithms Office des Publications – Reviewer - validator
When? May 2010 - May 2011
How? Using advanced semantic technologies An Interface enabling to:
• Review matching• Import/export any vocabulary in SKOS (Simple Knowledge
Organization System)/RDF• Import any matching algorithms• Import/export any mapping results
EuroVoc – TAE Project – Examples for Automated alignements Types of correspondences generated by algorithms
ExactMatch – concept T1 = concept T2• T1 acid rain exact match T2 acid rainT1=Gemet – T2=EuroVoc
BroadMatch - concept T1 has a generic concept in T2• T1 animal genetics broad match T2 genetics
NarrowMatch - concept T1 has a specific concept in T2• T1 mammal narrow match T2 wild mammal
EuroVoc – TAE Project – Practical use (overview) Indexing
Detailed and enriched indexing Automatic indexing and re-indexing Double annotation
Retrieving - Semantic extension Integration of results into search engines Facilitate users’ researches – „ Did you mean.. ? ” Redefinition of the research : Extend or Narrow the search
results
Results stored in CELLAR A unique storage and dissemination platform of the PO to
access European law and publications
SKOS web services and Sparql-end point for accessing and querying the mapping results
EuroVoc – TAE Project – Practical use: Help to indexing Annotation of a document by indexing of a specialized thesaurus
« whaling » is not represented in EuroVoc but GEMET contains “whaling”
Example in EUR-Lex
EuroVoc – TAE Project – Practical use: Help to indexing Correspondences (Gemet – EuroVoc) proposes in TAE
Whaling exactMatch “whale” AND “hunting regulation” Compound Mapping
EuroVoc – TAE Project – Practical use: Help to information retrieval Search engines
Did you mean … ? Automatic query expansion or restriction
Search for Whale
Did you mean … ? Whaling
– Restrict the search results towards a more specific concept in the target thesaurus
Whale or Marine mammal
– Expand the search results towards a more generic concept in the source thesaurus
EuroVoc – Future actions
MetaThesaurus Working Group
Main purpose Set up a specialized, multilingual thesauri network around
EuroVoc Meeting foreseen in June 2011
Advantages Use the same standards and formats Delegate the maintenance of specific domains Share candidates and translations
Participants: EU Institutions, European agencies International institutions (FAO, Unesco) Other multilingual thesauri (EINIRAS) First approach made during the EuroVoc Conference
(Luxembourg, November 2010)
EuroVoc – Refresher of its benefits
Enterprise Content Categorization Develop from the scratch
• Time consuming to build a taxonomy or controlled vocabulary
Use “Starter” metadata to speed-up the development • Import external metadata, taxonomies or controlled
vocabulary in your ECM system • Avoiding duplicate efforts• Minimize the cost of adding and managing metadata
EuroVoc = a Building block of your ECM application A high-level controlled vocabulary Cost benefit : maintained by the Publications Office Offers different levels of specificity (TAE, thesauri
collaboration network)
EuroVoc within the OP Cellar
In the repository will be stored: EuroVoc, the thesaurus The mapping or alignment results
On the Cellar service layer EuroVoc will be implemented as web services and Sparql-Endpoint for e.g. Linked Open Data Crosswalk EuroVoc and Semantic web applications Dereferencable URI Examples
• Search a term (expression or URI) and retrieves the alignments
• Search a term (expression or URI) and retrieves its relations (Broader Term, Specific Term, Related Terms)
• Search a Microthesaurus and retrieves all the terms
EuroVoc – Licensing policy Free of charge (4-Years)
Email: [email protected] Information in the website under “legal notice” Login and Password to download the SKOS or XML Alert once a new release is available
405 licences (64 for 2010, 64 for 2009)
Types of licence Indexing
• Text mining and extraction, automatic indexing and categorization,
• Library Information System, Knowledge Management & ECM Translation (Albanian) Academic, project, research
• Semantic technologies• Term matching
Eurovoc mappings
News from the Metadata Registry
What is the Metadata Registry (MDR)?
A central reference point for the registration and maintenance of metadata definitions and related authority data used by The interinstitutional systems supporting the decision
making process The production and dissemination systems of the
Publications Office A framework for the harmonisation and standardisation of
the metadata used in this context Documentation Organisation Procedures
Provide the reference metadata for reuse and validation purposes to internal and external clients/client systems in human and machine-readable format
Metadata Register – Scope
Core metadata Limited set of metadata, which needs to be adopted by
every institution to enable interoperability, in particular in the context of the decision making process
Common part of the Metadata register Management on interinstitutional level (IMMC)
Specific metadata Metadata dedicated to the specific internal needs of
each institution Out-of-scope for the common part of the Metadata
register Private workspace inside the Metadata register could be
provided to facilitate management by the owner
Metadata Registry – Expected benefits
Central reference location for metadata definitions and authority data
Reference source for consultation/validation purposes Stimulates reuse of metadata and increase interoperability Framework for harmonization and standardization Platform for collaboration and knowledge exchange in
metadata domain on interinstitutional level
Metadata Register - Architecture
Back-end application Maintenance of metadata definitions and authority data Access limited to restricted number of expert users Based on same tool as used for Eurovoc back-end (ITM) Possibility to create individual workspaces
Registration workflow (JIRA)
Metadata Registry website (front-end) Browse MDR content (read access) Detailed information about registered items Possibility to submit proposal for registration/feedback
(e.g. by Eurolib members)
Metadata Register – Workflow overview
Metadata Registry - Organisation (proposal) 1/2 Publications Office level
Management of changes in MDR by Metadata Register Team (MRT)
Interinstitutional level Proposals for registration by Interinstitutional Metadata
Maintenance Committee (IMMC) (2 members per institution)
Submission of relevant proposals by MRT to IMMC for approval
Technical support/evaluation by MRT on request Management of changes in MDR by MRT Supervision by Interinstitutional Metadata Steering
Committee (IMSC) composed of the suppléants of the management board of the Publications Office
Metadata Registry – Organisation 2/2
Common Authority Tables (CAT) – April 2011Common Authority Tables Source
Languages (ISO 639/1, 639/2B|T, 639/3) ISO
Countries (ISO 3166/1-α2 and α3, 3166/3) ISO
NTU (incl. NUTS and ISO 3166-2) ISO + UNO + Eurostat
Currencies (ISO 4217) ISO
Corporate Bodies Various
Roles LC + EurLex + Prelex
Places (locations, towns) UN-LOCODE
Resource format (incl. dimensions) ONIX + IANA
Resource type (categories of resources) Internal sources
Target Audience ONIX
Procedures PreLex
Events PreLex
Etc.
stable versionin progress to be started
Metadata Registry - Roadmap
Project kick-off: 20/12/2010 Phase 1: Implementation of back-end application
(management of ontology, authority tables, export)Target date: June 2011
Phase 2: Implementation of front-end applicationTarget date: August 2011
MDR project contacts
Metadata Registry team:Holger BAGOLACorinne FRAPPARTMadeleine KISSMartin SCHERBAUMWillem VAN GEMERT
Contact:[email protected]
Thank you for your attention!
We appreciate your questions and suggestions.