Upload
melanie-walker
View
214
Download
1
Tags:
Embed Size (px)
Citation preview
IZA Data Service CenterDDI/SDMX Workshop
Wiesbaden, Germany, June 18th 2008
The Data Documentation Initiative (DDI)
Arofan Gregory / Pascal Heus
[email protected] / [email protected]
Open Data Foundation
[email protected]://www.opendatafoundation.org
Content
• Background on metadata and XML
• Metadata and Microdata
• XML and Microdata: the DDI
• DDI 2.0
• DDI 3.0
• DDI 2.0 vs 3.0
• Major stakeholders / initiatives
Metadata / XML
[email protected]://www.opendatafoundation.org
What is metadata?
• Common definition: Data about Data
Unlabeled stuff
Labeled stuff
The bean example is taken from: A Manager’sIntroduction to Adobe eXtensible Metadata Platform, http://www.adobe.com/products/xmp/pdfs/whitepaper.pdf
[email protected]://www.opendatafoundation.org
What is XML?• Today's Universal language on the web• Purpose is to facilitate sharing of structured information
across information systems• XML stands for eXtensible Markup Language
– eXtensibe can be customized– Markup tags, marks, attach attributes to things– Language syntax (grammatical rules)
• HTML (HyperText Markup Language) is a markup language but not extensible! It is also concerned about presentation, not content.
• XML is a text format (not a binary black box)• XML is a also a collection of technologies (built on the
XML language)• It is platform independent and is understood by modern
programming languages (C++, Java, .NET, pHp, perl, etc.)
• It is both machine and human readable
[email protected]://www.opendatafoundation.org
Simple XML example
<catalog> <book isbn=”0385504209”> <title>Da Vinci Code</title> <author>Dan Brown</author> </book> <book isbn=”0553294385” pages=”352”> <title>I, robot</title> <author>Isaac Asimov</author> <language>English</language> </book></catalog>
Elements
Text content
Opening andClosing tags
Attributes
[email protected]://www.opendatafoundation.org
XML Technology overview
StructureDTD
XSchema
TransformXSL, XSLT
XSL-FO
DiscoverRegistriesDatabases
ExchangeWeb Services
SOAPREST
SearchXPath
XQuery
ManageSoftwareXForms
CaptureXML
Document Type Definition (DTD) and XSchema are use to validate an XML document by defining namespaces,
elements, rulesSpecialized software and
database systems can be used to create and edit XML
documents. In the future the XForm standard will be used
Very much like a database system, XML documents can be searched and queried through
the use of XPath oe XQuery.There is no need to create tables,
indexes or define relationships
XML separates the metadata storage from its presentation.
XML documents can be transformed into something else, like HTML, PDF, XML, other) through the use of the
eXtensible Stylesheet Language, XSL
Transformations (XSLT) and XSL Formatting Objects
(XSL-FO)
XML Documents can be sent like regular files but are typically
exchanged between applications through Web Services using the SOAP
and other protocols
XML metadata or data can be published in “smart”
catalogs often referred to as registries than can be used for discovery of information.
[email protected]://www.opendatafoundation.org
What is an XML Schema?• Exchange / sharing / harmonization implies
agreement on structure– We need a specification that describes the structure
and rules Schema• A schema is a set of rules to which an XML
document must conform in order to be considered 'valid' – XML Schema was also designed with the intent that
determination of a document's validity would produce a collection of information adhering to specific data types
– Similar to relational databases structural definition• Many schemas exists for different purposes• Examples
– DDI, SDMX ,Dublin Core, RSS, XHTML, etc.
Metadata, XML and Microdata
[email protected]://www.opendatafoundation.org
What is a survey?• More than just data….
– A complex process to produce data for the purpose of statistical analysis
– Beyond this, a tool to support evidence based policy making and results monitoring
• The data is surrounded by a large body of documentation
• Survey data often come with limited documention
• Note that microdata is intended for experts– Statisticians / researchers– Represents a single point in time and space– Need to be aggregated to produce meaningful results– It is the beginning of the story
[email protected]://www.opendatafoundation.org
What is survey metadata?
• Survey documentation can be broken down into structured metadata and documents– Structured metadata can be captured using XML– Documents can be described in structured metadata
• Example of metadata:– Survey level: Title, country, year, abstract, sampling,
agencies, access policy, etc.– Variable level: filename, label, code, questions,
instructions, derivation, etc.– Related materials: report, questionnaire, papers,
manuals, scripts/programs, photos– Cross-surveys: catalogs, longitudinal, concepts,
comparability, etc.
[email protected]://www.opendatafoundation.org
Importance of survey metadata• Data Quality:
– Usefulness = accessibility + coherence + completeness + relevance + timeliness + …
– Undocumented data is useless– Partially documented data is risky (misuse)
• Data discovery and access• Preservation• Replication standard (Gary King)• Information exchange• Reduce need to access sensitive data• Maintain coherence / linkages across the
complete life cycle (from respondent to policy maker)
• Reuse
[email protected]://www.opendatafoundation.org
The Data Documentation Initiative
• The Data Documentation Initiative is an XML specification to capture structured metadata about “microdata” (broad sense)
• First generation DDI 1.0…2.1 (2000-2008)– focus on single archived instance
• Second generation DDI 3.0 (2008) – focus on life cycle– go beyond the single survey concept– mutli-purpose
[email protected]://www.opendatafoundation.org
DDI Timeline / Status• Pre-DDI 1.0
– 70’s / 80’s OSIRIS Codebook– 1993: IASSIST Codebook Action
Group– 1996 SGML DTD– 1997 DDI XML– 1999 Draft DDI DTD
• 2000 – DDI 1.0– Simple survey– Archival data formats– Microdata only
• 2003 – DDI 2.0– Aggregate data (based on matrix
structure)– Added geographic material to aid
geographic search systems and GIS users
• 2003 - Establishment of DDI Alliance• 2004 – Acceptance of a new DDI
paradigm– Lifecycle model– Shift from the codebook centric /
variable centric model to capturing the lifecycle of data
– Agreement on expanded areas of coverage
• 2005– Presentation of schema structure– Focus on points of metadata creation
and reuse• 2006
– Presentation of first complete 3.0 model
– Internal and public review• 2007
– Vote to move to Candidate Version (CR)
– Establishment of a set of use cases to test application and implementation
– October 3.0 CR2• 2008
– February 3.0 CR3– March 3.0 CR3 update– April 3.0 CR3 final– April 28th 3.0 Approved by DDI Alliance– May 21st DDI 3.0 Officially announced– Initial presentations at IASSIST 2008
• 2009– DDI 3.1 and beyond
DDI 1/2.x
[email protected]://www.opendatafoundation.org
The archive perspective
• Focus on preservation of a survey
• Often see survey as collection of data files accompanied by documentation– Code book centric– report, questionnaire, methodologies, scripts,
etc.
• Result in a static event: the archive
• Maintained by a single agency
• Is typically documentation after the facts
• This is the initial DDI perspective (DDI 2.0)
[email protected]://www.opendatafoundation.org
DDI 2.0 Technical Overview
• Based on a single structure (DTD)• 1 codeBook, 5 sections
– docDscr: describes the DDI document• The preparation of the metadata
– stdyDscr: describes the study• Title, abstract, methodologies, agencies, access policy
– fileDscr: describes each file in the dataset– dataDscr: describes the data in the files
• Variables (name, code, )• Variable groups• Cubes
– othMat: other related materials• Basic document citation
[email protected]://www.opendatafoundation.org
Characteristics of DDI 1.0/2.0
• Focuses on the static object of a codebook• Designed for limited uses
– End user data discovery via the variable or high level study identification (bibliographic)
– Only heavily structured content relates to information used to drive statistical analysis
• Coverage is focused on single study, single data file, simple survey and aggregate data files
• Variable contains majority of information (question, categories, data typing, physical storage information, statistics)
[email protected]://www.opendatafoundation.org
Impact of these limitations
• Treated as an “add on” to the data collection process
• Focus is on the data end product and end users (static)
• Limited tools for creation or exploitation• The Variable must exist before metadata can be
created• Producers hesitant to take up DDI creation
because it is a cost and does not support their development or collection process
[email protected]://www.opendatafoundation.org
DDI 1/2.x Tools
• Nesstar– Nesstar Publisher, Nesstar Server
• IHSN– Microdata Management Toolkit– NADA (online catalog for national data archive)– Archivist / Reviewer Guidelines
• Other tools– SDA, Harvard/MIT Virtual Data Center (Dataverse)– UKDA DExT, ODaF DeXtris– http://tools.ddialliance.org
[email protected]://www.opendatafoundation.org
DDI 2.0 perspective
Producers
Archivists
Users
General Public
Policy Makers
Sponsors
Media/PressAcademic
Business
Government
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 2Survey
DDI 3.0
The life cycle
[email protected]://www.opendatafoundation.org
When to capture metadata?
• Metadata must be captured at the time the event occurs!• Documenting after the facts leads to considerable loss of
information• Multiple contributors are typically involved in this process (not
only the archivist)• This is true for producers and researchers
[email protected]://www.opendatafoundation.org
DDI 3.0 and the Survey Life Cycle
• A survey is not a static process: It dynamically evolved across time and involves many agencies/individuals
• DDI 2.x is about archiving, DDI 3.0 across the entire “life cycle”• 3.0 focus on metadata reuse (minimizes redundancies/discrepancies,
support comparison)• Also supports multilingual, grouping, geography, and others• 3.0 is extensible
[email protected]://www.opendatafoundation.org
Requirements for 3.0• Improve and expand the machine-actionable aspects of
the DDI to support programming and software systems• Support CAI instruments through expanded description
of the questionnaire (content and question flow)• Support the description of data series (longitudinal
surveys, panel studies, recurring waves, etc.)• Support comparison, in particular comparison by design
but also comparison-after-the fact (harmonization)• Improve support for describing complex data files
(record and file linkages)• Provide improved support for geographic content to
facilitate linking to geographic files (shape files, boundary files, etc.)
[email protected]://www.opendatafoundation.org
Approach• Shift from the codebook centric model of early versions
of DDI to a lifecycle model, providing metadata support from data study conception through analysis and repurposing of data
• Shift from an XML Data Type Definition (DTD) to an XML Schema model to support the lifecycle model, reuse of content and increased controls to support programming needs
• Redefine a “single DDI instance” to include a “simple instance” similar to DDI 1/2 which covered a single study and “complex instances” covering groups of related studies. Allow a single study description to contain multiple data products (for example, a microdata file and aggregate products created from the same data collection).
• Incorporate the requested functionality in the first published edition
[email protected]://www.opendatafoundation.org
Designing to support registries
• Resource package– structure to publish non-study-specific materials for
reuse
• Extracting specified types of information in to schemes – Universe, Concept, Category, Code, Question,
Instrument, Variable, etc.
• Allowing for either internal or external references– Can include other schemes by reference and select
only desired items
• Providing Comparison Mapping– Target can be external harmonized structure
[email protected]://www.opendatafoundation.org
Technical Overview
• DDI 3 is composed of several schemas– Use only what you need!– Schemas represent modules, sub-modules
(substitutions), reusable, external schemas
• archive• comparative• conceptualcomponent• datacollection• dataset• dcelements• DDIprofile• ddi-xhtml11• ddi-xhtml11-model-1• ddi-xhtml11-modules-1• group• inline_ncube_recordlayout
• instance• logicalproduct• ncube_recordlayout• physicaldataproduct• physicalinstance• proprietary_record_layout (beta)• reusable• simpledc20021212• studyunit• tabular_ncube_recordlayout• xml• set of xml schemas to support xhtml
[email protected]://www.opendatafoundation.org
• Any element that can be referenced is globally uniquely identified– Maintainable (by an agency)– Versionable (can change across time)– Identifiable (within a maintainable scheme)
• Modules– Reflect closely related sets of information similar to the sections
of DDI 1/2.* DTD– Modules can be held as separate XML instances and be
included in a large instance by either inclusion or reference– All modules are maintainable (but not all maintainables are
modules)
Technical Overview
[email protected]://www.opendatafoundation.org
Technical Overview: Maintainable Schemes
(that’s with an ‘e’ not an ‘a’)• Category Scheme• Code Scheme• Concept Scheme• Control Construct Scheme• GeographicStructureScheme• GeographicLocationScheme• InterviewerInstructionScheme• Question Scheme• NCubeScheme• Organization Scheme• Physical Structure Scheme• Record Layout Scheme• Universe Scheme• Variable Scheme
Packages of reusable metadata maintained by a single agency
[email protected]://www.opendatafoundation.org
DDI 3.0 Use Cases• Study design/survey instrumentation• Questionnaire generation/data collection and procesing• Data recoding, aggregation and other processing• Data dissemination/discovery• Archival ingestion/metadata value-add• Question/concept/variable banks• DDI for use within a research project• Capture of metadata regarding data use• Metadata mining for comparison, etc.• Generating instruction packages/presentations
[email protected]://www.opendatafoundation.org
Study Design/Survey Instrumentation
• This use case concerns how DDI 3.0 can support the design of studies and survey instrumentation – Without benefit of a question or concept bank
[email protected]://www.opendatafoundation.org
<DDI 3.0>ConceptsUniverses
<DDI 3.0>ConceptsUniverses
<DDI 3.0>QuestionsFlow Logic
<DDI 3.0>ConceptsUniversesQuestionsFlow Logic
+Drafting/Review/Revision
Final
Drafting/Testing/Revision
Final
Types of Metadata:• Concepts (conceptual module)• Universe (conceptual module)• Questions (datacollection module)• Flow Logic (datacollection module)
As the survey instrument is tested, all revisions and history can be tracked and preserved. This would include question translationand internationalization.
[email protected]://www.opendatafoundation.org
Questionnaire Generation, Data Collection, and Processing
• This use case concerns how DDI 3.0 can support the creation of various types of questionnaires/CAI, and the collection and processing of raw data into microdata.
[email protected]://www.opendatafoundation.org
Paper Questionnaire
<DDI 3.0>ConceptsUniversesQuestionsFlow Logic
Types of Metadata:• Concepts (conceptual module)• Universe (conceptual module)• Questions (datacollection module)• Flow Logic (datacollection module)• Variables (logicalproduct module)• Categories/Codes (logicalproduct module)• Coding (datacollection module)
<DDI 3.0>ConceptsUniversesQuestionsFlow Logic
Final
Online SurveyInstrument
CAIInstrument
<DDI 3.0>VariablesCoding
+
Raw Data
+
<DDI 3.0>Categories
CodesPhysical Data ProductPhysical Data Instance
Microdata
DDI capturesthe content – XMLallows for each application to do its own presentation
[email protected]://www.opendatafoundation.org
Data Recoding, Aggregation, etc.
• This use case concerns how DDI 3.0 can describe recodes, aggregation, and similar types of data processing.
[email protected]://www.opendatafoundation.org
<DDI 3.0>Conceptual
DatacollectionVariables
CategoriesCodes
+
MicrodataMicrodata/Aggregates
<DDI 3.0>Codings
Variables (new)Categories (new)
Codes (new)NCubes
Could be a recode, an aggregation, or other process.
Initial microdata has:• Concepts (conceptual module)• Universes (conceptual module)• Questions (datacollection module)• Flow Logic (datacollection module)• Variables (logicalproduct module)• Coding (datacollection module)• Categories (logicalproduct module)• Codes (logicalproduct module)• Physical Data Product• Physical Data Instance
Recode adds:• More codings (datacollection module)• New variables• New categories• New codes• NCubes (for aggregation)
[email protected]://www.opendatafoundation.org
Data Dissemination/Data Discovery
• This use case concerns how DDI 3.0 can support the discovery and dissemination of data.
[email protected]://www.opendatafoundation.org
Microdata/Aggregates
<DDI 3.0>[Full meta-data set]
Codebooks
+
RegistriesCatalogues
Question/Concept/Variable Banks
Databases,repositories
Websites
Research Data Centers
Data-Specific Info Access
Systems
<DDI 3.0>Can add archivalevents meta-data
Rich metadata supports auto-generation of websites and other delivery formats
[email protected]://www.opendatafoundation.org
Archival Ingestion and Metadata Value-Add
• This use case concerns how DDI 3.0 can support the ingest and migration functions of data archives and data libraries.
[email protected]://www.opendatafoundation.org
Microdata/Aggregates
<DDI 3.0>[Full meta-data set]
(?)
+ Data ArchiveData Library
Ingest Processing
<DDI 3.0>[Full or
additional metadata]
Archival events
Supports automation of processing if good
DDI metadata is captured upstream
Provides good format &foundation for value-added metadata by archive
Provides a neutral format for data
migration as analysis packages are
versioned
[email protected]://www.opendatafoundation.org
Question/Concept/Variable Banks
• This use case describes how DDI 3.0 can support question, concept, and variable banks. These are often termed “registries” or “metadata repositories” because they contain only metadata – links to the data are optional, but provide implied comparability. The focus is metadata reuse.
[email protected]://www.opendatafoundation.org
<DDI 3.0>QuestionsFlow Logic
Codings
<DDI 3.0>Variables
CategoriesCodes
<DDI 3.0>Concepts
Question Bank
VariableBank
ConceptBank
<DDI 3.0>QuestionsFlow Logic
Codings
<DDI 3.0>Variables
CategoriesCodes
<DDI 3.0>Concepts
Users and
Applications
Users and
Applications
Users and
Applications
Supports butdoes not requireISO 11179
Because DDI has links, each type of bank functions in a modular, complementary way.
[email protected]://www.opendatafoundation.org
DDI For Use within a Research Project
• This use case concerns how DDI 3.0 can support various functions within a research project, from the conception of the study through collection and publication of the resulting data.
[email protected]://www.opendatafoundation.org
<DDI 3.0>ConceptsUniverseMethodsPurposePeople/Orgs
<DDI 3.0>QuestionsInstrument
<DDI 3.0>Data CollectionData Processing
<DDI 3.0>Funding Revisions
SubmittedProposal
$€ £
PresentationsArchive/
RepositoryPublication
+++
+
+
<DDI 3.0>VariablesPhysical Stores
PrinicpalInvestigator
Collaborators
Research Staff
Data
[email protected]://www.opendatafoundation.org
Capture of Metadata Regarding Data Use
• This use case concerns how DDI 3.0 can capture information about how researchers use data, which can then be added to the overall metadata set about the data sources they have accessed.
[email protected]://www.opendatafoundation.org
Types of Metadata•Recodes (datacollection module)•Record subsets (physicalinstance module)•Variable subsets (logicalproduct module)•Comparison (comparative module)
<DDI 3.0>StudyUnitDataCollectionLogicalProductPhysicalDataProductPhysicalInstance
<DDI 3.0>•Recodes•Case Selection•Variable Selection•Comparison to original study•Resulting physical file descriptions
+
+
Data AnalysisData Analysis
Data Sets
Data
[email protected]://www.opendatafoundation.org
Metadata Mining for Comparison, etc.
• This use case concerns how collections of DDI 3.0 metadata can act as a resource to be explored, providing further insight into the comparability and other features of a collection of data.
[email protected]://www.opendatafoundation.org
Types of Metadata•Universe (comparative module)•Concept (comparative module)•Question (datacollection module)•Variable (logicalproduct module)
MetadataRepositories/
Registries
<DDI 3.0>Instances
Questions Variable
Concepts
Universe
?
<DDI 3.0>Comparison•Questions•Categories•Codes•Variables•Universe•ConceptsRecodesHarmonizationsData Sets
[email protected]://www.opendatafoundation.org
Generating Instruction Packages/Presentations
• This use case concerns how DDI 3.0 can support automation around the instruction of students and others.
[email protected]://www.opendatafoundation.org
Types of Metadata• Individual studies (studyunit module)• Grouping purpose (group module)• Linking information (comparative module)• Processing assistance (group module)<DDI 3.0>
StudyUnit 1
<DDI 3.0>StudyUnit 3 <DDI 3.0>
StudyUnit 4
<DDI 3.0>StudyUnit 2
<DDI 3.0>StudyUnit 1StudyUnit 2StudyUnit 3StudyUnit 4
<DDI 3.0>StudyUnit 1StudyUnit 2StudyUnit 3StudyUnit 4ComparativeOtherMaterials
• Topically related studies selected• Group is made with description of the intended use for the group• Comparative information is added indicating matching fields for linking and mapping between similar variables • Other materials such as SAS/SPSS recode command are referenced from the group
Instructional Package
[email protected]://www.opendatafoundation.org
DDI 3.0 Tools• Under developments• DDI Foundation Tools Program
– Road Map– XML Beans, validation, – DDI DExT, DDI2StatsProgs
• Other tools– R SPSS Export, Algenta SurveyViz, others presented
at IASSIST• DDI Editing Suite
– Proposed as extension of DDI-FTP– Plan for generic editor in 6-9 months
• DDI 3.0 related projects / initiatives– RDC Canada, Germany RDC / EURASI, DANS
MIXED, NORC
[email protected]://www.opendatafoundation.org
DDI 3 Relationship to Other Standards• SDMX (from microdata to indicators / time series)
– Completely mapping to and from DDI NCubes• Dublin Core (surveys and documents gets cited)
– Mapping of citation elements– Option for DC namespace basic entry
• ISO 19115 – Geography (microdata gets mapped)– Search requirements– Support for GIS users
• METS – Designed to support profile development
• OAIS (alignment of archiving standards)– Reference model for the archival lifecycle
• ISO/IEC 11179 (metadata mining through concepts)– Variable linking representation to concept and universe– Optional data element construct in ConceptualComponent that
allows for complete ISO/IEC 11179 structure as a maintained item
[email protected]://www.opendatafoundation.org
DDI 3.0 perspective
Producers
Archivists
Users
General Public
Policy Makers
Sponsors
Media/Press
Academic
Business
Government
DDI 2.0 and DDI 3.0
[email protected]://www.opendatafoundation.org
DDI 2 / DDI 3
• Single survey• Focus on the archive• Non-reusable
metadata• Maintained by single
agency• Loose validation
– DTD based– Sparse documentation
• Designed by archivists
• Some tools are available
• Multiple surveys• Focus on life cycle• Highly reusable
metadata• Maintained by many
agencies• Tied validation
– Schema based– Extensive guide
• Designed by expert groups
• Tools are beginning to emerge
[email protected]://www.opendatafoundation.org
What 3.0 can do for you
• Manage multi-surveys• Support multiple contributors• Support many different perspectives• Support many different use cases• Maintain metadata integrity across the life
cycle• Connect to other metadata spaces• Metadata reuse • Publication in registries• Backward compatibility with 2.0
DDI Community
[email protected]://www.opendatafoundation.org
DDI Organizations/ Agencies• DDI Alliance (http://www.ddialliance.org)
• Interuniversity Consortium for Political and Social Research (ICPSR) (http://icpsr.umich.edu)
• International Association for Social Science Infromation Service & Technology (IASSIST) (http://www.iassistdata.org)
• International Household Survey Network (IHSN) (http://www.surveynetwork.org)
• Open Data Foundation (ODaF) (http://www.opendatafoundation.org)
• National Opinion Research Center Data Enclave (NORC) (http://dataenclave.norc.org)
• Metadata Technology (http://www.metadatatechnology.com)
IZA Data Service CenterDDI/SDMX Workshop
Wiesbaden, Germany, June 18th 2008
The Statistical Data and Metadata Exchange Standard (SDMX):
An IntroductionArofan Gregory / Pascal Heus
[email protected] / [email protected]
Open Data Foundation
[email protected]://www.opendatafoundation.org
Overview of the Session
• SDMX Background and Goals
• SDMX and Data
• SDMX and Metadata
• SDMX and Best Practices: The Content-Oriented Guidelines
• The SDMX Information Model
• SDMX and Web Services– The SDMX Registry– SDMX Data Services
• Tools and Resources
SDMX Background and Goals
[email protected]://www.opendatafoundation.org
What is SDMX?
• The problem space:– Statistical collection, processing, and
exchange is time-consuming and resource-intensive
– Focus on aggregate data (esp. time series)– Various international and national
organisations have individual approaches for their constituencies
– Uncertainties about how to proceed with new technologies (XML, web services …)
[email protected]://www.opendatafoundation.org
What is SDMX?
The Statistical Data and Metadata Exchange (SDMX) initiative is taking steps to address these challenges and opportunities that have just been mentioned:– By focusing on business practices in the field
of statistical information– By identifying more efficient processes for
exchange and sharing of data and metadata using modern technology and open standards
[email protected]://www.opendatafoundation.org
Who is SDMX?
• SDMX is an initiative made up of seven international organizations:– Bank for International Settlements– European Central Bank– Eurostat – International Monetary Fund– Organisation for Economic Cooperation and
Development– United Nations– World Bank
• The initiative was launched in 2002
[email protected]://www.opendatafoundation.org
International OrganisationsRegional Organisations
accountsstatistics
Banks, CorporatesIndividual Households
trans-actions,
micro-data,accounts
National Statistical Organisations
accountsstatistics
180
+ C
ount
ries
180
+ C
ount
ries
Inte
rnet
, S
earc
h, N
avig
atio
nIn
tern
et,
Sea
rch,
Nav
igat
ionwww.z.org
www.hub.org
www.x.org
www.y.org
[email protected]://www.opendatafoundation.org
SDMX Products
• Technical standards for the formatting and exchange of aggregate statistics:– SDMX Technical Specifications version 1.0
(now ISO/TS 17369 SDMX – TC 154 WG2)– SDMX Technical Specifications version 2.0
(soon to be submitted to ISO – TC 154 WG2)
• Content-Oriented Guidelines (in draft)– Common Metadata Vocabulary– Cross-Domain Statistical Concepts– Statistical Subject-Matter Domains
[email protected]://www.opendatafoundation.org
Major Features of SDMX
• Structure and formats (XML, EDIFACT) for aggregate data
• Structure and formats (XML) for metadata• Formal information model (UML) for
managing statistical exchange and sourcing
• Web-services guidelines and registry services specification for use of modern technologies
• Content-oriented guidelines to recommend best practices
[email protected]://www.opendatafoundation.org
Recent Events
• Jan 2007 – Launch meeting at the World Bank for SDMX 2.0 Technical Specifications
• February 2007 – Endorsement of SDMX by EU’s Statistical Programme Committee
• March 2008 – SDMX becomes the preferred standard for data and metadata of the UN Statistical Commission– Other standards were mentioned – DDI and
XBRL specifically
[email protected]://www.opendatafoundation.org
Adopters/Interest• The following are known adopters (or planning to adopt):
– US Federal Reserve Board and Bank of New York– European Central Bank– Joint External Debt Hub (WB, IMF, OECD, BIS)– UN/TRADECOM at UN Statistical Division– NAAWE (National Accounts from OECD/Eurostat)– SODI (Eurostat and European Governments)– Mexican Federal System– Vietnamese Ministry of Planning and Investment– Qatar Information Exchange– IMF (BOP, SNA, SDDS/GDDS)– Food and Agriculture Organization– Millenium Development Goals (UN System, others)– International Labor Organization– Bank for International Settlements– OECD– World Bank– Marchioness Islands (Spanish/Portugese Statistical Region)– UNESCO (Education)– Australian Bureau of Statistics– Statistics Canada– There are many others not listed or which we are not aware of
[email protected]://www.opendatafoundation.org
Rate of Adoption
• Between January 2007 and January 2008, adoption has doubled
• We anticipate a similar rate of growth for the coming year– Tools are becoming available– UNSC recommendation makes it a safe
course to follow for risk-averse institutions– Training courses are in increasing demand
(Eurostat, Metadata Technology)– Standard data and metadata structures for
many domains are being developed
SDMX and Data
[email protected]://www.opendatafoundation.org
SDMX and Data Formats• SDMX provides a format for describing the structure of
data (“structural metadata”)– EDIFACT (was GESMES/TS, now SDMX-EDI)– XML (SDMX-ML)
• SDMX provides formats for transmission and processing of data– EDIFACT (1 message)– XML (4 different equivalent flavors for different functions)
• Data is tabulated, aggregate data (eg, multi-dimensional/OLAP cubes)– Can be any aggregate data!
• Most data formats are derived from the structural metadata (eg, XML schemas are generated for each type of structure according to the business rules)
[email protected]://www.opendatafoundation.org
Data Set: Structure
[email protected]://www.opendatafoundation.org
First: Identify the Concepts• A statistical concept is a characteristic of a time
series or an observation (MCV)
• A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model)
• Whatever the definition, statistical concepts are the DNA of the key family– Their usage (type, structure, sequence) define the
structure of the data
[email protected]://www.opendatafoundation.org
Computers need structure of data
•Concepts
•Code lists
•Data values
•How these fit together
Unit Multiplier
Unit
Topic
Time/Frequency
CountryStock/Flow
Data Set Structure:Concepts
[email protected]://www.opendatafoundation.org
Data Set Structure: Code Lists
Code Lists
TOPIC
A Brady Bonds
B Bank Loans
C Debt Securities
AR Argentina
MX Mexico
ZA South Africa
COUNTRY STOCK/FLOW
1 Stock
2 Flow
CONCEPTS
Topic
Country
Flow
Concepts
[email protected]://www.opendatafoundation.org
16457
Q,ZA,B,1,1999-06-30=16547
Data Makes Sense
Quarterly, South Africa, Bank Loans, Stocks, for 30 June 1999
[email protected]://www.opendatafoundation.org
Data Set Structure: Defining Multi-Dimensional Structures
• Comprises– Concepts that identify the observation value– Concepts that add additional metadata about the
observation value– Concept that is the observation value– Any of these may be
• coded• text• date/time• number• etc.
Dimensions
Attributes
Measure
Representation
[email protected]://www.opendatafoundation.org
Data Set Structure: Concept Usage
Unit Multiplier
Unit
Topic
Time/Frequency
CountryStock/Flow
Observation
(Dimension)(Dimension)
(Dimension)
(Attribute)
(Dimension)
(Dimension)
(Attribute)
(Measure)
SDMX and Metadata
[email protected]://www.opendatafoundation.org
SDMX and Metadata
• SDMX provides for several types of metadata– Structural (describes structures of data sets
and metadata sets and related items)– Provisioning (describes the sourcing of data
between departments and organizations)– “Reference” metadata – all other types of
metadata (footnotes, methodology, quality, etc. Can be specified by the user!)
• Reference metadata is the most important one – it is what we typically think of as metadata
[email protected]://www.opendatafoundation.org
SDMX Metadata Sets• Version 2.0 of the SDMX Technical
Specifications provides XML formats for metadata sets (SDMX-ML)– To describe their structure– To exchange metadata in XML
• This is based on concepts (similar to the data formats)– SDMX supports any metadata concepts the users
wishes to report/exchange/process– May be flat lists or hierarchical– Definitions provided by users, but recommendations
exist for many common concepts• Metadata sets are attached to a formal object in
the information model (an organization, a data set, a codelist, etc.)
[email protected]://www.opendatafoundation.org
SDMX and Metadata
• This is a very powerful feature of SDMX– It can be used to integrate/mimic other
metadata standards!
• Provides very good support for standard exchange of metadata which cannot be anticipated by the designers of systems/standards– Must be based on common agreements about
the meaning of metadata concepts– Often, concepts are taken from other
metadata models/standards such as DDI, Dublin Core, etc.
The SDMX Information Model
[email protected]://www.opendatafoundation.org
The SDMX Information Model
• A formal, documented conceptual model of statistical exchange, management, and sourcing
• Expressed as a UML model• Used as the basis of all SDMX implementation
– XML– EDIFACT– Any other programming language/platform
• Provides consistency between implementations• Based on analysis of many statistical processing
systems– Describes existing business practices in a generic
way
[email protected]://www.opendatafoundation.org
URL, registration date etc.
can provide data/metadata for many data/metadata flows using agreed data/metadata structure
conforms to business rules of the data/metadata flow Data or
Metadata FlowData or
Metadata Flow
Data ProviderData ProviderProvision
AgreementProvision
Agreement
can get data/metadata from multiple data/metadata providers
Data or Metadata Set
Data or Metadata Set
publishes/reports data/metadata sets
uses specific data/metadata structure
Data or Metadata Structure DefinitionData or Metadata
Structure Definition
can have child categories
comprises subject or reporting categoriescan be linked to
categories in multiple category schemes CategoryCategory
Category Scheme
Category Scheme
Registration of Data or
Metadata Set
Registration of Data or
Metadata Set
Information Model: High-Level Schematic
registers existence of data and metadata
Structure Maps
Structure Maps
structure and code list maps
SDMX and Best Practices: The Content-Oriented Guidelines
[email protected]://www.opendatafoundation.org
SDMX Content-Oriented Guidelines
• There is a long history of discussion about what is best practice in the collection of statistics
• SDMX decided to define the technical basis for statistical exchange, and then engage in this debate– It makes reaching agreements between organizations
easier!
• These documents build on many years of work defining statistical concepts, terms, and classifications
• Although described as “statistical”, much of what is here also applies to social science (and other) research
[email protected]://www.opendatafoundation.org
SDMX Content-Oriented Guidelines
• Four main documents:– Overview– Metadata Common Vocabulary (annex)– Cross-Domain Concepts (2 annexes)– Statistical Subject-Matter Domains (annex)
• These will not become ISO specifications, but will evolve as publications of the SDMX Initiative
• They are now available in their first official release at www.sdmx.org
[email protected]://www.opendatafoundation.org
Common Metadata Vocabulary
• A set of terms and definitions for the different parts of the SDMX technical standards, and many common concepts used in data and metadata structures
• Does not replace other major vocabularies in this space (such as the OECD glossary) but references these other works
[email protected]://www.opendatafoundation.org
Cross-Domain Concepts
• Includes concepts which are common across many statistical domains– Names & Definitions– Representations
• Approximately 130 concepts, some with recommended representations (codelists)
• These are concepts which support both data and metadata structures– Emphasis on quality frameworks for reference
metadata concepts
[email protected]://www.opendatafoundation.org
Statistical Subject-Matter Domains
• Based on the UN/ECE classification of statistical activities
• Provides a classification system for use in exchanging statistics across domain boundaries
• Provides a breakdown of the various domains within official statistics
SDMX and Web Services
[email protected]://www.opendatafoundation.org
Web-Services Components of SDMX
• Web-Services Guidelines– Part of the Technical Specifications package
• SDMX Query message– Part of SDMX-ML
• SDMX Registry Services– Part of version 2.0 Technical Specifications– Interfaces are in SDMX-ML– Document describes implementation rules
[email protected]://www.opendatafoundation.org
Web Services Guidelines
• Recommends use of WSS 1.1 for web services which use SOAP, WSDL
• Provides standard function names for many typical web-services functions– Querying for data– Querying for metadata– Querying for structural information
[email protected]://www.opendatafoundation.org
SDMX Query Message
• An XML Query to support two-way web-services calls using XML messages
• Designed to support:– Queries for structural information from online
databases/repositories– Queries for data from online databases– Queries for metadata from online databases
• Part of SDMX-ML– Very similar to the SQL query language
supported by all database packages– Specific to SDMX objects
[email protected]://www.opendatafoundation.org
SDMX Registry Services• A “registry” is a common type of technology
– Every Windows machine has a “Windows registry” to let applications know what other applications are on that machine, and where they are located
– Web services registries do the same thing on a network
– Functions like a card catalogue in a print library – you can look up resources and find out how to obtain them
• A registry provides a single place on the Internet where everyone can discover the data, metadata, and structures that other organizations use/publish– They do not contain the data and metadata – it just
indexes it and links to it
[email protected]://www.opendatafoundation.org
SDMX Registry Services (cont.)
• SDMX Registry Services are based on generic, standard web-services registry technology– ISO 15000 ebXML Registry/Repository – OASIS UDDI Registry (part of .NET, etc.)
• SDMX Registry Services are not generic– They are specific to SDMX exchanges of data
and metadata, etc.
• There is not one central “SDMX Registry”– Each domain will have its own registry for its
members– The registries can be linked (“federated”)
[email protected]://www.opendatafoundation.org
REPOSITORY Provisioning
Metadata
REGISTRY Data Set/
Metadata Set
REPOSITORY Structural Metadata
Register
Query
Submit
Query
Submit
Query
SDMX Registry/Repository
Describes data and metadata structures
Describes data and metadata sources and reporting processes
Indexes data and metadata
SDMX Registry Interfaces
[email protected]://www.opendatafoundation.org
REPOSITORY Provisioning
Metadata
REGISTRY Data Set/
Metadata Set
REPOSITORY Structural Metadata
Subscription/Notification
Applications can subscribe to notification of new or changed objects
Register
Query
Submit
Query
Submit
Query
SDMX Registry/Repository
Describes data and metadata structures
Indexes data and metadata
SDMX Registry Interfaces
[email protected]://www.opendatafoundation.org
The Old JEDH Site
BIS
IMF
OECD
WorldBank
WEBSITE
(VariousFormats) (3-month production cycle)
[email protected]://www.opendatafoundation.org
JEDH with SDMX
BIS
IMF
OECD
WorldBank
SDMX-ML
SDMX-ML
SDMX-ML
SDMX-ML
SDMX-ML(Debtor database)
[Info about data is registered]
SDMX“Agent”
SDMXRegistry
Discover data and URLs
Retrieves data from sites
JEDH Site
Data providedin real timeto site
SDMX-MLLoaded into
JEDH DB
[email protected]://www.opendatafoundation.org
Recent and On-Going Developments
• Many organizations using SDMX have been implementing web services
• There is growing interest in forming a working group to further extend the specification for use with web-services technology– Standard error messages– Expanded function calls– Standard WSDLs
• If you are interested in this, please tell me!
Tools and Resources
[email protected]://www.opendatafoundation.org
SDMX Tools• There are now several sources for SDMX tools
– All are free or open-source• Eurostat – complete package of tools for data, metadata,
and registry services• Metadata Technology Ltd – similar package of tools• Data editors are usually based on Excel• Some other tools
– Open Data Foundation “SDMX Browser” for data visualization– OECD, ECB, and UN/Statistical Division provide some other
tools for specific applications– Integration with PC-Axis has been prototyped, to be available
this summer– DevInfo has SDMX support– FAME is developing SDMX support
• Commercial vendors provide good support through web-services functionality– Eg, Oracle 11, .NET, etc.
[email protected]://www.opendatafoundation.org
Resources
• The SDMX Initiative Site: http://www.sdmx.org
• The SDMX Toolkit and Forums:http://www.metadatatechnology.com
• Various papers and (soon) open-source tools:http://www.opendatafoundation.org
IZA Data Service CenterDDI/SDMX Workshop
Wiesbaden, Germany, June 18th 2008
SDMX, DDI, and Other Standards
Arofan Gregory / Pascal Heus
[email protected] / [email protected]
Open Data Foundation
[email protected]://www.opendatafoundation.org
Overview of the Session
• DDI/SDMX: Philosophy and Timing of Standards Development
• DDI/SDMX: Points of Functional Overlap
• DDI/SDMX: Direct Mappings
• DDI/SDMX: Integration Approaches
• Other Related Standards and On-Going Work
DDI/SDMX: Philosophy and Timing of Development
[email protected]://www.opendatafoundation.org
Development Philosophies/Timing• Unlike many standards bodies, both the SDMX
Initiative and the DDI Alliance have attempted to create standards which do not duplicate existing efforts– There is an awareness that users need to deal with
several different standards– DDI (3.0) and SDMX were both intentionally aligned
with other, related standards• DDI 1.*/2.* existed before SDMX
– It was largely self-contained• SDMX was created before DDI 3.0 existed
– Created with an awareness of DDI 1.*/2.* • DDI 3.0 benefited from having SDMX as a
published specification– Actively aligned with SDMX and many other standards
[email protected]://www.opendatafoundation.org
SDMX Design
• SDMX was intentionally designed to accommodate integration of standards which are used with the inputs to aggregate data– This included DDI and XBRL– Mechanism for integration is generic
• The key point for this integration is the SDMX Registry– It provides links between aggregate (SDMX)
data sets, and also to source data and metadata
DDI/SDMX: Points of Functional Overlap
[email protected]://www.opendatafoundation.org
SDMX and DDI as Complementary
• DDI is designed to document micro-data– 1.*/2.* versions were archival, after-the-fact
documentation– 3.0 version covers entire life cycle, but still has an
after-the fact function
• SDMX is designed as a standard for processing and automation– It is not documentary, but is aimed at automation of
statistical systems and exchanges
• These purposes are related, but not duplicative– SDMX and DDI can both do useful things within a
single system
[email protected]://www.opendatafoundation.org
Examples• DDI could be used to document SDMX-based
aggregates more completely for archival purposes
• DDI could be used to document the micro-data on which aggregates are based– As soon as tabulation occurs, SDMX can be used to
describe and format the data– SDMX can describe micro-data, but it is not very
useful– DDI can be used to automate processing of multi-
dimensional data cubes, but it is more difficult than with SDMX
• SDMX can be used to link DDI instances with other types of standard data and metadata (including both SDMX and DDI)
[email protected]://www.opendatafoundation.org
DDI and SDMX
SDMXAggregated data
Indicators, Time SeriesAcross time
Across geographyOpen AccessEasy to use
DDIMicrodata
Low level observationsSingle time period Single geographyControlled accessExpert Audience
• Microdata data is a important source of aggregated data• Crucial overlap and mappings exists between both
worlds (but commonly undocumented)• Interoperability provides users with a full picture of the
production process
[email protected]://www.opendatafoundation.org
Generic Process Example
Survey/Register
Raw Data SetRaw Data Set
Anonymization, cleaning, Anonymization, cleaning, recoding, etc.recoding, etc.
Micro-Data Set/Micro-Data Set/Public Use FilesPublic Use Files
Tabulation, processing,
Tabulation, processing,
case selection, etc.
case selection, etc.
Aggregation,
Aggregation,
harmonizatio
n
harmonizatio
n
Aggregation, Aggregation, harmonizationharmonization
Aggregate Data SetAggregate Data Set(Lower level)(Lower level)
Aggregate Data SetAggregate Data Set(Higher Level)(Higher Level)
DDIDDI
SDMXSDMX
IndicatorsIndicators
[email protected]://www.opendatafoundation.org
DDI + SDMX?
• When you have data which has been tabulated/aggregated, it may be useful to have both SDMX and DDI– SDMX for processing and exchanging the
data– DDI for documenting these processes, in case
they are of interest to researchers
• DDI has a much richer descriptive capability for addressing the exact processes used in statistical packages
• SDMX is easier to process
DDI/SDMX: Direct Mappings
[email protected]://www.opendatafoundation.org
Direct Mappings: DDI & SDMX
• IDs and referencing use the same approach (identifiable – versionable - maintainable; structured URN syntax)
• Both are organized around schemes– Reusable packages of data, similar to relational tables
in databases
• Both describe multi-dimensional data– A “clean” cube in DDI maps directly to/from SDMX
• Both have concepts and codelists– DDI has much less emphasis on concepts– SDMX emphasizes concepts because they are
needed for comparison
• Both contain mappings (“comparison”) for codes and concepts
[email protected]://www.opendatafoundation.org
Formal Mapping
• There is on-going work to describe a formal mapping between SDMX and DDI– It will cover these direct correspondences– They are quite obvious: a code maps to a
code; a concept to a concept; etc.– There are currently no tools, because generic
tools such as XSLT will work for this transformation
– Drafts of this work are expected this summer, as part of the SDMX submission to ISO for the version 2.0 Technical Specifications
• The direct mappings are the easy part!
[email protected]://www.opendatafoundation.org
Issues with Direct Mapping• It is possible to describe everything in the DDI as an SDMX
Metadata Set– This is probably not the best way to use SDMX with DDI!– It is usually better to select the important fields, and keep the rest in
native DDI format
• When you map from DDI to SDMX, you typically will not carry much of the descriptive metadata, question text, etc.– Mostly structural (codelists, dimensions, attributes, concepts)– You must have concepts for SDMX which are not always present in
DDI
• Going from SDMX to DDI, it is not always possible to map all the data– Especially for SDMX Metadata Sets, which may have user-
configured concepts that don’t always exist in DDI
• Note that SDMX-DDI mappings refer to all versions of DDI
DDI/SDMX: Integration Approaches
[email protected]://www.opendatafoundation.org
Integration Use Cases
• The most important aspect of DDI – SDMX integration is understanding what the use cases are– This defines what mapping/transformation is needed– It also defines what links need to be stored between
data and metadata files
• There are some common use cases– DDI used to describe and link microdata inputs to
SDMX aggregates– DDI used to more fully document SDMX aggregates
for dissemination to users– Using the SDMX Registry as a lifecycle management
tool for DDI, SDMX, etc.
[email protected]://www.opendatafoundation.org
Linking Source Data and Aggregates
• DDI provides a wealth of information about the micro-data which serves as an input to SDMX aggregates– It is possible to capture these links in SDMX, at the cell level or
higher, to provide automated access to source data– An SDMX registry can be used to provide easy access to these links– The user/collector of aggregate data can access the rich DDI
metadata, and possibly the data (if they have access rights)
• It is possible to automatically generate SDMX output from the DDI metadata describing tabulation of micro-data– This may not be useful if the desired SDMX target is a standard cube
structure described by another organization– It may make transformation to the standard cube easier, however
• The SDMX Registry provides a good tool for managing links– Links between SDMX and DDI files are stored as Metadata Reports
[email protected]://www.opendatafoundation.org
Demo: SDMX – DDI Source Links
[email protected]://www.opendatafoundation.org
DDI + SDMX for Dissemination
• Typically, the full DDI documentation is not provided on web-sites which publish aggregates/indicators
• SDMX is becoming a popular dissemination format for these data– It has been shown to increase the use of data on the
Web
• If the DDI documentation is available, this could also be delivered as additional documentation– Especially useful at study level– Links could be directly embedded in SDMX data files
as attributes or stored in an SDMX Registry, or both
[email protected]://www.opendatafoundation.org
The SDMX Registry for Lifecycle Management
• The SDMX Registry provides a tool for tracking the sources of data for aggregates
• It can also track the transformation of versions of DDI as the data moves through the lifecycle
• There is an SDMX model for processes– This can be used to describe the DDI lifecycle model– SDMX Metadata Reports can be used to link DDI
metadata to specific stages of the DDI lifecycle, and to each other
• Applications could query the SDMX Registry to discover all of the DDI metadata produced upstream, as micro-data is collected and processed
[email protected]://www.opendatafoundation.org
Demos
• SDMX Metadata Report used to express DDI metadata
• SDMX Metadata Report used to link DDI instances
Other Related Standards and On-Going Work
[email protected]://www.opendatafoundation.org
Many Related Standards
• DDI• SDMX• ISO/IEC 11179 – concept management and
semantic modelling• ISO 19115 – Geographical metadata• METS – packaging/archiving of digital objects• PREMIS – Archival lifecycle metadata• XBRL – business reporting• Dublin Core – citation metadata• Standard mappings are being defined by people
from many different organizations (see presentation from METIS 2008 in Luxembourg)
[email protected]://www.opendatafoundation.org
ISO/IEC 11179
• ISO/IEC 11179 is used to describe the meanings and representations of terms and concepts
• Both SDMX and DDI are aligned with ISO/IEC 11179– SDMX and DDI concepts can be defined using the
ISO/IEC 11179 attributes– Codelists and categories can be directly mapped (and
other representations)– ISO/IEC 11179 can be implemented with DDI
(directly, for concepts) and/or with SDMX (as a Metadata Report)
– ISO/IEC 11179 has no standard expression in XML – it is just a model
[email protected]://www.opendatafoundation.org
ISO 19115 Geographical Metadata
• ISO 19115 describes geographies (bounding boxes for countries, etc.)
• DDI uses the ISO 19115 model in its own XML– It does not use the standard ISO 19115 XML
format, but there is a 1-to-1 mapping
• SDMX could model ISO 19115 if desired– Linking to DDI or ISO 19115 XML is probably
more useful, using the standard SDMX mechanism
– Most geographies in SDMX aggregate data sets are coded, not directly described
[email protected]://www.opendatafoundation.org
METS
• METS is used to package a set of files which work together as a digital object
• Both DDI and SDMX metadata could be placed inside a METS wrapper– They would be “metadata sections”– The primary use case would be for archiving
of a set of related data and metadata files, possibly with other related materials such as research publications
[email protected]://www.opendatafoundation.org
PREMIS
• PREMIS allows for the capture of administrative metadata as a collection is placed and managed within the archive
• DDI and SDMX files would be treated like any other files forming part of the collection– Both may contain metadata which can be
extracted and used to populate PREMIS instances (access levels, confidentiality, etc.)
[email protected]://www.opendatafoundation.org
XBRL
• XBRL is used by business to report required information to national supervisory bodies– This includes banking supervision and other
economic data– XBRL is a source format for some aggregate
statistics
• XBRL International and the SDMX Sponsors are working together to define a cross-walk between the two standards
[email protected]://www.opendatafoundation.org
Dublin Core
• Dublin Core is used to capture citation-type metadata for resources on the Internet and elsewhere – It is widely used in digital repositories for
research papers
• DDI has the basic Dublin Core XML format as an integral part of the DDI 3.0 specification
• Dublin Core can be easily mimicked as an SDMX Metadata Report [Demo]
[email protected]://www.opendatafoundation.org
High-Level Vision – Standards Mappings
Federated Registries (Based on SDMX, ebXML, web services)
Aggregated Data/Metadata
(SDMX)
XBRLBusiness Reports
DDI Microdata
Sets
ISO 19115Geographies
Dublin CoreCitations
Used in
registered
References to source data
Standard classifications
Organizedusing
ISO 11179
Semantic definitions
METS/PREMIS