Upload
cleopatra-rich
View
219
Download
0
Embed Size (px)
Citation preview
Using ISO/IEC Using ISO/IEC 11179 to Help with 11179 to Help with
Metadata Metadata Management Management
ProblemsProblems
Graeme OakleyGraeme OakleyAustralian Bureau of Australian Bureau of
StatisticsStatistics
Content of PresentationContent of Presentation
ABS Strategy for Management of ABS Strategy for Management of MetadataMetadata
Examples of Problems to AddressExamples of Problems to AddressHow are we using ISO 11179How are we using ISO 11179Further workFurther work
Strategy for Management of ABS Strategy for Management of ABS metadata metadata
Background - senior management Background - senior management projectproject
Problems to addressProblems to addressPrinciplesPrinciplesMetadata ModelMetadata ModelStrategies Strategies
"Metadata Management refers to the content, structure, and designs necessary to manage the vocabulary and other metadata that describes statistical data, designs and processes. ... includes the development of metadata models ..., building metadata registries to organise the metadata ..., developing statistical terminologies which define and organise terms ..." (Bargmeyer and Gillman, METIS 2000)
Metadata ManagementMetadata Management
Examples of ABS metadataExamples of ABS metadata
Data Item definitions, eg person, salesData Item definitions, eg person, salesClassification description, category Classification description, category
values and hierarchy eg Industry, values and hierarchy eg Industry, Household typeHousehold type
Collection - who, how, when, whereCollection - who, how, when, whereQuality eg response rate, survey error Quality eg response rate, survey error
measuresmeasuresProvider load - how long to complete Provider load - how long to complete
formsformsCollection form developmentCollection form developmentProcess - edits, imputationProcess - edits, imputationGlossary, Themes, ProductsGlossary, Themes, Products
Problems (not an exhaustive list)Problems (not an exhaustive list)
SMA's have to re-enter metadataSMA's have to re-enter metadataMetadata is not reusedMetadata is not reusedWhat is 'standard'?What is 'standard'?TerminologyTerminologyExisting business processes not widely Existing business processes not widely
usedusedGaps in InfrastructureGaps in Infrastructure
Principles for Metadata Principles for Metadata ManagementManagementMetadata should be managed as part of a Metadata should be managed as part of a
'life-cycle''life-cycle'
One, authoritative source for each One, authoritative source for each metadata typemetadata type
Registration process for each type of Registration process for each type of metadatametadata
New metadata not created until designer New metadata not created until designer has determined that no useable metadata has determined that no useable metadata already existsalready exists
Business processes have a metadata flow Business processes have a metadata flow describeddescribed
Principles (ctd)Principles (ctd)Metadata gathering activities should be Metadata gathering activities should be
minimised so that no metadata be minimised so that no metadata be entered more than once, and derivable entered more than once, and derivable metadata be automatically generated metadata be automatically generated
Cost/benefit mechanism to ensure Cost/benefit mechanism to ensure benefits to users justify costs to benefits to users justify costs to producers of metadataproducers of metadata
Single approved standard description for Single approved standard description for each metadata type. Variants may be each metadata type. Variants may be permitted but must always be linked to permitted but must always be linked to the 'standard'the 'standard'
StrategiesStrategiesCommunication and educationCommunication and educationAllocation of 'registration' tasks to Allocation of 'registration' tasks to
appropriate organisational unitsappropriate organisational unitsEnhance metadata model to cover Enhance metadata model to cover
quality and process metadataquality and process metadataDevelopment projects to fill Development projects to fill
infrastructure gapsinfrastructure gapsUse opportunities in other Use opportunities in other
development projects to promote development projects to promote metadata mgtmetadata mgt
Examples of Problems to addressExamples of Problems to address
What is the IDW and its relationship What is the IDW and its relationship to metadata?to metadata?
Definition of large number of data Definition of large number of data elementselements
Linking CMR and Input Data Linking CMR and Input Data WarehouseWarehouse
"Useful Metadata Structures" to "Useful Metadata Structures" to considerconsiderconcordancesconcordancesdefinition of legal and illegal definition of legal and illegal combinationscombinations
ETL descriptionsETL descriptions
Input Data WarehouseInput Data WarehouseUnderpins Business Statistics Underpins Business Statistics
Infrastructure ProjectInfrastructure ProjectRepository for all business statistics unit Repository for all business statistics unit
record datarecord dataStar schema architectureStar schema architectureUse of shared definitional metadata from Use of shared definitional metadata from
CMRCMRProcess and quality metadata to be Process and quality metadata to be
accessed and/or createdaccessed and/or created"Gatekeeper role" by Standards group"Gatekeeper role" by Standards group
Data sourceChange reason
Business location
Organisation type
Data itemProcessing
stage
Industry
Reference period
Fact
Provider
Change module
IDW v1.0 dimensional model
Definition of data elementsDefinition of data elements
Need for framework of concepts, Need for framework of concepts, properties, classificationsproperties, classifications
Potentially thousands of data Potentially thousands of data elements to define and approveelements to define and approve
Concepts and terminology of ISO Concepts and terminology of ISO 1117911179
Links to question wording and Links to question wording and question modules neededquestion modules needed
Linking IDW and CMRLinking IDW and CMR'Star schema' elements should be based 'Star schema' elements should be based
on metadata in CMRon metadata in CMRdata sources (surveys and their cycles)data sources (surveys and their cycles)key classifications eg industrykey classifications eg industry'data items''data items'
'Snapshot' rather than dynamic access'Snapshot' rather than dynamic accessOnly use 'approved' metadataOnly use 'approved' metadataMore services in 2004 eg form More services in 2004 eg form
information, better searching, transfer information, better searching, transfer from IDW to CMRfrom IDW to CMR
Useful Metadata StructuresUseful Metadata Structures
Definition of legal, illegal and Definition of legal, illegal and questionable combinationsquestionable combinations
Concordances between an IDW value Concordances between an IDW value domain and underlying 'standard' domain and underlying 'standard' classificationclassification
Specialised transformations - related to Specialised transformations - related to ETL processesETL processes
How are we using ISO 11179?How are we using ISO 11179?
11179 provides advice about 11179 provides advice about registrationregistration
Concepts of properties, object Concepts of properties, object classes, qualifiers and value classes, qualifiers and value domains (classifications)domains (classifications)
11179 is a general standard, so 11179 is a general standard, so wanted framework to link to existing wanted framework to link to existing statistical standards and to link to statistical standards and to link to metadata objects beyond metadata objects beyond boundaries of 11179boundaries of 11179
Further WorkFurther Work
ABS senior management endorsed ABS senior management endorsed metadata mgt strategymetadata mgt strategy
Now to look at implementationNow to look at implementationnew metadata storesnew metadata storesenhance existing metadata stores & enhance existing metadata stores & servicesservices
deployment of 'registration authority' deployment of 'registration authority' concept along with education & concept along with education & communication to staffcommunication to staff
Investigations related to 11179Investigations related to 11179
Definition of table or matrix, ie the Definition of table or matrix, ie the result of the 'aggregation' processresult of the 'aggregation' process
When to use qualifiersWhen to use qualifiersExpression of 11179 model in an XML Expression of 11179 model in an XML
schemaschemaDefining value domains and rules to Defining value domains and rules to
drive 'editing engines', ETL processdrive 'editing engines', ETL process
QUESTIONS ANDDISCUSSION