Upload
jeffry-french
View
223
Download
0
Tags:
Embed Size (px)
Citation preview
© Keith G Jeffery, Anne Asserson
Keith G JefferyConsultant
Anne AssersonUniversity Library
University of Bergen
Auditing Grey in a CRIS Environment
2-3 Dec 20123 BratislavaAuditing Grey in a CRIS Environment 1
Keith G Jeffery Consultants
© Keith G Jeffery, Anne Asserson
Prologue
• Metadata and data• Real world• ‘library’ metadata: MARC, DC etc• Key dependencies– Functional– Referential
• No AUDIT without QUALITY METADATAAuditing Grey in a CRIS Environment 2
© Keith G Jeffery, Anne Asserson
Structure
3Auditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava
• Introduction• Reliable Information• Open Data• ENGAGE• Conclusion
© Keith G Jeffery, Anne Asserson
Introduction• The vast majority of (research) information is grey– It is not peer reviewed scholarly publications
• We use information object to mean any digital grey object encoded in any format on any medium– Document, data file, video, software….
• Mechanisms are required to audit grey to assure quality
• We assert that audit of grey requires high quality metadata
Auditing Grey in a CRIS Environment 4
© Keith G Jeffery, Anne Asserson
Reliable Information• Quality– Represents accurately world of interest
• Context– Environment within which collected – related entities
• Persons, organisations, projects, funding, equipment, publications…..
• Availability– Persistence (preservation / curation)– Conditions of use (open access)
Auditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava 5
We have to encode this as metadata for audit
© Keith G Jeffery, Anne Asserson
Reliable Information: Quality
• Data integrity– Schema– Constraints
• Accuracy, precision• Incomplete and
inconsistent information
• Temporal validity• Independent validation– Quality rating
Auditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava 6
(With acknowledgements to FINETIK)
© Keith G Jeffery, Anne Asserson
Reliable Information: Context
• Related entities that give confidence that the information of interest is understood in context
• CERIF (Common European Research Information Format)
• EU Recommendation to member states• Used in 42 countries• National standard in 10• Maintained, developed, promoted by euroCRIS
(not for profit) www.eurocris.org
Auditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava 7
© Keith G Jeffery, Anne Asserson
CERIF
Dataset is here
82-3 Dec 20123 BratislavaAuditing Grey in a CRIS Environment
Document is here
© Keith G Jeffery, Anne Asserson
Reliable Information: Availability• Persistence
– Media migration• Who can read a 7 inch floppy
disk? Or a 3420 IBM tape?– Declared syntax and semantics
• Machine readable AND machine understandable
– Preservation of related software• Changing languages, compilers /
interpreters• Changing operating
environment (sequential, parallel, distributed, data dependencies)
• Specifications• Access
– Open– Toll-free (conditions, licences)
Auditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava 9
© Keith G Jeffery, Anne Asserson
Open Data• Semantic Web• LOD: Linked Open Data• RDF– Triples– Expressed as XML
• Metadata– DC– CKAN
• Most portals clickable lists of datasets
• Most datasets pdf or xls– Essentially documents
• Very little metadata• Metadata ‘flat’ and
poor• Not linked to underlying
research datasets
Auditing Grey in a CRIS Environment 102-3 Dec 20123 Bratislava
Open data implies open access to any digital information object
© Keith G Jeffery, Anne Asserson
Open Data• Semantic Web• LOD: Linked Open Data• RDF– Triples– Expressed as XML
• Metadata– DC– CKAN
• Most portals clickable lists of datasets
• Most datasets pdf or xls– Essentially documents
• Very little metadata• Metadata ‘flat’ and
poor• Not linked to underlying
research datasets
Auditing Grey in a CRIS Environment 11
An Opportunity A Problem2-3 Dec 20123 Bratislava
© Keith G Jeffery, Anne Asserson
The Vision: Metadata for Data Model
DISCOVERY(DC, eGMS…)
CONTEXT(CERIF)
DETAIL(SUBJECT OR TOPIC SPECIFIC)
Generate
Point to
Linked open data
Formal Information
Systems
Auditing Grey in a CRIS Environment 122-3 Dec 20123 Bratislava
© Keith G Jeffery, Anne Asserson
Open Data and The worlds of information processing
LOD, Semantic Web, RDFLOD, Semantic Web, RDFBrowsing, ease of useBrowsing, ease of use
Relational (Links)Relational (Links)Integrity, performanceIntegrity, performance
generategenerateprovide provide
access toaccess to
Example: summary data in semantic web/LOD environment (RDF) with associated processing
Example: research datasets in Relational DB environment with associated analysis, visualisation, data mining ….
Manual downloadManual connection to softwareManual integration
Automated downloadAutomatic connection to software
Automated integrationAuditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava 13
© Keith G Jeffery, Anne Asserson
Complete ICT environment for research
The Vision: The ModelsComplete cohort of researchers, research managers,
innovators, media
Processing Model
User Model
Data Model
Resource Model
interaction with data, processing, persons
providing what the user requires
representing research
representing ICT
We are We are
talking talking
about about
thisthis
Auditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava 14
© Keith G Jeffery, Anne Asserson
Conclusion
15Auditing Grey in a CRIS Environment 2-3 Dec 20123 Bratislava
• Architecture underpinning open data with quality research information
• CERIF provides formality and assurance
• Metadata interconvertors : CERIF superset generating the less rich metadata formats: DC, CKAN…
The provision of quality metadata assures quality to be confirmed by audit