Upload
madison-hunt
View
213
Download
1
Embed Size (px)
Citation preview
UK DATA ARCHIVEUK DATA ARCHIVE
Louise Corti,Louise Corti,ODAF April 2008ODAF April 2008
UK Data ArchiveUK Data Archive an internationally-renowned centre of expertise in data
acquisition, preservation, dissemination and promotion
curator of the largest collection of digital data in the social sciences and humanities in the UK
provides resource discovery and support for the secondary use of quantitative and qualitative data in research, learning and teaching
a lead partner of the Economic and Social Data Service (ESDS)
provides preservation services for other data organisations
facilitates international data exchange
UKDA holdingsUKDA holdings
Data for research and teaching purposes Data for research and teaching purposes and used in all sectors and for many and used in all sectors and for many different disciplinesdifferent disciplines
official agencies - mainly central official agencies - mainly central governmentgovernment
individual academics - research grantsindividual academics - research grants market research agenciesmarket research agencies public records/historical sourcespublic records/historical sources links to UK census datalinks to UK census data qualitative and quantitative qualitative and quantitative international statistical time seriesinternational statistical time series access to international data viaaccess to international data via links with other data archives worldwidelinks with other data archives worldwide
history data service in-house (AHDS)history data service in-house (AHDS)
5,000+ datasets in the 5,000+ datasets in the
collectioncollection
250+ new datasets are 250+ new datasets are
added each yearadded each year
60,000+ datasets 60,000+ datasets
distributed worldwide distributed worldwide
p.a.p.a.
PreservationPreservation
UKDA currently preserveUKDA currently preserve
approximately 4,600 studiesapproximately 4,600 studies
occupying about 650GB but with capacity for more occupying about 650GB but with capacity for more than 3TBytes on main systemthan 3TBytes on main system
266,000 files, 56,000 directories (average file size 266,000 files, 56,000 directories (average file size 2.6MBytes).2.6MBytes).
growing by about 100GB per yeargrowing by about 100GB per year
more than 40 years of electronic data preservationmore than 40 years of electronic data preservation have (so far) not lost any data!have (so far) not lost any data!
ESDS structureESDS structure ESDS Management ESDS Management
central help desk service; coherent and flexible collections central help desk service; coherent and flexible collections development policy; central registration service; links to development policy; central registration service; links to other ESRC resourcesother ESRC resources
ESDS Access and Preservation ESDS Access and Preservation collections development strategy; ingest activities - collections development strategy; ingest activities -
including data and documentation processing; metadata including data and documentation processing; metadata creation; data dissemination services; long-term preservation creation; data dissemination services; long-term preservation
Specialist data servicesSpecialist data services ESDS ESDS GovernmentGovernment ESDS ESDS InternationalInternational ESDS ESDS LongitudinalLongitudinal ESDS ESDS QualidataQualidata
• dedicated web sites• data and
documentation enhancements
• tailored user support• outreach and training
Data support services (DSS)Data support services (DSS)
Run ESDS advisory service for researchersRun ESDS advisory service for researchers data creation, data management and sharing data creation, data management and sharing
Run environmental data supportRun environmental data support new kinds of datanew kinds of data
Bidding for MRC DSSBidding for MRC DSS
Finding dataFinding data
catalogue of holdings –catalogue of holdings –some 4600 collectionssome 4600 collections
limited and basic DDI 2.0 TO Describes study, limited and basic DDI 2.0 TO Describes study, methods and data collectionmethods and data collection
records all study related publications (voluntary)records all study related publications (voluntary)
lists variables for SPSS datasetslists variables for SPSS datasets
can download user guide free (pdf)can download user guide free (pdf)
Data sharing and access
registration using Athens including agreement to an registration using Athens including agreement to an End User Licence, fine-grained access controlEnd User Licence, fine-grained access control
download service (SPSS, STATA, ASCII, RTF etc)download service (SPSS, STATA, ASCII, RTF etc)
online data browsingonline data browsing
Nesstar - simple data analysis, visualisation, Nesstar - simple data analysis, visualisation, downloading and subsetting of survey and aggregate downloading and subsetting of survey and aggregate datadata XMLXML
ESDS Qualidata online – exploring qualitative data ESDS Qualidata online – exploring qualitative data XMLXML
Beyond 20/20 – tabulating and graphing international Beyond 20/20 – tabulating and graphing international macro databanksmacro databanks
UKDA R&DUKDA R&D
data management – advice & trainingdata management – advice & training
consent and confidentiality – adviceconsent and confidentiality – advice
access and authentication systems – Shibbolethaccess and authentication systems – Shibboleth
Secure data service (bid )Secure data service (bid )
Data exchange standards & tools – survey and qualitativeData exchange standards & tools – survey and qualitative
Preservation metadata and METSPreservation metadata and METS
thesaurus developmentthesaurus development
Self-archiving FEDORA systemSelf-archiving FEDORA system
text mining applications for textual datatext mining applications for textual data
Web 2.0 & social networking tools – self tagging; feedback; Web 2.0 & social networking tools – self tagging; feedback;
facebookfacebook
survey question bank (bid)survey question bank (bid)
E-science – discussions on grid-enabling dataE-science – discussions on grid-enabling data
What we’d like to do if we had moneyWhat we’d like to do if we had money
more of last slidemore of last slide
data visualisation – numbers and words and data visualisation – numbers and words and beyond NESSTARbeyond NESSTAR
based on open source tools!based on open source tools! intelligent resource discovery – text mining intelligent resource discovery – text mining
capacity plus linking catalogues in different capacity plus linking catalogues in different domainsdomains
more ‘harmonised’ data – across seriesmore ‘harmonised’ data – across series legacy work to bring collections up to scratchlegacy work to bring collections up to scratch
digitisation of paper/analogue sourcesdigitisation of paper/analogue sources