19
Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy University Strategic Research Group

Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Embed Size (px)

Citation preview

Page 2: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Archaeology: data-intensive research• Big Data research: capture (e.g. CT) & processing

(e.g. HPC)

• Workflows and provenance

• Citation of data

• Interlinking data and publication

• Retention and migration

• Deposit/ Self-deposit

• Exposure: Semantic Web

Page 3: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

AHRC Portus Project• 2007-11 AHRC Portus

Project

• 2011-2014 AHRC Portus in the Roman Mediterranean Project

• Excavation, Survey, Geophysics, Finds Analysis

• Computer graphic simulation; field data capture and processing; data management

Page 4: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Portus and the Tiber Valley

Page 5: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy
Page 6: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Institutional Data Management Blueprint • Produce framework for managing research data for

an HEI

• Scope and evaluate a pilot implementation plan for an institution-wide data model

6

Policy Best Practice

Pilot Projects

Training &

Workshops

Page 7: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Key Findings

1. Group research practice is embedded and unified

2. Group data management capabilities vary widely

3. Data management is carried out on an ad-hoc basis in many cases

4. Researchers’ demand for storage is significant

5. Users want more support for backup, particularly for large quantities of data

6. Researchers resort to their own best efforts in many cases, where central support does not meet their needs

Page 8: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Key Findings• Researchers want to

keep their data for a long time

• There is a need from researchers to share data, both locally and globally

• Data curation and preservation support needs to be improved

• Data can be BIG

• Project management is part of the data cycle

Page 9: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Portus data requirements analysis• Data integration

• Data volume

• Data variety

• Data management (versioning, migration, linking)

• Fit to archaeological practice

– Context– Finds– Field practice– Technological practice

Page 10: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Portus Data Guidance• Archaeology Data Service – Guides to Good Practice; Deposit

• CIDOC CRM

• The English Heritage STAR project

• The Forum for Information Standards in Heritage (FISH)

• MIDAS XML schema

• JISC INCREMENTAL and SUDAMIH

• Temporal period metadata: English Heritage Timelines Thesaurus; COMMONERAS

• Asset management: JISCDIGITALMEDIA; EXIF, XMP or IPTC data

• Getty Art and Architecture Thesaurus

• Collections Trust’s Archaeological Objects Thesaurus are used.

• Archaeological geospatial data: UK GEMINI (Geo-spatial Metadata Interoperability Initiative)

• Geophysical data: English Heritage Geophysical Survey Database

• Laser scanning data: Laser Scanning Addendum to the Metric Survey Specification (Heritage3d.org); 3D-COFORM

• Reflectance Transformation Imaging: CHI and AHRC RTISAD (Empirical Provenance)

Page 11: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

How address human factors holding research back?• Skills: Broad range of archaeologically specific courseware,

exemplars and best practice guidance

• Legal: Remains problematic but increasing support and expectation of data sharing from administration

• Collaboration: archaeology is collaborative from the ground up

• Closed/ Open: Archaeology is at a cross-roads; funding means that open-ness is likely to be the only option but disciplinary cultures need to be addresses; strong open community

• E.g. research by Leif Isaksen on Roman Port Networks

Page 12: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

How address economic factors holding research back?

• Sustainability: Research council funding and mandate: institutional and disciplinary repositories; user-focussed tools

• Operational costs: complex issue, being addressed at Institutional level e.g. policy re: disciplinary curation

• Transactional costs: limited attempts to operate a pay model for data transactions in archaeology: some museum and media related; open model dominating current work

Page 13: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

How address technical factors holding research back?

• Standards: CIDOC CRM/ CRM-EH, FISH etc. – embedding in technology and data uptake key

• Access and exposure: Linked Open Data; green or gold open access

• Platforms and tools: broad range, but increasing trend towards standardisation

Page 14: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Portus in SharePoint 2010

• Archaeology Sharepoint Pilot (Metadata and Visualisation Component)

Page 15: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Virtual Research Environment (VRE)Toolkit for SharePoint

• SES Sharepoint Pilot (Project Management)

Page 16: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

DepositMO Project Overview• Embed deposit culture

into everyday work of researchers

• Turn repositories into an extension of the desktop

• Extend Microsoft Office

• Target ePrints and Dspace

• Enable repository deposit as part of everyday workflow

Page 17: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Future for Portus Data• Planned for deposit from the start of the AHRC Portus

Project

• Complete migration to partnership: Sharepoint archive and ARK database

• Strategies for big data

• Strategies for integrated workflowsScientific Workflows (e.g. I2S2; MyExperiment) and Blogs (e.g. Blog3)

e.g. RTI SharePoint 2010:

(Big) Data MetadataProject pages, blogs, wikis, finances

www.southampton.ac.uk/muvis

Page 18: Community Capability Model for Data-Intensive Research An archaeological case-study Graeme Earl Archaeological Computing Research Group & Digital Economy

Future for Portus Data• The latest AHRC funding (2011-2014) will take the

project from analysis of data through to linking of publications to the data upon which they are based

– Expose Linked Open Data

– Repository integration is core to our on-going work, including self-deposit (SWORD-ARM & DataPool)

– VRE developments to enable Sharepoint -> Repository

– Experimental Data-Publication linking e.g. RIN