A centre of expertise in digital information management
www.ukoln.ac.uk
UKOLN is supported by:
Monica Duke
Project Manager/Researcher
29th March 2011 Aston Business School
SageCite Project
http://blogs.ukoln.ac.uk/sagecite/
#sagecite [email protected]
A centre of expertise in digital information management
www.ukoln.ac.uk
Citation in the domain of disease network modelling
Data
Process
Publication
Research Object
Citation Chains
Credit and Attribution
Data
Process
Publication
?
A centre of expertise in digital information management
www.ukoln.ac.uk
Sage data and processes
• The idealised Sage modelling process can be divided into 7 stages
• A combination of phenotypic, genetic, and expression data are processed to determine a list of genes associated with diseases
• Different people are responsible for different stages of the modelling process. One person oversees the whole process though.
A centre of expertise in digital information management
www.ukoln.ac.uk
Stage 2: Statistical QC
• Actual values in data sets are validated for quality to check for experimental artifacts
• The checks made are dependent on the type of data set and involves the use of R scripts and tools like Plink
• The output is a normalised data set
Statistical QC
Validated & curateddata sets
Curateddata sets
A centre of expertise in digital information management
www.ukoln.ac.uk
A centre of expertise in digital information management
www.ukoln.ac.uk
Domain Complexity
• Multistage process– Each stage is specialised
• Several people involved
• Size/specialisation
A centre of expertise in digital information management
www.ukoln.ac.uk
Unpacking Citation
• I cite others– I need to give attribute to others
• I make my work citable– Make it easy to cite my work
• Others cite me– Get credit when others attribute me
A centre of expertise in digital information management
www.ukoln.ac.uk
I cite others
• Challenges– Tracking what data I have used– Some information may be confidential– Some data may be restricted access– What if I have modified the data?
A centre of expertise in digital information management
www.ukoln.ac.uk
Discover
I make my work citableSupport others to:
Re-use
Access
DataCite sagecitedemorepository
Citable data Pro
duces
Register, s
ubmit metad
ata
Generate landing page for data and store
DOIsDOIsDOIsDOIsMint
DataCite API Google API
Resolve to landing page
Taverna workflow
The relationships between data via DataCite DOIs with tools are captured by the provenance (OPM) produced by Taverna
1
2
3 4
5
6
Workflowmetadata
For referring to data reported in the provenance
Additional steps forciting data
A centre of expertise in digital information management
www.ukoln.ac.uk
I make my work citable
• Challenges– Making my work re-usable– Granularity of credit– When to assign a new identifier
• What type of identifier
– What represents intellectual input – which contributions deserve to be cited?
A centre of expertise in digital information management
www.ukoln.ac.uk
Others cite me
• Recognising contributions other than publications
• Granularity of roles and contribution• Will added value be recognised?• What metrics to use• Linking all my contributions together• What constitutes “publication”?
A centre of expertise in digital information management
www.ukoln.ac.uk
Sage bionetwork model(Co-expn, Bayesian)
Workflow
Provenance
GeneURI
myExperimentURI
Data sets in GEO database
DataCiteRegister
Submit
Open Provenance ModelW3C Provenance Incubator
RDF
Data creator
ORCID
Workflow user
ORCID
DOI, Pubmed Id
Scientific publication
Publication?
G. A. Thorisson, University of Leicester
Identity Workshop prep-meeting, Helsinki, January 27 2011
Publishing a journal article
Publishing a dataset
A centre of expertise in digital information management
www.ukoln.ac.uk
Different forms of publication
• As support for an article
• Publish to a repository/archive
• Blogs or other social networking sites
• Micro-attribution (nano-publication)
A centre of expertise in digital information management
www.ukoln.ac.uk
Working with ORCID
• Contributor ID
G. A. Thorisson, University of Leicester
Identity Workshop prep-meeting, Helsinki, January 27 2011
Centrally-managed informatics infrastructure:i) for researchers to manage & use profileii) for tracking author-to-publication attribution linksiii) interaction with other systems (e.g. publishers, digital libraries
ORCID ID: G-1442-2009J. Smith, Univ. North Pole
ORCID ID: D-2400-2010J. Smith, Luthor Corporation
ORCID ID: B-1242-2010G. Thorisson, Univ. LeicesterG. A. Thorisson, Univ. LeicesterG. A. Thorisson, Cold Spring Harbor Lab.
A centre of expertise in digital information management
www.ukoln.ac.uk
Special issue
• New Models of Semantic Publishing in Science
• http://www.semantic-web-journal.net/blog/special-issue-semantic-web-journal-new-models-semantic-publishing-science
• Deadline: 1st May
A centre of expertise in digital information management
www.ukoln.ac.uk
Acknowledgements• University of
Manchester– Carole Goble– Peter Li
• British Library– Max Wilkinson– Tom Pollard
• Sage Bionetworks
• UKOLN– Liz Lyon– Monica Duke
• Nature Genetics– Myles Axton
• PLoS Comp Bio– Phil Bourne