14
G. A. Thorisson www.gen2phen.org 3rd Human Variome Project Meeting, Paris, 10-14 May, 2010 Who’s who on the Web? Digital identity for tracking contributions and managing data access 1 Gudmundur A. Thorisson <[email protected] > Brookes lab Department of Genetics University of Leicester, UK -- Outline -- The author name problem and the challenge of attribution Unique identifiers for authors - the ORCID initiative A digital identity on the Internet - IDs for researchers? Introduce several identity-based projects in our group Attribution for data publications Access control for sensitive data

Who’s who on the Web? - Digital identity for tracking contributions and managing data access

Embed Size (px)

DESCRIPTION

10-min presentation on the author name problem, the ORCID initiative and two key applications of digital identity for researchers: i) attribution for data publication and ii) access control for sensitive research data

Citation preview

Page 1: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

Who’s who on the Web?Digital identity for tracking contributions

and managing data access

1

Gudmundur A. Thorisson <[email protected]>Brookes lab

Department of GeneticsUniversity of Leicester, UK

-- Outline --• The author name problem and the challenge of attribution

• Unique identifiers for authors - the ORCID initiative

• A digital identity on the Internet - IDs for researchers?

• Introduce several identity-based projects in our group

• Attribution for data publications

• Access control for sensitive data

Page 2: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

Non-unique names are a majorproblem in the scholarly literature

2

How about these?

Are these authors all the same person?G. Thorisson, Univ. LeicesterG. A. Thorisson, Univ. LeicesterG. A. Thorisson, Cold Spring Harbor Laboratory

J. SmithJ. SmithJ. SmithJ. SmithJ. Smith [etc.]

Or these?

∼2/3 of the ∼6 million authors in MEDLINE share a last name and first initial with at least one other author, and an ambiguous name refers to ∼8 persons on average.Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data (2009) vol. 3 (3)

Page 3: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

Unique identifiers for authors contributors

3

Dec’09: launch of the Open Researcher Contributor Identification Initiative - ORCID

ORCID ID: B-1242-2010G. Thorisson, Univ. LeicesterG. A. Thorisson, Univ. LeicesterG. A. Thorisson, Cold Spring Harbor Lab.

automated author disambiguation+

author involvement

ORCID ID: G-1442-2009J. Smith, Univ. North Pole

ORCID ID: D-2400-2010J. Smith, Luthor Corporation

Informatics infrastructure:i) for researchers to manage profilei) interaction with other systems

Page 4: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

• =

4

Page 5: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

Digital identity - an online ‘presence’

5

Problem: the Web ‘me’ is fragmentedacross many identity silos

Self-asserted

Identity = The collective aspect of the set of characteristics by which a thing is definitively recognizable or known (dictionary.com)

Many usernames/passwords- password fatigue -

Asserted by others

Nature Rev Genet G. A. Thorisson <authored> Nat Biotechnol (2009) vol. 27 (11)Nature Biotech G. A. Thorisson <authored> Nat Rev Genet (2009) vol. 10 (1) [...]

Nature Rev Genet B-1242-2010 <authored> Nat Biotechnol (2009) vol. 27 (11)Nature Biotech B-1242-2010 <authored> Nat Rev Genet (2009) vol. 10 (1) [...]

Page 6: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

Bridging the silos - distributed identity systems

6

Open, lightweight decentralised

authentication protocol

Web 2.0social networking

-Federated authentication-

Closed-world: Single sign-on (SSO) across the federation

-reuse profile information- username/pwd problem

Institution-centricidentityuser gt50 at University of Leicester

User-centric identity

Open-world: SSOacross the entire Web

Page 7: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010 7

>50,000 websites now accepting 3rd party IDs (from www.janrain.com)

Page 8: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010 8

ORCID ID: B-1242-2010G. Thorisson, Univ. LeicesterG. A. Thorisson, Univ. LeicesterG. A. Thorisson, Cold Spring Harbor Lab.

http://mummi.myopenid.com

A digital identity for researchers centred on

scholarly profile?

Page 9: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

• Identity-enabled access management– Controlling access to protected resources on the Web

• Tracking contributions– data submissions to central repositores

– data curation / micro-attribution

– crucial to bio-resource impact factor + nanopublications

9

Data-related applications for digitalidentity (a.k.a ‘researcher IDs’)

Page 10: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

Cafe RouGE for mutation data exchange

10

1. Diagnostic laboratories

2. Central mutation

depot

3. End-users (e.g. LSDB curators)

Publish data

Owen Lancaster, Session 10, Thu 10:10am

Retrieve RSS feeds

•Security•Log in via OpenID and other 3rd party identity (aka ‘outsourced’ security)•Identity-based management of access to non-open data

•Attribution•Link ORCID ID with variation data submissions => publication credit •B-1242-2010 <authored> doi:10.9354/caferouge0005 (Cafe RouGE entry 0005, 19/12/2009)

DOIs for scholarly publications vs DOIs for datasets

http://www.datacite.org

“Our long term vision is to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence.”

Page 11: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

HGVbaseG2P - a centralgenetic association database

• Controlled access to aggregate GWAS datasets

11

http://www.hgvbaseg2p.org Robert Free, Session 10, Thu 11:30am

Page 12: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010

Molgenis-based distributedG2P data infrastructure

12

• Identity-based access control + attribution

http://www.molgenis.org Morris Swertz, Session 10, Thu 9:40am

Page 13: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010 13

The GEN2PHEN Strategy

A federated network of databases, not a central database

Data import systems

Data standards

Database solutions

Locus-specific databases

‘Genomics’ G2P databases

Data exchange mechanisms

The GEN2PHEN Knowledge Centre will fully leverage this network

GEN2PHENKnowledge

Centre

http://www.gen2phen.org Adam Webb, Session 10, Thu 9:40am

Page 14: Who’s who on the Web? - Digital identity for tracking contributions and managing data access

G. A. Thorisson www.gen2phen.org

3rd Human Variome Project Meeting, Paris, 10-14 May, 2010 14

GEN2PHEN Consortiumhttp://www.gen2phen.org/about-gen2phen/partners

University of Leicester

-Anthony J. Brookes Bioinformatics Group

Tim Beck

Rob Free

Sirisha Gollapudi

Rob Hastings

Owen Lancaster

Adam Webb

-Raymond Dalgleish

This work has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013)under grant agreement number 200754 - the GEN2PHEN project.

Acknowledgements

Contact me! Gudmundur A. Thorisson <[email protected]>