TAIR Workshop Model Organism Databases and Community Annotation Plant and Animal Genome XVI...

Preview:

Citation preview

TAIR Workshop

Model Organism Databases and Community Annotation

Plant and Animal Genome XVI Conference, San DiegoJanuary 13, 2008

2

Curator-User collaborations in various databases

Karen Yook

Issak Yosief Tecle

Donghui LiPhilippe Lamesch

Community Annotation at TAIR

Philippe LameschDonghui Li

curator@arabidopsis.org

4

• Introduction to TAIR

• Community annotation at TAIR: an overview

• Community annotation: process and examples (gene function annotation)

• New initiative on encouraging community annotation: collaborations between Publisher and MODs

• Phil: community involvement in genome annotation

Outline

5

• Introduction to TAIR

• Community annotation at TAIR: an overview

• Community annotation: process and examples (gene function annotation)

• New initiative on encouraging community annotation: collaborations between Publisher and MODs

• Phil: community involvement in genome annotation

Outline

6

Who we are

• The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model plant Arabidopsis

• Funded by NSF

7

TAIR website: www.arabidopsis.org

8

TAIR is used worldwide

Visits per month (source: Google Analytics)

9

What we do: (1) Arabidopsis genome annotation

Swarbreck et al, Nuclear Acids Research 2007

10

What we do: (2) manual literature curation

• Controlled vocabulary annotations

Gene Ontology

Plant Ontology

• Gene name, symbol

• Allele, phenotype

• Summary statement composition

11

What we do: (3) metabolic pathway curation

AraCyc:

A metabolic pathway database for Arabidopsis thaliana that contains information about both predicted and experimentally determined pathways, reactions, compounds, genes and enzymes.

12

What we do: (4) work with ABRC to distribute research materials

13

• Introduction to TAIR

• Community annotation at TAIR: an overview

• Community annotation: process and examples (gene function annotation)

• New initiative on encouraging community annotation: collaborations between Publisher and MODs

• Phil: community involvement in genome annotation

Outline

14

Why encourage community annotation?

0

500

1000

1500

2000

2500

1999 2000 2001 2002 2003 2004 2005 2006

Year

# of

art

icle

s

Arabidopsis literature

15

Why encourage community annotation?

We need help!

16

Why encourage community annotation?

Benefits for the community

Increased data accessibility

unpublished data

data from publications that are not curated by us

Up to date information

Improved accuracy

17

Big issues in community annotation

Curators

Community

Meetings, workshops

Contact authors, project PIs

Invite experts for on-site annotation jamboree

How to get the community involved?

18

Big issues in community annotation

Curators

Community

Submission forms (gene function)

Post comments on TAIR website (comments on individual gene/seed stock)

Customized submission for large datasets

Email helpdesk curator@arabidopsis.org (questions, errors, omissions)

Tools and methods for community annotation

19

• Introduction to TAIR

• Community annotation at TAIR: an overview

• Community annotation: process and examples (gene function annotation)

• New initiative on encouraging community annotation: collaborations between Publisher and MODs

• Phil: community involvement in genome annotation

Outline

20

How to submit data to TAIR?

Submission method Example

Submission form Gene functional annotation

Direct submission on TAIR website Add Comments on individual gene

Customized submission processLarge-scale datasets (e.g. Arabidopsis 2010 Project data submission)

By email (curator@arabidopsis.org) Errors, corrections, omissions

21

Community annotation process: an example

22

Submission page

23

61 gene products, 204 annotations

1. User: download, fill in and send back the submission form

24

2. Curator: review and curate data

phloem = PO:0005417

25

3. Curator: data loading

26

Community annotation on TAIR

27

How to submit data to TAIR?

Submission method Example

Submission form Gene functional annotation

Direct submission on TAIR website Add Comments on individual gene

Customized submission processLarge-scale datasets (e.g. Arabidopsis 2010 Project data submission)

By email (curator@arabidopsis.org) Errors, corrections, omissions

28

User comments on TAIR website

29

User comments on TAIR website

30

How to submit data to TAIR?

Submission method Example

Submission form Gene functional annotation

Direct submission on TAIR website Add Comments on individual gene

Customized submission processLarge-scale datasets (e.g. Arabidopsis 2010 Project data submission)

By email (curator@arabidopsis.org) Errors, corrections, omissions

31

NSF Arabidopsis 2010 Project

Year funded

Funded projects (*)

ContactedData submitted

Data promised

Contact but did not reply

2005 16 15 8 6 1

2004 14 14 5 9 -

2003 14 13 7 5 -

2002 16 15 6 6 2

2001 27 25 13 9 2

Total 87 80 39 35 5

*Source: National Science Foundation; numbers updated 01/08

To determine the function of each Arabidopsis gene by 2010

32

Processing time

“Fabulous news! I should have thought of submitting it earlier. And thank you for such an informative and organized website. It was THE mainstay of my thesis research, which featured 22k microarrays.” [Donna Lindsay, University of Saskatchewan, Canada]

“Its perfect! Thank you very much for you rapidity.” [Fabienne Granierg, Institut Jean-Pierre Bourgin - INRA, France]

“Thanks for your rapid response.” [Qunfeng Dong, China]

“Thanks for your good work.” [Sabeeha Merchant, UCLA]

From TAIR users:

33

• Introduction to TAIR

• Community annotation at TAIR: an overview

• Community annotation: process and examples (gene function annotation)

• New initiative on encouraging community annotation: collaborations between Publisher and MODs

• Phil: community involvement in genome annotation

Outline

34

2006 high priority journals

CELLCURRENT BIOLOGYDEVELOPMENTGENES AND DEVELOPMENTNATURENATURE CELL BIOLOGYNATURE GENETICSNUCLEIC ACIDS RESEARCHPLoS biologyPNASSCIENCETHE EMBO JOURNALTHE PLANT CELLTHE PLANT JOURNALTRENDS IN PLANT SCIENCE

FEBS LETTERSGENETICSJOURNAL OF BIOLOGICAL CHEMISTRYMOLECULAR PLANT-MICROBE INTERACTIONSPLANT MOLECULAR BIOLOGYPLANT PHYSIOLOGY

35

Collaboration between Publishers and TAIR

Editorial board meeting in July 2007

Our proposal: request the following additional information from authors after paper is accepted:

• AGI Locus identifier/s (e.g. At1g01040)Provides clarity, avoid nomenclature conflicts

• Gene function annotation linked to AGI Loci with method

Up-to-date information about Arabidopsis genes from Plant Physiology is available at TAIR

36

Collaboration between Publishers and TAIR

• Launch of new feature via Benchpress

• Authors fill in an online form in TAIR-suggested format

• Data sent to TAIR for processing

37

• Introduction to TAIR

• Community annotation at TAIR: an overview

• Community annotation: process and examples (gene function annotation)

• New initiative on encouraging community annotation: collaborations between Publisher and MODs

• Phil: community involvement in genome annotation

Outline

38

Recommended