Upload
noel-mcdowell
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
TAIR Workshop
Model Organism Databases and Community Annotation
Plant and Animal Genome XVI Conference, San DiegoJanuary 13, 2008
2
Curator-User collaborations in various databases
Karen Yook
Issak Yosief Tecle
Donghui LiPhilippe Lamesch
4
• Introduction to TAIR
• Community annotation at TAIR: an overview
• Community annotation: process and examples (gene function annotation)
• New initiative on encouraging community annotation: collaborations between Publisher and MODs
• Phil: community involvement in genome annotation
Outline
5
• Introduction to TAIR
• Community annotation at TAIR: an overview
• Community annotation: process and examples (gene function annotation)
• New initiative on encouraging community annotation: collaborations between Publisher and MODs
• Phil: community involvement in genome annotation
Outline
6
Who we are
• The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model plant Arabidopsis
• Funded by NSF
7
TAIR website: www.arabidopsis.org
8
TAIR is used worldwide
Visits per month (source: Google Analytics)
9
What we do: (1) Arabidopsis genome annotation
Swarbreck et al, Nuclear Acids Research 2007
10
What we do: (2) manual literature curation
• Controlled vocabulary annotations
Gene Ontology
Plant Ontology
• Gene name, symbol
• Allele, phenotype
• Summary statement composition
11
What we do: (3) metabolic pathway curation
AraCyc:
A metabolic pathway database for Arabidopsis thaliana that contains information about both predicted and experimentally determined pathways, reactions, compounds, genes and enzymes.
12
What we do: (4) work with ABRC to distribute research materials
13
• Introduction to TAIR
• Community annotation at TAIR: an overview
• Community annotation: process and examples (gene function annotation)
• New initiative on encouraging community annotation: collaborations between Publisher and MODs
• Phil: community involvement in genome annotation
Outline
14
Why encourage community annotation?
0
500
1000
1500
2000
2500
1999 2000 2001 2002 2003 2004 2005 2006
Year
# of
art
icle
s
Arabidopsis literature
15
Why encourage community annotation?
We need help!
16
Why encourage community annotation?
Benefits for the community
Increased data accessibility
unpublished data
data from publications that are not curated by us
Up to date information
Improved accuracy
17
Big issues in community annotation
Curators
Community
Meetings, workshops
Contact authors, project PIs
Invite experts for on-site annotation jamboree
How to get the community involved?
18
Big issues in community annotation
Curators
Community
Submission forms (gene function)
Post comments on TAIR website (comments on individual gene/seed stock)
Customized submission for large datasets
Email helpdesk [email protected] (questions, errors, omissions)
Tools and methods for community annotation
19
• Introduction to TAIR
• Community annotation at TAIR: an overview
• Community annotation: process and examples (gene function annotation)
• New initiative on encouraging community annotation: collaborations between Publisher and MODs
• Phil: community involvement in genome annotation
Outline
20
How to submit data to TAIR?
Submission method Example
Submission form Gene functional annotation
Direct submission on TAIR website Add Comments on individual gene
Customized submission processLarge-scale datasets (e.g. Arabidopsis 2010 Project data submission)
By email ([email protected]) Errors, corrections, omissions
21
Community annotation process: an example
22
Submission page
23
61 gene products, 204 annotations
1. User: download, fill in and send back the submission form
24
2. Curator: review and curate data
phloem = PO:0005417
25
3. Curator: data loading
26
Community annotation on TAIR
27
How to submit data to TAIR?
Submission method Example
Submission form Gene functional annotation
Direct submission on TAIR website Add Comments on individual gene
Customized submission processLarge-scale datasets (e.g. Arabidopsis 2010 Project data submission)
By email ([email protected]) Errors, corrections, omissions
28
User comments on TAIR website
29
User comments on TAIR website
30
How to submit data to TAIR?
Submission method Example
Submission form Gene functional annotation
Direct submission on TAIR website Add Comments on individual gene
Customized submission processLarge-scale datasets (e.g. Arabidopsis 2010 Project data submission)
By email ([email protected]) Errors, corrections, omissions
31
NSF Arabidopsis 2010 Project
Year funded
Funded projects (*)
ContactedData submitted
Data promised
Contact but did not reply
2005 16 15 8 6 1
2004 14 14 5 9 -
2003 14 13 7 5 -
2002 16 15 6 6 2
2001 27 25 13 9 2
Total 87 80 39 35 5
*Source: National Science Foundation; numbers updated 01/08
To determine the function of each Arabidopsis gene by 2010
32
Processing time
“Fabulous news! I should have thought of submitting it earlier. And thank you for such an informative and organized website. It was THE mainstay of my thesis research, which featured 22k microarrays.” [Donna Lindsay, University of Saskatchewan, Canada]
“Its perfect! Thank you very much for you rapidity.” [Fabienne Granierg, Institut Jean-Pierre Bourgin - INRA, France]
“Thanks for your rapid response.” [Qunfeng Dong, China]
“Thanks for your good work.” [Sabeeha Merchant, UCLA]
From TAIR users:
33
• Introduction to TAIR
• Community annotation at TAIR: an overview
• Community annotation: process and examples (gene function annotation)
• New initiative on encouraging community annotation: collaborations between Publisher and MODs
• Phil: community involvement in genome annotation
Outline
34
2006 high priority journals
CELLCURRENT BIOLOGYDEVELOPMENTGENES AND DEVELOPMENTNATURENATURE CELL BIOLOGYNATURE GENETICSNUCLEIC ACIDS RESEARCHPLoS biologyPNASSCIENCETHE EMBO JOURNALTHE PLANT CELLTHE PLANT JOURNALTRENDS IN PLANT SCIENCE
FEBS LETTERSGENETICSJOURNAL OF BIOLOGICAL CHEMISTRYMOLECULAR PLANT-MICROBE INTERACTIONSPLANT MOLECULAR BIOLOGYPLANT PHYSIOLOGY
35
Collaboration between Publishers and TAIR
Editorial board meeting in July 2007
Our proposal: request the following additional information from authors after paper is accepted:
• AGI Locus identifier/s (e.g. At1g01040)Provides clarity, avoid nomenclature conflicts
• Gene function annotation linked to AGI Loci with method
Up-to-date information about Arabidopsis genes from Plant Physiology is available at TAIR
36
Collaboration between Publishers and TAIR
• Launch of new feature via Benchpress
• Authors fill in an online form in TAIR-suggested format
• Data sent to TAIR for processing
37
• Introduction to TAIR
• Community annotation at TAIR: an overview
• Community annotation: process and examples (gene function annotation)
• New initiative on encouraging community annotation: collaborations between Publisher and MODs
• Phil: community involvement in genome annotation
Outline
38