View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Curation of the EcoCyc Database:
The EcoCyc Update ProjectMartha Arnaud
Scientific Database CuratorBioinformatics Research Group
SRI International
http://www.ecocyc.orghttp://www.biocyc.org
SRI InternationalBioinformaticsEcoCyc Organization
EcoCyc collects information about multiple types of database objects
Pathway * Reaction * Compound * Protein Gene * Transcription Unit
* hierarchies
Proteins
Compounds
Genes
Pathway
Reactions
SRI InternationalBioinformaticsEcoCyc Statistics
176 pathways992 enzymes1006 enzymatic reactions169 transporters828 transcription units1929 proteins have a comment
(598 > 300 characters)
SRI InternationalBioinformaticsEcoCyc Pathway Information
http://biocyc.org:1555/ECOLI/new-image?type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2
SRI InternationalBioinformaticsEcoCyc Pathway Information
http://biocyc.org:1555/ECOLI/new-image?type=PATHWAY&object=ALANINE-VALINESYN-PWY&detail-level=2
SRI InternationalBioinformaticsEcoCyc Metabolic Overview
http://biocyc.org/ov-expr.shtmlStatic or animated views of expression data
SRI InternationalBioinformaticsEcoCyc Curation
names and synonyms gene classes subunit composition of protein complexes location of gene product protein or complex molecular weight enzyme activity name enzyme properties (activators, inhibitors, cofactors) comment fields evidence citations
reactions catalyzed pathway information
SRI InternationalBioinformatics
Build a new MOD or add a “Pathway Module”!
Pathway Tools Software
- Takes annotated genome- Generates database, including pathway predictions
Freely available (academics/non-profits)
http://bioinformatics.ai.sri.com/ptools/Pathway Tools software environment for creation, curation, analysis, and Web publishing of MODs
Saccharomyces cerevisiae SGD, Stanford UniversityArabidopsis thaliana Carnegie Institution of Washington
Plasmodium falciparum, Stanford UniversityMycobacterium tuberculosis Stanford UniversitySynechocystis Carnegie Institution of WashingtonMethanococcus janaschii EBI
Current Pathway Tools Users
SRI InternationalBioinformaticsEcoCyc into the Future:
“EcoCyc is not just metabolism anymore!”
…an integrated, review-level information resource on E. coli genomics and biochemistry…
SRI InternationalBioinformatics
What do we need to do? Goals
Can we possibly get it done? Quantification
Where do we start? Priorities
How is it going? Progress
The EcoCyc Update Project:
SRI InternationalBioinformaticsEcoCyc Update: Curation Goals
Expand database scope beyond metabolism, transporters, and transcription
Curate associated reactions and pathways
Stay current with the latest papers
Curate every gene product: literature-based descriptions comprehensive reference lists
SRI InternationalBioinformaticsEcoCyc Update: Quantification
4405 genes-175 transcription factors-168 transporters4062 genes to curate
Full-time curator: 4 days/week on curation+ Part-time curator (70%), years 2-4
Year 1: 1600 hoursYear 2: 3000 hoursYear 3: 3000 hoursYear 4: 3000 hoursTotal: 10,600 hours/4062 genes: 2.6 hours per gene
Curation of abstracts
SRI InternationalBioinformaticsEcoCyc Update: Priorities
1. Problems raised by users and advisors
2. Gene products that have new characterizations published in the literature
3. Gene products that have not yet been thoroughly curated
4. Gene products that have been curated, but have not been updated lately
SRI InternationalBioinformaticsWhere are we now?
807 gene products curated.
807/4062 = 19.9% of the total
(excluding transport and transcription factors)
4-year plan: Curate 615 genes in Year 1
We are meeting our goal!
SRI InternationalBioinformaticsThe EcoCyc Collaboration
SRI
Peter Karp, PI Suzanne Paley, Software
Engineer John Pick, Software Engineer Martha Arnaud, Curator
UCD
John Ingraham, Project Leader
MBL
Monica Riley, Editor Emerita
UNAM Julio Collado-Vides, Project
Leader Socorro Gama-Castro, Curator Martin Peralta, Curator
TIGR Ian Paulsen, Project Leader Mark Hance, Curator
UCSD Milton Saier, Project Leader Can Tran, Curator
Funding:NIH National Center for Research Resources
SRI InternationalBioinformatics
Pathway/Genome DBs Created byExternal UsersSaccharomyces cerevisiae, Stanford University
pathway.yeastgenome.org/biocyc/Plasmodium falciparum, Stanford University
plasmocyc.stanford.eduMycobacterium tuberculosis, Stanford University
BioCyc.org
Arabidopsis thaliana and Synechocystis, Carnegie Institution of Washington
Arabidopsis.org:1555
Methanococcus janaschii, EBI Maine.ebi.ac.uk:1555
Other PGDBs in progress by 40 other usersSoftware freely availableEach PGDB owned by its creator