21
PATO, December 2006 WormBase -- one Web site, WormBase -- one Web site, many roles many roles

WormBase -- one Web site, many roles

  • Upload
    diem

  • View
    32

  • Download
    1

Embed Size (px)

DESCRIPTION

WormBase -- one Web site, many roles. Caenorhabditis elegans. Diverse data (from 3,739 papers). Prior to July, 2006:. 127 phenotype objects in WormBase. three-tiered organization (specialization_of or generalization_of) redundancy existed between terms - PowerPoint PPT Presentation

Citation preview

Page 1: WormBase -- one Web site, many roles

PATO, December 2006

WormBase -- one Web site, many roles WormBase -- one Web site, many roles

Page 2: WormBase -- one Web site, many roles

PATO, December 2006

Caenorhabditis elegansCaenorhabditis elegans

Page 3: WormBase -- one Web site, many roles

PATO, December 2006

Diverse data (from 3,739 papers)Diverse data (from 3,739 papers)

1129

834

610 598529 503 479

419351 344 326

278

193150 130 124

85 58 58 57 43 26 20 12

0

200

400

600

800

1000

1200

Expression data

RNAi

Gene function

Sequence change

Transgene

Gene product interactions

Antibody

Gene-gene interactions

Gene-seq, gene name, synonym

Structure correctionMutant phenotype

New allele

Overexpression

Site of action analysisSequence features

Mapping data

Cell (name,function,ablation)

Mosaic analysisStructural info

Protein functions in vitro

Microarray

Covalent modification

SNPs

Functional complementation

Page 4: WormBase -- one Web site, many roles

PATO, December 2006

Prior to July, 2006:Prior to July, 2006:≈ 127 phenotype objects in WormBase.≈ three-tiered organization (specialization_of or

generalization_of)≈ redundancy existed between terms≈ no phenotype term definitions, references≈ many RNAi experiments annotated to ‘Unclassified’

phenotype term≈ ‘Not’ phenotype associations were not captured≈ Phenotype vocabulary was not used for annotation of

alleles and transgene objects

Page 5: WormBase -- one Web site, many roles

PATO, December 2006

A controlled and structured A controlled and structured vocabulary for phenotypes:vocabulary for phenotypes:

≈ allows complex data queries, and expedites analysis of genes that act in the same processes or pathways.

≈ helps to integrate a massive array of data from many different sources into a common body of knowledge.

≈ provides the option of linking phenotype data with other data in WormBase or with data from other databases.

≈ facilitates communication within and outside of the C. elegans community

Page 6: WormBase -- one Web site, many roles

PATO, December 2006

Expansion of the phenotype Expansion of the phenotype ontology, source for term names:ontology, source for term names:

≈ text descriptions in WormBase ≈ free text phenotype descriptions associated with alleles ≈ text associated with RNAi objects annotated to ‘Unclassified’

phenotype

≈ prior phenotype terms in WormBase≈ GO ontology≈ WormBase anatomy ontology≈ Life stage ontology

Term names and synonyms reflect the language of researchers.

Page 7: WormBase -- one Web site, many roles

PATO, December 2006

The WormBase phenotype ontology The WormBase phenotype ontology is a pre-coordinated ontology:is a pre-coordinated ontology:

1348 terms, ~20% of terms are defined

Page 8: WormBase -- one Web site, many roles

PATO, December 2006

Current term usage: Current term usage:

1

2

40% used for annotation

60% not associated with an annotation

Page 9: WormBase -- one Web site, many roles

PATO, December 2006

RNAi-phenotype data:RNAi-phenotype data:

≈ 272,759 total RNAi-phenotype connections≈ 63,439 RNAi experiments≈ 19,692 genes associated with phenotypes

via RNAi experiments:≈ 19,185 genes connected via “Not” associations≈ 4,577 genes connected directly

Page 10: WormBase -- one Web site, many roles

PATO, December 2006

Allele-phenotype data:Allele-phenotype data:

≈ Most phenotype connections are to knockout alleles (NBP).

≈ Ongoing:≈ Continuing to collect phenotype data from the

community.≈ Starting to annotate early papers describing large

collections of mutants -> many high-level phenotype annotations.

≈ Starting to annotate new papers.

Currently, 4,401 total allele-phenotype connections to 2585 alleles, defining 1296 genes.

Page 11: WormBase -- one Web site, many roles

PATO, December 2006

Lots of RNAi data -> dense Lots of RNAi data -> dense early_embryonic_lethal node:early_embryonic_lethal node:

Page 12: WormBase -- one Web site, many roles

PATO, December 2006

Vague collections of phenotypes Vague collections of phenotypes present challenges for present challenges for ontology/annotation:ontology/annotation:

pleiotropic_defects_severe_early_emb:“Often multiple pronuclei, aberrant cytoplasmic texture, drop in overall pace of development, osmotic sensitivity.”

complex_phenotype_early_emb“Complex combination of defects that does not match other class definitions.”

Page 13: WormBase -- one Web site, many roles

PATO, December 2006

Looking ahead to an entity-quality Looking ahead to an entity-quality compatible schema:compatible schema:

≈ Within OBO-Edit we store relevant GO term names within primary names or synonym names (GO ID stored in relevant dbxref field)

≈ Phenotype ontology is developed using existing anatomy and life stage term names

Page 14: WormBase -- one Web site, many roles

PATO, December 2006

Phenotype data integration:Phenotype data integration:≈ Phenotype annotations are associated with

molecular information for alleles, transgenes, and RNAi objects that permit mapping these objects to the genome.

≈ High-level phenotype annotations associated with RNAi objects are automatically converted to GO terms (RNAi2GO) and associated with gene objects.

≈ Phenotype annotations describing gene regulation (‘transgene_expression_abnormal’) linked with detailed gene regulation information.

≈ Phenotypes linked to life stage and anatomy term

Page 15: WormBase -- one Web site, many roles

PATO, December 2006

RNAi summary on gene page:RNAi summary on gene page:

Page 16: WormBase -- one Web site, many roles

PATO, December 2006

Sample detailed RNAi report:Sample detailed RNAi report:

Page 17: WormBase -- one Web site, many roles

PATO, December 2006

Sample allele report:Sample allele report:

Page 18: WormBase -- one Web site, many roles

PATO, December 2006

Immediate future plans:Immediate future plans:

≈ Ontology:≈ Define terms, further refine ontology

(expansion will be dictated by community feedback and curation needs)

≈ Solicit more expert community feed-back

≈ Web site:≈ Enhance phenotype search tools

Page 20: WormBase -- one Web site, many roles

PATO, December 2006

WormBase = ~30 people, 4 centersWormBase = ~30 people, 4 centersCold Spring Harbor Laboratory

Payan CanaranJack ChenTristan FiedlerTodd HarrisSheldon McKayWill SpoonerLincoln Stein

California Institute of Technology

Igor AntoshechkinCarol BastianiJuancarlos ChanWen ChenRanjana KishoreRaymond LeeHans-Michael MüllerCecilia NakamuraAndrei PetcherskiGary SchindelmanErich SchwarzPaul SternbergKimberly Van AukenDaniel WangXiaodong Wang

Washington University at St. Louis

Tamberlyn BieriDarin BlasiarPhil OzerskyJohn Spieth

Wellcome Trust Sanger Institute

Paul DavisRichard DurbinMichael HanAnthony RogersMary Ann TuliGary Williams

Page 21: WormBase -- one Web site, many roles

PATO, December 2006

≈ NIH/NHGRI≈ C. elegans research community

Other acknowledgements:Other acknowledgements: