27
Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes and disease model Control C3H/HeJ Homozygous Fasl gld /Fasl gld The mouse generalized lymphoproliferative disease (gld) mutation in the FAS ligand (TNF superfamily, member 6) gene. These mice model human Autoimmune Lymphoproliferative Syndrome; ALPS, type IB Janan T. Eppig PATO Meeting, Dec. 2006

Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes

  • Upload
    xia

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Homozygous Fasl gld /Fasl gld. Control C3H/HeJ. Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes and disease model. The mouse generalized lymphoproliferative disease (gld) mutation in the FAS ligand (TNF superfamily, member 6) gene. - PowerPoint PPT Presentation

Citation preview

Page 1: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Ontologies and vocabularies supporting data integration:

emphasis on mouse phenotypes and disease model

ControlC3H/HeJ

HomozygousFaslgld/Faslgld

The mouse generalized lymphoproliferative disease (gld) mutation in the FAS ligand (TNF superfamily, member 6) gene.

These mice model human Autoimmune Lymphoproliferative Syndrome; ALPS, type IB

Janan T. EppigPATO Meeting, Dec. 2006

Page 2: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

The genetic tools for mouse provide an ideal platform for

experimentation:

• genetic engineering

techniques to

specifically manipulate

the genome• sequenced genome

• Inbred strains

• high resolution maps

• Mammal : small, easy to breed and maintain, short lifespan• Similar to human genetically & physiologically

• human

disease model

• ES cell lines

Page 3: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

• short domed skull

• short-limbed dwarfism

• malocclusion

• bulging abdomen as adults

• respiratory problems

• shorted lifespan

Achondroplasia

Homozygous achondroplasia mouse mutant and control

…facilitating the use of the mouse as a model for human biology by providing integrated access to data on the genetics, genomics, and biology of the laboratory mouse.

www.informatics.jax.org

Page 4: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

…make phenotype and disease model data robust and

accessible to researchers and computational biologists

• semantically consistent search methods

• integrated access to all phenotypic variation sources (single-gene and genomic mutations, QTLs, strains)

• ability to query across sequence, orthology, expression, function, phenotype, disease

• data on human disease correlation

• access to mouse models from various approaches- Genetic- Phenotypic

- Computational

Objective

Page 5: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Existing Wealth of Mouse Phenotype Data in MGI

>16,800 phenotypic alleles representing ≈6,830 unique genes.

>71,000 annotations associating MP terms to genotypes. >6,550 phenotype records for 3,210 QTL. >9,000 strains catalogued.

Page 6: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

A few of the challenges • alleles can produce pleiotropic phenotypic effects

• non-allelic mutations can produce indistinguishable

phenotypes

• modifiers and epistasis can influence mutant phenotypes

• alleles of different genes can interact to produce unique

phenotypes

• genetic background can greatly influence mutant

phenotypes

• imprinted genes/alleles influence phenotype

• quantitative trait loci (QTLs) can contribute unequally to

phenotypes

• genomic mutations can delete or disrupt multiple genes

• strains (“whole-genome”) have characteristic phenotypes

• complex genetically engineered and multiple mutation

stocks are often developed for disease models

• environmental influences and age can dramatically affect

phenotype

Page 7: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Data Challenge

Mouse phenotype data from • publications • electronic submissions• mutagenesis (ENU centers)

(≈ 300 new alleles; ≈ 700 publications per month on phenotypes)

New initiatives to knock-out every gene in the mouse in next 5 years…

Need for efficiency, accuracy, full description of complex observations, storage/analysis of individual and population data

Page 8: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Making semantic sense

Controlled vocabularies/nomenclatures• Strains• Genes• Alleles (phenotypic or variant)• Classes of genetic markers• Types of mutations• Types of assays• Developmental stages• Tissues• Clone libraries• ES cell lines

….. organized as lists or simple hierarchies

Page 9: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

CloneLibrary Names

Inbred Strain Names

Gene Symbols

Page 10: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Hbp1 (high mobility group box transcription factor 1) gene expression differences in KitW-e/KitW-e homozygotes vs wild-type

AssayGene nomenclature

Results

Specimen

Page 11: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Semantics plus relationship data

Ontologies/structured vocabularies

• Gene Ontology (GO)• Molecular function

• Biological process

• Cellular component

• Mouse Anatomy (MA)• Embryonic

• Adult

• Mammalian Phenotype (MP)

• Sequence Ontology (SO)

….. organized as directed acyclic graphs (DAGs)

DAGs

Page 12: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Phenotype detail, including genotypes for mouse models of human diseases

Navigating the views of phenotypes & disease

Human/mouse disease

relationships

3.MP Ontolog

y

Summary: genotype, MP term, & ref

1.Gene Page

Summary: phenotype classes & human disease associated

4.Disease

vocabulary

2.Phenotype Query

5.Sequence

(GBrowse)

Page 13: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

enlarged brain ventricles

L1camtm1Mtei/Y 129/SvEv none affected

C57BL/6J high percentage affected

postnatal death Gnastm1Kel-pat/Gnas+ 129/Sv * C57BL/6J most die by P2; all by P9

129/Sv * C57BL/6J * CD-1

most die by P9; 10-20% survive past P21

TMEV viral susceptibility

Cd8atm1Mak/Cd8atm1Mak C57BL/6 Inflammation after infection resolves by 45 days; disease is absent by 10 mo.

PL/J viral infection persists

Genotype = allele combinations carried in the context of a specific genetic background (strain)

Page 14: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Mammalian Phenotype Ontology

• Structured as DAG

• Over 4,500 terms covering physiological systems, behavior, development and survival

• Available in browser and OBO formats from MGI ftp and OBO sites

• Each term linked to all annotations to the term or its children

Page 15: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Summary Results

• Genotypes that are annotated to a term or children of the term

• References supporting annotation

• Links to allele detail pages for full mutant phenotype

Page 16: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Allele Detail Page

• full phenotype annotations (MP) for each genotype

• specific detail for MP terms

• each MP annotation referenced

• human diseases for which genotype is used as a model

Page 17: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes
Page 18: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Mouse model genotypes linked to phenotype details

Genes associated with phenotypes characteristic of a disease in human, mouse, or both

Page 19: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

osteopetrosis

Human-mouse disease relationshipsOMIM terms 6,113Genotypes associated w/ OMIM 1,847OMIM associated w/ genotypes 720

to Human Disease and Mouse Model Page

Page 20: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Vocabularies in MGI

DAGs

DefinitionSynonyms

MP:1956

Strain: AEJ

Alleles:bd/bd

Genotype

Strain: C57BL/6

Alleles: Ppp1r3atm1Adpt/ Ppp1r3atm1Adpt

Terms

Respiratory failure

Postnatal lethality

Dilated renal tubules

Growth retardation

VocabularyNote

J:65378TAS

J:62648IDA

J:65322EE

Annotations

Page 21: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Making Mammalian Phenotype Ontology Work

DAG

• accommodate bio-specific terms• computationally useful• human accessible• practical for curation• cross-reference to other ontologies

Page 22: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Terms in MPMP term Entity Quality Other

Info

microphthalmia

eye small size

hydrocephaly

cerebro-spinal fluid

increased

excessive

brain large size

(dilated)

trauma observed

brain

increased blood pressure

? increased

Page 23: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Future MP Ontology Development

• New terms from ongoing curation process

• Collaborative community efforts• identify new terms • revise organization of existing terms within particular branches

• Recruit domain experts for systematic review

• Cross-ref and comparison to other relevant ontologies (GO, Anatomy, Cell Type, Mpath, etc.)

MP Ontology Growth

0

500

1000

1500

2000

2500

3000

3500

4000

4500

1/1/00 1/2/00 1/3/00 1/4/002003 2004 2005 2006

Page 24: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Collaborators

…currently annotating with MP and contributing to MP development

• Rat Genome Database (RGD)• Mouse Mutagenesis Centers • Human (NCBI/dbSNP)• Online Mendelian Inheritance in Animals

(OMIA)

…under discussion• Teratology Society• Animal Traits

Page 25: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Summary• Structured vocabularies and ontologies support semantic

integration for the MGI system and promote broader integration of mouse knowledge

• To meet community needs, practical implementations parallel formal ontological development

• MGI has implemented a generalized structure for vocabularies and ontologies in MGI

• The Mouse Genome Informatics group continues its strong interest and participation in community bio-ontology efforts

Page 26: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes

Human FOXN1forkhead box N1

T-CELL IMMUNODEFICIENY, CONGENITAL ALOPECIA, AND NAIL DYSTROPHY

Frank J, et al. Nature 398, 473 - 474 (1999)

Mouse Foxn1Homozygous “nude” mouse. One of eight known phenotypic mutations in mouse (6 spontaneous; 2 engineered) for the forkhead box N1 gene.

www.informatics.jax.org

Page 27: Ontologies and vocabularies  supporting data integration: emphasis on mouse phenotypes