UniProt & Ontologies

  • View
    1.736

  • Download
    1

  • Category

    Business

Preview:

DESCRIPTION

Brief overview of the different "ontologies" used by the UniProt protein sequence database.

Citation preview

UniProt & OntologiesHow ontologies are used in the context of a large life sciences database

Eric JainSwiss Institute of Bioinformatics, GenevaApril 2007

What is UniProt?

+ +

UniMES

UniRef

UniParc

...

UniProtKB

UniParc8.9M sequences

50%1.5M

90%2.9M

100%4.5M

UniRef8.9M clusters

UniProtKB4.5M entries

reviewed0.3M

Species & organelles

Description of function etc

Keywords & GO

Description of sequence features

Sequence(s)

Literature Citations

Cross-References

Protein & gene names

What is an Ontology?

unique identifier

names and synonyms

relationships within the ontology

stable!

mapped to other ontologies

human-readable definitions

machine-readable definitions

Why?

Practical.Navigation, auto-completion etc

Consistency!More than one way to say one thing...

Aggregate.i.e. set-oriented views

Automate?

What

Keywords

Taxonomy

Enzyme

Pathways

Tissues

Subcellular LocationsCellular Components

Gene Ontology

Summary

Increased use of ontologies is inevitable as data volumes grow.

UniProt has (or is in the process of introducing) several ontologies.

What data will be "ontologized" and how detailed the ontologies are depends on your feedback!

swiss-prot@expasy.org

beta.uniprot.org

login: guest/amazing

Recommended