Upload
eric-jain
View
1.736
Download
1
Embed Size (px)
DESCRIPTION
Brief overview of the different "ontologies" used by the UniProt protein sequence database.
Citation preview
UniProt & OntologiesHow ontologies are used in the context of a large life sciences database
Eric JainSwiss Institute of Bioinformatics, GenevaApril 2007
What is UniProt?
+ +
UniMES
UniRef
UniParc
...
UniProtKB
UniParc8.9M sequences
50%1.5M
90%2.9M
100%4.5M
UniRef8.9M clusters
UniProtKB4.5M entries
reviewed0.3M
Species & organelles
Description of function etc
Keywords & GO
Description of sequence features
Sequence(s)
Literature Citations
Cross-References
Protein & gene names
What is an Ontology?
unique identifier
names and synonyms
relationships within the ontology
stable!
mapped to other ontologies
human-readable definitions
machine-readable definitions
Why?
Practical.Navigation, auto-completion etc
Consistency!More than one way to say one thing...
Aggregate.i.e. set-oriented views
Automate?
What
Keywords
Taxonomy
Enzyme
Pathways
Tissues
Subcellular LocationsCellular Components
Gene Ontology
Summary
Increased use of ontologies is inevitable as data volumes grow.
UniProt has (or is in the process of introducing) several ontologies.
What data will be "ontologized" and how detailed the ontologies are depends on your feedback!
beta.uniprot.org
login: guest/amazing