26
24th Feb 2006 Jane Lomax GO Further

24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

Embed Size (px)

Citation preview

Page 1: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

GO Further

Page 2: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

GO annotations

• Where do the links between genes and GO terms come from?

Page 3: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

GO annotations• Contributing databases:

– Berkeley Drosophila Genome Project (BDGP)– dictyBase (Dictyostelium discoideum) – FlyBase (Drosophila melanogaster) – GeneDB (Schizosaccharomyces pombe, Plasmodium falciparum,

Leishmania major and Trypanosoma brucei) – UniProt Knowledgebase (Swiss-Prot/TrEMBL/PIR-PSD) and InterPro

databases – Gramene (grains, including rice, Oryza) – Mouse Genome Database (MGD) and Gene Expression Database (GXD)

(Mus musculus) – Rat Genome Database (RGD) (Rattus norvegicus)– Reactome– Saccharomyces Genome Database (SGD) (Saccharomyces cerevisiae) – The Arabidopsis Information Resource (TAIR) (Arabidopsis thaliana) – The Institute for Genomic Research (TIGR): databases on several

bacterial species – WormBase (Caenorhabditis elegans) – Zebrafish Information Network (ZFIN): (Danio rerio)

Page 4: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Species coverage

• All major eukaryotic model organism species

• Human via GOA group at UniProt• Several bacterial and parasite

species through TIGR and GeneDB at Sanger– many more in pipeline

Page 5: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Annotation coverage

Page 6: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Anatomy of a GO annotation

• Three key parts:– gene name/id

– GO term(s)

– evidence for association

Page 7: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Example annotation

• Breast cancer type 1 susceptibility protein gene in humans

Page 8: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Types of GO annotation:

Electronic Annotation

Manual Annotation

Page 9: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Manual annotation

• Created by scientific curators

• High quality

• Small number

Page 10: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Manual annotation

In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…

Page 11: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Manual annotation

Page 12: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Electronic Annotation

• Annotation derived without human validation– mappings file e.g. interpro2go, ec2go.– Blast search ‘hits’

• Lower ‘quality’ than experimental codes

Page 13: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Mappings files

Fatty acid biosynthesis ( Swiss-Prot Keyword)

EC:6.4.1.2 (EC number)

IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit (InterPro entry)

GO:Fatty acid biosynthesis

(GO:0006633)

GO:acetyl-CoA carboxylase activity

(GO:0003989)

GO:acetyl-CoA carboxylaseactivity

(GO:0003989)

Page 14: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Evidence types

• ISS: Inferred from Sequence/structural Similarity• IDA: Inferred from Direct Assay• IPI: Inferred from Physical Interaction• IMP: Inferred from Mutant Phenotype• IGI: Inferred from Genetic Interaction• IEP: Inferred from Expression Pattern• TAS: Traceable Author Statement• NAS: Non-traceable Author Statement• IC: Inferred by Curator• ND: No Data available

• IEA: Inferred from electronic annotation

Page 15: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

GO terms

• Where do GO terms come from?– most GO terms are added by the GO editorial

office at EBI– new terms are usually only added when they

are asked for by annotators– GO editors work with experts to make major

ontology developments• metabolism• pathogenesis• cell cycle

Page 16: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

GO stats

• almost 20,000 GO terms– 10452 biological_process– 1687 cellular_component– 7393 molecular_function

Page 17: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Growth of GO

Page 18: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

No GO Areas

• GO covers ‘normal’ functions and processes– No pathological processes– No experimental conditions

• NO evolutionary relationships• NO gene products• NOT a system of nomenclature

Page 19: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Open Biomedical Ontologies (OBO)• A repository for well-structured

controlled vocabularies for shared use across different biological and medical domains:

http://obo.sourceforge.net/

Page 20: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Open Biomedical Ontologies (OBO)• Requirements for inclusion:

http://obo.sourceforge.net/crit.html

Page 21: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

AmiGO exercise

Page 22: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Annotation exercise

• We have provided a Nature paper (PMID: 14961121) for you to annotate with GO terms – This will help you to understand how the

information is extracted from papers and GO terms are applied by the curators

– It will also give you the opportunity to use another GO browser developed at EBI: QuickGO

Page 23: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Annotation exercise

• The gene you are annotating is VG5Q– To make it easier we’ve highlighted

some of the most relevant passages in the text

• Use the GO browser QuickGO to look for the most appropriate GO terms:– http://www.ebi.ac.uk/ego/

Page 24: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Annotation exercise

• In QuickGO, you search for the GO terms by name

http://www.ebi.ac.uk/ego/

Page 25: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Annotation exercise

• Remember, as well as the GO term, you also need to assign an evidence code– to remind you, we’ve included a list of

the evidence codes at the back of the paper

Page 26: 24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

24th Feb 2006 Jane Lomax

Annotation exercise

• To see how your annotations compared to those done by the GO curator, search QuickGO for Q8N302– This is the UniProt id for the gene VG5Q

• Click ‘show only manual’ and this will show you the annotations the curator made