55
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of Digital Biology, Mississippi State University

From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Embed Size (px)

Citation preview

Page 1: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

From Functional Genomics to Physiological Model:

Using the Gene Ontology

Fiona McCarthy, Shane Burgess, Susan BridgesThe AgBase Databases, Institute of Digital Biology, Mississippi State University

Page 2: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

From Functional Genomics to Physiological Model1. A user’s guide to the Gene Ontology

(GO)

2. Finding GO for farm animal species

3. Adding GO to your dataset

4. GO based tools for biological modeling

5. Examples: using GO for biological modeling

Page 3: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

• Presentation available at AgBase• Websites available as handout

Page 4: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

1. A User’s Guide to GO

Page 5: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

What is the Gene Ontology?Emily Dimmer, GOA EBI:“a controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing”

assign functions to gene products at different levels, depending on how much is known about a gene product

is used for a diverse range of species

structured to be queried at different levels, eg: find all the chicken gene products in the genome that are involved in

signal transduction zoom in on all the receptor tyrosine kinases

human readable GO function has a digital tag to allow computational analysis of large datasets

Page 6: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 7: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 8: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GO Mapping Example

NDUFAB1 (UniProt P52505)Bovine NADH dehydrogenase (ubiquinone) 1, alpha/beta subcomplex, 1, 8kDa

Biological Process (BP or P)GO:0006633 fatty acid biosynthetic process TASGO:0006120 mitochondrial electron transport, NADH to ubiquinone TASGO:0008610 lipid biosynthetic process IEA

Cellular Component (CC or C)GO:0005759 mitochondrial matrix IDAGO:0005747 mitochondrial respiratory chain complex I IDAGO:0005739 mitochondrion IEA

NDUFAB1

Molecular Function (MF or F)GO:0005504 fatty acid binding IDAGO:0008137 NADH dehydrogenase (ubiquinone) activity TASGO:0016491 oxidoreductase activity TASGO:0000036 acyl carrier activity IEA

Page 9: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GO Mapping Example

NDUFAB1 (UniProt P52505)Bovine NADH dehydrogenase (ubiquinone) 1, alpha/beta subcomplex, 1, 8kDa

Biological Process (BP or P)GO:0006633 fatty acid biosynthetic process TASGO:0006120 mitochondrial electron transport, NADH to ubiquinone TASGO:0008610 lipid biosynthetic process IEA

Cellular Component (CC or C)GO:0005759 mitochondrial matrix IDAGO:0005747 mitochondrial respiratory chain complex I IDAGO:0005739 mitochondrion IEA

NDUFAB1

Molecular Function (MF or F)GO:0005504 fatty acid binding IDAGO:0008137 NADH dehydrogenase (ubiquinone) activity TASGO:0016491 oxidoreductase activity TASGO:0000036 acyl carrier activity IEA

aspect or ontologyGO:ID (unique)

GO term nameGO evidence code

Page 10: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GO Mapping Example

NDUFAB1 (UniProt P52505)Bovine NADH dehydrogenase (ubiquinone) 1, alpha/beta subcomplex, 1, 8kDa

Biological Process (BP or P)GO:0006633 fatty acid biosynthetic process TASGO:0006120 mitochondrial electron transport, NADH to ubiquinone TASGO:0008610 lipid biosynthetic process IEA

Cellular Component (CC or C)GO:0005759 mitochondrial matrix IDAGO:0005747 mitochondrial respiratory chain complex I IDAGO:0005739 mitochondrion IEA

NDUFAB1

Molecular Function (MF or F)GO:0005504 fatty acid binding IDAGO:0008137 NADH dehydrogenase (ubiquinone) activity TASGO:0016491 oxidoreductase activity TASGO:0000036 acyl carrier activity IEA

GO EVIDENCE CODESDirect Evidence CodesIDA - inferred from direct assayIEP - inferred from expression patternIGI - inferred from genetic interactionIMP - inferred from mutant phenotypeIPI - inferred from physical interaction

Indirect Evidence Codesinferred from literatureIGC - inferred from genomic contextTAS - traceable author statementNAS - non-traceable author statementIC - inferred by curatorinferred by computational analysisRCA - inferred from reviewed computational analysisISS - inferred from sequence or structural similarityIEA - inferred from electronic annotation

OtherNR - not recorded (historical)ND - no biological data available

Page 11: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Unknown Function vs No GO ND – no data

Biocurators have tried to add GO but there is no functional data available

Previously: “process_unknown”, “function_unknown”, “component_unknown”

Now: “biological process”, “molecular function”, “cellular component”

No annotations (including no “ND”): biocurators have not annotated

Page 12: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

2. Finding GO for Farm Animals

Page 13: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GO Browsers

QuickGO Browser (EBI GOA Project) http://www.ebi.ac.uk/ego/ Can search by GO Term or by UniProt ID Includes IEA annotations

AmiGO Browser (GO Consortium Project) http://amigo.geneontology.org/cgi-bin/amigo/go.cgi Can search by GO Term or by UniProt ID Does not include IEA annotations

Page 14: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Getting GO http://www.ebi.ac.uk/GOA/downloads.html

includes farm animals

Page 15: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Getting GO http://

www.geneontology.org/GO.current.annotations.shtml#filter

Page 16: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Getting GO http://www.agbase.msstate.edu/

Page 17: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 18: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

3. Adding GO to your dataset

Page 19: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GO analysis of array data

Probe data is linked to gene product data gene, cDNA, ESTs IDs

For some arrays, gene product data has corresponding GO data available from vendor (updated?)

Not all gene products will have GO annotation will not be included in modeling

Need to get the maximum amount of GO data to do biological modeling

Page 20: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Example: Netaffx

Page 21: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Secondary source of GO annotation

Page 22: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 23: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 24: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 25: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GORetriever

+ many more

Page 26: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GORetriever

Page 27: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GORetriever Results

Page 28: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GORetriever Results

Page 29: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GORetriever Results

save as text fileFor GOSlimViewer

Page 30: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GORetriever Results

Page 31: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

But what about IDs not supported by GORetriever?

Page 32: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GOanna

Page 33: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 34: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 35: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GOanna Results

Page 36: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

query IDs are hyperlinked to BLAST data(files must be in the same directory)

Page 37: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

*WHAT IS A GOOD ALIGNMENT?

If there is a good alignment* to a protein with GO transfer GO to your record

If there is not a good alignment or the record doesn’t have GO literature

Page 38: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

good alignment

add to GO summary file(tab-delimited text file containing ID, GO:ID, aspect)

Page 39: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Contact AgBase to request GO annotation of specific gene products.

Page 40: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GOSlimViewer: summarizing results

Page 41: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GOSlimViewer results

Page 42: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

response to stimulus

amino acid and derivative metabolic process

transport

behavior

cell differentiation

metabolic process

regulation of biological process

cell communication

nucleobase, nucleoside, nucleotide and nucleic acid metabolic process

cell death

cell motility

macromolecule metabolic process

multicellular organismal development

catabolic process

biological_process

Page 43: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

response to stimulus

amino acid and derivative metabolic process

transport

behavior

cell differentiation

metabolic process

regulation of biological process

cell communication

nucleobase, nucleoside, nucleotide and nucleic acid metabolic process

cell death

cell motility

macromolecule metabolic process

multicellular organismal development

catabolic process

biological_process

“process unknown”“function unknown”“component unknown”

??

Page 44: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

B-cells Stroma

immune response apoptosis

cell-cell signaling

Looking at function, not genesPie Graphs – relative proportions

Page 45: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

GOModeler: quantitative, hypothesis-driven modeling.Coming soon (contact AgBase)

GOModeler

Page 46: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

McCarthy et al “AgBase: a functional genomics resource for agriculture.” BMC Genomics. 2006 Sep 8;7:229.

Page 47: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

4. GO based tools for biological modeling

Page 48: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

http://www.geneontology.org/

Page 49: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 50: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

However…. many of these tools do not support farm animal

species the tools have different computing requirements may be difficult to determine how up-to-date the

GO annotations are…

Need to evaluate tools for your system.

Page 51: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Evaluating GO toolsSome criteria for evaluating GO Tools:1. Does it include my species of interest (or do I have to

“humanize” my list)?2. What does it require to set up (computer usage/online)3. What was the source for the GO (primary or secondary) and

when was it last updated?4. Does it report the GO evidence codes (and is IEA included)?5. Does it report which of my gene products has no GO?6. Does it report both over/under represented GO groups and

how does it evaluate this?7. Does it allow me to add my own GO annotations?8. Does it represent my results in a way that facilitates

discovery?

Page 52: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

5. Using GO for biological modeling

Page 53: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of
Page 54: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of

Using GO for biological modeling:

hypothesis generating hypothesis driven

Page 55: From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of