Upload
adele-brooks
View
216
Download
2
Tags:
Embed Size (px)
Citation preview
Functional Genomicswith an emphasis on YEAST
Genomics, Jef Boeke
November 2006
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
“Beer is living proof that God loves us and wants us to be happy” - Benjamin Franklin
Fermentation of mashed grains was probably considered a magical property of a properly cared-for vessel. We have been carrying around these vessels ever since and thus the cultivation of yeast has always been closely linked with human culture. It was not until Louis Pasteur's time that yeast was colony-purified. Saccharomyces cerevisiae was purified from European beers. Schizosaccharomyces pombe was purified from African millet beer and palm wine. Source: Charles Brenner web site, Dartmouth
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
What’s functional genomics?• Inferring gene function from genome-wide screens,
analyses, comparisons, etc.• Today: focus on experimental (not purely
bioinformatic) screens and profiling• Mutational analyses - how to make/analyze mutants• Transcript/Protein profiling• Protein interactions • Genetic interactions• Compound interactions• Complementation by metazoan orthologs
A two-part view of functional genomics
• Perturbing gene function– Knocking down/out gene function: mutants and RNAi– Overexpressing genes– Adding compounds
• Analyzing phenotypes and interactions– Growth and color; phenotypic reporters
Transcript/protein profiling– Protein interactions– Genetic interactions– Phenotype profiling
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Saccharomyces cerevisiae life cycle
99% of yeast lab work is done with MATa and MAT haploids
Perturbing gene function genome wide
• Nonessential genes:Systematic deletion/gene replacement strategies
• Overexpression• Strategies for essential genes
– haploinsufficency– promoter shutoffs – Ts mutants
• siRNA/shRNA• Compound libraries (subject for another day)
The systematic yeast knockout project Disrupted all ~6000 ORFs in the yeast genome – the
YKO (yeast knock out) collectionNon-essential genes provided as:
• a mating type• mating type • diploid heterozygous for each YKO
• diploid homozygous for each YKOEssential genes (~15%):
• diploid heterozygous for each YKO Distributes YKO mutants to the academic community
for phenotypic analyses at a low cost
YAL068cYAL069w
KanR
YAL068cKanR
Overall strategy: Gene Replacement by Homologous Recombination
YKO strain for YAL069w yal069::kanMX
One tag One Deletion Strain
.
List of 20mer tags Deletion strains
However, each deletion strain is assigned two tags to increase robustness of data
1. GATTCGATAGCCGGCAAGG
2. CGATTTAGGAATGTCATAG
3. AGCTCATACCTAGTAACTA
6,200. AGCTCATACCTAGTAACTA
. .
YFG+YFGL YFGR
UPTAG DOWNTAG
wild-type YFG+ strain
mutant yfgD::kanMX strain
kanMXYFGL YFGR
Detail of UPTAG and DOWNTAG structure
The yeast knockouts (= “YKOs”) are tagged for array detection
Application of the YKO mutant set: manipulating populations of
mutantsUnselected population Selected population
X
Selection imposedGenetic or environmental
How this applies to humansBefore stress
Survivors
Stressed people
X
What genetic/environmental interaction did this guy in?
winner! winner! winner!
winner! winner! winner!
Scanned image of a TAG array or “function chip”: Control cell TAGs red; experimental cell TAGs green
Essential genes
• Knockouts - limited to haploinsufficiency analysis
• Promoter shutoff approach• dAMP alleles Schuldiner et al. 2005 Cell 123:507-19
• Temperature sensitive mutants; now done for about 300 genes
• Yeast ~1000 essential genes - genome wide reagents needed
Tet promoter alleles
• Tet-off system• Promoter is
substituted for native promoter
• About half of these “behave”
• Cell 118, 31-44; S. Mnaimneh, A. et al. 2004
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Ts alleles• The classic genetic approach• Systematic PCR-based methods worked out;
these incorporate barcodes compatible with KO mutants
• Ts mutant alleles collected for ~500 of 1000 genes (C. Boone et al. unpublished)
• Technology developed for systematic Ts mutant generation with barcodes (Hieter/Boeke labs).
• Status: about 300 genes finished, remainder underway
Knockdown of gene function in metazoans - focus on mammals
• Knockdown ≠ Knockout!
• Gene traps
• Transposons and retrotransposons
• RNAi etc.
• Zinc finger nuclease
Genetraps 101• In mammals, biggest target is
introns• Typical gene-traps are
designed to prematurely truncate mRNAs; SA splice acceptor; pA polyadenylation signal; 1/2 chance it will truncate gene in theory
• They can incorporate a reporter
• Can be delivered by random transfection or by a retrovirus
• Retroviruses tend to show hotspots; limits usefulness
Gene-trap libraries• There are now multiple gene
trap libraries in ES cells• You can order the ES cell line
and differentiate it or …• Better yet… turn it into a
knockout mouse• http://www.genetrap.org/• Exercise: pick 3 genes and see if
you can find an existing ES cell line(s) for the major collections: Sanger Gene trap consortium; German Gene trap consortium. Then figure out where in your gene it inserted and the structure of the “mutant” transcript
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Experimental Hematology, Volume 33, Pages 845-856; A. Forrai, L. Robb
Transposon/retrotransposon gene traps
• DNA transposons “Sleeping Beauty” and “PiggyBac” engineered to knock out genes, find promoters and more in mammalian systems
• 2 component system “Tranposase” expression cassette and mini-Transposon
• Retrotransposons L1 and ORFeus can be used similarly; 1-component system
• Can incorporate gene traps; greater randomness may allow better genomic coverage; also, can do mutagenesis in whole animal, bypassing time-consuming derivation of ES Cell process
RFP labeled gene-trap in DNA transposon in vivo
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Red mice carry piggybac::RFP genetraps;Simple screen for hops into genes. Ding et al. Cell 2005; 122:473-83
Genome-wide RNAi methods•Basic strategy – apply double-stranded RNA of YFG to cells or organism of choice – you (usually) eliminate or reduce transcripts of YFG
•Very easy in C. elegans, which EATS E. coli – engineer coli to make dsRNA (plasmid vector with two convergent T7 promoters). Fraser et al. Nature 408:325
•Also works with mammalian cells – provided the dsRNA is provided as ~21-22 bp fragments so as to avoid provoking a non-specific interferon response. Effect is transient - a few days = “siRNA”
•Newer version is shRNA (synthetic hairpin RNA) which can be delivered and expressed via a DNA (plasmid or integrated lentivirus) vector and thus provoke a longer-term effect
RNA interference (RNAi)
Dicer
Gene Silencing
Argonaute (Slicer)
shRNAs provide a means for genome-wide “mutagenesis”
snRNA promoter
shRNA coding region
Plasmid or lentiviral vector
shRNA
AAAAA
Dicer action
siRNA
AAAAARISC/slicer action
RNAi libraries as “forward-genetic” tools
Resource # of clones (Species)
TRC: The RNAi ConsortiumshRNA clones
>100,000 clones planned(human, mouse & rat)
Vector
LentivirusPlasmid
Hannon/Elledge ConsortiumshRNA clones
LentivirusPlasmid
>100,000 clones planned(human, mouse & rat)
Notes
Nearly complete; U6 promoter; we have itat JHU (human and mouse)http://hitcores.bs.jhmi.edu/
Molecular barcodes;has GFP; miRNA promoter/5’ and 3’ UTRs
Many companiessiRNAs
Nonrenewable resource!More like a compound library
Works very well in TC cellsCan buy and use today (no viral packaging)Doesn’t work well in primary cells, in vivo (yet…)
Zn finger nuclease (ZFN) technology
Seminar this Friday at 4 PM from Sigma AldrichDarner Conference Room BRB G07; technology invented at
JHUSOPH! (Chandrasegaran)
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Great site: http://bindr.gdcb.iastate.edu/ZiFiT/Homework/lab assignment: design a Zinc finger for YFG
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Figure above shows how six different fingers collaborate to recognize two 9 bp sequences; nucleases (green) create double strand break (DSB). Figure to right A) shows how homologous recombination is used to repair DSBs in normal cells. B) shows how an engineered, cloned mutant version of the sequence (red) is recombined in at site of DSB created by ZFN
Analyzing phenotypes and interactions
• Growth and color; phenotypic reporters
• Transcript/Protein profiling
• Protein interactions
• Genetic interactions
• Phenotype microarrays
Yeast as tool - the awesome power of growth and color
• Yeast can grow• Yeast can make
colored colonies• Yeast can make
colored (or fluorescent) cells
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Counterselectable markers
• URA3 - select for the gene on minimal medium lacking Ura
• Select against it on 5-Foa
• Wide use in reporter assays, etc.
• LYS2 - select for the gene on minimal medium lacking Lys
• Select against it on -aminoadipate
• Used a bit less, as LYS2 is ~3.5 kb, URA3 is <1 kb
Most important colony color markers
• ADE2 (also ADE1)• Mutant (ade2) colonies
are red on low Adenine media like YPD
• Mutants accumulate an intermediate “AIR”, which produces a red derivative
• SUP11/ade2-101 system and others
• MET15 (also MET2)• Mutant (met15) colonies
are black on media containing lead ions
• Mutant produces H2S which generates insoluble black lead sulfide
How to make/use reporters
• Choose your reporter: color, fluorescent, selectable, and/or counterselectable
• Fuse it to a “control element” - could be promoter, UTR, protein segment conferring instability, anything you think might be important
• Go wild screening for things that turn reporter on or off - it will help dissect the biology
GFP
YFGpromoter+
Generic 3’ UTRGFP
Copyright ©1999 by the National Academy of Sciences
Edskes, Herman K. et al. (1999) Proc. Natl. Acad. Sci. USA 96, 1498-1503
URE2/GFP
“prion domain”
A reporter for prion state, assayed microscopically
The “macro-array”approach of
phenotyping large mutant collections
Ross-MacDonald et al Nature 402;413
See also TRIPLES database for data on gene expression, localization, and
insertion mutant phenotypeshttp://ygac.med.yale.edu
Phenotypic microarrays (PMs)www.biolog.com
~2000 different “conditions”(96 shown here)NB- Instrument is available for use in our microarray facility
Transcript and protein profiling•The idea: measure the abundance of mRNAs or proteins in cells
•Compare the abundance of gene products under different conditions
•Examine the pattern or profile of abundances
•Premise: Genes that are in similar pathways may be coregulated and thus may have related functions
•There are many profiling methods available and more under development
•These profiles have many other purposes; e.g. diagnostics
•Most commonly used method is microarray analysis in which transcript levels are assayed by hybridization; essentially a “reverse Northern” of the entire genome
Pharmacogenomics:Comparing the effects of drugs and mutants;500 microarray experiments on different YKO mutants or drug treatments of wild-type cells
clus
tere
d pr
ofile
inde
x
ergosterol
-10 -5 -2 1 2 5 10fold repression fold induction
histone deacetylase
mating
MAPK signaling
ribosome/translation
tup1, ssn6HU/MMS/rnr1
isw1, isw2
sir2, sir3
cell wallergosterol
cell wall
cup5, vma8
mitochondria
PAU RNR2,3,4
amino acidbiosynthesis (AA)
mitochondrialfunction
calcineurin/PKC
mating
clustered transcript response index
S/C
Drug treatments ORmutants in genes
Transcripts of genes involved in indicated process/pathway
T. Hughes et al Cell 102:109
Even more profiling approaches
Yeast promoter strength (transcriptional frequency)and transcript stability dissectedhttp://web.wi.mit.edu/young/expression/
SAGE (serial analysis of gene expression)http://genome-www.stanford.edu/Saccharomyces/SAGE/AdvancedQuery.html
Protein profiling via 2-D gel or other separation/Mass spectrometry methodsLink et al. Nature Biotechnol 17:676
6000 promoter fusions to GFP; measure promoter strength rather than steady state RNA levelDimster-Denk et al. J. Lipid Research 40:850
Genome wide search for enzyme functions:GST fusions to every ORF (Science 286:1153)
GST YFG
X 60 YFGs
Make 96 GST fusion protein pools
Run 96 enzymatic assaysFind positives,Deconvolute positive pools
Or put all 6000 GST-His6 fusion proteins down as spots on chips…Zhu et al. Science 293:2101
Search for phospholipid or calmodulin binding proteins
Calmodulin binding motif
Two hybrid and mass spec analysis across the genome; cataloging protein-protein interactionshttp://portal.curagen.com and Uetz et al. Nature 403:623; Ito et al. PNAS 98:4569; Ho et al. Nature 415:180; Gavin et al., Nature 415:141 and 440:631; Krogan et al, Nature 440:637
Concept: identify all possible pairwise protein/protein interaction through yeast two-hybrid assay
Implementation: cross 6000 DB domain fusions (representing all 6000 full-length ORFs) by 6000 AD domain fusions = 36 * 10e6 analyses
Problems: many false positives and negatives
A complementary approach, affinity pulldown of protein complexes, identify complex members by Mass spectrometry
The biggest yeast MS dataset
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
d, Graphical representation of the complexes. This Cytoscape/GenePro screenshot displays patterns of evolutionary conservation of complex subunits. Each pie chart represents an individual complex, its relative size indicating the number of proteins in the complex. The thicknesses of the 429 edges connecting complexes are proportional to the number of protein–protein interactions between connected nodes. Complexes lacking connections shown at the bottom of this figure have <2 interactions with any other complex. Sector colors (see panel f) indicate the proportion of subunits sharing significant sequence similarity to various taxonomic groups (see Methods). Insets provide views of two selected complexes—the kinetochore machinery and a previously uncharacterized, highly conserved fructose-1,6-bisphosphatase-degrading complex (see text for details)—detailing specific interactions between proteins identified within the complex (purple borders) and with other proteins that interact with at least one member of the complex (blue borders). Colors indicate taxonomic similarity.Krogan et al 2006 Nature 440, 637-643
Interaction databases
• BioGRID http://www.thebiogrid.org/• DIP Database of Interacting Proteins dip.doe-
mbi.ucla.edu• BIND Biomolecular interaction database
www.bind.ca (recently privatized)
• These databases warehouse interaction data, and represent small pieces of interaction webs to make them more digestible
The yeast protein and genetic interaction maps:Dissecting the hairball of interactions
Protein-protein interactions tend to define linear pathways or “series” circuits
A complementary approach is needed to identify parallel pathways, and branchpoints or “parallel” circuits in the network of life
Synthetic lethality: what is it and what does it tell us?
yfg1 mutant – viableyfg2 mutant – viableyfg1 yfg2 double mutant - inviable
If the nature of the yfg mutants is unknown, many possible interpretations…BUT, if they are both null alleles, simplest interpretation is they are in redundant, parallel, or branched pathways
input
output
YFG1
YFG2
YFG3
YFG4
YFG5
YFG6
YFG7
YFG8
Thus, the patterns of lethality helps deduce pathway architecture
Genetic interactions; two methods SGA, SLAM
• SGA: Synthetic Genetic Array– Uses robots to make diploids, sporulate them, and
select for double mutant haploids (approx 384 at one time)
– Some double mutants don’t grow or grow slowly
• SLAM: Synthetic lethality analyzed by microarray– Uses pools or mixtures of mutants, “query gene”
knocked out by transformation– Readout is microarray/molecular barcodes or
TAGs
Genetic analysis of YKOs as a population:SLAM
Tag-array hybridization
Uptag Downtag
kanMX4uptag
kanMX4PCRuptagDowntag Downtag
Cy5 Cy3
Genetic perturbation
physiological stress
control pool experimental pool
Genomic DNA
A role of the CTF18m module in DNA replication checkpoint signaling
ctf8 ctf8 rad9dcc1 dcc1 rad9
cont
rol
wt rad9tof1 tof1 rad9
HUMM
S
HUMM
S
cont
rol
DNA damagingassaults
Mec1/Ddc2Drc1, Dbp11Pol, Rfc2,5
Mrc1Tof1/Csm3
Mec1/Ddc2Rad24Ddc1Mec3Rad17Rad9
Rad53 Chk1
Cell cycle arrest& DNA repair
Dun1
SFL interaction Physical interaction
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
RAD9
SOD1
TSA1
CDC9RAD27
POL32
ELG1
RNR4
TOF1
CSM3
CTF18CTF8
DCC1
MRC1
DUN1
DDC1
MEC3RAD17
RAD24
CHK1
DRCsignaling
Ctf18/Ctf8/Dcc1
Oxidative stress
response
DNA replication
DDCDRC
Pan et al Cell 2006 124:1069-81
So much info, so little QC• Every method produces false negatives and false
positives• All of the methods seem to work well with “knowns”
but work much less well with unknown genes• Reasons may include functional redundancy,
complex, multiple functions or functions not evident under lab conditions
• Combinatorial informatic approaches need weighting to help evaluate strength of “links” between genes. Also, any single set of gene “links” is incomplete
• What is needed to have a better success rate at functional prediction is less links of low quality and more links of high quality