Like Illumina, but immobilized templates are SS DNA molecules (~200 nt)

Preview:

DESCRIPTION

Last updated 10/26/11. http://www.helicosbio.com/Technology/TrueSingleMoleculeSequencing/tabid/64/Default.aspx. Like Illumina, but immobilized templates are SS DNA molecules (~200 nt) Each cycle adds one base,records, and then cleaves the fluorescent group - PowerPoint PPT Presentation

Citation preview

1

Like Illumina, but immobilized templates are SS DNA molecules (~200 nt) Each cycle adds one base,records, and then cleaves the fluorescent group and washes it away. Several billion single molecule “spots” per slide.

http://www.helicosbio.com/Technology/TrueSingleMoleculeSequencing/tabid/64/Default.aspx

Last updated 10/26/11

2

3

1 2 3 4 5 6 7

Helicos paired end sequencing

4

Helicos virtual terminatorInhibits DNA Pol once incorporated (so 1 base at a time)Cleavable via the S-S bond (reduce it)

dUTP dU-3’P,5’P

Free 3’ OH never blocked

Fluorescent tag

5

Quantification of the yeast transcriptome by single-molecule sequencingLipson et al. NATURE BIOTECHNOLOGY 27: 652, 2009

Make cDNA via oligo dT

Tail 3’ end with A via terminal transferase, adding dT to terminate

Hybridize to surface-linked oligo dTs

Add Cy5-labeled special nucleotide tri-Ps + DNA Pol.Wash. Record image.

Cleave dye from incorporated nt.Wash.

Add next Cy5-labeled special nucleotide triPs (A) + DNA Pol.Wash. Record image.

Note: no amplifications or ligations

6

smsDGE = digital gene expression via Helicos sequencing and countingMA = microarray data

7

QPCR = quantitative PCR, real time PCR

Exponential phase

CT value

Non-exponential plateauphase

Threshold line

Bio-rad

8

QPCR (Quantitative PCR)Q-RT-PCR (Quantitative reverse transcription-PCR)

Run 96 samples simultaneously

9

Some data produced: Distribution of yeast transcripts

TSS = transcription start sitet.p.m. = transcripts per million

Est. copies/cell: 0.5 5 50 500

mRNA

TSS position relative to ATG

10

(DNA nanoballs)

AcuI: a type IIS restriction enzyme

Complete Genomics

RCR = rolling circle replication

11Rolling circle DNA synthesis (Φ29 polymerase)

12

Complete Genomics

13

Probes degenerate at all but one position, colored for the base at that position.

5 probe sets for positions +1 to +5 relative to anchor end

Hybridize, wash, ligate, wash,image.

Second anchor set extends 5 nt(degenerate reach). Repeat10 nt sequenced.

Repeat with anchors on the other side of the adaptor.

Repeat for the other 3 adaptors.

Total 70 nts sequenced(theor. = 80)

Complete Genomics\”CPAL

14

Complete Genomics

Est. 1 billion spots (reads) per slide Lower cost200 human genomes sequencedBusiness plan: sell sequencing service, not machines

15

http://www.pacificbiosciences.com

16

10 zl volume seen

(1 zeptoliter = 10-21 L.)

ZMW = zero mode waveguide

One DNA Pol molecule per

ZMW

Add template and special

phospho nucleotides.

17

Phospho-linked fluorescently-labeled nucleoside triphosphates

Other technologies

Cleaved when incorporated

18

Excitation Emission

19

20

Use a circular template to get redundant reads and so more accuracy.

21

Pacific Biosciences

• 50,000 ZMWs (Aug., 2011), and density may climb

• Long reads (e.g., full molecule analysis for splicing isoform)

• Direct RNA sequecning possible.

• DNA methylation detectable

22

DNA methylation detection by bisulfite conversion

23

Agilent SureSelect RNA Target Enrichment

Capture a subgenomic region of interest for economy and speed of sequencing:

E.g.,

the entire exome (all exons w/o introns or intergeneic regions)

hundreds of cancer genes

a particular genomic locus

Alternative: hybridize to a custom microarray.

Agilent

24

Applications of “deep” sequencing

Also: definition and discovery of cis-acting regulatory motifs in DNA and RNA

25

----CpG-- > ----CmpG--- > ----CmpG--- >< ---G p Cm---

Na bisulfiteHeat

cytosine

uracil

----UpG-- > ----CmpG--- >

Na bisulfiteHeat

PCR

----TpG-- ><--ApC---

----CpG-- ><--GpC---

All NON-methylated Cs changed to T

Detection of methylated C (~all in CpG dinucleotides)

DS DNA

26

DEEP SEQUENCING (Next generation sequencing, High throughput sequencing, Massively parallel sequencing) applications:

Human genome re-sequencing (mutations, SNPs, haplotypes, disease associations, personalized medicine)

Tumor genome sequencing

Microbial flora sequencing (microbiome)Metagenomic sequencing (without cell culturing)

RNA sequencing (RNAseq; gene expression levels, miRNAs, lncRNAs, splicing isoforms)

Chromatin structure (ChIP-seq; histone modifications, nucleosome positioning)Epigenetic modifications (DNA CpG methylation and hydroxymethylation)

Transcription kinetics (GROseq; nascent RNA, pulse labeled RNA)

High throughput genetics (QUEPASA; cis-acting regulatory motif discovery)

Drug discovery (bar-coded organic molecule libraries)

27

Ke et al, and Chasin, Quantitative evaluation of all hexamers as exonic splicing elements. Genome Res. 2011. 21: 1360-1374 ).

Order an equal mixture of all 4 bases at these 6 positions

28

29Rank 6-mer ESRseq score (~ -1 to +1)1 AGAAGA 1.0339 2 GAAGAT 0.9918 3 GACGTC 0.9836 4 GAAGAC 0.9642 5 TCGTCG 0.9517 6 TGAAGA 0.9434 7 CAAGAA 0.9219 8 CGTCGA 0.8853 : :4086 TAGATA -0.86094087 AGGTAG -0.87134088 CGTCGC 0.8850 4089 CTTAAA -0.87864090 CCTTTA -0.88124091 GCAAGA 0.89114092 TAGTTA -0.89334093 TCGCCG 0.91134094 CCAGCA -0.89424093 CTAGTA -0.92514094 TAGTAG -0.9383 4095 TAGGTA -0.9965 4096 CTTTTA -1.0610

Best exonic splicing enhancers

Worst exonic splicing enhancers,= best exonic splicing silencers

30

Composite exon (from ~100,000)

Constitutive exons

Alternativexons

Pseudo exons

3131

Experiment: 1 1 1 2 2 1+2 2 2 1 2

Sequence of 36 Quality codeCGCACTGTGCTGGAGCTCCCGGGGTTAACTCTAGAA abU^Vaa`a\aaa]aWaTNZ`aa`Q][TE[UaP_U]TACACTGTGCTGGAGCTCCCAACGGCAACTCTAGAA a`P^Wa`[`Wa^`X_X_XWVa^NSP]_]S^X_T\X^CGCACTGTGCTGGAGCTCCCATGGAGAACTCTAGAA aTa`^b``baaaa^aab^YaTQLOHIa`^a``TX]]TACACTGTGCTGGAGCTCCCCTCCCAAACTCTAGAA I_`aaaa`aaaaaaa_a_^[KZIGIGZ`U`\^P^^`CGCACTGTGCTGGAGCTCCCAATAGTAACTTTAGAA aY_\abb[T\abaaa`a`bZ[HXXIZa_`_LGMS[`TATACTGTGCTGGAGCTCCCGACGTAAACTCTAGAA aba]^aa_a]`aa]_]`XWSMFGGIPX[P]X`V_Y^TACACTGTGCTGGAGCTCCCTGGTAAAACTCTAGAA a_^a^aa`aYaaa_aY`Y_^[I]VY\`]V]R\W]VVTACACTGTGCTGGAGCTCCCAATAAAAACTCTAGAA XZababa`aZaaaaaYaYXX`baa``\\TaUa\aW`

2 nt barcode (TA or CG)

Constant regions(peculiar to our expt.)

Variable region

32

Next generarion method:

Use custom oligo libraries to construct minigene libraries (40,000, up to 60 nt long):

E.g., for saturation mutagenesis to identify all exonic bases contributing to splicing (or transcription or polyadenylation, …..)

Use bar codes to detect sequences missing from the selected molecules

E.g., Nat Biotechnol. 2009 27:1173-5. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Patwardhan RP, Lee C, Litvin O, Young DL, Pe'er D, Shendure J.

Long (200-mer) synthetic oligo library

3333OUTLINE OF NEXT LECTURE TOPICS

Expression and manipulation of transgenes in the laboratory

• In vitro mutagenesis to isolate variants of your protein/gene with desirable properties– Single base mutations– Deletions– Overlap extension PCR– Cassette mutagenesis

• To study the protein: Express your transgene – Usually in E. coli, for speed, economy– Expression in eukaryotic hosts– Drive it with a promoter/enhancer– Purify it via a protein tag– Cleave it to get the pure protein

• Explore protein-protein interaction• Co-immunoprecipitation (co-IP) from extracts• 2-hybrid formation• surface plasmon resonance• FRET (Fluorescence resonance energy transfer)• Complementation readout

3434

PCR

fragment subsequent cloning in a plasmid

Cut with RE 1 and 2

Ligate into similarly cut vector

RS1 RS2

RS1 RS2

Site-directed mutagenesis by overlap extension PCR

1 2

3535

Original sequence coding for, e.g., a transcripiton enhancer region

Cassette mutagenesis = random mutagenesis but in a limited region:

1) by error-prone PCR

------*--------*--*-**---------------*-----------*--*-------*------------------------*-*-*------------*------------*--

----------------------------------------------------------------------------------------------------------------------

Cut in primer sites and clone upstream of a reporter protein sequence.

Pick coloniesAnalyze phenotypes Sequence

PCR fragment with high Taqpolymerase and Mn+2 instead of Mg+2 errors

3636

Original enhancer sequence

-*------------------------*-*-*------------*------------*--------*--------*--*-**---------------*-----------*--*------

----------------------------------------------------------------------------------------------------------------------

Buy 2 doped oligos; annealOK for up to ~80 nt.

Clone upstream of a reporter. Doping = e.g., 90% G, 3.3% A, 3.3% C, 3.3% Tat each position

Pick coloniesAnalyze phenotypes Sequence

Cassette mutagenesis = random mutagenesis but in a limited region:2) by “doped” synthesis Target = e.g., an enhancer element

37

Got this far

3838

E. coli as a host

• PROs:Easy, flexible, high tech, fast, cheap; but problems

• CONs• Folding (can misfold)• Sorting -> can form inclusion bodies• Purification -- endotoxins• Modification -- not done (glycosylation, phosphorylation, etc. )

• Modifications:• Glycoproteins • Acylation: acetylation, myristoylation• Methylation (arg, lys)• Phosphorylation (ser, thr, tyr)• Sulfation (tyr)• Prenylation (farnesyl, geranylgeranyl on cys)• Vitamin C-Dependent Modifications (hydroxylation of proline and lysine)• Vitamin K-Dependent Modifications (gamma carboxylation of glu)• Selenoproteins (seleno-cys tRNA at UGA stop)

3939

Some alternative hosts

• Yeasts (Saccharomyces , Pichia)• Insect cells with baculovirus vectors• Mammalian cells in culture (later)• Whole organisms (mice, goats, corn)

(not discussed) • In vitro (cell-free), for analysis only

(good for radiolabeled proteins)

4040Yeast Expression Vector (example)

2 micron plasmid

2 mu seq:yeast orioriE = bacterial oriAmpr = bacterial selectionLEU2, e.g. = Leu biosynthesisfor yeast selection

Saccharomyces cerevisiae(baker’s yeast)

oriE

Your favorite

gene(Yfg)

LEU2

Ampr

GAPDterm

GAPDprom

Complementation of an auxotrophy can be used instead of drug-resistance

Auxotrophy = state of a mutant in a biosynthetic pathway resulting in a requirement for a nutrient

GAPD = the enzyme glyceraldehyde-3 phosphate dehydrogenase

41

Genomic DNA

HIS4 mutation-

Yeast - genomic integration via homologous recombination

HIS4

gfY

pt Vector DNA

FunctionalHIS4 gene

DefectiveHIS4 gene

Yfg

tp

Genomic DNA

42

Double recombination Yeast (integration in Pichia pastoris)

AOX1 gene (~ 30% of total protein)

Genomic DNA

AOX1p

Yfg

AOX1t HIS4 3’AOX1

Genomic DNA

HIS4

Yfg

AOX1p

AOX1t

3’AOX1

Vector DNA

P. pastoris-tight control-methanol induced (AOX1)-large scale production (gram quantities)

Alcohol oxidase gene

43

BD = (DNA) binding domain AD = activation domain

PROTEIN-PROTEIN INTERACTIONS

Yeast 2-hybrid system to discover proteins that interact with each otherOr to test for interaction based on a hypothesis for a specific protein.

?

http://www.mblab.gla.ac.uk/~maria/Y2H/Y2H.html

(bait)

(prey)

Y = e.g., a candidate protein being tested for possible interaction with X

Or: Y = e.g., a cDNA library used to discover a protein that interacts with X

?

44

Y = e.g., a cDNA library used to discover a protein that interacts with X

Recover the Y sequence from reporter+ colonies by PCR to idenify protein Y

No interaction between X and Y: no reporter expression

Yes, interaction between X and Y: reporter protein is expressed:

45

Fusion library

Two different assays help, as there are often many false positives.

http://www.mblab.gla.ac.uk/~maria/Y2H/Y2H.html

=“prey”

Bait protein is the known target proteinfor whom partners are sought

BD= DNA binding domain; TA = transactiavting domain

and/or

46

3-HYBRID: select for proteins domains that bind a particular RNA sequence

Bait

Prey

Prey could be proteins from a cDNA library

47

Yeast one-hybrid:

Insert a DNA sequence upstream of the selectable or reporter

Transform with candidate DNA-binding proteins (e.g., cDNA library)fused to an activator domain.

Each T = one copy of a DNA target sequence

48

Directed Evolution of a Glycosynthase via Chemical ComplementationHening Lin,† Haiyan Tao, and Virginia W. Cornish J. AM. CHEM. SOC. 2004, 126, 15051-15059

Turning a glycosidase into a glyco-synthase

Glycosidase: Glucose-Glucose (e.g., maltose) + H2O 2 Glucose

Indirect selection using a yeast 3-hybrid system:a more efficient glycosynthase enzyme

49

Indirect selection using the yeast 3-hybrid system(one of the hybrid moelcules here is a small molecule)

e.g., from a mutated library of enzyme glycosynthase genes

glucose

DHFR = dihydrofolate reductase GR = glucocorticoid receptor (trancription factor )MTX = methotrexate (enzyme inhibitor of DHFR)DEX = dexamethasone, a glucocorticoid agonist, binds to GRAD = activation domain, DBD = DNA binding domain

Leu2 geneLeu2 gene

Transform a yeast leucine auxotroph. Provide synthetic chimeric substrate molecules. Select in leucine-free medium.

50

URA-3 (toxic)

Library of cellulase mutant genes(one per cell)

x x x x

cellulase

Survivors are enriched for cellulase genes that will cleave cellulose with greater efficiency (kcat / Km)

Yeast cell

Directed Evolution of Cellulases via Chemical Complementation. P. Peralta-Yahya, B. T. Carter, H. Lin, H. Tao. V.W. Cornish.JACS 2008, 130, 17446–17452

Selection of improved cellulases via the yeast 2-hybrid system

Cellobiose(disaccharide)

51

Substrate

52Pathway to pyrimidine nucleotides:

URA-3 = gene for orotidine phosphate (OMP) decarboxylase

5-fluoroorotic acid

5-Fluoro-OMP

5-Fluoro-UMP

RNA

URA-3 decarboxylation (pyr-4)

Thymidylate SynthetaseinhibitionDeath

How does the URA-3 system work?

Exogenousuridine

Uridine kinase

analog

Ura3+ is FOA sensitive; ura3- is FOA resistant

53

Measuring protein-protein interactions in vitroX=one protein Y= another protein

Pull-downs:

Binding between defined purified proteins, at least one being purified.Tag each protein differently.

Examples:

His6-X + HA-Y; Bind to nickel ion column, elute (his), Western with HA Ab

GST-X + HA-Y; Bind to glutathione ion column, elute (glutathione), Western with HA Ab

His6-X + 35S-Y (made in vitro); Bind Ni column, elute (his), gel + autoradiography. No antibody needed.

(HA = influenza virus flu hemagglutinin)

glutathione = Gamma-glutamyl-cysteinyl-glycine.

54

Example of a result of a pull-down experiment

Antibody used in Western

Total protein: no antibody or Western(stained with Coomassie blue or silver stain)

Compare pulled down fraction (eluted)with loaded

Also identfy by MW (or mass spec)

55

Western blotting

To detect the antibody use a secondary antibody against the primary antibody.

The secondary antibody is fusion protein with an enzyme activity (e.g., alkaline phosphatase).

The enzyme activity is detected by its catalysis of a reaction producing a luminescent compound.

http://www.bio.davidson.edu/courses/genomics/method/Westernblot.html

56

Y YNon-luminescent substrate-PO4

=

Luminescent product + PO4=

Protein band on membrane

Alkaline phosphatase fusion

Secondary antibody-enzyme fusion(e.g., goat anti-rabbit IgG)

Antibody to protein on membrane

Detect by exposing to film

Detection of antibody binding in western blots

57Far western blotting to detect specific protein-protein interactions. Use a specific purified protein as a probe instead of the primary antibody

To detect the protein probe use an antibody against it.

Then a secondary antibody, a fusion protein with an enzyme activity.

The enzyme activity is detected by its catalysis of a reaction producing a luminescent compound.

http://www.bio.davidson.edu/courses/genomics/method/Westernblot.html

protein protein

58

Expression via in vitro transcription followed by in vitro translation

cDNA

T7 RNA polymerasebinding site (17-21 nt)

….ACCATGG…..

VECTOR

2. Add to translation system: rabbit reticulocyte lysate or wheat germ lysate

Or:

E. coli lysate (combined transcription + translation)

All commerically available as kitsAdd ATP, GTP, tRNAs, amino acids, label (35S-met), May need to add RNase (Ca++-dependent) to remove endogenous mRNA In lysate

1. Transcription to mRNA via the T7 promoter + T7 polymerase

Radioactively labeled protein

NOTE: Protein is NOT at all pure (100s of lysate proteins present), just “radio-pure”

59Co-immunoprecipitation

• Most times not true precipitation, which requires about equivalent concentrations of antigen and antibody• Use protein A immobilized on beads (e.g., agarose beads)• Protein A is from Staphylococcus aureus: binds tightly to Immunoglobulin G (IgG) from many species.

A

A

A

A

A

A

X

YD

Y

X

C

B

X

Y

+

D

Y

X

C

B

incubate+ anti-X IgG

A

A

A

A

A

A

X

Y

+

D

Y

X C

B

+ Protein A

A

A

A

A

A

A

YXX

Y

Wash by centrifugation (or magnet)Elute with SDSDetect X, Y in eluate by Western blotting

Or cell extract

Does X interact with Y in the cell or in vitro?

60Surface plasmon resonance (SPR)

The binding events are monitored in real-time and it is not necessary to label the interacting biomolecules.

http://home.hccnet.nl/ja.marquart/BasicSPR/BasicSpr01.htm

glass plate

61Expression in mammalian cellsLab examples:HEK293 Human embyonic kidney (high transfection efficiency)HeLa Human cervical carcinoma (historical, low RNase)CHO Chinese hamster ovary (hardy, diploid DNA content, mutants)Cos Monkey cells with SV40 replication proteins (-> high transgene copies)3T3 Mouse or human exhibiting ~regulated (normal-like) growth+ various others, many differentiated to different degrees, e.g.:BHK Baby hamster kidey HepG2 Human hepatomaGH3 Rat pituitary cellsPC12 Mouse neuronal-like tumor cellsMCF7 Human breast cancerHT1080 Human with near diploid karyotypeIPS induced pluripotent stem cells and:Primary cells cultured with a limited lifetime. E.g., MEF = mouse embryonic fibroblasts, HDF = Human diploid fibroblasts

Common in industry:NS1 Mabs Mouse plasma cell tumor cellsVero vaccines African greem monkey cellsCHO Mabs, other therapeutic proteins Chinese hamster ovary cellsPER6 Mabs, other therapeutic proteins Human retinal cells

62

Recommended