66
Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Embed Size (px)

Citation preview

Page 1: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Regulatory Genomics

Lecture 2 November 2012

Yitzhak (Tzachi) Pilpel

Lecture 2 November 2012

Yitzhak (Tzachi) Pilpel1

Page 2: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Course requirements

• Attendance and participation

• Two reading assignments

• A final take home papers reading-based exam

• website

No meeting next week on Nov 15th

2

Page 3: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Expression regulation of genes determines complex spatio-temporal patterns

3

Page 4: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Monitor expression during

cell cycle

0 5 10 15-2

-1

0

1

2

3

4

Time

mR

NA

exp

ress

ion

leve

l

G1 S G2 M G1 S G2 M 4

Page 5: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Time-point 1

Tim

e-po

int 3

Tim

e-po

int 2

-1.8

-1.3

-0.8

-0.3

0.2

0.7

1.2

1 2 3

-2

-1.5

-1

-0.5

0

0.5

1

1.5

1 2 3

-1.5

-1

-0.5

0

0.5

1

1.5

1 2 3

Time -pointTime -point

Time -point

Normalized

Expression

Normalized

Expression

Normalized

Expression

Genes can be clustered based on time-dependent expression profilesGenes can be clustered based on time-dependent expression profiles

5

Page 6: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The K-means algorithm

• Start with random positions of centroids.

Iteration = 0

6

Page 7: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

K-means

• Start with random positions of centroids.

• Assign data points to centroids

Iteration = 1

7

Page 8: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

K-means

• Start with random positions of centroids.

• Assign data points to centroids.

• Move centroids to center of assigned points.

Iteration = 1

8

Page 9: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

K-means

• Start with random positions of centroids.

• Assign data points to centroids.

• Move centroids to center of assigned points.

• Iterate till minimal cost. Iteration = 3

9

Page 10: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

An expression cluster

Page 11: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

1D and 2D clustering of gene expression data

Page 12: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Hierarchical clustering

Page 13: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

How to join sets?

f

e dc

b

a

Page 14: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

How to measure a distance between expression profiles?

14

Gene x

Gen

e y

t1

t2t3

Gene x

Gen

e y

t4t5

Page 15: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Clustering the data

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletH.html

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

Try these two applets at home (needs java)

Page 16: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The common distance matrices

16

Page 17: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Promoter Motifs and expression

profilesCGGCCCCGCGGA

CTCCTCCCCCCCTTC TGGCCAATCA

ATGTACGGGTG

17

Page 18: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Formaldehyde crosslinks living yeast cells

Binding site

TFBinding site Binding site

Inside the yeast nucleus:

ChIP - chromatin immunoprecipitation

Reversal of the crosslinks to separate DNA segments from proteins,and fluorescence labeling of each pool separately

(enriched DNA)

hybridization to DNA array of all yeast intergenic sequences

(unenriched DNA)

TF

= epitope tag

= TF of interest

Harvest and sonicate; results in DNA fragments(some of which are bound to proteins)

18

Page 19: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

P-value 0.0535,365 interactions

P-value 0.0112,040 interactions

P-value 0.0058,190 interactions

P-value 0.0013,985 interactions

P-value, or confidence level, for each spot in array

The total number of protein-DNA interactions in the location analysis data set, using a range of P value thresholds:

A P-value was selected which minimizes false positives, at the expense of gaining false negatives. P-value 0.001

19

Page 20: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Genome-wide Distribution of Transcriptional Regulators

• Promoter regions of 2343 of 6270 yeast genes (37%) were bound by 1 or more of the 106 transcriptional regulators (P=.001)

Avg.: regulator binds 38 promoter regions

At P= 0.001, significantly more intergenic regions bind 4 or more regulators than expected by chance

20

Page 21: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Network Motifs

21

Page 22: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Network Motifs

22

Page 23: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Network Motifs in the Yeast Regulatory Network

-Based on algorithmic analyses performed in Matlab; http://jura.wi.mit.edu/cgi-bin/young_public/navframe.cgi?s=17&f=networkmotif

103

49

90 81

188

23

Protein

Gene

Page 24: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The Cell Cycle Transcriptional Regulatory Network:

Various stages of cell cycle

Blue boxes represent sets of genes bound by a common set of regulators.

Each box is positioned according to the time of peak expression levels for the genes represented by the box.

Ovals represent regulators, connected to genes they regulate

Length of arc defines the period of activity of that regulator24

Page 25: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Network of Transcriptional Regulators Binding to Genes Encoding Other Transcriptional Regulators

25

Page 26: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Network of Transcriptional Regulators Binding to Genes Encoding Other Transcriptional Regulators

26

Page 27: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Network of Transcriptional Regulators Binding to Genes Encoding Other Transcriptional Regulators

27

Page 28: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

DNA mRNA Protein

Inactive DNA

The Central Dogma of Molecular BiologyExpressing the genome

RNA

28

Page 29: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Translation consists of initiation, elongation and termination

5’5’ 3’3’STOPSTOP

Codon

Anti-codon

29

Page 30: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The ribosome attachment site determines initiation rate

E. coli

Yeast

30

Page 31: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

A consensus for S. cerevisiae ribosome attachment sites?

position relative to ATG

100%

0%

sequenceHow good is it as a

“ribosomal attachment site” ?

ribosomal attachmentsite score

31

Page 32: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

5’ 3’

CTGCGC

GCG

GCGGCG

GCG

GCG

GCGGCG

CAGGCG

32

Page 33: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Rank

ribosomal attachmentsite score

The sequence adaptation score of proteins in yeast

CRP

good score

bad score

33

Page 34: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Multiple codons for the same amino acid

C1 C2 C3 C4 C5 C6Serine: UCU UCC UCA UCG AGC AGUCysteine: UGU UGCMethionine: UGG

STOP: UAA, UAG UGA

C1 C2 C3 C4 C5 C6Serine: UCU UCC UCA UCG AGC AGUCysteine: UGU UGCMethionine: UGG

STOP: UAA, UAG UGA

34

Page 35: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

G T R Y E C Q A S F D

C1C1C1C1C1C1C1C1C1C1C1C2C2C2C2C2C2C2C2C2C2C2C1C1C2C1C1C2C1C1C2C1C1C2C2C2C2C1C1C1C1C1C1C1C1C1C1C1C1C1C1C2C2C2C2

For a hypothetical protein of 300 amino acids with two-codon each, There are 2^300 possible nucleotide sequences

These variants will code for the same protein, and are thus considered “synonymous”.

Indeed evolution would easily exchange between them

These variants will code for the same protein, and are thus considered “synonymous”.

Indeed evolution would easily exchange between themBut are they all really equivalent??

35

Page 36: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Selection of codons might affect:AccuracyThroughput

CostsFolding

RNA-structure

36

Page 37: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

in

jijiji tRNAsW

1

)1(

Wi/Wmax if Wi0wi = wmean else{

tAIg wikk1

g

1/g

dos Reis et al. NAR 2004

The tRNA Adaptation Index (tAI)

ATC CCA AAA TCG AAT … ……

A simple model for translation efficiency

Wobble InteractionWobble Interaction

37

Page 38: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Supply demand and charging

38

Page 39: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

How the RNA structure influences translation?

?

39

Page 40: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

No correlation between CAI and protein expression

Positive correlation between structure’s energy and expression

The 5’ window needs to be un-folded for high expression

Pro

tein

ab

unda

nce

Pro

tein

ab

unda

nce

Conclusions from synthetic library

40

Page 41: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Formaldehyde crosslinks living yeast cells

Binding site

TFBinding site Binding site

Inside the yeast nucleus:

ChIP - chromatin immunoprecipitation

Reversal of the crosslinks to separate DNA segments from proteins,and fluorescence labeling of each pool separately

(enriched DNA)

hybridization to DNA array of all yeast intergenic sequences

(unenriched DNA)

TF

= epitope tag

= TF of interest

Harvest and sonicate; results in DNA fragments(some of which are bound to proteins)

41

Page 42: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

A genome-wide method to measure translation efficiency

(Ingolia Science 2009)

42

Page 43: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Translational response to starvation

43

Page 44: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

DNA mRNA Protein

Inactive DNA

The Central Dogma of Molecular BiologyExpressing the genome

RNA

44

Page 45: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

mRNA abundance

Option 1 Option 2 Option 3 Option 4

Production

degradation

45

Page 46: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Relationship between gene expression levels and mRNA decay rates across genes.

A study in human population examined decay and steady-state mRNA level variation across people.Found strong negative or positive correlations between mRNA level and decay rates.Fast responding genes show “discordant” relation suggesting that increased expression is often accompanied by increased decay rate

Page 47: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The various phases are coupled

47

Page 48: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

At the hardware level (post-transcription: RNA binding proteins)

G1 1 1 1 0

G2 1 0 0 1

G3 0 1 1 1 48

Page 49: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

At the hardware level (post-transcription: microRNA)

G1 1 1 1 0

G2 1 0 0 1

G3 0 1 1 1

RISC RISC RISC RISC

49

Page 50: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Yang CGFR 16:397, 2005

50

Page 51: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Computational approaches to find microRNA genes

• MiRscan (Lim, et al. 2003)– Scan to find conserved

hairpin structures in both C. elegans and C. briggsae.

– Using known microRNA genes (50) as training set.

51

Page 52: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

What is the effect of over expression of a miR?

52

Page 53: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

53

None-Coding RNAs are often co- targeted with their own targets for various cellular needs

Page 54: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

miR-124 decreases similarly the abundance and translation of mRNA targets

54

Page 55: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

microRNA expression profiles classify microRNA expression profiles classify human cancershuman cancers

Lu et al. Nature 435: 834, 2005Samples (patients)

miR

s

55

Page 56: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Gene expression is noisy

56

Page 57: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Fluorescence distribution shapes

57

Page 58: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The cell intrinsic and extrinsic contributions to noise

58

Page 59: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

DNA

RNA

Protein

Regulationby transcription

factors

RNA Polymerase

RibosomeExtrinsic

IntrinsicChromatin

remodeling

Transcription process

Translation process

Φ

Protein degradation

The actual intrinsic and extrinsic sources of noise:Extrinsic – variation in copy numbers of molecules

among cells; Intrinsic: stochastic events

59

Page 60: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

A theoretical approach

60

DNA mRNA Protein

Page 61: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The ratio of transcription to translation should affect noise

61

Page 62: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Transcription bursts should affect noise

62

Page 63: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Can noise be useful?

Page 64: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

The native net shows longer and more duration-diverse competence periods

Page 65: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Native networks does better on a wider range of extracellular [DNA]

The trade-off:High competence allows finding solutions, but reduces growth rate

Page 66: Regulatory Genomics Lecture 2 November 2012 Yitzhak (Tzachi) Pilpel 1

Questions about noise

• What are the sources of noise?

• How is noise regulated in cells

• How is it tolerated by the biological systems that need to be noise free?

• When is noise advantageous /deleterious/ neutral?

66