Upload
bidwbb
View
1.001
Download
1
Tags:
Embed Size (px)
DESCRIPTION
ISMB 2012 talk by Barbara Bryant, Li Chenhao and Greg Tucker-Kellogg, "Modeling chromatin as a computer"
Citation preview
Modeling Chromatin as a Computer
Barbara Bryant, Constellation Pharmaceuticals, [email protected];
Li Chenhao, National University of Singapore
Greg Tucker-Kellogg, National University of Singapore, [email protected]
ISMB 2012
Chromatin is a computer
• Protein complexes read and write chemical symbols on chromatin.
• Chromatin modification is computationally universal.
• Hard problems can be solved on biologically-sized chromatin computers.
• It is useful to model chromatin as a computer.
Reference: Barbara Bryant, "Chromatin Computation", PLoS ONE 2012 PMID 22567109
DNA wraps around histone octamers
Each nucleosome has many possible modifications
from Allis et al, Epigenetics, 2006
Enzymes modify chromatin
Kosi Gramatikoff via Wikimedia Commons
Readers and writers are joined together in protein complexes
DNA readerhistone tail
readerscaffolding
proteinhistone tail
writerchromatin remodeler
Chromatin-modifying complex
Examples of chromatin-modifying complexes from PINDB
Complex Erasers (HDMs, HDACs)
Writers (HMTs, DNMTs, HATs) Readers (bromodomains, PHD, PWWP, chromodomains, MBD)
NUMAC CARM1 SMARCA4
GST-Smad2 CREBBP, NCOA3 CREBBP, SMARCA4, TRIM33
DNMT3B HDAC1 DNMT3B DNMT3BAF9.com HDAC1, HDAC2 DOT1L, TAF1, TAF5 CBX8, MLLT10, TAF1, TAF3
CtBP HDAC1, HDAC2, KDM1A
EHMT1, EHMT2 CBX4, CDYL
DMAP1 EPC1, ING3, SRCAP BRD8, ING3
PCAF KAT2B, SUPT3H, TADA2A, TADA3, TAF10, TAF12, TAF5L, TAF6L, TAF9
KAT2B
ING5-TAP KAT6A, KAT6B, KAT7 BRD1, BRPF1, BRPF3, ING5, KAT6A, KAT6B, PHF15, PHF16, PHF17
MLL1-WDR5 KAT8, MLL, TAF1, TAF9 CHD8, MLL, PHF20, TAF1
HDAC2 HDAC1, HDAC2, KDM1A, MTA2
CHD3, CHD4, PHF21A
http://pin.mskcc.org/ Luc & Tempst, Bioinformatics 2004 20(9):1413 Next: the CC model
Chromatin Computational Model
DNA reader
histone tail reader
scaffolding protein
histone tail writer
chromatin remodeler
BBXX
BB** XX--
Chromatin modification sites
Chromatin tape
Chromatin-modifying complexes
Read/write rules
Chromatin Computer
Chromatin Computer
BBXX XXXX BBXX BBBB XX** BB** ---- XX—BBXX XXXX XXXX BBBB
XX** BB** ---- XX--
See Bryant 2012, PLoS ONE Next: an example CC program
Hamiltonian Path Problem
Find a path from vertex 0 to vertex 6 visiting each vertex exactly once.
DNA computer solution
Adleman 2004, Science
Edges
DNA computer solution
Adleman 2004, Science
Edges Paths
hybridization
DNA computer solution
Adleman 1994, Science
Edges Paths
hybridization
Correct path
Sizing, probe hybridization
Chromatin Computer Solution
000000 BBBBBB BBBBBB BBBBBB BBBBBB BBBBBB BBBBBB
2***** B***** ------ 3-----
Initial chromatin state
Rule encoding traversal of edge from vertex #2 to vertex #3
(10,6,2)-CC
10 marks: {B,0,1,2,3,4,5,6,F,S}6 modification sites in each nucleosome2-nucleosome rules
Building the path
000000 BBBBBB BBBBBB BBBBBB BBBBBB BBBBBB BBBBBB0***** B*****------ 1-----000000 1BBBBB BBBBBB BBBBBB BBBBBB BBBBBB BBBBBB 1***** B***** ------ 2-----000000 1BBBBB 2BBBBB BBBBBB BBBBBB BBBBBB BBBBBB 2***** B***** ------ 3-----000000 1BBBBB 2BBBBB 3BBBBB BBBBBB BBBBBB BBBBBB
But, how do you make sure you visit every vertex once?
Rules enforcing single visit to each vertex
## Check for one and only one visit to vertex 1*0**** 1B**** --> ------ -1----*0**** 2B**** --> ------ -0----*0**** 3B**** --> ------ -0----*0**** 4B**** --> ------ -0----*0**** 5B**** --> ------ -0----*0**** 6B**** --> ------ -0----*1**** 1B**** --> ------ -F----*1**** 2B**** --> ------ -1----*1**** 3B**** --> ------ -1----*1**** 4B**** --> ------ -1----*1**** 5B**** --> ------ -1----*1**** 6B**** --> ------ -1----
But, what if it chooses the wrong path and fails?
(Show visualization of running CC)
Animated gif of this CC
Hamiltonian Path II: with backtracking
• Click to run Ham Path backtracking animation
000000 BBBBBB BBBBBB BBBBBB BBBBBB BBBBBB BBBBBB IIIIIIInsulator
## If stuck at 6, erase6***** BBBBBB --> X----- ------ ## If hit wall, turn around1***** IIIIII --> H----- ------2***** IIIIII --> H----- ------3***** IIIIII --> H----- ------4***** IIIIII --> H----- ------5***** IIIIII --> H----- ------ ## Walk failure right to the I wall.X***** 1***** --> G----- X-----X***** 2***** --> G----- X-----X***** 3***** --> G----- X-----X***** 4***** --> G----- X-----X***** 5***** --> G----- X-----X***** 6***** --> G----- X-----
## Bounce back if hit right endX***** IIIIII --> H----- ------X***** BBBBBB --> H----- ------ ## Walk left to 0, erasing as you goG***** H***** --> H----- BBBBBBX***** H***** --> H----- BBBBBB1***** H***** --> H----- BBBBBB2***** H***** --> H----- BBBBBB3***** H***** --> H----- BBBBBB4***** H***** --> H----- BBBBBB5***** H***** --> H----- BBBBBB6***** H***** --> H----- BBBBBB0***** H***** --> 000000 BBBBBB
Turing Machine
B y x B ^ 1
Turing Machine
B y x B ^ 2
Turing Machine
B z x B ^ 3
Turing Machine
B z z B ^ 4
Chromatin computers are computationally universal
See Bryant 2012, PLoS ONE
Proof: simulation of a Turing Machine using a Chromatin Computer
CC programs can handle input of any size
BB* H00 BB* --> --- BB- H11BB* H11 BB* --> H1- BB- ---BB* H10 BB* --> H2- BB- ---BB* H20 BB* --> H2- BB- ---BB* H21 BB* --> --- H30 ---BB* H30 BB* --> --- BB- H3-BB* H31 BB* --> --- H4- ---BB* H41 BB* --> --- BB- H4-BB* H4B BB* --> --- H11 ---
Start: B 1 1 1 0 B B B B B
End: B 0 0 0 0 1 1 1 1 B
Rules for destructive-copy-and-add-one
1 1 1 H00 |H00| | |H11 1 1 1 0 H11
1 1 1 0 H11 |H11| H1 | | 1 1 1 H10 1
1 1 1 H10 1 |H10| H2 | | 1 1 H21 0 1
1 1 H21 0 1 |H21| |H30| 1 1 H30 0 1
1 1 H30 0 1 |H30| | |H3 1 1 0 H30 1
...
Almost done
0 0 H20 0 1 1 1 1 |H20| H2 | | 0 H20 0 0 1 1 1 1
Almost done
0 H20 0 0 1 1 1 1 |H20| H2 | | H20 0 0 0 1 1 1 1
Done
H20 0 0 0 1 1 1 1 |H20| H2 | | H2 0 0 0 0 1 1 1 1
Biological chromatin is a massively parallel, random access, self-modifying
stored procedure computer
• Non-deterministic / parallel computation
• DNA methylation
• DNA sequence readers (transcription factors)
• Gene expression: program output, and self-modifying programs
• Recognition of multiple marks by one reader
• Different rate constants or affinities
• Move left/right along the chromatin (remodeler)
• Remove/replace histone (remodeler)
• RNA processing systems operating on nascent RNA transcripts
Enhancer
enhancer TSS
Inspirations:• Okazaki fragments• Active chromatin hubs• Organizer factory
Inspirations: ACH, Okazaki
• Show replication video from Drew Berry
Enhancer
enhancer TSS
Enhancer
enhancer TSS
Enhancer
TSS
Enhancer
TSS
Enhancer
TSS
Enhancer
TSS
Copying
1111
Copying
1111
1111
Copying
1111
1111
Multiplying
1111
Hamiltonian Path III: arbitrary graph as input on chromatin tape
State
Lastvertex
Graph (edges)
Vertices
PathBlank
Input: Initial state, GraphWorking memory: State, Vertices, PathOutput: Path
L VG P
Sketch of HamPath III
• Set up the vertex-visited section of tape.
• Add the next valid edge; check the vertex has not yet been used.
• Check whether we found a Hamiltonian path (visits all vertices and ends at the last vertex)
• If there are no next valid edges, backtrack by removing the last edge.
• Repeat
1 LS G V P
5 30 2 2 2 2
Encoding: 2^12 combinations
Unary representation for vertex number.
Program design: state transition graph
Hamiltonian Path III: graph as input
LS
V
G
P
State
Lastvertex
Graph (edges)
Vertices
PathBlank
L VG PClick on the complex for the animation…
What does biological chromatin actually compute?
• State, subroutines, variables, recursion?• Self-modifying program (through changed expression of chromatin modifying complex components)
• How do CC programs evolve?
Silencing in S. cerevisiae
• Silencing in Saccharomyces cerevisiae heterochromatin domain has been well studied, and a model with sequential histone modification involved is well defined in literature.
• Here we map this model with a piece of chromatin and rules to implement the propagation of heterochromatin in our Chromatin Computer.
Proposed Nucleosome
• H4K16(A/B) Core(I/B) Protein(s/C/B) State(H/E)
• A: acetylation, B: no modification• I: end of heterochromatin (H3K56ac or
H3K79me), B: no modification• s: Sir3p, C: SIR complex (2-3-4), B: none• H: heterochromatin, E: euchromatin • N:nucleation center
Silencing steps
• (Nucleation) Signal for start, series of protein interactions and finally SIR complex binds to silencer.
• (Propagation)1. Sir2 deacetylate H4K16ac on adjacent
nucleosome.
2. Sir3 binds to the deacetylated nucleosome.
3. Sir3 recruits Sir4-Sir2 complex.
4. Goto step 2.
• (Termination) Boundary forms when H3K56ac or H3K79me is encountered
Initial chromatin & rules
... - NNNN -ABBE -ABBE -ABBE -... - AIBE -IIII -...
NNNN ABBE --> NNNN BBBE # Nucleation with the first SIR BBBE **** --> BBsE ---- # Recruit Sir3p **sE **** --> --CH ---- # Recruit Sir2 -4 complex & deactivation **C* A*** --> ---- B--- # Deacetylation(BIBE **** --| BIsE ----) # Inhibition not required in simulation
Sample run with simulator
Can we build a computer from molecular parts?
• To build a biological chromatin computer, we need to create the writable tape and the read/write rules.
Making the tape
+
Ligation ofssDNA ends
Proof of concept for CC rule engineering
From Fig. 1 of Haynes and Silver 2011, “Synthetic reversal of gene silencing” JBC
Engineering of read/write rules
• Start with list of existing parts: existing reader, writer and scaffolding/adaptor protein domains across organisms
• Protein engineering for additional functionality.
• Apply combinatorial remixing to engineer chromatin-modifying complexes.
• Generate a physical library of complexes, then test on nucleosome arrays with known modifications to determine read/write functionality.
• Write CC programs using these components; test with simulator.
CC software engineering
• State, subroutines, variables, recursion• Self-modifying program (through changed expression of chromatin modifying complex components)
• Compiler. • Correctness proofs for CC programs. • What kinds of CC programs are easy to evolve?
What we’ve done
• Defined chromatin computing closely modeled on biology
• Built a simulator• Wrote CC programs• Proved CC is computationally universal• Expanded the CC instruction set for more
efficient computing
Related Work
Prohaska, Stadler and Krakauer, “Innovation in gene regulation: The case of chromatin computation” Journal of Theor Biol 2010 265(1):27-44.
Benecke, "Chromatin code, local non-equilibrium dynamics, and the emergence of transcription regulatory programs," Eur Phys J E Soft Matter. 2006 Mar;19(3):353-66.
Rohlf et al, "Modeling the dynamic epigenome: from histone modifications towards self-organizing chromatin," Epigenomics. 2012 Apr;4(2):205-19.
Why it’s important to model chromatin as a computer
• Understand chromatin biology
• Fix chromatin when it’s broken
• Engineer chromatin-based computers
Thanks
• David Yee• Rich Ferrante• David Allis• Keith Robison• Sebastian Hoersch• Jim Audia• Mark Goldsmith
• Bob Tepper• Yang Shi• Adam Rudner• Lisa Tucker-Kellogg• Larry Hunter• Phil Bourne• Constellation
Extra slides
• Different modifications have different time scales. Acetylation marks have a lifetime of minutes, phosphorylation hours, whereas methylation can last for days and even through generations of cell cycle. (Pubmed ID 21818411) The time scale is quite different from that of DNA sequence changes: chromatin modifications are written and erased many times during the life of a cell, whereas DNA sequence is mostly static, varying just slightly over generations.
How much writable chromatin memory is there in each human cell?
Size Units Item3,000,000,00
0 base pairs Size of genome
300 base pairs length of region covered by a nucleosome, allowing for some nucleosome-free regions
10,000,000 nucleosomes Therefore, number of nucleosomes in genome64 locations Number of modifiable locations on each nucleosome2 marks Number of possible marks at each position (marked or not
marked)
264 mark combinations
Number of possible different values (mark combinations) taken by one nucleosome.
64 bits Number of bits per nucleosome 8 bytes Number of bytes per nucleosome
80,000,000 bytes Number of bytes per human cell152,000 bytes Amount of onboard memory in the Apollo mission that got
astronauts to the moon (www.doneyles.com/LM/Tales.html)
Size of program compared to biological chromatin
• Hamiltonian Path program allowing any sized input graph
• Number of rules: 150• Upper bound on required number of nucleosome states: 2^12 (12 modification sites, 2 states each)
• Biological chromatin• 100s-1000s of rules (or more)• > 64 nucleosome states