Cartoon modeling of proteins
Fred Howell and Dan Mossop
ANC
Informatics
Overview
Why / how to model intracellular processes? Examples: MCell, Stochsim, Virtual Cell
Cartoon models Where's the data on structure / interactions?
A new 3D protein interaction simulator post synaptic density self-assembly vesicle formation vesicle transport
Futures & speculations
Why / how to model intracellular processes?
Ordered soup of ~1,000,000 different types of macromolecules Complex and specific network of interactions Ion channels and complexes the tip of the iceberg (croutons?)
Much work on gene networks / intracellular pathways Mostly ignores spatial effects (well mixed pool / kinetics)
Hypothesis of mechanisms typically involve cartoon descriptions / precise shapes / jigsaw-like interactions of proteins
Computer models typically don't
Intracellular pathway modeling
Single mixed pool: Rate equations / kinetics (as differential equations) Stochastic simulators (Stochsim)
A number of connected compartments Virtual cell
Individual molecules / brownian motion MCell
... but none of them take into account the actual shapes of proteins!
Single protein modeling
The great protein folding problem - what shapes can the sequence form?
Uses molecular dynamics (motion of each atom in the molecule) to try and predict low energy folding conformations of primary sequence
hard, not there yet Intermediate protein modeling - recognise characteristic subsequences
of amino acids, guess substructures like alpha helices, beta sheets promising, not there yet
Timescales of femto- and pico- seconds
... data available from crystallography on some proteins (PDB) ... predicting binding sites is very hard
Cartoon models
Typically used to hypothesise mechanisms
Getting data on protein shapes
PDB: coordinates of each atom in protein
One possibility: cluster analysis to reduce to a number of subunits
Getting data on protein interactions
This is harder
Ideally would like binding sites, bond angles, bond strengths
Typically get "A does / does not interact with B (probably)"
... but the situation is set to improve as more data becomes available in databases
So, how to build models?
Cheat - use a mixture of real and hypothesised model proteins
A new protein interaction simulator
proteins modeled as simplified 3D structures including a number of subunits / binding sites / conformational states
water not modeled explicitly proteins moved by brownian motion bonding / state transition probabilities set as parameters collision detection in version 1 protein complexes modeled as rigid structures membranes modeled as a restriction to 2D diffusion of membrane
bound proteins (still free to rotate)
Example models
(1) Formation of the post synaptic density - a model of recruitment of AMPA receptors to the vicinity of activated NMDA receptors
(2) Self assembly of clathrin coated vesicles
(3) Transport of vesicles using kinesin
The common theme
Throw together an unordered collection of proteins, with specific binding sites, interactions and probabilities
Evolve the system through time
See if complex shapes and processes emerge
Example 1 - post synaptic density
CAM KIIglue
AMPArNMDAr
Example 2 - Vesicle formation
Clathrin:-
Example 4 - Kinesin
Input - a motor protein model, stable states / transitions / binding cause it to walk up microtubules carrying its payload
Details of simulator (and approaches tried)
Fluid dynamics? DPD? MD? Monte-carlo?
Simulator design:
XML model description (protein shapes, initial state, binding sites and probabilities)
Java simulation engine for state updates Java3D visualisation
Futures: modeling technology
Add spring constants to bonds (rather than completely rigid)
More sophisticated models of membranes (rather than a 2D restriction on diffusion)
Efficient cytoskeleton models?
Explicit water? Small ions?
Auto generation from databases of protein shapes and interactions?
Futures: applications
DNA replication machinery (helicase / polymerase) Snares / vesicle docking / budding (a model of Golgi apparatus?) Full molecular model of a dendritic spine receiving an burst of
transmitter Ribosome operation Entire process of cell division (dna replication + microtubule
formation + motor protein separation + control sequences) Self assembly of viruses from their coat proteins
A model of parallel processing?
How does this ordered soup of proteins maintain a such a large number of tightly synchronised feedback control systems?
Could it be a useful model of computation in its own right? The well mixed case is:-
we have a memory of 1,000,000 different variables (one per protein) we have specific probabilties of transitions between these we have mechanisms for synthesising and destroying proteins
Adding 3D structure we also get:- some combinations of these variables form substructures with specific
properties interactions depend on where the proteins are
Conclusions
We can build 3D models of protein systems to test and visualise hypothesis about how structures can form
We still don't have a good way to model all the intracellular complexity
Perhaps we should focus on molecular models of viruses and bacteria before attempting eukaryotic cells?
Thanks to Dan Mossop for doing all the work