37
CS273 CS273 Algorithms for Structure and Algorithms for Structure and Motion in Biology Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe | ssgross | @ cs.stanford.edu Spring 2006 – http://www.stanford.edu/class/cs273/

CS273 Algorithms for Structure and Motion in Biology

  • Upload
    huyen

  • View
    26

  • Download
    1

Embed Size (px)

DESCRIPTION

Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe | ssgross | @ cs.stanford.edu. Spring 2006 – http://www.stanford.edu/class/cs273/. CS273 Algorithms for Structure and Motion in Biology. Need a Scribe!!. Range of Bio-CS Interaction. - PowerPoint PPT Presentation

Citation preview

Page 1: CS273 Algorithms for Structure and Motion in Biology

CS273CS273Algorithms for Structure and Algorithms for Structure and

Motion in BiologyMotion in BiologyInstructors:

Serafim Batzoglou and Jean-Claude LatombeTeaching Assistant: Sam Gross

| serafim | latombe | ssgross | @ cs.stanford.edu

Spring 2006 – http://www.stanford.edu/class/cs273/

Page 2: CS273 Algorithms for Structure and Motion in Biology

Need a Scribe!!

Page 3: CS273 Algorithms for Structure and Motion in Biology

Range of Bio-CS InteractionRange of Bio-CS Interaction

Gene

Molecules

Tissue/Organs

Body systemRobotic surgery

Molecular structures,similaritiesand motions

Soft-tissue simulation andsurgical trainingCells

Simulation ofcell interaction

CS273Sequencealignment

Enormous range over space and time

Page 4: CS273 Algorithms for Structure and Motion in Biology

Focus on Proteins Proteins are the workhorses of all living

organisms They perform many vital functions, e.g:

• Catalysis of reactions• Transport of molecules• Building blocks of muscles• Storage of energy• Transmission of signals• Defense against intruders

Page 5: CS273 Algorithms for Structure and Motion in Biology

Proteins are also of great interest from a computational

viewpoint They are large molecules (few 100s

to several 1000s of atoms) They are made of building blocks

(amino acids) drawn from a small “library” of 20 amino-acids

They have an unusual kinematic structure: long serial linkage (backbone) with short side-chains

Page 6: CS273 Algorithms for Structure and Motion in Biology

Proteins are associated with many challenging

problems Predict folded structures and motion pathways Understand why some proteins misfold or

partially fold, causing such diseases as: cystic fibrosis, Parkinson, Creutzfeldt-Jakob (mad cow)

Find structural similarities among proteins and classify proteins

Find functional structural motifs in proteins Predict how proteins bind against other proteins

and smaller molecules Design new drugs Engineer and design proteins and protein-like

structures (polymers)

Page 7: CS273 Algorithms for Structure and Motion in Biology

Central Dogma Central Dogma of Molecular Biologyof Molecular Biology

Page 8: CS273 Algorithms for Structure and Motion in Biology

Central Dogma Central Dogma of Molecular Biologyof Molecular Biology

transcription

translation

Page 9: CS273 Algorithms for Structure and Motion in Biology

Protein SequenceProtein Sequence

O

NN

NN

OO

O

Long sequence of amino-acids (dozens to thousands), also called residues

Dictionary of 20 amino-acids (several billion years old)

(residue i-1)

Page 10: CS273 Algorithms for Structure and Motion in Biology

O

NN

NN

OO

O

Protein SequenceProtein Sequence

Peptide bond(partial double bond character)

T

Page 11: CS273 Algorithms for Structure and Motion in Biology

Central Dogma Central Dogma of Molecular Biologyof Molecular Biology

Physiological conditions: aqueous solution, 37°C, pH 7,atmospheric pressure

Page 12: CS273 Algorithms for Structure and Motion in Biology

Levels of Protein StructuresLevels of Protein Structures

hemoglobin (4 polypeptide chains)

Quaternary

Page 13: CS273 Algorithms for Structure and Motion in Biology

Mostly -helices Mostly -sheets

Mixed

Page 14: CS273 Algorithms for Structure and Motion in Biology

Intermediate states

FoldingFoldingUnfolded (denatured) state

Folded (native) state

Many pathways

Page 15: CS273 Algorithms for Structure and Motion in Biology

http://www-shakh.harvard.edu/ProFold2.html

How (we think) a protein folds ...

G = H - TS

Page 16: CS273 Algorithms for Structure and Motion in Biology

http://www-shakh.harvard.edu/ProFold2.html

How (we think) a protein folds ...

G = H - TS

Page 17: CS273 Algorithms for Structure and Motion in Biology

http://www-shakh.harvard.edu/ProFold2.html

How (we think) a protein folds ...

G = H - TS

Page 18: CS273 Algorithms for Structure and Motion in Biology

http://www-shakh.harvard.edu/ProFold2.html

How (we think) a protein folds ...

G = H - TS

Page 19: CS273 Algorithms for Structure and Motion in Biology

http://www-shakh.harvard.edu/ProFold2.html

How (we think) a protein folds ...

G = H - TS

Page 20: CS273 Algorithms for Structure and Motion in Biology

Motion of Proteins Motion of Proteins in Folded Statein Folded State

HIV-1 protease

Page 21: CS273 Algorithms for Structure and Motion in Biology

Structural variability of the overall ensemble of native ubiquitin structures

[Shehu, Kavraki, Clementi, 2005]

Page 22: CS273 Algorithms for Structure and Motion in Biology

Amylosucrase

Flexible LoopLoop 7

Page 23: CS273 Algorithms for Structure and Motion in Biology

Central Dogma Central Dogma of Molecular Biologyof Molecular Biology

Page 24: CS273 Algorithms for Structure and Motion in Biology

BindingBinding

Inhibitor binding to HIV protease

Protein-protein binding

Ligand-protein binding

Page 25: CS273 Algorithms for Structure and Motion in Biology

Binding of Pyruvate to LDH

(reduction of pyruvate to lactase)

ASP-195HIS-193ASP-166

ARG-169

+

+

+

THR-245

C

COO

OCH3

NADH

GLN-101

ARG-106Loop

Lactate dehydrogenase environment

Pyruvate

Nicotinamide adenine dinucleotide (coenzyme)

Page 26: CS273 Algorithms for Structure and Motion in Biology

What is CS273 about?What is CS273 about? Algorithms and computational

schemes for molecular biology problems

Molecular biology seen by computer scientists

Page 27: CS273 Algorithms for Structure and Motion in Biology

y = f(x) Biologists like experiments, specifics and classifications

They like it better to know many (xi,yi) – i.e., facts – and classify them, than to know f

Computer scientists like simulation, abstractions, and general algorithms

They want to know f – the explanation of the facts – and efficient ways to compute it, but rarely care for any (xi,yi)

One challenge of Computational Biology is to fuse these two cultures

The Shock of Two Cultures

Page 28: CS273 Algorithms for Structure and Motion in Biology

Two Views of a BioComputation Class

Where are IT resources for biology available and how to use them

How to design efficient data structures and algorithms for biology

Page 29: CS273 Algorithms for Structure and Motion in Biology

Main Ideas Behind CS273Main Ideas Behind CS2731. The information is in the sequence

Sequence Structure (shape) Function Sequence similarity Structural/functional similarity Sequences are related by evolution

Page 30: CS273 Algorithms for Structure and Motion in Biology

Main Ideas Behind CS273Main Ideas Behind CS2731. The information is in the sequence

Sequence Structure (shape) Function Sequence similarity Structural/functional similarity Sequences are related by evolution

2. Biomolecules move and bind to achieve their functions Deformation folded structures of proteins Motion + deformation multi-molecule complexes One cannot just “jump” from sequence to function

Protein folding

Ligandprotein binding

Page 31: CS273 Algorithms for Structure and Motion in Biology

Sequence Structure Function

sequencesimilarity

structuresimilarity

Page 32: CS273 Algorithms for Structure and Motion in Biology

Main Ideas Behind CS273Main Ideas Behind CS2731. The information is in the sequence

Sequence Structure (shape) Function Sequence similarity Structural/functional similarity Sequences are related by evolution

2. Biomolecules move and bind to achieve their functions Deformation folded structures of proteins Motion + deformation multi-molecule complexes One cannot just “jump” from sequence to function

CS273 is about algorithms for sequence, structure and

motion- Finding sequence and shape similarities- Relating structure to function- Extracting structure from experimental data - Computing and analyzing motion pathways

Page 33: CS273 Algorithms for Structure and Motion in Biology

Vision Underlying CS273 Goal of computational biology:

Low-cost high-bandwidth in-silico biology Requirements:

Reliable models Efficient algorithms Algorithmic efficiency by exploiting properties

of molecules and processes: Proteins are long kinematic chains Atoms cannot bunch up together Forces have relatively short ranges

Computational Biology is more than using computers to biological problems or mimicking nature (e.g., performing MD simulation)

Page 34: CS273 Algorithms for Structure and Motion in Biology

Tentative Schedule Tentative Schedule 1 April 5 Introduction2 April 10 Protein geometric and kinematic models3 April 12 Conformational space4 April 17 Inverse kinematics and applications5 April 19 Sequence similarity6 April 24 Sequence similarity7 April 26 Sequence similarity8 May 1 Structure comparison9 May 3 Structure comparison10 May 8 Protein phylogeny, clustering, and classification11 May 10 Protein phylogeny, clustering, and classification12 May 15 Energy maintenance13 May 17 Energy maintenance14 May 22 Structure prediction15 May 24 Roadmap methods16 May 31 Structure prediction17 June 5 Structure prediction18 June 7 TBA19 June 12 Project presentations (2 hours)

Page 35: CS273 Algorithms for Structure and Motion in Biology

Instructors and TAsInstructors and TAs Instructors:

– Serafim Batzoglou – Jean-Claude Latombe

TA:– Sam Gross

Emails: | serafim | latombe | ssgross | @ cs.stanford.edu Class website: http://cs273.stanford.edu

Page 36: CS273 Algorithms for Structure and Motion in Biology

Expected Work Regular attendance to lectures and active

participation Class scribing (assignments will depend on

# of students) Exciting programming project:

http://www.stanford.edu/class/cs273/project/project.html - Structure prediction- Clustering and distance metrics- Protein design- Something else

Page 37: CS273 Algorithms for Structure and Motion in Biology

Questions?Questions?