19
Ma#ng Muta#on DNA Migra#on A Cell/ Organism Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester Support Cure Ecosystem/Tissue Measure/manipulate here

DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Ma#ng

Muta#on

DNAMigra#on

ACell/Organism

OtherCells/

Organisms

OtherEnvironmental

Factors

Cell‐cellsignals

Changegenome Changeenvironment

Produce

Remediate

Sequester

Support

Cure

Ecosystem/Tissue

Measure/manipulatehere

Page 2: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

GoalsandOutcomes

Scien#ficGoals

Techno

logies

ImprovedAnnota#onHigh‐throughputPhysiology/Gene#cs

Frommacromolecularstructure

NonlinearOp#miza#onMul#variateSensi#vityAnalysis

Sta#s#calGraphModelingAlgorithmsSemi‐automatedmodelgenera#onScalablemethodsforhybrid/mul#scalesim.

Bifurca#onanalysisforlarge/mul#physicssys.IntegratedsoMwaresystemsExperimentalDesignSupport

ExascaleResources

CloudCompu#ng

TransparentresourceuseRapidDetermina#onofGenomicPoten#al

Environmentalchange,growth,andevolu#on

Remediate,sequester,produce,cure

Automatedmodelreduc#on(abstrac#on)

Page 3: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

GrandChallenges

•  ModelingPhenotypefromGenotypeAcrosstheTreeofLife.–  Rapidreconstruc#onofcellularnetworksacrossmul#ple#meandspace

scales.•  Predic#vemodelsof10,000organisms•  Mechanis#cunderstandingofsinglecelldynamics

–  Predictenvironmental“inputs”andoutputs:“crackthesignalingcode”*–  Predictop#malgrowthcondi#onsfromgenotype–  Op#mizenaturalsystemac#vityofindividualsandconsor#a–  Individualizedgenomics

•  DiscoverandEngineerUnitsofFunc#onatallNetworkScales–  Discovertheevolu#onaryprinciplesoffunc#ongenera#onandreuse–  Designnewfunc#onfromcomposi#onofsuchmodules

*Thanks,Mark

Page 4: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

ExampleStories:ChrisHenry(andCostasMaranas)

!""#$%&#'$#()%*&+,-

.#/0&)(*.1'2#(3*.4'5*.'67'

"0%8&+"'%)"#9'*2',0..#2('

)22*()8*2"-9)()'

:#.5*.$';<+82='

"(1&#'#>?#.+$#2("'

(*'50.(@#.'#&0,+9)(#'

.#/0&)(*.1'2#(3*.4'

A*2"(.0,(')'B6!'

$*9#&'*5'67'"0%8&+"'

(@)('+2(#/.)(#"'

%**&#)2'.#/0&)()8*2'

!99'(@#.$*912)$+,'

)29'4+2#8,',*2"(.)+2("'

(*'(@#'B6!'$*9#&'3+(@'

?).)$#(#.'02,#.()+2(1'

C"#'(@+"'$*9#&'(*'+9#2851'

(@#'$+2+$)&'9#&#8*2"'

.#D0+.#9'(*'(0.2'*E'#F#.1'

2*2#""#28)&'?)(@3)1'

G20$#.)(#'(@#'&+2#).&1'+29#?#29#2('

"*&08*2"'(*'(@+"'?.*%&#$'(*'

?.*90,#')'&+"('*5'502,8*2)&&1'

#D0+F)&#2('?)(@3)1'$*90&#"'

G>?&*+('67'"0%8&+"'2)(0.)&',*$?#(#2,#'

(*'#>?#.+$#2()&&1'+$?&#$#2('*2#'

"*&08*2')"'$0,@')"'?*""+%&#'3-*'

F+)%+&+(1'&*""'HIJK'9#&#8*2"L'

C"#'$*9#&'(*'9#"+/2')'"#('*5'

?@#2*(1?+,'(#"("'5*.'*0.';$+2+$)&='

"(.)+2''(@)('3+&&'$)>+$+M#'(@#'

*??*.(02+(1'(*'+2F)&+9)(#'*0.'$*9#&'

N02'?@#2*(1?+,'(#"("'

*2';$+2+$)&='"(.)+2'

)29',*$?).#'.#"0&("'

3+(@'$*9#&'?.#9+,8*2"'

O*9+51'$*9#&'(*'

$)>+$+M#')/.##$#2('

%#(3##2'?.#9+,8*2"')29'

#>?#.+$#2()&'.#"0&("'

!"#"$%&%'()&*+*,"&(-#,"-"./.0&$&."1&-(.(-$,&)%2$(.&"$*'&/-"&

$.3&$))"-4,(.0&$&)"%&56&#$%'1$+&-537,")&%'$%&-$+&4"&

2"$*/8$%"3&$%&1(,,&,$%"2&&

95%":&1"&$2"&

9;<&-(.(-(=(.0&

0".5-"&47%&

>.5*>(.0&57%&

?42$.*'@&#5(.%)&

(.&%'"&#$%'1$+)&

Desktop‐scalecomputa#onalstepsinredMassivescalecomputa#onsingreenExperimentalstepsinblueStepsrequiringnewalgorithmsinblack

Scien:ficobjec:ve:Developa“complete”understandingofthemetabolismofB.sub'lisincludingiden#fica#onofalltransporters,theregula#onofallmetabolicpathways,andthepathwayresponsetotheenvironment.

Page 5: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Whatswitchesonspores?

X

X

AR (proteins/mRNA-s)

k3(mRNA/s)

pulseoscillationgraded

bistable

X

X

AR (proteins/mRNA-s)

k3(mRNA/s)

pulseoscillationgraded

bistable

!"#

$%#

&'()#

%*+$#

&'(%#

+,-$#

.+/0$1+#

2/3$1+#

!"# $%#

Page 6: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Induc#onofSpo0B

WTlevelsofSpo0B Slightinduc#on Slightlylargerinduc#on

Page 7: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Impact

•  Wehaveanopportunitytotrulyunderstandawholelifeformanditsinterac#onwithitsenvironment.

•  Wecanplaceitinevolu#onarycontextandbegintounderstandhowitanditspartsarosefromoriginallyinorganiccomponents?

•  Wecandiscoverthemo#fsoffunc#onthatwecanexploitforhumanpurposes

Page 8: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

OverallConclusions•  Dataiss#llthemainbo^leneckbuttherearebadlyscalingcomputa#onalproblemsintheir

analysis

•  Behaviorisresponsivetocomplexcombina#onsofinputsandcombinatoriallyaffectedbylargenumbersofgenesthusexplora#onofmodelsortes#ngagainstdataremainsdifficult.

•  Forthisefforttohaveaprofoundimpactonbiologyweneedhighthroughput,easyuse,andfastconcisepresenta'onofdata,model,andcomparison.

•  Whilethereisfundamentaltheoryandalgorithmsthatneedtobedeveloped–andiscri'cal–thetransparentaccesstolargecomputa#onresourcesforalargenumberofusersisthekey.Mosttoolswillbeembarrassinglyparallelorclosebutwills#llrequireoutrageousnumbersofprocessorsforeachproject.Andtherewillbemanyusers.

•  Sadly–experimentsseemtoscalealongsidecomputa#on–thelargerandmorecomplexthemodelthemoreexperimentsareneededtotest.–  Thescalingmaynothavenearthesameexponentsbutitisfarfromtrivialin#meandcost–  Thedatais“fragile”andneedstobe#ghtlyqualitycontrolled,instantlyaccessible,andwell

annotated

–  Computa#onshouldbeaneverydayadjutanttoexperiment–butthecomputa#onalthroughputiscurrentlytooslowandcomplicatedtobeasgenerallyusefulasaknockoutormicroarray.

Page 9: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Annota#on

•  Sequencingisscalingsuchthat10k’sofgenomesandevenmoremetagenomicsequencewillbesequenced/unit#me

•  Newfamiliesareplateauing,buts#ll30‐50%unannotatedfunc#on.Needstructurepredic#onhelp.Experimentsarekey.–  Investments

•  Newalgorithmsforphylogene#cannota#onN^2Log(N)–  Deeptheoryofevolu#onfortrees.

•  Newalgorithmsforstructuralannota#on(Seemacromoleculargroup)•  Newalgorithmsfor“guilt‐by‐associa#on”(M*N^2)•  Experimentstobroadenfunc#onalassignments(NOrganism*nCond*Nreplic)

•  Manualannota#on/cura#on

Page 10: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

High‐throughputPhysiology/Gene#cs

•  Problemsofscale–  Fromsinglemoleculetopopula#onsofcells–  Mul#plexesmachinesallowsomewholegenomeassays–butbiochemistryiss#llaproblem,imagingiss#lla

problem(~N^3)andtechlikemassspecitselfcomputa#onalproblemsinspectramatching(~N^2)–  Gene#csiss#llcase‐by‐case–  Qualityandreproducibility–  Centraliza#on/dissemina#onofdataandstandards

•  Investments–  Determiningmethodsforqualitycontrolwithdatacollec#onwithtechnologieshighdistributedinindividual

laboratories(ratherthanacentralresource).–  Investmentincrea#nghigherthroughput/highcontentresourcesforbiochemistry,singlecellandsingle

molecule(andinsitu)imaging/measurement–  Investmentsinnewculturingtechnologiesinmicrofluidicstobioreactors–  Massivelyparallelcomputa#onMUSTbematchedbymassivelyparallelexperiment.–  Increasingcomputa#onintheloopdesignofexperiments

•  Mathema#csandCSforDesignofexperiments–  Transparentintegra#onofsupercompu#ngandtheHTexperimentlabs.

•  Embeddedsystemsengineeringforplacingcontrolsontheinstruments–  Visualiza#onofcomplexdatasetsderivedfromthese.–  Mathema#csforreduc#onofnonlinearmul#variatedatasets.–  Imageprocessing/segmenta#on‐–  Capturereagents(totargetmoleculesformeasurement)–  Experimentsthatseektodiscoverthecommunica#onmechanismsandmoleculesbetweencells.

Page 11: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Sta#s#calGraphModelingAlgorithms

•  Theoryfordeterminingbestarrayofexperimentstoinputintothemodel–  Whichmodali#es–  Whichcondi#onsandtheircombina#ons–  Which#mescales–  Howmanyreplicates

•  Currently–biclustergoesasN^2/N^3andpropaga#onanduncertainandbranch/boundmodelreduc#ongoesasExp(cNedges)wherecsmallconstant.

•  Investments–  Newtheoryfordataintegra#onandsta#s#calmodeling–  Validatedtestsetsforalgorithmcomparison–  Mechanismsfordataavailability–  Detec#onweaksignalsinhighbackgroundnoise(notsubgraphisomorphism)–

sequences,correla#ons,quan#fica#ons–  Networkmo#fdetec#on–  Automatedtextmining/naturallanguageprocessing–  Improvethefeedbackloopbetweensta#s#calassocia#onsandannota#on

•  Be^eruseofmolecularinterac#ondataandsequence/TFinforma#on

Page 12: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Semi‐automatedmodelgenera#on•  DataDriventoDynamicModels

–  Howdoweusebioinforma#canalysisofnewdatasetstobuildamodel?–  Howdoweforce“annota#on”ofmorebiochemicalfeaturesthatwouldaidmodelers.–  Metabolicreconstruc#onthroughannota#on

•  Valida#on•  Feedbackthroughinconsistencyandholes

–  Rapidupdateofmodelswhendataupdates•  Linktopapers

•  Ataxonomyofmodeltypesthatwouldallowyoutraversewhatkindofmodeltogeneratefromdata.

•  Establishmentofstandardsfordifferenttypesofmodel–  Whatisa“complete”model?

•  Howdoyouintegratemodelsofdifferentpartsofalargersystem–  Toolsformakingthiseasier.–  Controlledvocab,seman#csandagreementwiththebioinforma#csfolkswithnaming

•  We’dliketo–wehavenoclue?•  MaybeaccessibilitytotheSta#s#calmodels

Page 13: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

scalablemethodsforhybrid/mul#scalesim.

•  Tounderstandhowthe1Dgenomeistransformedinto3Dspace.•  CurrentPDEsimula#onsscaleasO(N)but…•  Needforusablelibraryandmodelingframeworksthatbiologistscanuseeasily.•  Dothiswithhighheterogeneityinthephysics(stochas#c/determinis#c)andspace.•  Dothisunderuncertaintyindata,parametersand(gasp)mechanism•  Languagesformodelrepresenta#ons:Formallanguages?(Languagesforsimula#oncontrol).

•  Investments–  Howdosmoothlytransi#onamonglevelsofphysicalabstrac#onfromfullyrenderedspa#al/molecularlevel

simula#on,throughmesoscalestochas#csallwaytosmoothedODE.–  Howdoyoucomparespa#al/temporaldataandmodelsinphylogene#ccontext.–  Howdoyouputtogethergene#c/genomicdataandspa#almodeling.–  Howdoyoubuildcomplexheterogeneousmodels?–  Howdoyoubuildhybridstochas#c/determinis#cmodels/sta#s#cal–  Newintegrators(likeMagnusExpansions,spectral)tosupportaccurate–  Measurementtechnologyofbiochemicalac#vity(andforces,etc.)inlivecells.

Page 14: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Automatedmodelreduc#on

•  Formalcoarsegraining,scalesepara#on,and“balancedtrunca#on”– RonaldCoifman?

•  Automatednondimensionaliza#on?•  Linearalgebraicsolu#onsfortheabove.– Pseudospectralmethods?– Regulariza#on

•  Responsesurfacemethodsandfunc#onalapproxima#on.

Page 15: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

NonlinearOp#miza#on

•  Parameteres#ma#onandModelTes#ng–  Determina#onoftheerrorboundsonthemodel–  Parameterfeasibilityregions

•  Changesinregionsuponcomposi#ngofsubmodels–  Theoryofmoduleimpendanceandretroac#vity

–  ModelInvalida#on•  Inconsistency

–  Datainvalida#on–  Algorithmsforglobalop#miza#onbycleverparametermo#on

•  Quasirandomsearch–  Howdowegettothetailsofthedistribu#onsofour“stochas#c”models.

•  Measuresofparameterconserva#onbetweenorganismsasaresultandproxyforevolu#onarypressure.

•  ModelSelec#on/ModelModifica#on–minimalnumberofmovestoexplainthemaximumamountofdata.(Exp(N)butprac#callyN^2toN^3).

•  Mul#objec#veop#miza#on•  IntegerOp#miza#on

Page 16: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Mul#variateSensi#vityAnalysis

•  Differencemethodsfailforstochas#csystems–  Newalgorithm–  Petzold&Doyle?

•  Highdimensionalsensi#vity–  Sethnaandsloppiness.

•  Eigendirec#ons.–  DavidRand(Warwick)–  FAST(Fouriersomethingorsomething)Saltelli–  DeniseKirchner

•  Understandingwheretodoyourexperiments•  Howdowelinksensi#vitytoevolvability?

Page 17: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Bifurca#onanalysisforlarge/mul#physicssys.

•  Auto,Oscill8,MatLab(CONTENT)–  Crashesaround25differen#alequa#ons

•  Howdoesbifurca#onchangewithcomposedmodels?–  Highdimensional(co‐dim3andhigher)

•  Classifica#onofdynamicsofamodel–  Rapidinferenceofpossibilityforbifurca#onfrommodels.–  ChemicalReac#onNetworkTheory(Feinberg)–  Differentphysicalmodelclasses.

•  Fullydiscreteandstochas#cmethods–  Dodeterminis#cbifurca#onssurviverealis#cnoise–  Donewbifurca#onsariseduenoise.

•  Mapparameterstogenotypedeterminewhichgenotypesleadtobifurca#on.

Page 18: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

ExperimentalDesignSupport

•  Op#mizechoiceofdatatoanswerques#onbasedonmodel–  E.g.bestdatatoes#mateparameters–  E.g.bestdatatodiscriminatebetweenmodels–  E.g.bestdatatodiscovermissingpiecesofmodel–  E.g.bestdatatoreducetheuncertaintyinmodelpredic#on.

–  Tidor–  Arkin/Flaherty

•  Theautomated“Adam”–thegene#cist.•  Issuesofresolu#on

–  Timeresolu#on–  Rela#vevs.absolutemeasurement–  Measuresofconstraint?–  Whatdatadoyouneedtodealwithmul#scalemodels

•  ClosedloopcontrolofBiology–  ThinkHerschelRabitz

Page 19: DNA Other Cells/ Organisms Mutaon Mang€¦ · Other Cells/ Organisms Other Environmental Factors Cell‐cell signals Change genome Change environment Produce Remediate Sequester

Italldependsonwhatthemeaningof“Mission”is.

Yes,wecan!