afek2011

Preview:

DESCRIPTION

afek2011

Citation preview

A Biological Solution to a FundamentalDistributed Computing ProblemYehuda Afek,1* Noga Alon,1,2* Omer Barad,3* Eran Hornstein,3Naama Barkai,3 Ziv Bar-Joseph4Computational and biological systems are often distributed so that processors (cells) jointly solvea task, without any of them receiving all inputs or observing all outputs. Maximal independentset (MIS) selection is a fundamental distributed computing procedure that seeks to elect a set oflocal leaders in a network. A variant of this problem is solved during the development of theflys nervous system, when sensory organ precursor (SOP) cells are chosen. By studying SOPselection, we derived a fast algorithm for MIS selection that combines two attractive features. First,processors do not need to know their degree; second, it has an optimal message complexitywhile only using one-bit messages. Our findings suggest that simple and efficient algorithmscan be developed on the basis of biologically derived insights.Computational and mathematical methodsare extensively used to analyze and mod-el biological systems (13). We provideanexampleof thereverseof this strategy, inwhich a biological process is used to derive a so-lution to a long-standing computational problem.Indistributedcomputing, alargenumberofprocessorsjointlyanddistributivelysolveatask, without anyoftheprocessorsgettingall of the inputs or observing all of the outputs(4). All large-scale computing efforts, fromwebsearchtoairplane control systems, usedistributed computing algorithms to reach agree-ment, overcome failures, and decrease responsetimes. Biological processes arealsodistrib-uted. Parallelpathwaysareusedtotransformenvironmental signals to gene expression programs,and several tasks are jointly performed byindependent cells without clear central control.A long-standing distributed computing prob-lem is that of electing a set of local leaders [themaximal independent set (MIS)] in a networkof connected processors (4). The MIS is usedtodetermine a backbone for wireless networks,for routing, andinseveral other networkpro-tocols (5). Formally, a MIS is defined as a setof processors(nodes) A so that everynode inthe network is either in A or directly connectedto a node in A, and no two nodes in A are con-nected(Fig.1A).DistributivelyelectingaMIShas been considered a challenging problem forthree decades (6). In particular, when all nodesare initially identical constructing a MIS by usingdeterministic algorithms is impossible (7), neces-sitating probabilistic approaches. Luby (8) andAlon et al. (9) presented fast probabilistic algo-rithmsforelectingaMIS. Inthesealgorithms,nodeschangetheirprobabilityofbeingelectedbasedonthenumberofactiveneighborstheyhave (nodes that are not yet connected to nodesin A), and they require processors to send mes-sages the sizes of which are a function of the num-ber of nodes in the network. Recent methods wereproposedthat partiallyremove either of theseassumptions (10, 11), but to date, no method hasbeen able to efficiently reduce message complex-itywithoutassumingknowledgeofthenumberof neighbors. These are important requirementsfor deployment of large, ad hoc sensor networks.Theselectionof neural precursorsduringthe development of the nervous system resem-bles the MIS election problem. The precursorsof the flys sensory bristles [sensory organ pre-cursors(SOPs)]areselectedduringlarvaeandpupaedevelopmentfromclustersofequivalentcells. The selectionof the small bristles pre-cursors (microchaetes) (Fig. 1B) is initiated 8 to10hours after pupaeformation, whenseveralelongated clusters of proneural cells, containingbetween 20 and 30 cells each, appear at specificpositions in the imaginal discs, which will laterbecometheflyswingsandnotum.Duringthenext 3 hours, SOPs begin to appear within theseclusters. A cell that is selected as a SOP inhibitsitsneighborsbyexpressinghighlevelsof themembrane-bound protein Delta, which bindsand activates the transmembrane receptor proteinNotch on adjacent cells (12). This lateral-inhibitionprocessishighlyaccurate(13),resultinginaregularlyspacedpatterninwhicheachcelliseither selected as SOP or is inhibited by a neigh-boring SOP (Fig. 1C). Thus, as in the MIS prob-lemeveryproneural clustermust elect aset ofSOPs (A) so that every cell in the cluster is eitherinAorconnectedtoaSOPinA, andnotwoSOPs in A are adjacent.Extensivestudiesandmathematical model-ing were used to define the molecular componentsmediatingSOPselectionandthemechanismunderlying selection. These studies suggest sev-eralsimilaritiesbetweenthemechanismunder-lying SOP selection and current algorithms forMISelection(14).First,theselectionofapar-ticular cell as a SOP is a random event governedbyanunderlyingstochastic process (15, 16).Second, similar to computational requirementsSOP selection is probably constrained in timebecausethedefaultofallclustercellsistobe-come SOPs unless they are inhibited (17). Lastly,incomputationalalgorithms(8,9)processorssendmessagesonlywhentheyproposetheirABCFig. 1. Computational and biological overview.(A) Illustration of a MIS. Edges represent com-munication channels. (Left) Processors in a net-work are initially identical. (Right) Following aMIS selection algorithm, a set of local leaders(blue computers) is elected so that each com-puter is either a local leader or connected to alocal leader. No two local leaders can be neigh-bors in the network. (B) The notum of an adultfly, presenting the typical pattern of small andlarge bristles (microchaetes and macrochaetes,respectively). Microchaetes are surrounded by adashed line. (C) Illustration of SOPs in flies. (Left) Cells in a cluster are initially equivalent. (Right) Following aSOP selection process, selected SOPs (blue cells) inhibit their physical neighbors (red cells), and so for thecluster depicted in this figure, no more SOPs can be selected.1Blavatnik School of Computer Science and Sackler School ofMathematics, Tel Aviv University, Tel Aviv 69978, Israel.2Institutefor Advanced Study, Princeton, NJ 08544, USA. 3Department ofMolecular Genetics, WeizmannInstituteof Science, Rehovot76100,Israel.4SchoolofComputerScience,CarnegieMellonUniversity, Pittsburgh, PA 15213, USA.*These authors contributed equally to this work.To whomcorrespondence should be addressed. E-mail:naama.barkai@weizmann.ac.il (N.B.); zivbj@cs.cmu.edu (Z.B.-J.)REPORTSwww.sciencemag.org SCIENCE VOL 331 14 JANUARY 2011 183 on April 1, 2015www.sciencemag.orgDownloaded from on April 1, 2015www.sciencemag.orgDownloaded from on April 1, 2015www.sciencemag.orgDownloaded from candidacy to become leaders, thus reducing com-municationcomplexity. Mathematical model-ingbyusandotherssuggeststhatthismightalso be the case during SOP selection: Beforebeingselected,theabilityof DeltatoactivateNotchonadjacent cellsisinhibitedbytheirinteractionwithinthesamecell,enabling com-munication between cells only after selection(1820).Although similar, the biological solution dif-fers from computational algorithms in at leasttwoaspects. First, SOPselectionisprobablyperformedwithout relyingonknowledge ofthenumber ofneighborsthat arenot yet se-lected. Second, mathematical analysis demon-stratedthat SOPselectionrequiresnonlinearinhibitionthatineffectreducescommunicationto the simplest set of possible messages (binary)(21, 22).These last two aspects of the biological pro-cessareattractivebecausetheycouldgreatlysimplify applications of MIS selection. We thusexaminedSOPselectionmorecloselyinordertodeterminewhetherbetterunderstandingofthis process can lead to an efficient algorithmfor MIS selection. To verify the stochastic na-tureofthisprocess,wefirstmonitoredtheinvivoselectionusingafluorescencereporterforNotch activity (17). We focused on the selectionof SOPs in two symmetrical rows of pro-neuralclusters[thefifthrowsof notummicrochaetes(Fig. 2A)], whichoccur beforeheadeversion.These clusters consist of 10 rows with two cellseach, giving rise to four SOPs per cluster. Mea-suringtheSOPselectiontimesin10differentpupae(20distinct clusters)revealedabiasforearlyselectionofthelowestSOP,probablyre-flectinganearlier initiationof thispart of thecluster. However, the upper three SOPs ap-pearedat seeminglyrandomorder(Fig. 2B),supportingprevious evidences for stochasticselection (23).Adefiningaspect of algorithms for MISselectionis the per-roundprobabilitythat anode joins the MIS. Current algorithms (8, 9)optimize this rate by dynamically increasing itwhenthenumberofactiveneighborsanodehas decreases. During SOP selection, cells donot knowthenumber of nonselectedneigh-bors. However, the temporal selection rate maystill beoptimizedbycell-autonomous mecha-nisms, for exampleby stochasticallyaccumu-latingaprotein(suchasDelta)untilitpassessomethreshold. Characterizingthestochasticaccumulation rate is thus akeyforunderstand-ingthebiological selectionstrategy. Todeter-minethisrate, wecomparedstatistics derivedfromthe observed SOPselection times withseveral in silico stochastic accumulation mod-els. Themodels differ inthewaybywhichstochasticity is introduced (14). Results of twoofthesemodelswereconsistentwithourex-perimental data (Fig. 3). Thefirst consistentmodel assumedafixedrateof accumulationover time, and we concluded that it is not ap-propriate for computer networks (14). In con-trast, thesecondmodel assumedaburst-likeproteinproductioninwhichthelikelihoodofburstingincreasesin time, resemblinga com-Fig. 2. Time-lapse imaging of no-tummicrochaetes SOPs selectionin a live pupa. (A) (Bottom) Time-lapse imaging of notummicro-chaetes SOPs selection in a livepupa of hsflp; ma-dsRED;FRT80B,ubi-NLS-GFPstrainafter the selectionof themicrochaetes SOPs at thefifth rowof the left and rightclusters (area surrounded by adashedwhitecurve). (Top)Anno-tated image highlighting the pro-neural, inhibited, and selectedcells in the fifth row of the bottompanels. Proneural clusters are markedwith gray, SOPs with blue, and non-SOPs with red. SOPs are identifiedby the up-regulation of ma-dsREDin adjacent cells. We followed theselection process from145 minbefore head eversion (HE) to HE, corresponding to ~9.5 to 12 hours after pupa formation. (B and C)Statistics of SOP selection order from time-lapse imaging of ten pupae. The y axis represents the movienumber. The x axis corresponds to the four SOPs selected in each row on the left and right sides (L1 to L4and R1 to R4) ordered from bottom to top. Color in each (x, y) coordinate represents the order (1 to 4) inwhich this SOP was selected (see color legend on the right) (C) Average and SDof SOP selection order for L1to L4 and R1 to R4; x axis is the same as in (B).Ratio between selection time differencesAccumulation FixedaccumulationRate change Fixed rateMean ratio3210Fig. 3. Comparisonof experimental andsimu-lated results. We computed the average ratio forthe difference between the selection times ofSOPs in the movies we analyzed. The dashed linerepresents theaverageobservedinour experi-ments (1.98). We simulated four stochastic mod-els. In the accumulation model, cells accumulaterandomamounts of Delta at each step untilreaching a threshold. The fixed accumulation andrate change models are described in the paper. Inthefixedratemodel, cells usethesameburstdistributionthroughout theprocess(14). Valuesare based on 20,000 runs for each model. Errorbars represent SD.14 JANUARY 2011 VOL 331 SCIENCE www.sciencemag.org 184REPORTSputational algorithm for MIS in which the se-lectionprobabilityalsoincreases intime asthe number of active processors decreases. Wethusaskedwhetherwecandevelopanalgo-rithmfor MISselectiononthebasisof thisstochasticratechangemodel that wouldnotrequire knowledge about the number of activeneighborsandwouldonlyusethreshold(bi-nary) communication.Weassumedacollectionofidentical pro-cessors placedat nodes of anarbitrarysyn-chronous communication network. Nodes canonlybroadcast one-bit messages. Amessagebroadcasted by a node reaches all of its neigh-borsthat arestill activeinthealgorithm. Ineach round, a processor can only tell whetherornotamessagewassenttoit.Whenapro-cessor receives a message, it cannot tell whichof its neighboringprocessors sent it, anditcannot count the number of messages receivedin a round. Hence, our model is appropriate forradio networks with collision detection. Weassumedthat nodesreceiveasinput anupperbound on the number of nodes in the network(n)andanupperboundD,onthenumberofneighbors any node can have (if no such boundisknown,wesetDton).Wefurtherassumedthat no failures occur. The algorithm, presentedin Table 1, is synchronously executed by all nodes.ThealgorithmproceedsinlogDphases,each consisting of M log n steps, where M is aconstant; its value is given in (14). Initially, allnodes are active. Each step in each phase i con-sistsoftwoexchanges.Inthefirstexchange,eachactivenodebroadcastsamessagetoitsneighborswithprobabilitypi. Suchasinthebiological model, theprobabilitypiincreaseswith i. In the second exchange, a node that hasbroadcastedamessageinthefirst exchangejoins the MISif none of its neighbors hadbroadcastedat thefirst exchange. Suchnodebroadcastsagainamessagetoitsneighbors,tellingthemtobecomeinactive, andexit thealgorithm.Weprovedthat whenthealgorithmtermi-nates, notwoneighboringnodesareinA(theMISset), andthat everynodethat hasbecomeinactivehasaneighborinA[theproofcanbefoundin(14)].Wethusconcludethattheonlywaythe algorithmmayerr is byterminatingwhile leaving some nodes that are not in A andare also not connected to nodes in A. Next, weshowthat when the algorithmterminates allnodesare, withhighprobability,eitherinAorconnected to a node in A, which solves the MISproblem.The proof andthe complete analysis areprovidedin(14). Briefly, theproof reliesonan inductive argument to show that with highprobability, by the time phase i ends (14) therearenoactivenodeswithmorethanD2iactiveneighbors. Thus, nodeswithmanyneighborseither leavethealgorithm(joiningAor elim-inatingwhenaneighborjoins A)orlosemanyof their neighbors at each phase as these neigh-borsexit thealgorithm. Bythetimethealgo-rithm ends, i equals log D, and so all nodes thathave not joined A are, with high probability, notconnectedtoanyactivenode(andarealsonotconnected to any node in A) and thus can join Awith no collisions.The running time of the algorithm is O(log nlogD), which is the number of rounds re-quiredtoexecutethetwonestedloops. Theworst-case running time is O(log2n). All mes-sages in the algorithm are one bit. We prove in(14) that the expected number of messagessent to active nodes in our algorithmis linearin the number of nodes of the network, whichis optimal because each node is required to atleast receive a message from its local leader.Taken together, by studying a developmen-tal processinflieswedevisedasolutiontoanimportantdistributedcomputingproblem.Thenewalgorithmdoesnot requireknowl-edgeof thedegreeof individual processors,uses one-bit messages, and has an optimal mes-sage complexity. These features are useful formany applications, including wireless commu-nication systems and ad hoc sensor networks.Biologists are increasingly relying on advancedmodeling techniques. The other directionusinginsights from biology to advance computationalsystemshas mainly focused on optimizationtechniques inspiredbybiological observations,includingneural networks, geneticalgorithms,and routing (24). We have shown that areas ofcomputersciencethat requirestrict, provableguarantees canalsobenefit fromknowledgeregarding how biological systems operate. Bet-terunderstandingofthesebiological systemscan lead to further improvement in the designof complex distributed computing systems.References and Notes1. P. Flicek, E. Birney, Nat. Methods 6, (11s), S6 (2009).2. S. F. Altschul et al., Nucleic Acids Res. 25, 3389 (1997).3. U. Alon, An Introduction to Systems Biology:Design Principles of Biological Circuits(Chapman & Hall/CRC, Boca Raton, FL, 2006).4. N. Lynch, Distributed Algorithms (Morgan Kaufmann,San Francisco, 1996).5. P. J. Wan, K. M. Alzoubi, O. Frieder, Mobile Netw. Appl. 9,141 (2004).6. L. G. Valiant, Proceedings of the 7th IBM Symposiumon Mathematical Foundations of Computer Science(IBM, New York, 1982).7. S. Cohen, D. Lehmann, A. Pnueli, Theor. Comput. Sci. 34,215 (1984).8. M. Luby, SIAM J. Comput. 15, 1036 (1986).9. N. Alon, L. Babai, A. Itai, J. Algorithms 7, 567 (1986).10. Y. Mtivier, J. M. Robson, N. Saheb-Djahromi,A. Zemmari,An optimal bit complexity randomizeddistributed MIS algorithm, in Proceedings of the 16thInternational Colloquium on Structural Information andCommunication Complexity (SIROCCO), pp. 323337 (2009).11. T. Moscibroda, R. Wattenhofer,Maximal independentsets in radio networks, in Proceedings of theTwenty-Fourth Annual ACM SIGACT-SIGOPS Symposiumon Principles of Distributed Computing (PODC),pp. 148157 (2005).12. S. J. Bray, Nat. Rev. Mol. Cell Biol. 7, 678 (2006).13. J. M. Rendel, B. L. Sheldon, D. E. Finlay, Genetics 52,1137 (1965).14. Materials and methods are available as supportingmaterial on Science Online.15. P. Simpson, Curr. Opin. Genet. Dev. 7, 537 (1997).16. M. E. Fortini, Dev. Cell 16, 633 (2009).17. B. Castro et al., Development 132, 3333 (2005).18. A. C. Miller, E. L. Lyons, T. G. Herman, Curr. Biol. 19,1378 (2009).19. O. Sprinzak et al., Nature 465, 86 (2010).20. O. Barad, D. Rosin, E. Hornstein, N. Barkai, Sci. Signal.3, ra51 (2010).21. J. R. Collier, N. A. Monk, P. K. Maini, J. H. Lewis,J. Theor. Biol. 183, 429 (1996).22. E. Plahte, J. Math. Biol. 43, 411 (2001).23. K. Usui, K.-i. Kimura, Rouxs Arch. Dev. Biol. 203,151 (1993).24. A. Tero et al., Science 327, 439 (2010).25. We thank J. W. Posakony for fly stock and D. Rosin forhelp in experiments. This work was supported by a EuropeanResearch Council (ERC) Advanced grant, the OswaldVeblen Fund, and the Bell Companies Fellowship (N.A.),the Azrieli Foundation (O.B.), the Israel ScienceFoundation (E.H. and N.B.), ERC (IDEA), the Hellen andMartin Kimmel award for innovative investigations (N.B.),NIH grant 1RO1 GM085022, and NSF CAREER award0448453 (Z.B.-J.).Supporting Online Materialwww.sciencemag.org/cgi/content/full/331/6014/183/DC1Materials and MethodsSOM TextFigs. S1 to S7Table S1ReferencesMovies S1 and S210.1126/science.1193210Table 1. MIS algorithm.1. Algorithm: MIS (n, D) at node u2. For i = 0: log D3. For j = 0: M log n // M is constant derived below4. * exchange 1*5. v = 06. With probability12logDibroadcast B to neighbors and set v = 1 // B is one bit7. If received message from neighbor, then v = 08. * exchange 2 *9. If v = 1 then10. Broadcast B; join MIS; exit the algorithm11. Else12. If received message B in this exchange, then mark node u inactive; exit the algorithm13. End14. End15. Endwww.sciencemag.org SCIENCE VOL 331 14 JANUARY 2011 185REPORTSDOI: 10.1126/science.1193210, 183 (2011); 331 Science et al. Yehuda AfekProblemA Biological Solution to a Fundamental Distributed ComputingThis copy is for your personal, non-commercial use only.clicking here. colleagues, clients, or customers by , you can order high-quality copies for your If you wish to distribute this article to othershere. following the guidelines can be obtained by Permission to republish or repurpose articles or portions of articles): April 1, 2015 www.sciencemag.org (this information is current as ofThe following resources related to this article are available online athttp://www.sciencemag.org/content/331/6014/183.full.htmlversion of this article at: including high-resolution figures, can be found in the online Updated information and services, http://www.sciencemag.org/content/suppl/2011/01/10/331.6014.183.DC1.html can be found at:Supporting Online Material http://www.sciencemag.org/content/331/6014/183.full.html#relatedfound at:can be related to this articleA list of selected additional articles on the Science Web sites http://www.sciencemag.org/content/331/6014/183.full.html#ref-list-1, 4 of which can be accessed free: cites 18 articles This article http://www.sciencemag.org/content/331/6014/183.full.html#related-urls6 articles hosted by HighWire Press; see: cited byThis article has been http://www.sciencemag.org/cgi/collection/comp_mathComputers, Mathematicssubject collections: This article appears in the following registered trademark of AAAS. is a Science 2011 by the American Association for the Advancement of Science; all rights reserved. The title Copyright American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the Science on April 1, 2015www.sciencemag.orgDownloaded from