9
Neural Networks 24 (2011) 918–926 Contents lists available at SciVerse ScienceDirect Neural Networks journal homepage: www.elsevier.com/locate/neunet 2011 Special Issue PLATO: Data-oriented approach to collaborative large-scale brain system modeling Takayuki Kannon a,, Keiichiro Inagaki b , Nilton L. Kamiji a , Kouji Makimura a , Shiro Usui a,b a Laboratory for Neuroinformatics, RIKEN Brain Science Institute, Hirosawa, 2-1, Wako, Saitama 351-0198, Japan b Computational Science Research Program, RIKEN, Hirosawa, 2-1, Wako, Saitama 351-0198, Japan article info Keywords: Neuroinformatics Large-scale model Common data format Data-oriented Modeling and simulation technique abstract The brain is a complex information processing system, which can be divided into sub-systems, such as the sensory organs, functional areas in the cortex, and motor control systems. In this sense, most of the mathematical models developed in the field of neuroscience have mainly targeted a specific sub- system. In order to understand the details of the brain as a whole, such sub-system models need to be integrated toward the development of a neurophysiologically plausible large-scale system model. In the present work, we propose a model integration library where models can be connected by means of a common data format. Here, the common data format should be portable so that models written in any programming language, computer architecture, and operating system can be connected. Moreover, the library should be simple so that models can be adapted to use the common data format without requiring any detailed knowledge on its use. Using this library, we have successfully connected existing models reproducing certain features of the visual system, toward the development of a large-scale visual system model. This library will enable users to reuse and integrate existing and newly developed models toward the development and simulation of a large-scale brain system model. The resulting model can also be executed on high performance computers using Message Passing Interface (MPI). © 2011 Elsevier Ltd. All rights reserved. 1. Introduction The brain can be considered as a large-scale information processing system, which flexibly performs functions such as recognition, perception, learning, memory, and motor control. It receives information from the external world through sensory sys- tems, processes them based on learned memory, and generates the output motor commands through motor control systems. The elucidation of the complicated information processing underly- ing those functions is extremely important. In order to grasp the mechanisms underlying such functions, numerous neurophysio- logical studies have been conducted, and models describing indi- vidual features have been developed. To uncover the complicated information processing in the whole brain system, detailed models of each sub-system should be constructed and integrated. To this end, several projects such as those reviewed by de Garis, Shuo, Go- ertzel, and Ruiting (2010) and Goertzel, Ruiting, Arel, de Garis, and Shuo (2010) were attempted to develop large-scale models that target specific brain functions and system models that integrate Corresponding author. Fax: +81 48 467 7498. E-mail address: [email protected] (T. Kannon). sub-system models developed by different research groups. How- ever, in practice, models are described in different levels (i.e., func- tional, computational, realistic, etc.), programming languages, and data structures. Even when dealing with the same function or ob- ject, the model’s input, output, and parameters may have different formats. For this reason, even if two models are described using the same simulator or language, its integration becomes very compli- cated, and much effort may be required for adapting the codes to each other. As an approach for large-scale mathematical modeling and simulation studies in the field of neuroscience, one could use systematic modeling languages and simulators such as NEURON (Hines & Carnevale, 1997), GENESIS (Bower & Beeman, 1997), and NEST (Gewaltig & Diesmann, 2007), where neural simulations are executed by simply describing the structure of the network and model parameters. On the other hand, XML 1 document type modeling languages such as CellML (Hedley, Nelson, Bellivant, & Nielsen, 2001), SBML (Hucka et al., 2003), NeuroML (Gleeson et al., 2010), InsilicoML (Asai et al., 2008), and NineML (Gorchetchnikov & INCF Multiscale Modeling Taskforce, 2010; Raikov & INCF Multiscale Modeling Taskforce, 2010) are specialized in describing 1 Extensible Markup Language, http://www.w3.org/TR/xml/. 0893-6080/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.neunet.2011.06.011

PLATO: Data-oriented approach to collaborative large-scale brain system modeling

Embed Size (px)

Citation preview

Neural Networks 24 (2011) 918–926

Contents lists available at SciVerse ScienceDirect

Neural Networks

journal homepage: www.elsevier.com/locate/neunet

2011 Special Issue

PLATO: Data-oriented approach to collaborative large-scale brainsystem modelingTakayuki Kannon a,∗, Keiichiro Inagaki b, Nilton L. Kamiji a, Kouji Makimura a, Shiro Usui a,ba Laboratory for Neuroinformatics, RIKEN Brain Science Institute, Hirosawa, 2-1, Wako, Saitama 351-0198, Japanb Computational Science Research Program, RIKEN, Hirosawa, 2-1, Wako, Saitama 351-0198, Japan

a r t i c l e i n f o

Keywords:NeuroinformaticsLarge-scale modelCommon data formatData-orientedModeling and simulation technique

a b s t r a c t

The brain is a complex information processing system, which can be divided into sub-systems, such asthe sensory organs, functional areas in the cortex, and motor control systems. In this sense, most ofthe mathematical models developed in the field of neuroscience have mainly targeted a specific sub-system. In order to understand the details of the brain as a whole, such sub-system models need to beintegrated toward the development of a neurophysiologically plausible large-scale system model. In thepresent work, we propose a model integration library where models can be connected by means of acommon data format. Here, the common data format should be portable so that models written in anyprogramming language, computer architecture, and operating system can be connected. Moreover, thelibrary should be simple so that models can be adapted to use the common data format without requiringany detailed knowledge on its use. Using this library, we have successfully connected existing modelsreproducing certain features of the visual system, toward the development of a large-scale visual systemmodel. This library will enable users to reuse and integrate existing and newly developed models towardthe development and simulation of a large-scale brain system model. The resulting model can also beexecuted on high performance computers using Message Passing Interface (MPI).

© 2011 Elsevier Ltd. All rights reserved.

1. Introduction

The brain can be considered as a large-scale informationprocessing system, which flexibly performs functions such asrecognition, perception, learning, memory, and motor control. Itreceives information from the external world through sensory sys-tems, processes them based on learned memory, and generatesthe output motor commands through motor control systems. Theelucidation of the complicated information processing underly-ing those functions is extremely important. In order to grasp themechanisms underlying such functions, numerous neurophysio-logical studies have been conducted, and models describing indi-vidual features have been developed. To uncover the complicatedinformation processing in thewhole brain system, detailedmodelsof each sub-system should be constructed and integrated. To thisend, several projects such as those reviewed by de Garis, Shuo, Go-ertzel, and Ruiting (2010) and Goertzel, Ruiting, Arel, de Garis, andShuo (2010) were attempted to develop large-scale models thattarget specific brain functions and system models that integrate

∗ Corresponding author. Fax: +81 48 467 7498.E-mail address: [email protected] (T. Kannon).

0893-6080/$ – see front matter© 2011 Elsevier Ltd. All rights reserved.doi:10.1016/j.neunet.2011.06.011

sub-system models developed by different research groups. How-ever, in practice, models are described in different levels (i.e., func-tional, computational, realistic, etc.), programming languages, anddata structures. Even when dealing with the same function or ob-ject, the model’s input, output, and parameters may have differentformats. For this reason, even if twomodels are described using thesame simulator or language, its integration becomes very compli-cated, and much effort may be required for adapting the codes toeach other.

As an approach for large-scale mathematical modeling andsimulation studies in the field of neuroscience, one could usesystematic modeling languages and simulators such as NEURON(Hines & Carnevale, 1997), GENESIS (Bower & Beeman, 1997),and NEST (Gewaltig & Diesmann, 2007), where neural simulationsare executed by simply describing the structure of the networkand model parameters. On the other hand, XML1 document typemodeling languages such as CellML (Hedley, Nelson, Bellivant, &Nielsen, 2001), SBML (Hucka et al., 2003), NeuroML (Gleeson et al.,2010), InsilicoML (Asai et al., 2008), and NineML (Gorchetchnikov& INCF Multiscale Modeling Taskforce, 2010; Raikov & INCFMultiscale Modeling Taskforce, 2010) are specialized in describing

1 Extensible Markup Language, http://www.w3.org/TR/xml/.

T. Kannon et al. / Neural Networks 24 (2011) 918–926 919

the formula, parameter, and structure of a network, and areseparated from simulation techniques and algorithms. In thiscase, simulation programs interpret the XML document and eithersimulate themor translate them to run on the simulators describedabove. Models are ported for simulators using the same XMLdocument; thus, integration can be carried out within the samesimulator.

Based on the aforementioned complications involved in inte-grating large-scale models, we have recently developed a novelmodeling framework called PLATO (platform for a collaborativebrain systemmodeling) (Usui, Inagaki et al., 2009; Usui, 2010). Theobjective of PLATO is to enable the reuse and integration of ex-isting models stored in neuroinformatics databases (Hines, Morse,Migliore, Carnevale, & Shepherd, 2004; Migliore et al., 2003; Usuiet al., 2008; Usui & Okumura, 2008) in order to construct and sim-ulate a large-scale model. Newly developed models can also beintegrated into an existing large-scale model. As an example, weare developing a large-scale model of the whole visual system tounderstand the visual processing underlying perception, illusion,learning, and memory.

In this paper, we describe a data-oriented model integrationmethod in PLATO. A common data format (see below) is utilizedfor the model and/or sub-model (sub-system model) interface forexchanging data between models. An agent process manages theentire simulation by controlling the timing of model executionand data exchange. In the simulation, our approach successfullyconnected different types ofmodels described by C++ and Pythonwith small changes in program code.

2. Data-oriented model integration

The main idea underlying the data-oriented model integrationmethod is to commonize themodels’ input and output (I/O) data sothat differentmodels can be connected to each other. This providesa framework for model connection toward the development ofa large-scale model from existing and newly developed sub-models. It also allows sub-models to be replaced so that the large-scale model can be improved as more detailed sub-models aredeveloped. In other words, sub-models can be plugged into thelarge-scale model, just like playing with the LEGO blocks, to createand/or remodel objects. Therefore, we considered the followingessential issues for the data description:Common data format: Every sub-model should share the same dataformat. This will allow sub-models developed in any programminglanguage to be connected.Self-describing: The data file should describe not only series ofnumerical data but also important information (metadata), such asdescriptions and units of variables, and simulation condition. Thiswill provide essential information to interpret the data withoutrequiring detailed knowledge on the model that created the data.Portability: The data file should be independent from the operatingsystem (OS) and computer architecture. This will allow sub-models developed in different programming languages, computerarchitectures, and OSs to be connected.

As a data format that satisfies the above requirements, wehave focused on netCDF 2 (Rew & Davis, 1990). netCDF is adigital data format for scientific data, widely used in the fieldof geophysics. This format can also describe metadata, suchas units, data descriptions, conditions, creator, related articles,etc. It also provides a feature for basic unit conversion, whichsupports data exchange between models dealing with differentunits. For example, models dealing with time in seconds can be

2 Network Common Data Form, http://www.unidata.ucar.edu/software/netcdf/.

connected with models dealing with time in another time unitwithout knowing it. However, the library available for the useof netCDF requires too many procedures, making it difficult fora general neuroscientist to construct a mathematical model thatcan cope with netCDF. Therefore, in order to provide a frameworkfor inexpensive and environment-independent model integration,we have developed a data-oriented model integration librarynamed PLATO Network Interface Class Library3 (PLATONIC), whichincludes the following features:Simplicity: The library should be simple and have few proceduresso that users can exploit netCDF without any detailed knowledgeon its use.Separability: The library should be separated from important partsof the model, such as calculations and formulas so that it does notinterfere with the model program. As a result, existing model canbe reused with minimum effort.Time management: The library should manage the data I/O timingbetween sub-models so that a sub-model can be constructedwithout considering the simulation step size in the other sub-models.Independent execution: The library should allow independentexecution of each sub-model. This will simplify the testing anddebugging process.

3. Overview of PLATONIC

3.1. Data convention

To create a pluggable model without deep knowledge on themodels to be connected, the configuration of the model inputand output data should be designed ahead of the developmentof the model. The description of the contents of a data file innetCDF uses an XML document based on the NcML Schema4and is called ‘‘Data Convention’’. This document describes threetypes of information: ‘‘Dimension’’, ‘‘Variable’’, and ‘‘Attribute’’.Dimension denotes the length of a data array. If the length is setto ‘‘Unlimited’’ (defined in netCDF), that dimension can expandduring calculation, which is useful for the time dimension. Variableindicates the type and structure of the data array, where structureis a combination of Dimensions. Attribute describes informationabout a Variable (Variable Attribute) or the whole dataset (GlobalAttribute). A Variable Attributes is used to describe the notationof a variable, such as units, valid range, and description, whereasa Global Attribute is used to describe details of the data, author,experimental condition, literature, etc. Fig. 1 shows an exampleof a Data Convention file. There are three dimensions (‘‘time’’, ‘‘x’’,and ‘‘y’’) and four variables (‘‘image’’, ‘‘time’’, ‘‘x’’, and ‘‘y’’), wheredimension sizes for ‘‘time’’, ‘‘x’’, and ‘‘y’’ are ‘‘Unlimited’’, ‘‘256’’, and‘‘256’’, respectively. In this example, the variable ‘‘image’’ means‘‘retinal image’’, which has time-axis, x-axis, and y-axis. The threeother remaining variables are the coordinate variables, definingphysical coordinates corresponding to each dimension. Since unitsand types are described, it reduces the risks of unit errors duringmodel integration. Users need only this file to build a programfor reading and/or writing data. Conversely, users need to start bydefining the Data Convention to build a model and thus a ‘‘data-oriented approach’’.

3.2. Agent-Interface-Model System (AIM System)

Model execution in PLATONIC is managed by the AIM System.Fig. 2 shows an example of a runtime process relationship diagramof the AIM System. At simulation start-up, three kinds of processes,

3 PLATO Network Interface Class Library, http://platonic.dev.neuroinf.jp/.4 The netCDF Markup Language, http://www.unidata.ucar.edu/software/netcdf/

ncml/.

920 T. Kannon et al. / Neural Networks 24 (2011) 918–926

Fig. 1. Example of Data Convention file (NcML document) describing threedimensions and four variables. This file has a global attribute that indicates thetitle of this data and variable attributes that declares unit and description of data.The variable names, which are the same as dimension, are the coordinate variablesdefining the physical coordinate corresponding to each dimension.

Agent, Interface, and Model, are launched. The Agent processmanages simulation time, progress of Models, and Interfaceprocesses. It also monitors Model progress, providing features tokill the Model and Interface processes as necessary. Models andInterfaces report the elapsed time to the Agent over the TCP5

network, allowing each process to be executed in different nodeswhen using parallel computing systems. The Interface processexchanges data between models based on the Data Conventionfile using MPI6 (Gropp, Lusk, & Skjellum, 1994; Message PassingInterface Forum, 1994) and can be connected to multiple Models.The models’ data I/O are read from or written to file via theInterface process. Since file I/O has the disadvantage of being slow,the Interface compensates for this by caching the data. Whena model requests data input for the Interface (Fig. 2, Model B),the Interface reads and transfers data according to the model’ssimulation time. If the connected models (Fig. 2, Models A and B)have different time steps, the Interface can interpolate the datafor transfer based on user’s choice. Currently, it only provideszeroth-order hold and linear interpolation algorithms. By default,it transfers data using the zeroth-order hold interpolationmethod.For the linear interpolation method, it only transfers when dataon the next simulation step is available. If no data is ready to betransferred, the Interface keeps the model waiting until the databecomes available. Thismay lock a part of, or thewhole simulation.To dealwith this problem, PLATONIChas a feature for detecting andreporting such situations so that users can redesign data transfer(e.g., by selecting the zeroth-order hold interpolationmethod or bysynchronizing the time steps). Users may also change the defaultbehavior, such as automatic switch to zeroth-order hold datawhena transfer lock is detected. For independent execution of each sub-model, data files are accessed directly, without the AIM System.

3.3. Simulation configure script

The AIM System is executed along with the simulationconfiguration file described by theXMLdocument called PLATONICMarkup Language (PLML). PLML describes runtime conditions andconnections between models and interfaces for each runtimecondition. A similar approach has been adopted in SED-ML (Köhn

5 Transmission Control Protocol, http://www.rfc-editor.org/rfc/rfc793.txt.6 Message Passing Interface, http://www.mpi-forum.org/.

Fig. 2. Schematic diagram of Agent-Interface-Model System. Agent processlaunches Model processes and Interface process. Data file is accessed by theInterface process. Models communicate to each other via the Interface process.

Fig. 3. Example of simulation configuration file (PLML document). Model elementsspecify the executable binary file, launching command, arguments, and steptime ofthemathematicalmodel used in the simulation. Interface elements specify the DataConvention file and netCDF file used by the Interface process during simulation.Condition elements specify the simulation condition, such as the starttime andstoptime, and override settings specified in the Model and Interface elements.

& Le Novére, 2008) for simulating models described by SBML andCellML.

The PLML begins with the root element ⟨simulation⟩, whichcan have three top-level elements (⟨model⟩, ⟨interface⟩, and⟨condition⟩) as child elements. Fig. 3 shows an example of a PLMLdocument describing the connections in Fig. 2 and is detailedbelow.

3.3.1. Model elementThe ⟨model⟩ element specifies the executable file of the

mathematical model used in the simulation and can have⟨steptime⟩, ⟨launch⟩, ⟨args⟩, ⟨output⟩, and ⟨input⟩ as childelements. The element ⟨steptime⟩ specifies the calculation timeinterval of the model and its unit, such as microseconds (us,usec, microsec), milliseconds (ms, msec, millisec), seconds (s, sec,second), minutes (min, minute), or hours (h, hour, hr). The element⟨launch⟩ specifies the launcher program (e.g., mpirun, mpiexec,qsub) required to execute the model, and ⟨args⟩ specifies thecommand line argument of the model. The elements ⟨output⟩ and⟨input⟩ specify the Interface process for output and input data of

T. Kannon et al. / Neural Networks 24 (2011) 918–926 921

Fig. 4. Process flow diagram of a simulation using the configuration script shown in Fig. 3. At the simulation start-up, Agent process launches Interface and Models Aand B. The Models have to call the PNIC: :Initialize and PNIC: :Finalize methods at the beginning and end of a program, respectively, for using the PLATONIC library. ThePNIC: :GetInterface method requests for connection to Interface and creates an instance of the PNICMPI class for communication. The PNICMPI: :AttachVariable methoddeclares the variable used for data exchange. Models call the PNICMPI: :Transmit method to send data and the PNICMPI: :Receive method to receive data. PNIC: :Start,PNIC: :Next, and PNIC: :Stop report starting, proceeding, and stopping of calculation loop, respectively, to the Agent.

the mathematical model. In Fig. 3, there are two model elementswhere the attribute value indicates the executable file. Here, bothare executed with twoMPI processes (mpirun—np 2), andModel Ais executed with ‘‘init’’ as argument. Both have different steptimevalues but share the same Interface.

3.3.2. Interface elementThe ⟨interface⟩ element specifies the Data Convention file used

by the Interface process during simulation and can have ⟨data⟩ aschild elements. The element ⟨data⟩ specifies the input or outputdata file. In Fig. 3, a Data Convention file (3.1 and Fig. 1) is used togenerate a data file ‘‘datafile.nc’’.

3.3.3. Condition elementThe ⟨condition⟩ element defines the simulation condition. At

least one such element is required. The elements ⟨starttime⟩ and⟨stoptime⟩ specify the start and end times for the simulation.Moreover, some elements specified as child elements in ⟨interface⟩and ⟨model⟩ can be overridden in this section. The simulationcondition is chosen at runtime. In Fig. 3, there are two conditions.The first condition (exec1) is executed by default, and it setsstarttime and stoptime to 0.0 s and 4.0 s, respectively. Conditionexec2, aside from setting the starttime and stoptime values to 0.0 sand 2.0 s, respectively, also overrides element attributes of somechild elements in ⟨model⟩ and ⟨interface⟩. Model A is executedwithout argument and steptime is changed to 0.05, whereasModelB is executed with four MPI processes (mpirun—np 4). Moreover,the data file utilized by the Interface is changed to ‘‘datafile2.nc’’.

3.4. Application Programming Interface (API)

This section explains how to describe programs, using theC++ API as an example. There are two classes declared in theC++ API. PNIC is the class for communication with the Agentprocess. PNICMPI is the class for communicationwith the Interfaceprocess using MPI. Fig. 4 shows the process flow diagram forsimulation using the configuration script shown in Fig. 3 anddetailed below.

3.4.1. Initialize and finalizeEach model needs to call the PNIC: :Initialize method for

communicating with the Agent process before using the library.This function retrieves the settings from the configuration file andnegotiates with the Agent process. The initializing method for MPIis also called in this function, but it can be skipped when usersemploy their preferred MPI library. At the end of the program, thePNIC: :Finalize method must be called to deallocate memory.

3.4.2. Communicating interfaceTo communicate with an Interface process, the Model has

to call the PNIC: :GetInterface method with the name speci-fied by the ⟨output⟩ or ⟨input⟩ element in the configurationfile. This creates an instance of the PNICMPI class. Then, thePNICMPI: :AttachVariable method links variables used in themodel program to variables declared in the corresponding DataConvention. It is possible to link variables to not only the entire

922 T. Kannon et al. / Neural Networks 24 (2011) 918–926

array but also the partial (subsampled) array. It can also allocatememory with the requested size. This method should be called foreach variable that requires linking. To exchange data through theInterface, the PNICMPI: :Receivemethod copies the data at the cur-rent calculation time from the Interface into the attached variable,and the PNICMPI: :Transmit method sends the data of the attachedvariable to the Interface process.

3.4.3. Time managementThemodels should report the elapsed time to the Agent process

in the calculation loop. First, the PNIC: :Start method tells theAgent process that the Model has completed preparation andjust started calculation. Next, the PNIC: :Next method notifies theAgent that the Model will proceed to the next calculation step.Unless otherwise designated, the PNIC: :Nextmethod advances thetimewith the value of ⟨steptime⟩ element in the configuration file.On the other hand, the PNIC: :GetCurrentTime method returns thesimulation time of the last PNIC: :Next method call of the Model.Finally, the PNIC: :Stop method informs about the termination ofcalculation loop to the Agent process.

4. Model integration using PLATONIC

4.1. An example of visual system

In order to demonstrate the usefulness of the proposed data-orientedmodel integration library, we constructed a visual systemmodel composed of an eye movement model (Inagaki, Hirata, &Usui, 2011), an eye optics model (unpublished), a pupil model,and a retinal network model named ‘‘VirtualRetina’’ (Wohrer &Kornprobst, 2009). Fig. 5(A) illustrates a schematic diagram ofthe visual system model constructed using PLATONIC. The eyeoptics model was constructed based on Artal’s model (Artal, 1990)and reproduced about ±10° retinal image (300 × 300 pixel). Inthe model, the retinal image is calculated from spectra imagescomposed in a range from 380 nm to 700 nm at 10-nm steps.The pupil model calculates the pupil diameter from the averageluminance of the retinal image (Sakai, Hirata, &Usui, 2007). The eyemovement model reproduces various characteristics of saccadesand microsaccades. The retinal network model describes the outerplexiform layer involving photoreceptors and horizontal cells by alinear spatiotemporal filter (300×300 pixel), the bipolar cell layermodeled through dynamic adaptation conductance by a nonlineardynamic spatiotemporal filter (300 × 300 pixel), and the ganglioncell layer by noisy leaky-integrate-and-firemodel, and divided intofour layers consisting of the ON- andOFF-midget and ON- andOFF-parasol ganglion cells (39,092 cells).

4.2. Procedures for incorporating models into PLATONIC

The following is an example of the procedures required toincorporate an existing model into PLATONIC. The retinal net-work model VirtualRetina consists of about 9000 lines of pro-gram code. The program code that configures the simulation,‘‘Retina_main.cc’’, consists of about 600 lines, and calculations andother tasks are described in other files. This is a typical config-uration for clearly separating calculation and data management,and therefore, PLATONIC was easily applied to VirtualRetina, andRetina_main.cc was the only file that needed to be modified.Statements for processing of data I/O without PLATONIC weredeleted, and the control statements for initialize/finalize, data def-inition/exchange, and time management were replaced by thoseof the PLATONIC API. Because of this modification, about 30 lineswere added, and about 90 lines were deleted. Essential modifica-tions are described below. Numbers appearing at the beginning of

the lines correspond to the line numbers of the modified modelprogram.7

Initializing and finalizing:First, the header files of the PLATONIC library should be

declared.

1 #include ‘‘PNIC.h’’2 #include ‘‘PNICMPI.h’’

At the beginning of themain function, the PLATONIC library shouldbe initialized as follows (described in 3.4.1):

27 PNIC: :Initialize(&argc, &argv);

The Finalize statement appears at the end of themain function andis given as follows:

554 PNIC: :Finalize();

Data definition and exchange:Because data I/O of the model is carried out via the Interface

process, an instance of the PNICMPI class should be defined forcommunicating with the Interface process (explained in 3.4.2). Inthe case of VirtualRetina, the original program uses an image fileas the input data. A common data format is used to connect to theeye optics model in PLATONIC, as follows:

75 PNICMPI *retinaimage = PNIC: :GetInterface(ïnput’’);

Then, the variable name described in Data Convention for the inputdata is attached to the variable name using themodel program; forexample, in the case of VirtualRetina, the variable is ‘‘image_L’’.

85 retinaimagenc= (double∗)retinaimage->AttachVariable(‘‘image_L’’);

The following statement is added to actually receive the data in thecalculation loop.

357 retinaimage->Receive();

Time management:Control sequences of the calculation loop of the model are

replaced by the PLATONIC API (described in 3.4.3). In the case ofVirtualRetina, the calculation part has a nested double loop, andthus, the instruction PNIC: :Next() is included in the inner loop.

351 PNIC: :Start();352 for(int t=0; PNIC: :GetCurrentTime()< PNIC: :GetStopTime();

t++ )363 for(int i=0; i <N_rep; i++ , PNIC: :Next())553 PNIC: :Stop();

4.3. Simulation results

Fig. 5B summarizes the simulation results produced by theintegrated visual systemmodel. A rotating snakes stimulus8 shownin Fig. 5B(a) was used for the model input. Horizontal and verticaleye movements are traced in solid and dotted lines, respectively,in Fig. 5B(b), pupil diameter is shown in Fig. 5B(c). A snapshot ofmodel outputs, viz., retinal image and ON- and OFF-midget, andON- andOFF-parasol ganglion cell activities, are shown in Fig. 5B(d,e). Here, ganglion cell activities are represented as the firing rate,with the spot center representing the cell position and the spot sizerepresenting the corresponding cell’s receptive field size. Theseresults, which successfully reproduced the signal processing ateach stage of the visual system, demonstrate that our proposedlibrary successfully integrated and simulated models and that theintegrated model well reproduced the whole-system behavior.

7 Details of program code modifications are provided in the supplementarymaterial.8 Rotating snakes, http://www.ritsumei.ac.jp/~akitaoka/.

T. Kannon et al. / Neural Networks 24 (2011) 918–926 923

Fig. 5. A: Schematic diagramof an integrated visual systemmodel consisting of eyemovement, eye optics, pupil, and retina sub-systemmodels constructed using PLATONIC.Squares represent the sub-systemmodels and parallelograms indicate the data described by the common data format. B: Simulation results of the visual systemmodel. (a) Arotating snakes stimulus used for the model input. (b) Horizontal and vertical eye movements. (c) Pupil diameter. (d) Snapshot of retinal image and (e) ON- and OFF-midget,and ON- and OFF-parasol ganglion cell activities. Ganglion cell activities are represented as white spots with the spot center representing the cell position and the spot sizethe corresponding cell’s receptive size.

5. Discussion

5.1. Contributions to large-scale modeling

In large-scale simulations, specifications and limitations areoften incorporated to optimize efficiency. The development of alarge-scale model may be hindered by such specifications andlimitations. PLATONIC tries to develop a large-scale system modelby loosely coupling existing sub-system models. That is, the

principal objective of PLATONIC is to connect the models, ratherthan considering the large-scale system performance.

When users develop models adapted to PLATONIC, the models’I/O data should be commonized. In other words, I/O dataof the models using PLATONIC are always commonized. Thisis a merit of the data-oriented approach. In the field ofcomputational neuroscience, mathematical models are oftendeveloped individually, in which the data format is designedindependently. If PLATONIC is utilized from an early stageaccording to the Data Convention of an existing large-scale model,

924 T. Kannon et al. / Neural Networks 24 (2011) 918–926

the researcher will be able to evaluate the overall impact of thedeveloped model as a part of that large-scale model. If manyresearchers use PLATONIC, reusable/pluggable models would begathered on the PLATO project in the future. This would increasethe number of available pluggable sub-system models for thedevelopment of a large-scale brain systemmodel using PLATONIC,which is the ultimate goal of the PLATO project, and is expected toprovide a thorough understanding of the whole brain system.

5.2. File-based data exchange

PLATONIC uses netCDF for data transfer between models. Thisapproach appears to contradict the concepts of high performancecomputing. However, as previously stated, our major goal is notto develop high performance large-scale models but to effectivelyconstruct large-scale models by providing Data Conventions andlibraries to support model connection. Given below are theadvantages of using a file transfermethod based on a common dataformat.

• It is possible to develop and simulate pluggable models even ifthe model to be connected is not available. For instance, oncethe experimental data is adapted to Data Convention, the datacan be integrated into the large-scalemodel in amanner similarto how sub-system models are integrated.

• It is possible to suspend or resume the simulation if the data issuitably stored.

• A common data format allows the generation and collection ofreusable data files.

For these reasons, PLATONIC utilizes file-based data exchange.The simulation performance is lower than that without fileI/O, but we think that it is important to first develop a large-scale system model by integrating sub-models before improvingthe performance of its simulation. In order to improve theperformance of data exchange without affecting the ease ofmodel connection, features such as determining what data mustactually be exchanged, and storing data on the memory withoutwriting to files are being developed. These features will allowswitching between the full data output and selected data outputmodes without having to make any changes to the program code.Therefore, the model can be utilized in the stand-alone mode,where all data has to be outputted, or as part of a large-scalemodel,where only the necessary data is exchanged.

5.3. Comparison with another large-scale modeling library

A few large-scale modeling libraries comparable to PLATONIChave already been published. For example, MUSIC (Ekeberg &Djurfeldt, 2008) is similar to PLATONIC in the concepts of reusingexisting models and data exchange using MPI. This sectioncompares the features of each library.Model standardization: MUSIC provides the C++ API that enablesexisting models to adapt to MUSIC, and it already supportsseveral useful simulators, such as NEST and NEURON. PLATONICalso provides the C++ API, described in Section 3.4. Thus,PLATONIC supports the existing models and these simulators inthe same manner. In the case of both PLATONIC and MUSIC,the models communicate the data at every time step. Therefore,the program structure and coding style of models utilizing bothlibraries are similar. Models that have a program structure inwhich data processing is isolated from calculation can be adaptedto PLATONIC. When adapting a model to PLATONIC, the datastructure is commonized by the Data Convention, and thus, modelconnection is straightforward.

Furthermore, by obeying the Data Convention, models usingMUSIC can be connected to other sub-system models using

PLATONIC. However, the opposite does not apply because of themethodology adapted by each librarywhen creatingMPI processes(see below).Data standardization: The data handled by MUSIC depend on thesimulators or authors of the particular models; on the otherhand, all the data handled by PLATONIC are standardized by DataConvention.Data handling and distribution among multiple processes: In MUSIC,because models are directly communicating and exchanging data,the data of the sender and receiver models must be adjusted fordetails such as array size and units of physical quantities. If there isa difference between the number of parallel processes in differentmodels, all multidimensional data are automatically mapped anddistributed at runtime. In PLATONIC, the models exchange all datavia the Interface process. The details of the data of the sender andreceiver models are adjusted by Data Convention. The differencein the number of parallel processes is handled by the Interfacebecause all data from the sender model is gathered by a singleInterface process, which then splits the data and sends it to thereceiver model. In addition, PLATONIC handles automatic unitconversion using Udunits, which is part of the netCDF library. Ifthe units are convertible, PLATONIC automatically converts theunits of the data exchanged by models at runtime. However, inthe case of tightly coupled data exchange, such as neural spike-train data, the Interface process that is the data exchange methodof PLATONICmight become a communication bottleneck. MUSIC ismore efficient in such cases because, in MUSIC, data exchange isperformed directly between two models.Procedure for setting MPI environment: In simulations using MPI,the MPI communicator controls the communication range ofthe model. MUSIC launches all models as a single MPI job. Theglobal communicator of that MPI job (MPI_COMM_WORLD) issplit and assigned to each model at the start of a simulation(MPI_Comm_Split). In contrast, PLATONIC launches each modelas an independent MPI job, and connects these models usingthe intercommunicator (MPI _Connect, MPI_Accept). Therefore,PLATONIC can execute simulations of large-scalemodels on severaldifferent computer clusters simultaneously through the Internet.Thus, for example, large-scale simulation carried out using ahigh performance computing infrastructure,9 which is formed byinterconnecting many individual supercomputer centers over anetwork, can be considered. In addition, because PLATONIC doesnot change the MPI communicator of each model, models usingMUSIC can be sub-system models of PLATONIC. However, modelsusing PLATONIC cannot be sub-systemmodels ofMUSIC because ofthe aforementioned mechanism of MPI communicator generation.

Compared with MUSIC, PLATONIC focuses on connectingmodels and independently developing models that can beinterconnected on the basis of a common data format. In a casewhere the development policy of a large-scalemodel is determinedbeforehand, either library may be used. However, PLATONICallows straightforward connection of models on the basis of DataConvention; therefore, the connection structure of a large-scalemodel can be easily changed.

When considering supercomputer architecture, MUSIC has anadvantage in simulations in terms of communication efficiency. Onthe other hand, PLATONIC is suitable for distributed computing,where sub-system models are simulated on individual highperformance computers and interconnected by loose coupling. Asmentioned above, PLATONIC can connect models in MUSIC. Inother words, these two systems can coexist. Collecting and sharingmodels developed using the features of the various simulators, notonly MUSIC, is the objective of the PLATONIC and PLATO projects.

9 High Performance Computing Infrastructure, http://hpcic.riken.jp/.

T. Kannon et al. / Neural Networks 24 (2011) 918–926 925

Fig. 6. Framework of PLATO environment. This environment enables large-scale mathematical model development by integrating/developing models based on a sharedData Convention and executing the simulation on a Supercomputer, or the Sim-PF that can directly run a model published on Japan-Node PFs or ModelDB through theInternet. Users can also perform a full research lifecycle, such as developing mathematical models using PLATONIC, managing experimental data, searching articles fromPubMed, and authoring papers, on this environment through Concierge and other plug-ins available in the PLATO IDE.

5.4. Environment for facilitating large-scale modeling

The key feature in model integration using PLATONIC is thedefinition of Data Conventions. Based on this concept, modelintegration is possible by only knowing this Data Convention. Itcontains information on the data structure a model provides orshould provide, as well as basic information on the model itself.Therefore, users can knowhow to retrieve or provide data, withoutanalyzing the connecting model’s source code or the data itself.Currently, we are building a service called the Data Conventioncenter for managing and publishing the Data Convention on theWeb (Fig. 6 bottom left).

For developing and simulating models using PLATONIC, XMLdocuments, such as Data Convention and PLML, are utilized.However, XML documents are difficult to describe using anordinary text editor. We are planning to develop a PLATO IDE(integrated development environment) (Fig. 6 bottom) based onEclipse10 technology, including GUI tools to simplify the creationof these documents. Eclipse provides features for convenientinstallation, high scalability, and flexibility. The user can extendand customize its usability by installing plug-ins. It is also apowerful tool supporting construction of programs using notonly JAVA but also C/C++ , Python, and others. Therefore, thePLATO IDE will inherit all such features, and users can conduct afull simulation cycle, from code writing to model integration tosimulation. In addition, the PLATO IDE also supports a personaldatabase for managing digital research files called Concierge11

10 Eclipse, http://www.eclipse.org/.11 Concierge: Personal Database Software for Digital Research Resources,http://concierge.sourceforge.jp/.

(Sakai, Aoyama, Yamaji, & Usui, 2007) and typesetting tools, suchas TeX and OpenDocument.12 Accordingly, users can perform afull research lifecycle, such as developing mathematical models,managing experimental data, searching articles, and authoringpapers.

Furthermore, we have developed Simulation Platform (Sim-PF)(Usui, Yamazaki et al., 2009, 2010; Yamazaki et al., 2011) as a cloud-computing environment. Sim-PF is a web-based platform that candirectly run a model published on Japan-Node PFs or ModelDB13

(Fig. 6 right). Under these frameworks, users can test the modelsin the Sim-PF, adapt them to PLATONIC, and integrate them into alarge-scale model.

6. Conclusions

This paper presented a data-oriented model integration libraryas part of the PLATO framework that aims to integrate and developa large-scale mathematical model of the brain. The proposeddata-orientedmodel integration library, called PLATONIC, employsTCP and MPI for communication in the model simulation. Itprovides a framework for environment-independent, flexible, andscalable model integration by isolating the functions of data I/O,management of simulation time, and calculations. This has allowedthe integration of existingmodelswithoutmuchmodification of itscode to develop a working integrated system model.

Under the PLATO framework, several researchers can con-tribute toward the development of a large-scale whole brain sys-tem model. Moreover, because of the flexibility and scalability

12 OpenDocument, http://docs.oasis-open.org/office/v1.0/.13 ModelDB, http://senselab.med.yale.edu/modeldb/.

926 T. Kannon et al. / Neural Networks 24 (2011) 918–926

of the PLATONIC library introduced in this paper, models can beplugged-in to test their effect on the large-scale model. It is alsodesigned using basic tools utilized on high performance computersystems, such as MPI, although tuning is necessary to achieve highperformance. We rather hope PLATO and PLATONIC will providea test bed for a large-scale model integrated by plugging-in exist-ing or newly developed sub-models, prior to the design and im-plementation of high performance large-scale models specificallytuned for simulation on such high performance computers, whereperformance is crucial.

Acknowledgments

We thank Drs. Shunji Satoh, Yoshimi Kamiyama, Yutaka Hirata,Akito Ishihara, and Hayaru Shouno for valuable discussion fordevelopment and Yoshihiro Okumura for technical support. Thiswork is partially funded by ‘‘The Next-Generation IntegratedSimulation of Living Matter’’ project, part of the Development andUse of the Next-Generation Supercomputer Project of the Ministryof Education, Culture, Sports, Science and Technology of Japan.

Appendix. Supplementary data

Supplementary material related to this article can be foundonline at doi:10.1016/j.neunet.2011.06.011.

References

Artal, P. (1990). Calculations of two-dimensional foveal retinal images in real eyes.Journal of the Optical Society of America A, 7(8), 1374–1381.

Asai, Y., Suzuki, Y., Kido, Y., Oka, H., Heien, E., Nakanishi, M., et al. (2008).Specifications of insilicoML 1.0: a multilevel biophysical model descriptionlanguage. The Journal of Physiological Sciences, 58(7), 447–458.

Bower, J. M., & Beeman, D. (1997). The book of GENESIS: exploring realistic neuralmodels with the general neural simulation system. New York: Springer.

Ekeberg, Ö., & Djurfeldt, M. (2008). MUSIC—MUltisimulation coordinator:request for comments. In Nature proceedings. PDF file is available at:http://dx.doi.org/10.1038/npre.2008.1830.1.

de Garis, H., Shuo, C., Goertzel, B., & Ruiting, L. (2010). A world survey of artificialbrain projects, part I: large-scale brain simulations. Neurocomputing , 74, 3–29.

Gewaltig,M.-O., &Diesmann,M. (2007). NEST (neural simulation tool). Scholarpedia,2(4), 1430.

Gleeson, P., Crook, S., Cannon, R. C., Hines, M. L., Billings, G. O., Farinella, M., et al.(2010). NeuroML: a language for describing data driven models of neurons andnetworks with a high degree of biological detail. PLoS Computational Biology,6(6), e1000815.

Goertzel, B., Ruiting, L., Arel, I., de Garis, H., & Shuo, C. (2010). A world survey ofartificial brain projects, part II: biologically inspired cognitive architectures.Neurocomputing , 74, 30–49.

Gorchetchnikov, A., & INCF Multiscale Modeling Taskforce, (2010). NineML—adescription language for spiking neuron networkmodeling: the user layer. BMCNeuroscience, 11(Suppl. 1), 71.

Gropp, W., Lusk, E., & Skjellum, A. (1994). Using MPI: portable parallel programmingwith the message-passing interface. Cambridge: MIT Press.

Hedley, W. J., Nelson, M. R., Bellivant, D. P., & Nielsen, P. F. (1783). A shortintroduction to CellML. Philosophical Transactions of Royal Society A, 359,1073–1089.

Hines, M. L., & Carnevale, N. T. (1997). The NEURON simulation environment.NeuralComputation, 9, 1179–1209.

Hines, M. L., Morse, T., Migliore, M., Carnevale, N. T., & Shepherd, G. M. (2004).ModelDB: a database to support computational neuroscience. Journal ofComputational Neuroscience, 7(1), 7–11.

Hucka, M., Finney, A., Sauro, H. M., Bolouri, H., Doyle, J. C., & Kitano, H. (2003). Thesystems biology markup language (SBML): a medium for representation andexchange of biochemical network models. Bioinformatics, 19(4), 524–531.

Inagaki, K., Hirata, Y., & Usui, S. (2011). A model-based theory on the signaltransformation for the microsaccade generation. Neural Networks, 24(9),990–997.

Köhn, D., & Le Novére, N. (2008). SED-ML—an XML format for the implementationof the MIASE guidelines. In Lecture notes in computer science: Vol. 5307.Computational methods in systems biology (pp. 176–190).

Message Passing Interface Forum (1994). MPI: a message-passing interface.Standard. Technical report UT-CS-94-230.

Migliore, M., Morse, T. M., Davison, A. P., Marenco, L., Shepherd, G. M., & Hines, M. L.(2003). ModelDB: making models publicly accessible to support computationalneuroscience. Neuroinformatics, 1(1), 135–139.

Raikov, I., & INCF Multiscale Modeling Taskforce, (2010). NineML—a descriptionlanguage for spiking neuron network modeling: the abstraction layer. BMCNeuroscience, 11(Suppl. 1), 66.

Rew, R. K., & Davis, G. P. (1990). NetCDF: an interface for scientific data access. IEEEComputer Graphics and Applications, 10(4), 76–82.

Sakai, H., Aoyama, T., Yamaji, K., & Usui, S. (2007). Concierge: personal databasesoftware for managing digital research resources. Frontiers in Neuroinformatics,1, 5.

Sakai, H., Hirata, Y., & Usui, S. (2007). Relationship between residual aberration andlight-adapted pupil size. Optometry & Vision Science, 84, 517–521.

Usui, S. (2010). PLATO: platform for collaborative brain system modeling. InProceedings of 2010 IEEE World Congress on Computational Intelligence.

Usui, S., Furuichi, T., Miyakawa, H., Ikeno, H., Nagao, S., Iijima, T., et al. (2008).Japanese neuroinformatics node and platforms. In Lecture notes in computerscience: Vol. 4985. Proceedings of the 14th international conference on neuralinformation processing (pp. 884–894).

Usui, S., Inagaki, K., Kannon, T., Kamiyama, Y., Satoh, S., Kamiji, N. L., et al. (2009). Anext generationmodeling environment PLATO: platform for collaborative brainsystem modeling. In Lecture notes in computer science: Vol. 5863. Proceedings ofthe 16th international conference on neural information processing (pp. 84–90).

Usui, S., &Okumura, Y. (2008). Basic schemeof neuroinformatics platform:XooNIps.In Lecture notes in computer science: Vol. 5050. Proceedings of 2008 IEEE worldcongress on computational intelligence (pp. 102–116).

Usui, S., Yamazaki, T., Ikeno, H., Okumura, Y., Satoh, S., & Kamiyama, Y. etal. (2009). Simulation platform: a test environment of computationalmodels via web. In Frontiers in neuroinformatics. Conference abstract:2nd INCF congress of neuroinformatics 2009. [PDF file is available at:http://dx.doi.org/10.3389/conf.neuro.11.2009.08.047 ].

Usui, S., Yamazaki, T., Ikeno, H., Okumura, Y., Satoh, S., & Kamiyama, Y. et al. (2010).Simulation platform: model simulation on the cloud. In Frontiers in neuroinfor-matics, Conference abstract: 3rd INCF Congress of neuroinformatics 2010. [PDF fileis available at: http://dx.doi.org/10.3389/conf.fnins.2010.13.00100 ].

Wohrer, A., & Kornprobst, P. (2009). VirtualRetina: a biological retina model andsimulator, with contrast gain control. Journal of Computational Neuroscience,26(2), 219–249.

Yamazaki, T., Ikeno, H., Okumura, Y., Satoh, S., Kamiyama, Y., Hirata, Y., et al. (2011).Simulation platformbeta: a cloud-based online simulation environment.NeuralNetworks, 24(9), 927–932.