Upload
julia-jenkins
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
CARMEN: Code Analysis, Repository and Modelling for
e-Neuroscience
Research Challenge
Understanding the brain may be the greatest
informatics challenge of the 21st century
Worldwide >100,000 neuroscientists(~ 5,000 in UK) are generating vast amounts of data
Principal experimental data formats:
molecular (genomic/proteomic)
neurophysiological (time-series electrical measures of activity)
anatomical (spatial)
behavioural
Neuroinformatics concerns how these data are handled and integrated, including the application of computational modelling
In recent years new technological opportunities for data sharing have emerged with faster networks, improved database technologies, and affordable massive data storage capabilities
Neuroinformatics is increasingly exploiting these opportunities to enable data sharing, re-use of data and novel analysis based on new combinations of data that can be performed via database systems
Neuroinformatics
Need for Cooperation
Understanding the brain may be the greatest
informatics challenge of the 21st century
OECD identified a need to work cooperatively in order to achieve major advances and have established the International Neuroinformatics Coordinating Facility
Cooperation will permit:
development of common processes
best value from data – long term curation
‘mega-analysis’ of large data sets
integration of data sets across different scales and different approaches
interdisciplinary research
Technical
Multiple proprietary data formats Need for detailed, standardised and evolvable metadata Volume of the data to be analysed
Cultural
Multiple communities each acting independently Concerns about the consequences of sharing data Difficulty in appreciating how the science could be moved forwards by e-Science
Potential Barriers to Cooperation
CARMEN – Focus on Neural Activity
resolving the ‘neural code’ from the timing of action potential activity
Understanding the brain may be the greatest
informatics challenge of the 21st century
neurone 1
neurone 2
neurone 3
raw voltage signal data is collected using single or multi-electrode array recording novel optical recording, particularly the activity dynamics of large networks
Much current knowledge about brain function is based on analysis of firing patterns of individual neurones.
New computer-based data acquisition systems and techniques for recording simultaneously from many neurones means data are amassing rapidly.
Neural modelling generates massive simulated data sets that need to be processed, analysed and compared with experimental data.
Neuronal recordings can be intra- or extra-cellular recordings of single spikes, ensembles of neurones, or field potentials. All of these data are types of time-series data which require a specialised information handling system.
Electrophysiological Data
To demonstrate and sustain advances in neuroscience enabled by e-Science technology
To create a grid-enabled, real time ‘virtual laboratory’ environment for neurophysiological data
To develop an extensible, client-defined ‘toolkit’ for data extraction, analysis and modelling
To provide a repository for archiving, sharing, integration and discovery of data
To achieve wide community and commercial engagement in developing and using CARMEN
CARMEN Objectives
Project Exemplar
Recording from brain tissue removed from epileptic patients (scarce tissue and
data rates up to 20 GB/h)
On line analysis by distributed collaborators will enable experiment to be defined during data collection
Repository will enable integration of rare case types from different laboratories
New knowledge will lead to advances in treatment
CARMEN Consortium
Newcastle: Colin Ingram Paul Watson Stuart Baker Marcus Kaiser Phil Lord Evelyne Sernagor Tom Smulders Miles Whittington
York: Jim Austin Tom Jackson
Stirling: Leslie Smith Plymouth: Roman Borisyuk
Cambridge: Stephen Eglen
Warwick: Jianfeng Feng
Sheffield: Kevin Gurney Paul Overton
Manchester: Stefano Panzeri
Leicester: Rodrigio Quian Quiroga
Imperial: Simon Schultz
St. Andrews: Anne Smith
CARMEN Consortium
Commercial Partners
- applications in the pharmaceutical sector
- interfacing of data acquisition software
- application of database infrastructure
- commercialisation of analysis tools
Work Packages
Data Storage& Analysis
WP1 Spike Detection& Sorting
WP2 Information TheoreticAnalysis of Derived Signals
WP 3 Data-Driven ParameterDetermination in Conductance-
Based Models
WP5 Measurement and Visualisationof Spike Synchronisation
WP6 Multilevel Analysis andModelling in Networks
WP4 Intelligent Database Querying
Hub and Spoke Project
Hub: A “CAIRN” repository for the storage and analysis of neuroscience data
Spokes: A set of neuroscience projects that will produce data and analysis services for the hub, and use it to address key neuroscience questions
CARMEN Structure
Managing vast amounts of data> 50TB primary data
Extracting value from the datadiscovery & interpretationanalysis – harnessing compute resourcescuration of services as well as data
Controlling access to the data & services
e-Science Challenges
Data
Metadata
Compute Cluster on which Services are Dynamically
Deployed
WebPortal
..............
WebPortal
Rich Clients
Sec
urity
Workflow Enactment
Engine
RegistryServiceRepos-
itory
CARMEN Active Information Repository Node
OMII:Grimoire
DAME:Signal Data Explorer
OMII/ myGrid:Taverna/ BPEL
OGSA-DAI& SRB
Gold:Role & Task based Security
myGrid & Gold:Feta, Provenance
DynasoarWhite Rose GridNewcastle Grid
• Data Collection from Electrode Array• Spike Detection
• with User Defined Threshold
• Spike Sorting• Analysis• Visualisation
Currently, this is a semi-manual process
We have an initial prototypefor automating this….
A Typical Scenario we want to Support
Signal Data Explorer
Example Workflow
SRB FileSystem
RDBMS
External
Client Spike Sorting
Service
Reporting
Dynamically Deployed Services in Dynasoar
BPEL / TAVERNA
Registry
INPUT Data
OUTPUT Metadata
Available Services
RepositoryS
ecur
ityWorkflow Engine
Query
Example Workflow Enactment
Example Graph Output
Example Movie Output
Extensible, standardised metadata for neurosciencedata formats (timing, data channels, etc.)experimental design (e.g. stimuli or drug
treatments)concurrent data (e.g. behaviour, physiological
measures) experimental idiosyncrasies (e.g. artifacts)experimental conditions (animals,
temperature, treatments etc.)
Some Remaining Challenges
Locating patterns in time-series data across multiple levels of abstraction
Reproducible e-Sciencecurating services as well as datapublic repositories of deployable servicesdynamic service deployment
Real-time expert collaboration
Some Remaining Challenges (cont.)
CARMEN
CARMEN is delivering an e-Science infrastructure that can be applied across a range of diverse and challenging
applications (not only neuroscience)
CARMEN enables cooperation and interdisciplinary working in ways currently not possible
CARMEN will deliver new results in neuroscience, computer science and medicine