IEEE CDC Workshop, Dec. 8, 2008

IEEE CDC Workshop, Dec. 8, 2008IEEE CDC Workshop, Dec. 8, 2008

Modeling Neuronal Multi-Unit Activity and Neural DynamicsModeling Neuronal Multi-Unit Activity and Neural Dynamics

Zhe (Sage) Chen

[email protected] Statistics Research Laboratory

MGH/HMS/MIT

Zhe (Sage) Chen

[email protected] Statistics Research Laboratory

MGH/HMS/MIT

mailto:[email protected]

mailto:[email protected]

AcknowledgmentAcknowledgment

•Emery N. Brown (MIT/Harvard/MGH)

•Matt A. Wilson (MIT)

•Sujith Vijayan (Harvard/BU)

•Riccardo Barbieri (MGH/HMS)

•Emery N. Brown (MIT/Harvard/MGH)

•Matt A. Wilson (MIT)

•Sujith Vijayan (Harvard/BU)

•Riccardo Barbieri (MGH/HMS)

OutlineOutline

•Brief overview of extracellular recordings: SUA, MUA, LFP

•MUA: properties and examples

•Neural dynamics and the “state” of network

•Case study: probabilistic models for estimating neuronal “UP/DOWN” states with MUA

•Brief overview of extracellular recordings: SUA, MUA, LFP

•MUA: properties and examples

•Neural dynamics and the “state” of network

•Case study: probabilistic models for estimating neuronal “UP/DOWN” states with MUA

LFP (<100~150 Hz)LFP (<100~150 Hz)MUA (300~6000 MUA (300~6000

Hz)Hz)

SUA (spike sorting)SUA (spike sorting)MSP (spike detection)MSP (spike detection)

MUA PropertiesMUA Properties•What is MUA? : weighted sum of N single-unit

activity (N can be either known or unknown)

•a weighted average of neural activity around the electrode tip in a region (140~300 μm) smaller than LFP but larger than SUA

•easy to process, recordings stable, information rich

•with spike detection-->might contain noisy spikes (false positives)

•mutual info of MUA is greatest when SUA are independent

•What is MUA? : weighted sum of N single-unit activity (N can be either known or unknown)

•a weighted average of neural activity around the electrode tip in a region (140~300 μm) smaller than LFP but larger than SUA

•easy to process, recordings stable, information rich

•with spike detection-->might contain noisy spikes (false positives)

•mutual info of MUA is greatest when SUA are independent

How MUA can be used?How MUA can be used?

•Simultaneously measuring brain activity during behavior task (Buchwald et al., Nature, 1965)

•Analyzing neuronal dynamics during sleep (Ji and Wilson, Nature Neurosci., 2007)

•Analyzing coherence between LFP, SUA, MUA (Zeitler, Fries, and Gielen, Neural Comput, 2006)

•Neuronal decoding (Stark and Abeles, J. Neurosci, 2007; Won and Wolf, Network, 2004)

•Simultaneously measuring brain activity during behavior task (Buchwald et al., Nature, 1965)

•Analyzing neuronal dynamics during sleep (Ji and Wilson, Nature Neurosci., 2007)

•Analyzing coherence between LFP, SUA, MUA (Zeitler, Fries, and Gielen, Neural Comput, 2006)

•Neuronal decoding (Stark and Abeles, J. Neurosci, 2007; Won and Wolf, Network, 2004)

ExamplesExamples

Example: MUA across different whisker intensity levelsExample: MUA across different whisker intensity levels

somatosensory cortex somatosensory cortex (rat)(rat)

from Dr. Anna Devorfrom Dr. Anna Devor(MGH/UCSD)(MGH/UCSD)

Time (s)Time (s)

Example: Motor cortexExample: Motor cortexraw signalraw signal

LFPLFP

MUAMUA

SUASUA

Stark and Abeles, Stark and Abeles, “Predicting “Predicting

Movement from Movement from multiunit activity” multiunit activity” J. Neurosci., 2007J. Neurosci., 2007

Neuronal decoding Neuronal decoding (movement prediction)(movement prediction)

Neural (State) DynamicsNeural (State) Dynamics

•The neural systems (neurons) can shift between different computational modes or between ranges of operation (e.g., change in response gain) at different timescales.

•externally-driven dynamics: results from changes in external stimuli factors (e.g., sensory adaptation, encoding of sensory percepts)

• internally-driven dynamics: results from changes in internal factors (e.g., attention shift, change of conductance or membrane potential)

•The neural systems (neurons) can shift between different computational modes or between ranges of operation (e.g., change in response gain) at different timescales.

•externally-driven dynamics: results from changes in external stimuli factors (e.g., sensory adaptation, encoding of sensory percepts)

• internally-driven dynamics: results from changes in internal factors (e.g., attention shift, change of conductance or membrane potential)

State of Neurons/Network

State of Neurons/Network•Modified by many factors, such as

membrane potentials, synaptic activities, conductance, top-down attention, etc.

•change at different levels of anesthesia or sleep stages

•spontaneous self-organization by a balance of excitatory and inhibitory neurons in recurrent network

•measurements of neuronal state (MUA, LFP)

•Modified by many factors, such as membrane potentials, synaptic activities, conductance, top-down attention, etc.

•change at different levels of anesthesia or sleep stages

•spontaneous self-organization by a balance of excitatory and inhibitory neurons in recurrent network

•measurements of neuronal state (MUA, LFP)

Haider et al. J. Neurosci 2006; Fontanini & Katz, J. Neurophy., 2008Haider et al. J. Neurosci 2006; Fontanini & Katz, J. Neurophy., 2008

Buzaki/Kopell/Wilson/McCormick/Donald Katz Buzaki/Kopell/Wilson/McCormick/Donald Katz

A case study: Probabilistic models for

estimating neuronal UP/DOWN states

A case study: Probabilistic models for

estimating neuronal UP/DOWN states

UP/DOWN statesUP/DOWN states• the periodic fluctuations between

increased/decreased spiking activity of a neuronal population (Sirota et al., PNAS, 2003; Hahn et al., PNAS, 2007)

• intracellular: membrane potential fluctuations define two states of neocortex (membrane potential greater/smaller than the resting threshold, -70~-80 mV)

•extracellular: LFP, MUA

•observed from anesthetized/behaving animals (during sleep) in a variety of brain regions (visual, somatosensory, hippocampus, basal ganglia ...)

• the periodic fluctuations between increased/decreased spiking activity of a neuronal population (Sirota et al., PNAS, 2003; Hahn et al., PNAS, 2007)

• intracellular: membrane potential fluctuations define two states of neocortex (membrane potential greater/smaller than the resting threshold, -70~-80 mV)

•extracellular: LFP, MUA

•observed from anesthetized/behaving animals (during sleep) in a variety of brain regions (visual, somatosensory, hippocampus, basal ganglia ...)

Ji & Wilson, Ji & Wilson, Nature Neurosci.Nature Neurosci., 2007, 2007

visuavisuall

Threshold-based method

Threshold-based method

Ji & Wilson, 2007Ji & Wilson, 2007

ObjectivesObjectives•Building statistical probabilistic

(generative) models to represent the MUA spike trains data

•Representing uncertainties with prob.

•Modeling the transition prob. (survival prob.) of two states

•Building statistical probabilistic (generative) models to represent the MUA spike trains data

•Representing uncertainties with prob.

•Modeling the transition prob. (survival prob.) of two states

Data RecordingData Recording

•Pre-RUN sleep --- RUN --- Post-RUN sleep

•Simultaneous recordings of primary somatosensory cortex (SI) & hippocampus from several behaving rats

•EEG/EMG/MUA (6-9 tetrodes)

•Sleep classification (REM, SWS, transitory)

•Pre-RUN sleep --- RUN --- Post-RUN sleep

•Simultaneous recordings of primary somatosensory cortex (SI) & hippocampus from several behaving rats

•EEG/EMG/MUA (6-9 tetrodes)

•Sleep classification (REM, SWS, transitory)

Monitoring location Monitoring location of rat using LEDof rat using LED

R e w a r d S i t e

T e x t u r e

by Sujith Vijayanby Sujith Vijayan

Modeling & EstimationModeling & Estimation

•UP and DOWN states, S(t), are treated as discrete r.v. (1/0) drawn from a hidden Markov process

•Observations: MUA spike trains (modulated by the hidden state) --- doubly stochastic process

•Estimation task: infer the hidden states (when and how many transitions occur) and the transition prob.

•UP and DOWN states, S(t), are treated as discrete r.v. (1/0) drawn from a hidden Markov process

•Observations: MUA spike trains (modulated by the hidden state) --- doubly stochastic process

•Estimation task: infer the hidden states (when and how many transitions occur) and the transition prob.

S(t-1)

S(t-1)S(t-1) S(t)S(t) S(t+T)S(t+T)

y(t-1)y(t-1) y(t)y(t) y(t+T)y(t+T)

Comparsion of discrete- & continuous-time

models

Comparsion of discrete- & continuous-time

models

spike counts spikes (0/1)

bin size (10-20 ms) arbitrary small (1 ms)

transition constrained by bin size any time

first-order Markovian semi-Markovian

Hidden Markov model (HMM)

Hidden Markov model (HMM)•stationary probabilistic model with three

elements: initial prob., transition prob., emission prob

• 11

•

•

•stationary probabilistic model with three elements: initial prob., transition prob., emission prob

• 11

•

•

inhomogeneous Poissoninhomogeneous Poisson

baselinebaseline statestate spiking historyspiking history

-1-1

EM algorithmEM algorithm•Missing data problems (statistics, speech

recognition, signal processing, & communication)

•principle: maximum likelihood estimation

•expectation maximization (EM) algorithm E-step: Estimate the hidden state sufficient stat. M-step: Estimate the unknown parameters Iterative procedure until convergence

•HMM --- Baum-Welch (forward-backward) algorithm & Viterbi algorithm

•Missing data problems (statistics, speech recognition, signal processing, & communication)

•principle: maximum likelihood estimation

•expectation maximization (EM) algorithm E-step: Estimate the hidden state sufficient stat. M-step: Estimate the unknown parameters Iterative procedure until convergence

•HMM --- Baum-Welch (forward-backward) algorithm & Viterbi algorithm

EM (continued)EM (continued)•Q-function

•sufficient stat.

•

•

•Q-function

•sufficient stat.

•

•

EM (continued)EM (continued)• forward-backward

•M-step (reestimation)

•Newton (iterative fixed-point optimization)

•

• forward-backward

•M-step (reestimation)

•Newton (iterative fixed-point optimization)

•

Continuous-time modelContinuous-time model•“single-step transition’’ is no longer valid

•state transition prob. time-varying (depending on time), modeled either parametrically or nonparametrically

•survival function

• Choices of F(z): Markov or semi-Markov

•“single-step transition’’ is no longer valid

•state transition prob. time-varying (depending on time), modeled either parametrically or nonparametrically

•survival function

• Choices of F(z): Markov or semi-Markov

Ji & Wilson, 2007Ji & Wilson, 2007

UP and DOWN States UP and DOWN States Duration Histograms Duration Histograms

Non-exponentialNon-exponential!!!!!!

ExamplesExamples

• In our case, we fit the histogram with 2-parameter parametric model for UP/DOWN state

•To be more precise, the model pdf is a censored version of the original pdf

•

• In our case, we fit the histogram with 2-parameter parametric model for UP/DOWN state

•To be more precise, the model pdf is a censored version of the original pdf

•

DetailsDetails•Conditional intensity function (generalized “rate”)

•Observed likelihood

• Joint log-likelihood

•Conditional intensity function (generalized “rate”)

•Observed likelihood

• Joint log-likelihood

( | ) Pr(pike in [, )| )t tt H s t t Hλ Δ≈ +Δ

Monte Carlo EMMonte Carlo EM

•Continuous-time model: E-step is difficult to compute analytically (since the total number & the locations of state transitions are both unknown)

• Idea: replace E-step with Monte Carlo E-step, M-step remains unchanged (if the M-step uses Gibbs sampling, then it is a fully Bayesian estimation)

•Use HMM-EM estimate as the initial condition

•Derive a reversible jump MCMC algorithm to allow dimensionality change (the number of transitions)

•Continuous-time model: E-step is difficult to compute analytically (since the total number & the locations of state transitions are both unknown)

• Idea: replace E-step with Monte Carlo E-step, M-step remains unchanged (if the M-step uses Gibbs sampling, then it is a fully Bayesian estimation)

•Use HMM-EM estimate as the initial condition

•Derive a reversible jump MCMC algorithm to allow dimensionality change (the number of transitions)

Reversible-jump MCMCReversible-jump MCMC

•A Metropolis-Hastings type MCMC with trans-dimensional proposal (Green, 1995)

•Allow the Markov chain to jump between two parameter spaces with different dim.

•example: change point detection, mixture modeling, object detection, etc.

•Design of efficient proposal distribution (non-unique, prior knowledge helps, likelihood-based data-driven) --- improve the acceptance rate

•A Metropolis-Hastings type MCMC with trans-dimensional proposal (Green, 1995)

•Allow the Markov chain to jump between two parameter spaces with different dim.

•example: change point detection, mixture modeling, object detection, etc.

•Design of efficient proposal distribution (non-unique, prior knowledge helps, likelihood-based data-driven) --- improve the acceptance rate

A bit detailA bit detail

•Consider seven proposals, 6 of them change dim.

•1) move the boundary; 2) merge two neighboring states; 3) delete one intermediate sojourn; 4-5) insert/delete the first sojourn; 6-7) insert/delete the last sojourn

•Acceptance prob.

•

•

•Consider seven proposals, 6 of them change dim.

•1) move the boundary; 2) merge two neighboring states; 3) delete one intermediate sojourn; 4-5) insert/delete the first sojourn; 6-7) insert/delete the last sojourn

•Acceptance prob.

•

•

Algorithmic procedureAlgorithmic procedure

Convergence diagnosis: single chain, log-likelihoodConvergence diagnosis: single chain, log-likelihoodPosterior dist. of Posterior dist. of SS(t): burn-in period & thinning (t): burn-in period & thinning

Experimental resultsExperimental results

•Validate the models and algorithms via synthetic spike train data

•Apply the models to real-world 15-min SWS cortical spike train data

•Plot EEG-triggered average

•Goodness-of-fit tests for the model

•Validate the models and algorithms via synthetic spike train data

•Apply the models to real-world 15-min SWS cortical spike train data

•Plot EEG-triggered average

•Goodness-of-fit tests for the model

Synthetic dataSynthetic data

Real dataReal data

K-complexK-complex

Method comparisonMethod comparison

2986 2986

no. jumps no. jumps

3223 3223

2576 2576 (occurrence rate: (occurrence rate:

82 /min) 82 /min)

2576 2576

KS plot KS plot

•go•go

Goodness-of-fit: Kolmogorov-Smirnov (KS) test Goodness-of-fit: Kolmogorov-Smirnov (KS) test from the time-rescaling theorem (Brown et al., 2002)from the time-rescaling theorem (Brown et al., 2002)

Possible model extension

Possible model extension

• Incorporation of continuous-time LFP observations into the likelihood model

•Fully Bayesian method (imposing priors to the model parameters)

•discrete prob. model for the state duration distribution

• Incorporation of continuous-time LFP observations into the likelihood model

•Fully Bayesian method (imposing priors to the model parameters)

•discrete prob. model for the state duration distribution

Take home messageTake home message•Probabilistic generative models are useful

for representing the uncertainties of data (in contrast, threshold-based method is ad hoc and very data-dependent)

•“All models are wrong, but some are useful.” --- George Box

•Continuous-time model is more accurate in characterizing & segmenting the UP/DOWN states, but it is more computationally demanding

•Model limitation: currently assumed to be stationary and model is identical across all spike trains (independent)

•Probabilistic generative models are useful for representing the uncertainties of data (in contrast, threshold-based method is ad hoc and very data-dependent)

•“All models are wrong, but some are useful.” --- George Box

•Continuous-time model is more accurate in characterizing & segmenting the UP/DOWN states, but it is more computationally demanding

•Model limitation: currently assumed to be stationary and model is identical across all spike trains (independent)

Documents

IEEE CDC Workshop, Dec. 8, 2008