Measuring the signal-to-noise ratio of a neuron · Measuring the signal-to-noise ratio of a neuron Gabriela Czanner a,1 , Sridevi V. Sarma b , Demba Ba c , Uri T. Eden d , Wei Wu

Measuring the signal-to-noise ratio of a neuronGabriela Czannera,1, Sridevi V. Sarmab, Demba Bac, Uri T. Edend, Wei Wue, Emad Eskandarf, Hubert H. Limg,Simona Temereancah, Wendy A. Suzukii, and Emery N. Brownc,j,k,1

aDepartment of Biostatistics and Department of Eye and Vision Science, Faculty of Health and Life Sciences, University of Liverpool, Liverpool L69 3GA,United Kingdom; bDepartment of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218;cDepartment of Brain and Cognitive Sciences and jInstitute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA02139; dDepartment of Mathematics and Statistics, Boston University, Boston, MA 02215; eSchool of Automation Science and Engineering, South ChinaUniversity of Technology, Guangzhou 510640, China; Departments of fNeurosurgery and kAnesthesia, Critical Care and Pain Medicine, MassachusettsGeneral Hospital, Harvard Medical School, Boston, MA 02114; gDepartment of Biomedical Engineering, Institute for Translational Neuroscience, Universityof Minnesota, Minneapolis, MN 55455; hMartinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, HarvardMedical School, Charlestown, MA 02129; and iCenter for Neural Science, New York University, New York, NY 10003

Contributed by Emery N. Brown, March 27, 2015 (sent for review December 5, 2014; reviewed by Mingzhou Ding, Satish Iyengar, and Hualou Liang)

The signal-to-noise ratio (SNR), a commonly used measure of fidelityin physical systems, is defined as the ratio of the squared amplitudeor variance of a signal relative to the variance of the noise. Thisdefinition is not appropriate for neural systems in which spikingactivity is more accurately represented as point processes. We showthat the SNR estimates a ratio of expected prediction errors andextend the standard definition to one appropriate for single neuronsby representing neural spiking activity using point process general-ized linear models (PP-GLM). We estimate the prediction errors usingthe residual deviances from the PP-GLM fits. Because the deviance isan approximate χ2 random variable, we compute a bias-correctedSNR estimate appropriate for single-neuron analysis and use thebootstrap to assess its uncertainty. In the analyses of four systemsneuroscience experiments, we show that the SNRs are −10 dB to−3 dB for guinea pig auditory cortex neurons, −18 dB to −7 dB forrat thalamic neurons, −28 dB to −14 dB for monkey hippocampalneurons, and −29 dB to −20 dB for human subthalamic neurons.The new SNR definition makes explicit in the measure commonlyused for physical systems the often-quoted observation that singleneurons have low SNRs. The neuron’s spiking history is frequentlya more informative covariate for predicting spiking propensity thanthe applied stimulus. Our new SNR definition extends to anyGLM system in which the factors modulating the response canbe expressed as separate components of a likelihood function.

SNR | signal-to-noise ratio | neuron | simulation | point processes

The signal-to-noise ratio (SNR), defined as the amplitudesquared of a signal or the signal variance divided by the variance

of the system noise, is a widely applied measure for quantifyingsystem fidelity and for comparing performance among differentsystems (1–4). Commonly expressed in decibels as 10log10(SNR),the higher the SNR, the stronger the signal or information in thesignal relative to the noise or distortion. Use of the SNR is mostappropriate for systems defined as deterministic or stochasticsignals plus Gaussian noise (2, 4). For the latter, the SNR can becomputed in the time or frequency domain.Use of the SNR to characterize the fidelity of neural systems is

appealing because information transmission by neurons is a noisystochastic process. However, the standard concept of SNR cannotbe applied in neuronal analyses because neurons transmit bothsignal and noise primarily in their action potentials, which arebinary electrical discharges also known as spikes (5–8). Definingwhat is the signal and what is the noise in neural spiking activity isa challenge because the putative signals or stimuli for neuronsdiffer appreciably among brain regions and experiments. For ex-ample, neurons in the visual cortex and in the auditory cortexrespond respectively to features of light (9) and sound stimuli (10)while neurons in the somatosensory thalamus respond to tactilestimuli (11). In contrast, neurons in the rodent hippocampus re-spond robustly to the animal’s position in its environment (11, 12),whereas monkey hippocampal neurons respond to the process oftask learning (13). As part of responding to a putative stimulus,

a neuron’s spiking activity is also modulated by biophysical factorssuch as its absolute and relative refractory periods, its burstingpropensity, and local network and rhythm dynamics (14, 15).Hence, the definition of SNR must account for the extent to whicha neuron’s spiking responses are due to the applied stimulus orsignal and to these intrinsic biophysical properties.Formulations of the SNR for neural systems have been stud-

ied. Rieke et al. (16) adapted information theory measures todefine Gaussian upper bounds on the SNR for individual neu-rons. Coefficients of variation and Fano factors based on spikecounts (17–19) have been used as measures of SNR. Similarly,Gaussian approximations have been used to derive upper boundson neuronal SNR (16). These approaches do not consider thepoint process nature of neural spiking activity. Moreover, thesemeasures and the Gaussian approximations are less accurate forneurons with low spike rates or when information is contained inprecise spike times.Lyamzin et al. (20) developed an SNR measure for neural

systems using time-dependent Bernoulli processes to model theneural spiking activity. Their SNR estimates, based on varianceformulae, do not consider the biophysical properties of theneuron and are more appropriate for Gaussian systems (16, 21,22). The Poisson regression model used widely in statistics torelate count observations to covariates provides a framework forstudying the SNR for non-Gaussian systems because it providesan analog of the square of the multiple correlation coefficient(R2) used to measure goodness of fit in linear regression analyses

Significance

Neurons represent both signal and noise in binary electricaldischarges termed action potentials. Hence, the standard sig-nal-to-noise ratio (SNR) definition of signal amplitude squaredand divided by the noise variance does not apply. We showthat the SNR estimates a ratio of expected prediction errors.Using point process generalized linear models, we extend thestandard definition to one appropriate for single neurons. Inanalyses of four neural systems, we show that single neuronSNRs range from −29 dB to −3 dB and that spiking history isoften a more informative predictor of spiking propensity thanthe signal or stimulus activating the neuron. By generalizingthe standard SNR metric, we make explicit the well-known factthat individual neurons are highly noisy information transmitters.

Author contributions: G.C. and E.N.B. designed research; G.C., S.V.S., D.B., U.T.E., W.W.,E.E., H.H.L., S.T., and W.A.S. performed research; and G.C. and E.N.B. wrote the paper.

Reviewers: M.D., University of Florida; S.I., University of Pittsburgh; and H.L., DrexelUniversity.

The authors declare no conflict of interest.

Freely available online through the PNAS open access option.1To whom correspondence may be addressed. Email: [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1505545112/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1505545112 PNAS | June 9, 2015 | vol. 112 | no. 23 | 7141–7146

STATIST

ICS

http://crossmark.crossref.org/dialog/?doi=10.1073/pnas.1505545112&domain=pdf

mailto:[email protected]



http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1505545112/-/DCSupplemental

http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1505545112/-/DCSupplemental

www.pnas.org/cgi/doi/10.1073/pnas.1505545112

(23). The SNR can be expressed in terms of the R2 for linear andPoisson regression models. However, this relationship has notbeen exploited to construct an SNR estimate for neural systemsor point process models. Finally, the SNR is a commonly com-puted statistic in science and engineering. Extending this conceptto non-Gaussian systems would be greatly aided by a precisestatement of the theoretical quantity that this statistic estimates(24, 25).We show that the SNR estimates a ratio of expected prediction

errors (EPEs). Using point process generalized linear models (PP-GLM), we extend the standard definition to one appropriate forsingle neurons recorded in stimulus−response experiments. Inanalyses of four neural systems, we show that single-neuron SNRsrange from −29 dB to −3 dB and that spiking history is often amore informative predictor of spiking propensity than the signalbeing represented. Our new SNR definition generalizes to anyproblem in which the modulatory components of a system’s outputcan be expressed as separate components of a GLM.

TheoryA standard way to define the SNR is as the ratio

SNR=σ2signalσ2noise

, [1]

where σ2signal is structure in the data induced by the signal andσ2noise is the variability due to the noise. To adapt this definition tothe analysis of neural spike train recordings from a single neu-ron, we have: to (i) define precisely what the SNR estimates;(ii) extend the definition and its estimate to account for cova-riates that, along with the applied stimulus or signal input, alsoaffect the neural response; and (iii) extend the SNR definitionand its estimate so that it applies to point process models ofneural spiking activity.By analyzing the linear Gaussian signal plus noise model (Sup-

porting Information), we show that standard SNR computations(Eq. S5) provide an estimator of a ratio of EPEs (Eq. S4). Forthe linear Gaussian model with covariates, this ratio of EPEs isalso well defined (Eq. S6) and can be estimated as a ratio of sumof squares of residuals (Eq. S7). The SNR definition furtherextends to the GLM with covariates (Eq. S8). To estimate theSNR for the GLM, we replace the sums of squares by the re-sidual deviances, their extensions in the GLM framework Eqs. S9and S10. The residual deviance is a constant multiple of theKullback−Leibler (KL) divergence between the data and themodel. Due to the Pythagorean property of the KL divergence ofGLM models with canonical link functions (26–28) evaluated atthe maximum likelihood estimates, the SNR estimator can beconveniently interpreted as the ratio of the explained KL di-vergence of the signal relative to the noise. We propose an ap-proximate bias correction for the GLM SNR estimate withcovariates (Eq. S11), which gives the estimator better perfor-mance in low signal-to-noise problems such as single-neuronrecordings. The GLM framework formulated with point processmodels has been used to analyze neural spiking activity (5–7, 29).Therefore, we derive a point process GLM (PP-GLM) SNRestimate for single-neuron spiking activity recorded in stimulus−response experiments.

A Volterra Series Expansion of the Conditional Intensity Function of aSpiking Neuron. Volterra series are widely used to model bi-ological systems (30), including neural spiking activity (16). Wedevelop a Volterra series expansion of the log of the conditionalintensity function to define the PP-GLM for single-neuronspiking activity (31). We then apply the GLM framework out-lined in Supporting Information to derive the SNR estimate.We assume that on an observation interval ð0,T�, we record

spikes at times 0< u1 < u2 < .....< uJ <T. If we model the spikeevents as a point process, then the conditional intensity functionof the spike train is defined by (5)

limΔ→0

PrðNðt+ΔÞ−NðtÞjHtÞΔ

= λðtjHtÞ, [2]

where NðtÞ is the number of spikes in the interval ð0, t� for t∈ ð0,T�and Ht is the relevant history at t. It follows that for Δ small,

Pr spike in ðt, t+Δ�jHtÞ≈ λðtjHtÞΔ.ð [3]

We assume that the neuron receives a stimulus or signal input andthat its spiking activity depends on this input and its biophysicalproperties. The biophysical properties may include absolute andrelative refractory periods, bursting propensity, and networkdynamics. We assume that we can express log λðtjHtÞ in a Volterraseries expansion as a function of the signal and the biophysicalproperties (31). The first-order and second-order terms in the ex-pansion are

log λðtjHtÞ=Z t

0

sðt− uÞβSðuÞdu+Z t

0

βHðuÞdNðt− uÞ

+Z t

0

Z t

0

sðt− uÞsðt− vÞh1ðu, vÞdudv

+Z t

0

Z t

0

h2ðu, vÞdNðt− uÞdNðt− vÞ

+Z t

0

Z t

0

h3ðu, vÞsðt− uÞdNðt− vÞ+ ..., [4]

where sðtÞ is the signal at time t, dNðtÞ is the increment in thecounting process, βSðuÞ is the one-dimensional signal kernel, βHðtÞis the one-dimensional temporal or spike history kernel, h1ðu, vÞ isthe 2D signal kernel, h2ðu, vÞ is the 2D temporal kernel, andh3ðu, vÞ is the 2D signal−temporal kernel.Eq. 4 shows that up to first order, the stimulus effect on the

spiking activity and the effect of the biophysical properties of theneuron, defined in terms of the neuron’s spiking history, can beexpressed as separate components of the conditional intensityfunction. Assuming that the second-order effects are not strong,then the approximate separation of these two components makesit possible to define the SNR for the signal, also taking account ofthe effect of the biophysical properties as an additional covariateand vice versa. We expand the log of the conditional intensityfunction in the Volterra series instead of the conditional intensityfunction itself in the Volterra series to ensure that the conditionalintensity function is positive. In addition, using the log of theconditional intensity function simplifies the GLM formulation byusing the canonical link function for the local Poisson model.

Likelihood Analysis Using a PP-GLM.We define the likelihood modelfor the spike train using the PP-GLM framework (5). We assumethe stimulus−response experiment consists of R independent tri-als, which we index as r= 1, ...,R. We discretize time within a trialby choosing L large and defining the L subintervals Δ=T−1L. Wechoose L large so that each subinterval contains at most one spike.We index the subintervals ℓ= 1, ...L and define nr,ℓ to be 1 if, ontrial r, there is a spike in the subinterval ((ℓ−1)Δ,ℓΔ) and it is 0otherwise. We let nr = ðnr,1, ...nr,LÞ be the set of spikes recorded ontrial r in ð0,T�. Let Hr,ℓ = fnr,ℓ−J , ..., nr,ℓ−1g be the relevant history ofthe spiking activity at time ℓΔ. We define the discrete form of theVolterra expansion by using the first two terms of Eq. 4 to obtain

log λrðℓΔ��Hr,ℓ, β

�≈ β0 +

XKk=0

βS,k sℓ−k +XJ

j=1

βH,j nr,ℓ−j, [5]

7142 | www.pnas.org/cgi/doi/10.1073/pnas.1505545112 Czanner et al.

http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1505545112/-/DCSupplemental/pnas.201505545SI.pdf?targetid=nameddest=STXT




where β= ðβ0, βS, βHÞ′, βS = ðβS,0,..., βS,KÞ′, and βH = ðβH,1,..., βH,JÞ′,and hence the dependence on the stimulus goes back a periodof KΔ, whereas the dependence on spiking history goes back aperiod of JΔ. Exponentiating both sides of Eq. 5 yields

λrðℓΔjHℓ, βÞ≈ exp

(β0 +

XKk=0

βS,k sℓ−k +XJ

j=1

βH,jnr,ℓ−j

). [6]

The first and third terms on the right side of Eq. 6 measure theintrinsic spiking propensity of the neuron, whereas the secondterm measures the effect of the stimulus or signal on the neuron’sspiking propensity.The likelihood function for β given the recorded spike train is (5)

Lðn, βÞ= exp

(XRr=1

"XLℓ=1

nr,ℓ log λ�ℓΔ��β,Hr,ℓ

�Δ

−XLℓ=1

λ�ℓΔ��β,Hr,ℓ

�Δ

#). [7]

Likelihood formulations with between-trial dependence (32) arealso possible but are not considered here.The maximum likelihood estimate of β can be computed by

maximizing Eq. 7 or, equivalently, by minimizing the residualdeviance defined as

Dev�n, β

�=−2

�logL

�n, β

�− logLðn, nÞ�, [8]

where n= ðn1, ..., nRÞ and Lðn,nÞ is the saturated model or thehighest possible value of the maximized log likelihood (26). Max-imizing logLðn, βÞ to compute the maximum likelihood estimateof β is equivalent to minimizing the deviance, because Lðn, nÞ is aconstant. The deviance is the generalization to the GLM of thesum of squares from the linear Gaussian model (33).As in the standard GLM framework, these computations are

carried out efficiently using iteratively reweighted least squares.In our PP-GLM likelihood analyses, we use Akaike’s Informa-tion Criterion (AIC) to help choose the order of the discretekernels βH and βS (34). We use the time-rescaling theorem andKolmogorov−Smirnov (KS) plots (35) along with analyses ofthe Gaussian transformed interspike intervals to assess modelgoodness of fit (36). We perform the AIC and time-rescalinggoodness-of-fit analyses using cross-validation to fit the model tohalf of the trials in the experiments (training data set) and thenevaluating AIC, the KS plots on the second half the trials (testdata set). The model selection and goodness-of-fit assessmentsare crucial parts of the SNR analyses. They allow us to evaluatewhether our key assumption is valid, that is, that the conditionalintensity function can be represented as a finite-order Volterraseries whose second-order terms can be neglected. Significantlack of fit could suggest that this assumption did not hold andwould thereby weaken, if not invalidate, any subsequent inferencesand analyses.

SNR Estimates for a Single Neuron. Applying Eq. S11, we have thatfor a single neuron, the SNR estimate for the signal given thespike history (biophysical properties) with the approximate biascorrections is

SNRS =Dev

�n, β0, βH

�−Dev

�n, β

�− dim

�β0�− dim

�βH

�+ dim

�β�

Dev�n, β

�+ dim

�β� ,

[9]

and that for a single neuron, the SNR estimates of the spikingpropensity given the signal is

SNRH =Dev

�n, βS

�−Dev

�n, β

�− dim

�βS�+ dim

�β�

Dev�n, β

�+ dim

�β� , [10]

where dimðβÞ is the dimension or the number of parameters in β.Application of the stimulus activates the biophysical propertiesof the neuron. Therefore, to measure the effect of the stimulus,we fit the GLM with and without the stimulus and use the dif-ference between the deviances to estimate the SNRS (Eq. 9). Sim-ilarly, to measure the effect of the spiking history, we fit the GLMwith and without the spike history and use the difference betweenthe deviances to estimate the SNRH (Eq. 10).Expressed in decibels, the SNR estimates become

SNRdBS = 10 log10

�SNRS

�[11]

SNRdBH = 10 log10

�SNRH

�. [12]

ApplicationsStimulus−Response Neurophysiological Experiments. To illustrate ourmethod, we analyzed neural spiking activity data from stimulus−response experiments in four neural systems. The stimulus appliedin each experiment is a standard one for the neural system beingstudied. The animal protocols executed in experiments 1–3 wereapproved by the Institutional Animal Care and Use Committees atthe University of Michigan for the guinea pig studies, the Universityof Pittsburgh for the rat studies, and New York University forthe monkey studies. The human studies in experiment 4 wereapproved by the Human Research Committee at MassachusettsGeneral Hospital.

Fig. 1. Raster plots of neural spiking activity. (A) Forty trials of spiking ac-tivity recorded from a neuron in the primary auditory cortex of an anes-thetized guinea pig in response to a 200 μs/phase biphasic electrical pulseapplied in the inferior colliculus at time 0. (B) Fifty trials of spiking activityfrom a rat thalamic neuron recorded in response to a 50 mm/s whisker de-flection repeated eight times per second. (C) Twenty-five trials of spikingactivity from a monkey hippocampal neuron recorded while executing alocation scene association task. (D) Forty trials of spiking activity recordedfrom a subthalamic nucleus neuron in a Parkinson’s disease patient beforeand after a hand movement in each of four directions (dir.): up (dir. U), right(dir. R), down (dir. D), and left (dir. L).

Czanner et al. PNAS | June 9, 2015 | vol. 112 | no. 23 | 7143

STATIST

ICS

In experiment 1 (Fig. 1A), neural spike trains were recordedfrom 12 neurons in the primary auditory cortex of anesthetizedguinea pigs in response to a 200 μs/phase biphasic electrical pulseat 44.7-μA applied in the inferior colliculus (10). Note that theneural recordings were generally multi-unit responses recordedon 12 sites but we refer to them as neurons in this paper. Thestimulus was applied at time 0, and spiking activity was recordedfrom 10 ms before the stimulus to 50 ms after the stimulus during40 trials. In experiment 2, neural spiking activity was recorded in12 neurons from the ventral posteromedial (VPm) nucleus of thethalamus (VPm thalamus) in rats in response to whisker stimu-lation (Fig. 1B) (11). The stimulus was deflection of the whiskerat a velocity of 50 mm/s at a repetition rate of eight deflectionsper second. Each deflection was 1 mm in amplitude and beganfrom the whiskers’ neutral position as the trough of a single sinewave and ended smoothly at the same neutral position. Neuralspiking activity was recorded for 3,000 ms across 51 trials.In experiment 3 (Fig. 1C), neural spiking activity was recorded

in 13 neurons in the hippocampus of a monkey executing a lo-cation scene association task (13). During the experiment, two tofour novel scenes were presented along with two to four well-learned scenes in an interleaved random order. Each scene waspresented for between 25 and 60 trials. In experiment 4, the datawere recorded from 10 neurons in the subthalamic nucleus ofhuman Parkinson’s disease patients (Fig. 1D) executing a di-rected movement task (15). The four movement directions wereup, down, left, and right. The neural spike trains were recordedin 10 trials per direction beginning 200 ms before the movementcue and continuing to 200 ms after the cue.The PP-GLM was fit to the spike trains of each neuron using

likelihood analyses as described above. Examples of the modelgoodness of fit for a neuron from each system is shown in SupportingInformation. Examples of the model estimates of the stimulus andhistory effects for a neuron from each system are shown in Fig. 2.

SNR of Single Neurons.We found that the SNRdBS estimates (Eq. 11)

of the stimulus controlling for the effect of the biophysical model

properties were (median [minimum, maximum]): −6 dB [−10 dB,−3 dB] for guinea pig auditory cortex neurons; −9 dB [−18 dB, −7dB] for rat thalamic neurons; −20 dB [−28 dB, −14 dB] for themonkey hippocampus; and −23 dB [−29 dB, −20 dB] for humansubthalamic neurons (Fig. 3, black bars). The higher SNRs (fromEq. 11) in experiments 1 and 2 (Fig. 3 A and B) are consistent withthe fact that the stimuli are explicit, i.e., an electrical current andmechanical displacement of the whisker, respectively, and that therecording sites are only two synapses away from the stimulus. It isalso understandable that SNRs are smaller for the hippocampusand thalamic systems in which the stimuli are implicit, i.e., behav-ioral tasks (Fig. 3 C and D).We found that SNRdB

H estimates (from Eq. 12) of the bio-physical properties controlling for the stimulus effect were: 2 dB[−9 dB, 7 dB] for guinea pig auditory cortex; −13 dB [−22 dB, −8dB] for rat thalamic neurons; −15 dB [−24 dB, −11 dB] for themonkey hippocampal neurons; and −12 dB [−16 dB, −5 dB] forhuman subthalamic neurons (Fig. 3, gray bars). They weregreater than SNRdB

S for the guinea pig auditory cortex (Fig. 3A),the monkey hippocampus (Fig. 3C), and the human subthalamicexperiments (Fig. 3D), suggesting that the intrinsic spiking pro-pensities of neurons are often greater than the spiking propensityinduced by applying a putatively relevant stimulus.

A Simulation Study of Single-Neuron SNR Estimation. To analyze theperformance of our SNR estimation paradigm, we studied sim-ulated spiking responses of monkey hippocampal neurons withspecified stimulus and history dynamics. We assumed four knownSNRs of −8.3 dB, −17.4 dB, −28.7 dB, and –∞ dB corresponding,respectively, to stimulus effects on spike rates ranges of 500, 60, 10,and 0 spikes per second (Fig. 4, row 1). For each of the stimulusSNRs, we assumed spike history dependence (Fig. 4, row 2) to besimilar to that of the neuron in Fig. 1C. For each of four stimuluseffects, we simulated 300 experiments, each consisting of 25 trials(Fig. 4, row 3). To each of the 300 simulated data sets at eachSNR level, we applied our SNR estimation paradigm: model

Fig. 2. Stimulus and history component estimates from the PP-GLM analy-ses of the spiking activity in Fig. 1. (A) Guinea pig primary auditory cortexneuron. (B) Rat thalamic neuron. (C) Monkey hippocampal neuron. (D) Hu-man subthalamic nucleus neuron. The stimulus component (Upper) is theestimated stimulus-induced effect on the spike rate in A, C, and D and theimpulse response function of the stimulus in B. The history components(Lower) show the modulation constant of the spike firing rate.

Fig. 3. KL-based SNR for (A) 12 guinea pig auditory cortex neurons, (B) 12 ratthalamus neurons, (C) 13 monkey hippocampal neurons, and (D) 10 sub-thalamic nucleus neurons from a Parkinson’s disease patient. The black dots areSNRdB

S , the SNR estimates due to the stimulus correcting for the spiking history.The black bars are the 95% bootstrap confidence intervals for SNRdB

S . The graydots are SNRdB

H , the SNR estimates due to the intrinsic biophysics of the neuroncorrecting for the stimulus. The gray bars are the 95% bootstrap confidenceintervals for SNRdB

H . The encircled points are the SNR and 95% confidence in-tervals for the neural spike train raster plots in Fig. 1.





fitting, model order selection, goodness-of-fit assessment, andestimation of SNRdB

S (Fig. 4, row 4) and SNRdBH (Fig. 4, row 5).

The bias-corrected SNR estimates show symmetric spreadaround their true SNRs, suggesting that the approximate biascorrection performed as postulated (Fig. 4, rows 4 and 5). Theexception is the case in which the true SNR was −∞ and ourparadigm estimates SNRdB

S as large negative numbers (Fig. 4D,row 4). The SNRdB

S are of similar magnitude as the SNR estimatesin actual neurons (see SNR = −18.1 dB in the third neuron in Fig.3C versus −17.4 dB in the simulated neuron (Fig. 4B).

A Simulation Study of SNR Estimation for Single Neurons with NoHistory Effect. We repeated the simulation study with no spikehistory dependence for the true SNR values of −1.5 dB, −16.9 dB,−27.9 dB, and –∞ dB, with 25 trials per experiment and 300realizations per experiment (Fig. 5). Removing the history de-pendence makes the simulated data within and between trialsindependent realizations from an inhomogeneous Poisson process.The spike counts across trials within a 1-ms bin obey a binomialmodel with n = 25 and the probability of a spike defined by thevalues of the true conditional intensity function times 1 ms. Hence,it is possible to compute analytically the SNR and the bias in theestimates. We used our paradigm to compute SNRdB

S . For com-parison, we also computed the variance-based SNR proposed byLyamzin et al. (20) Both SNRdB

S and the variance-based estimateswere computed from the parameters obtained from the sameGLM fits (see Eq. S16). For each simulation in Fig. 5, the trueSNR value based on our paradigm is shown (vertical lines).The histograms of SNRdB

S (Fig. 5, row 3) are spread symmetri-cally about the true expected SNR. The variance-based SNR esti-mate overestimates the true SNR in Fig. 5A and underestimates thetrue SNR in Fig. 5 B and C. These simulations illustrate that thevariance-based SNR is a less refined measure of uncertainty, as it isbased on only the first two moments of the spiking data, whereasour estimate is based on the likelihood that uses information fromall of the moments. At best, the variance-based SNR estimate can

provide a lower bound for the information content in the non-Gaussian systems (16). Variance-based SNR estimators can beimproved by using information from higher-order moments (37),which is, effectively, what our likelihood-based SNR estimators do.

DiscussionMeasuring the SNR of Single Neurons. Characterizing the reliabilitywith which neurons represent and transmit information is animportant question in computational neuroscience. Using thePP-GLM framework, we have developed a paradigm for esti-mating the SNR of single neurons recorded in stimulus responseexperiments. To formulate the GLM, we expanded the log of theconditional intensity function in a Volterra series (Eq. 4) torepresent, simultaneously, background spiking activity, the stimu-lus or signal effect, and the intrinsic dynamics of the neuron. In theapplication of the methods to four neural systems, we found thatthe SNRs of neuronal responses (Eq. 11) to putative stimuli—signals—ranged from −29 dB to −3 dB (Fig. 1). In addition, weshowed that the SNR of the intrinsic dynamics of the neuron (Eq.12) was frequently higher than the SNR of the stimulus (Eq. 11).These results are consistent with the well-known observation that,in general, neurons respond weakly to putative stimuli (16, 20).Our approach derives a definition of the SNR appropriate forneural spiking activity modeled as a point process. Therefore, itoffers important improvements over previous work in which theSNR estimates have been defined as upper bounds derived fromGaussian approximations or using Fano factors and coefficientsof variation applied to spike counts. Our SNR estimates arestraightforward to compute using the PP-GLM framework (5) andpublic domain software that is readily available (38). Therefore,they can be computed as part of standard PP-GLM analyses.The simulation study (Fig. 5) showed that our SNR methods

provide a more accurate SNR estimate than recently reportedvariance-based SNR estimate derived from a local Bernoulli model(20). In making the comparison between the two SNR estimates, wederived the exact prediction error ratios analytically, and we usedthe same GLM fit to the simulated data to construct the SNRestimates. As a consequence, the differences are only due todifferences in the definitions of the SNR. The more accurate

Fig. 4. KL-based SNR of simulated neurons with stimulus and history com-ponents. The stimulus components were set at four different SNRs: (A) −8.3 dB,(B) −17.4 dB, (C) −28.7 dB, and (D) –∞ dB, where the same spike history com-ponent was used in each simulation. For each SNR level, 300 25-trial simulationswere performed. Shown are (row 1) the true signal; (row 2) the true spike historycomponent; (row 3) a raster plot of a representative simulated experiment; (row4) histogram of the 300 SNRdB

S , the SNR estimates due to the stimulus correctingfor the spiking history; and (row 5) histogram of the 300 SNRdB

H , the SNR esti-mates due to the intrinsic biophysics of the neuron correcting for the stimulus.The vertical lines in rows 4 and 5 are the true SNRs.

Fig. 5. A comparison of SNR estimation in simulated neurons. The stimuluscomponents were set at four different SNRs: (A) −1.5 dB, (B) −16.9 dB, and(C) −27.9 dB with no history component. For each SNR level, 300 25-trialsimulations were performed. Shown are (row 1) the true signal; (row 2) a rasterplot of a representative simulated experiment; (row 3) histogram of the300 KL-based SNR estimates, SNRdB

S ; and (row 4) histogram of the 300 squarederror-based SNR estimates, SNRdB

SE (20). The vertical lines in rows 3 and 4 are thetrue SNRs.

Czanner et al. PNAS | June 9, 2015 | vol. 112 | no. 23 | 7145

STATIST

ICS

performance of our SNR estimate is attributable to the fact thatit is based on the likelihood, whereas the variance-based SNR es-timate uses only the first two sample moments of the data. Thisimprovement is no surprise, as it is well known that likelihood-basedestimates offer the best information summary in a sample given anaccurate or approximately statistical model (34). We showed thatfor each of the four neural systems, the PP-GLM accurately de-scribed the spike train data in terms of goodness-of-fit assessments.

A General Paradigm for SNR Estimation. Our SNR estimation par-adigm generalizes the approach commonly used to analyze SNRsin linear Gaussian systems. We derived the generalization byshowing that the commonly computed SNR statistic estimates aratio of EPEs (Supporting Information): the expected predictionof the error of the signal representing the data corrected for thenonsignal covariates relative to the EPE of the system noise.With this insight, we used the work of ref. 26 to extend the SNRdefinition to systems that can be modeled using the GLM frame-work in which the signal and relevant covariates can be expressedas separate components of the likelihood function. The linearGaussian model is a special case of a GLM. In the GLM paradigm,the sum of squares from the standard linear Gaussian model isreplaced by the residual deviance (Eq. S10). The residual deviancemay be viewed as an estimated KL divergence between data andmodel (26). To improve the accuracy of our SNR estimator, par-ticularly given the low SNRs of single neurons, we devised an ap-proximate bias correction, which adjusts separately the numeratorand the denominator (Eqs. 9 and 10). The bias-corrected estimatorperformed well in the limited simulation study we reported (Figs. 4and 5). In future work, we will replace the separate bias corrections

for the numerator and denominator with a single bias correction forthe ratio, and extend our paradigm to characterize the SNR ofneuronal ensembles and those of other non-Gaussian systems.In Supporting Information, we describe the relationship between

our SNR estimate and several commonly used quantities in sta-tistics, namely the R2, coefficient of determination, the F statistic,the likelihood ratio (LR) test statistic and f 2, Cohen’s effect size.Our SNR analysis offers an interpretation of the F statistic that isnot, to our knowledge, commonly stated. The F statistic may beviewed as a scaled estimate of the SNR for the linear Gaussianmodel, where the scale factor is the ratio of the degrees of freedom(Eq. S21). The numerator of our GLM SNR estimate (Eq. S9) is aLR test statistics for assessing the strength of the association be-tween data Y and covariates X2. The generalized SNR estimatorcan be seen as generalized effect size. This observation is especiallyimportant because it can be further developed for planning neu-rophysiological experiments, and thus may offer a way to enhanceexperimental reproducibility in systems neuroscience research (39).In summary, our analysis provides a straightforward way of

assessing the SNR of single neurons. By generalizing the stan-dard SNR metric, we make explicit the well-known fact that in-dividual neurons are noisy transmitters of information.

ACKNOWLEDGMENTS. This research was supported in part by the ClinicalEye Research Centre, St. Paul’s Eye Unit, Royal Liverpool and BroadgreenUniversity Hospitals National Health Service Trust, United Kingdom (G.C.);the US National Institutes of Health (NIH) Biomedical Research EngineeringPartnership Award R01-DA015644 (to E.N.B. and W.A.S.), Pioneer AwardDP1 OD003646 (to E.N.B.), and Transformative Research Award GM104948 (to E.N.B.); and NIH grants that supported the guinea pig exper-iments P41 EB2030 and T32 DC00011 (to H.H.L.).

1. Chen Y, Beaulieu N (2007) Maximum likelihood estimation of SNR using digitallymodulated signals. IEEE Trans Wirel Comm 6(1):210–219.

2. Kay S (1993) Fundamentals of Statistical Signal Processing (Prentice-Hall, Upper SaddleRiver, NJ).

3. Welvaert M, Rosseel Y (2013) On the definition of signal-to-noise ratio and contrast-to-noise ratio for FMRI data. PLoS ONE 8(11):e77089.

4. Mendel J (1995) Lessons in Estimation Theory for Signal Processing, Communications,and Control (Prentice-Hall, Upper Saddle River, NJ), 2nd Ed.

5. Truccolo W, Eden UT, Fellows MR, Donoghue JP, Brown EN (2005) A point processframework for relating neural spiking activity to spiking history, neural ensemble,and extrinsic covariate effects. J Neurophysiol 93(2):1074–1089.

6. Brillinger DR (1988) Maximum likelihood analysis of spike trains of interacting nervecells. Biol Cybern 59(3):189–200.

7. Brown EN, Barbieri R, Eden UT, Frank LM (2003) Likelihood methods for neural dataanalysis. Computational Neuroscience: A Comprehensive Approach, ed Feng J (CRC,London), pp 253–286.

8. Brown EN (2005) Theory of point processes for neural systems.Methods andModels inNeurophysics, eds Chow CC, Gutkin B, Hansel D, Meunier C, Dalibard J (Elsevier, Paris),pp 691–726.

9. MacEvoy SP, Hanks TD, Paradiso MA (2008) Macaque V1 activity during natural vision:Effects of natural scenes and saccades. J Neurophysiol 99(2):460–472.

10. Lim HH, Anderson DJ (2006) Auditory cortical responses to electrical stimulation ofthe inferior colliculus: Implications for an auditory midbrain implant. J Neurophysiol96(3):975–988.

11. Temereanca S, Brown EN, Simons DJ (2008) Rapid changes in thalamic firing syn-chrony during repetitive whisker stimulation. J Neurosci 28(44):11153–11164.

12. Wilson MA, McNaughton BL (1993) Dynamics of the hippocampal ensemble code forspace. Science 261(5124):1055–1058.

13. Wirth S, et al. (2003) Single neurons in the monkey hippocampus and learning of newassociations. Science 300(5625):1578–1581.

14. Dayan P, Abbott LF (2001) Theoretical Neuroscience (Oxford Univ Press, London).15. Sarma SV, et al. (2010) Using point process models to compare neural spiking activity

in the subthalamic nucleus of Parkinson’s patients and a healthy primate. IEEE TransBiomed Eng 57(6):1297–1305.

16. Rieke F, Warland D, de Ruyter van Steveninck RR, Bialek W (1997) Spikes: Exploringthe Neural Code (MIT Press, Cambridge, MA).

17. Optican LM, Richmond BJ (1987) Temporal encoding of two-dimensional patterns bysingle units in primate inferior temporal cortex. III. Information theoretic analysis.J Neurophysiol 57(1):162–178.

18. Softky WR, Koch C (1993) The highly irregular firing of cortical cells is inconsistentwith temporal integration of random EPSPs. J Neurosci 13(1):334–350.

19. Shadlen MN, Newsome WT (1998) The variable discharge of cortical neurons: Implicationsfor connectivity, computation, and information coding. J Neurosci 18(10):3870–3896.

20. Lyamzin DR, Macke JH, Lesica NA (2010) Modeling population spike trains withspecified time-varying spike rates, trial-to-trial variability, and pairwise signal andnoise correlations. Front Comput Neurosci 4(144):144.

21. Soofi ES (2000) Principal information theoretic approaches. Am Stat 95(452):1349–1353.22. Erdogmus D, Larsson EG, Yan R, Principe JC, Fitzsimmons JR (2004) Measuring the signal-to-

noise ratio in magnetic resonance imaging: A caveat. Signal Process 84(6):1035–1040.23. Mittlböck M, Waldhör T (2000) Adjustments for R2-measures for Poisson regression

models. Comput Stat Data Anal 34(4):461–472.24. Kent JT (1983) Information gain and a general measure of correlation. Biometrika

70(1):163–173.25. Alonso A, et al. (2004) Prentice’s approach and the meta-analytic paradigm: A reflection on

the role of statistics in the evaluation of surrogate endpoints. Biometrics 60(3):724–728.26. Hastie T (1987) A closer look at the deviance. Am Stat 41(1):16–20.27. Simon G (1973) Additivity of information in exponential family probability laws. J Am

Stat Assoc 68(342):478–482.28. Cameron AC, Windmeijer FAG (1997) An R-squared measure of goodness of fit for

some common nonlinear regression models. J Econom 77(2):329–342.29. Paninski L (2004) Maximum likelihood estimation of cascade point-process neural

encoding models. Network 15(4):243–262.30. Marmarelis VZ (2004) Nonlinear Dynamic Modeling of Physiological Systems (John

Wiley, Hoboken, NJ).31. Plourde E, Delgutte B, Brown EN (2011) A point process model for auditory neurons

considering both their intrinsic dynamics and the spectrotemporal properties of anextrinsic signal. IEEE Trans Biomed Eng 58(6):1507–1510.

32. Czanner G, Sarma SV, Eden UT, Brown EN (2008) A signal-to-noise ratio estimator forgeneralized linear model systems. Lect Notes Eng Comput Sci 2171(1):1063–1069.

33. McCullagh P, Nelder JA (1989) Generalized Linear Models (Chapman and Hall, New York).34. Pawitan Y (2013) In All Likelihood. Statistical Modelling and Inference Using Likelihood

(Oxford Univ Press, London).35. Brown EN, Barbieri R, Ventura V, Kass RE, Frank LM (2002) The time-rescaling theorem and

its application to neural spike train data analysis. Neural Comput 14(2):325–346.36. Czanner G, et al. (2008) Analysis of between-trial and within-trial neural spiking dynamics.

J Neurophysiol 99(5):2672–2693.37. Sekhar SC, Sreenivas TV (2006) Signal-to-noise ratio estimation using higher-order

moments. Signal Process 86(4):716–732.38. Cajigas I, Malik WQ, Brown EN (2012) nSTAT: Open-source neural spike train analysis

toolbox for Matlab. J Neurosci Methods 211(2):245–264.39. Collins FS, Tabak LA (2014) Policy: NIH plans to enhance reproducibility. Nature

505(7485):612–613.





Supporting InformationCzanner et al. 10.1073/pnas.1505545112What Does the SNR Estimate?The SNR has been most studied for linear model systems Y =Xβ+ « in which one has observations y= ðy1, . . . , ynÞ′ of a ran-dom vector Y = ðY1, . . . ,YnÞ′, Xβ is the signal, X = ðx1, . . . , xpÞ isthe n× p design matrix, xk are fixed known vectors of covariates(k = 1, . . . , p), β is a p× 1 vector of unknown coefficients,and « is an n× 1 vector of independent, identically distributedGaussian random errors with zero mean and variance σ2«. Thefirst column in X is a n× 1 vector of 1s denoted as 1n. The un-conditional mean of the random vector Y can be defined asEY = 1nβ0, where scalar parameter β0 is typically unknown.A standard way to define the SNR is as a ratio of variances as

SNRX =σ2signalσ2noise

, [S1]

where σ2signal is the variance of the signal representing the ex-pected variability in the data induced by the signal, where

σ2signal = ðXβ− 1nβ0Þ′ðXβ− 1nβ0Þ,

and σ2noise = nσ2« is the variance of noise. In other words, SNR isthe true expected proportion of variance in the data due thesignal divided by the variance due to the noise.We can obtain an alternative interpretation of SNR (Eq. S1) if

we view the two variances in terms of EPEs in the squared errorsense. The variance of the noise can be viewed as the expectederror of predicting Y when using covariates X (1). That is,

σ2noise =EPEðY ,XβÞ=E½ðY −XβÞ′ðY −XβÞ�. [S2]

Analogously, the expected error of predicting Y when using1nβ0 is

EPEðY , 1nβ0Þ=E½ðY − 1nβ0Þ′ðY − 1nβ0Þ�. [S3]

Due to the Pythagorean property of EPE in linear Gaussian sys-tem (1−3), at the parameter values β0 and β that minimize theEPE, the variance of the signal can be expressed as

σ2signal =EPEðY , 1nβ0Þ−EPEðY ,XβÞ,

the EPE of predicting the values of Y with overall mean, 1nβ0minus the EPE of predicting Y with the approximating model,Xβ (4). Hence, σ2signal is the reduction in the EPE achieved byusing the covariates X. This leads to an alternative definition ofthe SNR as

SNRX =EPEðY , 1n β0Þ−EPEðY ,XβÞ

EPEðY ,XβÞ , [S4]

which is the reduction EPE due to the signal, divided by the EPEdue to noise, in the squared-error sense. For this reason, we willrefer to Eqs. S1 and S4 as a variance-based or a squared-error-based SNR.The variance-based SNRX is the true expected SNR obtained

if the parameters β and β0 that give the minimum EPEs areknown. In practice, however, the SNRX is estimated by replacingthe parameters β and β0 by their least-squares estimates β andy, respectively. This leads to the estimate of SNRX (Eq. S4)

SNRX =SSResidualðy, 1nyÞ− SSResidual

�y,X β

�SSResidual

�y,X β

� [S5]

where

SSResidualðy, 1nyÞ= ðy− 1nyÞ′ðy− 1nyÞ

SSResidual�y,X β

�=�y−X β

�′�y−X β

�.

In linear model analyses, SSResidualðy, 1nyÞ is the variance of thedata around their estimated overall mean and SSResidualðy,X βÞis the estimated variability in the data around the estimated signalX β, i.e., the variability that is not explained by the covariate X.

Defining the SNR for a Linear Gaussian Signal PlusCovariates Plus Noise SystemIf the system is driven by a signal and a nonsignal component andif the two components can be separated by an approximate linearadditive model, then the SNR definition and estimate must bemodified. We assume the covariate component of the linearmodel Y =Xβ+ « can be partitioned as Xβ=X1β1 +X2β2, wherethe first component, X1β1, is a covariate not related to the signal,and the second component, X2β2, is the signal. There exist valuesof vectors β, β1, and β2 that give the minimum EPEs for de-scribing Y in terms of minimizing EPEðY ,XβÞ, EPEðY ,X1β1Þ,and EPEðY ,X2β2Þ, respectively. For this case, we can defineSNR in which only a part of the variability in random vector Y isattributed to the signal to extend the SNR definition in Eq. S1 byreplacing EPEðY , 1nβ0Þ with EPEðY ,X1β1Þ to obtain

SNRX2 =EPEðY ,X1β1Þ−EPEðY ,XβÞ

EPEðY ,XβÞ [S6]

where the first column of X1 and of X is the vector 1n. Eq. S6gives the expected SNR in Y about the signal, X2β2, while con-trolling for the effect of nonsignal component, X1β1. The numer-ator in Eq. S6 is the reduction in the EPE due to the signal, X2β2,when controlling for X1β1, the systematic changes in randomvector Y unrelated to the signal whereas the denominator isthe EPE due to the noise. By analogy with Eq. S5, we can esti-mate the squared-error based SNRX2 (Eq. S6) as

SNRX2 =SSResidual

�y,X1β1

�− SSResidual

�y,X β

�SSResidual

�y,X β

� , [S7]

where we replace SSResidualðy, 1n yÞ with SSResidualðy,X1β1Þ inEq. S5.

Defining the SNR for GLM SystemsThe SNR definition and estimate in Eqs. S6 and S7 extend to theGLM framework, the established statistical paradigm for con-ducting regression analyses when data from the exponentialfamily are observed with covariates (5). We extend SNR to GLMsystems in which the covariates may be partitioned into signaland nonsignal components by replacing the squared-error EPEin Eq. S6 with the KL EPE of Y from the approximating modeland by replacing the residual sums of squares in Eq. S7 by theresidual deviances (1, 3, 5). This leads to the following KL gen-eralization of the true SNR in Y about the signal X2, while taking

Czanner et al. www.pnas.org/cgi/content/short/1505545112 1 of 5

www.pnas.org/cgi/content/short/1505545112

into account the nonsignal effects X1, for the system approximatedby the GLM

SNRX2 =EPEðY ,X1β1Þ−EPEðY ,XβÞ

EPEðY ,XβÞ , [S8]

and its deviance-based estimate

SNRX2 =Dev

�y,X1β1

�−Dev

�y,X β

�Dev

�y,X β

� [S9]

EPEðY ,X1β1Þ=E½−2 log f ðY jX1β1Þ�

where the expectation is taken with respect to true generatingprobability distribution of random vector Y and the “2” in thedefinition makes the log-likelihood loss for the Gaussian distri-bution match squared-error loss. For this reason we refer to Eq.S8 as a KL-based SNR and Eq. S9 as its KL- or deviance-basedSNR estimator.The deviance is

Dev�y,X β

�=−2log

L�y,X β

�Lðy, yÞ [S10]

where Lðy,X βÞ is the likelihood evaluated at the maximum like-lihood estimate β of the model parameter β. Lðy, yÞ is the satu-rated likelihood defined as the highest value of the likelihood (5).By the Pythagorean property of the KL divergence estimate in a

GLM with canonical link (1−3), the numerator in Eq. S8 is thereduction in KL EPE due to the signal, X2β2, while controllingfor the effect of the nonsignal component, X1β1. The KL-basedSNRX2 has squared error-based SNRX2 as a special case in whichthe exponential family model has the Gaussian distribution. Thenumerator of the SNR estimate (Eq. S9) gives the reduction indeviance due to signal, X2β2, while controlling for the nonsignalcomponent, X1β1. The estimates β and β1 are computed from twoseparate maximum-likelihood fits of the two models to data y (6).We define a bias correction for the SNR estimator (Eq. S9), as

this problem is especially prevalent in data with a weak signal (4,7). By definition, the SNR estimate is always positive. Underregularity conditions, the asymptotic biases of the numerator anddenominator in Eq. S9 are respectively dimðβ1Þ− dimðβÞ anddimðβÞ, suggesting the approximate bias-corrected SNR estimate

SNRX2 =Dev

�y,X1β1

�−Dev

�y,X β

�+ dimðβ1Þ− dimðβÞ

Dev�y,X β

�+ dimðβÞ . [S11]

This SNR estimate remains biased because a ratio of unbiasedestimators is not necessarily an unbiased estimator of the ratio.Our simulation studies in Fig. 4 (rows 4 and 5) suggest thatthe bias is small for neural spike trains (4).

Variance-Based and KL-Based SNR Are the Same in LinearSystems with Independent and Additive Gaussian NoiseWe assume that y1, . . . , yn is a realization of independent randomvariables Y1, . . . ,Yn, from a linear regression model, with meansE½YijXi�=Xiβ, zero covariances, and a common random errorvariance, σ2«. We also assume overall (unconditional) meanE½Yi�= β0. Furthermore, we assume a reduced model β1, i.e.,β1 ⊂ β. An example of reduced model is a model with the overallmean, β0, representing the background firing constant (see Eq.6), or a model with parameter vector β1 for background firingconstant and for nonsignal covariates. The full model is alwaysthe generating model or a good approximating model.

Then, under the above assumptions, the divergence betweendata y1, . . . , yn and the model Xβ1 is

KLðy1..yn,Xβ1Þ= ðy−Xβ1ÞTðy−Xβ1Þ, [S12]

and, assuming the vector value β1 that minimizes EPE, the meanis equal to

EPEKLðY1..Yn,X1β1Þ=E½KLðY1..Yn,Xβ1Þ�=X

EðYi −Xiβ1Þ2

=EPESEðY1..Yn,X1β1Þ[S13]

i.e., KL-based EPE reduces to squared-error-based EPE for theGaussian linear system with independent noise. Furthermore,X

EðYi −Xiβ1Þ2 =X

EðYi −Xiβ1Þ2 +X ðXiβ−Xiβ1Þ2

= nσ2« + ðXiβ−Xiβ1Þ2[S14]

where Yi and Xi are ith component of Y and ith row of X,respectively. Hence, for a linear Gaussian system, we haveEPEKLðY1..Yn,X1β1Þ= nσ2« +

P ðXiβ−Xiβ1Þ2, with a special casebeing β1 = β that gives EPEKLðY1..Yn,XβÞ= nσ2« = σ2noise. If we sub-stitute this into Eq. S6, we obtain

SNRX1 =EPEKLðY ,X1β1Þ−EPEKLðY ,XβÞ

EPEKLðY ,XβÞ

=nσ2« +

P ðXiβ−Xiβ1Þ2 − nσ2«nσ2«

=ðXβ− 1nβ0ÞTðXβ− 1nβ0Þ

σ2noise.

That is,

SNRKL,X1 = SNRSE,X1 [S15]

in systems that are linear with additive, independent, and Gaussiannoise. Lastly, for completeness, we note here that the scale param-eter of a linear Gaussian system is ϕ= σ2«.

Variance-Based and KL-Based SNR Are Not the Same forIndependent Binomial ObservationsWe assume that data y1, . . . , yL are recorded at 1-ms resolutionand that they are realizations of independent random variablesY1, . . . ,YL, from a Bernoulli distribution with parameters Kand pl, l = 1, . . . , L i.e., their means are K × pl and the variancesare K × pl × ð1− plÞ. Then the overall expected probability of anevent (such as a spike) is p=L−1P K × pl and the total varianceis

P VarðylÞ=P K × pl × ð1− plÞ, and hence the squared-error-

based SNR (Eqs. S1 and S4) can be shown to be

SNR=P ðK × pl − K × pÞ2P K × pl × ð1− plÞ . [S16]

The formula in Eq. S16 can be used to calculate SNR for spiketrains when spikes trains are independent across trials, and whentimes of spikes are independent within each trial, such as whenthere is no spike history dependence. Then, one can summarizethe data into a 1-ms peristimulus time histogram, which can beseen as a realization of independent Binomial random variables.Then the SNR numerator in Eq. S16 is the variance of signal,and the denominator contains the sum of variances of Binomialrandom variables across L bins. This idea was used in ref. 8, and



it was extended to incorporate the spike history, which was esti-mated in a sequential manner rather than in one single analysis.Nevertheless, our simulations in Fig. 5 indicate that expectedvariance-based SNR (Eq. S16) is smaller than KL-based ex-pected SNR (Eq. S8).

Variance-Based SNR and the Coefficient-of-DeterminationIn linear models with Gaussian noise, the coefficient of deter-mination, R2, is a commonly used measure of the fit of the modelto the data. The coefficient of determination ranges from 0 to 1,with 1 indicating perfect fit. Specifically,

R2 =

�1ny−X β

�T�1ny−X β

�ðy− 1nyÞTðy− 1nyÞ

=SSResidualðy, 1nyÞ− SSResidual

�y,X β

�SSResidualðy, 1nyÞ , [S17]

i.e., the numerators of SNR estimator (see Eq. S5) and of R2 arethe same, and it is the sum-of-squares explained by the model(i.e., the signal), and it is often referred to as SSModel orSSRegression in statistical software output. The denominators of R2

and SNR are different. The denominator in R2 is the sum of squaresaround the grand mean, SSResidualðy, 1nyÞ, representing the totalvariability in the data and hence often referred to as SSTotal. Onthe other hand, the variability of the data around the estimated linearfunction is summarized in the term SSResidualðy,X βÞ, which is oftenreferred to in the statistical software as SSResidual. In summary, theR2 can be written as

R2 =SSModelSSTotal

=SSModel

SSModel+ SSResidual, [S18]

and we have that

SNR=SSModelSSResidual

.

It follows that

1�R2 =

SSModel+ SSResidualSSModel

= 1+SSResidualSSModel

= 1+ 1�SNR

and that

SNR=

8>>>><>>>>:

R2

1−R2 if R2 ≠ 1

Inf if R2 = 1

0 if R2 = 0

[S19]

SNRdB =

8>>>>>>><>>>>>>>:

10log10

�R2

1−R2

� if R2 ≠ 1

Inf if R2 = 1

−Inf if R2 = 0

0 if R2 = 0.5

. [S20]

Hence, by Eq. S20, we have that squared-error-based SNR is anincreasing function of R2 (Fig. S1). Furthermore, both quantitiesR2 and SNR decrease with increasing level of noise (Fig. S2).A well-known problem with R2 is that it always increases, even

if unimportant covariates are added to the model. Hence anadjusted R2 was proposed (6, 9) that adjusts for the numberof explanatory terms in a model. Unlike R2, the adjusted R2 in-

creases only if the new term improves the model more thanwould be expected by chance. The adjusted R2 can be negative—just like bias-adjusted SNR—and will always be less than orequal to the R2. While R2 is a measure of fit, the adjusted R2 isused for comparison of nested models and for feature (i.e.,variable) selection in model building and machine learning.By analogy, the adjusted SNR can also be used for featureselection in biological systems to quantify the amount of in-formation in features.There are many generalizations of R2 for GLMmodels (called

pseudo-R2). Some generalizations are based on likelihoods (9,10). Their bias-adjusted versions for independent data are knownand implemented in statistical software (e.g., statistical softwareR). These bias-adjusted pseudo-R2 measures can be directly usedto obtain the bias-adjusted SNR via Eq. S20. However, even if anunbiased R2 estimate is used in Eq. S20 under the assumptionthat the data are independent, then the SNR estimate can still bebiased because the ratio of unbiased estimates is not necessarilyan unbiased estimate.

Variance-Based SNR and F-Test StatisticIn linear regressionmodels with independent Gaussian errors, theF test is a commonly used test to evaluate the importance of a setof covariates, X, in explaining the variability of dependent vari-able, Y. The F-test statistic has the form

F =SSModel=df ðModelÞ

SSResidual=df ðResidualÞ

where df ðModelÞ= k− 1, df ðResidualÞ= n− k− 1 are degrees offreedom of the model and residuals, and k is the number ofcovariates (i.e., the number of columns of X). Hence, usingEq. S5,

F = SNR×df ðResidualÞdf ðModelÞ [S21]

i.e., the bias-unadjusted SNR estimate Eq. S5 is a multiple of theF statistic.If there is no signal, then ðσ2signal=σ2noiseÞ= 0, i.e., SNR= 0 (in

Eq. S1). In this case, none of the covariates in matrix X is re-lated to Y. In other words, the true generating model is a modelwith a constant only. In this case, the F statistic has a centralFisher distribution with degrees of freedom df ðModelÞ anddf ðResidualÞ. It is easy to see that the mean of the F statistic (ifdf ðResidualÞ> 2Þ is

EðFÞ= df ðResidualÞdf ðResidualÞ− 2

and hence, when there is no signal, it follows from Eq. S21 andproperties of the central F distribution that the mean of thevariance-based SNR is

E�SNR

�=

df ðResidualÞdf ðResidualÞ− 2

×df ðModelÞdf ðResidualÞ=

df ðModelÞdf ðResidualÞ− 2

,

while the true SNR= 0; hence the bias of SNR is df ðModelÞ/½df ðResidualÞ− 2�, which converges to zero when the ratio of datasize to number of parameters becomes large.In the general case, when the true variance-based SNR≠ 0,

then the associated F statistic (Eq. S21) has a noncentral Fisherdistribution with degrees of freedom df ðModelÞ and df ðResidualÞand with a noncentrality parameter equal to σ2signal=σ

2« = n × SNR.

In such a case, it can be shown that



E�SNR

�=E

F

df ðModelÞn− df ðModelÞ− 1

=df ðModelÞ

df ðResidualÞ− 2+ SNR

ndf ðResidualÞ− 2

and the confidence intervals for SNR can be constructed usingquantiles of noncentral Fisher distribution (11).The equivalent theory for the bias correction and confidence

intervals of SNR is not available in GLM models with historydependence. Therefore, here we offered a simple bias correctionEq. S11) that removes some bias, and we showed that it can workwell in simulations. However, it can be proved that our biascorrection is asymptotically equivalent to the bias correctionabove for independent data from linear Gaussian model.

SNR and LR TestThe concept of SNR is also related to the concept of the LR test(5). Specifically, the scaled numerator of the generalized SNRestimate (Eq. S9) is an LR test statistic for testing the association

between covariates X and variable Y in GLMs. Under indepen-dence of the observations, the LR test statistics have asymptot-ically χ2 distributions with degrees of freedom equal to thenumber of estimated parameters associated with the covariates.Hence, low levels of LR lead to the conclusion that there is notenough evidence for the association, which corresponds to lowvalues of SNR estimate (Eq. S9).

Variance-Based SNR and Effect Size for Linear RegressionAnother related measure is effect size. Cohen’s effect size forlinear regression models (6, 12), defined as f 2 =R2=ð1−R2Þ, isthe same as the squared-error-based SNR in Eq. S5. Cohen’s f 2

is not typically reported in studies, but it is often used for samplesize calculations in linear regression. For linear regressions, ef-fect sizes of 0.02, 0.15, and 0.35 are considered small, medium,and large, respectively. These three effect sizes correspond to anR2 of 0.02, 0.13, and 0.26 SNR of −17 dB, −8.2 dB, and −4.6 dB,which are consistent with the SNR values that we reported forsome of the neurons.

1. Hastie T (1987) A closer look at the deviance. Am Stat 41(1):16–20.2. Simon G (1973) Additivity of information in exponential family probability laws. J Am

Stat Assoc 68(342):478–482.3. Cameron AC, Windmeijer FAG (1997) An R-squared measure of goodness of fit for

some common nonlinear regression models. J Econom 77(2):329–342.4. Czanner G, Sarma SV, Eden UT, Brown EN (2008) A signal-to-noise ratio estimator for

generalized linear model systems. Lect Notes Eng Comput Sci 2171(1):1063–1069.5. McCullagh P, Nelder JA (1989) Generalized Linear Models (Chapman and Hall, New York).6. Cohen J (1992) A power primer. Psychol Bull 112(1):155–159.7. Chen Y, Beaulieu N (2007) Maximum likelihood estimation of SNR using digitally

modulated signals. IEEE Trans Wirel Comm 6(1):210–219.

8. Lyamzin DR, Macke JH, Lesica NA (2010) Modeling population spike trains withspecified time-varying spike rates, trial-to-trial variability, and pairwise signal andnoise correlations. Front Comput Neurosci 4(144):144.

9. Agresti A, Caffo B (2002) Measures of relativemodel fit. Comput Stat Data Anal 39:127–136.10. Nagelkerke NJD (1991) A note on a general definition of the coefficient of determination.

Biometrika 78(3):691–692.11. Kelley K (2007) Confidence intervals for standardized effect sizes: Theory, application, and

implementation. J Stat Softw 20(8):1–24.12. Fritz CO, Morris PE, Richler JJ (2012) Effect size estimates: current use, calculations, and

interpretation. J Exp Psychol Gen 141(1):2–18.

Fig. S1. Relationship between SNR and R2. Both plots are created for R2 values between 0.05 and 0.95. (Left) Eq. S19 and (Right) Eq. S20.



Fig. S2. Simulation analysis of the relationship between SNR and R2. One hundred observations were simulated from the linear model Y = 0.3X + «, where«, the errors, are independent Gaussian with zero mean and SDs of 2, 5, 10, and 30. These models give (A) SD = 2, R2 = 0.95, SNR = 13 dB; (B) SD = 5, R2 = 0.76,SNR = 5.1 dB; (C) SD = 10, R2 = 0.40, SNR = −1.7 dB; and (D) SD = 30, R2 = 0.09, SNR = −10 dB.

Fig. S3. Examples of goodness-of-fit analysis of GLM for a single neuron from the (A) primary auditory cortex of an anesthetized guinea pig, (B) rat thalamus,(C) monkey hippocampus, and (D) human subthalamic nucleus neuron. (Left) The KS plot of the time-rescaled interspike intervals The parallel 45° lines are the95% confidence interval. The KS plot (dark curve) lies within the 95% confidence intervals, suggesting agreement between the GLM and the data. (Right) Thepartial autocorrelation function of the interspike intervals transformed into Gaussian random variables. The horizontal parallel lines are the 95% confidence.The Gaussian transformed interspike intervals falling within the 95% confidence intervals suggests lack of correlations up to lag 100. Lack of correlation isconsistent with the transformed times being independent and further supports the goodness of fit of the GLM.



Documents

Measuring the signal-to-noise ratio of a neuron · Measuring the signal-to-noise ratio of a neuron Gabriela Czanner a,1 , Sridevi V. Sarma b , Demba Ba c , Uri T. Eden d , Wei Wu