Statistical methods for understanding neural codes Liam Paninski - PowerPoint PPT Presentation

Statistical methods for understanding neural codes Liam Paninski Department of Statistics and Center for Theoretical Neuroscience Columbia University http://www.stat.columbia.edu/ ∼ liam liam@stat.columbia.edu December 9, 2005

The neural code Input-output relationship between • External observables x (sensory stimuli, motor responses...) • Neural variables y (spike trains, population activity...) Probabilistic formulation: p ( y | x )

Example: neural prosthetic design Donoghue; Cyberkinetics, Inc. ‘04 Nicolelis, Nature ’01 (Paninski et al., 1999; Serruya et al., 2002; Shoham et al., 2005)

Basic goal ...learning the neural code. Fundamental question: how to estimate p ( y | x ) from experimental data? General problem is too hard — not enough data, too many inputs x and spike trains y

Avoiding the curse of insufficient data Many approaches to make problem tractable: 1 : Estimate some functional f ( p ) instead e.g., information-theoretic quantities (Nemenman et al., 2002; Paninski, 2003b) 2 : Select stimuli as efficiently as possible e.g., (Foldiak, 2001; Machens, 2002; Paninski, 2003a) 3: Fit a model with small number of parameters

Part 1: Neural encoding models “Encoding model”: p θ ( y | x ). — Fit parameter θ instead of full p ( y | x ) Main theme: want model to be flexible but not overly so Flexibility vs. “fittability”

Multiparameter HH-type model — highly biophysically plausible, flexible — but very difficult to estimate parameters given spike times alone (figure adapted from (Fohlmeister and Miller, 1997))

Cascade (“LNP”) model — easy to estimate: STA, STC (Simoncelli et al., 2004) — but not biophysically plausible (fails to capture spike timing details: refractoriness, burstiness, adaptation, etc.)

Two key ideas 1. Use likelihood-based methods for fitting. — well-justified statistically — easy to incorporate prior knowledge, explicit noise models, etc. 2. Use models that are easy to fit via maximum likelihood — concave (downward-curving) functions have no non-global local maxima = ⇒ concave functions are easy to maximize by gradient ascent. Recurring theme: find flexible models whose loglikelihoods are guaranteed to be concave.

Filtered integrate-and-fire model � � 0 − g ( t ) V ( t ) + I DC + � � dV ( t ) = k · � x ( t ) + h ( t − t j ) dt + σdN t ; j = −∞ (Gerstner and Kistler, 2002; Paninski et al., 2004b)

Model flexibility: Adaptation

The estimation problem (Paninski et al., 2004b)

First passage time likelihood P (spike at t i ) = fraction of paths crossing threshold for first time at t i (computed numerically via Fokker-Planck or integral equation methods)

Integral equation method Let p ( t ) = P (spike at time t). Then p ( t ) solves Volterra integral equation (Plesser and Tanaka, 1997) (via “method of images”; goes back to Schrodinger): � t G θ ( V th , t | V th , 0) = G θ ( V th , t | V th , s ) p ( s ) ds 0 G θ ( y, t | x, s ) = probability of V ( t ) = y , given V ( s ) = x (analytically computable) Lower-triangular linear system: O ( d 2 ) (Paninski et al., 2005) Can compute gradient p ( t ) w.r.t. θ via matrix perturbation: efficient maximization

Maximizing likelihood Maximization seems difficult, even intractable: — high-dimensional parameter space — likelihood is a complex nonlinear function of parameters Main result : The loglikelihood is concave in the parameters, no matter what data { � x ( t ) , t i } are observed. = ⇒ no non-global local maxima = ⇒ maximization easy by ascent techniques.

Proof of log-concavity theorem Based on probability integral representation of likelihood: � L ( θ ) = 1( V ∈ C ) dG � x,θ ( V ) x,θ ( V ) = OU-measure on voltage paths V G � C = set of voltage paths V ( t ) consistent with spike data: V ( t ) ≤ V th ; V ( t i ) = V th ; V ( t + i ) = V reset Now use fact that marginalizing preserves log-concavity (Prekopa, 1973): if f ( � y ) is jointly l.c., then so is x, � � f 0 ( � x ) ≡ f ( � y ) d� x, � y.

Application: retinal ganglion cells Preparation: dissociated salamander and macaque retina — extracellularly-recorded responses of populations of RGCs Stimulus: random “flicker” visual stimuli (Chander and Chichilnisky, 2001)

Spike timing precision in retina RGC LNP IF 0 0.25 0.5 0.75 1 0.07 0.17 0.22 0.26 RGC rate (sp/sec) LNP 200 IF 0 1.5 variance (sp 2 /bin) 1 0.5 0 0 0.25 0.5 0.75 1 0.6 0.64 0.85 0.9 (Pillow et al., 2005)

Linking spike reliability and subthreshold noise (Pillow et al., 2005)

Likelihood-based discrimination Given spike data, optimal decoder chooses stimulus � x according to likelihood: p ( spikes | � x 1 ) vs. p ( spikes | � x 2 ). Using correct model is essential (Pillow et al., 2005)

Generalization: population responses Pillow et al., SFN ’05

Population retinal recordings Pillow et al., SFN ’05

Part 2: Decoding subthreshold activity Given extracellular spikes, what is most likely intracellular V ( t )? 10 0 −10 −20 V (mV) −30 −40 −50 −60 −70 ? −80 5.1 5.15 5.2 5.25 5.3 5.35 5.4 5.45 5.5 5.55 time (sec)

Computing V ML ( t ) Loglikelihood of V ( t ) (given LIF parameters, white noise N t ): � T �� 2 L ( { V ( t ) } 0 ≤ t ≤ T ) = − 1 � � ˙ V ( t ) − − gV ( t ) + I ( t ) dt 2 σ 2 0 Constraints: • Reset at t = 0: V (0) = V reset • Spike at t = T : V ( T ) = V th • No spike for 0 < t < T : V ( t ) < V th Quadratic programming problem: optimize quadratic function under linear constraints. Concave : unique global optimum.

Most likely vs. average V ( t ) 1 1 0.5 0.5 V σ =0.4 0 0 0 0.5 1 1 1 V 0.5 0.5 σ =0.2 0 0 0 0.5 1 1 1 0.5 0.5 V σ =0.1 0 0 0 0.5 1 0 0.5 1 t t (Applications to spike-triggered average (Paninski, 2005a; Paninski, 2005b))

Application: in vitro data Recordings: rat sensorimotor cortical slice; dual-electrode whole-cell Stimulus: Gaussian white noise current I ( t ) Analysis: fit IF model parameters { g,� k, h ( . ) , V th , σ } by maximum likelihood (Paninski et al., 2003; Paninski et al., 2004a), then compute V ML ( t )

Application: in vitro data true V(t) 0 V ML (t) −20 V (mV) −40 −60 1.04 1.05 1.06 1.07 1.08 1.09 1.1 1.11 1.12 1.13 −45 −50 −55 V (mV) −60 −65 −70 −75 1.04 1.05 1.06 1.07 1.08 1.09 1.1 1.11 1.12 1.13 time (sec) P ( V ( t ) |{ t i } , ˆ x ) computed via forward-backward hidden θ ML , � Markov model method (Paninski, 2005a).

Part 3: Back to detailed models Can we recover detailed biophysical properties? • Active: membrane channel densities • Passive: axial resistances, “leakiness” of membranes • Dynamic: spatiotemporal synaptic input

Conductance-based models Key point: if we observe full V i ( t ) + cell geometry, channel kinetics known + current noise is log-concave, then loglikelihood of unknown parameters is concave. Gaussian noise = ⇒ standard nonnegative regression (albeit high-d).

Estimating channel densities from V ( t ) Ahrens, Huys, Paninski, NIPS ’05

0 −20 V −40 −60 50 summed currents dV/dt 0 −50 −100 20 40 60 80 100 Time [ms] conductance [mS/cm 2 ] 100 True Inferred 50 0 NaHH KHH Leak NaM KM NaS KAS Ahrens, Huys, Paninski, NIPS ’05

Estimating non-homogeneous channel densities and axial resistances from spatiotemporal voltage recordings Ahrens, Huys, Paninski, COSYNE ’05

Estimating synaptic inputs given V ( t ) B A Synaptic conductances Channel conductances 120 True parameters max conductance [mS/cm 2 ] 1 (spikes and conductances) 100 Data (voltage trace) Inferred (MAP) spikes Inferred (ML) channel densities 80 Inh spikes | Voltage [mV] | Exc spikes 60 0 40 20 mV 20 −25 mV 0 HHNa HHK Leak MNa MK SNa SKA SKDR C 1 −70 mV 1 0 20 mV −25 mV −70 mV 1 0 0 1280 1300 1320 1340 1360 1380 1400 0 500 1000 1500 2000 Time [ms] Time [ms] Ahrens, Huys, Paninski, NIPS ’05

Collaborators Theory and numerical methods — J. Pillow, E. Simoncelli, NYU — S. Shoham, Princeton — A. Haith, C. Williams, Edinburgh — M. Ahrens, Q. Huys, Gatsby Motor cortex physiology — M. Fellows, J. Donoghue, Brown — N. Hatsopoulos, U. Chicago — B. Townsend, R. Lemon, U.C. London Retinal physiology — V. Uzzell, J. Shlens, E.J. Chichilnisky, UCSD Cortical in vitro physiology — B. Lau and A. Reyes, NYU

Statistical methods for understanding neural codes Liam Paninski - PowerPoint PPT Presentation

Statistical methods for understanding neural codes Liam Paninski Department of Statistics and Center for Theoretical Neuroscience Columbia University http://www.stat.columbia.edu/ liam liam@stat.columbia.edu December 9, 2005 The neural

Statistical methods for understanding neural codes Liam Paninski Department of Statistics

Statistical methods for understanding neural codes Liam Paninski Department of Statistics and

Combining biophysical and statistical methods for understanding neural codes Liam Paninski

Statistical methods for understanding neural coding and dynamics Liam Paninski Department of

Statistical methods for understanding complex biophysical neural data Liam Paninski Department

Categorical Data Clustering Using Statistical Methods and Neural Networks . Kudov 1 , H.

Reconstructing functional neural circuits with single cell resolution Statistical methods for

Neural Networks, Chapter 11 in ESL II STK-IN4300 Statistical Learning Methods in Data Science

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Neural Networks and Learning

Recurrent Neural Networks LING572 Advanced Statistical Methods for NLP March 5 2020 1 Outline

Neural Networks: Computation + Gradient Descent LING572 Advanced Statistical Methods in NLP

Methods for interpreting and understanding deep neural networks Grgoire Montavon,

Increasing Awareness, Understanding, and Availability of Statistical Methods for Water Quality

What can Statistical Machine Translation teach Neural Text Generation about Optimization? Graham

twelve design methods, structural codes Methods & Codes 1 Elements of Architectural

ICASSP 2017 Tutorial on Methods for Interpreting and Understanding Deep Neural Networks G.

eleven design loads, methods, structural codes & tracing Methods & Codes 1

Neural Networks: Introduction LING572 Advanced Statistical Methods for NLP February 25 2020 1

Statistical methods for neural decoding Liam Paninski Gatsby Computational Neuroscience Unit

Tutorial on Methods for Interpreting and Understanding Deep Neural Networks Wojciech Samek

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

Toric ideals of neural codes Elizabeth Gross San Jos e State University Joint work with Nida

Statistical natural language processing 24.05.19 Statistical Natural Language Processing 1 The

Statistical methods for understanding neural codes Liam Paninski - PowerPoint PPT Presentation

Statistical methods for understanding neural codes Liam Paninski Department of Statistics and Center for Theoretical Neuroscience Columbia University http://www.stat.columbia.edu/ liam liam@stat.columbia.edu December 9, 2005 The neural

Statistical methods for understanding neural codes Liam Paninski Department of Statistics

Statistical methods for understanding neural codes Liam Paninski Department of Statistics and

Combining biophysical and statistical methods for understanding neural codes Liam Paninski

Statistical methods for understanding neural coding and dynamics Liam Paninski Department of

Statistical methods for understanding complex biophysical neural data Liam Paninski Department

Categorical Data Clustering Using Statistical Methods and Neural Networks . Kudov 1 , H.

Reconstructing functional neural circuits with single cell resolution Statistical methods for

Neural Networks, Chapter 11 in ESL II STK-IN4300 Statistical Learning Methods in Data Science

Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Neural Networks and Learning

Recurrent Neural Networks LING572 Advanced Statistical Methods for NLP March 5 2020 1 Outline

Neural Networks: Computation + Gradient Descent LING572 Advanced Statistical Methods in NLP

Methods for interpreting and understanding deep neural networks Grgoire Montavon,

Increasing Awareness, Understanding, and Availability of Statistical Methods for Water Quality

What can Statistical Machine Translation teach Neural Text Generation about Optimization? Graham

twelve design methods, structural codes Methods &amp; Codes 1 Elements of Architectural

ICASSP 2017 Tutorial on Methods for Interpreting and Understanding Deep Neural Networks G.

eleven design loads, methods, structural codes &amp; tracing Methods &amp; Codes 1

Neural Networks: Introduction LING572 Advanced Statistical Methods for NLP February 25 2020 1

Statistical methods for neural decoding Liam Paninski Gatsby Computational Neuroscience Unit

Tutorial on Methods for Interpreting and Understanding Deep Neural Networks Wojciech Samek

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

Toric ideals of neural codes Elizabeth Gross San Jos e State University Joint work with Nida

Statistical natural language processing 24.05.19 Statistical Natural Language Processing 1 The

twelve design methods, structural codes Methods & Codes 1 Elements of Architectural

eleven design loads, methods, structural codes & tracing Methods & Codes 1