generalized linear models glms
play

Generalized Linear Models (GLMs) Jonathan Pillow 1 Example 3: - PowerPoint PPT Presentation

Statistical modeling and analysis of neural data NEU 560, Spring 2018 Lecture 9 Generalized Linear Models (GLMs) Jonathan Pillow 1 Example 3: unknown neuron 100 75 (spike count) 50 25 0 -25 0 25 (contrast) Be the computational


  1. Statistical modeling and analysis of neural data NEU 560, Spring 2018 Lecture 9 Generalized Linear Models (GLMs) Jonathan Pillow 1

  2. Example 3: unknown neuron 100 75 (spike count) 50 25 0 -25 0 25 (contrast) Be the computational neuroscientist: what model would you use? 2

  3. Example 3: unknown neuron 100 75 (spike count) 50 25 0 -25 0 25 (contrast) More general setup: for some nonlinear function f 3

  4. Quick Quiz: The distribution P(y|x, ) can be considered as a function of y, x, or . spikes stimulus parameters What is P(y|x, ) : 1. as a function of y? Answer: encoding distribution - probability distribution over spike counts 2. as a function of ? Answer: likelihood function - the probability of the data given model params 3. as a function of x? Answer: stimulus likelihood function - useful for ML stimulus decoding! 4

  5. 60 40 (spike count) stimulus decoding 20 likelihood 0 0 20 40 (contrast) 5

  6. What is this? 60 40 (spike count) 20 0 0 20 40 0 20 40 60 (contrast) Stimulus likelihood function (for decoding) 6

  7. GLMs • Be careful about terminology: GLM ≠ GLM General Linear Model Generalized Linear Model (Nelder 1972) Linear Linear 7

  8. 2003 interview with John Nelder... Stephen Senn : I must confess to having some confusion when I was a young statistician between general linear models and generalized linear models. Do you regret the terminology? John Nelder : I think probably I do. I suspect we should have found some more fancy name for it that would have stuck and not been confused with the general linear model, although general and generalized are not quite the same. I can see why it might have been better to have thought of something else. Senn, (2003). Statistical Science 8

  9. Moral: Be careful when naming your model! 9

  10. 1. General Linear Model Linear Noise (exponential family) “Dimensionality Reduction” y = ~ ✓ · ~ x + ✏ Examples: 1. Gaussian 2. Poisson 10

  11. 2. Generalized Linear Model Linear Noise Nonlinear (exponential family) y = f ( ~ x ) + ✏ ✓ · ~ Examples: 1. Gaussian 2. Poisson 11

  12. 2. Generalized Linear Model Terminology: Linear Noise Nonlinear (exponential family) “distribution function” “ parameter ” = “ link function ” 12

  13. From spike counts to spike trains: N (0 , σ 2 ) y t = ~ response k · ~ x t + noise at time t vector stimulus linear at time t y t = ~ filter k · ~ x t + ✏ t first idea: linear-Gaussian model! stimulus response 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 00 0 time 13

  14. N (0 , σ 2 ) y t = ~ response k · ~ x t + noise at time t vector stimulus linear at time t filter walk through the data one time bin at a time t = 1 ~ x t stimulus response 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 00 0 time y t 14

  15. N (0 , σ 2 ) y t = ~ response k · ~ x t + noise at time t vector stimulus linear at time t filter walk through the data one time bin at a time t = 2 ~ x t stimulus response 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 00 0 time y t 15

  16. N (0 , σ 2 ) y t = ~ response k · ~ x t + noise at time t vector stimulus linear at time t filter walk through the data one time bin at a time t = 3 ~ x t stimulus response 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 00 0 time y t 16

  17. N (0 , σ 2 ) y t = ~ response k · ~ x t + noise at time t vector stimulus linear at time t filter walk through the data one time bin at a time t = 4 ~ x t stimulus response 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 00 0 time y t 17

  18. N (0 , σ 2 ) y t = ~ response k · ~ x t + noise at time t vector stimulus linear at time t filter walk through the data one time bin at a time t = 5 ~ x t stimulus response 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 00 0 time y t 18

  19. N (0 , σ 2 ) y t = ~ response k · ~ x t + noise at time t vector stimulus linear at time t filter walk through the data one time bin at a time t = 6 ~ x t stimulus response 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 00 0 time y t 19

  20. Build up to following matrix version: X ~ = + noise k Y 0 time ~ k 1 = 0 … … design matrix 20

  21. Build up to following matrix version: X ~ = + noise k Y 0 time ~ k 1 = 0 … … k = ( X T X ) − 1 X T Y ˆ least squares solution: stimulus spike-triggered avg (maximum likelihood estimate for covariance (STA) “Linear-Gaussian” GLM) 21

  22. Formal treatment: scalar version N (0 , σ 2 ) y t = ~ Guassian noise k · ~ x t + ✏ t model: with variance σ 2 x t , ~ x t · ~ k, � 2 ) y t | ~ k ∼ N ( ~ equivalent to writing: or k )2 xt · ~ 2 πσ 2 e − ( yt − ~ x t , ~ 1 p ( y t | ~ k ) = 2 � 2 √ T (independence p ( Y | X, ~ x t , ~ Y k ) = p ( y t | ~ k ) For entire dataset: across time bins) t =1 x t · ~ k ) 2 ( y t − ~ 2 exp( − P T = (2 πσ 2 ) − T ) t =1 2 � 2 x t · ~ k ) 2 ( y t − ~ log P ( Y | X, ~ k ) = − P T + const log-likelihood t =1 2 � 2 22

  23. Formal treatment: vector version N (0 , σ 2 I ) X ~ = k + ~ ✏ Y iid Gaussian noise vector 0 time ✏ 1 ~ k 1 + ✏ 2 = 0 ✏ 3 … … … Y | X, ~ k ∼ N ( X ~ k, � 2 I ) equivalent to writing: Take log, differentiate and or set to zero. ⇣ ⌘ P ( Y | X, ~ 2 σ 2 ( Y − X ~ k ) > ( Y − X ~ 1 1 k ) = 2 exp k ) − T | 2 πσ 2 I | 23

  24. But noise is not Gaussian! 0 time ~ k 1 ≈ f( ) 0 … … nonlinearity probability of x t · ~ p t = f ( ~ k ) Bernoulli GLM: spike at bin t (coin flipping model, p ( y t = 1 | ~ x t ) = p t y = 0 or 1) x t , ~ x t · ~ y t | ~ k ∼ Ber( f ( ~ k )) Equivalent ways of writing: ⌘ 1 − y t ⇣ x t , ~ x t · ~ x t · ~ k ) y t p ( y t | ~ k ) = f ( ~ 1 − f ( ~ k ) or ⇣ ⌘ L = P T x t · ~ x t · ~ y t log f ( ~ k ) + (1 − y t ) log(1 − f ( ~ k )) log-likelihood: t =1 24

  25. Logistic regression 0 time ~ k 1 ≈ f( ) 0 … … nonlinearity probability of x t · ~ p t = f ( ~ k ) Bernoulli GLM: spike at bin t (coin flipping model, p ( y t = 1 | ~ x t ) = p t y = 0 or 1) 1 f ( x ) = logistic function Logistic regression: 1 + e − x • so logistic regression is a special case of a Bernoulli GLM 25

Recommend


More recommend