real time adaptive information theoretic optimization of
play

Real-time adaptive information-theoretic optimization of - PowerPoint PPT Presentation

Real-time adaptive information-theoretic optimization of neurophysiology experiments Presented by Alex Roper March 5, 2009 Goals How do neurons react to stimuli? What is a neurons preferred stimulus? Goals How do neurons react


  1. Real-time adaptive information-theoretic optimization of neurophysiology experiments Presented by Alex Roper March 5, 2009

  2. Goals ◮ How do neurons react to stimuli? ◮ What is a neuron’s preferred stimulus?

  3. Goals ◮ How do neurons react to stimuli? ◮ What is a neuron’s preferred stimulus? ◮ Minimize number of trials. ◮ Speed - must run in real time.

  4. Goals ◮ How do neurons react to stimuli? ◮ What is a neuron’s preferred stimulus? ◮ Minimize number of trials. ◮ Speed - must run in real time. ◮ Emphasis on dimensional scalability (vision)

  5. Challenges ◮ Typically high dimension ◮ Model complexity - memory ◮ Stimulus complexity - visual bitmap

  6. Challenges ◮ Typically high dimension ◮ Model complexity - memory ◮ Stimulus complexity - visual bitmap ◮ Bayesian approach expensive ◮ Estimation ◮ Integration ◮ Multivariate optimization

  7. Challenges ◮ Typically high dimension ◮ Model complexity - memory ◮ Stimulus complexity - visual bitmap ◮ Bayesian approach expensive ◮ Estimation ◮ Integration ◮ Multivariate optimization ◮ Limited firing capacity of a neuron (exhaustion)

  8. Challenges ◮ Typically high dimension ◮ Model complexity - memory ◮ Stimulus complexity - visual bitmap ◮ Bayesian approach expensive ◮ Estimation ◮ Integration ◮ Multivariate optimization ◮ Limited firing capacity of a neuron (exhaustion) ◮ Essential issues ◮ Update a posteriori beliefs quickly given new data ◮ Find optimal stimulus quickly

  9. Neuron Model p ( r t |{ x t , x t − 1 , ..., x t − t k } , { r t − 1 , ..., r t − t k } )

  10. Neuron Model p ( r t |{ x t , x t − 1 , ..., x t − t k } , { r t − 1 , ..., r t − t k } ) ◮ The response r t to stimulus x t is dependent on x t itself, as well as the history of stimuli and responses for a constant sliding window.

  11. Neuron Model p ( r t |{ x t , x t − 1 , ..., x t − t k } , { r t − 1 , ..., r t − t k } ) ◮ The response r t to stimulus x t is dependent on x t itself, as well as the history of stimuli and responses for a constant sliding window. ◮ This is needed to measure exhaustion, depletion, etc.

  12. Neuron Model p ( r t |{ x t , x t − 1 , ..., x t − t k } , { r t − 1 , ..., r t − t k } ) ◮ The response r t to stimulus x t is dependent on x t itself, as well as the history of stimuli and responses for a constant sliding window. ◮ This is needed to measure exhaustion, depletion, etc. �� � l = 1 t k k i , t − l + � t a � λ t = E ( r t ) = f j = 1 a j r t − j , i

  13. Neuron Model p ( r t |{ x t , x t − 1 , ..., x t − t k } , { r t − 1 , ..., r t − t k } ) ◮ The response r t to stimulus x t is dependent on x t itself, as well as the history of stimuli and responses for a constant sliding window. ◮ This is needed to measure exhaustion, depletion, etc. �� � l = 1 t k k i , t − l + � t a � λ t = E ( r t ) = f j = 1 a j r t − j , i ◮ Filter coefficients k i , t − l represent dependence on the input itself.

  14. Neuron Model p ( r t |{ x t , x t − 1 , ..., x t − t k } , { r t − 1 , ..., r t − t k } ) ◮ The response r t to stimulus x t is dependent on x t itself, as well as the history of stimuli and responses for a constant sliding window. ◮ This is needed to measure exhaustion, depletion, etc. �� � l = 1 t k k i , t − l + � t a � λ t = E ( r t ) = f j = 1 a j r t − j , i ◮ Filter coefficients k i , t − l represent dependence on the input itself. ◮ a j models dependence on observed recent activity.

  15. Neuron Model p ( r t |{ x t , x t − 1 , ..., x t − t k } , { r t − 1 , ..., r t − t k } ) ◮ The response r t to stimulus x t is dependent on x t itself, as well as the history of stimuli and responses for a constant sliding window. ◮ This is needed to measure exhaustion, depletion, etc. �� � l = 1 t k k i , t − l + � t a � λ t = E ( r t ) = f j = 1 a j r t − j , i ◮ Filter coefficients k i , t − l represent dependence on the input itself. ◮ a j models dependence on observed recent activity. ◮ We summarize all unknown parameters as θ . This is what we’re trying to learn.

  16. Generalized Linear Models ◮ Distribution function (multivariate gaussian). ◮ Linear predictor, θ . ◮ Link function (exponential).

  17. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian

  18. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior)

  19. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior) ◮ Laplace approximation to construct a Gaussian approximation of the posterior.

  20. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior) ◮ Laplace approximation to construct a Gaussian approximation of the posterior. ◮ Set µ t to the peak of the posterior. ◮ Set covariance matrix C t to negative inverse of Hessian of log posterior at µ t .

  21. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior) ◮ Laplace approximation to construct a Gaussian approximation of the posterior. ◮ Set µ t to the peak of the posterior. ◮ Set covariance matrix C t to negative inverse of Hessian of log posterior at µ t . ◮ Compute directly?

  22. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior) ◮ Laplace approximation to construct a Gaussian approximation of the posterior. ◮ Set µ t to the peak of the posterior. ◮ Set covariance matrix C t to negative inverse of Hessian of log posterior at µ t . ◮ Compute directly? ◮ Complexity is O ( td 2 + d 3 )

  23. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior) ◮ Laplace approximation to construct a Gaussian approximation of the posterior. ◮ Set µ t to the peak of the posterior. ◮ Set covariance matrix C t to negative inverse of Hessian of log posterior at µ t . ◮ Compute directly? ◮ Complexity is O ( td 2 + d 3 ) ◮ O ( td 2 ) for product of t likelihood terms. ◮ O ( d 3 ) for inverting the Hessian ◮ Approximate p ( θ t − 1 | x t − 1 , r t − 1 ) as Gaussian

  24. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior) ◮ Laplace approximation to construct a Gaussian approximation of the posterior. ◮ Set µ t to the peak of the posterior. ◮ Set covariance matrix C t to negative inverse of Hessian of log posterior at µ t . ◮ Compute directly? ◮ Complexity is O ( td 2 + d 3 ) ◮ O ( td 2 ) for product of t likelihood terms. ◮ O ( d 3 ) for inverting the Hessian ◮ Approximate p ( θ t − 1 | x t − 1 , r t − 1 ) as Gaussian ◮ Now we can use Bayes’ rule to find the posterior in one dimension.

  25. Updating the Posterior ◮ Ideally, this runs in real time. ◮ Approximate the posterior as Gaussian ◮ The posterior is the product of two smooth, log-concave terms. ◮ (The GLM likelihood function and the Gaussian prior) ◮ Laplace approximation to construct a Gaussian approximation of the posterior. ◮ Set µ t to the peak of the posterior. ◮ Set covariance matrix C t to negative inverse of Hessian of log posterior at µ t . ◮ Compute directly? ◮ Complexity is O ( td 2 + d 3 ) ◮ O ( td 2 ) for product of t likelihood terms. ◮ O ( d 3 ) for inverting the Hessian ◮ Approximate p ( θ t − 1 | x t − 1 , r t − 1 ) as Gaussian ◮ Now we can use Bayes’ rule to find the posterior in one dimension. O ( d 2 ) .

  26. Deriving the optimal stimulus ◮ Main idea: maximize conditional mutual information:

  27. Deriving the optimal stimulus ◮ Main idea: maximize conditional mutual information: ◮ I ( θ ; r t + 1 | x t + 1 , x t , r t ) = H ( θ | x t , r t ) − H ( θ | x t + 1 , r t + 1 ) .

  28. Deriving the optimal stimulus ◮ Main idea: maximize conditional mutual information: ◮ I ( θ ; r t + 1 | x t + 1 , x t , r t ) = H ( θ | x t , r t ) − H ( θ | x t + 1 , r t + 1 ) . ◮ This ends up being equivalent to minimizing the conditional entropy H ( θ | x t + 1 , r t + 1 ) .

  29. Deriving the optimal stimulus ◮ Main idea: maximize conditional mutual information: ◮ I ( θ ; r t + 1 | x t + 1 , x t , r t ) = H ( θ | x t , r t ) − H ( θ | x t + 1 , r t + 1 ) . ◮ This ends up being equivalent to minimizing the conditional entropy H ( θ | x t + 1 , r t + 1 ) . ◮ End up with equation for covariance in terms of Fisher information, J obs .

Recommend


More recommend