biomedicine
play

Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics - PowerPoint PPT Presentation

Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics Neuron: real and simulated A bit of history From biology to models Biological models? Careful with brain analogies: Many different types of


  1. Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it

  2. Neuron basics

  3. Neuron: real and simulated

  4. A bit of history

  5. From biology to models

  6. Biological models? Careful with brain analogies: Many different types of neurons Dendrites can perform complex non-linear computations Synapses are not single weights but complex dynamical dynamical system Rate code may not be adequate

  7. Single neuron classifier

  8. Neuron and logistic classifier 1 ๐‘ ๐’š = ๐œ ๐’™ ๐‘ˆ ๐’š = 1 + ๐‘“ โˆ’(๐‘ฅ 0 + ๐‘— ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— ) ๐‘ฆ 0 ๐‘ฅ 0 synapse ๐‘ฅ 0 ๐‘ฆ 0 axon from a neuron dendrite cell body x ๐‘” ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘ฅ 1 ๐‘ฆ 1 ๐‘ง ๐‘— ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘” output axon ๐‘— activation ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— function Forward flow ๐‘ง x

  9. Linking output to input How a change in the output (loss) affects the weights? ๐‘ฆ 0 ๐‘ฅ 0 synapse ๐‘ฅ 0 ๐‘ฆ 0 axon from a neuron dendrite cell body ๐‘” ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘ฅ 1 ๐‘ฆ 1 ๐‘ง ) 2 ๐‘€(๐’™, ๐‘ง) โ‰ˆ (๐‘ง โˆ’ ๐‘— ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘” output axon ๐‘— activation ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— function Backward flow ๐‘ง w

  10. Linking output to input How a change in the output (loss) affects the weights? ๐‘ฆ 0 ๐‘ฅ 0 synapse ๐œ–๐‘” ๐œ–๐‘จ axon from a neuron ๐œ–๐’œ ๐œ–๐‘ฅ 0 ๐œ–๐‘” dendrite ๐œ–๐‘” ๐œ–๐‘จ cell body ๐œ–๐’œ ๐œ–๐‘€ ๐œ–๐’œ ๐œ–๐‘ฅ 1 ๐œ–๐‘” ๐œ–๐‘จ ๐‘” ๐œ–๐’œ ๐œ–๐’œ ๐œ–๐’™ output axon ๐œ–๐‘” ๐œ–๐‘จ activation function ๐œ–๐’œ ๐œ–๐‘ฅ ๐‘— Backward flow w y

  11. Activation function Ups 1) Easy analytical derivatives 2) Squashes numbers to range [0,1] 3) Biological interpretation as saturating ยซfiring rateยป of a neuron Downs 1 1 + ๐‘“ โˆ’๐‘ง(๐’™ ๐‘ˆ ๐’š) = ๐œ ๐’™ ๐‘ˆ ๐’š = ๐œ(๐’œ) 1) Saturated neurons kill the gradients 2) Sigmoid output are not zero- centered

  12. Sigmoid backpropagation Assume the input of a neuron is always positive. What about the gradient on ๐’™ ? ๐‘” ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘— ๐’™ ๐‘ข+1 = ๐’™ ๐‘ข + ๐›ผ ๐’™ ๐‘” Gradient is all positive or all negative!

  13. Improving activation function Ups 1) Still analytical derivatives 2) Squashes numbers to range [-1,1] 3) Zero-centered! tanh ๐‘ฆ = ๐‘“ ๐‘ฆ โˆ’ ๐‘“ โˆ’๐‘ฆ ๐‘“ ๐‘ฆ + ๐‘“ โˆ’๐‘ฆ Downs 1) Saturated neurons kill the ๐‘’ tanh ๐‘ฆ gradients = 1 โˆ’ ๐‘ข๐‘๐‘œโ„Ž 2 (๐‘ฆ) ๐‘’๐‘ฆ

  14. Activation function 2 Rectifying Linear Unit: ReLU Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice ๐‘” ๐‘ฆ = max(0, ๐‘ฆ) Downs 1) What happens for x<0? ๐‘’๐‘” ๐‘’๐‘ฆ = 1 ๐‘ฆ > 0 0 ๐‘ฆ < 0

  15. ReLU neuron killing

  16. Activation function 3 Leaky ReLU Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice 4) Keep neurons alive! ๐‘” ๐‘ฆ = ๐ผ โˆ’๐‘ฆ ๐›ฝ๐‘ฆ + ๐ผ ๐‘ฆ ๐‘ฆ Downs ๐‘’๐‘” ๐‘’๐‘ฆ = 1 ๐‘ฆ > 0 ๐›ฝ ๐‘ฆ < 0

  17. Activation function 4 Ups 1) Does not saturate Maxout 2) Computationally efficient 3) Linear regime 4) Keeps neurons alive! 5) Generalizes ReLU and leaky ๐‘” ๐‘ฆ = max(๐’™ 1๐‘ˆ ๐’š, ๐’™ 2๐‘ˆ ๐’š) ReLU Downs 1) Is not a dot product 2) Doubles the parameters

  18. Neural Networks: architecture

  19. Neural Networks: architecture 2-layers Neural Network 3-layers Neural Network 1-hidden layer Neural Network 2-hidden layers Neural Network

  20. Neural Networks: architecture Number of neurons? Number of weights? Number of parameters?

  21. Neural Networks: architecture Number of neurons: 4+2=6 Number of weights: 4x3+2x4=20 Number of parameters: 20+6

  22. Neural Networks: architecture Number of neurons: 4+2=6 Number of neurons: 4+4+1=9 Number of weights: 4x3+2x4=20 Number of weights: 4x3+4x4+1x4=32 Number of parameters: 20+6 Number of parameters: 32+9

  23. Neural Networks: architecture Modern CNNs: ~10 million artificial neurons Human Visual Cortex: ~5 billion neurons

  24. ANN representation ๐‘ฅ 0,๐‘— ๐‘ฅ 0,๐‘— ๐‘ฅ 1,๐‘— ๐‘ฅ 1,๐‘— ๐‘ฅ 2,๐‘— ๐’™ ๐Ÿ,๐’‹ = ๐’™ ๐Ÿ‘,๐’‹ = ๐‘ฅ 2,๐‘— ๐‘ฅ 3,๐‘— ๐‘ฅ 3,๐‘— ๐‘ฅ 4,๐‘— ๐’™ 1,1 ๐’™ 2,1 ๐‘ฆ 1 ๐‘ฆ 2 ๐’™ 2,2 ๐‘ฆ 3 ๐’™ 1,4 ๐‘ˆ ๐’š 1 1 ๐‘ฆ 11 โ‹ฏ ๐‘ฆ 13 ๐‘ˆ โ‹ฏ ๐‘ฆ 23 1 ๐‘ฆ 21 ๐’š 2 โ‹ฏ ๐‘ฆ 33 ๐’€ = = 1 ๐‘ฆ 31 ๐‘ˆ ๐’š 3 โ‹ฏ โ‹ฎ โ‹ฎ โ‹ฎ 1 ๐‘ฆ ๐‘‚1 โ‹ฏ ๐‘ฆ ๐‘‚3 ๐‘ˆ

  25. ANN representation ๐‘ˆ ๐’š 1 1 ๐‘ฆ 11 โ‹ฏ ๐‘ฆ 13 ๐‘ˆ โ‹ฏ ๐‘ฆ 23 1 ๐‘ฆ 21 ๐’š 2 โ‹ฏ ๐‘ฆ 33 ๐’€ = 1 ๐‘ฆ 31 = ๐‘ˆ ๐’š 3 โ‹ฏ โ‹ฎ โ‹ฎ โ‹ฎ 1 ๐‘ฆ ๐‘‚1 โ‹ฏ ๐‘ฆ ๐‘‚3 ๐‘ˆ ๐’š ๐‘‚ ๐’‚ ๐Ÿ ๐’€ = ๐‘” 1 ๐‘ฟ ๐Ÿ ๐’€ ๐‘ฅ 0,1 ๐‘ฅ 0,2 ๐‘ฅ 0,3 ๐‘ฅ 1,1 ๐‘ฅ 1,2 ๐‘ฅ 1,3 ๐’ ๐’€ = ๐‘” 2 ๐‘ฟ ๐Ÿ‘ ๐’‚ 1 = ๐‘” 2 ๐‘ฟ 2 ๐‘” 1 ๐‘ฟ 1 ๐’€ = ๐’™ ๐Ÿ,๐Ÿ ๐’™ ๐Ÿ,๐Ÿ‘ ๐’™ ๐Ÿ,๐Ÿ’ ๐‘ฟ 1 = ๐‘ฅ 2,1 ๐‘ฅ 2,2 ๐‘ฅ 2,3 ๐‘ฅ 3,1 ๐‘ฅ 3,2 ๐‘ฅ 3,3 ๐‘ฟ 2 = ๐‘ฅ 1,2 ๐‘ฅ 2,2

  26. ANN becoming popular

  27. ANN training: forward flow Define a loss function ๐‘€ ๐’™ ๐‘ˆ ; ๐‘ง Each neuron computes: ๐‘ ๐‘˜ = ๐‘— ๐‘ฅ ๐‘˜๐‘— ๐‘จ ๐‘— And pass to the following layer: ๐‘จ ๐‘˜ = ๐‘”(๐‘ ๐‘˜ )

  28. ANN training: backward flow Need to compute: ๐œ–๐‘ ๐‘˜ ๐œ–๐‘€ = ๐œ–๐‘€ = ๐œ€ ๐‘˜ ๐‘จ ๐‘— ๐œ–๐‘ฅ ๐œ–๐‘ ๐‘˜ ๐œ–๐‘ฅ ๐‘˜๐‘— ๐‘˜๐‘— Considering that for the output neurons: ๐œ€ ๐‘™ = ๐‘ง ๐‘™ โˆ’ ๐‘ง ๐‘™ We get: ๐œ€ ๐‘˜ = ๐‘”โ€ฒ(๐‘ ๐‘˜ ) ๐‘ฅ ๐‘™๐‘˜ ๐œ€ ๐‘™ ๐‘™

  29. ANN training: backpropagation Updating all weights: ๐‘˜๐‘— ๐‘ข+1 = ๐‘ฅ ๐‘˜๐‘— ๐‘ข + ๐œƒ๐œ€ ๐‘ฅ ๐‘˜

  30. ANN training ex: forward ๐‘” ๐‘ = tanh(๐‘) ๐ฟ ๐‘€ ๐‘œ = 1 ๐‘ง ๐‘™ โˆ’ ๐‘ง ๐‘™ 2 2 ๐‘™=1 ๐ธ (1) ๐‘ฆ ๐‘— ๐‘ ๐‘˜ = ๐‘ฅ ๐‘—๐‘˜ ๐‘—=0 ๐‘จ ๐‘˜ = tanh(๐‘ ๐‘˜ ) ๐‘ (2) ๐‘จ ๐‘ง ๐‘™ = ๐‘ฅ ๐‘™๐‘˜ ๐‘˜ ๐‘˜=1

  31. ANN training ex: backward ๐œ€ ๐‘™ = ๐‘ง ๐‘™ โˆ’ ๐‘ง ๐‘™ ๐ฟ ๐‘˜2 ) ๐œ€ ๐‘˜ = (1 โˆ’ ๐‘จ ๐‘ฅ ๐‘™๐‘˜ ๐œ€ ๐‘™ ๐‘™=1 ๐œ–๐‘€ ๐‘œ ๐‘˜๐‘—(1) = ๐œ€ ๐‘˜ ๐‘ฆ ๐‘— ๐œ–๐‘ฅ ๐œ–๐‘€ ๐‘œ ๐œ–๐‘ฅ ๐‘™๐‘˜ (2) = ๐œ€ ๐‘™ ๐‘จ ๐‘˜

  32. What can ANN represent?

  33. What can ANN classify?

  34. Regularization

Recommend


More recommend