musings on continual learning
play

Musings on Continual Learning Pulkit Agrawal tv.99 chair.98 - PowerPoint PPT Presentation

Musings on Continual Learning Pulkit Agrawal tv.99 chair.98 chair.99 chair.90 dining table.99 chair.96 wine glass.97 chair.86 bottle.99 chair.99 wine glass.93 wine glass1.00 bowl.85 wine glass.99 wine glass1.00 chair.96 chair.99


  1. Musings on Continual Learning Pulkit Agrawal

  2. tv.99 chair.98 chair.99 chair.90 dining table.99 chair.96 wine glass.97 chair.86 bottle.99 chair.99 wine glass.93 wine glass1.00 bowl.85 wine glass.99 wine glass1.00 chair.96 chair.99 fork.95 bowl.81 knife.83

  3. What is a zebra?

  4. What is a zebra?

  5. Success in Reinforcement Learning ATARI Games ~10-50 million interactions! 21 million games! Simulation, Closed World, Known Model

  6. Impressive Specialists

  7. Today’s AI AI we want Task Specific Generalists ???

  8. Core Characteristic: Reuse past knowledge to solve new tasks Learn to perform N Solve the (N+1)th task tasks faster or, more complex task

  9. Success on Imagenet

  10. Training on N tasks —> Object classification knowledge Knowledge for classification

  11. Training on N tasks —> Object classification knowledge Knowledge for classification

  12. Reuse knowledge by fine-tuning Orange? Apple?

  13. Imagenet: 1000 examples/class New task: ~100 examples/class Orange? Apple?

  14. Still need hundreds of “labelled” data points! Fine-tuning with very few data points, won’t be effective!

  15. Problem Setup Training Set Apple Orange

  16. Problem Setup Training Set Test Apple Apple Orange or Orange?

  17. Use Nearest Neighbors Training Set Test Apple Apple Orange or Orange?

  18. Use Nearest Neighbors Training Set Test Apple Apple Orange or Orange?

  19. What does the performance depend on?? Training Set Test Apple Apple Orange or Orange?

  20. What does the performance depend on?? Features might not be optimized Training Set for matching! Apple Apple Orange or Orange?

  21. Metric Learning via Siamese Networks* Instead of one v/s all classification (*Hadsell et. al. 2006)

  22. Metric Learning via Siamese Networks* (*Hadsell et. al. 2006)

  23. Metric Learning via Siamese Networks* 1 Same class: Output = 1 (*Hadsell et. al. 2006)

  24. Metric Learning via Siamese Networks* 1 Same class: Output = 1 (*Hadsell et. al. 2006)

  25. Metric Learning via Siamese Networks* 0 Same class: Output = 1 Different class: Output = 0 (*Hadsell et. al. 2006)

  26. Metric Learning via Siamese Networks* 0 Same class: Output = 1 Different class: Output = 0 (*Hadsell et. al. 2006)

  27. Solving using Siamese Network Training Set Test Apple Apple Orange or Orange?

  28. Solving using Siamese Network Training Set Siamese 0.1 Net Apple Orange

  29. Solving using Siamese Network Training Set Siamese 0.1 Net Apple Siamese Orange 0.8 Net

  30. Solving using Siamese Network Training Set Siamese 0.1 Net Apple Also look at Matching Networks, Vinyals et al. 2017 Siamese Orange 0.8 Net

  31. Another perspective : parameters after training on say Imagenet

  32. Another perspective Task1: Apple v/s Orange : parameters after training on say Imagenet

  33. Another perspective Task1: Apple v/s Orange fine-tuning : parameters after training on say Imagenet

  34. Another perspective Task1: Apple v/s Orange Task 2: Dog v/s Cat : parameters after training on say Imagenet

  35. Another perspective Task1: Apple v/s Orange Task 2: Dog v/s Cat : parameters after training on say Imagenet

  36. Another perspective Task1: Apple v/s Orange Task 2: Dog v/s Cat Amount of fine-tuning:

  37. What if? Task1: Apple v/s Orange Task 2: Dog v/s Cat fine-tuning would be faster! can we optimize to make fine-tuning easier? Amount of fine-tuning:

  38. How to do it? Task1: Apple v/s Orange Hariharan et al. 2016, Finn et al. 2017

  39. How to do it? Task1: Apple v/s Orange Hariharan et al. 2016, Finn et al. 2017

  40. How to do it? Task1: Apple v/s Orange (i.e. train for fast fine-tuning!) Hariharan et al. 2016, Finn et al. 2017

  41. Generalizing to N tasks Task1: Apple v/s Orange Hariharan et al. 2016, Finn et al. 2017

  42. More Details Task1: Apple v/s Orange Low Shot Visual Recognition Model Agnostic Meta-learning Hariharan et al. 2016 Finn et al. 2017 Hariharan et al. 2016, Finn et al. 2017

  43. Until Now Finetuning Nearest Neighbor Matching Siamese Network based Metric Learning Meta-Learning: Training for fine-tuning Better Features —> Better Transfer!

  44. In practice, how good are these features? Dog from Imagenet Accuracy ~80% Dog Accuracy ~20%

  45. Consider the task of identifying cars … Positives Negatives

  46. Testing the model ???

  47. Learning Spurious Correlations Unbiased look at Dataset bias, Torralba et al. 2011

  48. More parameters in the network More chances of learning spurious correlations!! Maybe this problem will be avoided if we first learn simple tasks and then more complex ones??

  49. Sequential/Continual Task Learning Catastrophic Forgetting!!! Poor performance on Fine-tuning Task-1 !!!

  50. Catastrophic forgetting in closely related tasks Training on rotating MNIST Test High Low Accuracy Accuracy

  51. In machine learning, we generally assume IID* data Sample batches of data! Each batch: uniform distribution of rotations *IID: Independently and Identically Distributed

  52. In machine learning, we generally assume IID* data In real world, data is often not batched :) Sample batches of data! Each batch: uniform distribution of rotations *IID: Independently and Identically Distributed

  53. Continual learning is natural …

  54. In the context of reinforcement learning

  55. Inves&ga&ng Human Priors for Playing Video Games, Rachit Dubey, Pulkit Agrawal , Deepak Pathak, Alyosha Efros, Tom Griffiths (ICML 2018)

  56. Humans make use of prior knowledge for exploration Inves&ga&ng Human Priors for Playing Video Games, Dubey R., Agrawal P. , Deepak P., Efros A., Griffiths T. (ICML 2018)

  57. Humans make use of prior knowledge for exploration Inves&ga&ng Human Priors for Playing Video Games, Dubey R., Agrawal P. , Deepak P., Efros A., Griffiths T. (ICML 2018)

  58. What about Reinforcement Learning Agents?

  59. In a simpler version of the game .. Inves&ga&ng Human Priors for Playing Video Games, Dubey R., Agrawal P. , Deepak P., Efros A., Griffiths T. (ICML 2018)

  60. For RL agents, both games are the same! Inves&ga&ng Human Priors for Playing Video Games, Dubey R., Agrawal P. , Deepak P., Efros A., Griffiths T. (ICML 2018)

  61. Equip Reinforcement Learning Agents with prior knowledge?

  62. Common-Sense/Prior Knowledge Hand-design

  63. Common-Sense/Prior Knowledge Hand-design Learn from Experience Transfer in Reinforcement Learning —> Very limited success Good solution to continual learning required!

  64. How to deal with catastrophic forgetting? Just remember the weights for each task!

  65. Progressive Networks (Rusu et al. 2016)

  66. Can we do something smarter than storing all the weights?

  67. Overcoming Catastrophic Forgetting (Kirkpatrick et al. 2017) Don’t change weights that are informative of task A Fisher Information EWC: Elastic Weight Consolidation

  68. Overcoming Catastrophic Forgetting (Kirkpatrick et al. 2017)

  69. Eventually we will run out of capacity! Is there a better way to make use of the neural network capacity?

  70. Neural Networks are compressible post-training (Slide adapted from Brian Cheung) (Han et. al. 2015)

  71. Neural Networks are compressible post-training (Slide adapted from Brian Cheung) (Han et. al. 2015)

  72. Negligible performance change after pruning —> Neural Networks are over-parameterized Can we make use of over-parameterization? We will have to make use of “excess” capacity during training

  73. Superposition of many models into one (Cheung et al., 2019) W(1) W(2) W(3) Superposition: W One Model: W(1) Implementation: Refer to the paper for details

Recommend


More recommend