episodic memory in lifelong language learning
play

Episodic Memory in Lifelong Language Learning NIPS 19 Cyprien de - PowerPoint PPT Presentation

Episodic Memory in Lifelong Language Learning NIPS 19 Cyprien de Masson dAutume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama DeepMind Xiachong Feng Outline Author Background Task Model Experiment Result Author


  1. Episodic Memory in Lifelong Language Learning NIPS 19 Cyprien de Masson d’Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama DeepMind Xiachong Feng

  2. Outline • Author • Background • Task • Model • Experiment • Result

  3. Author Lingpeng Kong Cyprien de Masson Sebastian Ruder Dani Yogatama (孔令鹏) d’Autume DeepMind DeepMind DeepMind

  4. Background • Life long learning

  5. Background • Catastrophic Forgetting

  6. Task • Text classification • Question answering

  7. Model • Example encoder • Task decoder • Episodic memory module.

  8. Example encoder && Task decoder

  9. • Text classification, 𝒚 𝒖 is a document to be Episodic Memory classified • Question answering, 𝒚 𝒖 is a concatenation of a context paragraph and a question separated by [SEP]. • key-value memory block Label key value • Text Classification • [CLS] • Question Answering • The first token of question Pretrained BERT Model (freeze)

  10. Episodic Memory Sparse experience replay Local adaptation Episodic Memory Model

  11. Model - Training • Write • Based on random write • Read sparse experience replay • Uniformly random sampling • Perform gradient updates based on the retrieved examples • Sparsely : randomly retrieve 100 examples every 10,000 new examples

  12. Model - Inference • Read local adaptation • Key net à query vector • K-nearest neighbors using the Euclidean distance function 1 𝐿

  13. Experiments • Text classification • News classification (AGNews), sentiment analysis (Yelp, Amazon), Wikipedia article classification (DBPedia), and questions and answers categorization (Yahoo). • AGNews (4 classes), Yelp (5 classes), DBPedia (14 classes), Amazon (5 classes), and Yahoo (10 classes) datasets. • Yelp and Amazon datasets have similar semantics (product ratings), we merge the classes for these two datasets. • Question answering • SQuAD 1.1 ,TriviaQA, QuAC • Create a balanced version all datasets

  14. QA Text classification Results randomly retrieved examples for local adaptation multitask model

  15. Result

  16. Result store only 50% and 10% of training examples.

  17. Result

  18. Thanks!

Recommend


More recommend