comp 762 paper presentation technical details alexander
play

COMP 762: Paper Presentation (Technical Details) Alexander - PowerPoint PPT Presentation

COMP 762: Paper Presentation (Technical Details) Alexander Nicholson Are Deep Neural Networks the Best Choice for Modeling Source Code? Authors: Vincent J. Hellendoorn, Premkumar Devanbu Overview: Big question in the title Creates a


  1. COMP 762: Paper Presentation (Technical Details) Alexander Nicholson Are Deep Neural Networks the Best Choice for Modeling Source Code? Authors: Vincent J. Hellendoorn, Premkumar Devanbu

  2. Overview: ▪ Big question in the title ▪ Creates a robust baseline model (Important!) ▪ Optimizations for a specific, practical task

  3. Scope Language specific scoping? Static vs. Dynamic

  4. Why online training/dynamism with (R)NN is hard: E.g. 50000 * 512 = 25600000 parameters between first and second layer. Image From: Towards Deep Learning Software Repositories, White et al.

  5. Smoothing (Discounting/Correction and Interpolation) Discounting/Correction: Interpolation: Subtract/Add a number � “Add information from from the counts. known distributions” Weight additional distributions using � Resources: https://www.youtube.com/watch?v=FUS7XkhYBLo&list=PLBv09BD7ez_7Ke6U7yGBvfP4_Hau3ZGj2&index =5 https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  6. Laplace/Lidstone Correction Add and re-normalize Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  7. Absolute Discounting Subtract and re-normalize Paper’s modification uses three values of � Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  8. Kneser-Ney Smoothing Paper’s modification uses three values of � Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  9. Jelinek-Mercer Smoothing General Interpolation: � P(X) + (1- � )P(Z) In J-M: � is a constant.

  10. Witten-Bell Smoothing Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  11. Trie Data-structure From Wikipedia: Children have common prefix.

  12. Trie Data-structure Each scope has its own trie

  13. Zipf’s Law Image From: https://phys.org/news/2017-08-unzipping-zipf-law-solution-century-old.html

  14. Memoization - Optimization technique commonly used in dynamic programming - Cache-(ish) to avoid multiple recalculations. http://cs.mcgill.ca/~jcheung/teaching/fall-2017/comp550/index.html

  15. Dependency Models (Trees) https://en.wikibooks.org/wiki/LaTeX/Linguistics

  16. Dropout http://cs.mcgill.ca/~hvanho2/comp551/

  17. Evaluation Terms - Mean Reciprocal Rank - Explained well in Sec. 4.1 - Two-tailed t-test - Statistical significance test - Tests if target is higher OR lower than reference. - Cohen’s D - Effect Size - Used with t-test

  18. Thanks!

Recommend


More recommend