COMP 762: Paper Presentation (Technical Details) Alexander Nicholson Are Deep Neural Networks the Best Choice for Modeling Source Code? Authors: Vincent J. Hellendoorn, Premkumar Devanbu
Overview: ▪ Big question in the title ▪ Creates a robust baseline model (Important!) ▪ Optimizations for a specific, practical task
Scope Language specific scoping? Static vs. Dynamic
Why online training/dynamism with (R)NN is hard: E.g. 50000 * 512 = 25600000 parameters between first and second layer. Image From: Towards Deep Learning Software Repositories, White et al.
Smoothing (Discounting/Correction and Interpolation) Discounting/Correction: Interpolation: Subtract/Add a number � “Add information from from the counts. known distributions” Weight additional distributions using � Resources: https://www.youtube.com/watch?v=FUS7XkhYBLo&list=PLBv09BD7ez_7Ke6U7yGBvfP4_Hau3ZGj2&index =5 https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf
Laplace/Lidstone Correction Add and re-normalize Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf
Absolute Discounting Subtract and re-normalize Paper’s modification uses three values of � Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf
Kneser-Ney Smoothing Paper’s modification uses three values of � Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf
Jelinek-Mercer Smoothing General Interpolation: � P(X) + (1- � )P(Z) In J-M: � is a constant.
Witten-Bell Smoothing Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf
Trie Data-structure From Wikipedia: Children have common prefix.
Trie Data-structure Each scope has its own trie
Zipf’s Law Image From: https://phys.org/news/2017-08-unzipping-zipf-law-solution-century-old.html
Memoization - Optimization technique commonly used in dynamic programming - Cache-(ish) to avoid multiple recalculations. http://cs.mcgill.ca/~jcheung/teaching/fall-2017/comp550/index.html
Dependency Models (Trees) https://en.wikibooks.org/wiki/LaTeX/Linguistics
Dropout http://cs.mcgill.ca/~hvanho2/comp551/
Evaluation Terms - Mean Reciprocal Rank - Explained well in Sec. 4.1 - Two-tailed t-test - Statistical significance test - Tests if target is higher OR lower than reference. - Cohen’s D - Effect Size - Used with t-test
Thanks!
Recommend
More recommend