tracking the flow of ideas through the programming
play

Tracking the Flow of Ideas through the Programming Languages - PowerPoint PPT Presentation

Tracking the Flow of Ideas through the Programming Languages Literature Michael Greenberg, Kathleen Fisher, and David Walker How can we understand the PL literature? 2 Alexandre Duret-Lutz Is there more related work should I cite? Is my


  1. Tracking the Flow of Ideas through the Programming Languages Literature Michael Greenberg, Kathleen Fisher, and David Walker

  2. How can we understand 
 the PL literature? 2 Alexandre Duret-Lutz

  3. Is there more related work should I cite? Is my work a better fit for PLDI or POPL ? Who should I invite to this PC ? Who should review this paper? Was this a typical year for ICFP? How has OOPSLA changed over the years? 3

  4. Types Optimization Verification Synthesis Abstract 
 Interpretation 4

  5. What is a ‘topic’ in a document? Word Count type 120 system 83 check 34 static 21 5

  6. Topics are distributions of words “Parsing” topic Word Log likelihood grammar -3.905040 lan language -4.206531 structure -4.308618 parser -4.513348 … … 6

  7. Documents are a mix of topics .6 type systems Word Count type 120 system 83 check 34 static 21 .28 operational semantics .22 Word Count object-orientation semantics 90 Word Count step 45 object 88 reduce 38 class 13 evaluate 19 instance 12 method 7 7

  8. Documents are a mix of topics < .6 , .28 , .22 > type systems operational semantics object-orientation 8

  9. Generative LDA topic model Takikawa, Strickland, Dimoulas, Tobin-Hochstadt, and Felleisen Gradual typing for first-class classes . OOPSLA 2012. 9

  10. Inference with LDA …… v 1 v N LDA-C* N vectors, k k -dimensional space N bags of words k topics *http://www.cs.princeton.edu/~blei/lda-c/ 10

  11. … v 1 … v N N vectors k topics LDA-C parse k post k top words aggregate vectors corpus N bags of k top papers by year N docs words by conference by hand combined vocabulary k topic names 11

  12. Parsing a about above after • Parsing drops standard stopwords again against • Added some extra ones with TF-IDF 
 … • Stemmed words using nltk* calculi ➞ calculus goes ➞ go • Removes plurals, etc. *http://www.nltk.org/ 12

  13. Our corpora • Abstracts: ICFP, OOPSLA, PLDI, POPL • 4,355 documents • Imperfect data in the ACM Digital Library • Fulltext: PLDI, POPL • 2,257 documents • Imperfect PDF-to-text conversion 13

  14. Let’s name a topic! Garbage object Space overhead bounds for dynamic memory management with partial compaction heap Schism: fragmentation-tolerant real-time garbage collection region Portable, unobtrusive garbage collection for multiprocessor systems memory collection! Limitations of partial compaction: towards practical bounds Correctness-preserving derivation of concurrent garbage collection pointer algorithms collector The ramifications of sharing in data structures A general framework for certifying garbage collectors and their garbage mutators collection Beltway: getting around garbage collection gridlock allocation On bounding time and space for multiprocessor garbage collection Garbage collection without paging reference 14

  15. Topic names for k=20, abstracts Compiler Array Processing Verification Program Logics optimization Resource Garbage Test generation Parallelism management Collection Components and Object-Oriented Parsing Language Design APIs Programming Low-level Analysis of Models and compiler Program Analysis Concurrent Modeling optimizations Programs Semantics of Object-oriented concurrent Type Systems Applications software programs development 15

  16. 16

  17. Compiler optimization Resource management Parsing Low − level compiler optimizations Semantics of concurrent programs 30 20 How has OOPSLA changed over the years? 10 0 Did changing the CfP change things? Array Processing Garbage Collection Components and APIs Program Analysis Type Systems 30 20 What about becoming part of SPLASH!? 10 Conference ICFP 0 Weight OOPSLA Verification Test generation Object − Oriented Programming Analysis of Concurrent Programs Applications 30 PLDI POPL 20 10 0 Program Logics Parallelism Language Design Models and Modeling Object − oriented software development 30 20 10 0 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 Year 17

  18. OOPSLA Call for Papers 2006 2007 2010 all aspects of paradigms beyond foundations of object programming the traditional and related languages and concept of object- technologies software engineering, oriented programming broadly construed 18

  19. CfP SPLASH! CfP SPLASH! 19

  20. Compiler optimization Resource management Parsing Low − level compiler optimizations Semantics of concurrent programs 30 20 10 0 Array Processing Garbage Collection Components and APIs Program Analysis Type Systems 30 20 10 Conference What trends are visible in program verification 
 ICFP 0 Weight OOPSLA Verification Test generation Object − Oriented Programming Analysis of Concurrent Programs Applications across the decades? 30 PLDI POPL 20 10 0 Program Logics Parallelism Language Design Models and Modeling Object − oriented software development 30 20 10 0 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 Year 20

  21. Program Logics 30 20 Conference ICFP OOPSLA 10 PLDI POPL 0 1980 1990 2000 2010 21

  22. Compiler optimization Resource management Parsing Low − level compiler optimizations Semantics of concurrent programs 30 20 10 0 Array Processing Garbage Collection Components and APIs Program Analysis Type Systems 30 20 10 Conference How has PLDI changed over time? ICFP 0 Weight OOPSLA Verification Test generation Object − Oriented Programming Analysis of Concurrent Programs Applications 30 PLDI POPL 20 Per “Future of PLDI” session in Edinburgh, 
 10 what is the state of the community? 0 Program Logics Parallelism Language Design Models and Modeling Object − oriented software development 30 20 10 0 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 Year 22

  23. Low − level compiler optimizations 30 20 Conference ICFP OOPSLA 10 PLDI POPL 0 1980 1990 2000 2010 23

  24. Topic names for k=20, full text Data-driven Abstract Object- Code generation optimization interpretation orientation Data-structure Languages and Security and Processes and correctness control bugfinding message passing Garbage Program Parallelization Dynamic analysis collection transformation Low-level Proofs and Design Program analysis systems models Register Types Concurrency Parsing allocation 24

  25. Data − driven optimization Data − structure correctness Garbage collection Low − level systems Register allocation 1000 750 500 250 0 Abstract interpretation Languages and control Parallelization Design Types 1000 750 500 250 How has PLDI changed over time? Conference 0 Weight PLDI Object − orientation Security and bugfinding Program transformation Program analysis Concurrency POPL 1000 750 Let’s compare PLDI and POPL , 
 500 250 using our fulltext corpus. 0 Code generation Processes and message passing Dynamic analysis Proofs and models Parsing 1000 750 500 250 0 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 Year 25

  26. 26

  27. Data − driven optimization Data − structure correctness Garbage collection Low − level systems Register allocation 1000 750 500 250 0 Abstract interpretation Languages and control Parallelization Design Types 1000 750 500 250 Conference 0 Weight PLDI Object − orientation Security and bugfinding Program transformation Program analysis Concurrency POPL 1000 750 500 250 Are there topics that used to be 
 0 Code generation Processes and message passing Dynamic analysis Proofs and models Parsing well represented in POPL ? 1000 750 500 250 0 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 Year 27

  28. 28

  29. Data − driven optimization Data − structure correctness Garbage collection Low − level systems Register allocation 1000 750 500 250 0 Abstract interpretation Languages and control Parallelization Design Types 1000 750 500 What topics are in POPL 
 250 Conference but not really in PLDI ? 0 Weight PLDI Object − orientation Security and bugfinding Program transformation Program analysis Concurrency POPL 1000 750 500 250 0 Code generation Processes and message passing Dynamic analysis Proofs and models Parsing 1000 750 500 250 0 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 1980 1990 2000 2010 Year 29

  30. 30

  31. Comparing documents v 1 d v 2 Are papers with close topic vectors related ? Measure distance using Symmetrized KL divergence, which gives less weight to dimensions with small magnitude. 31

  32. 15 Paper set Citations Random 1 10 Distance Random 2 Random 3 Random 4 5 Random 5 0 CDRS PCC SEMC TAL Paper 32

  33. http://tmpl.weaselhat.com 33

  34. Ideas and plans Beginning of a new project What do you think we should do? Models for researchers … v 1 … v N 34

  35. Limitations/problems • ACM DL is missing data • No programmatic access • Unclear choices about models • Abstracts or fulltext? k=20? k=30? k=200? • Which documents should ‘seed’ LDA? 35

Recommend


More recommend