learning execution through neural code fusion
play

Learning Execution through Neural Code Fusion Zhan Shi, Kevin - PowerPoint PPT Presentation

Learning Execution through Neural Code Fusion Zhan Shi, Kevin Swersky, Danny Tarlow, Paruhasarathy Ranganathan , Milad Hashemi Overview Proprietary + Confidential Motivation Background Neural Code Fusion Experimental Results


  1. Learning Execution through Neural Code Fusion Zhan Shi, Kevin Swersky, Danny Tarlow, Paruhasarathy Ranganathan , Milad Hashemi

  2. Overview Proprietary + Confidential ● Motivation ● Background ● Neural Code Fusion ● Experimental Results ● Conclusion 2 Confidential + Proprietary

  3. Proprietary + Confidential Motivation 3 Confidential + Proprietary

  4. Proprietary + Confidential Proprietary + Confidential 2% Pergormance/Year is the New Normal Source: Paruhasarathy Ranganathan, More Moore: Thinking Outside the (Server) Box 4

  5. Motivation Proprietary + Confidential Dynamic speculative execution ● Branch prediction, value prediction, cache replacement, ○ prefetching... 5 Confidential + Proprietary

  6. Motivation Proprietary + Confidential Dynamic speculative execution ● Branch prediction, value prediction, cache replacement, ○ prefetching... Static source code ● Variable naming, fjnding bugs, algorithm classifjcation, program ○ synthesis… Pergormance-related tasks: device mapping, thread coarsening, ○ throughput prediction... 6 Confidential + Proprietary

  7. Motivation Proprietary + Confidential Dynamic speculative execution ● Branch prediction, value prediction, cache replacement, ○ prefetching... Static source code ● Variable naming, fjnding bugs, algorithm classifjcation, program ○ synthesis… Pergormance-related tasks: device mapping, thread coarsening, ○ throughput prediction... Both views provide useful features ● 7 Confidential + Proprietary

  8. Example: a “Simple” Case for Branch Prediction Proprietary + Confidential for (i = 0; i < k; i++) { } 8 Confidential + Proprietary

  9. Example: a “Simple” Case for Branch Prediction Proprietary + Confidential Highly biased for (i = 0; i < k; i++) { } 9 Confidential + Proprietary

  10. Example: a “Simple” Case for Branch Prediction Proprietary + Confidential Highly biased for (i = 0; i < k; i++) { Branch history doesn’t help } 10 Confidential + Proprietary

  11. Example: a “Simple” Case for Branch Prediction Proprietary + Confidential Highly biased while(...){ generate k; for (i = 0; i < k; i++) { Branch history doesn’t help } } 11 Confidential + Proprietary

  12. Example: a “Simple” Case for Branch Prediction Proprietary + Confidential Highly biased while(...){ generate k; for (i = 0; i < k; i++) { Branch history doesn’t help } } ● Jump out when “close enough” ● Predictable if we knew the relation [Static] i and k are compared [Dynamic] values of i and k 12 Confidential + Proprietary

  13. Proprietary + Confidential Background: Graph Neural Networks 13 Confidential + Proprietary

  14. Background: Graph Neural Networks Proprietary + Confidential Typical deep learning operates ● on IID data points. 14 Confidential + Proprietary

  15. Background: Graph Neural Networks Proprietary + Confidential What if the data points had relational information? ● Battaglia et al., 2018 15 Confidential + Proprietary

  16. Background: Graph Neural Networks Proprietary + Confidential Message passing ● Input graph 16 Confidential + Proprietary

  17. Background: Graph Neural Networks Proprietary + Confidential Message passing ● Step 0 Input graph 17 Confidential + Proprietary

  18. Background: Graph Neural Networks Proprietary + Confidential Message passing ● Step 0 Step 1 Input graph 18 Confidential + Proprietary

  19. Background: Graph Neural Networks Proprietary + Confidential Message passing ● Step 0 Step 1 Step 2 Input graph 19 Confidential + Proprietary

  20. Background: Graph Neural Networks Proprietary + Confidential Message passing ● Step 0 Step 1 Step 2 Input graph GRU GRU 20 Confidential + Proprietary

  21. Proprietary + Confidential Programs as Graphs Allamanis et al., 2017 21 Confidential + Proprietary

  22. Proprietary + Confidential Representing Static and Dynamic Information Graphs are an efgective representation for static code ● How do we generally represent dynamic information in a ● model? 22 Confidential + Proprietary

  23. Proprietary + Confidential Neural Code Fusion 23 Confidential + Proprietary

  24. Full System Proprietary + Confidential 24 Confidential + Proprietary

  25. Assembly vs Source Code Proprietary + Confidential ● Highly structured 25 Confidential + Proprietary

  26. Assembly vs Source Code Proprietary + Confidential ● Highly structured 26 Confidential + Proprietary

  27. Assembly vs Source Code Proprietary + Confidential ● Highly structured ● Directly relate data to program semantics 27 Confidential + Proprietary

  28. Assembly vs Source Code Proprietary + Confidential ● Highly structured ● Directly relate data to program semantics ● Easy to use for architecture tasks 28 Confidential + Proprietary

  29. Code Fusion Graph Representation Proprietary + Confidential 29 Confidential + Proprietary

  30. Dynamic Tasks: Control Flow and Data Flow Proprietary + Confidential Control fmow (branch prediction) ● predict whether a branch statement will be taken or not taken. ● Set branch instruction node to be the target node. ● Binary classifjcation ● 30 Confidential + Proprietary

  31. Dynamic Tasks: Control Flow and Data Flow Proprietary + Confidential Control fmow (branch prediction) ● predict whether a branch statement will be taken or not taken. ● Set branch instruction node to be the target node. ● Binary classifjcation ● Data fmow (prefetching) ● predict which address will be accessed next. ● Set src node to be the target node. ● Predict 64-bit address ● 31 Confidential + Proprietary

  32. Multi-Task Representation Proprietary + Confidential Many other static/dynamic tasks can be defjned on the ● graph simultaneously Value prediction, indirect branch prediction, memory ○ disambiguation, caching… 32 Confidential + Proprietary

  33. Dynamic Snapshots Proprietary + Confidential Snapshots ● The values of the set of variable nodes ○ Captured during program execution ○ Used to initialize the graph neural network ● 33 Confidential + Proprietary

  34. Representation Study Proprietary + Confidential Number “3” in difgerent representations ● Categorical: [1, 0, 0, 0] ○ Scalar: 3 ○ Binary: 11 ○ 34 Confidential + Proprietary

  35. Representation Study Proprietary + Confidential ● Correctly predict when to jump out for(k=0; k < n; k+=3){ Sample k values as training data ● for (i = 0; i < k; i++) { } } 35 Confidential + Proprietary

  36. Representation Study: Proprietary + Confidential Results ● Binary > scalar > categorical ○ 36 Confidential + Proprietary

  37. Proprietary + Confidential Experimental Results 37 Confidential + Proprietary

  38. Experimental Setup Proprietary + Confidential Benchmarks ● SPEC06 INT ○ Tasks ● Dynamic: control fmow (branch prediction) and data fmow (prefetching) ○ Static: algorithm classifjcation ○ Offmine evaluation for both NCF and baselines ● 70% training ○ 30% testing ○ 38 Confidential + Proprietary

  39. Control-fmow (Branch Prediction) and Data-fmow Proprietary + Confidential (Prefetching) 39 Confidential + Proprietary

  40. Algorithm Classifjcation Proprietary + Confidential Test the usefulness of the learned representation ● We pre-train our GNN on the control-fmow task ● A simple linear SVM model ● We get 96% vs 95.3% (50M lines of LLVM IR ) using 200k lines of ● assembly with no external data sources. 40 Confidential + Proprietary

  41. Summary Proprietary + Confidential NCF combining static and dynamic information ● creates useful representations ○ Difgerent from the traditional dynamic models in architecture ● Data is usually purely dynamic ○ Model is history-based ○ Enhances static models with dynamic program behavior ● Learned representation can also transfer to a unseen static task ○ 41 Confidential + Proprietary

  42. Thank you! Questions?

Recommend


More recommend