PUBLIC Sony Interactive Entertainment TO Y PROGRAMMING “Demo of a repository for statically compiled programs” 2016 US LLVM Developers’ Meeting Paul Bowen-Huggett paul.huggett@sony.com
Agenda Background
RFC • Is the idea generally sound? Obvious improvements? • Is it something we should think about for LLVM? • There are several potentially related projects (C++ modules IFC, compilation database, ThinLTO, etc.) Views from respective owners?
Chromium Browser Build Ratios 100% Median Time Back-end Front-end Release Front-/Back-end time ratio 19% 81% 80% Debug 40% 60% 60% 40% 20% Front-end (Debug) Front-end (Release) Back-end 0% Source files
Chromium Browser COMDAT Groups 10,000 Number of instances Generated Discarded 1,000 577,397 576,233 99.8% Size (bytes) 100 10 100 200 300 400 500 Number of instances
Toy Tools • Toy programming language • Available on github: https://github.com/SNSystems/Toy-tools • Command line tools: Role Name Role Name Garbage Collector toygc Compiler toycc Strip toystrip Linker toyld Merge toymerge Debugger toydb Runtime toyvm
a.toy 🐜 toydb toyvm a.o toycc 🔘 toyld main.x toycc b.o toygc b.toy
Limitations 1. It’s just a toy! 6. Says nothing about performance 2. Written in Python (3.5) 7. The Toy language is nothing like C++: 3. Output files are YAML • VM has no registers, 3 stacks 4. No concurrency • Dynamic language, no user- defined types, no vague 5. No backward compatibility linkage…
Demo 1. “Hello, World” 2. “Modules” 3. “Distributed”
e n t r y p o i n t “tickets” table “fragments” table “ticket” files Key Value Key Value main.o type binary internal fixups external fixups name digest UUID 1 UUID 1 d(f 1 ) “main” “sieve”+0 55 48 89 e5 .text [] sieve.o 48 83 d(f 1 ) “fact3”+0 name digest UUID 2 UUID 2 d(f 2 ) “sieve” .text+0x19 01 7a 52 00 .eh_frame [] 01 78 10 01 factorial.o .text+0-x2f name digest UUID 3 UUID 3 d(f 3 ) “factorial” d(f 2 ) … d(f 4 ) “fact3” d(f 3 ) … factorial.o ʹ type binary internal fixups external fixups name digest UUID 4 d(f 4 ) UUID 4 d(f 3 ) 55 48 89 e5 “factorial” “factorial”+0 .text [] 48 83 d(f 4 ) “fact3” type binary internal fixups external fixups d(f 5 ) 66 4e 89 e5 “factorial”+0 .text [] 48 83
e n t r y p o i n t “fragments” table Key Value type binary internal fixups external fixups “sieve”+0 55 48 89 e5 .text [] 48 83 d(f 1 ) “fact3”+0 .text+0x19 01 7a 52 00 .eh_frame [] 01 78 10 01 .text+0-x2f d(f 2 ) … d(f 3 ) … type binary internal fixups external fixups d(f 4 ) 55 48 89 e5 “factorial”+0 .text [] 48 83 type binary internal fixups external fixups d(f 5 ) 66 4e 89 e5 “factorial”+0 .text [] 48 83
Distributed Builds Should the repository be a network resource?
Distributed Build Agent 1 compile S 1 S 1 T 1 T 1 S 2 compile T 2 T 2 S 2 🔘 merge binary link strip compile S 3 S 3 T 3 T 3 S 4 compile T 4 T 4 S 4 Agent 2
Challenges? • Remember, it’s just a toy… Need a production-ready C++ implementation • Understand real-world growth rates and GC strategy • Doesn’t show solutions to: • Fast storage with efficient indices • LLVM IR hashing • Efficient debug type references
Conclusion • Potential to reduce re-compile • (Almost) No change to workflows times by ~60% (“speed-of-light” • Next steps: based on Chrome Debug) • Data store (in-process, memory- • Small code changes benefit the mapped hash tables) most • Prototype: • No source code changes • IR hashing • Eradicate duplication and • MC back-end redundancy at source : minimize link-time processing and copying • Repository-based linker
Q & A
Recommend
More recommend