performance correctness exceptions pick three
play

Performance, Correctness, Exceptions: Pick Three Andrea Gussoni , - PowerPoint PPT Presentation

Performance, Correctness, Exceptions: Pick Three Performance, Correctness, Exceptions: Pick Three Andrea Gussoni , Alessandro Di Federico, Pietro Fezzardi, Giovanni Agosta Politecnico di Milano 24 February 2019 1 / 36 Performance,


  1. Performance, Correctness, Exceptions: Pick Three Performance, Correctness, Exceptions: Pick Three Andrea Gussoni , Alessandro Di Federico, Pietro Fezzardi, Giovanni Agosta Politecnico di Milano 24 February 2019 1 / 36

  2. Performance, Correctness, Exceptions: Pick Three Table of Contents 1 Motivations 2 rev.ng 3 Design 4 Experimental Results 5 Conclusions 2 / 36

  3. Performance, Correctness, Exceptions: Pick Three Motivations Motivations Static binary translation has a variety of possible uses: Support for legacy code. Performance improvement for legacy architectures. Instrumentation of code. 3 / 36

  4. Performance, Correctness, Exceptions: Pick Three Motivations Goals Improve the performance of the translated binaries. Do not reinvent the wheel, use as much as possible off-the-shelf components. Be architecture independent, as the the whole rev.ng framework. 4 / 36

  5. Performance, Correctness, Exceptions: Pick Three rev.ng Table of Contents 1 Motivations 2 rev.ng 3 Design 4 Experimental Results 5 Conclusions 5 / 36

  6. Performance, Correctness, Exceptions: Pick Three rev.ng rev.ng EMU 6 / 36

  7. Performance, Correctness, Exceptions: Pick Three rev.ng rev.ng input.elf Lift to QEMU IR Translate to LLVM IR Recompile output.elf

  8. Performance, Correctness, Exceptions: Pick Three rev.ng rev.ng input.elf Lift to QEMU IR Translate to LLVM IR Function Isolation Recompile output.elf 7 / 36

  9. Performance, Correctness, Exceptions: Pick Three rev.ng The root Function At the present time, the lifting phase places all the code recovered from the binary in a single (and often large) llvm function, that we call root . 8 / 36

  10. Performance, Correctness, Exceptions: Pick Three rev.ng The root Function 95 277 482 806 472 442 575 609 614 264 272 625 25 22 448 103 670 478 678 831 620 489 454 311 810 176 155 594 458 bb.printf_core.0x85a bb.fmt_fp.0x8a3 613 481 441 bb.fmt_fp.0xa99 bb.fmt_fp.0x37f bb.fmt_fp.0x117 310 bb.fmt_fp.0xaa 276 bb.vfprintf.0x31 627 bb.pad.0x54 608 bb.__stdio_exit.0x6 24 bb.__stdio_exit.0x16 475 20 447 bb.fmt_fp.0xa27 bb.fwrite_unlocked.0x34 102 bb.fmt_fp.0x8df 677 bb.printf.0x24 618 bb.fmt_fp.0x45 619 bb.fmt_fp.0x80a 488 bb.fmt_fp.0x9ea 453 192 bb.printf_core.0x666 477 462 100 110 bb.frame_dummy.0x1d bb.strerror_l.0x2 175 46 bb.frexpl.0x1b W8 612 u5 bb.frame_dummy.0x3c 809 w8 480 bb.fmt_fp.0x91c 191 m11 624 271 bb.vfprintf.0x6d 263 130 43 S5 s5 y5 M11 610 474 23 626 21 n8 446 122 bb.out.0xb 153 682 T8 bb._start.0x36 t8 830 616 P11 p11 617 487 452 254 p8 476 62 r3 S3 437 174 p.0x1f1 bb.fmt_fp.0x9aa 611 bb.fmt_fp.0xb5 479 190 bb.fmt_fp.0x8ae 471 bb.printf_core.0x861 61 bb.fmt_fp.0xaab 456 bb.printf_core.0x66e 270 262 bb.vfprintf.0x31_L0_ft 42 bb.vfprintf.0x31_L0 bb.pad.0x3c 201 473 449 bb.fmt_fp.0x124 109 bb.__stdio_exit.0x9 bb.fmt_fp.0x8f3 35 bb.fmt_fp.0xa2b N8 bb.fwrite_unlocked.0x37 156 bb.printf.0x5b bb.fmt_fp.0x61 615 bb.fmt_fp.0x51 136 bb.fmt_fp.0x816 bb.fmt_fp.0x8e6 P8 bb.fmt_fp.0x9ef d15 bb.__init_tls.0x37 436 R3 bb.strerror_l.0xf K3 451 173 bb.fmt_fp.0x967 bb.fwrite_unloc U5 119 470 z6 n11 189 170 435 261 41 T11 t11 108 v8 i1 I1 bb.__stdio_exit.0x28 445 o8 d4 Q11 q11 e9 u8 E15 P1 E9 s3 455 450 O5 172 fe bb.printf_core.0x67b bb.fmt_fp.0xd0 N11 188 bb.frame_dummy.0x28 bb.fmt_fp.0x8c9 269 260 bb.printf_core.0x39 bb.pad.0x5c bb.vfprintf.0x33 bb.vfprintf.0x70 469 bb.__init_tls.0x3d 49 bb.__stdio_write.0xb0 96 bb.fmt_fp.0x8f8 bb.pad.0x44 bb.__stdio_exit.0xe bb.__stdio_exit.0x1c V8 bb.fmt_fp.0xa32 bb.__fwritex bb.deregister_tm_clones bb.fmt_fp.0x8e bb.fmt_fp.0x6b bb.fmt_fp.0x821 U8 bb.fmt_fp.0x8eb bb.fmt_fp.0x9fb D4 146 432 bb.strerror_l.0x16 bb.frexpl.0x27 171 589 185 c9 187 68 x8 X8 53 259 q5 bb.__stdio_close.0x9 70 r1 bb.fmt_fp.0x107 c6 O8 Q8 q8 431 4d 588 bb.fmt_fp.0xe1 324 186 bb.fmt_fp.0x9d0 bb.fmt_fp.0x8d0 bb.fmt_fp.0x91e 299 438 bb.vfprintf.0x38 C6 bb.vfprintf.0x82 Q5 bb.__init_tls.0x69 bb.__stdio_write.0xbc R1 bb.fmt_fp.0x904 bb.close_ fi le bb.printf_core.0x40 bb.fmt_fp.0xa11 bb.frexpl bb.wcrtomb.0x44 bb.fmt_fp.0x9a 430 bb.fmt_fp.0xa03 bb.strerror_l.0x1b bb.__unlock fi le 465 D9 y8 T5 e4 t5 d9 302 bb.__copy_tls.0xd d6 R8 T3 429 428 r8 D6 t3 680 292 bb.fmt_fp.0x922 Y8 bb.__init_tls.0x9b 681 bb.vfprintf.0x38_L0 bb.vfprintf.0x38_L0_ft bb.vfprintf.0x87 E4 bb.printf_core.0x53 bb.__stdio_write.0xc6 484 bb.strerror_l.0x2c bb.fmt_fp.0xa46 bb.fmt_fp.0x9db 427 bb.fmt_fp.0xa08 bb.printf_core.0x70 bb.strerror_l.0x1f U3 586 bb.__fwritex.0x3b Z9 z9 r5 577 423 688 e6 714 u3 713 bb.fmt_fp.0x929 bb.__init_tls.0xa6 bb.vfprintf.0x3a bb.vfprintf.0x91 R5 bb.__stdio_write.0x44 bb.__memcpy_fwd 579 422 bb.printf_core.0x7a 426 bb.strerror_l.0x28 G9 483 f4 425 bb.wctomb.0x7 0x28c bb.__init_tls.0xce F4 671 bb.vfprintf.0xba bb.__syscall_ret bb.wcrtomb.0xf1 bb.printf_core.0x7d bb.__init_tls.0xf1 bb.__syscall_ret.0x10 f6 835 J9 bb.__init_tls.0xec bb.printf_core bb.printf_core.0x83 F6 39 bb.printf_core.0x5c5 bb.printf_core.0x747 g6 G6 mt_fp.0x298 bb.printf_core.0x87 bb.printf_core.0x8c 9 bb.printf_core.0x704 66 bb.exit.0x32 k9 165 bb.printf_core.0x7b1 bb.printf_core.0x8f L9 bb.exit.0x2b bb.fmt_fp.0x871 396 H6 bb.fmt_fp.0x29c 395 bb.printf_core.0xbf h6 202 bb.fmt_fp.0x33d bb.__libc_start_main.0x4f i6 I6 169 bb.printf_core.0xc6 bb.printf_core.0x95 bb.__init_libc.0x8d bb.fmt_fp.0x309 j6 163 bb.printf_core.0xa5 230 bb.fmt_fp.0x881 273 bb.out 168 26 1 686 bb._init.0x6 198 725 381 bb.__towrite 826 6 bb.wcrtomb bb.__lctrans bb.printf_core.0x1dd _thread_area w6 bb.printf_core.0x1e9 bb.__copy_tls 370 366 bb.strerror 668 Y5 bb.__fpclassifyl bb.memset 325 bb.vfprintf 339 335 178 338 337 336 334 333 bb._Exit 332 bb.__stdio_exit 675 Z6 bb._ fi ni bb.exit bb.dummy1 bb.printf_core.0x295 bb.printf_core.0x29a h12 H12 n15 N15 f7 m1 M1 o11 O11 m5 M5 bb.printf_core.0x2a6 bb.register_tm_clones g4 G4 b3 B3 m8 M8 bb.__do_global_ctors_aux F7 i12 I12 s2 S2 i3 I3 v3 V3 a4 A4 h4 H4 b6 B6 r11 R11 v11 V11 l12 L12 bb.printf_core.0x2fd l7 m12 M12 bb.printf_core.0x301 n12 N12 G8 p12 P12 bb.printf_core.0x333 r12 R12 l13 L13 bb.pop_arg l15 L15 q15 Q15 5 7 8 bb.printf_core.0x5c0 bb.printf_core.0x506 bb.printf_core.0x4be v7 V7 bb.__errno_location bb.printf_core.0x4c8 bb.printf_core.0x4d8 w7 W7 bb.printf_core.0x509 bb.printf_core.0x4e1 bb.printf_core.0x511 bb.memchr bb.printf_core.0x51d g12 G12 bb.fmt_u u7 U7 t7 T7 k12 K12 bb.strerror_l bb.main bb.frame_dummy bb.strlen bb.dummy2 bb.dummy4 bb.__stdio_close F1 P3 W11 w3 o13 f15 o4 q12 d3 c3 g1 j15 h1 t12 m3 o15 q3 W3 O13 d1 F15 a3 O4 Q12 D3 C3 catchblock invoke_return S15 s15 abnormal_invoke normal_invoke unexpectedpc 9 / 36

  11. Performance, Correctness, Exceptions: Pick Three rev.ng The Dispatcher What about indirect branches or indirect function calls (e.g. jmp rax )? We need the dispatcher . 10 / 36

  12. Performance, Correctness, Exceptions: Pick Three rev.ng The Dispatcher switch i64 @pc , label %dispatcher.default [ i64 4194536 , label %bb._init i64 4194542 , label %bb._init.0x6 i64 4194547 , label %bb._init.0xb i64 4194560 , label %bb._start i64 4194582 , label %bb._start_c i64 4194614 , label %bb._start .0x36 i64 4194624 , label %bb.deregister_tm_clones i64 4194645 , label %bb.deregister_tm_clones .0x15 i64 4194655 , label %bb.deregister_tm_clones .0x1f i64 4194672 , label %bb.deregister_tm_clones .0x30 i64 4194688 , label %bb.register_tm_clones i64 4194723 , label %bb.register_tm_clones .0x23 i64 4194733 , label %bb.register_tm_clones .0x2d i64 4194744 , label %bb.register_tm_clones .0x38 ] 11 / 36

  13. Performance, Correctness, Exceptions: Pick Three rev.ng Current Limitations One mayor problem of the dispatcher is that every time we need to pass through it, we pay an high cost in terms of performance. The CFG of the root function contains a lot of unnecessary edges, and this leads to a mazy topology. This topology prevents a lot of opt optimizations. 12 / 36

  14. Performance, Correctness, Exceptions: Pick Three rev.ng Current Limitations bb.dispatcher: switch . . . bb.main: store 0 @rax br %bb.main.0x8 bb.main.0x8: %1 = load @rax 13 / 36

  15. Performance, Correctness, Exceptions: Pick Three Design Table of Contents 1 Motivations 2 rev.ng 3 Design 4 Experimental Results 5 Conclusions 14 / 36

  16. Performance, Correctness, Exceptions: Pick Three Design A naive approach The natural thing to do is try to reconstruct (with some approximations) the original function layout . Will things break? (Spoiler: yes, they will). 15 / 36

  17. Performance, Correctness, Exceptions: Pick Three Design Bird View def foo(): def root(): bb.foo: bb.foo.0x8: bb.foo: bb.foo.0x8: br %bb.bar ? def main(): bb.main: bb.main: bb.main.0xa: bb.main.0xa: bb.bar: bb.bar.0x4: def bar(): bb.bar: bb.bar.0x4: 16 / 36

  18. Performance, Correctness, Exceptions: Pick Three Design Bird View def foo(): def root(): bb.foo: bb.foo.0x8: bb.foo: bb.foo.0x8: br %bb.bar ? def main(): bb.main: bb.main: bb.main.0xa: bb.main.0xa: bb.bar: bb.bar.0x4: def bar(): bb.bar: bb.bar.0x4: 16 / 36

Recommend


More recommend