revenge is a dish served cold debug oriented malware
play

RevEngE is a dish served cold: Debug-Oriented Malware Decompilation - PowerPoint PPT Presentation

Introduction RevEngE Evaluation Final Remarks RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly Marcus Botacin 1 , Lucas Galante 2 , cio de Geus 2 , Andr egio 1 Paulo L e Gr 1 Federal University of


  1. Introduction RevEngE Evaluation Final Remarks RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly Marcus Botacin 1 , Lucas Galante 2 , ıcio de Geus 2 , Andr´ egio 1 Paulo L´ e Gr´ 1 Federal University of Paran´ a (UFPR-BR) { mfbotacin, gregio } @inf.ufpr.br 2 University of Campinas (UNICAMP-BR) { galante, paulo } @lasca.ic.unicamp.br RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 1 / 44 ROOTS’19

  2. Introduction RevEngE Evaluation Final Remarks Who Am I? Background Computer Engineer (University of Campinas–Brazil). CS Master (University of Campinas–Brazil). CS PhD Student (Federal University of Paran´ a–Brazil). Malware Analyst (Since 2012). Research Interests Malware Analysis & Detection . Hardware-Assisted Security. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 2 / 44 ROOTS’19

  3. Introduction RevEngE Evaluation Final Remarks The Problem Topics Introduction 1 The Problem Background RevEngE 2 Overview Architecture Evaluation 3 Malware Decompilation Malware Reassembly Final Remarks 4 Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 3 / 44 ROOTS’19

  4. Introduction RevEngE Evaluation Final Remarks The Problem The Problem Malware Hard to Understand at low level (e.g. assembly). Decompilers Lift low level constructions to high level semantics. Allow API and/or source code analyses. Decompilation Challenges Malware is not well-behaved. Malware implement anti-analysis tricks. Malware binaries present dead code. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 4 / 44 ROOTS’19

  5. Introduction RevEngE Evaluation Final Remarks The Problem Insights & Proposal (1/2) Current Decompilers They perform reasonably well with small pieces of code. They do not perform well with static disassembly. Current Debuggers They can perform dynamic disassembly and/or inspection. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 5 / 44 ROOTS’19

  6. Introduction RevEngE Evaluation Final Remarks The Problem Insights & Proposal (2/2) Current Analysts’ Tasks Analysts already debug binaries in a sliced manner. Analysts perform their own anti-anti-analysis routines. What If? Could we combine analysts manual work with decompiler? And decompile the small pieces debugged by the analyst? And let the analyst to overcome anti-analysis by themselves? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 6 / 44 ROOTS’19

  7. Introduction RevEngE Evaluation Final Remarks Background Topics Introduction 1 The Problem Background RevEngE 2 Overview Architecture Evaluation 3 Malware Decompilation Malware Reassembly Final Remarks 4 Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 7 / 44 ROOTS’19

  8. Introduction RevEngE Evaluation Final Remarks Background Background Compiler Parsing, Pre-Processing, Assemblying, Optimization, and Code Generation. Decompiler Disassembly, Lifting, data type recovery, and Code Generation. Notice: Not the same code generation routines. Decompiler is an inverse compiler. There are cross-platform compilers and decompilers. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 8 / 44 ROOTS’19

  9. Introduction RevEngE Evaluation Final Remarks Background The Challenges (1/2) Disassembly Opaque Constants. Overlapping Instructions. Data and Code are mixed. Lifting A typical ISA is VERY large. Have you ever executed VFMADDSUBPS ? and O.S. support as well... Do you know what is NUMA ? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 9 / 44 ROOTS’19

  10. Introduction RevEngE Evaluation Final Remarks Background The Challenges (2/2) Data Type Reconstruction Whats is the difference between an array ( int a[2]; ) and consecutive variables ( int a,b; )? Is 0x77FF... an integer or a pointer? Code Generation How to implement? Which optimizations? How to name variables? Evaluation Is recovered code a good metric for malware decompilation? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 10 / 44 ROOTS’19

  11. Introduction RevEngE Evaluation Final Remarks Overview Topics Introduction 1 The Problem Background RevEngE 2 Overview Architecture Evaluation 3 Malware Decompilation Malware Reassembly Final Remarks 4 Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 11 / 44 ROOTS’19

  12. Introduction RevEngE Evaluation Final Remarks Overview Reverse Engineering Engine Overview PoC Decompiler focused on malware analysis. GDB-powered (no-reimplementation). Dynamic Inspection (no static analysis constraints). Trace-Oriented (decompile what is debugged). Reassembler (merge the decompiled pieces in a new software). RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 12 / 44 ROOTS’19

  13. Introduction RevEngE Evaluation Final Remarks Architecture Topics Introduction 1 The Problem Background RevEngE 2 Overview Architecture Evaluation 3 Malware Decompilation Malware Reassembly Final Remarks 4 Limitations Conclusion Questions? RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 13 / 44 ROOTS’19

  14. Introduction RevEngE Evaluation Final Remarks Architecture RevEngE -GDB Integration Figure: RevEngE Architecture. GDB provides the basic debugging capabilities and was armored to handle malware anti-analysis techniques. RevEngE decompiler is developed on top of the armored GDB. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 14 / 44 ROOTS’19

  15. Introduction RevEngE Evaluation Final Remarks Architecture GDB Armoring __libc_start_main (main=<value >, argc=<value >, 1 ubp_av=<value >, init=<value >, fini=<value >, rtld_fini=<value >, stack_end=<value > Code Snippet 1: Libc Entry Point. First argument points to application entry point. output = gdb.execute("set␣$eflags |=0x%x" % self. 1 flag_map[flag],to_string=True) Code Snippet 2: Invert Branch Direction. Flags register is changed according a map of possible flags for such command. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 15 / 44 ROOTS’19

  16. Introduction RevEngE Evaluation Final Remarks Architecture Instruction Representation Figure: Instruction Representation . RevEngE benefits from Python’s polymorphism to model instruction’s behaviors and overloads method declarators to support each x86 instruction’s possible multiple argument types. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 16 / 44 ROOTS’19

  17. Introduction RevEngE Evaluation Final Remarks Architecture Instruction Factory class IFactory (...): 1 def get(self , args): 2 newclass = globals ()[name ]( args) 3 newclass return 4 Code Snippet 3: Instruction Factory. The Factory design pattern allows instantiating objects from the proper class by exploring Python OOP capabilities. self.classes[’div’] = "IDiv" 1 self.classes[’divl ’] = "IDiv" 2 self.classes[’idiv ’] = "IDiv" 3 self.classes[’idivl ’] = "IDiv" 4 Code Snippet 4: Instruction Lifting. RevEngE assumes only signed integer operations to handle all instructions via the same high-level class. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 17 / 44 ROOTS’19

  18. Introduction RevEngE Evaluation Final Remarks Architecture Lifting Complex Instructions 0x4004eb cmp -0x8(%rbp) ,%eax 1 0x4004ee jle 4004 fb <main +0x25 > 2 Code Snippet 5: Low level representation of a conditional decision. IF instructions are composed by multiple assembly instructions. class HighLevelCompare (): 1 def __init__ (self ,cmp ,set): 2 self.op1 = cmp.op1 3 self.op2 = cmp.op2 4 self.op3 = set.op3 5 Code Snippet 6: High level conditional decision representation. Assembly instructions are promoted to a single class that represents a high level conditional structure (e.g., IFs). RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 18 / 44 ROOTS’19

  19. Introduction RevEngE Evaluation Final Remarks Architecture Handling Variables self.vars = VariableManager () 1 self.vars. remove_registers (reg=arg1.get_operand ()) 2 self.vars. check_is_pointer (var.get_value ()) 3 Code Snippet 7: Variable Management. RevEngE does not handle variables directly but via a centralized manager to keep context consistent. self.var = self.vars.new_var(reg="%eax") 1 self.var = self.vars.new_var(reg=arg1.get_operand (), 2 value=val) self.var = self.vars.new_var(value=arg1.get_value (), 3 mem=arg2.get_operand ()) Code Snippet 8: Variable Manager. Context complexity is encapsulated by the manager, thus releasing RevEngE to focus on decompilation logic. RevEngE is a dish served cold: Debug-Oriented Malware Decompilation and Reassembly 19 / 44 ROOTS’19

Recommend


More recommend