Evolving the Process Injection Injecting the Python bytecode
WhoAmI ● Red teamer at Sberbank of Russia ● OSCP/OSCE/SLAE
Process Injection purposes ● Accessing/modifying in-memory data or program execution flow ● AV evasion & anti-forensics ● Post-exploitation & maintaining access
Traditional Process Injection ● Linux ○ strace ○ ptrace ○ SO Injection ● Windows ○ DLL Injection ○ Process Hollowing
Software is Evolving ● Interpreted (and VM-based) languages keep growing their popularity, even in non-traditional areas ● New paradigms - microservices, serverless technologies
Evolving & Attacker Benefits ● Injecting VM-based (or other interpreted) languages potentially gives us all benefits of these languages in our payloads — portability, ability to use “high-level” interfaces over “low-level” functions and other ● Also, it is very hard to investigate — how many people can detect and successfully reverse engineer Java or Python bytecode payloads?
Primary Goal ● Find a way to manipulate running Python process to gain required impact on target system (usermode persistence, internal logic modification and other) ● This way should work at least for x86_64 CPU, Linux and CPython 3.5.3
Python ● Interpreted, high-level programming language ● Object-oriented, mostly ● Strong community, great standard library, and perfect extensibility Note: Python itself is just a reference, you can build your own Python engine and decide how it will work with Python code — compile it to special bytecode, interpret it as is, or even translate to Java bytecode and execute it with JVM
CPython ● The reference implementation of Python ● Written in C (and Python) ● Compiles source code to bytecode and interprets it with Python Virtual Machine (PVM)
CPython — Compilation 1. Parse source code into a parse tree 2. Transform parse tree into an Abstract Syntax Tree 3. Transform AST into a Control Flow Graph 4. Emit bytecode based on the Control Flow Graph
CPython — Compilation Example Compilation produces a set of Objects (e.g., CodeObject for handling bytecode) and prepares them for Interpretation in the PVM >>> def hello(): >>> hello.__code__.co_code.hex() ... print("Hello!") '740064018301010064005300' >>> hello >>> dis.dis(hello.__code__) <function hello at 0x10bd210d0> 2 0 LOAD_GLOBAL 0 (print) 2 LOAD_CONST 1 ('Hello!') >>> hello() 4 CALL_FUNCTION 1 Hello! 6 POP_TOP 8 LOAD_CONST 0 (None) 10 RETURN_VALUE
CPython — Python Virtual Machine (PVM) ● Virtual Stack Machine ○ Value Stack, Call Stack, and Block Stack ○ No registers ⇨ Short instructions list ● Custom memory management ○ Huge space mapped as MAP_ANONYMOUS | MAP_PRIVATE ○ Arenas, Pools, and Blocks — internal memory management primitives ⇨ Small amount of system malloc/free calls ● Operates Objects, not raw memory values ⇨ Keeps abstraction level high
CPython — A Short Guide to Objects ● Compilation and Interpretation produce a wide set of memory primitives named Objects ● C is not Object-Oriented, therefore all Objects are described with corresponding structs ● PVM works with these structs , therefore, we have to discover some of them to move forward
CPython — PyObject & PyVarObject ● Universal headers, the Basis for all other Objects ● Every pointer to a CPython Object can be cast to a PyObject* — inheritance built by hand ● PyVarObject is just a PyObject extension to describe Objects with variable-sized part
PyObject — Structure typedef struct _object { _PyObject_HEAD_EXTRA Py_ssize_t ob_refcnt; struct _typeobject *ob_type; } PyObject;
PyVarObject — Structure typedef struct { PyObject ob_base; Py_ssize_t ob_size; } PyVarObject;
CPython — Types ● Every Object in CPython has its own Type specified by PyObject.ob_type field ● Types are Objects too — instances of PyTypeObject struct ● Type Objects are a fundamental part of CPython describing Objects functionality and behavior ● Some Examples: ○ PyUnicodeObject.ob_type → PyUnicode_Type ○ PyBytesObject.ob_type → PyBytes_Type ○ PyCodeObject.ob_type → PyCode_Type
CPython — The Type for Types ● Every Type Object is a PyObject, therefore it has the ob_type field: ○ PyUnicode_Type.ob_type → PyType_Type ○ PyBytes_Type.ob_type → PyType_Type ○ PyCode_Type.ob_type → PyType_Type ● But PyType_Type is a PyObject too: ○ PyType_Type.ob_type → PyType_Type
PyTypeObject — Structure typedef struct _typeobject { PyObject_VAR_HEAD const char *tp_name; Py_ssize_t tp_basicsize, tp_itemsize; ... } PyTypeObject;
A Short Guide to Objects — Subtotals ● We already know the structure of PyObject and PyTypeObject instances — comparatively, low-level structures. ● There is still no place to inject CPython bytecode. ● Let’s check the Object that works with CPython bytecode itself — PyCodeObject.
CPython — PyCodeObject ● PyCodeObject — PyObject extension to describe pieces of Static Code. ● PyCodeObject.ob_type → PyCode_Type ● PyCodeObject is NOT a run-time primitive, it stores only static information about bytecode: ○ PyUnicodeObject* co_name — specifies code name (e.g., function name, <stdin>) ○ PyBytesObject* co_code — opcode sequence ○ PyTupleObject* co_consts — constants used
PyCodeObject — Example >>> def hello(): >>> hello.__code__.co_name ... print("Hello!") 'hello' ... >>> hello.__code__.co_code b't\x00d\x01\x83\x01\x01\x00d\x00S\x00' >>> hello.__code__.co_consts (None, 'Hello!')
PyCodeObject — Structure typedef struct { PyObject_HEAD ... PyObject *co_code; PyObject *co_filename; PyObject *co_name; ... } PyCodeObject;
PyCodeObject — Points of Interest ● Controlling PyCodeObject allows us to play with member fields and pointers like co_code and co_consts ● Changing the co_code (ob_type → PyBytes_Type) field gives us an ability to change existing or inject new bytecode ● Playing with the co_consts (ob_type → PyTuple_Type) field allows us to add some data to our injection
CPython — PyBytesObject ● PyBytesObject — PyVarObject extension to describe byte sequences ● PyBytesObject.ob_type → PyBytes_Type ● Just a container for bytes sequence
PyBytesObject — Structure typedef struct { PyObject_VAR_HEAD Py_hash_t ob_shash; char ob_sval[1]; } PyBytesObject;
CPython — PyTupleObject ● PyTupleObject — PyVarObject extension to describe immutable arrays of object references ● PyTupleObject.ob_type → PyTuple_Type ● Just a container for object references
PyTupleObject — Structure typedef struct { PyObject_VAR_HEAD PyObject *ob_item[1]; } PyTupleObject;
A Short Guide to Objects — Conclusion ● Gaining control on PyCodeObject and corresponding low-level structures allows us to patch bytecode, inject values and do other things in the Virtual Memory ● The main question there — How can we find necessary PyCodeObject?
Finding PyCodeObject The main approach: unraveling pointers chains ● with known code name — targeted impact ○ Code name ⇨ PyUnicodeObject ⇨ PyCodeObject.co_name ○ ● with Symbol Table lookup — requires access to Python binary ○ PyCode_Type (from Symbol Table) ⇨ PyCodeObject.ob_type ○ ● with PyType_Type — potentially gives us access to all Objects ○ “type\x00” ⇨ PyType_Type ⇨ PyCode_Type ⇨ PyCodeObject.ob_type
PyCodeObject.co_code Patching ● When PyCodeObject is located, we can continue our work ● Let’s try to patch existing bytecode and see how can we use that
PyCodeObject.co_code Patching — Example ● An old-school example — patching the “if-then-else” construction def check_password(password): if password == "P@ssw0rd": return True else: return False # bytecode: 7c00006401006b0200721000640200536403005364000053
PyCodeObject.co_code Patching — Example ● Will use NOP instruction with 0x09 opcode # bytecode[9:12] = b”\x09\x09\x09”
PyCodeObject.co_code Patching — The Problem ● There is a chance to crash the application while patching bytecode being executed ● PyCodeObject is not a run-time primitive, there is no flag to show us the Object is executing or not ● But this flag exists in PyFrameObject
CPython — PyFrameObject ● PyFrameObject — PyVarObject extension to describe the Call Stack Frame ● PyFrameObject.ob_type → PyFrame_Type ● PyFrameObject — dynamic object created during Interpretation, it stores arguments during function call and do other things like traditional Stack Frame
PyFrameObject — Structure typedef struct _frame { PyObject_VAR_HEAD struct _frame *f_back; /* previous frame, or NULL */ PyCodeObject *f_code; /* code segment */ ... PyObject *f_locals; /* local symbol table (any mapping) */ ... PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */ } PyFrameObject;
Recommend
More recommend