HFL: Hybrid Fuzzing on the Linux Kernel Kyungtae Kim*, Dae R. Jeong°, Chung Hwan Kim ¶ , Yeongjin Jang § , Insik Shin°, Byoungyoung Lee ˥ * * Purdue University, ° KAIST, ¶ NEC Labs America, § Oregon State University, ˥ Seoul National University
Software Security Analysis • Random fuzzing • Pros : Fast path exploration • Cons : Strong branch conditions e.g., if(i == 0xdeadbeef) • Symbolic/concolic execution • Pros : Generate concrete input for strong branch conditions • Cons : State explosion 2
Hybrid Fuzzing in General • Combining traditional fuzzing and concolic execution • Fast exploration with fuzzing ( no state explosion ) • Strong branches are handled with concolic execution • State-of-the-arts • Intriguer [CCS’19], DigFuzz [NDSS’19], QSYM [Sec’18], etc. • Application-level hybrid fuzzers 3
Kernel Testing with Hybrid Fuzzing • Software vulnerabilities are critical threats to OS kernels • 1,018 Linux kernel vulnerabilities reported in CVE over Q. Is hybrid-fuzzing good enough for kernel testing? the last 4 years • Hybrid-fuzzing can help improve coverage and find more bugs in kernels. • A huge number of specific branches e.g., CAB- Fuzz[ATC’17], DIFUZE[CCS’17] 4
Challenge 1: Indirect Control Transfer Q. Can be fuzzed derived from enough to explore targets to be hit syscall arguments all functions? ioctl_fn _ioctls[] = { idx = cmd - INFO_FIRST; ioctl_version, ... ioctl_protover, funp = _ioctls [idx]; ... … ioctl_ismountpoint, funp (sbi, param); }; <function pointer table> <indirect function call> indirect control transfer 5
Challenge 2: System Call Dependencies explicit syscall dependencies int open (const char *pathname, int flags, mode_t mode) ssize_t write ( int fd , void *buf, size_t count) ioctl (int fd, unsigned long req, void *argp) ioctl (int fd, unsigned long req, void *argp) Q. What dependency behind? 6
Example: System Call Dependencies Q. Can be struct d_bind struct d_alloc inferred fd = o pen (…) s32 ID ; s32 x; ioctl (fd, D_ALLOC, arg1 ) exactly? s32 y; s32 ID ; copy_to_user ioctl (fd, D_BIND, arg2 ) Read ❸ second Write ❶ first ioctl ioctl ❷ d_alloc (struct d_alloc * arg ): … arg->ID = g_var ; ioctl (fd, cmd, arg): … switch (cmd) { case D_ALLOC: d_alloc (arg); case D_BIND: d_bind (arg); Check ID d_bind (struct d_bind * arg ): … if ( g_var != arg->ID ) with g_var return -EINVAL; ❹ /* main functionality */ ... 7
Challenge 3: Complex Argument Structure unknown type ioctl (int fd, unsigned long cmd, void *argp ) write (int fd, void *buf , size_t count) unknown type 8
Example: Nested Arguments Structure struct usbdev_ctrl : memory view ioctl (fd, USB_X, arg ) void * data ; unsigned len ; arg : syscall data len dst addr struct usbdev_ctrl ctrl; uchar * tbuf ; … … copy_from_user (&ctrl, arg , sizeof(ctrl)) … arg. len copy_from_user ( tbuf , ctrl.data , ctrl.len ) /* do main functionality */ src addr Q. Can be inferred exactly? … 9
HFL: Hybrid Fuzzing on the Linux Kernel • The first hybrid kernel fuzzer infer argument calling • Handling the challenges feedback retrieval orders • Coverage-guided/system call 1. Implicit control transfer unsolved conds fuzzer • Convert to direct control-flow Symbolic Fuzzer Agent candidate 2. System call dependencies Analyzer dependency • Hybrid fuzzing • Infer system call dependency ondemand pairs solved inputs exec • Combining fuzzer and symbolic 3. Complex argument structure Linux convert * Linux analyzer • Infer nested argument structure static Kernel Kernel • Agent act as a glue between the analysis two components hybrid-fuzzing 10
1. Conversion to Direct Control-flow <Before> <After> idx = cmd – INFO_FIRST; idx = cmd – INFO_FIRST; ... ... Compile time conversion: funp = _ioctls[idx]; funp = _ioctls[idx]; direct control transfer ... ... funp (sbi, param); if (cmd == IOCTL_VERSION) ioctl_version (sbi, param); else if (cmd == IOCTL_PROTO) ioctl_protover (sbi, param); ioctl_fn _ioctls[] = { … ioctl_version , ioctl_ismountpoint (sbi, param); ioctl_protover , ... ioctl_ismountpoint , }; 11
2. Syscall Dependency Inference symbolically fd = o pen (…) {struct d_alloc} arg ❶ Collecting ioctl (fd, D_ALLOC, {struct d_alloc} ) tainted symbolize W-R pairs ioctl (fd, D_BIND, {struct d_bind} ) 0x8 inferred syscall sequence ❷ Runtime ID syscalls validation write prio1: ioctl (fd, D_ALLOC, {* _1 }) ❸ Parameter prio2: ioctl (fd, D_BIND, {* _2 }) ❸ offset d_alloc (struct d_alloc *arg) : dependency g_var = gen(); ❷ hit ❸ symbolic arg->ID = g_var; checking <instruction struct _2 { struct _1 { W : offset ( 0x8 ) dependency pair> u32 ID , u64 x; R : offset ( 0x0 ) ❷ address u32 ID ;} u64 x; } if yes, W : g_var = true dependency R : g_var ❶ static ❸ offset analysis ❷ address {struct d_bind} arg ❷ hit d_bind (struct d_bind *arg): ❸ symbolic Linux ... ID checking symbolically Kernel if( g_var == arg->ID ) tainted read ... 12
3. Nested Argument Format Retrieval ioctl (fd, USB_X, arg ) symbolically inferred syscall interface final memory view memory view tainted syscall 0x14 ioctl (fd, USB_X, {* _1 }) 0x8 0x8 upper struct usbdev_ctrl ctrl ; uchar * tbuf ; buffer data ctrl: … ❶ hit struct _1 : struct _2: copy_from_user ( &ctrl , arg, sizeof(ctrl) ); u64 x; u64 x; lower 0x14 … {* _2 } y ; u64 y; buffer copy_from_user ( tbuf , ctrl.data , ctrl.len) ); u64 z; ❷ hit … 0x10 ctrl.data: symbolic check 0x10 13
Implementation ❺ Python-based - transfer data ❶ Syzkaller ❷ S2E - send unsolved conds - constraint solving - process solved argument infer calling feedback - symbolic checking retrieval orders conditions unsolved conds ❹ SVF/ ❺ Symbolic Fuzzer Agent candidate ❶ LLVMLINUX Analyzer ❷ dependency ❸ GCC ondemand pairs solved inputs - collect - convert to direct exec dependency set ❸ control-flow Linux convert * Linux ❹ static Kernel Kernel analysis hybrid-fuzzing 14
Vulnerability Discovery • Discovered new vulnerabilities • 24 new vulnerabilities found in the Linux kernels • 17 confirmed by Linux kernel community • UAF, integer overflow, uninitialized variable access, etc. • Efficiency of bug-finding capability • 13 known bugs for HFL and Syzkaller • They were all found by HFL 3x faster than Syzkaller 15
Code Coverage Enhancement • Compared with state-of-the-art kernel fuzzers • Moonshine [Sec’18] , kAFL [CCS’17] , etc. • KCOV -based coverage measurement • HFL presents coverage improvement over the others • Ranging from 15% to 4x HFL kAFL S2E Syzkaller TriforceAFL Moonshine 16
Conclusion • HFL is the first hybrid kernel fuzzer. • HFL addresses the crucial challenges in the Linux kernel. • HFL found 24 new vulnerabilities, and presented the better code coverage, compared to state-of-the-arts. 18
Thank you 19
Recommend
More recommend