Making C Less Dangerous in the Linux Kernel Kees Cook | @keescook LINUX.CONF.AU 21-25 January The Linux of Things | #LCA2019 | @linuxconfau 2019 Christchurch, NZ
Making C Less Dangerous in the Linux Kernel Linux Conf AU January 25, 2019 Christchurch, New Zealand Kees (“Case”) Cook keescook@chromium.org @kees_cook https://outflux.net/slides/2019/lca/danger.pdf
Agenda ● Background – Kernel Self Protection Project – C as a fancy assembler ● Towards less dangerous C – Variable Length Arrays are bad and slow – Explicit switch case fall-through – Always-initialized automatic variables – Arithmetic overflow detection – Hope for bounds checking – Control Flow Integrity: forward edges – Control Flow Integrity: backward edges – Where are we now? – How you can help @Rob_Russell
Kernel Self Protection Project https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project ● KSPP focuses on the kernel protecting the kernel from attack (e.g. ● refcount overflow) rather than the kernel protecting userspace from attack (e.g. execve brute force detection) but any area of related development is welcome Currently ~12 organizations and ~10 individuals working on about ● ~20 technologies Slow and steady ●
C as a fancy assembler: almost machine code ● The kernel wants to be as fast and small as possible ● At the core, kernel wants to do very architecture-specific things for memory management, interrupt handling, scheduling, ... ● No C API for setting up page tables, switching to 64-bit mode …
C as a fancy assembler: undefined behavior ● The C langauge comes with some operational baggage, and weak “standard” libraries – What are the contents of “uninitialized” variables? … whatever was in memory from before now! ● – v pointers have no type yet we can call typed functions through them? o i d … assembly doesn’t care: everything can be an address to call! ● – Why does m have no “max destination length” argument? e m c p y ( ) … just do what I say; memory areas are all the same! ● ● “With undefined behavior, anything is possible!” – https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html
Variable Length Arrays (and a ) are bad l l o c a ( ) ● Exhaust stack, linear overflow: write to things following it ● Jump over guard pages and write to things following it ● Easy to find with compiler flag: - W v l a ● But if you must (in userspace) please use gcc’s stack probing feature: - f s t a c k - c l a s h - p r o t e c t i o n stack 1 … stack 1 … size = 8192; size = 8192; … ... ... ... … char buf[size]; u8 array[size]; guard page ... … … stack 2 stack 2 strcpy(buf, src, size); array[big] = foo; … … … … ... ...
Variable Length Arrays are slow ● This seems conceptually sound: more instructions to change stack size, but it seems like it would be hard to measure. ● It is quite measurable … 13% speed up measured during VLA removal: l i b / b c h . c https://git.kernel.org/linus/02361bc77888 (Ivan Djelic) B u f f e r a l l o c a t i o n | E n c o d i n g t h r o u g h p u t ( M b i t / s ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - o n - s t a c k , V L A | 3 9 8 8 o n - s t a c k , f i x e d | 4 4 9 4 k m a l l o c | 1 9 6 7
Variable Length Arrays: stop it fixed-size array variable length array
Switch case fall-through: did I mean it? ● CWE-484 “Omitted Break Statement in Switch” ● Semantic weakness in C (“switch” is just assembly test/jump...) ● Commit logs with “missing break statement”: 67 Did they mean to leave out “ b ” ?? r e a k ;
Switch case fall-through: new “statement” ● Use - to add a new switch “statement” W i m p l i c i t - f a l l t h r o u g h – Actually a comment, but is parsed by compilers now, following the lead of static checkers ● Mark all non-breaks with a “fall through” comment, for example https://git.kernel.org/linus/4597b62f7a60 (Gustavo A. R. Silva)
Always-initialized local variables: just do it ● CWE-200 “Information Exposure”, CWE-457 “Use of Uninitialized Variable” ● gcc - not upstream f i n i t - l o c a l - v a r s ● Clang - not upstream f s a n i t i z e = i n i t - l o c a l ● C O N F I G _ G C C _ P L U G I N _ . . . (for structs S T R U C T L E A K – with _ pointers) _ u s e r (when S T R U C T L E A K _ B Y R E F – passed into funcs) Soon, plugin to mimic – too - f i n i t - l o c a l - v a r s
Always-initialized local variables: switch gotcha w a r n i n g : s t a t e m e n t w i l l n e v e r b e e x e c u t e d [ - W s w i t c h - u n r e a c h a b l e ]
Arithmetic overflow detection: gcc? ● gcc’s - ( C ) f s a n i t i z e = s i g n e d - i n t e g e r - o v e r f l o w O N F I G _ U B S A N – Only signed. Fast: in the noise. Big: warnings grow kernel image by 6% (aborts grow it by 0.1%) ● But we can use explicit single-operation helpers. To quote Rasmus Villemoes:
Arithmetic overflow detection: Clang :) ● Clang can do signed and unsigned instrumentation: - f s a n i t i z e = s i g n e d - i n t e g e r - o v e r f l o w - f s a n i t i z e = u n s i g n e d - i n t e g e r - o v e r f l o w
Bounds checking: explicit checking is slow :( ● Explicit checks for linear overflows of SLAB objects, stack, etc checking: <~1% performance hit – c o p y _ { t o , f r o m } _ u s e r ( ) – s -family checking: ~2% performance hit t r c p y ( ) -family checking: ~1% performance hit – m e m c p y ( ) ● Can we get better APIs? is terrible – s t r c p y ( ) – s is bad p r i n t f ( ) is weak – m e m c p y ( )
Instead of s : s t r c p y ( ) t r s c p y ( ) ● s no bounds checking on destination nor source! t r c p y ( ) ● s doesn’t always NUL terminate (good for non-C-strings, does NUL pad destination) t r n c p y ( ) c h a r d e s t [ 4 ] ; s t r n c p y ( d e s t , “ o h a i ! ” , s i z e o f ( d e s t ) ) ; / * u n h e l p f u l l y r e t u r n s d e s t * / … no trailing NUL byte :( d e s t : “ o ” , “ h ” , “ a ” , “ i ” ● s reads source beyond max destination size (returns length of source!) t r l c p y ( ) ● s safest (returns bytes copied, not including NUL, or -E2BIG) t r s c p y ( ) s s i z e _ t c o u n t = s t r s c p y ( d e s t , “ o h a i ! ” , s i z e o f ( d e s t ) ) ; / * r e t u r n s - E 2 B I G * / d e s t : “ o ” , “ h ” , “ a ” , N U L Does not NUL-pad destination … if desired, add explicit m (kernel needs a helper for this...) – e m s e t ( ) i f ( c o u n t > 0 & & c o u n t + 1 < s i z e o f ( d e s t ) ) m e m s e t ( d e s t + c o u n t + 1 , 0 , s i z e o f ( d e s t ) – c o u n t - 1 ) ;
Instead of s : s p r i n t f ( ) c n p r i n t f ( ) ● s no bounds checking on destination! p r i n t f ( ) ● s always NUL-terminates, but returns how much it n p r i n t f ( ) WOULD have written :( i n t c o u n t = s n p r i n t f ( b u f , s i z e o f ( b u f ) , f m t … , … ) ; f o r ( i = 0 ; i < s o m e t h i n g ; i + + ) c o u n t + = s n p r i n t f ( b u f + c o u n t , s i z e o f ( b u f ) - c o u n t , f m t … , … ) ; c o p y _ t o _ u s e r ( u s e r , b u f , c o u n t ) ; ● s always NUL-terminates, returns count of bytes copied c n p r i n t f ( ) Replace in above code! –
Instead of m : uhhh … be … careful? e m c p y ( ) ● m has no sense of destination size :( e m c p y ( ) u i n t 8 _ t b y t e s [ 1 2 8 ] ; s i z e _ t w a n t e d , c o p i e d = 0 ; f o r ( i = 0 ; i < s o m e t h i n g & & c o p i e d < s i z e o f ( b y t e s ) ; i + + ) { w a n t e d = . . . ; i f ( w a n t e d > s i z e o f ( b y t e s ) - c o p i e d ) w a n t e d = s i z e o f ( b y t e s ) - c o p i e d ; m e m c p y ( b y t e s + c o p i e d , w a n t e d , s o u r c e ) ; c o p i e d + = w a n t e d ; }
Recommend
More recommend