Step 3: replay non-attack inputs Time Inputs Outputs Attack input X Virtual machine 48
Problems with VM replay ● VM replay is expensive – Repairing a week-old attack needs a week for replay ● Past inputs are meaningless to new system – Non-determinism: new SSH crypto keys ... – Deterministic replay won't work 49
Retro's approach: Action history graph ● Represent fjne-grained history – Includes kernel objects, system calls, function calls, … – Assume tamper-proof kernel, storage 50
Retro's approach: Action history graph ● Represent fjne-grained history – Includes kernel objects, system calls, function calls, … – Assume tamper-proof kernel, storage ● Rollback objects directly afected by attack – Avoid the false positives of Taint tracking ● Selectively re-execute indirectly afected actions – Avoid the expensive VM replay 51
Action history graph: Objects represent fjles, processes adduser Attacker's password Admin's Time Alice process fjle shell 52
Action history graph: Actions represent execution (syscall) adduser Attacker's password Admin's Time Alice process fjle shell 53
Action history graph: Actions have dependencies adduser Attacker's password Admin's Time Alice process fjle shell w r i t e ( o f s e t , d a t a ) 54
Action history graph: Actions have dependencies adduser Attacker's password Admin's Time Alice process fjle shell w r i t e exec ( o f s e t , d (prog, args, ..) a t a ) 55
Action history graph: Actions have dependencies adduser Attacker's password Admin's Time Alice process fjle shell w r i t e exec ( o f s e t , d (prog, args, ..) a t a ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 56
Action history graph: Objects have checkpoints adduser Attacker's password Admin's Time Alice process fjle shell w r i t e exec ( o f s e t , d (prog, args, ..) a t a ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 57
Step 1: fjnd attack action adduser Attacker's password Admin's Time Alice process fjle shell w r i t e exec ( o f s e t , d (prog, args, ..) a t a ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 58
Step 2: rollback afgected objects adduser Attacker's password Admin's Time Alice process fjle shell w r i t e exec ( o f s e t , d (prog, args, ..) a t a ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 59
Step 3: skip attack action adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 60
Step 4: redo non-attack actions adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 61
Repeat step 2: rollback objects adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 62
Repeat step 3: redo actions adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) read (ofset, data) Key advantage over e t i r w ) a t a d VM replay: , t e s f o ( Re-run only adduser, e x i t not entire VM. ( s t a t u s ) 63
Repeat step 3: redo actions adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 64
Repeat step 3: redo actions adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) read (ofset, data) Key advantage over e t i r w Taint tracking: ) a t a d , t e s f o ( Attacker removed, e x Alice account preserved i t ( s t a t u s ) 65
Challenge: how to avoid re-executing everything? adduser Attacker's password Admin's Time Alice process fjle shell w r i t e Exit status afgects shell, X ( o exec f s e t , d a t a (prog, args, ..) ) which afgects sshd, and so on… read (ofset, data) Naïve process-level re-execution e still re-executes entire system! t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 66
Observation: Admin's shell was not afgected ● “Adduser alice” succeed as before – This is what Admin wanted to do – If failed, need to re-execute Admin's shell 67
Example 1: exit status to shell unchanged adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) read (ofset, data) e t i r w ) a t a d , t e s f o ( e x i t ( s t a t u s ) 68
Predicates: avoid equivalent re-execution adduser Attacker's password Admin's Time Alice process fjle shell w r i t e X ( o exec f s e t , d a t a (prog, args, ..) ) Check if adduser read (ofset, data) succeed as before? Skip the re-run e t i r w ) a t a d , t of admin's shell e s f o ( e x i t ( s t a t u s ) 69
Example 2: user's password unchanged Attacker's password Alice's Time process fjle SSHD w r i t e X ( o f s e t , d a t a ) r e a d ( o f s e t , d a t a ) 70
Observation: Alice's SSHD was not afgected ● Alice's SSHD checked only Alice's account – This is what Alice's SSHD wanted to do – If Alice's account changed, need to re-execute SSHD 71
Refjnement : exploits high-level semantics getpwnam() Attacker's password Alice's Time function process fjle SSHD w r i t e X ( o f s e t , d a t l l a a ) c ) ” e c i l a “ read ( m a n w p t e (ofset, data) g return (Alice's password) 72
Refjnement : Get username, exploits high-level semantics return passwd entry getpwnam() Attacker's password Alice's Time function process fjle SSHD w r i t e X ( o f s e t , d a t l l a a ) c ) ” e c i l a “ read ( m a n w p t e (ofset, data) g return (Alice's password) 73
Refjnement : exploits high-level semantics getpwnam() Attacker's password Alice's Time function process fjle SSHD w r i t e X ( o f s e t , d a t l l a a ) c ) ” e c i l a “ read ( m a n w p t e (ofset, data) g return (Alice's password) 74
Refjnement : exploits high-level semantics getpwnam() Attacker's password Alice's Time function process fjle SSHD w r i t e X ( o f s e t , d a t l l a a ) c ) ” e c i l a “ read ( m a n w p t e (ofset, data) g return Rerun getpwnam() (Alice's password) instead of SSHD 75
Refjnement : exploits high-level semantics getpwnam() Attacker's password Alice's Time function process fjle SSHD w r i t e X ( o f s e t , d a t l l a a ) c ) ” e c i l a “ read ( m a n w p t e (ofset, data) g return (Alice's password) Predicate: Check if return same Alice's passwd? Skip the re-run of Alice's SSHD 76
Quick summary: Retro's approach ● Action history graph: represent history in detail ● Two techniques to minimize re-execution: – Predicates : skips equivalent computations – Refjnement : re-executes fjne-grained actions 77
Challenge: external dependencies ● What if the attack was externally-visible? – Spam sent out ... – Hard in general case ask for user's decision → ● Help users to understand repaired state – (e.g.) notify user spam email was sent out ... 78
Compensating action: notify changes in terminal output ... [redo] cat ~/.ssh/authorized_keys ... ! --- old ! +++ new ! @@ -1,3 +1,2 @@ ! ssh-rsa AAAAB3NzaC1yc2EAAAABIw... vagrant ! -ssh-rsa AAAAB3NzaC1yc2EAAAADAQ... attacker ! ssh-rsa AAAAB3NzaC1yc2EAAAAAao... new pubkey ... You should not have seen this output! 79
Retro implementation Processes Runtime: Record action history graph Action history graph Userspace Kernel Linux kernel Retro module File system (checkpts) 80
Retro implementation Processes Repair Managers Repair (e.g., fs, terminal ..) Controller Action history graph Userspace Recovery: repair logic/mgr Kernel Linux kernel Retro module File system (checkpts) 81
Application specifjc mgrs Retro implementation using well-defjned API Processes Repair Managers Repair (e.g., fs, terminal ..) Controller Action history graph Userspace Kernel Linux kernel Retro module File system (checkpts) 82
Demo: recovering from inadvertently installed virus ● Backtracking tool ● Selective re-execution ● Compensating action 83
Problem: detecting an entry point of attacks is hard ● How to fjnd one-month-old attack? ● Too much information – Manual analysis is time-consuming 84
Observation: security patch renders attack harmless ● Escape URL arguments for fjrefox // slider.c - sprintf(cmd, “firefox %s”, evt->uri); + sprintf(cmd, “firefox %s”, escape(evt->uri)); slider sh slider sh fjrefox virus fjrefox virus x vs Unpatched Patched 85
Approach: comparing both histories to detect past attacks ● How can we get history of patched execution? – Replay inputs after applying security patches – Diferent history potential threats → slider sh slider sh fjrefox virus fjrefox virus x vs Unpatched Patched 86
Approach: comparing both histories to detect past attacks ● How can we get history of 'secure' execution? Turn manual efgort of auditing process – Replay one more after applying security patches into computational problem! – Diferent history potential threats → (patch-based auditing) slider sh slider sh fjrefox virus fjrefox virus x vs Unpatched Patched 87
Challenge: performance ● Re-executing is costly for busy computer – Auditing requests re-executes all requests again → – Auditing one month takes another month! → 88
Three techniques developed for partial re-execution ● Control fmow fjltering – Audit possibly afected executions ● Function-level auditing – Compare function-level executions ● Memoized re-execution – Avoid duplicated executions while replaying 89
Putting all together: fjxing our past & future with patch Patch from upstream (fjxing a bug in SSHD) Aug. ?? Aug. 28 st Sept. 1 st Oct. 3 rd 2011 ... ... ? 1. Manual and time consuming 2. Lost changes 3. No guarantees (safe to rollback?) (a month!) 90
Putting all together: fjxing our past & future with patch ● Automatic detection Patch from upstream (fjxing a bug in SSHD) x Aug. ?? Aug. 28 st Sept. 1 st Oct. 3 rd 2011 ... ... ? 1. Manual and time consuming 2. Lost changes 3. No guarantees (safe to rollback?) (a month!) 91
Putting all together: fjxing our past & future with patch ● Automatic detection Patch from upstream ● Preserve changes (fjxing a bug in SSHD) x x Aug. ?? Aug. 28 st Sept. 1 st Oct. 3 rd 2011 ... ... ? 1. Manual and time consuming 2. Lost changes 3. No guarantees (safe to rollback?) (a month!) 92
Putting all together: fjxing our past & future with patch ● Automatic detection Patch from upstream ● Preserve changes (fjxing a bug in SSHD) ● Strong guarantees x x x Aug. ?? Aug. 28 st Sept. 1 st Oct. 3 rd 2011 ... ... ? 1. Manual and time consuming 2. Lost changes 3. No guarantees (safe to rollback?) (a month!) 93
Putting all together: fjxing our past & future with patch ● Automatic detection Patch from upstream ● Preserve changes (fjxing a bug in SSHD) ● Strong guarantees x x x Aug. ?? Aug. 28 st Sept. 1 st Oct. 3 rd 2011 Whenever new patches are released, ... ... ? not only prevent future attacks, 1. Manual and time consuming but also detect and repair past attacks for free ! 2. Lost changes 3. No guarantees (safe to rollback?) (a month!) 94
Summary of our approach: building real systems ● Existing systems are not designed for history – Implicit dependencies and time-line ● Attacks can be anywhere in the history – Attacks are often detected days or weeks later ● History can not be changed in some cases – External dependencies: spam sent out 95
Summary of our approach: building real systems ● Existing systems are not designed for history – Implicit dependencies and time-line → Action history graph & re-execution techniques ● Attacks can be anywhere in the history – Attacks are often detected days or weeks later ● History can not be changed in some cases – External dependencies: spam sent out 96
Summary of our approach: building real systems ● Existing systems are not designed for history – Implicit dependencies and time-line → Action history graph & re-execution techniques ● Attacks can be anywhere in the history – Attacks are often detected days or weeks later → Patch-based auditing ● History can not be changed in some cases – External dependencies: spam sent out 97
Summary of our approach: building real systems ● Existing systems are not designed for history – Implicit dependencies and time-line → Action history graph & re-execution techniques ● Attacks can be anywhere in the history – Attacks are often detected days or weeks later → Patch-based auditing ● History can not be changed in some cases – External dependencies: spam sent out → (Not solved) compensating actions in some cases (see our recent work, Aire [SOSP'13] in this direction of research) 98
Evaluation questions ● Automatic intrusion recovery – How much better than manual repair? – How much runtime overhead? ● Patch-based auditing – What attacks can be detected? – How fast is re-execution? 99
Experimental setup for Retro (automatic recovery) ● 2.8 GHz Intel Core i7, 8 GB RAM ● 64-bit Linux 2.6.35 ● Tested with – 2 real-world attacks from Honeypot – 8 synthetic attacks 100
Recommend
More recommend