detecting and surviving intrusions
play

Detecting and Surviving Intrusions Exploring New Host-Based - PowerPoint PPT Presentation

Detecting and Surviving Intrusions Exploring New Host-Based Intrusion Detection, Recovery, and Response Approaches Ph.D. Thesis Defense December 17th, 2019 1 HP Labs (ronny.chevalier@hp.com) 2 CIDRE Team, CentraleSuplec/Inria/CNRS/IRISA


  1. Thesis and Problems Addressed Surviving Intrusions at the Operating System Level How to design an OS so that its services can survive ongoing intrusions while maintaining availability? Detecting Intrusions at the Firmware Level How to detect intrusions at the firmware level without impacting the quality of service to the rest of the platform? Contribution published at ACSAC’17 8 6Chevalier, Plaquin, and Hiet, “Intrusion Survivability for Commodity Operating Systems and Services: A Work in Progress”. 7Chevalier, Plaquin, Dalton, et al., “Survivor: A Fine-Grained Intrusion Response and Recovery Approach for Commodity Operating Systems”. 8Chevalier, Villatel, et al., “Co-processor-based Behavior Monitoring: Application to the Detection of Attacks Against the System Management Mode”. 10 Contribution published at RESSI’18 6 and ACSAC’19 7

  2. Agenda Introduction: Preventing, Detecting, and Surviving Intrusions Surviving Intrusions at the Operating System Level State of the Art Approach and Prototype Evaluation Conclusion Detecting Intrusions at the Firmware Level Conclusion and Perspectives 11

  3. Running Example Service: Gitea, a Git Self-Hosting Server Open source clone of Github (git repositories, bug tracking,...) Intrusion: Ransomware It compromises data availability 12

  4. State of the Art: Intrusion Survivability, Recovery, and Response Intrusion Survivability 9 Trade-off between the availability and the security risk Intrusion Recovery 10 Restore the system in a safe state when an intrusion is detected Intrusion Response 11 Limit the impact of an intrusion on the system 13 9Knight and Strunk, “Achieving Critical System Survivability Through Software Architectures”; Ellison et al., Survivable Network Systems: An emerging discipline .

  5. State of the Art: Intrusion Survivability, Recovery, and Response Intrusion Survivability 9 Trade-off between the availability and the security risk Intrusion Recovery 10 Restore the system in a safe state when an intrusion is detected Intrusion Response 11 Limit the impact of an intrusion on the system 10Goel et al., “The Taser Intrusion Recovery System”; Xiong, Jia, and P. Liu, “SHELF: Preserving Business Continuity and Availability in an Intrusion Recovery System”. 13 9Knight and Strunk, “Achieving Critical System Survivability Through Software Architectures”; Ellison et al., Survivable Network Systems: An emerging discipline .

  6. State of the Art: Intrusion Survivability, Recovery, and Response Intrusion Survivability 9 Trade-off between the availability and the security risk Intrusion Recovery 10 Restore the system in a safe state when an intrusion is detected Intrusion Response 11 Limit the impact of an intrusion on the system 10Goel et al., “The Taser Intrusion Recovery System”; Xiong, Jia, and P. Liu, “SHELF: Preserving Business Continuity and Availability in an Intrusion Recovery System”. 11Balepin et al., “Using Specification-Based Intrusion Detection for Automated Response”; Shameli-Sendi, Cheriet, and Hamou-Lhadj, “Taxonomy of Intrusion Risk Assessment and Response System”. 13 9Knight and Strunk, “Achieving Critical System Survivability Through Software Architectures”; Ellison et al., Survivable Network Systems: An emerging discipline .

  7. State of the Art: Limitations we are addressing Intrusion Survivability Lack of focus on commodity OSs Intrusion Recovery • The system is still vulnerable and can be reinfected • Lack of integration between intrusion recovery and response Intrusion Response Coarse-grained responses and few host-based solutions 14

  8. State of the Art: Limitations we are addressing Intrusion Survivability Lack of focus on commodity OSs Intrusion Recovery • The system is still vulnerable and can be reinfected • Lack of integration between intrusion recovery and response Intrusion Response Coarse-grained responses and few host-based solutions 14

  9. State of the Art: Limitations we are addressing Intrusion Survivability Lack of focus on commodity OSs Intrusion Recovery • The system is still vulnerable and can be reinfected • Lack of integration between intrusion recovery and response Intrusion Response Coarse-grained responses and few host-based solutions 14

  10. State of the Art: Limitations we are addressing Intrusion Survivability Lack of focus on commodity OSs Intrusion Recovery • The system is still vulnerable and can be reinfected • Lack of integration between intrusion recovery and response Intrusion Response Coarse-grained responses and few host-based solutions Commodity OSs are lacking solutions to make them survive while waiting for the patches to be available 14

  11. Agenda Introduction: Preventing, Detecting, and Surviving Intrusions Surviving Intrusions at the Operating System Level State of the Art Approach and Prototype Evaluation Conclusion Detecting Intrusions at the Firmware Level Conclusion and Perspectives 15

  12. Approach Overview Illustrative Example Running Example Gitea infected by some ransomware When Detected • Recovery: We restore the service and the encrypted files to a previous state • Apply restrictions: We remove the ability to write on the file system Positive Impact Degraded Mode 16 If the ransomware reinfects the service → cannot compromise the files Users can no longer push to repositories → trade-off between availability and security risk

  13. Approach Overview Checkpoint & Log Store Store Checkpoint Log Checkpoint Monitor Logs States Detection During the normal operation of the system Intrusion Filesystem Network Devices Apache Gitea Operating System 17 Service n

  14. Approach Overview Checkpoint & Log Store Store Checkpoint Log Checkpoint Monitor Logs States Detection During the normal operation of the system Intrusion Filesystem Network Devices Apache Gitea Operating System 17 Service n

  15. Approach Overview 1. Periodic checkpointing Store Store Checkpoint Log Checkpoint Monitor Logs States Checkpoint & Log During the normal operation of the system Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 17 Service n

  16. Approach Overview 1. Periodic checkpointing Store Store Checkpoint Log Checkpoint Monitor Logs States 2. Log file write accesses Checkpoint & Log During the normal operation of the system Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 17 Service n

  17. Approach Overview States Use Use files Restore restrictions Apply service Restore Alert Monitor Logs / How our approach allows the system to survive intrusions after their detection? Policies Recovery & Response Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 18 Service n

  18. Approach Overview States Use Use files Restore restrictions Apply service Restore Alert Monitor Logs / How our approach allows the system to survive intrusions after their detection? Policies Recovery & Response Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 18 Service n

  19. Approach Overview Logs / Use Use files Restore restrictions Apply service Restore Alert Monitor States Policies How our approach allows the system to survive intrusions after their detection? 1. Restore infected objects Recovery & Response Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 18 Service n

  20. Approach Overview States Per-service responses to prevent attackers to achieve their goals Remove privileges and decrease resource quotas Use Use files Restore restrictions Apply service Restore Alert Monitor Logs / How our approach allows the system to survive intrusions after their detection? Policies 2. Withstand reinfection 1. Restore infected objects Recovery & Response Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 18 Service n

  21. Approach Overview Logs / The degraded mode maintains core functions while waiting for patches Potential Degraded Mode Use Use files Restore restrictions Apply service Restore Alert Monitor States Policies How our approach allows the system to survive intrusions after their detection? 3. Maintain core functions 2. Withstand reinfection 1. Restore infected objects Recovery & Response Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 18 Service n

  22. Approach Overview Policies Use Use files Restore restrictions Apply service Restore Alert Monitor States Logs / 3. Maintain core functions How our approach allows the system to survive intrusions after their detection? 2. Withstand reinfection 1. Restore infected objects Recovery & Response Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 18 Service n

  23. Approach Overview Logs / maximizing the security We select responses that minimize the availability impact on the service while Use Use files Restore restrictions Apply service Restore Alert Monitor States Policies How our approach allows the system to survive intrusions after their detection? 3. Maintain core functions 2. Withstand reinfection 1. Restore infected objects Recovery & Response Detection Intrusion Filesystem Network Devices Apache Gitea Operating System 18 Service n

  24. Cost-Sensitive Response Selection Additional Intelligence Selected Response Cost very likely Likelihood Initial Alert Information Detection Cost Efficiency Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Threat Intrusion understand the intrusion Efficiency find possible responses assign costs select a response Response Costs Matrix Response Malicious Behaviors Costs Optimization 1. Pareto-optimal set 2. Weighted sum Risk 19 → → →

  25. Cost-Sensitive Response Selection Availability violation Cost Efficiency Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Costs very low, low, moderate, high, very high, critical Malicious behaviors Consume system resources understand the intrusion Crack passwords Mine for cryptocurrency Compromise data availability Compromise access to information assets Command and Control Determine C2 server Generate C2 domain name(s) Receive data from C2 server Control malware via remote command Update configuration ... Example of malicious behaviors Information Additional Alert 1. Pareto-optimal set find possible responses assign costs select a response Response Costs Response Efficiency Malicious Behaviors Costs Initial Optimization 2. Weighted sum Risk Matrix Intrusion Detection Threat Intelligence Selected Response Cost very likely Likelihood 19 → → → text Example

  26. Cost-Sensitive Response Selection Crack passwords Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Costs very low, low, moderate, high, very high, critical Malicious behaviors Availability violation Consume system resources Mine for cryptocurrency Cost Compromise data availability Compromise access to information assets ... Command and Control Determine C2 server Generate C2 domain name(s) Receive data from C2 server Control malware via remote command Update configuration ... ... Example of a non-exhaustive malicious behavior hierarchy (Source: MAEC of the STIX project) Efficiency Information understand the intrusion 2. Weighted sum find possible responses assign costs select a response Response Costs Response Efficiency Malicious Behaviors Costs Optimization Additional 1. Pareto-optimal set Risk Matrix Alert Initial Likelihood very likely Cost Selected Response Intelligence Threat Detection Intrusion 19 → → → text Example

  27. Cost-Sensitive Response Selection Crack passwords Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Costs very low, low, moderate , high, very high, critical Malicious behaviors Availability violation =moderate Consume system resources Mine for cryptocurrency Cost Compromise data availability Compromise access to information assets ... Command and Control Determine C2 server Generate C2 domain name(s) Receive data from C2 server Control malware via remote command Update configuration ... ... Example of a non-exhaustive malicious behavior hierarchy (Source: MAEC of the STIX project) Efficiency Information understand the intrusion 2. Weighted sum find possible responses assign costs select a response Response Costs Response Efficiency Malicious Behaviors Costs Optimization Additional 1. Pareto-optimal set Risk Matrix Alert Initial Likelihood very likely Cost Selected Response Intelligence Threat Detection Intrusion 19 → → → text Example

  28. Cost-Sensitive Response Selection Crack passwords =moderate Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Costs very low, low, moderate , high, very high, critical Malicious behaviors Availability violation =moderate Consume system resources =moderate Mine for cryptocurrency =moderate Cost Compromise data availability =moderate Compromise access to information assets =moderate ... Command and Control Determine C2 server Generate C2 domain name(s) Receive data from C2 server Control malware via remote command Update configuration ... ... Example of a non-exhaustive malicious behavior hierarchy (Source: MAEC of the STIX project) Efficiency Information understand the intrusion 2. Weighted sum find possible responses assign costs select a response Response Costs Response Efficiency Malicious Behaviors Costs Optimization Additional 1. Pareto-optimal set Risk Matrix Alert Initial Likelihood very likely Cost Selected Response Intelligence Threat Detection Intrusion 19 → → → text Example

  29. Cost-Sensitive Response Selection Information Selected Response Cost very likely Likelihood Initial Alert Additional Cost Threat Efficiency Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Intelligence Detection understand the intrusion Efficiency find possible responses assign costs select a response Response Costs Intrusion Response Malicious Behaviors Costs Optimization 1. Pareto-optimal set 2. Weighted sum Risk Matrix 19 → → → text Example

  30. Cost-Sensitive Response Selection System calls Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Per-service responses File system Read-only file system Read-only path Inaccessible path Blacklist any system call Cost Blacklist a list or a category of system calls Network Disable network Blacklist IP addresses Blacklist ports ... Resources CPU quota ... ... Example of a non-exhaustive per-service response hierarchy Responses may be provided via the exchange format STIX (e.g., the course of action field) Efficiency Information understand the intrusion 2. Weighted sum find possible responses assign costs select a response Response Costs Response Efficiency Malicious Behaviors Costs Optimization Additional 1. Pareto-optimal set Risk Matrix Alert Initial Likelihood very likely Cost Selected Response Intelligence Threat Detection Intrusion 19 → → → text Example

  31. Cost-Sensitive Response Selection Cost Selected Response Cost very likely Likelihood Initial Alert Additional Information Efficiency understand the intrusion Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Defined by threat intelligence Intelligence Threat Detection Efficiency find possible responses assign costs select a response Response Costs Intrusion Response Malicious Behaviors Costs Optimization 1. Pareto-optimal set 2. Weighted sum Risk Matrix 19 → → → text Example

  32. Cost-Sensitive Response Selection L M L 0.6 – 0.8 Likely H H H M 0.8 – 1 H Very likely 0.8 – 1 Very high 0.6 – 0.8 High 0.4 – 0.6 Moderate 0.2 – 0.4 Low M H Very low L L L L L L 0 – 0.2 Very unlikely M M L Probable L 0.2 – 0.4 Unlikely H M M L L 0.4 – 0.6 0 – 0.2 (Likelihood) understand the intrusion Costs Threat Detection Intrusion Matrix Risk 2. Weighted sum 1. Pareto-optimal set Optimization Behaviors Confidence Malicious Efficiency Response Costs Response select a response assign costs find possible responses Intelligence Selected Response Cost Malicious Malicious Behavior Cost Risk Matrix Defined by the organization Defined by threat intelligence Defined by the administrator/developper Responses read-only FS, disable syscall,... very likely Behaviors ransomware Risk Efficiency Cost Information Additional Alert Initial Likelihood 19 → → → text Example

  33. Cost-Sensitive Response Selection ransomware very likely Likelihood Initial Alert Additional Information Cost Efficiency Risk Malicious understand the intrusion Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Defined by threat intelligence Defined by the organization Cost vs Efficiency It prioritizes efficiency if the risk is high, and cost if the risk is low Cost Selected Response Intelligence Malicious find possible responses assign costs select a response Response Costs Response Threat Efficiency Behaviors Costs Optimization 1. Pareto-optimal set 2. Weighted sum Risk Matrix Intrusion Detection 19 → → → text Example max ( Risk × Efficiency + ( 1 − Risk ) × ( 1 − Cost ))

  34. Cost-Sensitive Response Selection It prioritizes efficiency if the risk is high, and cost Cost Efficiency Risk ransomware Malicious Behaviors read-only FS, disable syscall,... Responses Defined by the administrator/developper Defined by threat intelligence Defined by the organization Cost vs Efficiency if the risk is low understand the intrusion We rely on: • Possible responses • Malicious behaviors • Likelihood We assign: • Response costs • Malicious behavior costs • Risk matrix We select responses based on: • Response cost • Risk • Response efficiency Information Additional Alert 1. Pareto-optimal set find possible responses assign costs select a response Response Costs Response Efficiency Malicious Behaviors Costs Initial Optimization 19 2. Weighted sum Likelihood very likely Cost Selected Response Intelligence Threat Detection Intrusion Matrix Risk → → → text Example max ( Risk × Efficiency + ( 1 − Risk ) × ( 1 − Cost ))

  35. Prototype Implementation for Linux-Based Systems Restoration [...] record security relevant events audit partition kernel resources namespaces filter system calls seccomp set of processes bound to a set of limits cgroups 460 Logging & Responses Linux kernel 0 manage snapshots of file systems Projects Used or Modified snapper 383 Restoration checkpoint & restore processes CRIU 2639 Orchestration system and service manager systemd code added Lines of C Why do we use/modify it? What does it do? What is it? Project 20

  36. Agenda Introduction: Preventing, Detecting, and Surviving Intrusions Surviving Intrusions at the Operating System Level State of the Art Approach and Prototype Evaluation Conclusion Detecting Intrusions at the Firmware Level Conclusion and Perspectives 21

  37. Evaluation Setup What Do We Evaluate? • Responses effectiveness • Cost-sensitive response selection • Availability cost and performance impact • Stability of degraded services Malware and Attacks • Different types of malicious behaviors (botnet, ransomware, cryptominer,...) • Linux.BitCoinMiner, Linux.Rex.1, Hakai, Linux.Encoder.1, GoAhead Exploit Performance Evaluation Setup • Various types of services (Apache, nginx, mariadb, beanstalkd, mosquitto, gitea) • Both synthetic and real-world benchmarks using Phoronix test suite 22

  38. Evaluation Setup What Do We Evaluate? • Responses effectiveness • Cost-sensitive response selection • Availability cost and performance impact • Stability of degraded services Malware and Attacks • Different types of malicious behaviors (botnet, ransomware, cryptominer,...) • Linux.BitCoinMiner, Linux.Rex.1, Hakai, Linux.Encoder.1, GoAhead Exploit Performance Evaluation Setup • Various types of services (Apache, nginx, mariadb, beanstalkd, mosquitto, gitea) • Both synthetic and real-world benchmarks using Phoronix test suite 22

  39. Security Evaluation Communicate with C&C • The service can withstand the reinfection • The service is restored Results Render paths inaccessible Data theft GoAhead exploit Forbid connect syscall Open reverse shell GoAhead exploit Read-only filesystem Encrypt data Linux.Encoder.1 Ban C&C servers’ IPs Hakai Restoration and Responses Effectiveness Ban bootstrapping IPs Join P2P botnet Linux.Rex.1 Reduce CPU quota Mine for cryptocurrency Linux.BitCoinMiner Ban mining pool IPs Mine for cryptocurrency Linux.BitCoinMiner Response Policy Per-service Malicious Behavior Attack Scenario 23

  40. Security Evaluation Cost-Sensitive Response Selection Goal Evaluate the impact of the IDS accuracy when selecting responses Scenario Survive ransomware that compromised Gitea Results • High risk: read-only filesystem (1, 3) • Ransomware failed to reinfect • Gitea still usable (can access all repositories, clone them, log in) • Low risk: read-only paths of important git repositories (2) • Ransomware could not encrypt important repositories • Gitea still usable (can access important repositories, clone them) 24 → accurate likelihood (1), inaccurate likelihood (2), false positive (3)

  41. Performance Evaluation Availability Cost • less than 300 ms to checkpoint • less than 325 ms to restore 25

  42. • e.g., SHELF 12 has 8 Performance Evaluation Parameters 650 675 700 Compile Tree 0 600 80 160 240 MB/s (a) MB/s score with the Compilebench benchmark (more is better) 625 Monitoring rule enabled and service monitored Availability Cost - 4 5 • less than 300 ms to checkpoint • less than 325 ms to restore Monitoring Cost • Overhead present only on applications that write to the file system • Small overhead in general (0 6 ) Monitoring rule enabled, but service not monitored • Worst case (28 7 overhead): writing small files asynchronously in burst and 67 overhead No monitoring (baseline) 25 Initial Create Read Compiled

  43. • e.g., SHELF 12 has 8 Performance Evaluation 510 (b) Time (in seconds) to build the Linux kernel (less is better) Time (in seconds) 100 75 50 25 0 Parameters Linux 4.13 555 540 525 Monitoring rule enabled, but service not monitored Monitoring rule enabled and service monitored Availability Cost No monitoring (baseline) overhead and 67 small files asynchronously in burst overhead): writing • Worst case (28 7 write to the file system • Overhead present only on applications that Monitoring Cost • less than 325 ms to restore • less than 300 ms to checkpoint 25 • Small overhead in general (0 . 6 % - 4 . 5 % )

  44. • e.g., SHELF 12 has 8 Performance Evaluation overhead Linux kernel source code (less is better) (c) Time (in seconds) to extract the archive (.tar.gz) of the Time (in seconds) Parameters Linux kernel 4.15 with tar 1.30 Monitoring rule enabled and service monitored Availability Cost No monitoring (baseline) Monitoring rule enabled, but service not monitored and 67 write to the file system • less than 300 ms to checkpoint • less than 325 ms to restore Monitoring Cost small files asynchronously in burst • Overhead present only on applications that 25 15 . 0 13 . 5 12 . 0 4 . 5 3 . 0 • Small overhead in general (0 . 6 % - 4 . 5 % ) 1 . 5 • Worst case (28 . 7 % overhead): writing 0 . 0

  45. Performance Evaluation Availability Cost • less than 300 ms to checkpoint • less than 325 ms to restore Monitoring Cost • Overhead present only on applications that write to the file system small files asynchronously in burst 12Xiong, Jia, and P. Liu, “SHELF: Preserving Business Continuity and Availability in an Intrusion Recovery System”. 25 • Small overhead in general (0 . 6 % - 4 . 5 % ) • Worst case (28 . 7 % overhead): writing • e.g., SHELF 12 has 8 % and 67 % overhead

  46. Agenda Introduction: Preventing, Detecting, and Surviving Intrusions Surviving Intrusions at the Operating System Level State of the Art Approach and Prototype Evaluation Conclusion Detecting Intrusions at the Firmware Level Conclusion and Perspectives 26

  47. Scientific Contributions and Future Work What were the challenges? • The system survives while waiting for the patches • Realistic use cases • Maintain availability while maximizing security Future work • Checkpointing limitations (e.g., with CRIU) • Models input RESSI’18 Ronny Chevalier, David Plaquin, and Guillaume Hiet. “Intrusion Survivability for Commodity Operating Systems and Services: A Work in Progress”. May 2018 ACSAC’19 Ronny Chevalier, David Plaquin, Chris Dalton, and Guillaume Hiet. “Survivor: A Fine-Grained Intrusion Response and Recovery Approach for Commodity Operating Systems”. Dec. 2019 27

  48. Agenda Introduction: Preventing, Detecting, and Surviving Intrusions Surviving Intrusions at the Operating System Level Detecting Intrusions at the Firmware Level Background, Use Case, and State of the Art Approach and Prototype Evaluation Conclusion Conclusion and Perspectives 28

  49. Computers rely on firmware Hardware Less More Privileges Applications System Operating BIOS 29 Where can we find firmware? • Early execution and configuration Boot time vs Runtime • Tightly linked to hardware • Low-level software • Stored in a flash What is it? Here, we focus on BIOS/UEFI-compliant firmware Mother boards (e.g., BIOS), hard disks, network cards,... • Highly privileged runtime software

  50. What is the problem? BIOSs are often written in unsafe languages (i.e., C & assembly) Memory safety errors (e.g., use after free or buffer overflow) BIOSs are not exempt from vulnerabilities 13 Why compromise a BIOS? • Malware can be hard to detect (stealth) • Malware can be persistent (survives even if the HDD/SSD is changed) and costly to remove What do we want? • Boot time integrity 30 • Runtime integrity → some platforms are rarely rebooted 13Kallenberg et al., “Defeating Signed BIOS Enforcement”; Bazhaniuk et al., “A new class of vulnerabilities in SMI handlers”; Researchers, LoJax: First UEFI rootkit found in the wild, courtesy of the Sednit group .

  51. What are the currently used solutions? System our focus is with the System Management Mode (SMM) Isolation of critical services available while the OS is running Runtime Report Measure & Verify Updates Signed Operating Boot time Bootloader Firmware UEFI Root of Trust Immutable • Immutable hardware root of trust • Measurements and reporting to a TPM chip • Signature verification before executing • Signed updates 31

  52. What are the currently used solutions? Boot time Isolation of critical services available while the OS is running Runtime Report Measure & Verify Updates Signed System Operating Bootloader Firmware UEFI Root of Trust Immutable • Immutable hardware root of trust • Measurements and reporting to a TPM chip • Signature verification before executing • Signed updates 31 → our focus is with the System Management Mode (SMM)

  53. Introducing the System Management Mode (SMM) Highly privileged execution mode for x86 processors Runtime services BIOS update, power management, UEFI variables handling, etc. How to enter the SMM? • SMIs code & data are stored in a protected memory region: System Management RAM (SMRAM) BIOS code is not exempt from vulnerabilities affecting SMM 14 Why is it interesting for an attacker? • Only mode that can write to the flash containing the BIOS • Arbitrary code execution in SMM gives full control of the platform 32 • Trigger a System Management Interrupt (SMI) → needs kernel privileges 14Bazhaniuk et al., “A new class of vulnerabilities in SMI handlers”; Bulygin, Bazhaniuk, et al., “BARing the System: New vulnerabilities in Coreboot & UEFI based systems”; Pujos, SMM unchecked pointer vulnerability ; Researchers, LoJax: First UEFI rootkit found in the wild, courtesy of the Sednit group .

  54. State of the Art: Runtime Intrusion Detection for Low-Level Components Few solutions were designed to monitor the SMM at runtime Snapshot-Based Approaches 15 • Periodic snapshot of the target’s state Event-Based Approaches 16 • Observe events generated by the target • Limitations: performance issues , lack of flexibility , or semantic gap 15Petroni et al., “Copilot - a Coprocessor-based Kernel Runtime Integrity Monitor”; Bulygin and Samyde, “Chipset based approach to detect virtualization malware”. 16Lee et al., “KI-Mon: A Hardware-assisted Event-triggered Monitoring Platform for Mutable Kernel Object”; Z. Liu et al., “CPU Transparent Protection of OS Kernel and Hypervisor Integrity with Programmable DRAM”. 33 • Limitations: transient attacks

  55. State of the Art: Runtime Intrusion Detection for Low-Level Components Few solutions were designed to monitor the SMM at runtime Snapshot-Based Approaches 15 • Periodic snapshot of the target’s state Event-Based Approaches 16 • Observe events generated by the target • Limitations: performance issues , lack of flexibility , or semantic gap How computing platforms can be designed to detect intrusions modifying the runtime behavior of the SMM ? 15Petroni et al., “Copilot - a Coprocessor-based Kernel Runtime Integrity Monitor”; Bulygin and Samyde, “Chipset based approach to detect virtualization malware”. 16Lee et al., “KI-Mon: A Hardware-assisted Event-triggered Monitoring Platform for Mutable Kernel Object”; Z. Liu et al., “CPU Transparent Protection of OS Kernel and Hypervisor Integrity with Programmable DRAM”. 33 • Limitations: transient attacks

  56. Agenda Introduction: Preventing, Detecting, and Surviving Intrusions Surviving Intrusions at the Operating System Level Detecting Intrusions at the Firmware Level Background, Use Case, and State of the Art Approach and Prototype Evaluation Conclusion Conclusion and Perspectives 34

  57. Our objective Response • How to monitor? • How to define a correct behavior? • How to ensure the integrity of the monitor? Such a goal raises the following questions: Monitoring Behavior ... Our goal is to detect attacks that modify the expected behavior of the SMM by monitoring its Stop execution or Raise alert or Firmware Runtime Monitor behavior at runtime . 35

  58. Approach overview integrity of the monitor? How to define a correct behavior? BIOS source code code SMM source Compiler LLVM-based bridging the semantic gap How to monitor? Semantic gap? How to ensure the Co-processor RAM SMM code target behavior Expected Processor Co-processor FIFO Unidirectional Monitor Target Processor RAM 36

  59. Approach overview integrity of the monitor? How to define a correct behavior? BIOS source code code SMM source Compiler LLVM-based bridging the semantic gap How to monitor? Semantic gap? How to ensure the Co-processor RAM SMM code target behavior Expected Processor Co-processor FIFO Unidirectional Monitor Target Processor RAM 36

  60. Approach overview integrity of the monitor? How to define a correct behavior? BIOS source code code SMM source Compiler LLVM-based bridging the semantic gap How to monitor? Semantic gap? How to ensure the Co-processor RAM SMM code target behavior Expected Processor Co-processor FIFO Unidirectional Monitor Target Processor RAM 36

  61. Approach overview How to ensure the How to define a correct behavior? BIOS source code code SMM source Compiler LLVM-based bridging the semantic gap How to monitor? Semantic gap? integrity of the monitor? SMM code Co-processor RAM Instrumented target behavior Expected Processor Co-processor FIFO Unidirectional Monitor Target Processor RAM 36

  62. Approach overview How to ensure the How to define a correct behavior? BIOS source code code SMM source Compiler LLVM-based bridging the semantic gap How to monitor? Semantic gap? integrity of the monitor? SMM code Co-processor RAM Instrumented target behavior Expected Processor Co-processor FIFO Unidirectional Monitor Target Processor RAM 36

  63. How to define a correct behavior? Our use case: SMM code • Written in unsafe languages (i.e., C & assembly) • Tightly coupled to hardware Control Flow Graph (CFG) Define the control flow that the software is expected to follow Invariants on CPU registers Define rules that registers are expected to satisfy 37 → Such languages are often targeted by attacks hijacking the control flow → Its behavior rely on hardware configuration registers → Control Flow Integrity (CFI) → CPU registers integrity

  64. How to define a correct behavior? Control Flow Integrity (CFI): principle Authenticated authenticated Non verification auth Simplified graph 38 Example void auth( int a, int b) { char buffer[512]; [...vuln...] verification(buffer); } void verification( char *input) { if (strcmp(input, "secret") == 0) authenticated(); else non_authenticated(); }

  65. How to define a correct behavior? Control Flow Integrity (CFI): principle Authenticated authenticated Non verification auth Simplified graph 38 Example void auth( int a, int b) { char buffer[512]; [...vuln...] verification(buffer); } void verification( char *input) { if (strcmp(input, "secret") == 0) authenticated(); else non_authenticated(); }

  66. How to define a correct behavior? Control Flow Integrity (CFI): principle Goal: constrain the execution path to follow a control-flow graph (CFG) Authenticated authenticated Non verification auth Simplified graph 38 Example void auth( int a, int b) { char buffer[512]; [...vuln...] verification(buffer); } void verification( char *input) { if (strcmp(input, "secret") == 0) authenticated(); else non_authenticated(); }

  67. 0x0fffb804 0x0fffb804 0x0fffb804 0x0befca04 How to define a correct behavior? i32(i8) 1561 Target Address Message Call Site ID Type 1561 i8(i32) 4852 ... ... Message Function Address Type i8(i32) i32() ... ... Compilation SMM source code valid? Call Site ID SMM code Control Flow Integrity (CFI): type-based verification Compile time We focus on indirect branches integrity Type-based verification Ensures the integrity of indirect calls Instrumented Target Runtime 39 Monitor Compile time Message Call Site ID Message Target Address 1561 Runtime typedef struct SomeStruct { [...] char (*foo)( int ); } SomeStruct; int bar(SomeStruct *s) { char c; [...] c = s->foo(31); [...] }

  68. 0x0fffb804 0x0fffb804 0x0fffb804 0x0befca04 How to define a correct behavior? i32(i8) 1561 Target Address Message Call Site ID Type 1561 i8(i32) 4852 ... ... Message Function Address Type i8(i32) i32() ... ... Compilation SMM source code valid? Call Site ID SMM code Control Flow Integrity (CFI): type-based verification Compile time We focus on indirect branches integrity Type-based verification Ensures the integrity of indirect calls Instrumented Target Runtime 39 Monitor Compile time Message Call Site ID Message Target Address 1561 Runtime typedef struct SomeStruct { [...] char (*foo)( int ); } SomeStruct; int bar(SomeStruct *s) { char c; [...] c = s->foo(31); [...] }

  69. 0x0fffb804 0x0fffb804 0x0fffb804 0x0befca04 How to define a correct behavior? i32(i8) 1561 Target Address Message Call Site ID Type 1561 i8(i32) 4852 ... ... Message Function Address Type i8(i32) i32() ... ... Compilation SMM source code valid? Call Site ID SMM code Control Flow Integrity (CFI): type-based verification Compile time We focus on indirect branches integrity Type-based verification Ensures the integrity of indirect calls Instrumented Target Runtime 39 Monitor Compile time Message Call Site ID Message Target Address 1561 Runtime typedef struct SomeStruct { [...] char (*foo)(int); } SomeStruct; int bar(SomeStruct *s) { char c; [...] c = s->foo(31); [...] }

  70. How to define a correct behavior? i32(i8) Message Call Site ID 1561 Target Address Message Call Site ID Type 1561 i8(i32) 4852 ... Instrumented ... Function Address Type i8(i32) i32() ... ... Compilation SMM source code valid? Control Flow Integrity (CFI): type-based verification SMM code Message Monitor We focus on indirect branches integrity Type-based verification Ensures the integrity of indirect calls Target Runtime Compile time 39 Runtime Target Address Message 1561 Call Site ID Compile time 0x0fffb804 typedef struct SomeStruct { [...] char (*foo)( int ); } SomeStruct; int bar(SomeStruct *s) { char c; 0x0fffb804 [...] 0x0fffb804 c = s->foo(31); 0x0befca04 [...] }

  71. How to define a correct behavior? i32(i8) Message Call Site ID 1561 Target Address Message Call Site ID Type 1561 i8(i32) 4852 ... Control Flow Integrity (CFI): type-based verification ... Function Address Type i8(i32) i32() ... ... Compilation SMM source code valid? SMM code Instrumented Message Monitor We focus on indirect branches integrity Type-based verification Ensures the integrity of indirect calls Target Runtime Compile time 39 Runtime Compile time Message Target Address 1561 Call Site ID 0x0fffb804 typedef struct SomeStruct { [...] char (*foo)(int); } SomeStruct; int bar(SomeStruct *s) { char c; 0x0fffb804 [...] 0x0fffb804 c = s->foo(31); /* Call Site ID = 1561 */ 0x0befca04 [...] }

  72. How to define a correct behavior? i32(i8) Message Call Site ID 1561 Target Address Message Call Site ID Type 1561 i8(i32) 4852 ... Control Flow Integrity (CFI): type-based verification ... Function Address Type i8(i32) i32() ... ... Compilation SMM source code valid? SMM code Instrumented Message Monitor We focus on indirect branches integrity Type-based verification Ensures the integrity of indirect calls Target Runtime Compile time 39 Runtime Compile time Message Target Address 1561 Call Site ID 0x0fffb804 typedef struct SomeStruct { [...] char (*foo)( int ); } SomeStruct; int bar(SomeStruct *s) { char c; 0x0fffb804 [...] [SendMessage(1561, s->foo)] 0x0fffb804 c = s->foo(31); /* Call Site ID = 1561 */ 0x0befca04 [...] }

  73. How to define a correct behavior? i32(i8) Message Call Site ID 1561 Target Address Message Call Site ID Type 1561 i8(i32) 4852 ... Control Flow Integrity (CFI): type-based verification ... Function Address Type i8(i32) i32() ... ... Compilation SMM source code valid? SMM code Instrumented Message Monitor We focus on indirect branches integrity Type-based verification Ensures the integrity of indirect calls Target Runtime Compile time 39 Runtime Compile time Message Target Address 1561 Call Site ID 0x0fffb804 typedef struct SomeStruct { [...] char (*foo)( int ); } SomeStruct; int bar(SomeStruct *s) { char c; 0x0fffb804 [...] [SendMessage(1561, s->foo)] 0x0fffb804 c = s->foo(31); /* Call Site ID = 1561 */ 0x0befca04 [...] }

  74. How to define a correct behavior? Control Flow Integrity (CFI): shadow call stack pop code SMM source Compilation valid? Shadow call stack ... Message Return Address Message SMM code Instrumented Message Return Address Message Runtime Monitor Compile time Runtime Target Ensures integrity of the return address on the stack Shadow call stack 40 0x0f8a520c 0x0f8522d0 0x0f8a520c 0x0f8a520c

  75. How to define a correct behavior? CPU registers integrity SMM code is tightly coupled to hardware • Generic detection methods (e.g., CFI) are not aware of hardware specificities • Adhoc detection methods are needed Some interesting registers for an attacker • SMBASE : Defines the SMM entry point • CR3 : Physical address of the page directory How to protect such registers? • Send the expected values at boot time • Send messages at runtime containing these values to detect any discrepancy 41 → Their value is stored in memory and is not supposed to change at runtime

  76. How to monitor? Communication channel constraints Security constraints • Message integrity • Chronological order • Exclusive access Performance constraints • Acceptable latency of an SMI as defined by Intel BIOS Test Suite: 150 µs • More than 150 µs per SMI handler leads to degradation of performance or user experience 42

  77. How to monitor? Communication channel design pop (SMIACT#) In SMM? push Processor Co-processor FIFO Restricted 43 • Performance • Exclusive access • Message integrity • Chronological order Additional hardware component monitor → FIFO (queue) → Restricted FIFO → Check if CPU is in SMM (SMIACT# signal) → Use a low latency interconnect target

  78. Agenda Introduction: Preventing, Detecting, and Surviving Intrusions Surviving Intrusions at the Operating System Level Detecting Intrusions at the Firmware Level Background, Use Case, and State of the Art Approach and Prototype Evaluation Conclusion Conclusion and Perspectives 44

Recommend


More recommend