dependable intrusion tolerance
play

Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) - PowerPoint PPT Presentation

Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) Magnus Almgren, Dan Andersson, Steve Cheung, Bruno Dutertre, Yves Deswarte, Josh Levy, Hassen Sadi,Tomas Uribe October 2001 Acknowledgements Research sponsored under DARPA


  1. Dependable Intrusion Tolerance Alfonso Valdes (valdes@sdl.sri.com) Magnus Almgren, Dan Andersson, Steve Cheung, Bruno Dutertre, Yves Deswarte, Josh Levy, Hassen Saïdi,Tomas Uribe October 2001 Acknowledgements Research sponsored under DARPA Contract N66001-00-C-8058. Views presented are those of the authors and do not represent the views of DARPA or the Space and Naval Warfare Systems Center 1

  2. Dependable Intrusion Tolerance J J Intrusion Detection to Date New Emphasis Seeks to detect an arbitrary Detection, damage assessment, G G number of attacks in progress and recovery Relies on signature analysis and Finite number of attacks or G G probabilistic (including Bayes) deviations from expected system techniques behavior Response components immature Seek a synthesis of intrusion G G detection, unsupervised learning, No concept of intrusion tolerance G and proof-based methods for the detection aspect Concepts from fault tolerance are G adapted to ensure delivery of service (possibly degraded) 2

  3. Outline J Architecture overview J Sensor subsystem J Proxy functionality J Stopping Code Red J Selecting optimal response J Summary 3

  4. Architecture App Server App Server App Proxy Server App e e 2 Server e EMERALD Network Appliance Sensor Subnet Proxy-AS Subnet 4 External traffic

  5. Architecture (2) App Proxy Server App Proxy Server App Proxy Server App e e 2 Server Proxy EMERALD Network Appliance e Sensor Subnet Proxy-AS Subnet 5 External traffic

  6. Sensor Subsystem for Situational Awareness J EMERALD host and network sensors detect a variety of known attacks. J EMERALD probabilistic sensors potentially detect novel attacks, but perhaps symptomatically (and therefore after the fact). J Content agreement and challenge/response protocols detect corrupted content, regardless of the mechanism by which it became corrupted. J On-line verifiers which check overall system compliance with the system specification by formal means at run time. 6

  7. The Sensor Picture Note : The Net appliance has a passive interface for the network traffic. Tolerance Proxy Net appliance and app servers have write-only access to sensor subnet (for alert reporting). Proxies use sensor subnet for alert reporting and Proxy function management. Application Server On-Line Application Server Verifier Critical APP Application Server Critical APP Application Server Critical APP Proof Based Critical APP Proof Based Trigger Proof Based Trigger Trigger EMERALD EMERALD EMERALD APP EMERALD EMERALD APP Host Monitor AMI EMERALD APP Monitor Monitor APP Monitor IDS Network Appliance Monitor EMERALD eBayes-TCP EMERALD Host Blue EMERALD Host Monitor Sensor EMERALD Host Monitor Host Monitor e Monitor Net Experts Correlation 7

  8. Proxy Implementation J Basic functionality: Accept HTTP connection G Read client HTTP request G Check ACLs G Load balancing G Send reply to client G J Functionality to implement intrusion tolerance Effect change of policy if needed G Check content agreement (depends on dynamic policy) G Challenge/response protocol monitors file system integrity G Alert the sensor subsystem if required G 8

  9. Ensuring Correct Content J In agreement modes, we compare content from more than one APP server J For efficiency and bandwidth, we actually check MD5 checksums for all polled servers J If these agree, we obtain content from one of the servers and actually verify the MD5 at the proxy J If this agrees with the previous MD5 check, the content is forwarded to the client 9

  10. Four policy levels J Benign - 1 GET request J Duplex (default regime at system start) 1 HEAD (get MD5 only) and 1 GET (MD5 plus content). G If MD5 agree, send content to client G Otherwise, go to Triplex G J Triplex - 2 HEAD- and 1 GET-request. G If MD5 all agree, send content to client. If majority obtained, consider G minority AS COMPROMISED. Send content to client, rebuild AS, continue Triplex J Full Agreement J Transition to a more permissive regime after some time of normal activity 10

  11. Stopping Code Red (and NIMDA) Distributed Proxy Bank IDS Appliance IIS 1. 3/4 of Code Red atempts miss the IIS server 2. IDS detects attempt. System invokes agreement mode 3. In case of a successful infection, corrupt content is detected and reinfection attempts are blocked 4. Clients get valid content while compromised server is rebuilt 11

  12. Selecting the Optimal Response J System responses include increased agreement modes, restarting servers, or restarting proxies J These responses can all temporarily degrade system performance J Responses are invoked based on imperfect or delayed situational awareness J Approach: Define objective functions for the system (percent of dropped G requests and percent invalid replies) Estimate degree to which system state optimizes the objective G Consider present state and likely evolution under the available G responses. Response actions have a cost with respect to the objective. Select response that best optimizes the objective over time G Elements of dynamic programming and Markov Decision Processes G 12

  13. Simulation Analysis of MDP J System is modeled as 14 Poisson processes Processes include client requests, server replies, challenge/response G requests (from proxy, to assess content validity), random failures, attacks (which make transitions between attack states), IDS false alarms, IDS detections,... Process rates are state dependent G Requests, attacks, failures always ON. Response process is ON if G there are active requests. False alarms are always ON, detections are ON if there are active attacks in a detectable state. J System performance is based on true state. Tolerance response is based on sensor reports Responses include various levels of content agreement as well as G server reboot J Objective: Minimize dropped requests and requests with invalid replies (the latter come from a root-compromised app server) All tolerance responses have a cost with respect to these objectives, G but not responding can also cost 13

  14. Initial Results J Requests arrive at 1000/unit time. Total reply capacity is 4000/unit time. Attack rate is 50/unit time. J Redundancy is beneficial, but diminishing returns beyond 2 App Servers (Total server capacity is 4000/unit time) App Servers % Drop % Invalid 1@4000/time 3.62 2.78 2@2000/time each 0.04 1.26 3@1333/time each 0.16 0.59 4@1000/time each 0.99 0.51 J Frequent challenge/response requests improve system objectives λ Challenge % Drop % Invalid 0 0 26.21 100 0.43 1.89 500 0.99 0.51 1000 0.31 0.33 14

  15. Summary J Developing an intrusion tolerant server architecture J Key feature is redundant capability provided by diverse implementation J A variety of IDS, symptom detectors, and on-line verifiers provide situational awareness J Stepped policy response enforces content agreement in suspicious situations 15

  16. (Backup) Poisson Processes J Poisson process: Event stream where inter-event times have an exponential distribution. Parameter is referred to as the process rate, typically denoted λ J Mathematical properties of multiple simultaneous Poisson processes lead to tractable implementation: Overall process is Poisson, with overall rate equal to the sum of the G rates of the individual processes ∑ λ overall = λ i i Next event is of a given class with the following probability: G ) = λ i λ overall ( P Next event is of class i 16

  17. Proxy Capabilities Simulated J IDS detect probes and root compromises, but occasionally fail to detect or are too slow, or generate false alerts J Asset distress monitor (blue sensor) can detect a “down” server by rate of failed requests J Proxy detects AHBL when request queue overflows J Challenge/Response: Periodically issues a request to all servers, for which the reply is known Can detect compromised server if reply is invalid G Can detect a “down” server G These detections are typically much later than from an IDS G J Available responses are: Invoke a content agreement regime for client requests with 2..n G servers Reboot a server G 17

  18. Processes and Rates Note: Time units not specified. These rates should be viewed as relative. Process Rate per unit time Comment Request 1000 Active if there are Reply 4000 total active requests Compete with client requests for server Challenge/Response 500 bandwidth Non-malicious crash 1 So E(reboot Reboot 100 time)=0.01 Probe attack 50 Probe_to_root 10 Probe_to_crash 5 Probe_to_term 5 Root_to_crash 5 Attack in this state Root_to_term 5 compromises host Probe_detect 10 Must detect before Root detect 50 root_to_term False Detect 5 18

Recommend


More recommend