learning rules for anomaly detection lerad of hostile
play

Learning Rules for Anomaly Detection (LERAD) of Hostile Network - PDF document

Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney Overview Prior Work in Network Anomaly Detection The 1999 DARPA Intrusion Detection Evaluation Packet Header Anomaly Detection (PHAD) Application


  1. Learning Rules for Anomaly Detection (LERAD) of Hostile Network Traffic Matt Mahoney

  2. Overview � Prior Work in Network Anomaly Detection � The 1999 DARPA Intrusion Detection Evaluation � Packet Header Anomaly Detection (PHAD) � Application Layer Anomaly Detection (ALAD) � Learning Rules for Anomaly Detection (LERAD)

  3. An anomaly detector models “normal” behavior. Deviations may be an attack. Host-Based: models sequences of system calls made by a server or operating system program. � N-grams (Forrest, 1996) � State machine, neural networks (RST, Ghosh 1999) Network Based: usually models IP addresses and ports. � User-programmed rules: firewalls, SNORT, Bro � Learned rules: ADAM, NIDES, SPADE

  4. The 1999 DARPA Evaluation Data Set SunOS Solaris Sniffer Internet Cisco Router Linux Ethernet Attacks Windows NT Attacks � Weeks 1 and 3: training: no attacks � Week 2: training: 138 labeled instances of 32 attacks � Weeks 4 and 5: test: 201 unlabeled instances (190 actual) of 53 attacks No data for 12 attacks (week 4, day 2) � One mislabeled attack (apache2) �

  5. 1999 DARPA IDS Attacks Abuse of Legitimate Bug Exploit Configuration Error Service Exploit Probe : illegalsniffer, Probe : queso Probe : mscan, ntinfoscan, ipsweep, ls, ntfsdos, satan portsweep DOS : apache2, back, dosnuke, land, pod, R2L : dict, ftpwrite, guest, DOS : arppoison, selfping, syslogd, teardrop snmpget, xsnoop mailbomb, neptune, processtable, resetscan, R2L : framespoofer, imap, smurf, tcpreset, udpstorm, named, ncftp, phf, warezclient, warezmaster sendmail R2L : httptunnel, netbus, U2R : anypw, casesen, netcat, ppmacro, sshtrojan, eject, fdformat, ffbconfig, xlock loadmodule, perl, ps, sechole, sqlattack, xterm, U2R/Data : secret yaga

  6. 1999 Evaluation Results - Best 4 of 18 Systems (Lippmann, 2000) � IDS must identify IP address of attacker or target and time within 60 seconds. � Evaluated at 100 false alarms (10 per day) threshold. System Detections Expert 1 85/169 (50%) Expert 2 81/173 (47%) Dmine 41/102 (40%) Forensics 15/27 (55%) � Blind (developers had no access to test data) � Evaluated by DARPA � May use both signature and anomaly detection � May use both host and network based methods � May restrict attacks by category, data type, or target

  7. Packet Header Anomaly Detection (PHAD) � Examines Ethernet, IP, TCP, UDP, ICMP protocols � 34 learned rules (trained on week 3) TOS = 0, 8, 16, or 192 � IP source = 12.2.169.104-12.20.180.101, ... � TCP flags = x2, x4, x10, ... � � Score = tn/r summed over packet � t = time since last anomaly (values never seen in training) � n = number of training packets � r = number of allowed values � Detects 72/189 attacks (54 without TTL)

  8. Application Layer Anomaly Detection (ALAD) � Models incoming server TCP connections � Conditional rules (5 forms, selected ad-hoc) If dest. port = 80 then keyword = GET, Host, Accept... � If TCP flags = S/AF/A then dest. port = 23, 25, 80 � If source IP = x.x.x.x then dest. IP = y.y.y.y, ... � If source IP = x.x.x.x then dest. IP/port = y.y.y.y:p, ... � Dest. IP/port = x.x.x.x:p, x.x.x.x:p, ... � � Score = tn/r � Detects 59/189 attacks (70 with PHAD w/o TTL)

  9. Learning Rules for Anomaly Detection (LERAD) � Like ALAD, but rule forms are derived from a sample of training data. � If A 1 = V 1 and A 2 = V 2 and ... then A m = V 1 , V 2, ... or V r � 23 attributes (A i ). Date, time � Source, destination IP address (4 bytes), and ports � TCP flags (first, next to last, last) � Duration in seconds � Length in bytes � First 8 words of application data � � Score = tn/r � Detects 112-118/190 attacks (average 114.8, or 60%) � No improvement when merged with PHAD or ALAD.

  10. Rule Learning Algorithm 1. Select random sample of 20-100 tuples (TCP connections) of training data. 2. Generate 1000-5000 rules satisfying randomly selected pairs of tuples. 3. Sort rules by decreasing n/r on sample. 4. Coverage test: remove rules that predict no additional values in sample (leaving 80-120 rules). 5. Train on full input (35,455 tuples in week 3). 6. Remove rules that generate anomalies in last 10% of training (leaving 55-85 rules).

  11. Rule Generation 1. Pick random pair of tuples (from sample or full input). 2. Select up to 4 matching attributes in random order. 3. First match is the consequent. 4. Subsequent matches are conditions in the antecedent. A B C D E 1 2 3 4 5 1 2 3 4 6 C = 3 If A = 1 then C = 3 If D = 4 and A = 1 then C = 3 If B = 2 and D = 4 and A = 1 then C = 3

  12. Coverage Test 1. For each rule by decreasing n/r on the sample 2. Mark each unmarked sample value predicted. 3. If no values can be marked, remove the rule. A B C 1 (R1) 2 4 (R3) 1 (R1) 2 5 (R3) 1 (R1) 3 5 R1. A = 1 (n/r = 3/1) R2. If B = 2 then A = 1 (n/r = 2/1) removed R3. If B = 2 then C = 4 or 5 (n/r = 2/2)

  13. LERAD Sample Input Date Time DA1 DA0 DP SA3 SA2 SA1 SA0 SP DUR F1 F2 F3 Len W1 W2 03/15/1999 08:00:57 112 050 25 196 037 037 158 1111 0 .S .AP .AF 857 .^@EHLO .ju 03/15/1999 08:00:57 113 050 25 196 037 037 158 1113 0 .S .AP .AF 880 .^@EHLO .ju 03/15/1999 08:01:13 114 050 80 172 016 016 100 2971 4489 .S .AP .AP 872 .^@GET . 03/15/1999 08:01:13 114 050 80 172 016 016 100 2972 5693 .S .AP .AF 595 .^@GET . 03/15/1999 08:01:13 114 050 80 172 016 016 100 2973 12 .S .AP .AF 318 .^@GET ./w 03/15/1999 08:01:13 114 050 80 172 016 016 100 2974 118 .S .AP .AP 610 .^@GET ./

  14. Sample Rules (sorted by n/r) 1 28882/2 if F2=.AP then F1 = .S .AS 2 14236/1 if DA0=100 then DA1 = 112 3 12854/1 if W3=.HTTP/1.0^M^ then W1 = .^@GET 4 12854/1 if W3=.HTTP/1.0^M^ then DP = 80 5 35455/3 if then DA1 = 113 112 114 6 34602/3 if F3=.AF then F1 = .S .AF .AS 7 10857/1 if SA3=172 then SA2 = 016 8 10857/1 if SA2=016 then SA1 = 016 9 10857/1 if SA2=016 then SA3 = 172 10 10642/1 if F1=.S F2=.AP W1=.^@EHLO then DP = 25 11 9914/1 if W3=.HELO then W7 = .RCPT 12 9914/1 if W5=.MAIL then W3 = .HELO 13 9914/1 if W3=.HELO then W1 = .^@EHLO 14 28882/3 if F2=.AP then F3 = .AP .AF .R 15 35455/4 if then F1 = .S .AF .AS .R 16 34602/4 if F3=.AF then F2 = .S .AP . .AS 17 7656/1 if W7=. then W8 = . 18 7645/1 if W5=. then W6 = . 19 7645/1 if W4=. then W7 = . 20 7596/1 if W3=. then W4 = . 21 7566/1 if DA1=114 W3=.HTTP/1.0^M^ then DA0 = 050 22 29549/4 if F1=.S then F2 = .S .AP . .A 23 35455/5 if then F2 = .S .AP . .AS .A 24 35455/5 if then F3 = .S .AP .AF .AS .R 25 12867/2 if W1=.^@GET then W3 = .HTTP/1.0^M^ .align= 26 12854/2 if W3=.HTTP/1.0^M^ then DA0 = 050 100 27 10105/2 if W7=.RCPT then W5 = .MAIL .RCPT 28 35455/8 if then SA3 = 196 172 197 194 195 135 192 152 29 12838/3 if DP=25 then W1 = .^@EHLO . .^@HELO 30 3992/1 if W3=.HTTP/1.0^M^ W7=.text/htm then W8 = .text/pla 31 7647/2 if W6=. then W5 = . .QUIT^M^ 32 7279/2 if SA0=050 then SA1 = 016 073 33 3521/1 if DA1=112 W3=.HTTP/1.0^M^ W6=.User-Age then W7 = .Mozilla/ 34 6824/2 if W6=.User-Age then W4 = .Connection: .Referer: 35 6823/2 if F2=.AP W6=.User-Age then W8 = .[en] .(X11; 36 18807/6 if DA1=112 then DA0 = 050 100 194 207 149 020 37 2998/1 if SA1=037 then SA0 = 158 38 29549/10 if F1=.S then DP = 113 25 23 80 135 21 79 22 515 139 39 35455/12 if then DA0 = 105 050 204 084 168 148 169 100 194 207 149 020 40 34602/12 if F3=.AF then DP = 113 25 23 80 21 20 79 22 1022 515 1023 139 41 35455/13 if then SA2 = 037 016 182 168 169 115 027 008 227 073 007 218 013 42 35455/13 if then DP = 113 25 23 80 135 21 20 79 22 1022 515 1023 139 43 35455/13 if then SA1 = 037 016 182 168 169 115 027 008 227 073 007 218 013 44 2695/1 if SA1=007 then SA2 = 007 45 2695/1 if SA3=194 SA2=007 then SA0 = 153 46 5223/2 if SA3=194 then SA0 = 021 153 47 7656/3 if W7=. then W3 = . .PASS .6667^M^ 48 6852/3 if W4=.Referer: then W5 = .http://w .http://m .http://h 49 2083/1 if SA1=013 then SA0 = 191 50 1888/1 if SA1=227 F1=.S then SA0 = 189 51 12885/7 if DP=80 then W4 = .HTTP/1.0^M^ .Connection: .Referer: . .Host: 52 53 35455/24 if then SA0 = 105 158 050 204 084 182 233 168 148 169 100 194 108 54 12854/10 if W3=.HTTP/1.0^M^ then W8 = .User-Age .[en] .text/pla .(X11; .I; 55 7109/6 if DA1=112 SA2=016 F3=.AF then DA0 = 050 100 194 207 149 020 56 12867/13 if W1=.^@GET then W6 = .User-Age .[en] .Connecti .Accept: .(X11; 57 10857/12 if SA2=016 then DA0 = 105 050 204 084 168 148 169 100 194 207 149 58 1805/2 if F1=.S W6=." then W2 = .^C .^@^@^@ 59 1798/2 if DP=23 F3=.AF then W4 = .^_ .# 60 5827/9 if DP=20 W5=. then DUR = 0 1 4 6 7 2 3 5 36 61 7656/13 if W8=. then W2 = . ., .anonhmous^M^ .anonymMus^M^ .anonyxous^M^ 62 7647/32 if W6=. then DUR = 0 23 1 12 108 4 30 6 9 21 24 7 14 22 2 3 11 15 27

Recommend


More recommend