challenges in experimenting
play

Challenges in Experimenting with Botnet Detection Systems Adam - PowerPoint PPT Presentation

Challenges in Experimenting with Botnet Detection Systems Adam J. Aviv Andreas Haeberlen University of Pennsylvania August 8th, 2011 CSET-2011 1 Alice has developed a new botnet detector!!! What should the evaluation show? Alice's


  1. Challenges in Experimenting with Botnet Detection Systems Adam J. Aviv Andreas Haeberlen University of Pennsylvania August 8th, 2011 CSET-2011 1

  2. Alice has developed a new botnet detector!!! What should the evaluation show? Alice's Detector August 8th, 2011 CSET-2011 2

  3. Ideal Alice deploys her detector live on her local network Alice is provided with a list of hosts that are botnet infected Alice deploys her detector on various other networks Academic, Residential, Corporate, etc. Alice records traces of each deployment Improve detector in the lab Readily available to other researchers August 8th, 2011 CSET-2011 3

  4. Realities Production-ready deployment? Ground truth of botnet infections? Deployment on various networks? Record trace and replay experiment? Traces available to other researchers? August 8th, 2011 CSET-2011 4

  5. T aking a Step Back August 8th, 2011 CSET-2011 5

  6. Many Challenges Multiple Administrative Focus on Academic Domains Networks Network Heterogeneity Scale Multimorbidity Mixing Artifacts Privacy False Postives & Negatives Controlled Environments Artifact Overfitting Repeatability Botnet Overfitting Comparability Lack of Verification August 8th, 2011 CSET-2011 6

  7. privacy August 8th, 2011 CSET-2011 7

  8. We have to worry about privacy, but the botnet authors don't! August 8th, 2011 CSET-2011 8

  9. Can we do better together? August 8th, 2011 CSET-2011 9

  10. Discussion/T opics/Questions Experimental Ideals vs. Realities Not just botnet detectors ... Raw Materials of the Experiment Sharing and Obtaining Traces Botnet and Background Traces Can we do better via collaboration? August 8th, 2011 CSET-2011 10

  11. Presentation Outline Ideal vs. Reality Experimental Challenges Overlay Methodology Pitfalls Obtaining Traces Sharing Traces What can be done? August 8th, 2011 CSET-2011 11

  12. Alice has developed a new botnet detector!!! What should the evaluation show? Alice's Detector August 8th, 2011 CSET-2011 12

  13. Ideal vs. Reality Alice deploys her detector live on her Production-ready deployment? local network Alice is provided with list of hosts that are botnet infected Ground truth of botnet infections? Alice deploys her detector on other Deployment on various various networks networks? Corporate, Residential, Corporate, etc. Record trace and replay Alice records traces of each experiment? deployment Improve detector in the lab Readily available to other researchers Traces available to other researchers? August 8th, 2011 CSET-2011 13

  14. Evaluation Realities Performance Realistic Settings Network Heterogeneity Multiple Administrative Domains Modernity Comparability Lack of Ground T ruth & Repeatability Overfitting Privacy August 8th, 2011 CSET-2011 14

  15. Pitfalls Experimental Challenges Overlay Methodology Pitfalls Obtaining Traces Sharing Traces What can be done? August 8th, 2011 CSET-2011 15

  16. Overlay Methodology v v v Internet v v v v Anonymizer v Network Trace August 8th, 2011 CSET-2011 16

  17. Replay and Evaluate Network Trace Detected 2 Bots! v v v v v v v v v v Collected Background Independently Trace is Sensitive August 8th, 2011 CSET-2011 17

  18. Prevalence in the Literature [13] [49] [15] [36] [46] [47] Overlay [41] [23] [6] Methodology [7] [28] [25] [24] [14] Other [20] [14] [45] Methodology [36] [11] [5] * See paper for references. August 8th, 2011 CSET-2011 18

  19. Advantages of Overlay Methodology v v v v v v v v v v Ground Truth August 8th, 2011 CSET-2011 19

  20. Pitfalls Experimental Challenges Overlay Methodology Pitfalls Obtaining Traces Sharing Traces What can be done? August 8th, 2011 CSET-2011 20

  21. Obtaining Traces Realism Merging of Botnet and Background trace should be realistic August 8th, 2011 CSET-2011 21

  22. Collecting Botnet Traces v August 8th, 2011 CSET-2011 22

  23. Realistic Embedding Residential ISP ? SPAM! v v v v v v v v v v August 8th, 2011 CSET-2011 23

  24. Mixing Artifacts v v v v v v v v v v v v DHCP August 8th, 2011 CSET-2011 24

  25. Multimorbidity v v v v v v v v v v v v v v August 8th, 2011 CSET-2011 25

  26. Obtaining Traces Realism Merging of Botnet and Background trace should be realistic Representativeness Reflect diversity in network scenarios August 8th, 2011 CSET-2011 26

  27. Focus on Academic Networks v v v v v v v v v v Corporate Business State University August 8th, 2011 CSET-2011 27

  28. Prevalence in the Literature At Least One Academic Traces Other Trace [13] [49] [15] [28] [25] [24] Overlay [36] [46] [47] [14] Methodology [41] [23] [6] [7] Other [20] [14] [45] [36] [11] [5] Methodology * See paper for references. August 8th, 2011 CSET-2011 28

  29. Scale v v v v v v v v v v v v vv v v vv v v v v v v v v v v v v v v v vv v v v v v v v v v v v v v v vv v v v v v v v v v vv v v v v v v v v v vv v vv v v v vv v v v v v v v v v v v v v v v v v v v v vv v v v vv v v v v v v v v vv v v v vv v v v v v v v v v v v v v v v vv v v v v v v v v v v v v v v v vv v v v v v v v v v vv v v v v v v v v v v v v v vv v v vv v v v v v v v v vv v v vv v v vv v v v v v v v v v v v v v v v v v v v v v vv vv vv v v v v v v v v v v v v vv vv August 8th, 2011 CSET-2011 29

  30. Obtaining Traces Realism Merging of Botnet and Background trace should be realistic Representativeness Reflect diversity in network scenarios Performance False postives and negatives August 8th, 2011 CSET-2011 30

  31. Lack of Verification v v v v v v v v v August 8th, 2011 CSET-2011 31

  32. Example From the Literature T aMD “ ” We suspect that the reason not every bot in the botnet was detected is due to the randomness in our choice of selected internal hosts to which the malware traffic was assigned, such that a selected internal host that was also contacting other suspicious subnets (not relevant to the botnet) is likely to bias the dimension reduction and clustering algorithm. August 8th, 2011 CSET-2011 32

  33. privacy August 8th, 2011 CSET-2011 33

  34. Sharing Traces v v v v v v v v v v Is the experiment independently repeatable? Can we do apples to apples comparison? August 8th, 2011 CSET-2011 34

  35. What can be done? Experimental Challenges Overlay Methodology Pitfalls Obtaining Traces Sharing Traces What can be done? August 8th, 2011 CSET-2011 35

  36. Observations Much of these challenges stem from difficulties in sharing and obtaining realistic data sets. Similar to problems faced by researchers studying large scale distributed systems ---> PlanetLab August 8th, 2011 CSET-2011 36

  37. Can we do better together? A PlanetLab for Botnet Detection? August 8th, 2011 CSET-2011 37

  38. Strawman Distributed Evaluation PlanetLab-like nodes on participating networks Cannot communicate network traces outside of network Researchers Deploy Detector Code on Nodes Reports are reviewed and declassified by sys-admins Researcher can test and debug on local node Incentives Sys-Admins gain access to bleeding edge detectors, for FREE! Researchers gain insight into usefulness of reports or “ground truth” August 8th, 2011 CSET-2011 38

  39. Address Challenges Performance Realistic Settings Network Heterogeneity Lack of Ground Truth Multiple Administrative Domains Modernity Comparability & Repeatability Overfitting Privacy August 8th, 2011 CSET-2011 39

  40. Huge Deployment Challenges Privacy Accountability August 8th, 2011 CSET-2011 40

  41. Conclusions T aking a step back Overlay Methodology Literature Review And, its pitfalls Ideal is hard Can we do better Ideal vs. Reality together? Privacy! PlanetLab for Sharing and Obtaining realistic traces Botnet detectors? August 8th, 2011 CSET-2011 41

  42. Backup August 8th, 2011 CSET-2011 42

Recommend


More recommend