AV-Meter: An Evaluation of Antivirus Scans and Labels Omar Alrawi - PowerPoint PPT Presentation

AV-Meter: An Evaluation of Antivirus Scans and Labels Omar Alrawi (Qatar Computing Research Institute) Joint with Aziz Mohaisen (VeriSign Labs)

Overview • Introduction to problem • Evaluation metrics • Dataset gathering and use • Measurements and findings • Implications • Conclusion and questions 2

Example of labels • ZeroAccess known labels by vendors and community: -‑ Zeroaccess, Zaccess, 0access, Sirefef, Reon 4

Applications • Anti-virus (AV) independent labeling and inconsistency -‑ Heuristics, generic labels, etc. • Machine learning (ground truth learning set and verification for classification) • Incident response, mitigation strategies • “Elephant in the room” -‑ Symantec finally admits it! 5

Approach • Contribution -‑ Provide metrics for evaluating AV detection and labeling systems -‑ Use of highly-accurate and manually-vetted dataset for evaluation -‑ Provide several directions to address the problem • Limitations -‑ Cannot be used to benchmark AV engines -‑ Cannot be generalized for a given malware family 6

Metrics (4Cs) • Completeness (detection rate) • Correctness (correct label) • Consistency (agreement among other Avs) • Coverage 7

Completeness (detection rate) • Given a set of malware, how many are detected by a given AV engine • Normalized by the dataset size; value in [0-1] Malware ¡Set ¡ Detected ¡Set ¡ 8

Correctness • Score based on correct label returned by a given AV engine; normalized by the set size Malware ¡Set ¡ Detected ¡Set ¡ Correct ¡Label ¡Set ¡ 9

Consistency • Agreement of labels (detections) among vendors -‑ Completeness consistency -‑ Correctness consistency -‑ (S’^S’’)/(S’vS’’) for both measures • Normalized by the size of the union of S’ and S’’ S’ ¡ ¡S’’ ¡ S’^S’’ 10

Coverage • Minimal number of AV engines required to detect a given complete set of malware • Normalized by the size of set; value in [0-1] AV1 ¡ ¡ ¡ AV5 ¡ AV2 ¡ ¡ ¡ ¡ AV3 ¡ AV6 ¡ Malware ¡Set ¡ AV4 ¡ 11

Data • Eleven malware families -‑ Zeus, ZeroAccess, Getkys, Lurid, DNSCalc, ShadyRat, N0ise, JKDDos, Ddoser, Darkness, Avzhan -‑ Total of about 12k pieces of malware • Three types of malware -‑ Trojans -‑ DDoS -‑ Targeted 12

Data Vetting • Operational environment -‑ Incident response -‑ Collected over 1.5 years (2011-2013) • Malware labels -‑ industry, community, and malware author given labels (Zbot, Zaccess, cosmu, etc.) • Virus scans -‑ VirusTotal -‑ Multiple occurrence of vendors, use best results 13

Experiment - Completeness • More than half of AV engines detect our pool of samples (positive outcome!) • These samples contribute to the high detection rates seen across AV engines Number of scanners 40 30 20 10 0 zeus zaccess lurid n0ise oldcarp jkddos dnscalc ddoser darkness bfox avzhan 15

• Average completeness provided is 59.1% • Maximum completeness provided is 99.7% • Completeness score for each AV for all 12k samples Experiment - Completeness Completeness 0.0 0.2 0.4 0.6 0.8 1.0 eTrust.Vet ● eSafe ● ● NANO ● Malwarebytes Agnitum MicroWorld ● NOD32 VirusBuster ● Antiy.AVL ● ● Kingsoft Rising ClamAV TotalDefense SAntiSpyware ViRobot CAT.QuickHeal PCTools F.Prot Commtouch TheHacker ESET.NOD32 Jiangmin VBA32 nProtect Symantec AhnLab.V3 ● TrendMicro ● K7AntiVirus Emsisoft ● TrendMicro.1 Comodo Sophos ● Fortinet DrWeb ● Norman ● Panda ● VIPRE ● Microsoft ● Avast ● ● McAfee.GWE AVG ● Ikarus ● F.Secure ● ● AntiVir ● ● McAfee ● BitDefender ● ● Kaspersky ● GData ● ● 16

Experiment - Completeness • Completeness versus number of labels -‑ On average each scanner has 139 unique label per family and median of 69 labels • Completeness versus largest label -‑ We see an average largest label is 20% o Example: if largest label 100, then average AV has 20 labels per family -‑ AV with smaller labels can be deceiving regarding correctness o Example: Norman has generic label (ServStart) for Avzhan family covering 96.7% of the sample set 17

Experiment - Correctness • Highest correct label is Jkddos (labeled jackydos or jukbot) by: -‑ Symantec (86.8%), Microsoft (85.3%), PCTools (80.3%), with completeness close to 98% • Others -‑ Blackenergy (64%,) -‑ Zaccess (38.6%) -‑ Zbot (73.9%) 18

Correctness • Correctness - Zeus and JKDDoS Experiment - Correctness Correctness 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 -‑ Incorrect labels (unique label) - red -‑ Behavior labels (Trojan, generic, etc.) - blue -‑ Static scan labels - green eTrust.Vet eSafe NANO Malwarebytes Agnitum MicroWorld NOD32 VirusBuster Antiy.AVL Kingsoft Rising ClamAV TotalDefense SA.Spyware ViRobot CAT.QuickHeal PCTools F.Prot Commtouch TheHacker ESET.NOD32 Jiangmin VBA32 nProtect Symantec AhnLab.V3 TrendMicro K7AntiVirus Emsisoft TrendMicro.HC Comodo Sophos Fortinet DrWeb Norman Panda VIPRE Microsoft Avast McAfee.GWE AVG Ikarus F.Secure AntiVir McAfee BitDefender Kaspersky GData 19

Experiment – Consistency • Consistency of detection -‑ Pairwise comparison for sample detection across two vendors • On average 50% agreement • 24 vendors have, almost, perfect consistency -‑ AV sharing information is a potential explanation; -‑ AV vendor 1 depends on vendor 2 detection (piggybacking) • Example of one family (Zeus) 1.0 0.8 Consistency 0.6 0.4 0.2 0.0 1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 Antivirus Scanner 20

Experiment - Coverage • JKDDoS and Zeus 1 • Coverage for 0.95 0.9 o Completeness (3-10 AV Coverage engines) depending on family 0.85 Completeness - Zeus 0.8 o Correctness (Never reached Correctness - Zeus with all 48 engines) 0.75 Completeness - JKDDoS Correctness - JKDDoS 0.7 o Highest score observed for 5 10 15 20 25 correctness is 97.6% Number of Antivirus Scanners 21

Implications • Relying on AV labels to evaluate proposed approaches seems problematic at best; -‑ Machine learning, classification and clustering • Rapid incident response based on AV labels -‑ Applying wrong remediation for incident based on incorrect label may cause long-lasting harm. • Tracking and attribution of malicious code (Law enforcement) -‑ Tracking inaccurate indictors due to incorrect label 22

Conclusion • Proposed remedies -‑ Data/indicator sharing -‑ Label unification -‑ Existing label consolidation -‑ Defining a label, by behavior, class, purpose, etc. • Future work -‑ Methods and techniques to tolerate inconsistencies and incompleteness of labels/detection • Full paper -‑ http://goo.gl/1xFv93 23

Omar Alrawi oalrawi@qf.org.qa +974 4544 2955

AV-Meter: An Evaluation of Antivirus Scans and Labels Omar Alrawi - PowerPoint PPT Presentation

AV-Meter: An Evaluation of Antivirus Scans and Labels Omar Alrawi (Qatar Computing Research Institute) Joint with Aziz Mohaisen (VeriSign Labs) Overview Introduction to problem Evaluation metrics Dataset gathering and use

Meter Data February 2016 Version 4.0 Agenda Meter Data Submit Meter Data Meter

Meter Data David Daniels Settlements Analyst Agenda Meter Data Submit Meter Data

AI in Antivirus What AI do in your antivirus Who I am? Arcangelo Saracino Student of Computer

Strobe delay scans in STcontrol Jrn Grosse-Knetter Intro: strobe delay scans (1) See talk

2016 Vegetable Pesticide Update: Weeds 1) New/Changed labels 2) Labels soon 3) Auxin Technologies

2012 GFVGA: Herbicide Update 2012 Weed Control Update 1. Recent labels 2. New labels 3. Near

SEFRAM 7880 series TV Meter SEFRAM 7880 series TV Meter Overview SEFRAM 7880 series TV Meter

FUSED TABLE SCANS: COMBINING AVX-512 AND JIT Markus Dreseler, Jan Kossmann, Johannes Frohnhofen,

Intra-Pulse Beam-Beam Scans at the NLC IP Steve Smith SLAC Nanobeams 2002 Beam-Beam Scans

ESBN Presentation to the IGG John Bracken 27 th June 2018 Agenda Meter Update - ESBN Meter

Carmel OConnor Prepayment Meter Stats Keypad Meter Stats Total installed up to 16 October

Eco Labels in AEC Dr.Lunchakorn Prathumratana Thailand Environment Institute (TEI) Eco labels in

GENERAL PRESENTATION PROTECTION- CONTROL- IDENTIFICATION TRACKING 2506 RFID LABELS 02 What

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

Active Energy Management Ahead of the meter and behind the meter Presented by David M. Ferro

4.1 3D Scanning Hao Li http://cs599.hao-li.com 1 Administrative Exercise 2: this

Lotus Domino: Penetra0on Through the Controller Alexey Sintsov

Self-Driving Database Management System In-Memory Autonomous Open-Source Rewrote storage +

Parallel Scanning Marc Moreno Maza University of Western Ontario, London, Ontario (Canada)

PARALLEL PATTERNS REDUCE & SCAN 2 6/16/2010 Parallel

File class in Java Programmers refer to input/output as "I/O". Input is received

Objectives Fault modeling and simulation Test generation

Understanding the Mirai Botnet Manos Antonakakis , Tim April , Michael Bailey ,

Sambuz

Useful Links

Newsletter

Mail Us

AV-Meter: An Evaluation of Antivirus Scans and Labels Omar Alrawi - PowerPoint PPT Presentation

AV-Meter: An Evaluation of Antivirus Scans and Labels Omar Alrawi (Qatar Computing Research Institute) Joint with Aziz Mohaisen (VeriSign Labs) Overview Introduction to problem Evaluation metrics Dataset gathering and use

Meter Data February 2016 Version 4.0 Agenda Meter Data Submit Meter Data Meter

Meter Data David Daniels Settlements Analyst Agenda Meter Data Submit Meter Data

AI in Antivirus What AI do in your antivirus Who I am? Arcangelo Saracino Student of Computer

Strobe delay scans in STcontrol Jrn Grosse-Knetter Intro: strobe delay scans (1) See talk

2016 Vegetable Pesticide Update: Weeds 1) New/Changed labels 2) Labels soon 3) Auxin Technologies

2012 GFVGA: Herbicide Update 2012 Weed Control Update 1. Recent labels 2. New labels 3. Near

SEFRAM 7880 series TV Meter SEFRAM 7880 series TV Meter Overview SEFRAM 7880 series TV Meter

FUSED TABLE SCANS: COMBINING AVX-512 AND JIT Markus Dreseler, Jan Kossmann, Johannes Frohnhofen,

Intra-Pulse Beam-Beam Scans at the NLC IP Steve Smith SLAC Nanobeams 2002 Beam-Beam Scans

ESBN Presentation to the IGG John Bracken 27 th June 2018 Agenda Meter Update - ESBN Meter

Carmel OConnor Prepayment Meter Stats Keypad Meter Stats Total installed up to 16 October

Eco Labels in AEC Dr.Lunchakorn Prathumratana Thailand Environment Institute (TEI) Eco labels in

GENERAL PRESENTATION PROTECTION- CONTROL- IDENTIFICATION TRACKING 2506 RFID LABELS 02 What

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

Active Energy Management Ahead of the meter and behind the meter Presented by David M. Ferro

4.1 3D Scanning Hao Li http://cs599.hao-li.com 1 Administrative Exercise 2: this

Lotus Domino: Penetra0on Through the Controller Alexey Sintsov

Self-Driving Database Management System In-Memory Autonomous Open-Source Rewrote storage +

Parallel Scanning Marc Moreno Maza University of Western Ontario, London, Ontario (Canada)

PARALLEL PATTERNS REDUCE &amp; SCAN 2 6/16/2010 Parallel

File class in Java Programmers refer to input/output as &quot;I/O&quot;. Input is received

Objectives Fault modeling and simulation Test generation

Understanding the Mirai Botnet Manos Antonakakis , Tim April , Michael Bailey ,

Sambuz

Useful Links

Newsletter

Mail Us

PARALLEL PATTERNS REDUCE & SCAN 2 6/16/2010 Parallel

File class in Java Programmers refer to input/output as "I/O". Input is received