a human factors approach to spam
play

A Human Factors Approach to Spam Factors Approach Filtering - PowerPoint PPT Presentation

Spam & HCI R. Beverly The Problem A Human A Human Factors Approach to Spam Factors Approach Filtering SpamGUI Parting Thoughts Summary Robert Beverly MIT CSAIL rbeverly@csail.mit.edu July 27, 2009 Conference on Email and


  1. Spam & HCI R. Beverly The Problem A Human A Human Factors Approach to Spam Factors Approach Filtering SpamGUI Parting Thoughts Summary Robert Beverly MIT CSAIL rbeverly@csail.mit.edu July 27, 2009 Conference on Email and Anti-Spam 2009 R. Beverly (MIT) Spam & HCI CEAS 2009 1 / 12

  2. The Problem Spam & HCI R. Beverly No spam classifier is perfect The Problem A Human Okay in other ML fields, e.g. Factors Approach Handwriting recognition, search engines, music SpamGUI recommendation, etc. Parting Thoughts Summary But with spam: Adaptable, adversarial inputs Complexion of dataset severely unbalanced High cost of false positives Getting from 99.9% to 99.999% Fighting a losing battle? R. Beverly (MIT) Spam & HCI CEAS 2009 2 / 12

  3. The Problem Spam & HCI R. Beverly No spam classifier is perfect The Problem A Human Okay in other ML fields, e.g. Factors Approach Handwriting recognition, search engines, music SpamGUI recommendation, etc. Parting Thoughts Summary But with spam: Adaptable, adversarial inputs Complexion of dataset severely unbalanced High cost of false positives Getting from 99.9% to 99.999% Fighting a losing battle? R. Beverly (MIT) Spam & HCI CEAS 2009 2 / 12

  4. The Problem Spam & HCI R. Beverly No spam classifier is perfect The Problem A Human Okay in other ML fields, e.g. Factors Approach Handwriting recognition, search engines, music SpamGUI recommendation, etc. Parting Thoughts Summary But with spam: Adaptable, adversarial inputs Complexion of dataset severely unbalanced High cost of false positives Getting from 99.9% to 99.999% Fighting a losing battle? R. Beverly (MIT) Spam & HCI CEAS 2009 2 / 12

  5. The Problem Spam & HCI 1 R. Beverly The Problem 0.8 A Human Factors Cumulative Fraction of Emails Approach 0.6 SpamGUI Parting Thoughts 0.4 Summary 0.2 Spam Ham 0 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 SpamAssassin Score TREC 2007 dataset ( ∼ 75k messages) Classified with SpamAssassin How close are mails to the threshold (5)? R. Beverly (MIT) Spam & HCI CEAS 2009 3 / 12

  6. The Problem Spam & HCI R. Beverly 1 The Problem A Human 0.8 Factors Approach Cumulative Fraction of Emails SpamGUI 0.6 Parting Thoughts Summary 0.4 0.2 Spam Ham 0 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 SpamAssassin Score How close are mails to the threshold (5)? 99.72% of ham below threshold... good? R. Beverly (MIT) Spam & HCI CEAS 2009 4 / 12

  7. The Problem Spam & HCI 1 Spam Ham R. Beverly Complimentary Cumulative Fraction of Emails The Problem 0.1 A Human Factors Approach 0.01 SpamGUI Parting Thoughts 0.001 Summary 0.0001 1e-05 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 SpamAssassin Score No threshold gives zero FP/FN (well-known compromise) Deluge of spam implies this compromise is flawed 0.28% above → 71 false positives R. Beverly (MIT) Spam & HCI CEAS 2009 5 / 12

  8. A Human Factors Approach Spam & HCI R. Beverly The Problem A Human Approaching from a different direction... Factors Approach SpamGUI The User Agent: Parting Thoughts Users interact with their email via a Mail User Agent Summary (MUA), e.g. Outlook, Hotmail, etc. Note that besides going graphical, MUAs have changed little over past ∼ 30 years Better incorporate human factors into a MUA R. Beverly (MIT) Spam & HCI CEAS 2009 6 / 12

  9. A Human Factors Approach Spam & HCI R. Beverly The Problem Human Factors Approach – Potential: A Human Factors Make email more useful to the user 1 Approach How are emails presented? SpamGUI Humans ultimate arbiter of any mail’s importance Parting 2 Thoughts How to better include, scale their decision process? Summary Remove burden of perfect classification from classifier 3 “good enough” filtering Eliminate false positives 4 Innovate in the user agent R. Beverly (MIT) Spam & HCI CEAS 2009 7 / 12

  10. A Human Factors Approach Spam & HCI R. Beverly The Problem Human Factors Approach – Potential: A Human Factors Make email more useful to the user 1 Approach How are emails presented? SpamGUI Humans ultimate arbiter of any mail’s importance Parting 2 Thoughts How to better include, scale their decision process? Summary Remove burden of perfect classification from classifier 3 “good enough” filtering Eliminate false positives 4 Innovate in the user agent R. Beverly (MIT) Spam & HCI CEAS 2009 7 / 12

  11. SpamGUI Spam & HCI R. Beverly Position The Problem A Human Separate classification from filtering Factors Approach SpamGUI The inbox : Parting Rethink the inbox: use a single mail folder, don’t Thoughts Summary attempt to filter into spam, ham “folders” Use color, size, shade, order, and other human factors to present the inbox Presentation of email a function of importance Proof-of-concept: SpamGUI Thunderbird extension... R. Beverly (MIT) Spam & HCI CEAS 2009 8 / 12

  12. SpamGUI Spam & HCI R. Beverly Position The Problem A Human Separate classification from filtering Factors Approach SpamGUI The inbox : Parting Rethink the inbox: use a single mail folder, don’t Thoughts Summary attempt to filter into spam, ham “folders” Use color, size, shade, order, and other human factors to present the inbox Presentation of email a function of importance Proof-of-concept: SpamGUI Thunderbird extension... R. Beverly (MIT) Spam & HCI CEAS 2009 8 / 12

  13. SpamGUI Spam & HCI R. Beverly The Problem A Human Factors Approach SpamGUI Parting Thoughts Summary R. Beverly (MIT) Spam & HCI CEAS 2009 9 / 12

  14. SpamGUI Spam & HCI R. Beverly A Few Observations: The Problem A demarcation “line” naturally emerges to the eye, A Human Factors above which user (or UI) can ignore messages Approach SpamGUI User part of filtering process, but only burdened by Parting making spam decisions on a small number of emails Thoughts around line Summary Easy to scan for formerly false positive emails on the threshold border Lots of work remains: No user studies performed yet Experimenting with several approaches R. Beverly (MIT) Spam & HCI CEAS 2009 10 / 12

  15. Parting Thoughts Spam & HCI R. Beverly The Problem A Human More generally: Factors Approach Users inundated with information, how can UI help? SpamGUI Spam is just one class of very unimportant information Parting Thoughts Lots of unused input “features;” systems designers Summary should use them Learn best way to present email to user Recognize that innovation is possible in the user agent R. Beverly (MIT) Spam & HCI CEAS 2009 11 / 12

  16. Summary Spam & HCI R. Beverly The Problem We’re fighting a losing battle trying to make spam A Human classifiers perfect Factors Approach Separate act of classification from filtering SpamGUI As a community, think more about how HCI / human Parting Thoughts factors methods can help Summary Thanks! http://www.rbeverly.net/spamgui/ Questions? R. Beverly (MIT) Spam & HCI CEAS 2009 12 / 12

Recommend


More recommend