field studies and ecological validity
play

Field studies and ecological validity Michelle Mazurek 1 Todays - PowerPoint PPT Presentation

Field studies and ecological validity Michelle Mazurek 1 Todays class Field studies (pluses and minuses) Ecological validity Ethics Crowdsourced studies (MTurk and friends) Project pitches: Next week 2 FIELD STUDIES 3 Why


  1. Field studies and ecological validity Michelle Mazurek 1

  2. Today’s class • Field studies (pluses and minuses) • Ecological validity – Ethics • Crowdsourced studies (MTurk and friends) • Project pitches: Next week 2

  3. FIELD STUDIES 3

  4. Why (not) a field study? • Better ecological validity – Validate a lab study result • Because you can’t get the data any other way • Logistically difficult • Limited piloting / not easy to adjust – One shot at your participant pool • Expensive (money and time) Plan extremely carefully! 4

  5. PhishGuru in the real world • Anti-phishing training delivered when users follow a phishing link • Training, phishing, legitimate emails delivered to 300 employees in a Portuguese company 5

  6. PhishGuru in the real world • Was a field study necessary here? Why? – How could it have been designed differently? • What logistical problems were encountered? – Design choices the authors later regretted? – How did they threaten the study’s outcome? 6

  7. Case study: Measuring password strength ECOLOGICAL VALIDITY 7

  8. Passwords research is everywhere WWW 2007 CCS 2005 (Narayanan (Florencio and Herley) and Shmatikov) Computers CCS 2010 (Weir et al.) and Security, 1989 CHI 2011 (Komanduri et al.) IEEE S&P 2012 (Bonneau) NDSS 2012 (Castelluccia et al.) Comm ACM, 1979 ACIS 2004 (Campbell and Bryant) 8

  9. … but good data is hard to find • Small data sets • Experimental rather than field data • Self-reported surveys • Leaked data of questionable validity • Minimal-value accounts • No access to plaintext passwords Are the results generalizable? 9

  10. Fahl et al.: Password study validity • Goal: Compare lab study, online study, real passwords • Methods: – Several thousand passwords (plaintext, anonymized) – Invite same pool to online or lab study – Security priming, or not – Manual analysis for similarity • 583 online, 63 lab participants 10

  11. Results: Validity % Online Lab Priming Non Total Highly valid 46 49 47 44 46 Somewhat valid 23 32 24 24 24 Invalid 31 18 29 32 30 • Overall, experimental data can be useful – Self-reporting of realistic behavior can help • No significant difference due to priming • Lab slightly but significantly better than online 11

  12. Critique the study design • What was measured? – Comments about the manual analysis approach? • Priming vs. non-priming – How are the instructions different in the 2 cases? – Would you have given different instructions? – Are there other conditions you would test? 12

  13. Implications of the results • Do these findings apply to other studies in the security/privacy area? How? 13

  14. Passwords for an entire university • 25,000 real, high-value eal, high-value passwords from CMU • Contextual data – logs, demographics, survey • What factors correlate with password strength? – New (to passwords) statistical methods – Find new results, confirm prior results • What to do when you don’t have field data? – Comparison with leaked and study data 14

  15. What are CMU passwords? • 25,459 accounts for faculty, staff, and students – Plus 17,104 deactivated accounts • Single-sign-on for email, financial, grades, registration, health, etc. • Password requirements: – Minimum 8 characters – Upper, lower, digit, symbol – Dictionary check (241,497 words) 15

  16. Strength metric: Guessability • How many guesses to reach each password? – Subject to guessing algorithm and training data • Result: guess number or beyond t beyond the cutof he cutoff – Cutoff = 380 trillion guesses (runs in about 1 day) Example: Password Guess number 12345678 4 1.4 x 10 6 Password178 jn%fKXsl!8@Df Beyond cutoff 16

  17. Comparing password sets • Examining CMU password policy – Use conforming subset conforming subset for all leaked data • Online studies – MTsim: Closest match to real CMU experience – MTcomp8: Similar password requirements • Leaked: plaintext – RockYou, Yahoo!, CSDN • Leaked: hashed and cracked – Gawker, StratFor 17

  18. Comparing sets – Guessability Gawker Extensive-knowledge Gcomp8 Stratfor 60% SFcomp8 MTsim MT 50% RockYou RYcomp8 CMU CMUactive MTcomp8 40% MT CSDN CSDNcomp8 Yahoo Percent guessed Ycomp8 30% 20% 10% 0% 1 E 4 1 E 7 1 E 10 1 E 13 Guess number Leaked hashed/cracked: Very easy to guess 18

  19. Comparing sets – Guessability Gawker Extensive-knowledge Gcomp8 Stratfor 60% SFcomp8 MTsim MT 50% RockYou RYcomp8 CMU CMUactive MTcomp8 40% MT CSDN CSDNcomp8 Yahoo Percent guessed Ycomp8 30% 20% 10% 0% 1 E 4 1 E 7 1 E 10 1 E 13 Guess number Leaked plaintext: RockYou close to CMU, others much tougher 19

  20. Comparing sets – Guessability Gawker Extensive-knowledge Gcomp8 Stratfor 60% SFcomp8 MTsim MT 50% RockYou RYcomp8 CMU CMUactive MTcomp8 40% MT CSDN CSDNcomp8 Yahoo Percent guessed Ycomp8 30% 20% 10% 0% 1 E 4 1 E 7 1 E 10 1 E 13 Guess number Online studies: Both close, MTcomp8 closer 20

  21. Other metrics for comparison • Composition: length, character classes • Structures • Entropy (Shay et al., SOUPS 2010) • Frequency distribution 21

  22. Comparing sets – Length Digits Length CMUactive CMU sim MTSim comp8 MTcomp8 Ycomp8 RockYou 12.6 Ycomp8 Yahoo CSDNcomp8 CSDN Stratfor SFcomp8 Gawker Gcomp8 8.0 Survey SVcomp8 MTbasic8 MTbasic8 MTdictionary8 MTdictionary8 MTbasic16 MTbasic16 17.9 9.5 10 10.5 11 11.5 number of characters number of digits Overall: Online studies closest across metrics (Full results in the paper) 22

  23. Discussion • Critique the study design – Challenges of field studies • Are there lessons for other HFSP studies? 23

  24. Quick note on ethics • All three studies we discussed today have significant ethical implications • We’ll revisit this in a couple of weeks – Any comments/questions in the meantime? 24

  25. Homework 2 • Suggesting study designs – We’ll talk about more options Tuesday • Deploy and analyze an MTurk survey – Part, but only part, Part, but only part, can be done wit can be done with partners h partners – Ther There ar e are 11 of you … potent e 11 of you … potential ially one triple ly one triple • Read d Read dir irect ections car ions careful efully! ly! 33

  26. Project pitches • 5 min each; slides optional – What is the research question? – Preliminary high-level methodology – Ideally: Quick overview of related work / why it’s novel • We’ll vote to narrow down and then form teams • Final teams by 2/24 • Proposals due 3/3 – Two pages; details posted on course website 34

Recommend


More recommend