from very weak to very strong analyzing password strength
play

From Very Weak to Very Strong : Analyzing Password-Strength Meters - PowerPoint PPT Presentation

NDSS 2014 Presentation, Feb 25, 2014 From Very Weak to Very Strong : Analyzing Password-Strength Meters Xavier de Carn de Carnavalet Mohammad Mannan Concordia University, Montreal, Canada X. de Carn de Carnavalet NDSS14: Analyzing


  1. NDSS 2014 Presentation, Feb 25, 2014 From Very Weak to Very Strong : Analyzing Password-Strength Meters Xavier de Carné de Carnavalet Mohammad Mannan Concordia University, Montreal, Canada X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 1 / 20

  2. Password-strength meter/checker X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 2 / 20

  3. What is this work about? We analyzed why is this: X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 3 / 20

  4. What is this work about? And why is that (same password): X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 4 / 20

  5. Our motivations Recent studies: meters really guide users to choose 1 better passwords [Ur et al. , USENIX Security’12] and [Egelman et al. , CHI’13] Deployed meters impact hundreds of millions of users 2 Built by up-to-billion-dollar IT companies 3 They don’t seem reliable... 4 X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 5 / 20

  6. Tested 11 web services/applications X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 6 / 20

  7. Analysis setup (1/3) 11 dictionaries: 3,895,247 unique passwords 1 X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 7 / 20

  8. Analysis setup (1/3) 11 dictionaries: 3,895,247 unique passwords 1 Top500, cracking tools (e.g., JtR) worm dictionaries, 2 database leaks (e.g., RockYou) X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 7 / 20

  9. Analysis setup (1/3) 11 dictionaries: 3,895,247 unique passwords 1 Top500, cracking tools (e.g., JtR) worm dictionaries, 2 database leaks (e.g., RockYou) Mangling & leet transformations 3 password → Password1+ or p@5$w0rd X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 7 / 20

  10. Analysis setup (2/3) Understanding of functionalities (involve some RE) 1 JavaScript (whitebox) and/or server-side (blackbox) 2 52+ million tests 3 X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 8 / 20

  11. Analysis setup (3/3) Analyze results 1 Understand checkers profile 2 Find common weaknesses 3 X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 9 / 20

  12. In theory Designing PSMs is non-trivial: No straightforward academic literature to follow Failure of NIST recommendations How to deal with password leaks, cultural references? X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 10 / 20

  13. In practice Custom “entropy” based on: Perceived complexity Password length Number of charsets used Known patterns Comparison with dictionary of common passwords (blacklist) More entropy ≃ more secure password Everyone invents their own algorithm X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 11 / 20

  14. Meters heterogeneity Each meter reacts differently to our dictionaries 1 Strength results vary widely from one to another 2 Example: Password1 Obvious, Very weak, Weak (x3), Poor, Moderate (blacklisted), Medium (x2), Strong (x3), Very strong By Microsoft itself (3 versions): strong, weak and medium! Some simple dictionaries score significantly higher 3 than others X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 12 / 20

  15. Stringency bypass Simple mangling rules/leet transformations allow bypassing password requirements Example: Consider {Top500, C&A, Cfkr and JtR} How many passwords are medium or better? Regular Mangled Web service Skype 10 . 5 % 78 % Google 0 . 002 % 26 . 8 % X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 13 / 20

  16. Password policies Password policies not often explicitly stated 1 Rules for measuring strength unexplained to users 2 Differences in policies: 3 Very stringent: assign strengths only for 3+ charsets (FedEx) Promotion of single-charset passphrases (Dropbox) Google and Yahoo!, lots of personal info, but lenient 4 policy... X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 14 / 20

  17. Google checker: some results Password strength distribution: 100 80 60 40 20 0 T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT Too short Weak Fair Good Strong Inconsistencies: testtest is weak testtest2 is good 1 4 testtest0 is strong testtest3 is strong... 2 5 testtest1 is fair Strength is time-dependent 3 6 X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 15 / 20

  18. One checker to rule them all X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 16 / 20

  19. Summary (1/2) Facts: Passwords are not going to disappear anytime soon Users will continue to choose weak passwords Current solutions: Stringent policies (user resentment?) Influence users in choosing better passwords, willingly Provide feedback on the quality of chosen passwords Should be consistent and avoid confusion X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 17 / 20

  20. Summary (2/2) Reality: Commonly-used meters are highly inconsistent 1 Fail to provide coherent feedback, sometimes 2 blatantly misleading Often have very ad-hoc design 3 Simple transformations not taken into account 4 X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 18 / 20

  21. What can be done? Common API to reduce confusion (e.g., Dropbox 1 with zxcvbn ) Real-time cracking with state-of-the art techniques to 2 assess passwords? Passphrases (be careful at simple structures) 3 Password popularity, Markov models, PCFG, 4 semantic? X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 19 / 20

  22. Thanks To recap: Meters less robust than expected from such large companies 1 Companies should stop misleading users 2 Opportunities for academic research 3 x_decarn@ciise.concordia.ca Contact: http://goo.gl/0E5Ieu Project URL: O,u3$T1()|\|5? X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 20 / 20

  23. Additional slides X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 21

  24. Percentage of dic. assigned “good” or + Base dictionaries: 80 Top500 Cfkr 60 JtR C&A 40 RY5 phpBB 20 0 Google Drupal Yahoo! Dropbox Microsoft PayPal FedEx Twitter Skype eBay Apple “Advanced” dictionaries: 100 Top500+M Cfkr+M 80 JtR+M RY5+M Leet 60 40 20 0 Drupal Yahoo! Google PayPal FedEx eBay Twitter Dropbox Skype Apple Microsoft X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 22

  25. FedEx: Password strength distribution 100 80 60 40 20 0 T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT T5 CF JR CA RY PB TM CM JM RM LT Very Weak Weak Medium Strong Very Strong X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 23

  26. FedEx: Password strength distribution Very weak? Fine... X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 24

  27. FedEx: Targeted dictionary Refined mangling rules: capitalize, append a digit and a symbol 1 capitalize, append a symbol and a digit 2 capitalize, append a symbol and two digits 3 capitalize, append a symbol and a digit, and prefix 4 with a digit Gives 121,792 words from {Top500, JtR, Cfkr} 60 . 9 % is now very strong 1 9 . 0 % is strong 2 29 . 7 % is medium 3 0 . 4 % is very weak 4 X. de Carné de Carnavalet NDSS’14: Analyzing Password-Strength Meters 25

Recommend


More recommend