T HE SCIENCE OF GUESSING analyzing an anonymized corpus of 70 million passwords Joseph Bonneau jcb82@cl.cam.ac.uk Computer Laboratory IEEE Symposium on Security & Privacy ≈ Oakland, CA, USA May 23, 2012 Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 1 / 33
Why do password research in 2012? Compatible Time-Sharing System, MIT 1961 Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 2 / 33
Research goal Precisely compute the guessing difficulty of a given population’s password distribution Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 3 / 33
Research goal Compare the guessing difficulty of password distributions chosen by different populations Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 4 / 33
Research goal Compare the guessing difficulty of password distributions chosen by different populations vs. Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 4 / 33
Research goal Compare the guessing difficulty of password distributions chosen by different populations vs. Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 4 / 33
Research goal Compare the guessing difficulty of password distributions chosen by different populations vs. Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 4 / 33
Research goal Compare the guessing difficulty of password distributions chosen by different populations vs. Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 4 / 33
Approach #1: Semantic password evaluation How long are the passwords? Do they look like English words? What kind of characters do they contain? Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 5 / 33
Approach #1: Semantic password evaluation User Chosen Randomly Chosen 94 Character Alphabet 10 char. alphabet 94 char alphabet Length No Checks Dictionary Dict. & Char. Rule Comp. Rule 1 4 - - 3 6.6 3.3 2 6 - - 5 13.2 6.7 3 8 - - 7 19.8 10.0 10 14 16 9 26.3 4 13.3 5 12 17 20 10 32.9 16.7 6 14 20 23 11 39.5 20.0 7 16 22 27 12 46.1 23.3 18 24 30 13 52.7 8 26.6 10 21 26 32 15 65.9 33.3 12 24 28 34 17 79.0 40.0 14 27 30 36 19 92.2 46.6 30 32 38 21 105.4 16 53.3 18 33 34 40 23 118.5 59.9 20 36 36 42 25 131.7 66.6 22 38 38 44 27 144.7 73.3 24 40 40 46 29 79.9 158.0 30 46 46 52 35 197.2 99.9 40 56 56 62 45 263.4 133.2 NIST “entropy” formula Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 5 / 33
Approach #2: Cracking experiments Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 6 / 33
Approach #2: Cracking experiments 35 30 25 µ = lg(dictionary size) 20 Morris and Thompson [1979] Klein [1990] Spafford [1992] 15 Wu [1999] Kuo [2006] 10 Schneier [2006] Dell’Amico (it) [2010] 5 Dell’Amico (fi) [2010] Dell’Amico (en) [2010] 0 0 . 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 α = proportion of passwords guessed Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 6 / 33
Methodological problems with password analysis semantic cracking external validity � no operator bias � no demographic bias ? repeatable ? � easy ? � Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 7 / 33
My approach Collect password data on a huge scale 1 Compare populations as probability distributions 2 Test hypotheses using different populations 3 Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 8 / 33
My approach Collect password data on a huge scale 1 Compare populations as probability distributions 2 Test hypotheses using different populations 3 Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 8 / 33
My approach Collect password data on a huge scale 1 Compare populations as probability distributions 2 Test hypotheses using different populations 3 Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 8 / 33
Goal #1: collect a massive data set with cooperation from Yahoo! privacy-preserving collection � histograms only demographic splits collected Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 9 / 33
Collecting large-scale data at Yahoo! Internet user: joe pass: 12345 Collection Proxy 12345 Login Server Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 10 / 33
Collecting large-scale data at Yahoo! Internet user: joe pass: 12345 Collection Proxy H(12345) Login Server Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 10 / 33
Collecting large-scale data at Yahoo! Internet user: joe pass: 12345 Collection Proxy K H(K||12345) Login Server Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 10 / 33
Collecting large-scale data at Yahoo! User SELECT gender, lang, age database FROM users Internet WHERE user = joe user: joe pass: 12345 m, en, 21-34 Collection Proxy K H(K||12345) m, en, 21-34 Login Server Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 10 / 33
Collecting large-scale data at Yahoo! User SELECT gender, lang, age database FROM users Internet WHERE user = joe user: joe pass: 12345 m, en, 21-34 Collection Proxy H(K||12345) K user: joe pass: 123456 gender=m H(K||12345) H(joe)? H(K||12345) Login Seen Server users lang=en age=21-34 Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 10 / 33
Collecting large-scale data at Yahoo! Internet gender=m Login Server lang=en age=21-34 Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 10 / 33
Collecting large-scale data at Yahoo! Experiment run May 23–25, 2011 69,301,337 unique users 42.5% unique 328 different predicate functions Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 10 / 33
Goal #2: model guessing as a probability problem Assume perfect knowledge of the distribution X X has N events (passwords) x 1 , x 2 , . . . Events have probability p 1 ≥ p 2 ≥ . . . ≥ p N ≥ 0 R Each user chooses at random X ← X Question: How hard is it to guess X ? Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 11 / 33
Shannon entropy N � H 1 ( X ) = − p i lg p i i = 1 Interpretation: Expected number of queries “Is X ∈ S ?” for arbitrary subsets S ⊆ X needed to guess X . (Source-Coding Theorem) Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 12 / 33
Guesswork (guessing entropy) N � � � G 1 ( X ) = E # guesses = p i · i i = 1 Intepretation: Expected number of queries “Is X = x i ?” for i = 1 , 2 , . . . , N (optimal sequential guessing) Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 13 / 33
G 1 fails badly for real password distributions Random 128-bit passwords in the wild at RockYou ( ∼ 2 − 20 ) ed65e09b98bdc70576d6c5f5e2ee38a9 e54d409c55499851aeb25713c1358484 dee489981220f2646eb8b3f412c456d9 c4df8d8e225232227c84d0ed8439428a bd9059497b4af2bb913a8522747af2de b25d6118ffc44b12b014feb81ea68e49 aac71eb7307f4c54b12c92d9bd45575f 9475d62e1f8b13676deab3824492367a 92965710534a9ec4b30f27b1e7f6062a 80f5a0267920942a73693596fe181fb7 76882fb85a1a8c6a83486aba03c031c9 6a60e0e51a3eb2e9fed6a546705de1bf ... G 1 ( RockYou ) > 2 107 ⇒ Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 14 / 33
Attackers might be happy ignoring the hard values Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 15 / 33
α -work-factor µ � � � � � µ α ( X ) = min µ ∈ [ 1 , N ] p i ≥ α � � � i = 1 Intepretation: Minimal dictionary size to succeed with probability α Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 16 / 33
α -guesswork µ α ( X ) � G α ( X ) = ( 1 − ⌈ α ⌉ ) · µ α ( X ) + p i · i i = 1 Intepretation: Mean number of guesses to succeed with probability α Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 17 / 33
Guessing curves visualise all possible attacks 10000 µ α ( U 10 4 ) µ α ( U 10 3 ) µ α (PIN) 8000 G α (PIN) dictionary size/number of guesses 6000 4000 2000 0 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 success rate α Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 18 / 33
More intuitive after converting to bits 14 H 0 ց 4 . 0 12 ˜ G 1 ց 3 . 5 H 1 → 10 3 . 0 2 . 5 8 H 2 → bits dits 2 . 0 6 1 . 5 տ H ∞ µ α ( U 10 4 ) / ˜ 4 ˜ G α ( U 10 4 ) 1 . 0 µ α ( U 10 3 ) / ˜ ˜ G α ( U 10 3 ) 2 µ α (PIN) ˜ 0 . 5 ˜ G α (PIN) 0 0 . 0 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 success rate α Joseph Bonneau (University of Cambridge) The science of guessing May 23, 2012 19 / 33
Recommend
More recommend