How to Design a Blacklist (for a Password Meter) L. Jacquin – A. Kumar – C. Lauradoux December 4, 2014
Blacklist in practice ◮ The backbone of security ! ◮ Traffic analysis : ⊲ Firewall : iptables , netfilter ⊲ IDS : l7-filter ⊲ DPI : OpenDPI , nDPI ◮ Malware signatures : ⊲ anti-viruses : ClamAV , SplitScreen ⊲ anti-phishing : Google Safe Browsing ◮ and password meters ! 1
Application to password meters ◮ Password meter: tool to prevent users to choose weak (or strong) passwords. ◮ Examples: • LUDS: Lower, Upper, Digit, Symbol • cracklib , apg and apgbfm (Linux) • Keepass ◮ de Carn´ e de Carnavalet and Mannan [NDSS 2014] report the use of password blacklist by at least: • Google • Apple 2
Are password meters useful ? ◮ Can you trust a password meter ? • Yes if it fails • No otherwise The NIST Test Suite for Pseudorandom Generator suffers the same flaw. Only failure is trustworthy. ◮ Does it give an advantage to a cracker ? Can I speedup the cracking by including the meter? ◮ Difficulty to exploit a meter during cracking ? 3
Exploit a meter during cracking Meter Rules Hash Dict. Eng. Rules Hash Dict. Eng. Rules Hash Dict. Meter Eng. 4
Implementation of a blacklist ◮ Data to store the data ? • Raw list • hash table • Bloom filter • Count-min sketch ◮ Which data ? • Leaked passwords • Personal data ◮ Result expected ? 5
Blacklist Data Structures An algorithmic problem ◮ Exact/approximate answer: the value returned by the blacklist is correct with probability ǫ . • ǫ = 1 exact answer • ǫ < 1 approximate answer ◮ Exact/approximate query: the goal is to emulate mangling as it is done in crack engines. passwZrd is close to password . 6
Approximate query Blacklist ◮ Definition. Let B be a blacklist and s ∈ B . We want to have s ′ / ∈ B , d ( s, s ′ ) ≤ δ such that: query( s ′ , B ) = true , with d ( · , · ) a distance and δ a given threshold. If d ( s, s ′ ) > δ , query( s ′ , B ) = false . ◮ Implementations: • transformation on the password • special hash functions: h ( s ) = h ( s ′ ) , • approximate pattern matching. 7
Approximate answer Blacklist ◮ Different types of errors: • false-positive, s ′ / ∈ B but query( s ′ , B ) = true • true-negative, s ′ ∈ B but query( s ′ , B ) = false False-positives are acceptable but not true-negatives ◮ Example: Bloom filter • s ′ ∈ B , Pr (query( s ′ , B ) = false) = 0 • s ′ / ∈ B , Pr (query( s ′ , B ) = true) = ǫ and � k � − kn ǫ ≈ 1 − e m 8
Bloom filter Bloomcrastination z of size m set to � ◮ Vector of bit � 0 : • k hash functions: h i : { 0 , 1 } ∗ → [0 , m − 1] • Insert(x): set bits at h i ( x ) of � z . • Query(y): Check #set bits at h i ( y ) . x 1 x 2 x 3 0 0 1 1 0 1 1 0 1 0 y 1 y 2 = x 1 y 3 9
Approximate answer Blacklist Advantages ( ǫ = 2 − 16 ) Dictionary # entries Size (Mb) Bloom (Mb) Conficker 182 1.4K 252 JtR 3107 22K 4.2K phpbb 184.389 1.6M 256K C&A 306.706 3.1M 415K RockYou 14.344.391 134M 19M ◮ apgbfm is based on Bloom filter. ◮ Good compression but supports exact query only! 10
APM vs apgbfm Mamber and Wu ◮ Raw data can be used with approximate pattern matching (APM): • agrep • tre-agrep Raw Data + APM apgbfm weak weak password weak strong passwZrd ◮ Exact query solutions are space-efficient but limited. 11
Approximate query/Approximate answer Mamber and Wu Dictionary Size # entries Mangling Conficker 1.4K 489 2.6 JtR 22K 7602 2.4 phpbb 1.6M 545.643 2.9 C&A 3.1M 1.091.563 3.5 RockYou 134M 48.493.095 3.8 ◮ Mangling + Bloom is inefficient. 12
Approximate query/Approximate answer Locality-sensitive hashing ◮ Definition: • d ( s, s ′ ) < δ then Pr ( h ( s ) = h ( s ′ )) = c 1 • d ( s, s ′ ) ≥ δ then Pr ( h ( s ) = h ( s ′ )) = c 2 ◮ Example: tabulation hashing h ( s ) = H 1 ( s 1 ) ⊕ H 2 ( s 2 ) ⊕ · · · ⊕ H ℓ ( s ℓ ) ◮ Famous application: Minhash , Simhash (Google) ◮ Better coverage of mangling. ◮ Difficulty to control false-positive rate. 13
Conclusion ◮ We are not trying to solve the password problem! This war is lost unless you take 15-character password randomly generated. ◮ Our goal is to find how to implement efficiently blacklist. The only literature available is related to password meter. ◮ It is not easy to use memory efficient solutions. Ongoing need to find good locality-sensitive hash functions for password. 14
Recommend
More recommend