Password classification Tiko Huizinga Supervisor: Zeno Geradts, - PowerPoint PPT Presentation

Password classification Tiko Huizinga Supervisor: Zeno Geradts, Nederlands Forensisch Instituut (NFI) 1

Example case ● Police confiscates hard drives ● Fast (automatic) analysis of data needed ● Saved plain text passwords can be very useful 2

Hansken ● Search engine for Dutch police and forensic institute ● Machine learning and image classification ● No password classification yet ○ This is where my research jumps in 4

Research question ● How can software be used to classify whether a string is a password or a “normal” word? 5

Scope ● The input for the tool are text files containing one or mul7ple words ● A word is the string between a star7ng and ending space or newline ● As a result, the tool does not classify passwords containing a space ● English language is used for training the tool 6

Method ● Gather data ○ Password list ○ Word list ● Generate statistics ○ Length, #Digits, #Special characters, … ● Create naive probabilistic classification tool ● Use machine learning to create classification tool ○ Support Vector Machine (SVM) ● Evaluate both tools ○ Precision, Accuracy, F1-Score 7

Data gathering Started with ● Common passwords English wordlist ○ Common credential list ○ English dictionary wordlist 123456 abac Too ‘boring’ ● ○ Not a lot of special characters and no password abaca unique passwords New password list ● ○ Breach compilation 12345678 abacay ○ Unique passwords New word list ● qwerty abacas ○ Partial Wikipedia dump ○ Represents text files on computers 8

Generate statistics Gather characteristics for all words ● ○ Length ○ # Special characters ○ # Digits ○ # Capital letters ○ # Small letters 9

Length of passwords and words 10

Number of digits Passwords Words 11

Naive probabilistic classifier Class C = {Password, Word} Characteristics X = { Length, #Special characters, #Digits, #Capital letters, #Small letters} pw(x) = Number of passwords with characteristic x / total number of passwords w(x) = Number of words with characteristic x / total number of words 12

Naive probabilistic classifier If result >= 0.5 ● ○ Classify as password Else ● ○ Classify as word 13

Support Vector Machine (SVM) Machine learning classification ● Divide data in two classes ● Find hyperplane with largest margin ● 14

Metrics and evaluation of classifiers Confusion matrix 15

Metrics and evaluation of classifiers 16

Metrics and evaluation of classifiers 17

Metrics and evaluation of classifiers ● F1 score ● The harmonic mean of Precision and Recall 18

Evaluation of classifiers Naive probabilistic classifier SVM Class Precision Recall F1-score Class Precision Recall F1-score Word 0.93 0.89 0.91 Word 0.79 0.91 0.85 Password 0.89 0.93 0.91 Password 0.89 0.74 0.80 19

Conclusion ● How can software be used to classify whether a string is a password or a “normal” word? ○ A naive probabilistic classifier achieves good results with an F1 score of 0.91 ○ A Support Vector Machine trains slower and achieves a lower F1 score with 0.80 and 0.85 20

Discussion ● The results are very dependant on the training set and test set ● SVM probably scores worse because there is no clear line separating passwords from words ● I used lists with all unique words with all the same weight ○ Giving more frequent words a higher weight might bring the model closer to reality 21

Future work ● Use more characteristics ○ Place of special characters in string ● Use different (machine learning) classification algorithms ○ Decision trees ○ Bayesian networks ○ SVM with different parameters 22

Thank you! 23

Password classification Tiko Huizinga Supervisor: Zeno Geradts, - PowerPoint PPT Presentation

Password classification Tiko Huizinga Supervisor: Zeno Geradts, Nederlands Forensisch Instituut (NFI) 1 Example case Police confiscates hard drives Fast (automatic) analysis of data needed Saved plain text passwords can be very

Team Password Manager Password Management Software for Groups http://teampasswordmanager.com

return password return hash( password ) return hash( password, salt )

Cisco Passwords - Enforcing Minimum Password Length Common Types of Password Attacks Brute-Force

Password, Authentication, Password Managers Week 4 Frank Chen | Spring 2017 Frank Chen | Spring

LastPass An Introduction to Password Managers Why do I need a Password manager? Email is your

Screen 1 Go to www.myenroll.com < Click Request User ID and Password> Acquire USER ID and

A Large-scale Analysis of the Mnemonic Password Advice Johannes Kiesel , Benno Stein, Stefan Lucks

Authentication Authentication Overview Registration User sends username and password

IPAKE IPAKE Summary Summary Isomorphisms for Password-based Isomorphisms for Password-based

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

(a) Quantitative classification (b) Qualitative classification (c) Area classification (d) Simple

Classification Image Classification Set of predefined categories [eg: table, apple, dog, giraffe]

Classification 1 Classification: Basic Concepts and Methods Classification: Basic Concepts

Library of Congress Classification: Module 1.3 1 Library of Congress Classification: Module 1.3

Classification K-nearest neighbor classification D istance functions Choice of k Choice of k

Pursuant to Regulation 30 of the Securities and Exchange Board of India (Listing Obligations

Machine Learning and Artificial Intelligence Advancements for Electrical Inspection SEPTEMBER 5

Dynamic Bayesian Networks Cora Perez-Ariza 1 , Ann Nicholson 2 , Kevin Korb 2 , Steven Mascaro 2

Wentian Li and Yaning Yang Lab of Statistical Genetics Rockefeller University [poster,

A Na ve Bayes model based on overlapping groups for link prediction in online social networks

OUTLINE Introduction Supervised ,Unsupervised And Semi-Supervised Learning.

Mortality Models and Longevity Risk for Small Populations Jack C. Yue National Chengchi Univ.

Kobe + DICEA + code_aster: Creating a framework for the validation of procedures with engineered

Password classification Tiko Huizinga Supervisor: Zeno Geradts, - PowerPoint PPT Presentation

Password classification Tiko Huizinga Supervisor: Zeno Geradts, Nederlands Forensisch Instituut (NFI) 1 Example case Police confiscates hard drives Fast (automatic) analysis of data needed Saved plain text passwords can be very

Team Password Manager Password Management Software for Groups http://teampasswordmanager.com

return password return hash( password ) return hash( password, salt )

Cisco Passwords - Enforcing Minimum Password Length Common Types of Password Attacks Brute-Force

Password, Authentication, Password Managers Week 4 Frank Chen | Spring 2017 Frank Chen | Spring

LastPass An Introduction to Password Managers Why do I need a Password manager? Email is your

Screen 1 Go to www.myenroll.com &lt; Click Request User ID and Password&gt; Acquire USER ID and

A Large-scale Analysis of the Mnemonic Password Advice Johannes Kiesel , Benno Stein, Stefan Lucks

Authentication Authentication Overview Registration User sends username and password

IPAKE IPAKE Summary Summary Isomorphisms for Password-based Isomorphisms for Password-based

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

(a) Quantitative classification (b) Qualitative classification (c) Area classification (d) Simple

Classification Image Classification Set of predefined categories [eg: table, apple, dog, giraffe]

Classification 1 Classification: Basic Concepts and Methods Classification: Basic Concepts

Library of Congress Classification: Module 1.3 1 Library of Congress Classification: Module 1.3

Classification K-nearest neighbor classification D istance functions Choice of k Choice of k

Pursuant to Regulation 30 of the Securities and Exchange Board of India (Listing Obligations

Machine Learning and Artificial Intelligence Advancements for Electrical Inspection SEPTEMBER 5

Dynamic Bayesian Networks Cora Perez-Ariza 1 , Ann Nicholson 2 , Kevin Korb 2 , Steven Mascaro 2

Wentian Li and Yaning Yang Lab of Statistical Genetics Rockefeller University [poster,

A Na ve Bayes model based on overlapping groups for link prediction in online social networks

OUTLINE Introduction Supervised ,Unsupervised And Semi-Supervised Learning.

Mortality Models and Longevity Risk for Small Populations Jack C. Yue National Chengchi Univ.

Kobe + DICEA + code_aster: Creating a framework for the validation of procedures with engineered

Screen 1 Go to www.myenroll.com < Click Request User ID and Password> Acquire USER ID and