No Training Hurdles: Fast Training- Agnostic Attacks to Infer Your - PowerPoint PPT Presentation

No Training Hurdles: Fast Training- Agnostic Attacks to Infer Your Typing Song Fang * , Ian Markwood † , Yao Liu † , Shangqing Zhao † , Zhuo Lu † , Haojin Zhu ‡ * University of Oklahoma † University of South Florida ‡ Shanghai Jiaotong University

Background • Typing via a keyboard plays a very important role in our daily life. What are you typing? Hacker 2 of 37

Existing Non-invasive Attacks Software or hardware based keylogger General principle: pressing a key causes subtle environmental impacts unique to that key 3 of 37

Example Attacks Vibration pattern Environmental Acoustic feature change Wireless distortion Trained model Training Attack Phase Phase Checking Unknown Keystrokes training data disturbances 4 of 37

Why Is Training A Hurdle A user may change typing behaviors No physical control of Require keyboard pressed key knowledge 5 of 37

Statistical Methods • Frequency analysis: analyzing the frequencies of observed disturbances 0.15 A large amount of 0.1 text 0.05 0 e t a o i n s h r d l c umw f g y p b v k j x q z Letter frequency distribution in English 6 of 37

Question: Is it possible to develop a non-invasive keystroke eavesdropping within a shorter time? Self-contained Probabilistic structures of words Statistics Type Disturbances sense ! 7 of 37

Wireless Signal Based Attacks v Advantages: ü Ubiquitous deployment of wireless infrastructures ü Radio signal nature of invisibility ü Elimination of the line-of-sight requirement • CSI (channel state information) quantifies the disturbances H ( f , t ) = Y ( f , t ) X ( f , t ) Public Y ( f , t ) X ( f , t ) Rx Tx 8 of 37

Outline • Motivation • Attack Design • Experiment Results • Conclusion

System Overview Signal CSI time series Channel estimation Noise removal Reduction Pre-processing Segmentation CSI word group generation CSI samples Dictionary demodulation Alphabet matching Keystrokes A CSI sample refers to an individual segment corresponding to the action of pressing a key. 10 of 37

CSI Word Group Generation CSI word CSI Word Classification Sorting groups samples segmentation A CSI word group refers to the a group of CSI samples comprising each typed word. 11 of 37

Word Classification Sorting segmentation · · · Set 1 Similarity CSI samples calculation Set 2 · · · 12 of 37

Word Classification Sorting segmentation Set 1 Set 2 Set i Set N · · · · · · Sort based on the size · · · · · · · · · 13 of 37

Word Classification Sorting segmentation time …… …… CSI word Dictionary …… group demodulation Space-associated Non-space-associated / / /··· 14 of 37

Dictionary Demodulation (DD) DD Feature Extraction Joint Demodulation CSI word English groups words Error Tolerance (Eg., ) Non-Alphabetical Impact 15 of 37

Feature Extraction Ø Length L : number of constituent letters Ø Repetition { L , ( t 1 , … , t r )} : o r is the number of distinct letters that repeat, o t i denotes how many times the corresponding letter repeats Ø Inter-Element Relationship Matrix M if x i and x j are same or similar 16 of 37

Feature Extraction • Dictionary: Top 1,500 most frequently used word list [1] Set 1 · · · Selected English words feature Set 2 Length OR · · · Repetition OR Relationship Matrix [1] Mark Davies. “Word frequency data from the Corpus of Contemporary 17 of 37 American English (COCA),” http://www.wordfrequency.info/free.asp.

Feature Extraction -- number of sets obtained Uniqueness rate = T p -- number of consider words T Better partitioning (distinguishability) Uniqueness Average set rate cardinality Length 0.009 107 Repetition 0.042 24 Relationship matrix 0.225 4 18 of 37

Joint Demodulation • Example: o A dictionary W ={‘among’, ‘apple’, ‘are’, ‘hat’, ‘honey’, ‘hope’, ‘old’, ‘offer’, ‘pen’}. o Type in two words: “apple” and “pen” 1) R 1 : 2) compute the relationship matrix for each word in W , and compare each with R 1 Candidates: “apple” and “offer” 19 of 37

Joint Demodulation 3) Candidates: {“hat”, “old”, “are”, “pen”} || R new 4) 5) Candidates T of the two-word sequence, {“apple||hat”, “apple||old”, “apple||are”, “apple||pen”, “offer||hat”, “offer||old”, “offer||are”, “offer||pen”} 6) Generate the relationship matrix for each new candidate in T and compare it with R new Final result: “apple||pen” 20 of 37

Joint Demodulation • Input: m CSI word groups S = { S 1 , S 2 , … , S m }; Ø dictionary with q words W = { W 1 , W 2 , … , W q } Ø • Output: a corresponding phrase of m words Ø • Observation: each CSI word group => multiple candidate words Ø each candidate => <CSI sample, letter> mapping info Ø 21 of 37

Joint Demodulation Step 1: find initial candidate words for each CSI word group R CSI word group R each word Compare => match, add the word as a candidate; no match, add the CSI word group to the “undemodulated set” U 22 of 37

Joint Demodulation Step 2 (iteratively): (a) T i : concatenation of the first i -1 demodulated CSI word groups; candidates for T i are { T i 1 , T i 2 , … , T ip } (b) S i : the i- th CSI word group; candidates for S i are { S i 1 , S i 2 , … , S iq } (by step 1) (c) Find new candidates for concatenated CSI word groups R R Compare T i || S I T ij || S ik (1<=j<= p , 1<= k <= q ) => match, add T ij || S ik as a candidate for T i +1 ; no match, add S i to U and skip to S i +1 23 of 37

Joint Demodulation • Alphabet matching: the mapping can be applied to the remaining CSI word groups and those in U Ø Example: the user types “deed” || “would” after the mapping is established; 24 of 37

Error/Non-Alphabetical Characters Tolerance • Abnormal situations: Ø CSI classification errors X Set of CSI samples A CSI sample for the letter for the letter Ø Typos/Non-Alphabetical Characters Match with Have no Consequence invalid words candidates Add the CSI word Cascading discovery group to the set U failures 25 of 37

Experiment Results • Attack system: Ø a wireless transmitter + a receiver (each is a USRP connected with a PC) Ø the channel estimation algorithm runs at the receiver to extract the CSI for key inference. Ø dictionary: Top 1,500 most frequently used word list • Target user: Ø a desktop computer with a Dell SK-8115 USB wired standard keyboard 27 of 37

Example Recovery Process • Randomly select 5 sentences from the representative English sentences in the Harvard sentences [2] . Input paragraph: The boy was there when the sun rose. A rod is used to catch pink salmon. The source of the huge river is the clear spring. Kick the ball straight and follow through. Help the woman get back to her feet. Step%1% Searching results: The boy/box was there when the sun rose. A *** is used to catch **** *****. The source of the huge river is the clear spring. **** the ball straight and follow through. Help the woman get back to her ****. Step%2% Recovering words not in the dictionary: (1) rod; (2) pink; (3) salmon; (4) Kick; (5) feet. [2] IEEE Subcommittee on Subjective Measurements. “IEEE Recommended Practice for Speech Quality Measurements,” IEEE Transactions on Audio and 28 of 37 Electroacoustics , vol. 17, no. 3 (Sep 1969), pp. 227–246.

Eavesdropping Accuracy # of successfully recovered words Word recover ratio= total # of input words • Single article recovery (Type a piece of CNN news) 1 Word recovery ratio 0.8 0.6 0.4 0.2 0 0 50 100 Number of typed words 29 of 37

Impact of CSI Sample Classification Errors • We artificially introduce errors into the groupings. 1 1500-word dictionary 1000-word dictionary 0.8 Word recovery ratio 500-word dictionary 0.6 0.4 0.2 0 0.4 0.5 0.6 0.7 0.8 0.9 1 Success rate of classification 30 of 37

Overall Recovery Accuracy • L WRR > x denotes the required number of typed words from each article to satisfy the ratio x . 1 0.8 Empirical CDF 0.6 0.4 0.2 P ( L W RR> 0 . 8 < L ) P ( L W RR> 0 . 9 < L ) 0 0 20 40 60 Number L of typed words 31 of 37

Time Complexity Analysis • The comparison of relationship matrices is the dominant part of the demodulation phase. 5 10 1500 − word dictionary New comparison number 4 1000 − word dictionary 10 500 − word dictionary 3 10 2 10 1 10 0 10 0 10 20 30 40 50 Number of words 32 of 37

Password Entropy Reduction • The higher the entropy, the more the randomness • 2012 Yahoo! Voices hack [3] : 342,508 passwords: 98.42% of passwords are 12 characters or fewer 0.8 Ratio of letters 0.6 0.4 0.2 0 6 7 8 9 10 11 12 Key length [3] 2012 Yahoo! Voices hack. 33 of 37 https://en.wikipedia.org/wiki/2012_Yahoo!_Voices_hack

Password Entropy Reduction (Cont’d) • Breaking a 9-character password is reduced to guessing 1-5 non-letter characters. 34 of 37

No Training Hurdles: Fast Training- Agnostic Attacks to Infer Your - PowerPoint PPT Presentation

No Training Hurdles: Fast Training- Agnostic Attacks to Infer Your Typing Song Fang * , Ian Markwood , Yao Liu , Shangqing Zhao , Zhuo Lu , Haojin Zhu * University of Oklahoma University of South Florida Shanghai

LANGUAGE-AGNOSTIC INJECTION LANGUAGE-AGNOSTIC INJECTION DETECTION DETECTION Lars Hermerschmidt,

MANA for MPI MPI-Agnostic Network-Agnostic Transparent Checkpointing Rohan Garg, *Gregory Price,

Pool-based Agnostic Pool-based Agnostic Experiment Design Experiment Design in Linear

MOPS in Pakistan: Implementation and Hurdles Ali Choudhary, Sate Bank of Pakistan & CEP-

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Illustrating Agnostic Learning We want a classifier to distinguish between cats and dogs Image 1

Category & Progression Specific Programming Model for Industry Agnostic Incubators Sean

CS485/685 Lecture 16: March 1, 2012 Agnostic Learning [BDSS] Chapters 2, 3 CS485/685 (c) 2012 P.

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

no- more PowerPoint Examples: Notes To Slides no- more | Nytorv 3. 1 st floor, 1450 Kbenhavn

How to (not) Share a Password: Privacy preserving protocols for finding heavy vy hit itters

CDW Corporation Webcast Conference Call August 5, 2020 CDW.com | 800.800.4239 Disclaimers

We reviewed properties of the SVD. Currently no slides for this part of the lecture. We also saw

NO X AND NO Y IN THE TROPICAL MARINE BOUNDARY LAYER AT CAPE VERDE C . R E E D , J . D . L E E ,

Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering Jialin Li , Ellis

Locally private learning without interaction requires separation Vitaly Feldman Research with

Boundedness of Conjunctive Regular Path Queries Pablo Barcel (Univ. of Chile & IMFD) Diego

No Training Hurdles: Fast Training- Agnostic Attacks to Infer Your - PowerPoint PPT Presentation

No Training Hurdles: Fast Training- Agnostic Attacks to Infer Your Typing Song Fang * , Ian Markwood , Yao Liu , Shangqing Zhao , Zhuo Lu , Haojin Zhu * University of Oklahoma University of South Florida Shanghai

LANGUAGE-AGNOSTIC INJECTION LANGUAGE-AGNOSTIC INJECTION DETECTION DETECTION Lars Hermerschmidt,

MANA for MPI MPI-Agnostic Network-Agnostic Transparent Checkpointing Rohan Garg, *Gregory Price,

Pool-based Agnostic Pool-based Agnostic Experiment Design Experiment Design in Linear

MOPS in Pakistan: Implementation and Hurdles Ali Choudhary, Sate Bank of Pakistan &amp; CEP-

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Illustrating Agnostic Learning We want a classifier to distinguish between cats and dogs Image 1

Category &amp; Progression Specific Programming Model for Industry Agnostic Incubators Sean

CS485/685 Lecture 16: March 1, 2012 Agnostic Learning [BDSS] Chapters 2, 3 CS485/685 (c) 2012 P.

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

Why Im NOT Why Im NOT Jewish/ Christian Atheist Agnostic Hindu Muslim Buddhist

no- more PowerPoint Examples: Notes To Slides no- more | Nytorv 3. 1 st floor, 1450 Kbenhavn

How to (not) Share a Password: Privacy preserving protocols for finding heavy vy hit itters

CDW Corporation Webcast Conference Call August 5, 2020 CDW.com | 800.800.4239 Disclaimers

We reviewed properties of the SVD. Currently no slides for this part of the lecture. We also saw

NO X AND NO Y IN THE TROPICAL MARINE BOUNDARY LAYER AT CAPE VERDE C . R E E D , J . D . L E E ,

Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering Jialin Li , Ellis

Locally private learning without interaction requires separation Vitaly Feldman Research with

Boundedness of Conjunctive Regular Path Queries Pablo Barcel (Univ. of Chile &amp; IMFD) Diego

MOPS in Pakistan: Implementation and Hurdles Ali Choudhary, Sate Bank of Pakistan & CEP-

Category & Progression Specific Programming Model for Industry Agnostic Incubators Sean

Boundedness of Conjunctive Regular Path Queries Pablo Barcel (Univ. of Chile & IMFD) Diego