Something about audio CAPTCHAs Elie Bursztein, Romain Bauxis, Daniele Perito, Hristo Paskov, Celine Fabry, John Mitchell 1
Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
CAPTCHAS Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
CAPTCHAS Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
CAPTCHAS Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
CAPTCHAS Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
CAPTCHAS Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
CAPTCHAS Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
CAPTCHAS Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Audio capchas Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Audio capchas Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Outline • Audio captchas background • Breaking audio captchas • Evaluation results • Demo Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Creating audio captcha Captcha Super Maker secure captcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Creating audio captcha Captcha Super Maker secure captcha Voices Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Creating audio captcha Captcha Super Maker secure captcha Noises Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Creating audio captcha Super secure captcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Type of noise • Additive noise i.e white noise • Convolutive noise i.e echo • Semantic noise i.e music Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Noise intensity (RMS/SNR) J 5 H K Authorize J A K Digg 2 9 0 0 Microsoft Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Sound representation TCR Cep DFT WAV TDC TFR Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Breaking audio captchas Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solving an audio captcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solving an audio captcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solving an audio captcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solving an audio captcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solving an audio captcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solving an audio captcha C Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solving an audio captcha C T A 2 T R R A F S Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Dealing with random noise • Statistical learning 5: • Supervised learning Authorize eBay Recaptcha • RLS (Regularized J: least square) classifier Authorize Digg Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Solver efficiency Solver accuracy = Coverage * Precision^length Coverage : Segmentation Precision : Recognition rate Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Decaptcha Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Decaptcha overview Captcha scraping Sound processing Web Site Discretized and segmented captcha Captcha labels Answers Mechanical Turk users Classifier Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Testing corpus Familly Name Description White White Gaussian noise. Constant Noise Sine waves at 700 Hz, buzz 2100 Hz and 3500 Hz. 10 ms bursts of white pow Gaussian noise repeated every 100 ms. Every 100 ms, a section of the signal is replaced rnoise by white noise of the same RMS amplitude. Regular noise Add distortion, cracks, bandwidth limiting and lofi compression. Simulates old audio equipment. The signal starts echo to echo at 0.6, 1.32, and 1.92 seconds. Amplifies random half- cycles of the signal by disintegrator a multiplier. Simulates a bad audio channel. Chopin Polonaise for chopin Piano No. 6, Op. 53. Semantic noise Gregorian chant. gregorian “Just in time“ by nina Nina Simone. Table III Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Synthetic evaluation 100 90 80 Per − Captcha Precision (%) 70 60 50 40 white 30 buzz gregorian 20 nina chopin 10 pow echo, lofi, rnoise, disintegrator 0 20 15 10 5 0 − 5 SNR (dB) Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Semantic noise Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Captcha features Scheme Authorize Digg eBay Microsoft Recaptcha Yahoo Length 5 5 6 10 8 7 Type of voice Female Female Various Various Various Child Background Noise None Constant (random) Constant (random) Constant (random) Constant (random) None Intermediate noise None None Regular (speech) Regular (speech) Regular (speech) Regular (speech) Charset 0-9a-z a-z 0-9 0-9 0-9 0-9 Avg. duration 5.0 6.8 4.4 7.1 25.3 18.0 Sample rate 8000 8000 8000 8000 8000 8000 22050 Beep no no no no no yes Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Results Length Coverage Digit Captcha 89.2% Authorize 5 100 97 41.4% Digg 5 100 76 82.9% eBay 6 85.6 92.5 48.9% Microsoft 10 80.6 89.6 1.5% Recaptcha 8 99.9 40.5 45.4% Yahoo 7 99.1 74.7 Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Recaptcha semantic noise 0 3 4 7 9 2 1 0 -10 5 N N -20 -30 DB -40 -50 -60 -70 0 20 40 60 80 100 120 140 160 180 200 Time in seconds Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Recaptcha semantic noise 0 3 4 7 9 2 1 0 -10 5 N N -20 -30 DB -40 -50 -60 -70 0 20 40 60 80 100 120 140 160 180 200 Time in seconds Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Confusion matrices Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
How many captchas do you need ? 100 Authorize Digg 90 Ebay MSLive 80 Recaptcha Yahoo Per − Captcha Precision (%) 70 60 50 40 30 20 10 0 2 3 4 10 10 10 Corpus Size (in Digits) Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Conclusion • Non-continuous based captchas are broken • Urgent need to come-up with the next generation of audio captchas Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Questions ? Thanks http://ly.tl/p18 Twitter: @elie Elie Bursztein (@elie) http://ly.tl/p18 The Failure of Noise-Based Non-Continuous Audio Captchas
Recommend
More recommend