CS682 - Advanced Security Topics Instructor: Elias Athanasopoulos APTCHA I am Andreas Charalampous, April 2020
Contents 1. Introduction to Captcha 2. Paper 1: Re: Captchas – Understanding Captcha-Solving Services in an economic context 3. Paper 2: I am Robot: (DEEP) Learning to break Semantic Image Captchas
1. Introduction to Captcha i. Motivation ii. Definition iii. Type of Captcha Challenges iv. reCaptcha
Motivation • Using computers for bot fraud, attackers can attack at scale. • Fake Registrations - Create multiple accounts automatically. • Comment/Posting Spam. • Purchase of tickets. • Resource that has to be guarded. • A defense mechanism is needed to distinguish computers and humans, let humans in and spammers out of resources.
Definition of Captcha • Captcha: C ompletely A utomated P ublic T uring test to tell C omputers and H umans A part. • Reverse Turing Test. • Term coined by Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford in 2003. • Captchas protect open Web Resources from being exploited at scale. • Challenge-Response to determine whether the user is human or not. • A Captcha challenge must at the same time make the bot fail and the human easily solve it. • Approximately 10 seconds for a human to solve a typical Captcha.
Type of Captcha Challenges • First version of Captcha (v.1) is the “twisted text”, made in 1997. • Earliest commercial use by idrive.com and Paypal in 2002 and 2001 respectively. • Math problems captchas. • Audio captchas. • Picture captchas.
Type of Captcha Challenges Drag-And-Drop Captcha Trivial Captcha Advertisement Captcha SlideLock Captcha Game Captcha
reCaptcha • Was developed by Luis von Ahn, David Abraham, Manuel Blum, Michael Crawford, Ben Maurer, Colin McMillen, and Edison in May 2007. • It was acquired by Google in September 2009. • Used for digitization of The New York Times archives and books from Google Books. • Two of the reCaptcha challenges are image and distorted text identification.
No Captcha ReCaptcha • Developed in 2014. • Consists of a checkbox where the user is asked to just click it. • Performs behavioral analysis on the browser predicting if the user is human or not. • Easier for humans. • “Harder” for bots.
Evolution and Variety in Captchas • Captchas are evolving for more than 20 years and will keep on doing. Many different kinds of captcha challenges. • Are improved, finding ways to make it easier to users, more difficult to bots. • Provide accessibility to health impaired users. • Captchas are kept being bypassed by automation software or solver services, creating an arms race between solvers and providers.
2. Paper 1: Re: Captchas – Understanding Captcha- Solving Services in an economic context i. Introduction ii. What is examined in this paper iii. Automated Software Solvers iv. Human Solver Services v. Conclusion
Introduction • Captchas attached value to the problem of solving them, creating an industrial market, where captcha providers and solver are competing. • Providers come against two types of solvers: • Automated solving technology. • Real time Human Labor . • Captchas are evaluated in economic terms.
What is examined in the paper? • How this new market works • Serving quality to price. • Solving capacity of the market leaders. • Details about solving services. • How the two categories of solvers work: • Automated solving: • How it evolved. • How the arms race favors the providers (defender). • Human Labor: • Why it surpassed automated solving. • How the cost of it dropped significantly. • Which Captchas are targeted most.
To support further the study • Interviewed Mr. E. , owner of a successful CAPTCHA-solving service. He provided validation and insight of the underlying business processes. • Studied the whole market, from all aspects and view. • Purchased solving services from both categories and tested them. • Became part of the human labor pool.
Automated Software Solvers • Use segmentation algorithms – Optical Character Recognition (OCR) • Complex. • Fails to replicate human accuracy. • Advantages: • Near-zero cost. Only cost is in creating solver. • Near-infinite capacity. • Tested Xrumer and reCaptchaOCR.
Xrumer • Software for spamming, mostly forums and comment sections. • Integrated support for bypassing many different anti-spam mechanisms, including Captcha. • Available from 2006 and in 2010 it cost $540. Authors purchased it for evaluation. • In 2008 was capable of solving Captchas of major message boards.
XrumerTests • Tested on netbook with 1.6Ghz Intel Atom Processor. • On all but one captchas scored 100% accuracy, requiring 1 second or less for each Captcha. • Only on phpBB which uses GD Captcha generator and foreground noise, scored 35% accuracy, requiring 6-7 seconds per captcha. • Even though the scores are pretty impressive, a couple of months later theses captchas were updated, defeating Xrumer.
reCaptchaOCR • Created in December 2009. (a) Early 2008 • Focused on reCaptcha. • Developed to defeat early 2008 reCaptchas. • Was able to defeat late 2009 reCaptchas. (b) Late 2009 • Early 2010 reCaptcha was updated and reCaptchaOCR was unable to defeat it. (c) Early 2010
reCaptchaOCRTests • Tested on netbook with 2.13Ghz Intel Core 2 Duo Processor. • Uses iteration for improving accuracy. • With 613 iterations: • 100 (a) captchas scoring 30%. • 100 (b) captchas scoring 18%. • Average 105 seconds per challenge. • With 75 iterations: • 100 (a) captchas scoring 29%. • 100 (b) captchas scoring 17%. • Average 12 seconds per challenge.
Conclusion • Arms races traditionally favor the attacker. Here attackers have the more challenging recognition problem, while providers can be agile. • Economics of automated solving are driven by several factors: • Cost of developing new solvers. • Accuracy of those solvers. • Responsiveness of the sites whose captchas are attacked.
Human Solver Services • Instead of using automated solving software, the workload of captchas is given to humans to solve. • Opportunistically. • On a “For a Hire” Basis.
Opportunistic Solving • Individual solving a Captcha as part of some other task. • An attacker controlling a popular Website, might use its visitors for solving third- party Captchas by offering them as the visitor’s challenge. • Did not play a major role in the market.
Paid Solving • Core of the CAPTCHA-solving ecosystem. • Services are paying individuals to solve captchas. • Price is calculated as $X/1000, where X is the amount paid for solving 1000 Captchas. • An advertisement in 2006 was looking for a full-time CAPTCHA solver for $10/1000.
Workers all around the world 8 1 5 demenoba 4 6 7 decaptcher.com PixProfit DeCaptcher Pictures are life 3 2
Paid Solving Evolution • From 2007 to 2010 the market has been expanding with wages declining. • 2007: $10/1000. • Mid-2008: $1.5/1000. • Mid-2009: $1/1000. • 2010: $0.75/1000 – $0.5/1000. • Solving is unskilled activity. • Services preferred labor from Eastern Europe, Bangladesh, China, India, Vietnam. • Competition made wages reduce even more.
Solver Service Quality • Evaluate 8 Paid Services: • Antigate https://anti-captcha.com/ • BeatCaptchas https://beatcaptchas.com.cutestat.com/ • BypassCaptcha http://bypasscaptcha.com/ • CaptchaBot http://www.captchabot.com/ • CaptchaBypass – Ceased Operation during evaluation • CaptchaGateway – Ceased Operation during evaluation • DeCaptcher https://de-captcher.com/ • ImageToText – Ceased Operation • Based on: 1. Customer Interface 2. Solution Accuracy 3. Response time 4. Capacity 5. Load and Availability
Verifying Results • For each captcha, the most frequent solution from solvers is used. • If there are more frequent solutions, the answers are incorrect. • Heuristic Evaluation: • 1025 random selected captchas that had at least one solution and checked manually. • 1009 correct. • 16 incorrect • 6 of them because of characters similarities (zero vs O (0 – o), six versus letter B (6 – b))
Customer Account Creation • All of them required prepayment. • Antigate and Decaptcher, offer bidding systems for higher priority access when load is high. • For most services, account registration is accomplished via Web and email. • Some of them presented obstacles during registration: • CaptchaBot and Antigate required third-party invitation codes. • Antigate guards against Western users and required the name of Prime Minister in Cyrillic. • Some of them, like ImageToText, required live phone call.
Evaluation Details • Tested as customer for about five months using captchas from 25 popular sites, some of them including PayPal, eBay, Google etc. • Submitted a single Captcha every five minutes to all services, recording the time submitted.
Recommend
More recommend