Forgot your password?
typodupeerror
Security Spam IT

ReCAPTCHA.net Now Vulnerable to Algorithmic Attack 251

Posted by timothy
from the bless-you! dept.
n3ond4x writes "reCAPTCHA.net algorithms have been developed to solve the current CAPTCHA at an efficacy of 30%. The algorithms were disclosed at DEFCON 18 over the weekend and have since been made available online. Also available is a video demonstration of random reCAPTCHA.net CAPTCHAs being subjected to the algorithms." There's probably an excellent Firefox plugin to render this page's color scheme more bearable. Note: the PowerPoint presentation linked opens fine in OpenOffice, and the video speaks for itself.
This discussion has been archived. No new comments can be posted.

ReCAPTCHA.net Now Vulnerable to Algorithmic Attack

Comments Filter:
  • Human Success? (Score:5, Insightful)

    by Anonymous Coward on Thursday August 05, 2010 @05:03PM (#33154776)

    So what is the average human success rate? I think mine is only about 50%

  • Bad Hacking (Score:5, Insightful)

    by pz (113803) on Thursday August 05, 2010 @05:16PM (#33154910) Journal

    Why would anyone want to do this? It's like attacking the UN peace keeping troops or the Red Cross. reCAPTCHA is doing good work, digitizing scanned printed books so that the the text can be made available for online searching. Breaking reCAPTCHA is like defecating in the village well, ensuring that everyone suffers. No one benefits from reCAPTCHA being broken. No one.

  • by Anonymous Coward on Thursday August 05, 2010 @05:23PM (#33154974)

    This won't happen. Many current CAPTCHAs are already hard to solve for humans, and increasing the computational cost to solve a CAPTCHA will also make it harder to solve for humans.

    Now, the problem is, computers are getting more powerful every day, while humans don't. Sooner or later, this simple fact will render CAPTCHAs useless.

  • Re:Bad Hacking (Score:5, Insightful)

    by Dhalka226 (559740) on Thursday August 05, 2010 @05:31PM (#33155094)

    No one benefits from reCAPTCHA being broken. No one.

    Spammers.

  • Re:Bad Hacking (Score:5, Insightful)

    by maxume (22995) on Thursday August 05, 2010 @05:32PM (#33155106)

    Actually, it could be of use to reCAPTCHA, they can just pass their test words through this system before they make them public and then use the output to help prevent similar attacks.

  • Re:Bad Hacking (Score:4, Insightful)

    by Flyne (1082975) on Thursday August 05, 2010 @05:42PM (#33155220)
    The problem of breaking reCAPTHCA is precisely the same problem as increasing computer OCR abilities, since reCAPTCHA by design uses words which current OCR abilities are inadequate for. This is a good thing for AI and computer vision and text digitization.
  • Re:Bad Hacking (Score:5, Insightful)

    by sbayless (1310131) on Thursday August 05, 2010 @05:58PM (#33155364)

    No one benefits from reCAPTCHA being broken. No one

    You couldn't be more wrong. Sure, breaking reCAPTCHA would create a headache for website admins (including me, for example), but in order to break reCAPTCHA someone has to devise a better text recognition program. And that's great news! This is an example of a general side effect of the cat and mouse game that are captchas. Captcha's are a simple form of Turing Test, where website admins are trying to determine who is a computer and who is a real human being. Every time a captcha gets broken, we get a sophisticated new algorithm for doing something that previously only humans could do (or only humans could do well, at least).

  • by mwvdlee (775178) on Thursday August 05, 2010 @06:02PM (#33155406) Homepage
    When it is claimed to be 30% accurate, I'd expect some 30% of all captchas being correcly guessed. Watching the video, I noticed the algorithm gives itself 30-40% scores for getting just one of the two words right or sometimes even for getting the right length and a few correct letters. Didn't watch it to the end, but in the few minutes I watched, ZERO entire captcha's were solved. So that's ZERO% acurate in my book. For instance, actual captcha text "ware readiness", guessed captcha "votarry rehabbed", reported accuracy 38.24%... how the hell is that over 38% accurate? If you had that level of accuracy when trying to get past a captcha (which is pretty much the definition of it being vulnerable, right?), you wouldn't get past a single captcha. it's 30% accurate if it correcly guessed about 3 out of every 10 captcha's, not if it fails every single captcha.
  • Re:Hmm (Score:3, Insightful)

    by Monkeedude1212 (1560403) on Thursday August 05, 2010 @06:06PM (#33155430) Journal

    I'm glad YOUR common sense kicked in before hundreds of others.

  • Re:far from it (Score:3, Insightful)

    by retchdog (1319261) on Thursday August 05, 2010 @06:15PM (#33155498) Journal

    Interesting. If this is true as stated, and one knew/modeled OCR performance, you could use this information in some cases to pick out the plum and boost the crack...

  • Re:Bad Hacking (Score:2, Insightful)

    by mysidia (191772) on Thursday August 05, 2010 @07:09PM (#33155904)

    reCaptcha, and indeed all Captchas have a fundamental flaw.... advances in computer vision will eventually render them all obsolete.

    Most of the CS knowledge is already around to totally defeat captchas of this sort... it's only an Engineering question. They will most likely get broken when sufficiently unethical engineers are hired by sufficiently wealthy spammers.

    It's basically a known fact, that spammers will eventually break conventional captchas totally, by developing algorithms to guess captcha answers. It's only a question of when and how long will it take them to figure out all the systems that matter.

    This does not mean it is a respectable thing for people to specifically target Captcha and attempt to hasten its demise.

    reCaptcha is a big one... but there are other Captcha systems that matter (like Google's).

    And there are other ways around them besides software algorithms... Amazon-style mech turk, for example... find a few thousand folks in certain countries to pay $0.05/hour for breaking captchas, and suddenly reCaptcha is no longer a boundary.

  • Re:Bad Hacking (Score:4, Insightful)

    by Timmmm (636430) on Thursday August 05, 2010 @07:10PM (#33155906)

    The problem of breaking reCAPTHCA is precisely the same problem as increasing computer OCR abilities

    No it isn't. Well, not unless you read books with wavy crossed-out words and don't mind 30% accuracy.

  • Re:Bad Hacking (Score:2, Insightful)

    by mysidia (191772) on Thursday August 05, 2010 @07:14PM (#33155946)

    Except the algorithm doesn't really do that... to defeat the captcha, it only needs to get it right about 10 or 20% of the time, to give the malicious script a "good enough guess" to brute-force the Captcha with 5 or 6 retries.

    As long as the number retries are less than those the a fair percentage of humans require....

  • by martin-boundary (547041) on Thursday August 05, 2010 @08:08PM (#33156492)
    Google books isn't really public, though. You can only view a small number of pages of each book, which is pretty useless from the point of view of public uses that come to mind.
  • Re:My eye's... (Score:3, Insightful)

    by Peach Rings (1782482) on Thursday August 05, 2010 @09:08PM (#33157060) Homepage

    By the way, that wasn't just a facetious comment. TFA isn't a serious paper. It's not even typeset, just typed into Microsoft Word. And god knows why I'm being warned about VBScript macros when I try to open it.

    And this isn't a case where the little guy is making real scientific progress right under the nose of the obsolete establishment. The author doesn't even have a freshman understanding of big-O notation, it's completely juvenile.

  • by bill_mcgonigle (4333) * on Thursday August 05, 2010 @11:16PM (#33157860) Homepage Journal

    So its for-profit work for the biggest advertising firm in the world.
    Sort of expected project gutenberg or something.

    Google's digitizing hundreds of thousands of historic books from some of the great university libraries. What's the problem here, that they won't lose money on the effort?

    The NYT archive has been done for at least a year, it made reCAPTCHA a feasible company.

  • by mrnobo1024 (464702) on Thursday August 05, 2010 @11:38PM (#33157958)

    The spammers can just choose a random option until they get in. All that will do is slow them down a bit.

  • Re:Human Success? (Score:4, Insightful)

    by Kalriath (849904) on Thursday August 05, 2010 @11:41PM (#33157972)

    Yeah, I agree with this. Recaptcha is one of the easiest out there.

    Admittedly though, I have around about 3% success rate with vBulletin captchas. Hear that forum owners? I'm not joining your forum because I can't read your captcha!

  • by Sparr0 (451780) <sparr0@gmail.com> on Friday August 06, 2010 @03:44AM (#33158838) Homepage Journal

    The problem is that since you are *probably* solving the verification words with higher accuracy to begin with, you are actually poisoning the data being gathered regarding the book words. So, while a book word becoming a verification word based on your "solutions" will keep your solution rate constant, it actually damages the system when it comes time for humans to solve the CAPTCHA, or worse when the solutions are used as OCR corrections.

    To clarify, given a classically OCR-able "foo" and a non-OCR-able-but-human-readable "bar", a human is expected to recognize the slightly-deformed-by-reCAPTCHA "foo" and is trusted to get "bar" right more often than OCR would. This attack only defeats the deformation applied by reCAPTCHA, it doesn't actually improve the OCR on the non-deformed words, which means you are going to submit an answer of "foo ban" every time this pair is encounted (or "blah ban" for a different scenario), and the reCAPTCHA system is eventually going to decide that the book word really is "ban".

  • by IBBoard (1128019) on Friday August 06, 2010 @04:09AM (#33158890) Homepage

    Remember, iPads and touch-screens can't do hover. Plus there's the whole disability accessibility aspect as well ;)

Those who can, do; those who can't, simulate.

Working...