Forgot your password?
typodupeerror
Security Spam IT

ReCAPTCHA.net Now Vulnerable to Algorithmic Attack 251

Posted by timothy
from the bless-you! dept.
n3ond4x writes "reCAPTCHA.net algorithms have been developed to solve the current CAPTCHA at an efficacy of 30%. The algorithms were disclosed at DEFCON 18 over the weekend and have since been made available online. Also available is a video demonstration of random reCAPTCHA.net CAPTCHAs being subjected to the algorithms." There's probably an excellent Firefox plugin to render this page's color scheme more bearable. Note: the PowerPoint presentation linked opens fine in OpenOffice, and the video speaks for itself.
This discussion has been archived. No new comments can be posted.

ReCAPTCHA.net Now Vulnerable to Algorithmic Attack

Comments Filter:
  • by imsabbel (611519) on Thursday August 05, 2010 @05:08PM (#33154816)

    I recently went to their homepage and looked _really_ hard for any statistics about which books are transcriped. I read their Science paper. Tried all sections.
    Its all about the captcha part, and _nothing_ about the RE.
    The way they state how it works ("We are using 100.000 unique words") sounds like they have given up on that part long ago and just recycle their old database again and again...

  • far from it (Score:4, Informative)

    by MagicM (85041) on Thursday August 05, 2010 @05:12PM (#33154858)

    I'm watching the video, and the end result is "b:1/78 1.28% s:27/78 34.62%" indicating that out of 78 tests of two words per test it got a single word right 35% of the time, and both words right only once or 1% of the time.

    Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.

  • Plugin not needed... (Score:4, Informative)

    by knarf (34928) on Thursday August 05, 2010 @05:13PM (#33154874) Homepage

    There's probably an excellent Firefox plugin to render this page's color scheme more bearable

    No plugin needed:

    View->Use Style->None

    That is what it looks like in Seamonkey, Firefox will be similar. This more or less always works.

  • by icebraining (1313345) on Thursday August 05, 2010 @05:14PM (#33154892) Homepage

    Currently, we are helping to digitize old editions of the New York Times and books from Google Books.

    http://www.google.com/recaptcha/learnmore [google.com]

  • Re:colours (Score:5, Informative)

    by electrostatic (1185487) on Thursday August 05, 2010 @05:17PM (#33154916)
    "...an excellent Firefox plugin to render this page's color scheme more bearable."

    Yep. Color Toggle

    https://addons.mozilla.org/en-US/firefox/addon/9408/ [mozilla.org]

    I have it set so Ctl-Shift-Z set light yellow background, black text, and blue links.
  • Re:far from it (Score:3, Informative)

    by NegativeK (547688) <(moc.liamtoh) (ta) (neiraket)> on Thursday August 05, 2010 @05:18PM (#33154928) Homepage
    35% * 35% ~ 12%. And that ignores that one word is a known control, while the other is a word they're trying to OCR.
  • Re:far from it (Score:3, Informative)

    by BarryJacobsen (526926) on Thursday August 05, 2010 @05:21PM (#33154956) Homepage

    I'm watching the video, and the end result is "b:1/78 1.28% s:27/78 34.62%" indicating that out of 78 tests of two words per test it got a single word right 35% of the time, and both words right only once or 1% of the time.

    Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.

    My understanding is that only one of the words needs to be correct, but it has to be the "right" one (reCAPTCHA presents two words one it's very certain it knows what it is and one it's less certain, you have to get the one that it's very certain of in order to pass).

  • Re:Bad Hacking (Score:3, Informative)

    by kyrio (1091003) <slashdot.lurkmore@com> on Thursday August 05, 2010 @05:28PM (#33155054) Homepage
    4chan already broke it.
  • Re:Offtopic (Score:4, Informative)

    by Anonymous Coward on Thursday August 05, 2010 @05:29PM (#33155078)

    No, Firefox addons used to be called extensions, plugins are still plugins.

  • Re:far from it (Score:3, Informative)

    by rm999 (775449) on Thursday August 05, 2010 @05:45PM (#33155250)

    You are right, there is no need to get both words right.

    But, your 35% * 35% calculation assumes the recognition difficulty of the words is independent, which is a bad assumption in this case; the OCR word is one that is known to be hard to guess. It is probably more like 35% * 5% or something.

  • by JesseMcDonald (536341) on Thursday August 05, 2010 @05:49PM (#33155304) Homepage

    There is ZERO reason to use worthless tests like these as opposed to using real identification. That is instead of using computer generated difficult test, use actual pictures of actual 'difficult text' that an OCR agent failed to identify. Each person is given one alread tested sample and one unknown sample. If you get the already tested sample, then your answer is accepted as 'probable' correct for the unknown sample.

    Congratulations, you've just described ReCAPTCHA! This is exactly how the current system works.

  • Re:Human Success? (Score:2, Informative)

    by Anonymous Coward on Thursday August 05, 2010 @05:52PM (#33155324)
    Mine is 100%. Recaptcha is probably one of the easiest captcha I've ever had to deal with; something is wrong with you, sorry.
  • Re:colours (Score:1, Informative)

    by Anonymous Coward on Thursday August 05, 2010 @05:53PM (#33155328)

    Neat, I also use yellow background, black text and bluish links. It is very relaxing.
    The color codes are #FFFF00 for the background, #000000 for the text and #00EFFF for the links.

  • Re:far from it (Score:5, Informative)

    by hydrofix (1253498) on Thursday August 05, 2010 @05:53PM (#33155330)

    Since both words need to be correct "solve the current CAPTCHA at an efficacy of 1%" would be closer to the truth.

    Actually, that is incorrect. The other word is already positively known by the OCR, and serves as a control, while the other is the one that the OCR could not read. It will of course only check the one that it knowns, and assumes the other one is then correct as well. So, if you get one of the words correct AND this is the same word that as their OCR identified correctly (which is very likely the case), then you pass, but most of the time (99%) give a bad answer for the harder, non-OCR word. Sadly, this leads to pollution of their database in the long run.

  • Re:Offtopic (Score:3, Informative)

    by Cougar Town (1669754) on Thursday August 05, 2010 @05:59PM (#33155372)

    Wrong. Plugins have been around since Netscape and are still called plugins. They have a different function than an extension (and an extension is what we would want in this case to fix the site's colours).

    Both plugins and extensions, along with themes, are collectively referred to as "addons." "Plugin" is the wrong word in the summary. "Extension" or "addon" would have been acceptable.

  • Re:colours (Score:1, Informative)

    by Anonymous Coward on Thursday August 05, 2010 @05:59PM (#33155374)

    View>page style>no style

    easy.

  • Re:OCR improvements? (Score:3, Informative)

    by AusIV (950840) on Thursday August 05, 2010 @06:28PM (#33155582)
    They're not. I saw the presentation these guys gave at DefCon (their presentation was about as painful as their website), and they're only getting the test word correct with about 30% accuracy. They're not completely sure about their success rates on book words, but they believe it to be considerably lower than the test words.
  • Re:far from it (Score:4, Informative)

    by Jorl17 (1716772) on Thursday August 05, 2010 @09:20PM (#33157152)
    This is not informative. As many have said. If You read: http://www.google.com/recaptcha/learnmore [google.com] , you'll get it.

    Here is the deal: reCAPTCHA presents two words. One is picked by it and is previously known. The other one is a word from a book that has been scanned. Said word is unknown to the reCAPTCHA system. When the user enters both words, reCAPTCHA checks to see if the known word has been properly recognized. If that is the case, then reCAPTCHA can assume that a human is answering. Given that a human is answering, then the second unknown word given by the human is most likely correct, because he/she will be able to recognize it as well. Using this system, reCAPTCHA works as a CAPTCHA (spam prevention) mechanism and also helps transforming old books/papers into digital format, such as the New York Times.

    So, in practice, only one word has to be correct -- the word that reCAPTCHA knows. What's sad is that bots may contribute incorrect second words...

    Next time, get informed before going all crazy.

    And here is the relevant info, quoted from the aforementioned website:

    reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly. But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

"Go to Heaven for the climate, Hell for the company." -- Mark Twain

Working...