Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×

Audio CAPTCHAs Cracked; ReCAPTCHA Remains Strong 157

Falkkin writes "Ars Technica reports that audio CAPTCHAs consisting of only distorted digits or letters can be easy to crack using machine learning techniques. This includes most of the audio CAPTCHAs currently in use on the Web. The reCAPTCHA team has discussed their new audio CAPTCHA, which is resistant to this attack."
This discussion has been archived. No new comments can be posted.

Audio CAPTCHAs Cracked; ReCAPTCHA Remains Strong

Comments Filter:
  • by ouder ( 1080019 ) on Monday December 08, 2008 @11:39AM (#26033299)
    Isn't this just an advertisement for ReCAPTCHA disguised as a news item?
  • RECAPTCHA (Score:5, Insightful)

    by EddyPearson ( 901263 ) on Monday December 08, 2008 @11:43AM (#26033357) Homepage

    People crack CAPTCHAs for profit. They either sell the algorithms to spammers or spam themselves.

    The thing is, if you managed to reliably crack RECAPTCHA, then you've succeeded where all the best OCR software on the market has failed (All Recaptcha's are words that couldn't be deciphered by existing software). At which point there's big bucks to be made legally selling the software.

  • by compro01 ( 777531 ) on Monday December 08, 2008 @11:46AM (#26033415)

    Banning that way doesn't work real well when you consider dynamic IPs, distributed attacks (bot nets), proxies, etc.

    Unless you're willing to ban at least a third of the world, you're not going to get much out of that.

  • by X0563511 ( 793323 ) on Monday December 08, 2008 @11:52AM (#26033555) Homepage Journal

    Well, kudos for using CSS instead of javascript to hide it.

  • by greatgregg ( 1106739 ) on Monday December 08, 2008 @11:54AM (#26033585)
    This only works for small sites. Certainly the Yahoos and Googles of the world can't rely on something that can be broken with 2 minutes of hacking.
  • by Ron Bennett ( 14590 ) on Monday December 08, 2008 @11:55AM (#26033607) Homepage

    Captchas are user unfriendly and relatively ineffective.

    A more effective route is to require a new user to submit their postal address and a phone number. Then the service mails a post card containing a verification code to the postal address and/or calls the phone number. Google does this for AdSense publishers.

    Ron

  • by X0563511 ( 793323 ) on Monday December 08, 2008 @11:55AM (#26033609) Homepage Journal

    Just let the spam flow and crap up everything. When everything is useless, perhaps they will give up.

    Right now, they push tons of shit with the hope that the peak of it might show through. If all of it is seen, the volume might backfire.

    It sure will suck for everyone though.

  • by Waffle Iron ( 339739 ) on Monday December 08, 2008 @11:58AM (#26033651)

    By law, camera phones must make the click noise when operated within some countries to help fight voyeurism.

    That's a great idea. However, we need a law for video cameras, too.

    I propose that by law, each video camera must be equipped with a prominent hand crank, and shall only record while the crank is being turned. Furthermore, as added protection, people with video cameras must wear a beret and carry a conical megaphone at all times while operating said device.

  • by fuzzyfuzzyfungus ( 1223518 ) on Monday December 08, 2008 @12:21PM (#26034053) Journal
    The tricky bit with CAPTCHA is not just asking questions that are easy for humans and hard for AI. There is a huge field of well known stuff, common sense, basic knowledge, etc, etc. that would work. The problem is asking questions that are easy for AI to ask, easy for humans to answer and hard for AI to answer.

    If you have to manually populate your CAPTCHA, you have a problem. It costs just about as much(in money and time) to manually document a set of CAPTCHA questions as it would to build the set. If you can't generate questions automatically, your CAPTCHA will be expensive, or useless, or both. RECAPTCHA is interesting in that is a something of a hybrid. It makes use of real world complexity, from scanned documents; but largely automates the conversion of real world complexity into CAPTCHAs, which makes it fairly practical to use at a large scale.
  • Re:hell (Score:5, Insightful)

    by numbsafari ( 139135 ) <swilson@bsd4us. o r g> on Monday December 08, 2008 @12:21PM (#26034061)

    You're probably a bot.

  • by fuzzyfuzzyfungus ( 1223518 ) on Monday December 08, 2008 @02:12PM (#26036185) Journal
    Oh, the other thing, that I forgot: certain sorts of natural language questions would actually be trivially easy to answer, and thus would have to be avoided. Consider your "how many?" examples.

    Obviously there can't be fewer than 0 of something in a picture, and you can assume that(for the sake of not pissing people off) you won't make your customers count more than 20 of something. Thus, if I am trying to crack your CAPTCHA, If my script sees "how many...?" it will just pick a number between 0 and 20, inclusive. That is ~5% accuracy without anything cleverer than one line of regex. Since you can tell whether or not you solved a given CAPTCHA, your script could even, with some additional logic, chose future guesses based on past success.

    Questions about colors and animals and things have some similar vulnerabilities. How many colors can you reasonably expect your average viewer to verbally distinguish between? Maybe 30, tops? A fairly basic image processing heuristic(say, have a human identify a bunch of visually distinct color groups and name them, then have your script identify all color groups that make up more than 10% of the target image, and make a guess from among those) could thus achieve decent success on any "what color?" questions. Animals are tricker, because you start to get into nontrivial identification of shape; but there also aren't that many plausible choices. I suspect that you couldn't presume the ability to distinguish more than 100 or so animals, which makes even naive guessing a functional strategy, with basic imagine processing tightening up considerably from there.

If you have a procedure with 10 parameters, you probably missed some.

Working...