Forgot your password?
typodupeerror
Spam The Internet

Why the CAPTCHA Approach Is Doomed 522

Posted by timothy
from the how-do-you-feel-about-are-you-a-human dept.
TechnoBabble Pro writes "The CAPTCHA idea sounds simple: prevent bots from massively abusing a website (e.g. to get many email or social network accounts, and send spam), by giving users a test which is easy for humans, but impossible for computers. Is there really such a thing as a well-balanced CAPTCHA, easy on human eyes, but tough on bots? TechnoBabble Pro has a piece on 3 CAPTCHA gotchas which show why any puzzle which isn't a nuisance to legitimate users, won't be much hindrance to abusers, either. It looks like we need a different approach to stop the bots."
This discussion has been archived. No new comments can be posted.

Why the CAPTCHA Approach Is Doomed

Comments Filter:
  • So what next? (Score:2, Insightful)

    So if the CAPTCHA is doomed, what is the next approach? Letting spam bots go rampant over a site is not an acceptable alternative.

    • Re: (Score:3, Insightful)

      by Anonymous Coward
      R'ing TFA would be a start :P (he has solutions at the bottom)
      • Re: (Score:3, Interesting)

        All except the money solution seem to rely on being able to pin an identity to a particular user (or bot). For example, GMail's rate limiting assumes that each bot has exactly one GMail address.

        It falls apart when the bot registers a few hundred thousand GMail addresses.

        What prevents bots from doing that now? CAPTCHAS.

        I agree with the article that CAPTCHA is doomed and that other approaches are needed. I don't agree that either of those solutions work, by themselves.

    • by Hojima (1228978) on Wednesday April 08, 2009 @03:47PM (#27508231)

      So if the CAPTCHA is doomed, what is the next approach?

      Torture

    • Re:So what next? (Score:5, Interesting)

      by Trepidity (597) <delirium-slashdot@hacki s h . o rg> on Wednesday April 08, 2009 @03:49PM (#27508279)

      Spam-filters analogous to those applied to email seem to be increasingly used as plugins to various blog engines.

    • Re: (Score:3, Insightful)

      by ion++ (134665)

      So if the CAPTCHA is doomed, what is the next approach? Letting spam bots go rampant over a site is not an acceptable alternative.

      The next thing to do is to close the services that needs (CAPTCHA) spam projection. This means no more free email. Get used to paying.

      • Re: (Score:3, Interesting)

        The next thing to do is to close the services that needs (CAPTCHA) spam projection. This means no more free email. Get used to paying.

        Why is this bullshit non-solution always brought up by some greed-monkeys who salivate at the idea of charging billions in "micro-payments" ... oh wait.

        I will make it as simple as possible to you: pay-to-play-posting + bot-net = spam unabated + billions in charges to hapless consumers. And no, securing PCs air-tight is not a practical solution in a situation where average u

    • Re:So what next? (Score:5, Interesting)

      by Ralph Spoilsport (673134) on Wednesday April 08, 2009 @03:57PM (#27508421) Journal
      Making people pay for posts. Making people pay for email. That will stop spam dead in its tracks.

      Now, I didn't say you'd LIKE what 's next...

      RS

      • by arth1 (260657)

        There are other alternatives, like better blocking at the client side.
        For this to be more feasible, blogs and e-mail sites need to come up with published and preferably common standards for their output. Which would be another win for the consumer.

      • Re: (Score:3, Insightful)

        by syousef (465911)

        Making people pay for posts. Making people pay for email. That will stop spam dead in its tracks.

        No it won't, and once we introduce it we'll be stuck with it.

        Now, I didn't say you'd LIKE what 's next...

        You're right, I don't like the idea of killing off the Internet as we know it over a misguided attempt to stop something that can only be limited, not stopped. Sometimes the cure is much much worse than the disease and in that case the cure should be rejected.

      • Not really (Score:5, Informative)

        by willy_me (212994) on Wednesday April 08, 2009 @05:01PM (#27509523)
        SPAM is sent from compromised computers. If you make people pay for posts then the owners of compromised computers will be billed - not the real senders of SPAM. Billing would help minimize the problem, but we would still receive a pile of SPAM. And a pile of people who only use their computer once a week would have to foot the bill.
        • Re: (Score:3, Insightful)

          by DragonWriter (970822)

          SPAM is sent from compromised computers. If you make people pay for posts then the owners of compromised computers will be billed - not the real senders of SPAM.

          If the computer was so compromised that the spambot was able to log-in to secure websites (which any site that used a pay-to-post system would need to be) as if it was the legitimate operator of the computer, it makes sense to charge the operator of the computer. This will also, very quickly, encourage adoption of good security practices, as when th

    • by Mordok-DestroyerOfWo (1000167) on Wednesday April 08, 2009 @03:59PM (#27508455)
      Maybe a different type of system? Show a series of animals and ask which one is a pet. Show a series of letters and ask which one is the vowel. A series of types of food and ask which one would go best with Natalie Portman. Show an action shot and a series of similar actions, ask which one would occur in Soviet Russia.
    • by arth1 (260657)

      I'd rather see a hundred spams getting through than one legitimate user being blocked.

    • by Dare nMc (468959)

      The end of free speech on the web? (IE single/shared logins across the web.) maybe require excellent Karma on slashdot before you can get a digg/youtube/reddit/myspace/craigslist login.

    • Re:So what next? (Score:5, Interesting)

      by zippthorne (748122) on Wednesday April 08, 2009 @04:19PM (#27508831) Journal

      Charge a fee. It doesn't have to be money. It could be cycles.

      Have the client hash the message append some random characters to the end of the message. Have it change vary the characters until the hash matches some pre-defined pattern before sending. Cheap to verify on the incoming machine (just one hash), arbitrarily expensive on the sending machine. Your requirement can be for a certain number of characters or a specific sequence of bits, all the way up to the bitlength of the hash.

      It doesn't answer the question of "is the sender a human" but it does answer the question of "how much is this message worth to the sender." The beauty of it is that that is sufficient.

      If the spammer is using a dedicated server, you can limit the amount of spam they can send arbitrarily. Imagine how profitable a spam server would be if it cost $3k to send 86,400 messages per day? If the spammer is using a botnet, that scales a little better for them, but since it chews up cycles, it's going to make their operation noticeable to users.

      There are probably better ways even than that, and someone will eventually find one that is more deterministic (it's unlikely, but there's a chance that someone could just be unlucky enough to never be able to chance on the right sequence using a psuedorandom perturbation approach)

      I didn't think of this though, so there might be some patents. Google for message digest spam control or something like that to see some papers.

      • Re:So what next? (Score:5, Informative)

        by uhoreg (583723) on Wednesday April 08, 2009 @05:00PM (#27509513) Homepage
        This is known as hashcash [hashcash.org]. One big reason that it doesn't work on the web is that, currently, users will be stuck with some slow JavaScript version of the algorithm, while a sufficiently determined spammer can use a fast C version, and end up with much less work required to post. So it's nearly impossible to set a cost that is cheap enough for valid visitors, that will be a sufficient deterrent against spammers.
  • by ivan256 (17499) on Wednesday April 08, 2009 @03:36PM (#27508065)

    ...is the point going right over the author's head.

    A CAPTCHA works well enough for the same reason greylisting works well enough. They may be trivial to bypass (for some definition of 'trivial'), buy many applications only need a tiny speed-bump to make a huge difference in undesirable traffic.

    • by geekoid (135745)

      I think the point here is it won't even be a speed bump soon.

    • Re: (Score:3, Informative)

      by qoncept (599709)
      I think you're missing the point. CAPTCHA isn't a speed bump. Anyone that is going to take the time to make a bot to spam your site is going to take an extra minute to add a hack for your CAPTCHA or cat picture or sound or simple question. And saying you have to make CAPTCHA difficult for humans to read to be effective is a pretty major understatement. It should read "Computers are better at it than people."
      • by ivan256 (17499) on Wednesday April 08, 2009 @04:51PM (#27509383)

        Almost nobody takes the time to make a spam-bot.

        Some 90% brain-dead excuse for human life takes something off the shelf and points it at whatever software you're running. Unless you're one of the most visited sites on the net, a minor modification to the code, and a manually integrated captcha is going to stop practically everybody from spamming your site.

      • Re: (Score:3, Insightful)

        by relguj9 (1313593)

        Errm... on small scale CAPTCHA's work brilliantly. For instance, if you've ever installed and administrated a PHPbb forum, the CAPTCHA that comes with has been broken to hell such that as soon as your site is indexed, it's going to be spammed. Adding retardedly simple changes to the CAPTCHA will immediately stop all the spamming until someone specifically re-writes the bot for your site, which is doubtful in most cases.

        I didn't specifically do this, but you could change the code to say "Add these 2 number

    • Well I think you make a good point: for many sites, it's not particularly worth the effort to break the capatcha. On the other hand, it may be worth the effort for some sites, and it will be broken for the sake of those sites.

      Once they've figured out how to break those, they might (possibly) be able to apply the same technique to everyone else with little overhead. But really, that's not even the point. If spammers can hack verification on major sites and get access to millions of free email addresses,

    • by RobertB-DC (622190) * on Wednesday April 08, 2009 @04:02PM (#27508521) Homepage Journal

      They may be trivial to bypass (for some definition of 'trivial'), buy many applications only need a tiny speed-bump to make a huge difference in undesirable traffic.

      Plus, if you're using ReCaptcha [recaptcha.net], you're making the spammers do a little bit of good for the world. If they can develop software that reliably cracks ReCaptcha, then they've solved a lot tougher problem than just pushing v1@g@r@.

    • by Lord Ender (156273) on Wednesday April 08, 2009 @04:10PM (#27508647) Homepage

      CAPTCHAs have moved far past "tiny speed bumps" for me. Many are case sensitive yet vary letter size greatly; they use fonts which make the number 1 and the letter l identical; and they smash things together making, for example "m" and "n n" identical.

      Implementers also suck royally. Sites often require a long list of information be typed, including redundant passwords. Then they lose ALL that information when you get the CAPTCHA wrong. Some get caching all screwed up. It's a mess.

      CAPTCHAs today are so much worse than "speed bumps" for regular users, that I'm beginning to wonder whether I, myself, am a bot. The internet is becoming unusable to me.

    • by speedtux (1307149) on Wednesday April 08, 2009 @04:20PM (#27508853)

      Greylisting only works because many sites don't use it; if everybody used it, it would stop working.

      The economics of CAPTCHAs are even less favorable, since the cost of breaking a CAPTCHA is small compared to the cost of what the bot actually does after it has broken it.

  • by get quad (917331) on Wednesday April 08, 2009 @03:36PM (#27508077)
    ...until AI gets smart enough to answer questions intuitively.
  • Annoyance (Score:5, Insightful)

    by Renraku (518261) on Wednesday April 08, 2009 @03:41PM (#27508153) Homepage

    That's where the issue is.

    I've been a nerd since I was born. Grew up with early computers. Watched them evolve until now. But nothing makes me feel dumber than trying a CAPTCHA 5 or 6 times and failing every time. Its a serious annoyance and I've seen WORSE that I haven't even attempted.

  • After three tries (Score:3, Interesting)

    by geekoid (135745) <dadinportland&yahoo,com> on Wednesday April 08, 2009 @03:43PM (#27508177) Homepage Journal

    block the I address for 10 minutes, then an hour then a day.

  • by Anita Coney (648748) on Wednesday April 08, 2009 @03:44PM (#27508197) Homepage

    ... which is another way of saying they really doesn't work at all. Both annoy legitimate customers and users while still allowing those with nefarious motives to do whatever they wanted to do in the first place.

    • That's complete bullshit. How did you get modded insightful?

      There have been MAYBE half a dozen Captcha's in my life that I have failed to get through. The "annoyance" is what... 5 seconds spent on an extra text field? Maybe 30 seconds if your eyesight suck _really bad_?

      DRM, on the other hand, can keep users from actually installing programs that they paid for. It will often disable these programs outright if certain conditions are not met. It can keep users tied to services, keep users tied to the int
  • by Anonymous Coward on Wednesday April 08, 2009 @03:47PM (#27508245)

    Everyone seems to think that the answer to this is to challenge the user somehow. Why isn't a technical solution possible that doesn't require any interaction from a person?

    On my own contact forms, I use a really simple obfuscation technique, it doesn't require any user interaction, and I don't get any spam. I've chosen to name my form elements with meaningless names, because obviously automated spammers rely on field names to fill in the blanks. If they see a form like this:

    <input type="text" name="email">
    <input type="text" name="subject">
    <input type="text" name="message">

    Obviously it's pretty easy to fill out. If they see this instead:

    <input type="text" name="sj38d74j">
    <input type="text" name="9sk2i84h">
    <input type="text" name="m29s784j">

    Then they probably won't even make it past the email validation part, unless they catch the error that my page is printing and try all combinations (or get lucky).

    It makes it even more effective when you use fields with good names, but hide them from users with either CSS or Javascript:

    <input type="text" name="email" style="display: none;">

    That's a honeypot, if it's filled out then it's a robot. You can use the same CSS or Javascript techniques to also print messages informing users not to fill those out if their browser decides to not run my code and instead shows them.

    Really simple solution, requiring no user interaction, and is at least if not more effective than a challenge and response type of solution. I don't know why everyone is hung up on a visual challenge when it's a lot easier to distinguish between a real web browser and a scraper that doesn't bother to execute Javascript or apply CSS. I've been saying this for years though, so I don't really expect anyone to start paying attention now.. at least my own inbox is spam-free though.

    • I like the general idea, however a problem I see is that mechanisms that auto-fill forms for you (like your name and email address) may not work on your page - and even worse might populate that honey pot field the same way a bot would.

      • Auto-fill tools work by remembering the previous values of the fields. As long as the field names weren't changed from visit to visit, it should work fine.

        If you're talking about the robo-form fillers that try to fill out forms that you've never visited before, it'd be easy enough to clear the honeypot inputs using Javascript after the page was loaded. A robot most likely wouldn't execute the Javascript.

    • by Eternauta3k (680157) on Wednesday April 08, 2009 @06:05PM (#27510507) Homepage Journal
      If your site gained any popularity, they would make bots specifically to register in your website.
  • by smooth wombat (796938) on Wednesday April 08, 2009 @03:50PM (#27508287) Homepage Journal

    has a different take on the subject. Rather than trying to obscure the image with lines or similar measures, it uses a series of letters, some of which are a color. You are then asked to type in the colored letters to proceed.

    I don't know if these are static images or generated each time but the owner claims his site has almost no spammers (i.e. people have to do it, not machines).

    • Srly - great. :)

    • His site probably also doesn't have many colourblind users.

      • His site has hundreds of thousands of registered users so I am presuming he has a few. He does have an alternative method for color blind people to use.

    • by Kimos (859729) <<moc.liamg> <ta> <todhsals.somik>> on Wednesday April 08, 2009 @04:10PM (#27508663) Homepage
      There are a few flaws with this idea. Primarily that it blocks colorblind individuals from registering for the site, and there are much more colorblind internet users than visually and hearing impaired.

      This is also not very difficult to break. Assuming that the letters and numbers aren't obfuscated the same way CAPTCHA images are (if they are then this is just another CAPTCHA), a bot would be able to parse the characters out of the image. It could then classify the characters into groups of colors, pick one group randomly, and guess. There couldn't be more than four or five colors in the image since asking to differentiate between aqua/navy/royal/pale blue is unreasonable for a human (but interestingly enough, not difficult for a computer). That would give you a bot with a ~20-25% accuracy rate.

      Beyond that, you could parse the question as well, looking for the words red, blue, green, black, etc. and classify ranges of hex colors into associated color names. That would greatly increase success rate of guesses.

      This is not a reliable CAPTCHA replacement and in fact seems not very difficult to break.
  • Wrong implementation (Score:4, Informative)

    by js3 (319268) on Wednesday April 08, 2009 @03:54PM (#27508351)

    Most CAPTCHAs are hacked because their implementation is amatuerish. They are hacked by resusing session ids or dictionary attacks and nothing to do with actual image itself. Long story short CAPTCHAs reduce the amount of spam by more than 50% simply because it's not worth the effort for a spambot to break it, after all they have the entire internet to spam.

    Some are good some are bad and most are downright horrible, but you wouldn't want your favorite forum to be trolled by spambots would ya? Might as well live with it. Nothing works 100% you should know that by now

    • but you wouldn't want your favorite forum to be trolled by spambots would ya?

      My favorite site is /. It's already trolled by spambots, you insensitive clod.

  • It looks like we need a different approach to stop the bots.

    Nuke the sites from orbit; it's the only way to be sure.

  • by davidwr (791652) on Wednesday April 08, 2009 @04:00PM (#27508475) Homepage Journal

    The more effort someone is willing to put out to prove they are human or are backed by a human willing to be responsible for problems, the more abuse-able services you give them.

    For example, e-mail service providers could offer several tiers:

    Simple signup/new accounts:
    Limited number and size of incoming and outgoing messages.

    Verified signup/driver's license with confirmation by paper mail:
    Nearly-full, with shutoff or limitations imposed at first sign of abuse.

    Verified signup/credit card with confirmation:
    Nearly-full, with shutoff or limitations imposed at first sign of abuse.

    Established account, with a pattern of usage indicative of a human over a period of several weeks:
    Nearly-full, with shutoff or limitations imposed at first sign of abuse.

    Credentialed user, backed by a substantial bond or deposit and an explanation of why suspicious behavior really is legitimate:
    Full access plus a free pass on "legitimate" suspicious behavior until someone complains, but if it's abused then throttle him and take the costs out of his deposit.

  • It's doomed because it's fundamentally flawed. When you can hire someone in India to crack them by the thousands (per day) for cheap wages, it's all moot. It doesn't matter what you do for lettering and whatnot when you have an intelligent human perfectly willing to solve them. They just happen to be in the employ of spammers. They make catchpas on the assumption it isn't worth someones time to crack them, the problem is they are placing value on time / labor expenditure at local rates and not those in Indi
  • by MrBippers (1091791) on Wednesday April 08, 2009 @04:03PM (#27508533)
    Solve the following math problem to continue:
    1/0 = ?
  • CAPTCHAs are simple Turing tests. As computers get faster and software gets smarter, it will become harder and harder to tell them apart. Also, since humans have a broad spectrum of ability, there will be an increasing percentage of humans who can not pass the tests.

    For example, math students who can not tell a Rembrandt from a Picasso, and art students who can't determine the roots of a simple quadratic. (See, I'm not picking on anyone in particular - we are all ignorant in most fields.)

    In future we wil

  • by Binty (1411197) on Wednesday April 08, 2009 @04:18PM (#27508801)

    Most posts on this topic have been along the lines of, "Maybe CAPTCHAs as they are implement now don't work, but here is a method that is trivial for people but hard for computers."

    TFA's best argument, in my opinion, was that it is trivially inexpensive for a spammer to simply hire people to break CAPTCHAs. So, a method that doesn't annoy people but is hard for computers still won't work because the spammer will just use people. This is not a topic I know a lot about (not being a spammer I don't know what kind of revenue they generate) but would like to hear a response to this. Is the TFA off its gourd and better technology really will solve this problem? Or is gate-keeping for free services essentially pointless?

  • by rAiNsT0rm (877553) on Wednesday April 08, 2009 @04:29PM (#27509015) Homepage

    I watched an amazing mini-documentary about Re-Captcha and really like the concept and the end goal. Basically Re-Captcha uses two words, one known word and one of the words is unknown and comes from book digitization efforts. The known word gets you into the site for whatever you are doing, the unknown one comes from a literary work that OCR couldn't figure out. After a large sampling of people have typed the unknown word the majority answer becomes the text entered in the digitization effort.

    My contention is that people like myself who think it is a great cause would happily spend some free/bored time just entering the unknown words on a website without the whole captcha bit. If anyone here is a part or knows anyone on the team please bring this idea up.

    • Re: (Score:3, Informative)

      by TheRaven64 (641858)
      You can do this already, just go to the 'about' page on the site. When I first heard about ReCaptcha, I spent a little while filling them in to see how hard they were.
  • by QuoteMstr (55051) <dan.colascione@gmail.com> on Wednesday April 08, 2009 @04:35PM (#27509103)

    Everyone has a great idea for a CAPTCHA, but very few people know what the hell is really going on. Remember that the machine doesn't need to solve the CAPTCHA every time, that machines are infinitely patient and have huge memories, and that another machine needs to make sure the human gave the right answer!

    Ideas that won't work:

    1. Make clients identify an object from a picture. Machines can't describe objects in pictures: if machines can't describe the picture, how the hell is the CAPTCHA server supposed to verify that the client gave the correct answer? If a human being manually inputs the pictures and acceptable descriptions for each, then another human can program his attacking machine to do the same thing! Having a large, but finite set of pictures doesn't help either since a machine doesn't need to solve the CAPTCHA every time. It can just learn the correct responses without actually understanding the image. ANY APPROACH BASED ON IDENTIFYING A MEMBER OF A FINITE SET DOES NOT WORK AS A CAPTCHA.
    2. As a special case of #2, QUIZZES DO NOT WORK: either the questions are finite and subject to attacker memorization, or the number of patterns for the question is finite, and these patterns can be detected by a machine. (Consider "A train is coming from Denver at X miles per hour..." --- same problem, different coefficients)
    3. Send the client a special program that verifies he's real: if it doesn't work for DRM, it won't work for CAPTCHAs. An attacker can just program his machine to simulate slow typing, slow thinking, or a cross-eyed human being. YOU CANNOT CONTROL THE EXECUTION ENVIRONMENT. No amount of Javascript obfuscation, encryption, or header-checking will make the slightest bit of difference for a determined hacker.
    4. As a special case of #3, TIMING ANALYSIS DOES NOT WORK. Machines can simulate arbitrary delays.
    5. Limiting CAPTCHA-solving attempts by cookie/IP address/etc.: that doesn't work. Attackers don't obey web standards, and have botnets

    Really, it's very easy to think you've come up with a very clever CAPTCHA. When you think that, all you've done is stoked your ego and screwed yourself over. It's the same reason why we don't roll our own cryptography: CAPTCHA-making is a very hard problem, mainly because your problem space must be infinite (to avoid an attacking machine simply memorizing answers), the answers verifiable by a machine, but the problems not solvable by a machine.

    How many questions can be checked by machines but not answered by them?

    Not many; fewer every day. There are no questions that can't be answered by a computer (and which can be answered by a human mind). The Church-Turing thesis [wikipedia.org] [wikipedia.org] has some validity: the human mind is no more powerful than a turing machine, and ultimately, computers and our brains are equivalently computationally. There's nothing a computer can't solve: there are just things we haven't figured out yet.

  • Here's what I use... (Score:3, Interesting)

    by X86Daddy (446356) on Wednesday April 08, 2009 @05:08PM (#27509647) Journal

    When the PHPBB2 CAPTCHA became completely useless and I was seeing hundreds of bot registrations on a forum I ran, I built something else. I added a simple extra text field to the registration form. I ask a plain English question, giving away the answer, and require the user to write it in the blank.

    i.e. What is the common name for a domesticated feline? (Starts with "c" and ends with "at" This is an anti-spam measure)

    The field is checked for the right answer on the post-processing. This stopped 100% of the fake registrations. I ended up doing this on practically every web-accessible form I have built since then, and I've seen the method pop up on other people's websites as well (certainly parallel evolution rather than "they got it from me").

  • by IamGarageGuy 2 (687655) on Wednesday April 08, 2009 @06:49PM (#27511123) Journal
    We all bloody well know how to get rid of spam but nobody ever talks about the real culprits. The credit card companies. The ones who facilitate the way for spammers to make money. Unfortunately the CC companies make money so they don't care, but let's face it, if the CC companies decided to get rid of spam and lose the income, it could be wiped out in a week. All they would have to do is deny any payments to somebody suspected of spam - problem solved - I never hear anybody bitch about the root of the problem which is the ability to recieve payments.
  • UN solution (Score:3, Insightful)

    by Max_W (812974) on Thursday April 09, 2009 @01:15AM (#27514173)
    It is a task for United Nations. Spam is causing a major damage to the world economy via lost work time, traffic, etc. We need international enforceable laws, which would make spam illegal and inevitable punishable worldwide.

    It is a bog problem and requires a big solution.

    Our leaders shall overcome their cultural shock, phase out activities in local organizations, like EU, NATO, CIS, etc., and begin to work in a global setup, the UN, the WTU - world telecommunication union, Interpol, UNICEF, etc.

    What is the point of fighting spam in, say, the USA, if it will continue to pour in from, say, Indonesia?

Help! I'm trapped in a PDP 11/70!

Working...