Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Security Technology

A Vision For a World Free of CAPTCHAs 168

An anonymous reader writes "Slate argues that we're going about verifying humans on the Web all wrong: 'As Alan Turing laid out in the 1950 paper that postulated his test, the goal is to determine whether a computer can behave like a human, not perform tasks that a human can. The reason CAPTCHAs have a term limit is that they measure ability, not behavior. ... the random, circuitous way that people interact with Web pages — the scrolling and highlighting and typing and retyping — would be very difficult for a bot to mimic. A system that could capture the way humans interact with forms algorithmically could eventually relieve humans of the need to prove anything altogether.' Seems smart, if an algorithm could actually do that."
This discussion has been archived. No new comments can be posted.

A Vision For a World Free of CAPTCHAs

Comments Filter:
  • Just a Thought... (Score:5, Insightful)

    by ryanleary ( 805532 ) on Saturday April 25, 2009 @01:14AM (#27710327)

    It seems to me that if you can design an algorithm to verify how humans interact with a computer, it should be relatively trivial to engineer an algorithm that mimics this interaction?

    Maybe someone smarter than I could clarify?

    • by Nazlfrag ( 1035012 ) on Saturday April 25, 2009 @01:17AM (#27710339) Journal

      Using anything other than a human to judge the behaviour puts it outside of the Turing test. So not only does their proposed solution not match the goal they set, it should indeed be defeatable by another algorithm.

      • Re: (Score:3, Insightful)

        by Anonymous Coward

        So if I have an algorithm that can verify an integer factorization quickly, it means there must be an algorithm that can factor any integer quickly? How would that work?

        • Re: (Score:3, Informative)

          Factoring an integer has one answer. Trial and error doesn't work. Scrolling and clicking tempos have many answers, trial and error does work.

          • What does multiple possible answers have to do with anything? The correct riposte to ryanleary is to point him to NP [wikipedia.org] which is a whole class of decision problems defined by the fact that they're simple to verify but hard to solve.

            Honestly, did he even use his head? How does he think his computer can verify an SSL cert in a fraction of a second when it's common knowledge that they take a long time to crack?

            Also the article's idea is awful. Hey here's a bot that could defeat the algorithm: record one huma
        • by 1 a bee ( 817783 ) on Saturday April 25, 2009 @02:49AM (#27710675)

          So if I have an algorithm that can verify an integer factorization quickly, it means there must be an algorithm that can factor any integer quickly? How would that work?

          The anonymous poster makes a good counter argument against the idea that the algorithm must be easily defeatible: just because you have an algorithm that detects human behavior does not imply you have an algorithm that emulates the human behavior detected by the original algorithm.

          In fact, there are many, so-called, one-way (correct terminology?) algorithms. So, for example, for a given file, it's easy to compute its MD5; harder to compute a file for a given MD5 (though doable). And of course, the AC's better example which is impossibly hard in reverse for composite numbers made from very large prime factors.

          So no. Labeling the idea flawedbydesign is jumping the gun--logically, speaking.

          • Re:Just a Thought... (Score:5, Interesting)

            by Joce640k ( 829181 ) on Saturday April 25, 2009 @03:10AM (#27710751) Homepage

            I disagree. I don't think there's anything terribly un-mimicable about the way humans interact with web pages.

            Besides, have you considered the effect of false positives (which will be many)?

            With a captcha it's a black/white decision and people know why they passed/failed.

            In the world being proposed in the article people will have to sit dejectedly wiggling their mouse while a web page decides if they're human or not based on some unknown criteria. Pass or fail? It's up to the machine.

            After two or three sessions of this people will be running away screaming from your web pages.

            • by 1 a bee ( 817783 )

              I disagree. I don't think there's anything terribly un-mimicable about the way humans interact with web pages.

              Maybe, maybe not. The point was that claiming

              it should indeed be defeatable by another algorithm

              is not a logical slam-dunk.

              • Re: (Score:3, Insightful)

                by Joce640k ( 829181 )

                I'd say it's a lot more of a slam-dunk than this:

                "Read heavily distorted text on random patterned backgrounds with added noise and geometric figures drawn across it"

                My real problem with the proposal is with the false positives. There's no clear feedback to let a user know *why* he's not being allowed into the system, it's just that the machine doesn't like the look of him.

            • Personally I don't nkow why they just don't use pictures of things and randomly circle some item in the picture that has a tagged string to it the picture and then type in.

              Right now they are using distorted letters and numbers, it seems to me that using pictures and asking questions about the nature of the objects ('randoly') circled would be a lot better..

              because algorithms would have a tough time desciphering what object might be circled or pointed out in the picture,

              Or you could use pictures to "suggest"

              • The reason that won't work is because how do you get a computer to dynamically generate these puzzles? The trick is to find something that is easy for a computer to create but not easy for it to reverse. Most common sense AI type puzzles require as much AI to create as they need to solve.

              • If there's a limited number of question/answer pairs then it can be broken by having a human solve each one once (or have a computer trial and error it) until they know all the answers.

                It could work for a small low-profile site that isn't a big enough target to have that effort directed at breaking its Captcha, but the big players need something more dynamic.

          • Re: (Score:3, Interesting)

            by cskrat ( 921721 )

            The anonymous poster that you're responding to was actually the one to introduce the word "quickly" to the discussion.

            That being said, I think the method proposed at the end of the article is flawed in that the algorithm is reversible and facing the wrong direction.

            Assuming that the website in question only has access to the message information passed to the GUI window of the browser by the OS, (I'm sure as hell never installing a browser with ring 0 access to my system) it would be fairly trivial to produc

          • by Endo13 ( 1000782 )

            The reason it's never going to work is because unlike passwords/encryption/captcha methods it's not something that can be continuously changed or updated when it gets compromised. Even if just one large company uses it you still know that eventually the algorithm will get out into the wild. If everyone uses it, the algorithm will be trivially easy for anyone to get their hands on almost immediately. And just like it's trivially easy for a computer to "crack" an encryption if it has the key, it's also going

          • Re: (Score:3, Informative)

            by dcollins ( 135727 )

            The anonymous poster makes a good counter argument against the idea that the algorithm must be easily defeatible: just because you have an algorithm that detects human behavior does not imply you have an algorithm that emulates the human behavior detected by the original algorithm.

            That's vaguely clever, but it doesn't really pass the sniff test. While "one-way" or "trapdoor" functions may or may not exist, they appear to be pretty rare. That's why it's such a big deal when computer scientists identify a new

          • In fact, there are many, so-called, one-way (correct terminology?) algorithms.

            Background: I'm doing my phd in crypto. I use terms like one-way function (and one-way {,trapdoor} permutation.

      • by mcrbids ( 148650 ) on Saturday April 25, 2009 @02:01AM (#27710521) Journal

        It's a lot tougher do define what a human is than it may seem on the surface, and the difference between man and machine will, by definition become more and more blurred until there is no effective difference.

        It's an idea that I've become familiar with esp. aftre reading 'The Singularity is Near' by Ray Kurzweil. As our technology advances, we'll find that our capabilies beyond our technolgy will diminish. Machines have long ago surpassed our running speed (cars/planes/trains) and our ability to farm/grow food (tractors) and our ability to hurl object (guns) and swim (boats) but we've always had the ability to out-think our machines.

        Increasingly, this isn't true.

        We've already shown that SPAM filters are good enough to be more accurate than the people who read the messages. Machines have long been better than people for math-related stuff, keeping track of stuff, and the like, but now we're getting close to the threshhold for image processing and character recognition. It's already true for voice recognition. Captcha is, therefore, doomed to fall eventually as we approach the singularity, and is already pretty weakened. The next question is, therefore simple: what does it mean to be human?

        Remember Lt. Commander Data on Star Trek, trying to be human? It's quaint largely because he/it was a minority on he show, but in reality the machine will outnumber us by a wide margin - they already do!

        So what does it mean to be human?

        If you have a prosthetic leg, are you still human?

        If the leg has a CPU in it, are you still human?

        If the CPU is more powerful than your mind, are you still human?

        If the chip is wired into your mind, are you still human?

        If you use the CPU as though it were part of your mind, are you still human?

        If you have transferred modt of your thinking to the CPU, are you still human?

        If you transferred all your thinking to the CPU and rarely use your 'wet' brain, are you still human?

        If you find th

        • Re: (Score:3, Interesting)

          I might recommend http://en.wikipedia.org/wiki/Homosapien [wikipedia.org] for further reading on this topic. Clearly, you are not a human no matter how smart you are if you're a computer. Are you a person? Well, depends how you define 'person'.

        • So what does it mean to be human?

          Born of a human mother. Take that, mister data!

      • The human brain is works on an algorithm that is Turing complete. It is also unlikely that the human brain has any algorithmic capability that a computer does not have, so it is reasonable to say that

        Any captcha that can be solved by a human, eventually will also be solvable by a computer.
      • it should indeed be defeatable by another algorithm.

        True. Let's say you have a test T in mind. This test will have some inputs I1,...,In which represent some observations coming from the keyboard and the mouse input obtained from some websurfer. If a computer tries to pass the test T, all it has to do is know the observations I1,...,In that are being looked for and simulate plausible values.

        What are plausible values? To obtain them, all you have to do, before the test T goes live, is ask some humans to

      • Using anything other than a human to judge the behaviour puts it outside of the Turing test. So not only does their proposed solution not match the goal they set, it should indeed be defeatable by another algorithm.

        I imagine there will have to be a new job description for the webmaster ...

    • by l3prador ( 700532 ) <wkankla@gmaTOKYOil.com minus city> on Saturday April 25, 2009 @01:25AM (#27710381) Homepage
      Yep. If you can characterize the behavior pattern enough to automatically determine that it's "human-like," then you can automatically generate "human-like" behavior. The only way around it that I can see is if there is some sort of asymmetrical information involved, such as the invisible form honeypot mentioned in TFA--the website's creator (and thus the bot-detection script) knows that there is an invisible form present, but it's difficult for a script to see without rendering the site in standards compliant CSS.
      • but it's difficult for a script to see without rendering the site in standards compliant CSS.

        But with many OpenSource web browsers, would it be that hard to work out what is rendered and what is not? it seams that bots could even run a hidden tab of firefox/chrome on a victims computer if they had to. I suppose it does make cracking capatchas computationally more difficult but isn't OCR much more intensive than rendering a page (wait why not just put capatchas in terribly codded flash apps)?

      • Honeypots are the Answer! You simply have pages and options which are just distasteful to humans, the reasons for which are not comprehensible to machines! The machines will give themselves away because they cannot distinguish the distasteful options.

        Example: A page of Markov-chain nonsense in an otherwise informative website.

        This page would be generated using the same technology that spammers use to get past spam filters. Only a real human being or an AI that can achieve some sort of comprehension will

      • by dissy ( 172727 )

        The only way around it that I can see is if there is some sort of asymmetrical information involved, such as the invisible form honeypot mentioned in TFA--the website's creator (and thus the bot-detection script) knows that there is an invisible form present, but it's difficult for a script to see without rendering the site in standards compliant CSS.

        The one website I had to make where the signup page (Thus the bot attacked form) had quite a few text input boxes at the owners request.

        To handle the auto-tag reading bots, which will either look for standard text element names like 'username', or the human bot owner will pull one copy of the page to see what you personally named that field and program the bot accordingly, i setup a season system with happy randomness in it.

        Basically the only one single tag that is the same on every load of the page is a hi

    • by Z00L00K ( 682162 )

      Aren't many of those things like captchas circumvented by a trial and error methodology?

      What if you get three tries and then a blacklisted IP address? Not that the poster will realize that it's blacklisted, just that the tries to crack the captcha won't work, even if it's the correct answer.

      • by RiotingPacifist ( 1228016 ) on Saturday April 25, 2009 @02:01AM (#27710519)

        If you have a botnet then a single computer probably dosen't need to try a site more often than a human would.

        • by Z00L00K ( 682162 )

          That's assuming the botnet is targeting a single site or only a few sites.

          • No it assumes the opposite, it assumes a botnet is targeting many sites!
            if you have 1000 computers, then you can attack every site you want 1000 times before a single computer has to attack the same site twice. If you only attack a few sites most of the time the bots will be inactive, its only if your attacking many sites that your not wasting time.

            • by Z00L00K ( 682162 )

              The usual method of a bot is to use the same bot against a single site for several repeated attempts, so that reasoning doesn't hold.

              There may be another bot in the net that targets that site, but then that bot may run the same logic or similar logic. This since the nodes in the bot net aren't doing communication between each other otherwise the traffic for control would be horrible.

              So your reasoning doesn't hold.

              And I didn't claim perfectness, just another spanner in the works of spam bots that will slow t

    • Re: (Score:3, Funny)

      by cjfs ( 1253208 )

      It seems to me that if you can design an algorithm to verify how humans interact with a computer, it should be relatively trivial to engineer an algorithm that mimics this interaction?

      Maybe someone smarter than I could clarify?

      You're looking at this all backwards. This isn't the humans attempting to prevent access to the bots. It's the bots getting the humans to speed up their evolutionary arms race.

      Think of it, bots trying to determine bot from non-bot. Bots honing their human-infiltration skills vs the best of the bots. It'll be the greatest leap since spam filtering. We'll^WThey'll be getting +5s again on Slashdot in no time!

    • by julesh ( 229690 ) on Saturday April 25, 2009 @02:16AM (#27710567)

      It seems to me that if you can design an algorithm to verify how humans interact with a computer, it should be relatively trivial to engineer an algorithm that mimics this interaction?

      Maybe someone smarter than I could clarify?

      Sometimes it's easier to write an algorithm that checks that something is correct than to generate that something in the first place. An example: if you have a public key, checking a message is signed with it is fairly easy; signing a message with it is hard, because it requires you to factor the key.

      I see no evidence that "human behaviour" is such an algorithm. It might be, but we're way too far off understanding it to be able to make any sensible guesses in this field.

      A simplified approach is doomed to failure; simplified human behaviour is much more likely to behave like you suggest than like public keys, I think. Also, because different people interact with their browser in different ways, how do you cope with that? I tend to navigate via keyboard, so would the script reject me because I tabbed to the form field (thus jumping directly to it) rather than scrolling circuitously to reach it? I also make far fewer typos than average and type faster than the average user, so is this going to count against me?

    • by major_fault ( 1384069 ) on Saturday April 25, 2009 @02:19AM (#27710575) Homepage
      No algorithm will do. Ultimately the question that must be solved is whether the user is malicious or not. Best possibilities so far are the tried and true invitation system and excluding malicious users from the system. Malicious users are also users who keep including other malicious users. Easily detectable with proper moderation system that needn't be gotten into right here and now.
      • I think this makes sense, though that has to mean that the legitimate user has to find someone they know that is part of a community. I think this is going to keep out a lot of good users.

        Anyways, I thought CAPTCHAS usually aren't solved by machines, so trying to deliver a Turing-like test isn't going to solve the problem.

    • by bytesex ( 112972 )

      The only thing I can think of that could break this, is lack of efficiency on the human's part. That is, if the test, or the judgement takes time, then this is time that automated algorithms usually do not have. They want to inject, mass-mail, or do whatever they maliciously want to do, quickly. But then again, they might not.

    • Perhaps the verification algorithm could reject any"one" who behaves too much like the algorithm expects a human to.

      Seriously though, this sort of verification method seems like it would be easy to defeat.

    • Consider that IPV6 takes care of the details; no Turing Test needed. At some point, you won't be able to spoof an address without setting off a last-mile or even core-located router alarm. Once you kill NAT, we're all exposed like worms after a shovel turn in the garden. The need for CAPTCHA and other algorithms that authenticate humanity will be reduced to simply partitioning your machine if it turns out to be a spam or other bot. Get your machine clean, or you don't logon. At least that's the concept unti

      • So, what you are saying is that you would IP ban those who spam. OK. Why is IPv6 necessary? Oh, you don't want to ban entire networks that are behind NAT? OK, but IIRC, with IPv6 you can change the IP of the computer at will (well, part of the IP anyway), so you would still need to ban entire networks (using the part that does not change) or the bot will just change IP of the machine...

        Why not just actually give up the misbelief that you're anonymous on the Internet?

        Because even if the government knows who I am and where I live, the other internet users do not (or I hope so). There is on

    • Using javascript to record certain events: random clicks on the page, scroll actions, and snapshots of the mouse x/y position every 5 seconds or so.

      Using xmlhttprequest to send this data to a server that determines whether the behavior fits, within a margin of error, to a markov model built via previous human interaction in the page.

      Of course, if the automated blogspam bot ever got ahold of the markov model, it would be able to generate 'believable' interaction with the page by creating a markov chain.

  • Not so sure (Score:5, Insightful)

    by Misanthrope ( 49269 ) on Saturday April 25, 2009 @01:19AM (#27710345)

    Assuming you could write an algorithm to determine humanistic behavior, it stands to reason that you could write a bot to fool the initial algorithm.

    • Re:Not so sure (Score:4, Insightful)

      by TheRaven64 ( 641858 ) on Saturday April 25, 2009 @05:55AM (#27711229) Journal

      Not true. For example, any NP-complete problems can be solved in polynomial time on a nondeterministic Turing machine, but a solution can be verified in polynomial time on a deterministic Turing machine. There are lots of examples of this kind of problem, for example factoring the product of two primes or the travelling salesman problem. In a vast number of cases, it is easier to test whether a solution is correct than it is to produce the solution. Even division is an example of this; it is easier to find c in a*b = c than it is to find a in c/b = a.

      Of course, as the other poster said, there is no evidence that 'seeming human' is in this category, and it's a very wooly description of a problem so it is probably not even possible to prove one way or the other.

      • Even division is an example of this; it is easier to find c in a*b = c than it is to find a in c/b = a.

        That would be quite hard to prove... ;)

        • No it's not, it's the set to first-year computer science students to prove (or was when I was was an undergrad).
  • by gcnaddict ( 841664 ) on Saturday April 25, 2009 @01:21AM (#27710351)
    I remember reading... I can't remember if it was a post about an algorithm already written or a proposal for an algorithm which would run alongside a CAPTCHA through the entire registration process, but the basic premise was just that: measure the entropy and fluidity of human movement and determine whether or not the user is a bot based on whether or not the user fits typical random human usage patterns.

    I also remember the writer of the post noting that this kind of system would basically stretch the human-unwittingly-answers-CAPTCHA out such that humans would have to do the entire setup process manually instead of just the CAPTCHA, thus defeating the point of automated setup.

    Does anyone have this article? I can remember reading it but I can't find it.
    • ...algorithm ... which would run alongside a CAPTCHA through the entire registration process, ... measure the entropy and fluidity of human movement and determine whether or not the user is a bot based on whether or not the user fits typical random human usage patterns.

      Ya. I don't think I'll be whitelisting *that* in NoScript... :-)

    • by abolitiontheory ( 1138999 ) on Saturday April 25, 2009 @02:25AM (#27710595)

      In addition to this, what about those humans who just happen to fall into the seemingly 'mechanical pattern' that a computer registrant would? I know some parents of friends who very meticulously and methodically fill out forms, reading every box and explanation to ensure that they're inputting the right data.

      Any computer judgment of what is authentically human is in a way a reverse Turing test. It's a computer judging if humans are behaving enough like humans. The problem here is too many degrees of separation: a very specific type of human [engineer] designs a computer to assess the 'humanness' of other humans actions. Any such assessment would be based on certain assumptions and biases about how humans act. It sounds like putting a document through Google translator into another language and then back again, before turning it in for a final grade.

      • In addition to this, what about those humans who just happen to fall into the seemingly 'mechanical pattern' that a computer registrant would? I know some parents of friends who very meticulously and methodically fill out forms, reading every box and explanation to ensure that they're inputting the right data.

        Even the most "mechanical" of your friends wouldn't download the page, parse it in its entirety without scrolling the page in their browser, then enter all form fields in a fraction of a second, be

        • Re: (Score:3, Insightful)

          by TheRaven64 ( 641858 )
          It's a nice idea, but unfortunately it's easy for a computer to work around. How does the client-side JavaScript know how much the page has been scrolled? Because the browser tells it. There is nothing stopping a bot from downloading the page and then submitting the same HTTP requests that the client-side JavaScript would (or even running it in a VM and injecting DOM events into it with some random wait events). Once you know the algorithm used on the server to determine whether something is human, it's
      • by Atraxen ( 790188 )

        Plus, there are hardware based differences in interaction that modify your reading/interaction behavior. Analyzing mouse cursor movements for a trackball, mouse, and touchpad will likely give very different results - and that's assuming they're being moved the same way. When I'm reading with a mouse, I tend to 'follow along' on the page - with a trackball, I park the cursor to the side - with a touchpad, I tend to move in blocks. Add enough variables, and you can model any behavior (at the risk of losing

        • Don't forget tablet PCs with their touch screens. In that case the mouse pointer is jumping from point to point.

    • I can't find the article itself, but there's a short summary of it here [slashdot.org].
    • by caramelcarrot ( 778148 ) on Saturday April 25, 2009 @07:23AM (#27711505)
      Last time this came up, I suggested the idea of constant bayesian analysis on HTTP logs to determine the likelyhood of the current user being a bot.

      It could take things into account like if the user bothered to visit previous pages, request images, the time between requests etc. You could then either just make the webserver kill the connection, or you could add a function to your preferred web language (e.g. PHP) that returned the probability that the current user is a bot, and so redirect them to a more annoying turing test or block them.

      This'd also work pretty effectively if people wanted to stop scrapers and bots in browser games. Of course a bot could mimic all this, but it'd raise the cost of entry significantly - and it might end up being that the bot is no more effective than a human working 24/7, though even then you'd need to be changing ips constantly.

      I was thinking of trying to implement this over the summer, based on comment spam bots on my website, all without any need for client-side spying
  • A system that could capture the way humans interact with forms algorithmically could eventually relieve humans of the need to prove anything altogether.'

    This system could also reproduce human interactions. So it's only time until this behavourial approach stops working.

    BTW: I don't want slashdot to check how I scroll the page, nor is my typing and retyping business of anybody but me. Imagine you can't comment anywhere because you block Google Analytics.

  • doesn't that just mean a computer can also feed the correct data in, defeating it?

    Anyway, the little tests these days are stupid and annoying, and perhaps for some people, getting impossible to do. Perhaps instead of the test being administered at the point of registration, new accounts at places should be automatically monitored for type of activity.

    For instance, if the first post at a forum has any links to blacklisted ad sites (could be EasyList USA, whatever), it's probably safe to just kick it out aut

    • doesn't that just mean a computer can also feed the correct data in, defeating it?

      Unless P == NP, checking a solution can some times be a lot easier than actually generating a solution. Consider, for instance, a hash like SHA-1. The whole point of a secure cryptographic hash is that checking if a certain hash matches that corresponding to a document is very easy, but crafting a document that matches an already specified hash is very hard.
  • by cjfs ( 1253208 ) on Saturday April 25, 2009 @01:53AM (#27710491) Homepage Journal

    I can see it now: "have you tried moving your mouse around randomly?", "how about clicking on a few different parts of the page then making coffee?", "still not working? Try slamming the mouse down several times", "okay, as a last resort click on the tabloid pop-up."

  • The tricky part of the an alternative solution seems to be modelling human behaviour - in order to detect if something is human or not your need to have a pretty good model of what humans do. I suspect there would be a lot of variation in the sort of way people interact, if I'm feeling sleepy I would present a very different profile of use to when I'm on task and in flow. A program to do this will probably have to be statistical in nature with some sort of confidence intervals of humanness. Maybe it will ne
  • You mean I didn't need a new pair of glasses every time I couldn't read on of those CAPTCHAs? I want my money back.
  • Seems some things should be easy. There's a certain minimum amount of time that it takes a human to tab from one field to another as they fill in data, even if they're pasting info in. Even just slowing down bots to the speed that a human could reasonably do a task would put a dent in the problem =\

    • The problem is already easily parallelised. If it takes you 10s to fill in a form, and it isn't using any CPU (you're sleeping), then run a couple of thousand attempts in parallel. You get the _exact_ same throughput as you do if they are all run serially.

      For batch processes, latency isn't really an issue, it just means you need to do more transactions at once.

      • by rdnetto ( 955205 )

        Then limit it to one attempt per IP address to prevent the parallelization. The only downside would be that this would also block people behind NAT, since they would have the same address.

        • Re: (Score:3, Informative)

          These guys have botnets, and with networks like Tor, you can't limit access to one IP. Besides, if you've got captcha that is being attacked, to limit them by IP, you need to send them all through a single location to perform the detection, completely breaking your load balancing. It becomes a DoS target.

          Basically, the attacker has more machines, more IP addresses and more time than the target.

          Even if I only have one machine, that's fine, I attack 10 or 100 sites instead of just yours. Or, I use a network

  • A system that can determine whether or not a user is human would have built-in characteristics as to what a human would do in such a situation. What's keeping someone from taking that same algorithm and adapting it for means other than their intended purpose?

    If a machine knows what to do, another machine can take advantage of that.

    Obligatory: import skynet; blah

  • If the judge of the test is a computer, then the test will always be passable by a computer.

    • >If the judge of the test is a computer, then the test will always be passable by a computer.

      You are missing the point. It is not about making a failproof system. And you are stating the obvious. Any conceivable system could be fooled, either with human or computer judges. But the issue here is finding less obtrusive ways to detect spambots, screenscraper scripts, etc... without giving up too much detection efficiency.

  • Everyone has been focusing on the how easy/difficult it would be to reverse this hypothetical algorithm that would determine based on your use of a webpage if you're human or not... ...I see a more fundamental problem. This is on the internet, so they have basically 3 options on how to implement this.
    1) server side. The only variable you could track is time between page requests. Don't see how that could possibly be enough information
    2) Client side JS. Simple, just modify the JS to return &isHuman=true
    3

  • The problem with a lot of sites dealing with spam is that they are using the same software that tries to solve everything at the top. Uniformity doesn't help.

    But leaving people to their own devices to create or adapt their own forum/blogging/wiki software is not a good solution either. Uncoordinated diversity leaves a lot of people to fend for themselves.

    Having unity-in-diversity (a common strength across systems and organisms), however, might well solve the problem.

    If forum/blogging/wiki software creators

    • Uncoordinated diversity leaves a lot of people to fend for themselves. Having unity-in-diversity (a common strength across systems and organisms), however, might well solve the problem.

      Now you wouldn't happen to hold a degree in politics or economics, would you?

  • The idea that behavior was a better judge of identity than "biometrics" is old old. I wish I could remember the name of the program, but there was a Gnu / Unix utility that measured word frequency, letter frequency, the amount of delay between pressing any two letter combinations on the keyboard, and more... all put together to verify identity. And it worked quite well. I think that program is close to 20 years old.

    Biometrics fails for the same reason it always has... as soon as someone comes up with a h
  • Wouldn't the ability to collect biometric information require a fairly potent piece of spyware to be loaded on the client system? How would a user, or even a security professional, easily tell the difference between a keylogger that reads our actual strokes, and one that is just timing the key presses?

    Sounds like a kernel mode device that would have be part of the input drivers. It's an attack surface, IMO. I would think it's safer to have an separate input device for biometric authentication only than atte

    • I think the deadline for making meta-April Fools jokes must have also passed. And yes, there's a deadline. "April Fools was last year!" so he says.

  • Spam Karma? (Score:3, Informative)

    by nilbog ( 732352 ) on Saturday April 25, 2009 @03:43AM (#27710843) Homepage Journal

    It seems like the old Spam Karma module for Wordpress did this. It calculated how long they were on the page vs. how much they had typed, how fast they typed, and a bunch of other factors before it ever hit a captcha. Back when I used wordpress I remember being it pretty accurate too.

  • voice recording (Score:2, Insightful)

    by Ofloo ( 1378781 )
    Think of every behavior as a voice recording, record and replay ! And there you go bots are able to mimic.
  • Captcha's etc won't work perfectly. Ever. There are always bot(net)s that are able to defeat them. If you use software to make the lettering difficult to read, you can still write software to read it. Like the algorithms, we detect the order in the chaos..

    So let's just face it:

    The internets needs a unified authentication system if we are to kill spam. If there was a unified authentication system, you would't need to store your passwords around the internet, and your mails would be tracable to you.

    So, let th

  • Not a great idea (Score:4, Interesting)

    by jgoemat ( 565882 ) on Saturday April 25, 2009 @05:03AM (#27711101)
    The article did have links to some interesting topics, such as google experimenting with image orientation as a test. The premise of using how a user interacts with a page is deeply flawed though. There's not even a need for an algorithm or program to 'figure out' the captcha, just record how an actual user interacts once and you can send the same exact thing every time to pass the test. The reason this works is because the 'question' doesn't change. This would be like showing the same text captcha every time. If they ignore identical values being sent, the values can just be fudged a bit.
  • When I posted question to the Turbo Tax community forum it asked a simple question as a CAPTCHA. Seems like an easy enough solution, and it changes each time to foil a persistent brute force attack.

    Of course I'm sure it's only a matter of time before someone has an algorithm smart enought to answer questions. And I suppose that a botnet with enought time would work too. Still an interesting approah I thought.
  • by Arancaytar ( 966377 ) <arancaytar.ilyaran@gmail.com> on Saturday April 25, 2009 @05:21AM (#27711153) Homepage

    The user's local behavior before form submission is detectable only via a client-side script. There are therefore two ways this can go.

    1.) You maintain accessibility standards and make the client-side script optional. The effectiveness of this approach is comparable to xkcd's [xkcd.com] "When Littlefoot's mother died in /Land before Time/, did you feel sad? (Bots: NO LYING!)

    2.) You require client-side script execution in order to submit the form. The effect is a lot of pissed-off users with NoScript or non-compatible Javascript interpreters (IE or the rest, depending on which one you support).

    This idea is basically like visual captchas, but instead of the visually impaired, you're screwing everyone without Javascript.

    There is one aspect of user behavior that can be detected, however, and that is the time passed between the user requesting the form and submitting it. From an AI perspective, humans spend an eternity typing, so setting a minimum delay between request and submission will slow the bot right down - especially with a flood control that requires a delay before submitting the next form. Slashdot does both of these things already, by the way.

  • The captcha is entered into a field and submitted to the web server. However our random highlights, backspacing, scrolling etc. all happens in the browser on our system. The web server (thank ______ ) doesn't know about any of that, it just sees the end result. So it doesn't have access to any of that data, to make any kinda of determination. Currently only malware would be collecting this data and sending it somewhere. So the proposal here is to be human verified by malware.

    There are other flaws that o

  • Not really seeing a difference between behavior and ability.

    Any action that you perform is behavior, and, obviously, if you perform an action you are also capable of performing it. A behavior is therefore an ability. Any algorithm that tries to distinguish between human behavior and computer behavior is still a reverse Turing test.

    Given that, testing the quirky way humans navigate through the web is arguably even flimsier test than the captcha. There is a certain degree of randomness, but nothing that rand(

  • by monoqlith ( 610041 ) on Saturday April 25, 2009 @09:37AM (#27712643)

    Can Slate stop writing articles about shit it doesn't know about?

    • Can Slate stop writing articles about shit it doesn't know about?

      Right.

      First, most of the things Slate suggests have been tried. Timing human input behavior is in use already, and attacks already do some randomization there.

      Second, despite what the Slate article quotes, the CAPTCHA for Gmail has been cracked. [pcworld.com] The success rate is only 20%, but because the cracker is embedded in a botnet, that's good enough to survive IP blacklisting. MessageLabs says Gmail spam went from 1.3 percent of all spam e-ma

  • Some notes in no particular order. . .

    1. I kind of like winning the Turing Test. It makes me feel human. Some days, before the coffee kicks in, this is a plus.

    2. It's funny when I can't read the secret warped word. It throws me in existential questioning for about half a second.

    3. I like the new idea of having to describe a randomly rotated 3D image. That's a cool system which I'd like to see implemented, though I can't imagine it will be very long before it too is solved.

    4. I find it funny that proving

  • What stops someone from recording a human looking at the page, and then replaying that behavior from a bot?

    Also, will humans actually want to send the information needed for this to remote websites? I don't really want a website to know what part of the page I'm looking at.

  • Regardless of the Turning Test aspects of this, forms are filled on the client. This hypothetical algorithm would also be running on the client. The server can't trust any "Yes this is a human" that comes from the client. So even if you could make this algorithm it would not solve the intended problem.

  • This whole thing is a moving target.

    Anything your algorithm can do, my algorithm can do too.

    Might work for a while, though, but then again, so did CAPTCHAs.

    Wait, did I just say "so did CAPTCHAs"? What I meant was, so are CAPTCHAs, because everyone is still using them, even though they don't work.

    Which is the real problem ... not only is the whole thing a moving target, but tackling the problem only works when everyone actually moves.

    Remember, it's measure --> countermeasure.

    All this really means is now e

  • 20 or more of the top-level posts on this page are all "Well yeah, but if a computer can test it, then a computer can emulate it." I'd ask if anybody bothered reading other comments before they posted, but I already know the answer (this /is/ slashdot after all).

    On to the topic at hand: this is impractical for another, less complex reason. From what I've been seeing, most of the "bot" registrations these days are not bots, they're people. If those who wish to can pay someone a couple dollars a day to

Bus error -- please leave by the rear door.

Working...