Slashdot Log In
Making CAPTCHAs Even Harder With 3-D Models
Posted by
timothy
on Mon Jan 31, 2005 07:01 PM
from the now-what-is-the-man-doing-with-the-rabbit dept.
from the now-what-is-the-man-doing-with-the-rabbit dept.
Michael G. Kaplan writes "CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) are commonly used to prevent computers from filling out web forms. Computer vision experts have been able to design programs to foil CAPTCHA with a high degree of success. I have designed a CAPTCHA that is based on the identification of attributes contained in an image generated by the grouping of easily recognized 3-D objects. I call this the Virtual Photographic CAPTCHA and it is likely to remain invulnerable to automated attack for many years to come. A novel anti-spam system necessitated its development."
Related Stories
[+]
Carnegie Mellon CAPTCHA Digitization Project Now Underway 119 comments
tomandlu writes "The BBC is reporting that Carnegie Mellon University has found a novel use for CAPTCHAs — deciphering old texts. We've discussed this project before, but it was prior to it getting off the ground. Users Entering text acts as a sort of distributed computing project. Basically, the CAPTCHA is made up of two words — one of which is known to Carnegie, and one of which isn't. If the user correctly deciphers the known word, then the unknown word is assumed to be correct. Well, almost. Two different users must give the same answer to the same unknown CAPTCHA before it is taken off the list. 'Using the reCAPTCHA system von Ahn's team is digitizing documents and manuscripts as fast as the Internet Archive can supply them, and the good news for book lovers (and bad news for spammers) is that the supply of reCAPTCHAs is not likely to dry up any time soon.'"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Famous last words. (Score:5, Funny)
Like so many, he obviously doesnt think anyone can (Score:3, Funny)
spare us the modesty!
Re:This is a good thing! Not!! (Score:5, Funny)
Parent
Re:This is a good thing! Not!! (Score:3, Funny)
This is a bad thing for the blind. (Score:3, Interesting)
Re:This is a bad thing for the blind. (Score:3, Insightful)
Re:This is a bad thing for the blind. (Score:4, Insightful)
Parent
Implementing CAPTCHAs with PHP (Score:5, Informative)
PHP developers might find this article useful:
http://phpsec.org/articles/2005/text-captcha.html [phpsec.org]
Captcha's have already been cracked (Score:5, Interesting)
Vision-recognition systems be dammed, all a spammer needs to do is use the inherent need of apparently most of the male race to look at pictures of naked women to get what he needs. I don't know if a counter was ever found to this method either...
Re:Captcha's have already been cracked (Score:4, Insightful)
Yes, I first heard this from an engineer at Yahoo. They were, as far as I know, the first site to have to deal with this technique on a major scale. Fortunately, this attack requires that the attacker's system communicate with your server, playing the role of a typical user.
So, although the "answer" to the CAPTCHA is provided an actual human, you can still pinpoint mass registrations and the like to a single group of IP addresses in most cases, because the users are not the ones interacting with your application. This becomes a network problem rather than an application problem.
Parent
Re:Captcha's have already been cracked (Score:2)
Though if I did this on a small scale and didn't get too greedy I might be able to stay off the radar. Couple that with changing hosts frequently and/or finding hosts with badly enforced TOS's and I can give a headache to any Captcha test.
So the game continues...
*blows whistle* Five-minute major... (Score:4, Informative)
Although it's tangential to the topic, you can't "ban by MAC addresses". Not unless you're on the same ethernet segment as the attacker. Try it the next time you've got access to a few machines separated by at least one router. Ping from two different machines to a third on another network and run tcpdump to inspect the MAC addresses on the packets. Let me know how it turns out. (hint: they'll have the MAC address of the router)
Parent
Re:Heh... (Score:3, Informative)
Re:Heh... (Score:3, Insightful)
Re:Heh... (Score:3, Insightful)
Re:Captcha's have already been cracked (Score:2)
Which isn't to say that no-one is u
Counter to this method (Score:3, Funny)
I don't like it already (Score:5, Informative)
"Patents pending."
Tyvm, but no.
Here's another test... (Score:5, Funny)
Re:Here's another test... (Score:3, Funny)
Of course that's not the way it currently is done. Glitzy marketing folks tend to generate the acronym first, and then come up with humongous phrases that retrofits into the acronym.
Popular CAPTCHA implementation beaten (Score:5, Interesting)
Kinda scary... (Score:5, Funny)
The logical conclusion is that I'm not actually human. My girlfriend will be very upset when I tell her.
Re:Kinda scary... (Score:2, Funny)
Took a long time (Score:5, Insightful)
It seems a very good idea, but all that flicking back-and-forth of the eyes is to compute-intensive for my grey matter.
Re:Took a long time (Score:3, Informative)
I need a program to identify them (Score:2, Interesting)
Anti-spam system (Score:2)
And thus you have effectively blocked that email adress permanently for the 70% of the population who doesn't understand the above, and who - more importantly - doesn't have the time or interest to make the effort to understand (and that would include people like my mother), or who don't read English well enough to understand it, interest or
Someone already cracked it... (Score:2, Redundant)
In the end, it is only a deterrent. But it is definately not close to foolproof
(note that this technique
Does it scale? (Score:3, Insightful)
Just some hypotheticals.
Let me be the first to say it (Score:2, Insightful)
Why graphics? (Score:5, Insightful)
Solomon Chang
Re:Why graphics? (Score:3, Interesting)
already been done (Score:5, Funny)
Rachael: Is this testing whether I'm a replicant or a lesbian Mr Deckard?
Deckard: Just answer the questions please.
Prediction... (Score:3, Insightful)
CAPTCHAs are useless with cheap labor now (Score:3, Insightful)
I had a conversation with a senior executive at a former employer.
He told me that, just as companies were outsourcing tech support to India/China/etc, companies which handled mass-emailing were also outsourcing work to have people sit there and recognize CAPTCHAs as well as respond to those stupid validation things some people try with their email (ie, you have to respond back to some silly email from their server saying "yes, I do ACTUALLY want to email you"). The mass-emailing companies would forward all the responses they got to a mailing to the company, and rooms of people would go through them all.
Very little training was required for the CAPTCHAs, and only rudimentary English for the email-response things.
Don't invest time in these things yet. (Score:3, Interesting)
I work at a school for the deaf and blind, and captcha's make it impossible for the blind or many of the vision impaired to do many things on the Internet without having help from someone with good vision. Even I, with my cheap LCD monitor and 73 year-old eyes, have trouble reading the Yahoo ones.
Re:Don't invest time in these things yet. (Score:4, Insightful)
To allow governments to actually control the content of websites on such a fine level seems rather draconian to me. Also, while they're typically buried, some websites provide an audio-based alternative; I know that Hotmail offers this. It seems to me that you should rather lobby websites which offer no alternative for blind or vision-impaired users to change their policies.
Finally, I'd like to note that with relatively young eyes and a surplus CAD-workstation monitor, I also find the Yahoo CAPTCHAs difficult to see. The problem is not your eyes, it is rather that in trying to make graphics illegible to computers the algorithm has managed to make the graphics illegible to humans as well.
Parent
I Cannot believe (Score:3, Funny)
Re:Don't invest time in these things yet. (Score:3, Insightful)
A Simple Improvement? (Score:2, Interesting)
Now, I'm not suggesting that it is easy for a computer the read these words; but, wouldn't this darker text colour make it easier for a learning algorithm to "dissect" two letters that intersect slightly?
I can't imagine that re
solving the handwriting problem (Score:4, Interesting)
Obligatory checklist (Score:4, Funny)
(X) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
(X) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
( ) It is defenseless against brute force attacks
( ) It will stop spam for two weeks and then we'll be stuck with it
(X) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
(X) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( ) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
( ) Armies of worm riddled broadband-connected Windows boxes
( ) Eternal arms race involved in all filtering approaches
( ) Extreme profitability of spam
( ) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
( ) Bandwidth costs that are unaffected by client filtering
( ) Outlook
and the following philosophical objections may also apply:
(X) Ideas similar to yours are easy to come up with, yet none have ever
been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
(X) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
( ) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
( ) Feel-good measures do nothing to solve the problem
(X) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough
Furthermore, this is what I think about you:
(X) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your
house down!
(From http://www.craphound.com/spamsolutions.txt)
Re:Obligatory checklist (Score:3, Interesting)
(X) Mailing lists and other legitimate email uses would be affected
You shouldn't sign up for the mailing list with your non-subaddress account.
(X) Users of email will not put up with it
Why? It should be automatic. If done on a massive scale (de-facto industry standard), people can believe that it'll take two weeks to convert, and then spam will be gone. They will put up with
This sucks. (Score:4, Insightful)
US govt contractors won't be able to use it (Score:4, Insightful)
Many companies that do business in the United States of America are subject to regulations [ada.gov] that forbid them from discriminating against people with disabilities; companies that have significant contracts with the United States Government are subject to the stricter guidelines of Section 508 of the Rehabilitation Act [section508.gov]. Anything that discriminates so flagrantly against people with vision or cognitive disabilities may get companies in trouble with the law.
why? (Score:3, Insightful)
Spam is a problem, but for me at least, this ain't the solution! I'm not about to jump through these hoops. If you want to exchange e-mails with me, fine. This system tells me you don't.
A lot of people won't understand it, and a lot of people who do are going to ignore it and move on to the next message in the inbox.
Problems with This System (Score:5, Insightful)
- It uses a whitelist as a means of solving spam. The system claims to allow strangers to effectively email each other, but only after first forcing the user to jump through several hoops. Correspondence will be slowed, and many people may give up in irritation before they bother to send the mail a second time. Imagine a prospective employer who decides that it's not worth tracking down Joe Blow because the email didn't get through, or a university attempting to contact a student by email. This particular method of foiling spam eliminates one of the key benefits of email: easy correspondence with a fast response time.
- Users have to maintain a database of trusted senders, as well as another database of recipients who trust them. This means extra data and the possibility of users accidentally falling off of each other's whitelists whenever somebody loses their address book.
- It will generate too many bounced messages, thus increasing network overhead to a point where it really may not be much better than spam. It also requires transmission of graphics, which again increase system overhead, as well as extra computational time to generate said images and to register and process the responses.
- The system claims it will benefit from server-side cooperation, instead of keeping the method purely client-side. This means that users have to rely on the benevolence of their ISP to keep the system updated and maintained.
- The graphical images contain a fixed number of very easily discerned letters that can be combined to form "easily-remembered" words. Once the letters are extracted, they can be recombined into known sequences, first of common English words, then popular web slang, then even transcribed into 1337 for the heck of it. Shouldn't take long to hack that.
- Sub-addresses? So you want to explain this one to my parents? "I know you picked out one, simple email address that you really like and will never have to change, but now I want you to pick out a new one. It might be a good idea to change it once every few months or so, too." The whole purpose of an address is to allow someone to have a unique identity that can be easily found.
Honestly, this particular system sounds like it relies more on sheer grunt work and the wasted time of its users to make it work, rather than any innovative computer programming.Re:Problems with This System (Score:3, Interesting)
1. If you emailed an employer your resume, he would automatically be whitelisted. His reply would go through to your inbox, and he would be sent a valid subaddress in plaintext that could be automa
CAPTCHA problems resolved (Score:3, Interesting)
I find the classification of these measures as "abusive" to be flawed at best, and misleading at worst. CAPTCHAS are a desperate response to an immoral group of people who will stop at nothing to make money with absolutely no regard for the problems, cost, and distress they cause their targets, who hide behind the first amendment when possible, or using illegal techniques when not. I hate having to deal with them myself, but I understand the necessity of their existence, however unpleasant, and will continue to deal with them as long as is necessary, as such.
Below are several problems mentioned with CAPTCHAs, as well as some possible solutions:
1] Accessibility
Problem: Blind/visually impaired users cannot reliably read the altered text.
Solution: Audio file accompanies every graphic, to be read on command. (However, still crackable with speech recognition.)
2] Referring test to 3rd parties
Problem: Spammers have other membership-based site users (i.e. porn sites) do the test.
Solution 1: Image is generated randomly, based on a user session, requiring an actual visit to your site; copying will be less effective unless the images are compared later... which may be quite some time if there are a large number of images and/or if the images are generated live on the server, rather than being stored files.
Solution 2: Include text imbedded in the image (and audio file) specifically referencing the site it is to be utilized with exclusively, requesting that the user report violations of duplication/unauthorized usage, and possibly offering a small reward for information leading to the arrest/conviction/judgment against the violator.
3] AI text processing
Problem: AI can be complex enough to identity letters, no matter how obfuscated, until such characters must be so distorted that even a human cannot decipher them.
Solution: Ask a logic question, present a photograph, or require another means of challenge/response than simple text recognition.
Example 1: Present a photograph of an apple or otherwise easily-spelled object, and ask the user to type the name into a field, or allow the user to select from a group of mildly distorted text, to avoid spelling issues. (However, this issue raises the accessibility issue again.)
Example 2: Present a short list of slightly distorted words (with audio files available for each word), and ask a short logic/history/other question. (One | Two | Three | Four | Orange - Of these words, one does not match. Please type the number of letters in this word, in numeric format. (Example: Apple = 5) This test is to be used exclusively by abc123.org. Please let us know if you see this elsewhere, as this means it was stolen.)
Until it is financially infeasible for a spammer to continue to do business, we will all be forced to deal with the messes they make. This is a challenge/response system, not an attempt to abuse the users of the internet. If there was a better way to solve this problem than hitting "delete" (which must happen hundreds if not thousands of times per day, for some of use), or using filters (which ALL give false positives, eventually), you can be sure that millions of semi-knowledgeable or better computer users would have chosen this path. To claim that such measures, which attempt to HELP people are abuse... perhaps you would like to re-evaluate your claim.
Won't be cracked in ten years? Ha! (Score:5, Insightful)
Let's look at his "LUCKY" example to see why. So he has a picture of the standing man, the flower, and the sitting man, and all over the picture, he has a series of glyphs. As these glyphs are not distorted, they are easily extracted -- the whole point of this system is that distortion based CAPTCHAs are relatively easy to defeat, so he doesn't bother. In his example, he has 26 glyphs, corresponding to A-Z, but in practice, it isn't important what the set is -- only that it is small and finite.
Once this set is extracted, we know that the "password" is some permutation of this set. Because the set of possible characters in an e-mail address is much smaller than the set of possible characters in an actual password (in particular, e-mail addresses are case insensitive), brute-force cracking of this password is much simpler than brute force cracking of a UNIX password, for example. But luckily for us, it's even easier than that.
In the e-mail, he includes this "decoder" list.
Of course, it should be clear at this point that this list would be relatively easy to extract from the e-mail, and further, that it tells you the exact length of the password, reducing the number of permutations to check to (in this case) 11,881,376.
Furthermore, a little bit of extra logic could reduce this number still further by noticing repetitive patterns in the list. So if "The Leaf of the Flower" appears twice, we know that the letters in those two slots are the same. And if the glyph set is unique (ie, no glyph appears twice), then we can reduce the number of permutations to at most 7,893,600.
Now, that's still a fairly large number of permutations to check, and at one point, it probably would have been enough. However, computational power is free now, at least for spammers. And it doesn't take much. Here's a sample perl (!) program I ran on my Debian GNU/Linux laptop (1.2GHz Pentium M).
This just prints out all the permutations; of course they still would need to be checked.
Not very long on a modern computer, eh? And written in perl, too, not exactly the fastest programming language in the world. Now consider that spammers have access to just about infinite CPU and bandwidth, thanks to their army of zombie bots, and that both CPU power and bandwidth are likely to increase at a rather rapid rate in the next decade. Furthermore, this is a worst case scenario -- success in a brute force attack tends to occur somewhere in the middle, not towards the end, reducing the necessity to actually go through all the permutations.
You don't think they'd try to crack it?
Plus, by his own admission, e-mail addresses can be shared. What does this mean in this context? I don't even need to get the e-mail address encoded in the CAPTCHA! If I can get any working e-mail address, even one, I get through! So the more active he is, e-mail wise, the more likely I can randomly strike a hit in the first hundred or so tries.
On top of