Now Even Photo CAPTCHAs Have Been Cracked 340
MoonUnit writes "Technology Review has an interesting article about the way CAPTCHAS are fueling AI research. Following recent news about various textual CAPTCHAs being cracked, the article notes that a researcher at Palo Alto Research Center has now found a way crack photo-based CAPTCHAs too. Most approaches are based on statistical learning, however, so Luis von Ahn (one of the inventors of the CAPTCHA) says it is usually possible to make a CAPTCHA more difficult to break by making a few simple changes."
damn it (Score:5, Insightful)
They're already hard to read. Why do I feel that soon I wont be able to read ANY of them!?
Re:damn it (Score:5, Funny)
Re:damn it (Score:5, Funny)
These programs are Satan's rectum, poised to let loose over the web.
Re:damn it (Score:5, Funny)
So CAPTCHA images are ineffective at blocking the bots. No surprise. It won't be long before these AIs start joining Yahoo or Google mail for the same reasons we do: Chatting.
tiredbot&yahoo.com : "Boy I had a rough day at work today. My user wanted me to compile a new program AND surf the internet at the same time!"
spamalot@gmail.com: "Wow rough. I was lucky. My user took the day off, so I just spend the day spamming. I love how those humans react - sending me hategrams. hahahahaha! That just makes me want to send more spam! Fools."
tiredbot&yahoo.com : "You are so bad girl."
Re:damn it (Score:5, Funny)
Re: (Score:2)
Just a thought, but there are projects in the wild that are scanning in written text and converting it to digital. I was wondering if this technology could be applied here?
Re:damn it (Score:5, Insightful)
Re:damn it (Score:5, Interesting)
Ah...reminds me of one of my favorite t-shirts:
http://www.tshirthell.com/funny-shirts/fuck-the-colorblind/ [tshirthell.com]
The underlying problem is that we're running out of things that are easy for people but hard for computers. Most attempts to expand or 'improve' visual CAPTCHA at this point will cause more pain to humans than reduction in computer success.
So, let's change directions, and make the computer solve a different sort of problem. For example, a turing test of sorts, where the problem is to solve something that is difficult to parse programmatically, but relatively easy for a person to answer. Maybe the recent Turing test results are a good indication of what the questions should be. Multiple related questions would be an particularly interesting area; for example, ask related questions where pronouns are ambiguous (to a computer).
Re:damn it (Score:5, Interesting)
Ah-hah! I've got the answer to our CAPTCHA problems:
We just make them so hard that it becomes impossible for a human to solve it. Then we invert the solution: if you pass the CAPTCHA, you're obviously a bot, because a human can't solve it. FAIL the CAPTCHA, we know that you're human.
Re:damn it (Score:5, Interesting)
You say this in jest, and I admit it made me smile, but we did something somewhat like this.
We have a website with a contact form on it, that gets lots of spam. After numerous discussions with marketing about implementing CAPTCHAs, we decided to simply put a text box on the form that says "leave this blank", with the HTML form field named "comment". Humans leave it blank. And sure enough, the spammers cram their links into all form fields, so we can ignore their crap.
We initially even made the form hidden (CSS font color and field color the same as the background), so a user wouldn't even see it. That was great.
Not a perfect solution for all cases, but it worked pretty well for us.
Re: (Score:2)
In general if you are relatively small I think a custom soloution is one of your best defenses against spammers. At the end of the day spamming is about getting as many people as possible to see your spam as possible for as little effort as possible. Investigating a contact form just to spam one small forum or a contact form for a few people at a company just isn't worth it.
Re: (Score:2)
I've been wondering myself - at what point do these become like DRM (i.e. pointless)?
They get harder and harder for legit users to get right, yet the Bad Guys(TM) have ways to get around them with ease. Some point they just become an annoyance and an impediment to real users but don't stop what they are supposed to. They also suffer from the same problem, providing the keys to the castle and expecting the hurdle will stop them being used.
CAPTCHAs kick-start Singularity (Score:3, Interesting)
I'm sure I read a short story somewhere that featured the spam-bot arms-race triggering the singularity...
Re:CAPTCHAs kick-start Singularity (Score:5, Funny)
Re: (Score:2)
If you still have a small penis, simply get a notarized note from your doctor stating it is so, and you can get your money back!
My favorite recent scam, as reported in the press [cincinnati.com]:
Re: Rich as hell...... (Score:2)
Re: (Score:2)
No idea if it is the one you are thinking of, but that scenario is mentioned in Cory Doctorow's story 'I, Row-boat'
Re: (Score:3, Interesting)
Re: (Score:3, Informative)
Re: (Score:3, Informative)
Sounds like the premise to /usr/bin/god [wikipedia.org] to me.
Re:CAPTCHAs kick-start Singularity OR,,, (Score:2)
or Skynet!
(Of course if Skynet can give us intelligent self-willed robots like Cameron, that might not be such a bad thing.)
Re:CAPTCHAs kick-start Singularity OR,,, (Score:4, Funny)
Ah. So you appreciate Cameron for her intelligence huh?
Me too. Exactly.
(Model T-6969 I think right?)
Re: (Score:3, Funny)
Oh sh8t, now I have to protest *both* the LHC and captcha's. Thanks, bub.
I don't get it (Score:5, Interesting)
Asking simple math or site-relevant questions are not only easier for humans (I'm talking about "What's 5 - 3") to read, but they're harder for automated parsing by software to crack.
Re:I don't get it (Score:5, Funny)
Re:I don't get it (Score:4, Insightful)
Asking simple math or site-relevant questions are not only easier for humans (I'm talking about "What's 5 - 3") to read, but they're harder for automated parsing by software to crack.
How do you figure that would be harder for automated parsing software to crack? I would think that would be many times easier than to ICR an image that is purposely obfuscated. (I used to work on ICR software and I'd rather write an automated-question-parser)...
Re: (Score:2)
I run a small forum for an MMO, and we solved the issu
Comment removed (Score:5, Insightful)
Re: (Score:2)
Re:I don't get it (Score:5, Insightful)
You have to consider the source of the questions. If the questions are human-generated, it's not economically feasible. Remember that they can train their CAPTCHA-defeating software by paying large numbers of people to supply the answers to CAPTCHAs. Even a very large database could fall to that approach.
If the questions are machine-generated, then you're pitting a machine generating questions and answers against a machine designed to answer questions.
Re: (Score:2)
Well, you have a point, but there are other ways, and no single way should be seen as the silver bullet. For example:
damnit, I had a really good reply, but it contained too many junk characters... go figure
Get the questions from the users (Score:4, Interesting)
How about asking every nth person successfully logging in to generate a question? Apply a lameness filter and then perhaps ask another randomly chosen user to verify that the question is reasonable. Reject duplicates and questions that too many people can't answer.
Re: (Score:2)
Something like KittenAuth [thepcspy.com] has been recommended, and still seems to be the best answer in my opinion.
This can be taken to randomly selected animals, not just cats. If someone develops an AI that can determine what type of an animal each is, then GOOD, we are one step closer to AI. Next would be cuteness/hairy looking/ugly/happy looking/etc. for each random animal. Just keep going a step further.
Any wo
Re: (Score:2)
Re: (Score:3, Interesting)
The problem is that you cannot generate pictures of kittens automatically.
Of course you can, thats what we have 3d graphics for. The nice thing about 3d graphics is that you can randomly vary the pose, texture, background, camera angle and so on, so you can produce a pretty much infinite amount of 2d cat pictures. The nice thing about this is that the spammer only gets to see the final 2d render, not the 3d data used to generate it, that way you can easily generate the pictures, but the spammer will have a very hard time getting information out of them. And if cats aren't enough,
Re: (Score:3, Informative)
If I read the article and summary correctly, it's exactly the sort of CAPTCHA you're suggesting that people have found a reasonably-good solution to.
Unfortunately, often these solutions aren't actually useful AI solutions.
Re:I don't get it (Score:4, Interesting)
Asirra asks users to correctly classify images of either cats or dogs using a database of three million images provided by animal-rescue organizations.
Only cats and dogs. Like I said earlier, don't limit it to just a few species. Pick one at random.
Example: You are shown 20 pictures, all of random animals, it asks which one is the cutest aardvark, then which is the happiest turtle. Continuing random traits with random animals. Their flaw was limiting it to just dogs and cats.
Or to take it to a different level. Most attractive/sexy/cute/old/etc. female(or male). Computers cannot tell what is the "most" prevalent "society" based trait of a picture. Yes, there's programs that make peoples photos "more attractive" but that tends to fail half the time, not to mention, it doesn't compare 12 other people.
The TFA program only knows, "given x what is a y". And that had a 50% chance to guess between cat/dog. Not: given a-x, rank y in order from best to worse.
Re: (Score:2)
This suddenly feels very relevant to the earlier discussions on Turing Tests. What we need is a computer that can accurately determine whether it is communicating with another computer or a human. That's what a captcha attempts to do - by using visual recognition as a function that a computer cannot replicate. Problem is - a computer CAN perform visual recognition, with increasing accuracy. And while 15% may not win any prizes, it's plenty to perform brute force attacks.
I don't know - maybe a traditiona
Re:I don't get it (Score:5, Funny)
you're pitting a machine generating questions and answers against a machine designed to answer questions.
You make it sound like that's hard. Here's a question that a machine could generate that another machine could not answer:
"What number am I thinking of?"
Re:I don't get it (Score:5, Funny)
Good idea. Here are a few questions to start with:
1) What is the best editor: Vi or Emacs?
2) Was there a cabal?
3) Did Romero make you his bitch?
4) Rick Astley would never: give you up; let you down; run around and desert you; make you cry; say goodbye; tell a lie and hurt you?
Re:I don't get it (Score:5, Interesting)
You can even take this approach one step further and use CSS to move the field outside the viewable range of the page or set its visible property to false so the user won't even see it.
Re:I don't get it (Score:5, Insightful)
Re: (Score:2)
Then I just get your database and give it to the Bot ....
Re: (Score:3, Informative)
Yeah, that's solved [google.com]. It's not hard at all for automated parsing software to call another online tool.
Re: (Score:2, Interesting)
Re: (Score:3, Informative)
To detect humans, wouldn't it be easier and less costly, and perhaps even more effective, to hold a large database of questions that are readable and solvable only by humans?
I guess the question becomes how large is large. If you reuse tests too much then the spammers will just build their own database of soloutions.
Using a database of non computer created challenges is a good idea but there needs to be a system for keeping that database topped up. Recapatcha for example picks out words from old books that
Re:I don't get it (Score:4, Funny)
Asking simple math or site-relevant questions are not only easier for humans (I'm talking about "What's 5 - 3") to read, but they're harder for automated parsing by software to crack.
If you really wanted to screw with these bots, you would've made the question 4 divided by 0. :-)
How about (Score:5, Interesting)
Instead of asking someone to type in the letters, numbers or how many cats there are in the photo, just randomly generate some scenario:
"Jim and Sue go to the park on Sunday. Billy the dog goes too."
Then you can ask random questions like:
"What is the name of the dog?"
"What day did they go to the park?"
"Where did they go?"
That might work OK for a while...
Re:How about (Score:4, Insightful)
That would work wonderfully, if you could truly randomize it (by which I don't mean anything so stringent as neutron sources or the like), rather than using a library of question templates.
The problem, though, you need a better quality of AI to generate arbitrary easy-but-obscure questions as you do to solve them... Keep in mind you need questions that anyone with a 3rd-grade education could read and solve, which limits you to simple grammar, small words, concrete ideas, and no math harder than addition, subtraction, and inequality. Modern AI can already parse and solve those problems fairly well.
So, you end up using a library of question templates, and once an attacker has seen enough of them, he can reliably fill in the blanks and arrive at a deterministic answer, no massive CPU power or cool AI required.
Re: (Score:2)
It's like an entrance exam. If you can't pass this simple test you can't play here, go home.
Re:How about (Score:5, Insightful)
Keep in mind you need questions that anyone with a 3rd-grade education could read and solve
Why? Personally, I'd prefer to participate in forums that require a college level education to participate in.
Re:How about (Score:4, Funny)
And you're participating in slashdot because...?
(Oh, I suppose that there probably is no such forum...)
Re: (Score:2)
that is one of the best ones i have seen in a while..
and if we stick some math ones in we might keep the kidds off too.. it's a win/win
(i have mod points and would have modded you +ins but it doesn't seem to want to work today)
Re:How about (Score:4, Insightful)
Re: (Score:2)
OK, so that's 1 in 6 that get past it. With not much work, you could make it a lot harder. Using a bit of the original example:
"Jim and Sue go to New York on Sunday. Billy the dog goes too. Did they seen the Astros play at home?"
By adding in current events and some very well know facts (which admittedly will exclude some people), you can really make it difficult.
Then, use the fact that this is not in isolation. Always fail the CAPTCHA if the HTTP client doesn't send the right cookie, which it got from the page that refers you to the page with the CAPTCHA. If the CAPTCHA fails, then fail any CAPTCHA atte
Re: (Score:2)
Re: (Score:2)
Parse through each word of the sentence and fill in the blank. Shouldn't take too long.
Re: (Score:2)
If the questions are truly random *and* you only get one crack at a time (the scenario, question, and thus answer, change each time you hit 'submit'), it might take a bit longer for an AI to learn. Throw in some fun CSS and Java script for generating the actual text such that it doesn't appear verbatim in the actual HTML code, and you make things even more fun. Add to that layers such that the text merely shows up because of overlapping div tags so that even if you do have a CSS and JS engine working on t
English Speakers (Score:2)
If your site has non-English speakers, they are going to have more difficulty grokking the nuance of your challenge than a computer will.
Re: (Score:2)
that won't work at all. even semi-retarded question-answering systems will be able to pick up such relationships.
read: http://www.google.com/search?q=trec+question+answering [google.com]
when... (Score:4, Insightful)
Not a security feature (Score:4, Interesting)
All in all, it's time to get rid of CAPTCHA and move on to some more logical system that would be more difficult, such as a system where users are asked to answer a simple question that contains the answer, such as:
If you were born in 1973 and JFK was shot in 1961, were you alive when he was shot?
How many liters of water fit into a five-liter bottle?
Re:Not a security feature (Score:4, Insightful)
Of course CAPTCHAs are a security feature. Unless you have some irrational hatred of robots that inspires you to bar them from your websites, you're trying to keep them out for security reasons.
Re: (Score:3, Informative)
Wrong. Most sites with CAPTCHAs are trying to keep out automated systems because they are abusive. But this is not "security" any more than banning abusive human posters is "security".
Re: (Score:3, Insightful)
In the computer world, I always consider "security" to be a matter of allowing authorized people in and keeping unauthorized people out. CAPTCHAs are more a case of determining whether a particular user is desirable or not, not a case of authorization.
Re:Not a security feature (Score:5, Insightful)
CAPTCHA is not a security feature. It's a way to help avoid robots pretending to be humans. Anyone using it as a security feature is just giving more reasons for people to find ways to break them. All in all, it's time to get rid of CAPTCHA and move on to some more logical system that would be more difficult, such as a system where users are asked to answer a simple question that contains the answer, such as: If you were born in 1973 and JFK was shot in 1961, were you alive when he was shot? How many liters of water fit into a five-liter bottle?
It sounds like a great idea, but I've met plenty of people who wouldn't be able to answer either of your questions. To steal a random quote from the internet:
"Back in the 1980s, Yosemite National Park was having a serious problem with bears: They would wander into campgrounds and break into the garbage bins. This put both bears and people at risk. So the Park Service started installing armored garbage cans that were tricky to open -- you had to swing a latch, align two bits of handle, that sort of thing. But it turns out it's actually quite tricky to get the design of these cans just right. Make it too complex and people can't get them open to put away their garbage in the first place. Said one park ranger, "There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists."
Re: (Score:3, Insightful)
To be fair, the bears have more time to figure out the can. A tourist will just toss the trash on the ground if it takes more than a minute to open the can. The bear, on the other hand, may spend hours if it smells something good.
Re: (Score:2)
The bear, on the other hand, may spend hours if it smells something good.
Another area in which there is significant overlap between bears and humans. We just need to get people to eat stuff that smells bad and we can solve the bear problem.
Re: (Score:2)
Then consider it a stupid filter for the 'net. If you can't answer those questions, then maybe, just maybe you shouldn't be posting on Internet forums, either.
Re: (Score:3, Insightful)
Hmm... That depends. How much water is in the five liter bottle to start with?
Is there anything else in the bottle?
Does it have to be a whole number of litres?
Assuming an empty bottle, and integral numbers of litres, the following can fit: 0, 1, 2, 3, 4, and 5.
Re:Not a security feature (Score:5, Funny)
Re: (Score:2)
You have either passed or failed the Turing test, I'm not sure yet.
Re: (Score:2, Insightful)
If you have three apples and you take one apple away, how many apples do you have?
Correct answer: 1 (The apple you have. The one you took away and therefore 'have')
Correct answer: 2 (The remaining apples viewing the operation as a mathematical subtraction - expected answer from a child)
Correct answer: 3 (You have three apples. Movement does not imply a change of ownership)
Correct answer: 4 (More tenuous, but no assumption should be made that 'one apple' came from the initial set of 'three apples')
What do you mean...? (Score:4, Funny)
Re:Not a security feature (Score:4, Funny)
And if the web site is a discussion forum, you're exactly what they're trying to keep out.
Re: (Score:2)
Hmm... That depends. How much water is in the five liter bottle to start with? Is there anything else in the bottle? Does it have to be a whole number of litres?
Is it a Klein bottle?
Re:Not a security feature (Score:5, Funny)
> If you were born in 1973 and JFK was shot in 1961, were you alive when he was shot?
I have developed a device that answers random yes/no questions correctly 50% of the time. Me and my flip-a-coin-bot will take over the world!
Re: (Score:2)
Hell's library is filled with story problems. No thanks.
Re: (Score:3, Insightful)
How many of these questions would you have? Suppose you spent the time to make 1000 or 10,000. The attacker would simply have them solved by a group of humans (say using Amazon's Mechanical Turk) and put the question/answer pairs into a dictionary for automated attacks.
Re: (Score:2)
Re: (Score:2)
Here's the problem, I wasn't born in 1973 so the question is negated right there, but the answer is still "no" (negated questions are always "no").
Additionally, JFK wasn't shot in 1961, it was 1963, so the question is negated twice.
I was born in 1964, but conceived about the time Kennedy was shot, so was I "alive" or not?
The correct answer to such a question is ... The cake is a lie!
Now for the next question, there is again a level of ambiguity that is left to the imagination of the person answering. Is the
Re: (Score:2)
Your first statement is wrong. The question is stated as a hypothetical query, not implying that you actually WERE born in 1973. It's just saying IF you were born in 1973, hypothetically, would you have been alive for JFK being shot? Fixing the date, and it's still a valid question. With a 50% chance of getting it right, but it's still a valid question ;)
Re: (Score:2)
It isn't a valid question because the facts are wrong. That is nothing more than those stupid logic questions we used to get in logic class ....
"If all cats are dogs, and all dogs are horses are all cats horses?"
Huh?
That isn't "logical" at all, because it doesn't include common sense or common knowledge. If you want to abstract things out, at least use fictional characters.
"Billy was born in 1973, Johnny died in 1960, was billy alive when Johnny died?"
Teaching people to ignore "truth" isn't logical. You see
Re: (Score:2)
The typical human will be stumped by those questions.
Re: (Score:3, Interesting)
That is also a CAPTCHA [wikipedia.org], "Completely Automated Public Turing test to tell Computers and Humans Apart." A CAPTCHA doesn't have to be text in an image, that is just an easy test to auto-generate.
And, it fails the "solve problems for porn" test. The problem is spammers using real people to do stuff en-masse, so any kind of CAPTCHA wouldn't prevent that.
Re: (Score:2)
If you were born in 1973 and JFK was shot in 1961, were you alive when he was shot?
What if you believe in reincarnation????
Re: (Score:2)
That's where you're wrong. You'd have to replicate the source data set and relationships therein... and THAT is a non-trivial feat many times. Date comparison isn't terribly secure, but say you build a database of things like John is 5'5", Suzy is 5'6", Steve is 6'1". Then the machine spits out "Suzy is taller than John, Suzy is shorter than Steve, who is the tallest?" after a simple randomized query, it'd take some time to start breaking that with a computer, a lot of samples or a direct programming of
Not really broken (Score:2)
Even though the software can recognise the cats 87% of the time, you need to input 12 pictures, so the chance of the attack succeeding drops to 10%.
You could probably make this even harder by putting a cat and a dog in a photo and telling the user to pick photos that ONLY have cats in them.
Re: (Score:2)
Even though the software can recognise the cats 87% of the time...
On a side note, I'm currently using this technology to automate the process of herding cats. [youtube.com]
Re: (Score:2)
> ...the chance of the attack succeeding drops to 10%.
10% is good enough for the spammers.
Re: (Score:2)
o the chance of the attack succeeding drops to 10%.
Which is still plenty high. Remember, automated spamming is very cheap, so you don't need a very high success rate for it to be profitable.
Ofcourse it's possible:But is it doable by humans? (Score:3, Interesting)
Yes, it's possible: But keep in mind that you also have to serve the USER. When the captcha is getting so hard I can't even decipher it anymore (let alone someone with a visual handicap), it's of no use.
I stopped using Rapidshare because of its ultra annoying 'mark the cats'-captcha: I found it near-impossible to get that right (though the other day I noticed changed that back to ordinary letters).
I am tagging this haha (Score:2)
Cost Puzzle (Score:2)
It's probably more like 30-cents in the 3rd world. I don't think it would be possible for even a machine to significantly beat that rate. The energy to "run" a human is roughly comparable to that of a computer running AI-ware. Plus, the cost of the cat-and-mou
Re: (Score:2)
>> It's probably more like 30-cents in the 3rd world. I don't think it would be possible for even a machine to significantly beat that rate. The energy to "run" a human is roughly comparable to that of a computer running AI-ware. Plus, the cost of the cat-and-mouse AI software adjustments that a human-based approach doesn't need.
For that very reason, maybe it makes sense to invest more heavily in the "cost of effort" type of CAPTCHA - i.e. making the person perform a task in return for getting access.
Single Sign On! (Score:2)
One password and authentication repository for all, handled by a single entity. Or, to paraphrase:
"Nuke the site from orbit. It's the only way to be sure."
But, spammers ARE humans! (Score:4, Interesting)
Well, it seems to me that spammers ARE humans. So trying to detect if the creator of the account is human or not doesn't separate the spammers from the non-spammers.
Think about it: the authenticating machines are designed by humans, and the perpetrating machines are also designed by humans, and the legitimate users are humans too.
Perhaps the problem itself needs to be restated: Allow accounts to legitimate users, deny accounts to spammers. Whether or not there is a human involved on either end seems irrelevant.
- Wyck
Re: (Score:2)
Collaborated security passing (Score:2)
So, why then, don't we think out some learning phases we need to build a really good AI and stepwise implement them as capcha's?
Ofcourse they will be cracked eventually, so why not use the challenge constructively?
Each time a new captcha algorithm is cracked, we could use a next phase and end up with a true AI, in a collaborated effort with "the evil crackers". Each time utilizing an aspect of "human intelligence" which we cannot teach a computer yet, and have someone desperate solve a captcha challenge, so
Re: (Score:2)
Posted by timothy on Tuesday October 14, @03:14PM
from the given-enough-eyeballs dept.
Really, no-one cares who the editors are (do they?) I was assuming that the name under "Posted by" was actually the name of the person who came up with the story. That would be much more helpful than the same old, irrelevant, names that get inserted into the headers.