Baffling the Spam Bots 350
dumpster_dave writes "Scientific American is running an article, Baffling the Bots on techniques to outsmart and subvert spam bots and their chat-room cousins via CAPTCHA. You have probable seen this in the form of images containing text as gate-keepers to various on-line services. The latest evolution is using non-words and distorting the text such that even the best AI systems cannot decipher them, yet humans can not help but do so [cf., Gestalt Psychology]."
Blind Users (Score:5, Insightful)
I've often wondered how these types of systems can be made handicapped accessible
Re:Blind Users (Score:3, Interesting)
Re:Blind Users (Score:2)
Re:Blind Users (Score:4, Insightful)
"Then you have to worry about those with poor or no hearing, as well as those with poor or no sound equipment. Why not have someone solve a riddle or puzzle"
Because then you'd be discriminating against stupid people, and keeping them off the internet.
Oh, wait...
Re:Blind Users (Score:5, Funny)
1) Obfuscate e-mail addresses
2) Stop spammers from getting to places containing real email addresses
3) Keep stupid people off the internet so the revenue stream of spammers is cut off.
Dumb Users? (Score:2)
Re:Blind Users (Score:2)
Re:Blind Users (Score:2)
Re:Blind Users (Score:2)
Re:Blind Users (Score:3, Informative)
How about using a *picture*? (Score:2)
Re:Blind Users (Score:4, Interesting)
Most blind users are running windows with JAWS [synapseadaptive.com] or similar screen-reading software, and sites like ACB [acb.org] release a lot of their content as mp3's already, so I'd assume that most are well equipped to handle web audio.
Re:Blind Users (Score:2)
I'm on 56k, you insensitive clod!
Re:Blind Users (Score:2)
Re:Blind Users (Score:2)
Re:Blind Users (Score:2)
For example:
Of "book", "cat", "tree", and "silver", which is an animal?
Of course, since a bot could try all the permutations here, one try would be all that should be allowed, but that should be enough for a human. I'm sure there's a form that couldn't be brute-forced, but I'd have to think about that a bit more.
Re:Blind Users (Score:2)
Having two tables of nouns and categories, and from those generating a challenge of the type:
Put "silver", "oak", "water", and "cat" in the order of liquid, tree, animal, metal.
Re:Blind Users (Score:3, Insightful)
Another hole in this defense - human traitors (Score:2)
Re:Blind Users (Score:2)
Re:Blind Users (Score:2)
Re:Blind Users (Score:3, Funny)
A couple of simple math/logic problems such as these should be suitable:
Simple puzzles like this should be able to be figured out by almost a
Well (Score:2)
Re:Blind Users (Score:2)
For the second you're raising the bar by complicating the parsing, but the question is: How would you generate the problems? If the
Re:Blind Users (Score:2)
Re:Blind Users (Score:2)
(I know toll free US numbers aren't toll free outside the US, but I believe there is also a toll free international exchange or "countr
Re:Blind Users (Score:2)
I've always thought (Score:3, Insightful)
Re:I've always thought (Score:5, Interesting)
These same people if I were verbally giving them the url to slashdot would end up at http://www.slash..org/ (god I wish I were trying to make a joke but seriously I've had this happen).
Because of this my email is plainly visible on our web site, and in my forums, and on a few other forums and on an occasional usenet message. With a combination of RBL's, bayesian filtering, procmail soup and other goodies my spam count per day is kept to a low roar (double figures in spam number rather than four figures, again I wish this were joking).
Re:I've always thought (Score:2, Funny)
Re:I've always thought (Score:2)
Re:I've always thought (Score:3, Informative)
Re:I've always thought (Score:2)
Unmunging addresses that have been munged like that is a trivial matter, but nonetheless is left as an exercise for the reader. You don't even need a full JS interpreter. Just parse anything that looks like a bunch of escapes on the basis that someone probably did that because they don't want you to see it, and that assumption will be valid more of
Re:I've always thought (Score:2)
The solution to SPAM:
1. Educate consumers not to respond to spam or its enticing advertisements.
2. Modify SMTP so that we guarantee we
Re:I've always thought (Score:2)
Of course, this goes completely out the window if enough people use it since the spammer would just use a rendering engine to pull the content and parse the DOM for mailto: links or anything looking like an e-mail address.
Re: (Score:2)
Hotmail? (Score:2)
Huh? For any other domain than hotmail.com, perhaps. :-)
zRe:I've always thought (Score:2)
I'm
I don't receive any spam (Score:3, Informative)
I'm not one to go about shouting the praises of Microsoft, but someone over there's got their head out of their asses.
Re:I don't receive any spam (Score:2)
Re:I don't receive any spam (Score:2)
Re:I don't receive any spam (Score:3, Funny)
Losing battle against false error (Score:2, Interesting)
Smart humans will outsmart computers for quite a while. The average human is already dis-comforted with such a test (what's the middle word in the second image?!).
But those systems should work for the dumbest (within reason) humans. They're trying to design a test that's passed by the dumbest of six million, yet makes the smartest of a few (bots) fail.
I give in.
*comment about spambot overlords*
Keep tabs on where your address goes (Score:5, Insightful)
The address I use to post to USENET is completely disposable. The 'swen' worm in fact picked up my USENET addy and spammed it with about 40,000 emails. The address is now dead, but I saw that coming.
I have a public address which I give to casual contacts (who may not be totally trustworthy). This address changes yearly, and this keeps it spam free.
My well guarded private address, which I only give to my closest friends, has gotten no spam for 5 years. I receive about 20 emails per day at that private address and there is 0 spam.
Re:Keep tabs on where your address goes (Score:3, Insightful)
Re:Keep tabs on where your address goes (Score:2)
He didn't say he hides it, he simply uses different addresses for different purposes. A professor and other students would get the throw-away account specifically created for them. If you start receiving spam at that address you can be sure someone in that group of people signed you up.
Re: (Score:2)
Re:Keep tabs on where your address goes (Score:2)
What about a web page which you want to publish your contact information? What about mailing lists? Yeah, you could have hundreds of different email addresses which you cut off and add as you see fit, but the overhead, hassles and lost email is more difficult than dealing with the spam. What if you post something to a mailing list, then a year late
Re:Keep tabs on where your address goes (Score:2)
I have the same policy and managed to keep my real email address hidden for about a year. Then one of my 'friends' decided to send me an e-card using my private address. A short time later I started receiving my first spam on that address. Years later, now I get about a dozen a day
Re:Keep tabs on where your address goes (Score:2)
Doesn't always work! (Score:2)
I have young children who each have two email addresses. One address is the name of the kid @ our family domain. This address is only for close relatives and trusted friends. Spammers have not picked up these address.
But I don't run a real SMTP server, being on a less than completely reliable connection to the net. So I ha
Instead of Text? (Score:2, Informative)
Re:Instead of Text? (Score:2)
CAPTCHAs are not the answer (Score:4, Interesting)
Unfortunately, the system does not work very well. My dad sells on eBay, and a buyer of one of his auctions had an Earthlink account, which blocked the message that told how much the shipping would be, where to send payment, etc. When my dad went to the specified URL, and entered the CAPTCHA text as requested he would simply get an error message that he had entered it incorrectly. He forwarded me the Earthlink email and asked me if it was just him; it wasn't; I couldn't get it to work either. The random string of numbers and letters was very distorted, and there were four possible meanings; I tried those plus at least ten more with no sucess. The message never got through.
There are many problems with this type of system. Consider: what if both parties have CAPTCHA-enabled accounts, from different providers? The confirmation messages from both parties get blocked. Smarter systems whitelist people as messages are sent to them, but as in the eBay case, the recipient had no way of knowing my dad's email until AFTER a message from him was received. It's a Catch-22.
And for people who are visually impaired, universal deployment of this system this makes email essentially impossible. Earthlink's page had a link "if you cannot see the picture, click here" and when you got to that they said to call their 1-800 number if you have any problems. Right.
Adding CAPTCHAs to everyone's email systems is NOT the way to solve the spam problem. We need a more realistic, permanent solution. For example, cryptographically authenticating the sender (the "From" field) at the level of the originating ISP (and rejecting messages from senders it cannot authenticate, by password or whatever means), and then having each relay in turn authenticating the previous relay if it trusts it. Headers can be inserted in the emails, signing the previous headers with private encryption keys with their public counterparts obtainable from the ISPs by simple DNS lookups. This will build a chain of trust, which stops when a message gets outside of the sender's network, and therefore allows the original sender to be properly identified back through their ISP. Once we know who messages are from, people can be held responsible. And at that point, anti-spam laws can handle the rest.
Re:CAPTCHAs are not the answer (Score:2)
And, did you call that 1-800 number? I'm sure they would have been able to solve your problem. And what's more, your call would have cost Earthlink a couple of cents, and if lots of people who experienced problems would have
Big problem (Score:3, Insightful)
One solution might be to offer multiple ways of deciphering. Such as an audio clip that could play a distorted version of the phrase that you could then type in. Or even ask simple questions, such as "What color is the background?".
Then there's the other issue of the code not being visible simply because I'm using Mozilla....but thats a whole different can of worms.
Re:Big problem (Score:2)
verge (I think)
obvious
churches
It took me a while to figure out the first one, and I'm still not sure whether my answer is correct.
(It could also be "energy".)
If I had to respond to this type of thing to get into a site, I would probably go elsewhere.
Re:Big problem (Score:2)
Daniel
Could baffletext be used here ? (Score:3, Insightful)
Re:Could baffletext be used here ? (Score:2, Insightful)
A better way to do this... (Score:2, Interesting)
What is really needed for a *good* CAPTCHA is not pure
Re:A better way to do this... (Score:5, Funny)
Re:A better way to do this... (Score:2)
And if you think you can computer-generate the quizzes, well, then, I'm betting a computer could guess the answers, if it used the same knowledge web for the word associations. The text-based CAPTCHAs work because you can computer-generate them but
Re:A better way to do this... (Score:2)
Re:A better way to do this... (Score:2)
For example, show 4 pictures; three of them of the same animal (say, a tiger) and the fourth of a random animal (say, a rhino). Ask the user to pick the odd one out. Make them grayscale, so that a color histogramming technique can't be used.
Another example: show an analog clock, and ask the user to enter the time shown.
By deploying 100s of su
Re:A better way to do this... (Score:2)
You have 10 seconds.
Aren't they trying too hard? (Score:4, Insightful)
Re:Aren't they trying too hard? (Score:2)
The answer is "obvious".
So, what's the first word, "verge" or "energy"?
Re:Aren't they trying too hard? (Score:3, Funny)
It says:
NVIRGIE
OBVIOUSE
HURCHES
I'm not sure what the hell that means, but if they're expecting someone to come up with other words in place of those then they're really expecting too much. Anything this complicated isn't worth it.
Re:Aren't they trying too hard? (Score:3, Funny)
And I thought the eye tests were hard enough... (Score:4, Insightful)
Sound is better, but even that sometimes can be difficult to understand - also, I don't have speakers hooked up on some machines I use; some folks disable sound due to abnoxious websites/ads that blast sound unexpectedly.
Anyways, many of my relatives and friends can't get into sites that use distorted numbers, etc at all and are basically locked out; sometimes they get lucky and find a similar site (likely a competitor) to the site they desired, which doesn't use such nonsense...
Seems to me a better way is use geotracking (too many inbound connections from similar sources [IP ranges, routes, browser config, etc), email verification, etc...
sites, etc.
With good heuristics (really the key to stopping automated bots in my view), any decent website should be able to filter out much of the bots and other junk - it's no accident really that many of the largest sites don't use distorted numbers, pictures, etc - how do they do without them?...perhaps be a good Ask Slashdot item
Ron
Spam isn't that much of a problem ... (Score:3, Insightful)
This is convenient, I don't have to care where my email address goes, I just use it.
Re:Spam isn't that much of a problem ... (Score:3, Interesting)
I used the same method, and my own mailserver with agressive filters, and it worked very well until... a Russian spammer started to send out spam with my mail address as the sender address. He did this via hacked systems (open proxies) so it was not possible to do any blocking.
The load of crap that came in was just unbelievable, and all attempts to contact his spamvertized site or their providers just had no result.
In the end the only thing I could do was remove the MX
Re:Spam isn't that much of a problem ... (Score:2)
One of my addresses uses "spam assasin" for protection for example. In its configuration it lets me give it a number called "hits" which interpretation is as follows:
"Set the number of hits required before a mail is considered spam. n.nn can be an integer or a real number. 5.0
type what you see: (Score:4, Funny)
heh dumb bot
The real problem with CAPTCHAs.. (Score:3, Interesting)
Suppose that a human can solve your CAPTCHA in an average of five seconds. Suppose unskilled labor costs $6/hour. Then it costs a bit under a cent to find the solution to your CAPTCHA, assuming that I want to solve at least a few thousand a day. As a result it is impractical to protect a service worth more than a penny with a CAPTCHA.
Actually unskilled labor costs far less than $6/hour in some parts of the world, so if CAPTCHAs see wide use the value of the services they can protect is even lower. A tenth of a cent?
CAPTCHAs should be seen as a proof-of-work mechanism, like "hash cash", not as an oracle that can determine whether a transaction was initiated by a human or a machine. Unlike proof-of-worth schemes that burn CPU time, the value of a CAPTCHA won't be inevitably halved every 18 months by Moore's law; on the other hand, it could be suddenly reduced to zero by breakthroughs in image processing.
Re:The real problem with CAPTCHAs.. (Score:2)
Re:The real problem with CAPTCHAs.. (Score:2)
The first example is a bit stupid (Score:2)
Re:The first example is a bit stupid (Score:2)
Stupid? Well, then...go ahead: Provide an algorithm that not only correctly extracts this antialiased text out of three-channel color (hint: filtering out the wavy background is not mathematically easy), and then also can do an OCR regognition on the remaining distorted bitmap.
Can it be done? Sure -- but it certainly isn't trivial. Coming up with a mathematical method (and hence
Re:The first example is a bit stupid (Score:2)
The result: A near-clear, black and white representation of the letters remained. If wavy backgrounds can't defeat even the simplest of image software programs, how do you expect the same backgrounds to prove any challenge to custom-designed software?
Re:The first example is a bit stupid (Score:2)
Regardless of what you may think, CAPTCHA defeating programs are difficult to write and are never are 100% effective.
Thus, it hardly seems appropriate to label this sample "stupid."
Stupid for you, but hard for a program. That's the whole point!
What's wrong with this picture (Score:3, Insightful)
Baffling the spam-bots are easy... (Score:2)
The spambots will never bother trying to run javascript, especially if it means downloading an external file. And using, for example, mozilla's command-line js-engine will not help, because without an attached browser most of the scripts will reference objects that does not exists (like windows and such).
Dynamically generated documents are a pain in the ass for web-spiders. I know. I have
Re:Baffling the spam-bots are easy... (Score:2, Insightful)
Simplified cleartext password management (Score:2)
Socially, people like to pick dumb passwords. Tell them what makes a good password...and they will nod and pick a dumb password...then loose it. So, demanding that people follow good practices is not possible (unless you make fools of people with poor passwords by sending out funny but embarasing email using the person
But spammers can use this too! (Score:2)
Easy (Score:4, Funny)
how about tons of fake emails on webpages? (Score:2)
The goal would be to feed the bots so many fakes that they choke on the bounced undeliverables, or, they ma
Re:how about tons of fake emails on webpages? (Score:3, Informative)
Re:how about tons of fake emails on webpages? (Score:2)
Simple... (Score:2)
If a mail reader can be customized to decode the message, why couldn't a bot?
That would depend.. (Score:2)
Would it have a bank account? If so, yes.
Re:So, The Philosophical Question Is (Score:2)
Re:This is stupid. (Score:2, Insightful)
Agreed, this is an immensely useful measure; HTML e-mail simply isn't too useful. This'll also kill all the tracking bugs.
2. Institute a block all email except where you have whitelisted the sender...
Powerful, but a huge sacrifice. Feels like throwing in the towel to me.
3. Allow the sender to get prioritized by requiring them the first time to respond to an email and identify who they are a
Re:This is stupid. (Score:2)
But it's also very easy to lose important e-mails if your inbox is filled with spam.
Except the important e-mail might well be in H
Re: (Score:2)
Re:As a record store owner... (Score:2)
And you wonder why you have no customers!
"That's it. What's your name? You're blacklisted. Now take yourself and your little bitch friend out of my store - and don't come back." I barked. Cravenly, they complied and scampered off.
So you're telling your customers to "scamper off", and then act surprised when your business is no longer profitable.
Hint: if you want to run a profitable business, don't
Re:Fool bots, fool humans with NaturalNames(TM) (Score:2)
It is the bot-trap of the past. It's called 'wpoison'.
Your version creates allegedly fake addresses in real domains. That's impolite, to say the least. You're poisoning domains that don't belong to you and causing them spam problems. Just like the really swell fellow who has created fake email addresses using randomword@domain.name for two domains I own.
Re:Computer Vision Breakthrough Put Forth By Spamm (Score:2)
The only kind of recognition the spammers would ever care about is the kind that gets their spambots past the test.