Turing Tests to Stop Spam 284
cexy writes "The Register has a story about how Hotmail and Yahoo! are using Carnegie Mellon developed captcha technology (completely automated public Turing tests to tell computers and humans apart) to stop spammers from automating signups for accounts from which they can send spam. These guys are using captcha too, but to stop incoming spam."
Hasn't this been around a while? (Score:2, Insightful)
Why? (Score:0, Insightful)
I don't have much personal experience with SpamAssassin, but from what I heard it does a fine job already.
The first step is stopping it from getting there (Score:5, Insightful)
MsgTo.Com used images to thwart spammers (Score:4, Insightful)
Kind of offensive though, a lot of people took offence to clicking a link to send me email.
MsgTo.Com dissappeared some time ago during the
Hedley
Ok here we go (Score:3, Insightful)
And now, back to our regular show.
Re:What a ripoff (Score:2, Insightful)
Re:Hotmail is more popular (Score:4, Insightful)
This kinda defeats the object of email - for people who barely know you, if at all, to contact you. Email is excellent at bringing together people from all over the world - what's the point if only people you already know can contact you using it? Wasn't the Internet supposed to surpass the letter and the stamp?
I'd rather put up with the spam. But if you really need to avoid it, do what I do: use two accounts: one for online publishing on the Web and sites like Slashdot, and the other for people I know. You get the best of both worlds.
The /. posting title is misleading (Score:5, Insightful)
But I don't think that translates into 60 times the cost. The Turing tests are interesting but I don't think that the creation of the accounts ever was a bottleneck in the process in sending spam. You could get a high school kid to create all the accounts you would need for a month in about an hour, and pay him in pr0n.
If the truth were known, Hotmail and Yahoo are just trying to decrease server loads. I bet that when bots create accounts they create hundreds or thousands more than are used, which take up server resources during creation and later as the accounts eat up storage. With Turing tests it is more likely that not too many will be laying around waiting to be used.
Re:CAPTCHA project (Score:4, Insightful)
So, while I commend their effort, I wish CMU would work harder to make their tools available not just to commercial sites but to the Open Source community and projects like Slashcode. This would help the captcha project actually accomplish its mission of protecting users from abuse, instead of leaving sites like Slashdot vulnerable to any 13 year old Visual Basic programmer with a grudge and a clue.
Accessibility (Score:2, Insightful)
"[...] humans can read distorted text as the one shown below but current computer programs can't:"
I think they mean "non-blind humans". How exactly will they ever solve that problem? If a blind
man's OCR program can read the text, so can the spammer's.
inherent imperfections (Score:4, Insightful)
Re:Yahoo works, hotmail not (Score:5, Insightful)
Instead of just experimenting by setting up a Hotmail account, has anybody ever tried the other way around? That is, pose as an advertiser and approach Hotmail about e-mailing their users?
Re:Yahoo works, hotmail not (Score:3, Insightful)
However, I gave my email account to one site and went from 0->2MB quota filled in less than a day in much less than 2 months. It's all about who or what you're in contact with... not about the service itself.
not only mail spam, sms too (Score:4, Insightful)
Next they'll patent the phone call (Score:4, Insightful)
What do you get if you eliminate the human from the above? Why, a protocol link. Might as well require me to type in TCP/IP packets and consider me human if I make too many erorrs :-)
Re:Ok here we go (Score:5, Insightful)
Bayesian techniques depend on predicting which elements (usually, which words) are likely to indicate spam, and which are likely to indicate non-spam messages. This can vary highly from user to user, and so it should be done on a per-user basis.
For instance, I am a security administrator and receive a lot of legitimate mail about "antivirus software", and very little legitimate mail about "teenage lesbians." However, my girlfriend's crush, who is an activist lesbian, may well receive a lot of legitimate mail about "teenage lesbians" and only spam about "antivirus software." If we are on the same ISP, then it would be erroneous behavior for my reporting "teenage lesbians" as spam and "antivirus software" as nonspam to throw her spam-filtering out of whack, or vice versa. And yet it is a potential privacy violation for the ISP to be gathering statistics on which one of us gets virus bulletins, and which one is the lesbian.
(Moreover, there also isn't yet any standard mechanism for users to report spamminess or nonspamminess back to normal IMAP or POP mail hosts -- and Bayesian algorithms require sampling both spam and non-spam mail, not just spam reported to an abuse address.)
The filtering mechanisms that should be implemented on the server are general ones -- ones that do not rely on deep inspection into the content of the message. I don't really want ISPs to gather stats on common keywords in users' incoming mail -- do you? It is one thing to examine structural elements of the message, such as the IP address which sent it, or the presence of normal headers; or to statelessly scan the message for static patterns, such as virus signatures or "DISCOUNT HERBAL VIAGRA !!!" It would be quite another thing to gather the kind of data that Bayesian filters involve, for every user on a large end-user system.
Re:Yahoo works, hotmail not (Score:4, Insightful)
I've had my Hotmail account for nearly three years, and I typically get about 5-10 spam messages per day - not a lot. I have custom filters that catch all emails with "mortgage, viagra, debt" - this catches most of the spam I get (I actually don't filter porn spam, well I haven't really tried, as at least they are creative with their subject lines - "Knob Gobblers" was a favourite - I've had some other funny ones too)
My username is 11 characters long with an underscore - this is probably a bit out of range for your typical "brute force"/random sign up name spammers.
So - if you want to use popular free email services, perhaps follow the same guidelines for creating secure passwords? Numbers, special characters,(although this is a bit more limited with email) and more importantly length of name!
Re:I failed the Turing test! (Score:3, Insightful)
It's easy to throw such ideas around, but implementation becomes an issue of rights quickly. I guess you want to force everyone to use their ISP's mail server and pay their ISP the amount. Fine. You have to block outgoing port 25, which fucks over anyone running their own mail server. Spammers will just buy T1s and be their own "ISP", and sell a flat rate email sending fee to other spammers. (They already do that).
What about people like myself that maintain announcement lists for my web sites. That's something like 2000 emails each time I send an update. It's all completely opt-in, and has a real return address, from which I personally handle unsubscribe requests from the people that can't figure out how to use the web site to unsubscribe. It's nothing like spam.
What about all the thousands of other email lists. The owners of the linux kernel mailing list would have to pay thousands a month in your email fees, even if it was only a couple cents an email.
Anyway, everytime someone comes up with these "change the infrastructure" silver bullet solutions to spam, they are always half-baked.
Playing BOTH ends (Score:2, Insightful)
Don't think that'll work? Your phone company is already doing it with telemarketers.
Automated Turing test? (Score:4, Insightful)
The Turing test is where a human talks to a computer and tries to decide if the backend that's answering him is a human or a computer program.
This is more of a reverse turing test, where the computer asks questions to try and find out if it's interacting with a person or a program.
It would be possible to write a program to beat this system, but it would not qualify as having passed the Turing test, because it would have only fooled another computer program, not a real person. Of course maybe said program could go on to pass the Turing test.
Wouldn't it be weird if spam was the driving force behind the creation of the first real AI?
Skynet began learning at a geometric rate.......by 1800 hours every mailbox in the world was jammed with unfilterable spam.
captcha stops blind people too (Score:5, Insightful)
Re:Yahoo works, hotmail not (Score:2, Insightful)
I use my hotmail address for pretty much everything and it's very clean. Instead of just deleting spam I use the block feature. Lately I've just been getting a lot of e-mail viruses.
Yahoo has a limit on the number of blocked addresses you can have. I ran into with those 100 spams in my inbox. I've yet to run into a limit with hotmail except on keywords.
So yeah, I'm sticking with hotmail for free accounts.
Ben
Re:Ok here we go (Score:3, Insightful)
I strongly suspect that Bayesian filtering would turn mail processing into a CPU-bound activity. You're converting words into known tokens, looking up coefficients associated with each distinct token, and then manipulating them. If anything, it resembles compiling as a workload.
To prove the issue either way, of course, I'd have to get off my tail and actually build an efficient filter and test it. As an O(n log n) problem, it _might_ not be CPU bound, for low enough disk/network throughput.
Re:Illogical. (Score:3, Insightful)