Distributed Spam Detection 304
A reader writes "There's an interesting project at SourceForge, called, "Vipul's Razor", that uses a gnutella like
system to let users exchange spam "signatures" to filter spam. I work at an ISP in Ottawa, we have been using it for last two weeks to stop bulk of spam coming to our POP3 accounts. More impressively, it hasn't tagged any valid mail as spam yet.
Here's
the scoop from its webpage:
"Vipul's Razor is a distributed, collaborative, spam detection and
filtering network. Razor establishes a distributed and constantly updating
catalogue of spam in propagation. This catalogue is used by clients to
filter out known spam. On receiving a spam, a Razor Reporting Agent (run
by an end-user or a troll box) calculates and submits a 20-character
unique identification of the spam (a SHA Digest) to its closest Razor
Catalogue Server. The Catalogue Server echos this signature to other
trusted servers after storing it in its database. Prior to manual
processing or transport-level reception, Razor Filtering Agents (end-users
and MTAs) check their incoming mail against a Catalogue Server and filter
out or deny transport in case of a signature match."" Cool idea. I'm up around 80% spam a day on my main mail account. Might be worth a try.
Idiotic (Score:1, Insightful)
"New pill reduces debt! 513456"
So, a message digest won't work.
Anyone know where these people live? (Score:1, Insightful)
Great use of p2p (Score:5, Insightful)
Are there any other innovative non-piracy p2p apps out there that we should know about?
Authentication with servers? (Score:5, Insightful)
How about a server frontend approach? (Score:3, Insightful)
Nothing truly insightful here, just speculation from a convenience freak.
idea won't work if reaches critical mass (Score:4, Insightful)
To get around this all a spammer has to do is change/add at least one charachter to each spam. This would make all the hashes unique and no spams would be detected.
Open for abuse? (Score:2, Insightful)
Re: Distributed spam filter (Score:3, Insightful)
You can tell if the same email has been sent to hundreds of people (and if you use hashes, you can do that without revealing the email)
You can click a "this is spam" button when you read an email, and anyone who trusts you (i.e. has your public key in their "trusted filtering friends" list) can look for similar messages and filter them.
But, there do seem to be a load of problems:
- Personalised email, as someone already mentioned
- Privacy problems with letting others into the secrets of your mailbox
- If you have the original of a message, you can calculate the hash, then see who else got the message (i.e. works for personal mail as well as spam)
- Relatively easy for malicious users to wrongly label someone as a spammer
Well worth investigating, though...
One way around potential abuse. (Score:5, Insightful)
The thought goes like this.
A person submits a signature of "identified" spam mail to a "supernode" for ex. and the submission gets a ranking of 1. Each additional submission (by other users) increases the score by a number.
This way, there are several classifications which could be used to filter incoming mail. For the mail providers, they could opt for only removing mail matching signatures with a very high score (thus very likely these will be actual spam) or they could filter anything reported.
The purpose of allowing the use of classifications is that it will take longer time to get higher scores, since more people have to report the specific spam mail. Some people whish to eliminate things the least bit suspected, but mileage may vary.
Do you see a resemblance with the
Re:I've managed to filter most spam (Score:2, Insightful)
an other effective spam stopping method ? (Score:3, Insightful)
I receive too much real messages in order to try this out and I think most spammers won't bother to actuall remove an email address from their database if it doesn't exist. But has someone else tried this with any luck?
This p2p spam sounds really nice and I'm going to give it a try asap. I already "lost" an other mail-account in the flood of spam I got on it, so now it forwards all messages to msnbill@microsoft.com (microsoft domain billing address).
Re:So... (Score:4, Insightful)
Why not a histrogram filter? (Score:1, Insightful)
Re:So... (Score:4, Insightful)
well, i would have to disagree with you on this point.. i work at a web hosting company as the technical support manager, and handling abuse complaints falls into my realm of responsibility... and i have found that a significant number of first time spammers do not KNOW that spam is "wrong", and get quite upset that they were "taken" by companies that send bulk messages on their behalf. i had one gentleman send me an apology letter that actually made me feel sorry for him. he, and many other people on our network, have never been repeat spammers.
i know that there are many people out there who don't care, but we can't automatically assume that all spammers are evil. some of them are just ignorant.