Using Email Networks as P2P Spam Filters 108
Oscar Boykin writes "New Scientist is running a story on using the social network in email as a P2P network.
The idea is that email networks have structure that is conducive to a type of search called percolation search . This means email clients could query the social network of email users to filter spam.
This story is based on a preprint available."
Secure? (Score:5, Interesting)
Am I missing something in this analysis?
Re:Secure? (Score:5, Interesting)
Re:Secure? (Score:4, Funny)
Re:Secure? (Score:2)
Re:Secure? (Score:2)
True. I guess my concern would have been whether their proposed system could be mined for information regarding frequency of connection between two emailers.
Re:Secure?partially... (Score:4, Interesting)
Re:Secure? (Score:2, Interesting)
Also you would not be able to be emailed by people who you haven't already approved the email address; would they have to phone you first?
For example:
People who change email address (Gmail, dropping a spammed email)
People legitimately contacting you (Old friend, people wanting to know more about your website etc.)
etc.
It would be like setting your telephone to only accept certain phone numbers and scrapping the phonebook. Bad for people, terrible for business.
Though I suppose spam is worse because
Nice...but not necessary (Score:5, Insightful)
Re:Nice...but not necessary (Score:3, Interesting)
Re:Nice...but not necessary (Score:1)
Re:Nice...but not necessary (Score:5, Insightful)
But I think this could even be a step back. Like the parent says, I think most informed people have solved the issue of filtering spam pretty effectively (Thunderbird, Yahoo, Gmail, Bayesian filters, etc.) and so we don't generally *see* much spam.
The *REAL* problem with spam is traffic and network pollution. Spam wastes a ridiculous amount of bandwidth and (through spyware) hijacks our systems' cycles to do something that is (with filters) ultimately to no end. This seemingly won't solve the bandwidth consumption issue and might worsen the problem by polling all your friends over the network and then using your personal cycles to scan said email against all the known spam on your friends' computers.
People forget that the true detriment of spam these days is the traffic it causes, not cluttering your inbox (if you're smart).
Re:Nice...but not necessary (Score:1, Insightful)
I use gmail, which does an excellent job at filtering spam.
I see this stated here on
A couple months ago the majority of my spam was actually legitimate email from my mailing lists. As of this moment, I don't see any legitimate mail in my Spam folder.
However, about 20% of the actual spam I get ends up in my Inb
Re:Nice...but not necessary (Score:1)
if nobody pays for spam, nobody sends spam.
Re:Nice...but not necessary (Score:3, Interesting)
Re:Nice...but not necessary (Score:2)
Re:Nice...but not necessary (Score:2)
Re:Nice...but not necessary (Score:2)
Re:Nice...but not necessary (Score:2)
Spam degrades the reliability of the medium (Score:2)
You've got to be kidding. Spam is text, or very nearly so (HTML). Unless you are using floppies and a 2400bps modem, the bandwidth/storage costs are irrelevent.
What is relevent is that it forces people to either use spam filters that randomly throw good messages away or they miss good messages becuase they can't be seen among all that spam. In either case, the loss rate goes fr
2004 spam for one user costs $0.55 (Score:2)
No, actually it isn't. I run my own mail server. I keep all mail, including spam. I get something like 200 spams/day. All spam for 2004 amounts to a bit more than 100MB. At the somewhat inflated price of $0.50/GB, that is about 5 cents to store all the spam for 2004. You may quibble over the exact number but you would be hard pressed to come up with storage c
Re:2004 spam for one user costs $0.55 (Score:1)
over $500 million dollars!
Spam worldwide still wastes $$$....
Re:Nice...but not necessary (Score:1)
Re:Nice...but not necessary (Score:1)
Re:Nice...but not necessary (Score:2)
Re:Nice...but not necessary (Score:1)
Re:Nice...but not necessary (Score:2, Interesting)
gentoo implimentation of courier-imap in ssl mode.
I have had no problems with it.
NOTE: 3 users, all me. NOT in production environs
Any suggestions for a better replacement?
Re:Nice...but not necessary (Score:2)
Re:Nice...but not necessary (Score:1)
Re:Nice...but not necessary (Score:2)
Great... (Score:5, Funny)
Re:Great... (Score:1)
Potential for harm (Score:4, Insightful)
Social network-based spam-detection is a part of, not a total, solution, and its limits need to be recognized.
Re:Potential for harm (Score:2)
The potentials for abuse that I see are if you don't keep a spam database at all, and so you will not flag any query as spam (even if it clearly is), or if you try to keep an "Anti-Spam" database (I dunno, a database of legitimate emails. Only problems with these abuses are if you have no database it shouldn't matter be
Re:Potential for harm (Score:1)
Wondering if this works for mailinglists (Score:1)
Re:Wondering if this works for mailinglists (Score:4, Informative)
A well-designed opt-in list won't have any fake addresses on it (although it may have messages to invalid addresses bounce is once-valid accounts stop working), because anyone with half a brain designing an opt-in list would require the addresses it's mailing to be validated by the recipients of the messages before sending them anything.
Re:Wondering if this works for mailinglists (Score:1)
Re:Wondering if this works for mailinglists (Score:2, Informative)
If you get enough trash back because of users, the nicest way is to let the mailserver handle it. A CPU can do the dumping a lot faster than a person can lookup an account, and take the person of the mailinglist
The spamassassin side of the story: We do not like to send out a plain text message, but nice HTML formatted messages. We take care that this requested e-mail is not mistaken for spam by already routing it through a filter to preven
Re:Wondering if this works for mailinglists (Score:1)
Again you state that your mail server is "dumping" the bounces as they come in. Does your definition of "dumping" mean it's removing the bad email addresses from your list?
If you are removing the bad addresses when bounces come in, whether manually or using an automatic process, then you need
Isn't this basically how Razor works? (Score:5, Insightful)
Re:Isn't this basically how Razor works? (Score:2)
Re:Hate to burst your bubble.. (Score:1)
FTA: "our large-scale simulations show that the system achieves a spam detection rate close to 100%"
That must have been some easy spam to stop because in the real world, its more complex. Random generators changing subjects, spoofing random senders addresses, and changing content.
How can it stop the spam from getting to inboxes when it HAS TO GET TO AN INBOX TO EVEN BE MARKED AS SPAM!!! Lets say SPAM A hits server B at 11:00pm. Server then relay's it
Isn't this how Yahoo works (Score:4, Interesting)
Reduces to a standard spam filter (Score:4, Insightful)
So it can't deal with spam that includes a unique random ID and would tag emails from a mailing list as spam. Once more: nice try, but it won't work in the real world.
Re:Reduces to a standard spam filter (Score:2)
Ob (Score:5, Interesting)
Actually, I think we should find a way to attach the same stigma to spam customers that we do to the spammers. Why do spam customers not have to go to jail? They're as much the problem as the spammers.
I can see something like having all the spam customers' names published online, so you google for "spam" and "lheal" and up pops my list of purchases. The other spammers then get a very clean list of people to spam. Over time, the net would be segregated into those who like spam and those who don't.
Yeah, unworkable idea, but so are all the others.
Yet Another Insane Proposal (Score:2)
Nah... Congress can't make stupidity illegal; they'd lose too many votes. The universe, not being elected, can... but tends to be in favor of capital punishment as a way of preventing repeated behavior.
An utterly illegal and unethical solution would be to start up a V1AGRA spam outfit, and taint the supply so that one pill in twenty was actually a disguised lethal dose of cyanide. This would cut into dem
Re:Yet Another Insane Proposal (Score:1)
Re:Ob (Score:2)
Hmmm... (Score:3, Funny)
Re:Hmmm... (Score:2)
Re:Percolation search (Score:1)
YahooMail, GMail and Hotmail Do This Already (Score:4, Informative)
Re:YahooMail, GMail and Hotmail Do This Already (Score:2)
Don't reject that idea so quickly, as I think you're on to something. The protocol would encapsulate the information that "userx@foo.com marked this message as spam". What the email provider does with that information is something else.
Not only that, but Google and Yahoo! could team up against Hotmail, or AOL, or whoever. Maybe
NEWS BULLETIN (Score:1)
would that really be good? (Score:3, Insightful)
I'd change an email client to respond with any message from certain folks I don't like to report all of their messages as spam to poison the social network. a couple of clients out there saying "yup, I've already got a message like that here, and my user marked it as spam".
think globally, act locally, right?
Re:would that really be good? (Score:2)
Not a particularly new idea... (Score:4, Insightful)
For one thing, it would block mailing list messages, which are messages that you probably do share with your contacts.
For another, it does not consider that most spam has random keywords seeding into every copy sent, so those would have to be ignored somehow, which introduces a fuzzy match algorithim, which means the possibility of false matches exists, and since you're asking others (probably all using the same algorithim against their databases) you have increased the chances of a false match being found.
In any case, collaborative networks already exist in a better form. Users mark messages as spam when they get them, a flag is created and sent to some central place that all users check against for matches. The algorithim for fuzzy matching resides in one place and is only used as an indicator in spam assassin in any case, not as the sole indicator..
Large scale systems like Google's GMail can use people flagging messages as spam to filter similar enough messages from other users, sort of thing. I'm pretty sure they do something like this, in fact, as my GMail account has *never* made a mistake in it's spam detection.
And so forth. There's better ways than relying on a random query of your contacts to see what they think.
Mmmm, buzzwordie (Score:3, Funny)
Re:Mmmm, buzzwordie (Score:1, Funny)
Darn... (Score:1)
Create a distributed spam filter that fingerprints incoming mail based on a number of criteria, have the user mark spam with a certain 'undesiredness factor', blacklist email fingerprinted as spam and propagate this information to other people using the same system... This way it should be possible to create some kind of 'network' that classifies email much more reliable than a simple content filter or address blacklist
spam filters should reduce network load (Score:3, Insightful)
Re:spam filters should reduce network load (Score:2)
All other things being equal, reducing network load is better than not reducing it.
But all other things are NOT equal.
If it's a tradeoff between reducing the load on the human (by reducing the amount of spam they must deal with) and reducing the load on the network, I'll pick "reducing the load on the hum
Re:spam filters should reduce network load (Score:2)
Re:spam filters should reduce network ld-mine does (Score:1)
My mailserver [cf13.com] does everything it can to prevent spammers from using the SMTP DATA command to send their spam.
Probably easy to bypass (Score:2)
Spammers can ranodomly generate content for their spam to bypass this. Have the actual spam message text as a JPEG image followed by random, gibberish text in both the same background and foreground colour so it is invisible. If the system looks for messages that are identical by comparing the text, then spam messag
Re:Probably easy to bypass (Score:1)
First, the random text doesn't look like legitimate e-mail, so it will be completely ignored by most spam filters. That leaves a jpeg attachment and bogus headers, which out to look pretty 'spammy' to a mail filter. How often do total strangers send you legitimate email that contains nothing but a JPEG and text that doesn't have any 'important' words in common with your other mail?
Second, if you're like me, then your mail client i
Sounds Like SpamNet (Score:3, Informative)
Cloudmark.
I signed up for the free beta and was told that it would be free forever (they were going to charge businesses, IIRC). Then they chagned their mind but said that early adpoters/beta users would get it free for life. Then it left beta and they offered me a $5 discout (one time) for their subscription service (or some other pointless trinket offer like that). As far as I'm concerned they ripped me off.
That set me off trying other things, and I eventually found POPFile, which I use to this day (great software). I've posted this to Slashdot before (a long time ago). Some nice guy from a anti-spam company gave me a code for a free version of their product to be nice (I never used it, I had found something by then and didn't feel like switching again).
The point of all this is that it is a nice method that really works. If there was an open source project that did the same thing, I would use it. Untill then, I've got a solution that works fine.
But this isn't new (if I'm right about what it is, the article is down).
Standard Form Letter (Score:4, Funny)
(X) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
(X) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
( ) It is defenseless against brute force attacks
(X) It will stop spam for two weeks and then we'll be stuck with it
( ) Users of email will not put up with it
( ) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
( ) Many email users cannot afford to lose business or alienate potential employers
( ) Spammers don't care about invalid addresses in their lists
(X) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
( ) Ease of searching tiny alphanumeric address space of all email addresses
( ) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
( ) Armies of worm riddled broadband-connected Windows boxes
(X) Eternal arms race involved in all filtering approaches
( ) Extreme profitability of spam
(X) Joe jobs and/or identity theft
( ) Technically illiterate politicians
( ) Extreme stupidity on the part of people who do business with spammers
( ) Dishonesty on the part of spammers themselves
(X) Bandwidth costs that are unaffected by client filtering
( ) Outlook
and the following philosophical objections may also apply:
( ) Ideas similar to yours are easy to come up with, yet none have ever
been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
( ) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
( ) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
(X) Feel-good measures do nothing to solve the problem
( ) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough
Furthermore, this is what I think about you:
(X) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a stupid person for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn your
house down!
Re:Standard Form Letter (Score:3, Insightful)
(X) Similar to DCC and Razor, but far less bandwidth efficient than either
You should also have checked:
(X) Users of email will not put up with it
(X) Requires immediate total cooperation from everybody at once
Re:Standard Form Letter (Score:1)
I didn't know if users of email would put up with it or not, so I didn't check that one.
I definitely should have added my own option for consuming tons of extra bandwidth per spam, though - this thing would make the existing bandwidth use of spam look like a raindrop compared to the ocean...
Bigger problem... (Score:3, Insightful)
There is a DCC filter that does this. (Score:1)
I have no friends.. (Score:2)
Betcha GMail is doing something like this already (Score:1)
This type of searching (i.e efficiently searching through a long-tailed distribution) my contacts and archived mail is probbaly just one part of the equation - only about 25% of my email is from other gmail users. But nearly all of my legit email i
Spam Archive (Score:1)
missing the point (Score:2)
A P2P approach and querying of other people's address books has huge privacy and compatibility problems without any obvious advantages.
NEW(?) PHISHING TRICK TO AVOID: PLEASE READ! (Score:1)
I entered a bogus but properly formatted CC# but it appeared to reject it. Oh well. Enjoy the relevant information and use it to avoid being duped....
The phish was sent from 80.247.227.76 in France through a redirect page at href=http://projekt.ig-immobilien.com/signin.html in Germany to the phish site itself at:
href=http://61.190.66.139/ws/index2.php?MfcISAP I Co mmand=SignInFPP
in China via the phish email link:
href=https://signin.ebay.com/ws/eBayISAPI.dll?S