
Comparison of Bayesian POP3 Spam Filters 326
kreide writes "Spam e-mail has become an ever increasing problem, and these days it is next to impossible to use e-mail without receiving it in large amounts. Although various techniques exits to combat the problem, spammers seemed to be winning the war - until a new, powerful weapon appeared on the scene: Bayesian filters, our last, best hope for spam-free inboxes. In this review I compare POP3 based bayesian spam filters." We did an Ask Slashdot on this a few weeks ago.
Nitpick... (Score:2, Insightful)
Re:Nitpick... (Score:5, Interesting)
Re:Nitpick... (Score:5, Informative)
Re:Nitpick... (Score:3, Informative)
Re:Nitpick... (Score:3, Informative)
It's quite possible that it goes back further to a version of the Bible or Shakespeare. (Always the two to bet on when finding the source of a phrase in one fell swoop.)
Bayesian filters are useful, but... (Score:5, Funny)
Re:Bayesian filters are useful, but... (Score:5, Insightful)
No, it should be longer, if not all year long.
Re:Bayesian filters are useful, but... (Score:5, Insightful)
Re:Bayesian filters are useful, but... (Score:5, Funny)
Re:Bayesian filters are useful, but... (Score:3, Insightful)
You know, computer crimes are considered terrorism under the USA PATRIOT Act. Until that silly law gets repealed, lets hunt down those terrorists for their, umm, denial of service ...
An immoral law is no less immoral just because you can find a practical use for it. If you don't like the PATRIOT Act, don't support it, period.
A new poll is required (Score:4, Interesting)
I'd personally go for the last option... Maybe the next-to-last if their suit takes place in a really democratic place (there are 278 millions American citizens and 2,2 of them are in jail, this is a *lot*).
Re:A new poll is required (Score:2)
Re:A new poll is required (Score:2, Interesting)
Under the existing technology, a spammer is like the royal pest on a city bus which takes advantage of the captive audience. The analogy here is that we have to download our POP box, we have no way of arranging our affairs to where the signals exist, but we deliberately choose not to tap into them.
I believe the technology must chang
Re:A new poll is required (Score:4, Informative)
Without major ISP deployments, the response rates to spam will not go down, since the clued-up individuals who deploy filtering themselves would never have responded to spam anyway.
Your RF analogy is interesting but it breaks down for people with wireless mobile phone links, dialup when travelling, and so on. The best thing is to make spam unprofitable so it goes away.
Re:A new *law* is required (Score:5, Insightful)
Wouldn't it be grand? [mjlegal.org]
PS: Sorry about the OT, but things like this need to be said whenever the opportunity presents itself.
Re:Bayesian filters are useful, but... (Score:5, Funny)
You: Spammer season!
Spammer: Duck season!
You: Duck season!
Spammer: Spammer season! Fire!
*bang*
You just don't get it (Score:5, Insightful)
Spam is effective because it reaches millions of people who are not installing these filters on their systems. Until ISP's start applying these filters to all spam by default, then the spam filters will have no effect at all, exactly the same number of marks will be reached and respond no matter if the people who know better than to respond to spam go ahead and filter their e-mail or not!
Re:You just don't get it (Score:2, Funny)
Just stay of herbal Viagra and penis enlargement pills, man! :)
Re:You just don't get it (Score:3, Funny)
The problem is there were no instructions on how to find a partner.
Re:You just don't get it (Score:4, Funny)
I do not know how many times I have to tell people this.
They do not work. They just make your hand smaller.
Re:You just don't get it (Score:5, Insightful)
You cannot automatically filter spam. Bayesian filtering works because it works on your own personal items only, and you have a method of manually removing false positives. There is nothing worse than the possibility that an ISP will filter out a real email in their spam system. That simple fact makes server side spam filtering impossible for most situations. You can filter spam into
Until Hotmail et al starts offering bayesian filtering with a separate 'spam' mailbox, consider server side filtering worthless.
I am smart and don't get any spam. A lot of people I see in my line of work, aren't. These people are going to get something like Outclass [vargonsoft.com] (an Outlook plugin for POPfile), and then they are going to see the problem go away, and they're not going to lose any email in the process.
I'd rather use SpamBayes, but the Outlook plugin [python.net] has an annoying bug [sourceforge.net] that renders autocompleting addresses in Outlook useless.
You really just don't get it (Score:5, Insightful)
But you still do get spam. Exactly as much of not more because you use Bayesian filtering. Spam still wastes your bandwidth to download that spam before it can be filtered. Spam still wastes any inbox size limits your ISP might impose. Spam cuts into any quota a forwarding service might now or in the future impose on your account, or it could take you to a higher charge level if you pay for a forwarding service. It costs your ISP money, costs that one way or another are eventually paid by you. Even the processing power for that Bayesian filtering costs you CPU cycles, while having no negative effect on the spammers whatsoever.
While you might not think you care how much spam I get, you might care if dozens, hundreds or thousands of other users at your work also get tons of spam, particularly when all of that spam significantly cuts into your bandwidth. And you will care when overload from spam on your mail server is so bad that it causes failures, effectively causing a D.O.S. situation.
And as long as geeks happly play with their little Bayesian filters, they stop seeing spam and so stop complaining to the providers that are letting spam get through. They stop doing other things that might make spammer's life difficult. Heck, I fully expect some spam haters with an additude like yours to say within earshot of a congressman or Senator something like "Oh, I never get any Spam. Spam can be filtered easily and nothing should be done about it". The spammers should love Bayesian filtering, it takes the presure off them while allowing them to reach exactly the same number of marks with a mailing.
Re:You really just don't get it (Score:5, Informative)
The other thing you can do is impose a microcost for mailing - at 1c/mail, spamming isn't economical any more. But then that is going to penalise the people who have legitimate reasons to send a million emails at a time - you'd have to have a very good micropayment system working on the Internet to do this.
However, those things need widespread change, and they need people in positions of power. Joe User at home can push for it, but they still get spam and they still want a short term solution. I suggest that even if they're filtering, the action of having to check their spam filter will make them irate enough. I see it as being like IPV6 - everyone would really have to change at once for the system to be most effective. (I use Freenet6, do you?)
Now that viruses are public, caught quickly, and Microsoft are being a lot less lax with security (I am in no way commending their effort, but they at least mostly fixed the Outlooks), you don't see people writing them nearly as often. I feel spam will get the same.
No, I don't use freenet6 (Score:2)
Aug 11 03:19:02 traminer pppoe[19276]: Sent PADT
Aug 11 03:19:02 traminer pppd[12690]: Serial connection established.
Aug 11 03:19:02 traminer pppd[12690]: Using interface ppp0
Aug 11 03:19:02 traminer pppd[12690]: Connect: ppp0 <-->
Aug 11 03:19:08 traminer pppoe[12694]: PADS: Service-Name: ''
Aug 11 03:19:08 traminer pppoe[12694]: PPP session is 4029
Aug 11 03:19:12 traminer pppd[12690]: local
Re:You really just don't get it (Score:5, Interesting)
I'm afraid you've made the cardinal mistake of thinking that spammers follow logic.
First question: Why do people install filters on their mailboxes?
Answer: To stop spam.
Now, take a look at any interview with any spammer.. you'll note that when they're asked, the spammer will say "I don't send it to people who don't want it."
They'll also say "we're always coming up with ways to bypass filters."
Now, you'd think that with the two statements, that one of them is false - however (besides the fact that spammers lie), any sociologist will tell you that the spammer actually believes he's telling the truth in each of these statements..
How he justifies it in his mind is that he believes that even though someone has installed a spam filter, that this person only wants to filter spam from other spammers - that his spam is somehow "special".
Spammers are sociopaths, and like all sociopaths, they believe the rules do not apply to them.
If spammers weren't sociopaths, and were capable of applied logic, then they'd realize that any filter (not just Bayseian) would benefit them.. but then, if they weren't sociopaths, they wouldn't be spammers in the first place.
Re:You just don't get it (Score:4, Interesting)
Even after implementing all the postfix uce rules and adding in the RBL's - and using spamassassin... I still saw some spam slipping in...
So I hacked together a tiny little perl script that monitors my mail log... after any IP address gets more than 3 "554" messages (generated by the RBL's) the source IP gets a lovely little teergrube.
I waste their resources and prevent them from trying to deliver any other shit that might get through spamassassin...
Script can be found at here [jasonjordan.com.au] but is only good for postfix/linux/iptables peoples.
Re:You just don't get it (Score:5, Funny)
Lots of them. They're called 'girls' and Slashdot should encourage communication with them wherever possible.
Re:You just don't get it (Score:2)
Re:You just don't get it (Score:2, Insightful)
Still, there are plenty of people who hate spam but don't know how to handle it. At our department, quite a few people receive over 30 spams per day and hate it, but no one has installed a spam filter better than the subject/sender filter built-in in their (Windows) mail clients. One has stopped reading e-mail from his university ac
Other filters (Score:5, Informative)
I have used PopFile in the past on both Windows and Linux, but found K9 to be better suited for environments where Windows is an option. It's very easy to use, having a windowed interface, and it seemed to learn much faster than PopFile did.
I haven't used SpamBayes. I'll have to give it a shot.
SpamPal (Score:3, Informative)
SpamPal with the add-on Bayesian filter (search Google for it) came out top. It works as a proxy and also provides blacklist/whitelist/known Spammer list checking.
Spamprobe (Score:5, Informative)
Re:Spamprobe (Score:3, Interesting)
Re:Spamprobe (Score:3, Insightful)
Only useful to a point (Score:5, Interesting)
I wish the government would somehow make the practice illegal, but I doubt they'll ever get anything to stick. The far better option at this point is to have a class action suit of server owners (who provide mail accounts) against developers of spamming software and spammers. I've gotten enough warnings from my university to know that bandwidth costs money. By sending millions of spams a year into any one e-mail server, that can account for a serious chunk of bandwidth used at significant cost to the provider. It won't stop spam all together, but it will bankrupt anybody that has been doing it.
Re:Only useful to a point (Score:2)
http://spamassassin.org/
Re:Only useful to a point (Score:4, Informative)
Re:Only useful to a point (Score:3, Interesting)
This is why a real anti-spam legal reform would clearly equate circumvention of an anti-spam filter with circumvention of a password prompt. Both are attempts to crack into someone else's computer without permission -- indeed, against an express prohibition -- and the former ought to carry the same penalties as the latter.
Filtering (Score:4, Interesting)
If you are feeling clever you can even use addresses that expire after a week. So something like epochseconds@domain.com
Just my 0.02p
Rus
Re:Filtering (Score:3, Informative)
hmm, if you really are so clever (Score:2)
you could reduce the flow to 0 by putting
From: not_real@naimod.moc
and to be honest if I was an email harvester I might have noticed "user at domain dot com" and be harvesting those too
Re:hmm, if you really are so clever (Score:5, Interesting)
Speaking from experience, I know for a fact that many of the harvesting programs (written in perl, running on linux, written by geeks) are very robust at deciphering most email obfuscation methods. You all sit and shake your fists, and the spamware writers are laughing their asses off.
You have the easy answer: don't obfuscate your email, don't even bother putting it on your posts.
Missing the point? (Score:5, Insightful)
However, I think that ultimately this sort of thing misses the point. Spam needs to be fought in the courts, not in the battlefield. I'm afraid that the success of these filters will cause spam NOT to become illegal, and thus lead to a world where we have a constant trickle of spam, albeit in small amounts.
I think we all agree that we want spam to be gone entirely, as is evidence by the first post being labeled as "troll"
Re:Missing the point? (Score:3, Insightful)
Yes, absolutely does - just like any other sociopathic behaviour. We need clearly defined rules of what is and is not acceptable. Perhaps you haven't noticed, but "the market" is not working anything out - spam is getting worse, not better, and things such as filters make it worse, by hiding the problem (hint: even though your filters hide your spam from you, you're still
What about features other than text? (Score:2)
Does SpamBayes do anything similar?
Re:What about features other than text? (Score:3, Interesting)
Filters do not stop spam... (Score:5, Insightful)
Your server and its harddrives still end up being a storage bin for it, and the spammers will continue to send as long as your machine allows it to be recieved. Always remember that spam differs from postal junk mail, in that the -receiver- pays for it. Unsolicited postage due mail.
Spam must be -blocked- and the ISPs that allow/encourage its continued spread must re-educated, or be put out of business. Only when spam becomes costly to send with it diminish.
The current proposed laws concerning the subject are currently focusing on content rather than consent. They dont mind if you get spammed with hundreds of ads, provided what is being advertised isnt fraudulent. They overlook the fact that the claim of you having 'opt in' for the spam is in itself the lie and fraud.
--Teh
Re:Filters do not stop spam... (Score:2)
Re:Filters do not stop spam... (Score:2, Insightful)
I changed my mind. Simpler is better. (Score:5, Interesting)
I encountered a very simple but unique spam system which works entirely on the sender's address. Simply, you create a small database with the domains/addresses you want to whitelist. Then, a program screens your mail, and if the sender is not in your whitelist, it sends an e-mail BACK to the sender with a simple URL (or even an actual link for HTML e-mail clients) which states that they REALLY want to send the e-mail to its destination. When this is done, they are added to the whitelist. Therefore, mails from forged remote addresses are no longer a problem, and neither are mails from trusted sources. And, better than SPEWS or similar blacklists, the sender gets a SECOND CHANCE to send their mail to you.
There's a commercial solution using this system right now, although the URL escapes me.
Of course, one could encounter problems when ordering online, say. Droids at Amazon will not be clicking your links to make sure your order receipt got through. One could argue that you'd put things like Amazon.com in the whitelist, but what if someone used amazon.com as a spoofed e-mail domain/address? Ay, there's the rub. But if this system were tied in with a Bayesian system, it'd be pretty unbeatable. What's more the Bayesian system would have extra data for negative matches, in the form of e-mails that were never 'approved', and positive data in the form of those that were.
So, I'd be more interested in producing a homebrew system that used MULTIPLE weaker systems, than one supposed 'sure fire' method.. as I feel no one method is perfect, whereas multiple systems can approach this nirvana.
Re:I changed my mind. Simpler is better. (Score:5, Interesting)
I do agree with you that we need multiple layers of safeguards in order to solve spam - or at least to hide it away so nobody has to look at it - but I don't think your specific example is very good.
Re:I changed my mind. Simpler is better. (Score:2)
Re:I changed my mind. Simpler is better. (Score:5, Interesting)
This system has the following benefits:
Re:I changed my mind. Simpler is better. (Score:3, Insightful)
"Bayesian" (Score:4, Insightful)
If the spam disaster had struck fifteen years ago, we'd all be talking about "neural spam filtering" (using artificial neural networks, ANNs) and basking in the warm fuzzy feeling imparted by the term "neural". But ANNs and Bayesian classifiers have the same interface: both are trained on labeled data and can be used to classify unlabeled data. The implementation details are not of primary importance, and if you think they are, I'd encourage you to look into large margin classifiers instead of Naive Bayes or ANNs.
Re:"Bayesian" (Score:5, Informative)
P(mail is spam | words X, Y, Z,
The computation is then done using Bayse's rule (P(A|B)=P(B|A)*P(A)/P(B)) under certain independance assumption which makes it tractable.
So this is actually bayesian filtering
My favorite filter is spamoracle [inria.fr]
wtf (Score:2, Insightful)
filtering is no solution as long as there's no way to stop the spammers!
Or would you say that ignoring the corpses in the gutters would be a solution to the problem of violence on the streets?
bye
[L]
Re:wtf (Score:2)
This can be compared to filtering.
Of course it is better to get rid of the problem, but just as with violence this is not realistic.
No matter how many laws, there will always be people or countries who just don't care.
Re:wtf (Score:3, Funny)
filtering is no solution as long as there's no way to stop the spammers!
Or would you say that ignoring the corpses in the gutters would be a solution to the problem of violence on the streets?
Your analogy is slightly flawed. In the case of spam, it would be correct if:
On my system, SpamAssassin kills 99% of the Spam, carries it outside, buries the remains in the sp
And the winner is... (Score:2)
Which only works out of the box with Outlook 2000/Express. Woopy doo.
Are there any recommendations for those of us who aren't forced to use outlook? I use Eudora my self, have been for years, thus I'm not looking for a new email client recommendation.
Re:And the winner is... (Score:3, Informative)
SpamBayes also has a very well done and integrated Outlook plugin which leads to the common misconception that SpamBayes will only work with Outlook.
Also note the review mentioned that both SpamBayes and POPFile work on multiple pl
YFI list (Score:2, Informative)
Re:YFI list (Score:2, Informative)
You'd be surprised how many DNS servers are completely misconfigured for this, but I think that a simple ping to the address given could actually show if it _existed_.
Personally I've found that I can reduce my spam by a huge amount by never viewing HTML...which brings a thought about tracking and tracing the webbugs in any given piece of HTML email...
Re:YFI list (Score:2, Interesting)
That alone kills off about 70% (IMO) of the spam that comes through servers that I administer, and as far as I know, only 2 emails(over the last 4 years or so) that wern't ment to be rejected were rejected because they had invalid sender envelopes.
HTH
cya
Andrew
Authentication of senders (Score:2, Insightful)
Re:Authentication of senders (Score:4, Interesting)
I agree with everything that you said about filters being ineffective. But I strongly disagree with your "only thing" statement. Particularly if you mean it as any of the systems I've ever heard about, such as "If it's not in the address book, the sender must acknowledge a challange message" type of approaches. The problem with such systems is that many of us get quite a bit of e-mail each day from people who are not in our regular address books, some of it quite important to us. We do not want that mail lost because the system at the other end was not in out address book and did not waste their time responding to a challange and response type system. For example, say I purchased something on-line from a vendor I had never dealt with before. Their e-mail system may automatically kick out an e-mail that informs me the product was shipped and give me an important Fed-ex or UPS tracking number. I'm glad they do such things with their shipping systems, and I don't expect them to manually respond to every challange they get back; realistically they will send any such challanges to the bit bucket and people who want e-mail that is important to them will end up never getting it.
So I do not believe that Authentication of senders , at least in any of the traditionally suggested ways, is the correct approach. Much of the spam problem we have is due to what I consider flaws in SMTP. I would very much like to see a replacement for SMTP that considered the spam problems (as well as other problems inherent in SMTP). As an example, another post here mentioned a system where the mail is held, not on your ISP or upstream provider's system until you download it, but rather is held on the sender's or sender's ISP's system. The recipent would presumably receive only a very short indicator of where they have mail waiting, and would fetch it themselves when they are ready to receive it. The puts the burden of storage on the sender or the service provider for the sender, and avoids considerable bandwidth wasted by senders who supposedly send out e-mail with addresses generated to match all combinations of up to x characters (the excuse Mindspring gave to me when addresses that I created but never gave out or used started getting spam, not that I believe them). In addition to putting this burden on the sender, it would insure that there was a good address in the e-mail to fetch the mail from, so spammers would have a much harder time injecting their spam into the system and would be much more traceable. And while I'm not foolish enough to think that laws could completely stop spam, we've seen how laws did drastically curtail fax spam, and some fax spammers have recently been made to pay serious fines. I do think laws would have a big effect on spammers; ther are a lot of spammers who just don't want to have to move out of the country to keep up spamming, and those of us who hate spam will track the spam back to US sources if we have a law with teeth in it to impose fines (or worse) on them when we do.
Of course, and change to or replacement of SMTP must be phased in over time. It's not a short term solution to spam. But I expect SMTP would quickly go the way of gopher or archie or the rest if a viable new protocol was presented that addressed these problems effectively, and this is where I think out greatest chances for sucess are.
Re:Authentication of senders (Score:3, Interesting)
Using TMDA [tmda.net], you would generate a "keyword" address: A unique addressed, identified by a keyword embedded in the address, which would allow your vendor to bypass the C/R system. If they keyword address starts being abused then (1) you can easily disable it, and (2) you know not to do business with that vendor again.
As an example, another post here mentioned a system w
Why not stop the sellers? (Score:5, Insightful)
Why not instead of hunting down the spammers do we not hunt down the people who are selling and advertising their junk via the spammers?
The spammers purposly make themselves difficult to find, but it must be easier to track down a company that is collecting money and sending out products? Why not make the using of spammers services illegal and fine and punish those doing so?
I think Im correct in saying and please tell me if Im wrong, but here in the UK a similar situation is people "fly-posting". In these cases, if advertising posters are put somewhere illegal or unwanted, it is not the person who put the poster up that is fined, but the club, record label, whoever is beign advertised that takes the rap.
Just my 0.02p
Mozilla - filters on client not server (Score:4, Interesting)
Re:Mozilla - filters on client not server (Score:4, Informative)
However, that means a change to the server, and a change to the POP3 protocol. The ISP would have to install a filtering plugin or a modified version of the server, and the client would subscribe to this service and train it (every client would have his own dictionary). With the first few messages there would be some special POP3 report back to the server indicating that you consider it spam, and from then on the server would filter on its own.
However, that would be difficult/impractical to roll out, so you will have to live with clientside filtering like in Mozilla.
Re:Mozilla - filters on client not server (Score:2)
In related news (Score:4, Informative)
Links to pdf's you need to print and mail in included.
"A little-known Federal law allows individuals to send a Prohibitory Order against companies that are sending unsolicited sexually provocative or erotically arousing mail. The Supreme Court went one step further, allowing individuals to decide what constitutes "erotically arousing" mail. The law makes it illegal for a company to send mail to an individual within thirty days of receiving the Order."
"Postmasters may not refuse to accept a Form 1500 because the advertisment in question does not appear to be sexually oriented. Only the addressee may make that determination."
Everyone? (Score:2, Insightful)
"The first requirement is because I wanted the results to be applicable to everyone"
My how the definition of everyone has changed. So it's bad luck Mac, Solaris, *BSD, HP-UX, VMS users...
Something he misses about popfile. (Score:4, Interesting)
Why filtering isn't the solution (Score:5, Insightful)
I'm saying, why not focus instead on technology which puts a bigger dent in spammers' ability to operate, like how to secure against proxy hijacking [uoregon.edu].
POPFile is more than just a spam tool (Score:5, Interesting)
Re:POPFile is more than just a spam tool (Score:3, Informative)
POPFile, and Outcast rock.
Re:POPFile is more than just a spam tool (Score:3, Informative)
Works great. My father, who gets far more spam than the average person (why I don't know) has virtually 100% success rate.
It's virtually impossible to not get spam? (Score:5, Informative)
I get spam at the rate of 1 spam mail per 6 months or so. Or maybe even less. I can't remember getting a single spam email on my actual email address for about a year.
If you have an account on a crapless domain (i.e. not hotmail.com, msn.com, aol.com and the likes),
it all comes down to this very simple rule:
Do not, under any circumstance, have your email address posted publicly accessible ANYWHERE on the web.
It WILL get trawled. And then it will be spammed relentlessly.
If you have an existing address you don't want to give up, or an address at hotmail.com or a similar place, dump it.
Then exercise a bit of common sense about where you use your actual address.
I have a domain which catches email to unknown addresses and put them in my regular mailbox.
Whenever I have to give an email address to some place on the web, I use *domain-i-am-currently-visiting*@mydomain.com. So if I am visiting foobar.com, I would put in foorbar.com@mydomain.com.
I have been doing this for years. It enables me to see what was the source of the leak when I get spam on one of the addresses.
It has taught me one thing: I have never, ever, ever, in all my years of online shopping, forum posting etc, come across a single website that have ignored their own privacy statement. Ever. Even the slightly sketchy sites (like divx subtitle sites) don't leak addresses.
I was surprised to realize this.
The only addresses I ever get spam on are the ones I know to be publicly displayed on the web.
So it's that easy to avoid spam.
Blame the idiots that respond to SPAM. (Score:3, Insightful)
I think the only reasonable way to rid the world of SPAM is to get the foolish folk who respond to it to stop. The reason there is so much of it now is that it seems to work; there are people who actually respond to it. If these people stopped responding to it the use of SPAM would most likely diminish.
Sending SPAM costs money. No sence spending that money if no profit is made.
The real reason SpamBayes wins... (Score:4, Interesting)
You've all seen it work; the Spammers don't just send you the same spam once, they send you it 5 to 20 times, and they include a clipping from the headlines or something under their pitch.
They're not doing it to get that one mail past to you. They're actually HOPING that you classify all 20 mails as spam.
Why?
Because every time you classify that mail as spam, EVERY SINGLE WORD of that news clipping is "poisoned" inside the filter, and becomes an indicator of a spam. Then you turn around, and get an email from someone legitimate using those common words... and it gets wrongly classified too.
Enough false positives, and the spammers win, because they'll get you to turn the filter back off.
Enough is enough -- time to establish open hunting season on Spammers.
SpamBayes Testimonial (Score:4, Interesting)
After a week or two of this, I installed SpamBayes in the form of it's outlook plugin. I showed it my email archive as my "good" messages, and a bunch of spam gleaned from my deleted folder as "bad". My mailbox is now perfectly clean. I have received at least 15,000 spam messages since installing SpamBayes, and I have probably had to hit the "Delete As Spam" button about 10 times for ones that it missed, most of those being variations on the Nigerian scheme. It has never grabbed a real message, and the "Unsure" feature localizes everything that I really need to look at in one place.
If you have a spam problem, get SpamBayes. It is that simple. There is no need to speculate about that better method that you thought up, or how it really won't work because of XYZ theory... it works almost perfectly, and it lets you know about anything that it is not sure about with the "Unsure" folder, so it never throws the baby out with the bathwater. In short, this is almost the perfect Spam filter. It even caught the emails that were using GIFs to avoid being filtered on content, placing them in unsure until I said "this is spam", after which I never saw another one. Pretty darned cool!
It is actually kind of fun to watch this thing work. I came in this morning to find 568 new messages in my spam folder, 3 in unsure, all of which were spam. No spam anywhere to be found in my inbox, just 15 unread messages that were correctly left alone by SpamBayes. Just imagine having to flip through 600 emails to find 15 real messages! Now I just hit "CTRL-A DEL" in my spam folder and it is all gone! 5 seconds a day to deal with spam, I can live with that....
MIMEDefang + SpamAssassin + Razor (Score:4, Informative)
Setting that score at 8 has resulted in no false positives over a week (I log From and Subject information - it's all obvious spam). Then stuff that scores between 5 and 8 I divert to a separate mail box, which I comb through every day or two. There have been two false positives that ended up in that over the week. This is with hundreds of e-mails for a half-dozen users coming in a day. I also end up, with this setup, with 2-4 spams making it through to my own mailbox (the bussiest on the system). These are, because of the filtering, the least obnoxious, and easily enough report to Razor to spare others. Meanwhile, I like to keep a window open to the mail server running "tail -f mail.info | grep REJECT" and watch a dozen or so attempted spams an hour refused acceptance with a message like "554 5.7.1 SpamAssassin score of 15, rejected" back to the origin, which is enough that if it wasn't spam any good mail daemon will inform the sender, and they can find another way to get through.
Even if this gives spammers a clue about ducking SpamAssassin, the spams that can get by it are by far the least obnoxious. I look forward to seeing if the Bayesian feature helps (it feeds itself anything ti scores at over 15 by default). But it's a pretty good system short of that. If it became standard for ISPs to reject all mail with a SpamAssassin score of 8 or higher, the loss of legitimate communications would be exceedingly rare, and politeness standards would be encouraged.
Re:great (Score:3, Insightful)
It's harrasment.
Re:great (Score:3, Insightful)
Personally, I get around 100 of these a day, but only 3 get in my inbox instead of one of my specific mail directories, this is not *that* disturbing.
I just wish these spams were better targetted : getting some penis-enlargement, ultra-fast-diet, university-diploma or cheap-herbal alternative to viagra is somehow repetitive and boring.
Re:great (Score:3, Insightful)
Some wanker spammer got my email address and within two days my spam volume went from zero (seriously) to 30+ a day. All for the same fucking thing. These shits should be legal to hunt and kill.
In respose to the original troll, it's a bogus analogy. We PAY for our internet access. We get bombarded with ads on damn near every site... The revenue generated from these s
Re:great (Score:3, Insightful)
So, spam is junk, indeed, but i dispose of it almost instantaneously.
I won't make spamfighting my Holy War...
I have more interesting and valuable things to deal with IRL and I am naturally optimistic.
Let the sp
Re:great (Score:2, Insightful)
But on the other side
Re:great (Score:3, Informative)
Re:great (Score:4, Insightful)
For people who have to pay for their online time (England for example), these scumbags are essentially stealing money from people. Filtering only works once you've downloaded the mail. You still have to download their worthless drivel. Sure, it may be pennies a week in costs for a user, but you tally that up over a year or two of dealing with these idiots, and you've got a sizeable chunk of change. Certainly enough for a nice pizza.
Let's not forget the TIME these shits waste as well. All this work invested in stopping spam. Who know's what cool stuff may have come from the minds who instead are working on ways of dealing with the email cancer.
As I said, these scumbags should be legal to hunt and kill.
Re:great (Score:2, Interesting)
From a sysadmin's POV, this doesn't halt the issue of spam eating bandwidth or disk space. I'll address that next.
Disk space depends on what kind
Spam is not the same as commercial (Score:4, Insightful)
I'd be happy to.
I don't know about you but for me e-mail is an important part of my work - not something comparable to watching cable TV.
Spam clogs my mailbox and I have lost several important e-mails from clients when deleting the spam which, by the way, is often disguised as legitimate non-commercial mail and comes with forged headers. In addition to pushing fraudulent products, these facts make spam a completely different beast from the cable TV and its legitimate, controlled ads which eat up only my free time - not my emails or work efficiency.
Re:great (Score:4, Insightful)
When you watch cable TV, you know that for an hour of content, you are going to see up to 12 minutes of advertising. The advertising is controlled by the cable company, and no-one can advertise on the channel without going through that 'filter'.
Spam, on the other hand, is not restricted. If I receive 100 e-mails a day, anywhere from 0 to 100 of them could be spam. None of those spams are sanctioned (or controlled) by my service-provider, and they were not part of the package I signed up for.
Re:great (Score:3, Insightful)
Re:great (Score:5, Insightful)
NEVER?....Try the BBC [bbc.co.uk]?
No ads, quality programming, small fee.
Re:great (Score:2, Informative)
No Adds??? no, it's stuffed to the brim with promos for their own stuff though... (Gardening magazine, History magazine, Nature magazine, Radio times, TellyTubby toys, Fimbles stuff, trailers for upcoming programmes and series)
Quality programming??? it's gone really downmarket in the last few years..
Small fee??? That fee is your license for receiving _all_ television programs, even cable and satellite... not just the BBC. Although that license money goes to the BBC,
Re:great (Score:2, Informative)
You used to get a free satellite viewing card for your licence fee giving access to all the "terrestrial" public channels on satellite, which was great if you had a spare decoder and crappy terres
Re:popfile accuracy (Score:2)
I was getting 99.87% (Score:2)
It's no good at more subtle classification though, but spam/not spam is highly useful.
If you make a mistake filtering you don't have to restart, you just keep training it, eventually your mistake will be drowned out as statistical noise.
I've since been moved to Notes so no more spam filtering.
Re:Spammers will just just HTML with images.. (Score:2)
My POPFile says this:
Lookup result for html:imgremotesrc
good 0.2185471262
spam 0.7814528738
html:imgremotesrc is most likely to appear in spam