Distributed Spam Detection 304

Posted by CmdrTaco on Saturday December 01, 2001 @01:20PM from the interesting-ideas dept.

A reader writes "There's an interesting project at SourceForge, called, "Vipul's Razor", that uses a gnutella like system to let users exchange spam "signatures" to filter spam. I work at an ISP in Ottawa, we have been using it for last two weeks to stop bulk of spam coming to our POP3 accounts. More impressively, it hasn't tagged any valid mail as spam yet. Here's the scoop from its webpage: "Vipul's Razor is a distributed, collaborative, spam detection and filtering network. Razor establishes a distributed and constantly updating catalogue of spam in propagation. This catalogue is used by clients to filter out known spam. On receiving a spam, a Razor Reporting Agent (run by an end-user or a troll box) calculates and submits a 20-character unique identification of the spam (a SHA Digest) to its closest Razor Catalogue Server. The Catalogue Server echos this signature to other trusted servers after storing it in its database. Prior to manual processing or transport-level reception, Razor Filtering Agents (end-users and MTAs) check their incoming mail against a Catalogue Server and filter out or deny transport in case of a signature match."" Cool idea. I'm up around 80% spam a day on my main mail account. Might be worth a try.

This discussion has been archived. No new comments can be posted.

Distributed Spam Detection

Load All Comments

Search 304 Comments Log In/Create an Account

Comments Filter:

SpamBouncer (Score:5, Informative)

by joib ( 70841 ) writes: on Saturday December 01, 2001 @01:27PM (#2641281)

I'm personally using SpamBouncer [spambouncer.org], a procmail-based spam filter. Works fine for me.

Share
twitter facebook
Great use of p2p (Score:5, Insightful)

by astrashe ( 7452 ) writes: on Saturday December 01, 2001 @01:29PM (#2641293) Journal

This is a great use of p2p -- something that doesn't involve piracy. I wish I had heard of it before.

Are there any other innovative non-piracy p2p apps out there that we should know about?

Share
twitter facebook
- Re:Great use of p2p -- Wont work. (Score:2, Interesting)
  
  by VC ( 89143 ) writes:
  
  This wont work. All that will happen is that the spammers will just modify their spam programs to slightly modify each message they send out. This will result in each message having a COMPLETELY different SHA signature.
  Cool idea but wont work. Sorry. Maybe some kind of AI algrorithm.
  - Re:Great use of p2p -- Wont work. (Score:5, Interesting)
    
    by DLG ( 14172 ) writes: on Saturday December 01, 2001 @02:40PM (#2641459)
    
    >> This wont work. All that will happen is that the spammers will just modify their spam programs to slightly modify each message they send out.
    
    It will however require them to send each specific message separately rather than sending large cc's or using some sort of relay. That alone is a big step since right now most spammers can get away with sending a single email message and relying on an open relay to retransmit to a larger group.
    
    Furthermore I have doubts that for the time being this project will concern spammers. Infact I am pretty sure spammers are not really interested in wasting their own time trying to spam people who consider spam a violation. It is more convenient to ignore those people (which is why they don't bother to check if you want spam or not before they send it to you).
    
    DLG
    
    Parent Share
    twitter facebook
    - Re:Great use of p2p -- Wont work. (Score:2, Informative)
      
      by Idolatre ( 197068 ) writes:
      
      It will however require them to send each specific message separately rather than sending large cc's or using some sort of relay. That alone is a big step since right now most spammers can get away with sending a single email message and relying on an open relay to retransmit to a larger group
      
      Most spam I get has my real name somewhere in the body of the message, so it doesn't seem like a problem for spammers :(
      - Re:Great use of p2p -- Wont work. (Score:2)
        
        by DLG ( 14172 ) writes:
        
        I personally have never seen a single spam that has my real name. If I sign up on some website for something then certainly I can't be really suprised if folks from that website opt me in. Giving them my email and name and such is an invititation to recieving email unless they specifically state they will not send anything.
        
        Much of the spam I do recieve is of the type where they are sending mail to all the DLG's out there for instance.
        
        Also much of the spam I get comes through the email addresses that are on webpages... I infact will recieve the same spam several times a day. The only thing that might change is the subject name. (I have never understood why someone thinks that sending me 20 of the same exact advertisement overnight is wise..)
        
        In any case, I don't know if this process will reduce all spam for all people, but considering that even with blackholes I still get a sizeable amount of spam, anything is worth trying...
        
        DLG
      - Re:Great use of p2p -- Wont work. (Score:2, Funny)
        
        by linzeal ( 197905 ) writes:
        
        I get a lot of email for Ass Hole, Fuck You, Die Spammer, and other such people I've never heard of.
  - Re:Great use of p2p -- Wont work. (Score:2)
    
    by sporty ( 27564 ) writes:
    
    To a degree, this can work. If the signatue was of the text itself. If it was based on long sentences being present within a mail, plus the origin of the mail (based on the connecting IP), this might have a chance.
    
    Think of it, spammers would have to start hitting multiple mail servers which creates a lot of over head and is just silly, to get around this. That and spammers would have to use very very generic text to get by it. Like "Act now. We sell. Porn!. Natalie Portman!" vs "Come see our barely of age teens do really bad stuff."
    - Re:Great use of p2p -- Wont work. (Score:2)
      
      by sporty ( 27564 ) writes:
      
      Let me make an ammendment, a common IP, not necessarily the IP of origin. Someone could be behind NAT :) But then again, the software to figure out the common IP shouldn't be hard...
  - Some positivism and less bitching please... (Score:3, Funny)
    
    by tcc ( 140386 ) writes:
    
    Well at least it *WILL* filter some of the bad content while leaving the good one clean, right now I receive 20 mails a day of spam in my hotmail inbox and the hotmail filter killed *VALID* messages! they keep junk for 2 weeks, I found that out 3 months later because my girlfriend posts would never reach me for the last few days.. and she's far from being a spammer.
    
    There's not perfect solution for spam (aside from killing every single individuals that dare spamming people, which unfortunately is still illegal :) ).
    
    Legislation is too busy removing our civil rights right now than to make our lives better (as they should do). So right now, I'd say, ANY technology helping us to reduce spam should be welcomed and helped in a productive way instead of bashing on it without even giving it a try. It's an open project and it means that if you can contribute in a POSITIVE way, you should. Else, people, please don't discourage programmers working on something that could eventually come out as being a very good solution.
    - - Re:Some positivism and less bitching please... (Score:2)
        
        by Klaruz ( 734 ) writes:
        
        Sounds like they use ospam:
        
        http://omail.omnis.ch/ospam/ [omnis.ch]
        
        It's qmail only though...
      - Re:Some positivism and less bitching please... (Score:3, Interesting)
        
        by Kris_J ( 10111 ) writes:
        
        What is needs is for someone to setup free email accounts with "nospam" in the domain. myemail@nospam.com, or myemail@yahoo.nospam.com, etc -- then all these new harvest-bots that trim out "nospam" will either get it wrong or discount it completely.
        Just a random thoughr early on a Sunday morning...
  - Re:Great use of p2p -- Wont work. (Score:4, Interesting)
    
    by friscolr ( 124774 ) writes: on Saturday December 01, 2001 @03:31PM (#2641557) Homepage
    
    Maybe some kind of AI algrorithm
    everytime spam gets mentioned on slashdot, someone says this, and everytime i respond with the work i've been doing-
    pattern matching spam [blackant.net]
    uses word counts and phrase counts from known spam and known good mail to match against incoming mail. requires a certain amount of known spam/not spam, but otherwise it has a good rate of matching spam/not spam and doesn't require the incoming mail to at all known beforehand.
    
    Parent Share
    twitter facebook
    - Re:Great use of p2p -- Wont work. (Score:5, Interesting)
      
      by kevinank ( 87560 ) writes: on Saturday December 01, 2001 @04:49PM (#2641773) Homepage
      
      Interesting work, but I notice that you are only examining trigrams, and you are using an even weight factor. To improve selection you probably at least need to use variable weights (a fuzzy logic neural network rather than binary logic) and train the network with more sample spam.
      I've been working on a similar project but using additional factors that help identify spam such as violations of the mail RFC's, and other header indicators, in addition to NLP. I have a prototype that I'm using to score all of my inbox e-mail and am using that to tune the weight factors and add in new factors as I encounter them. It would be interesting to combine your approach with mine I think, since I hadn't thought of analyzing trigrams.
      Anyway, if you are interested send me an e-mail and I'll give you my current perl code.
      
      Parent Share
      twitter facebook
      - I think you may have missed the point. (Score:2)
        
        by MarkusQ ( 450076 ) writes:
        
        Interesting work, but I notice that you are only examining trigrams, and you are using an even weight factor. To improve selection you probably at least need to use variable weights (a fuzzy logic neural network rather than binary logic) and train the network with more sample spam.
        They aren't trying to answer the question "should this particular piece of e-mail be considered spam," but rather "is this particular piece of mail identical (to within some factor) to one that some human considers spam." So they don't need to train anything, they just store the hash-signatures of the spam that is currently making the rounds.
        Even if someone mistakenly identifies a piece of mail as spam, it won't hurt anything; the odds are very low that it will ever match another piece of mail in the entire history of the cosmos.
        -- MarkusQ
- Re:Great use of p2p (Score:2, Interesting)
  
  by __aawsxp7741 ( 78632 ) writes:
  
  How about Freenet [freenetproject.org]? Can be (ab)used for piracy, of course, but neither is that its purpose, nor does it seem its current main use.
- Re:Great use of p2p (Score:5, Informative)
  
  by Sarcasmooo! ( 267601 ) writes: on Saturday December 01, 2001 @02:36PM (#2641456)
  
  Just because most people on a P2P network use it for piracy, it doesn't become a pirate-app. I can, and have, used programs that are under attack by the RIAA do download speeches, text documents, etc. At the early point of the 2000 Nader campaign, when he couldn't get 30 seconds of time on M$NBC (much less a place in the debates later on), I used Napster and Scour to find speeches he's given. And when the Department of Commerce kicked of it's 'Safe Harbor' privacy program by failing to put the confidential information provided by the companies involved on a secure site, I downloaded the pages in a zip file despite the site being closed for a fix. Using programs like Scour, I found reading material on scientology [chalmers.se], COINTELPRO [icdc.com], and more, all the way up until the day that lawsuits shut them down.
  
  Parent Share
  twitter facebook
- Re:Great use of p2p (Score:2)
  
  by LionKimbro ( 200000 ) writes:
  
  Not yet, but there will be relatively soon.
  
  I anticipate that P2P networks will be good as a Free Software server publishing mechanism.
  
  For example, you download a game, and it uses some popular publishing mechanism for finding or publishing where a game server is.
  
  I'd REALLY like to see a game construction kit that allows you to easily share your sprites and sounds with others around the world.
  
  I mean, just think about anything that you can create and share with others...
So... (Score:5, Interesting)

by DagSverre ( 223837 ) writes: on Saturday December 01, 2001 @01:29PM (#2641294) Homepage

...what stops this from being abused? Say I set up a box that automatically reports all mails on the most popular mailing lists as spam, effictively making the ISPs around the world start to filter out the mailing lists...

It's a great initiative, I really hope no troll out there takes my word on this and actually do this.

Share
twitter facebook
- One fix . . . (Score:2)
  
  by tmoertel ( 38456 ) writes:
  
  for abusers who report bogus signatures is to count the number of times each signature is reported and only consider a report valid after the count exceeds a threshold value. Real spam mailings would be reported many times each from distinct nodes and would be easy to distinguish from bogus signatures, which wouldn't be as widely reported.
- Apply it late (Score:2)
  
  by Webmonger ( 24302 ) writes:
  
  I don't know how an ISP would accomplish this, but when a user sets it up, it's easy: filter your mailing lists first.
  
  THEN filter the remaining mail.
  The remaining mail SHOULD NOT contain any mailing lists, or other generic mail, just personal stuff.
  
  Wait-- here's how an ISP sets it up: don't delete the suspected spam, just add a header. The user's client can filter it, hopefully after it handles mailing-list mail.
- - Re:So... (Score:4, Insightful)
    
    by Greyfox ( 87712 ) writes: on Saturday December 01, 2001 @04:05PM (#2641639) Homepage Journal
    
    Spammers themselves are generally interested in ways to disrupt those lines of defense. If this project grows in popularity and shows itself to effectively block spam, they'll start gunning for it. Considering potential holes in the system before that starts happening really isn't a bad idea.
    
    Parent Share
    twitter facebook
  - Re:So... (Score:4, Insightful)
    
    by dev0n ( 313063 ) writes: on Saturday December 01, 2001 @04:51PM (#2641781) Homepage
    
    Seems like everyone hates spam with a passion, except maybe the spammers themselves
    
    well, i would have to disagree with you on this point.. i work at a web hosting company as the technical support manager, and handling abuse complaints falls into my realm of responsibility... and i have found that a significant number of first time spammers do not KNOW that spam is "wrong", and get quite upset that they were "taken" by companies that send bulk messages on their behalf. i had one gentleman send me an apology letter that actually made me feel sorry for him. he, and many other people on our network, have never been repeat spammers.
    
    i know that there are many people out there who don't care, but we can't automatically assume that all spammers are evil. some of them are just ignorant.
    
    Parent Share
    twitter facebook
- - Re:So... (Score:2)
    
    by Suidae ( 162977 ) writes:
    
    Seems like it would be easier to set up a superserver or central server setup similar to Kazaa that requires multiple matching reports from many different sources. That would eliminate the difficulties with trust models (like having to pay certificate providers, and people that obtain certs specificly to poison the data).
    
    Either way, you need some to have spam sigs verified from mulitple sources before accepting them.
Authentication with servers? (Score:5, Insightful)

by GlassUser ( 190787 ) writes: <slashdot&glassuser,net> on Saturday December 01, 2001 @01:30PM (#2641296) Homepage Journal

I read some of the documentation, but I can't find details on a couple of questions. Do the servers authenticate with each other? It was implied, but how deep is it? Are the SHA signatures signed to the originating server (or client/trollbox) too? I think this kind of model is great, but if you don't have some nifty authentication/accountability, it can be wide open for abuse. I'm sure anyone reading slashdot can imagine a vengeful spammer flooding the network with bogus or malicious hashes.

Share
twitter facebook
- Bogus hashes won't tag valid mail (Score:4, Informative)
  
  by morzel ( 62033 ) writes: on Saturday December 01, 2001 @01:57PM (#2641383)
  
  The beauty of a cryptographic hash function is that it's purely one-way: it is very easy to check if two messages are the same (they calculate to the same hash), but it is nearly impossible (or at least very very very hard) to calculate the message for any given hash.
  
  Injecting random hashes into the network won't result in valid emails being tagged, but can flood/DOS the catalogue machines.
  
  It would be possible to create hashes for a number of "probable" emails, but diversity in messages is so big, the chances are quite slim to actually stop a legitimate mail.
  
  Parent Share
  twitter facebook
  - - Re:Bogus hashes won't tag valid mail (Score:2)
      
      by morzel ( 62033 ) writes:
      
      Hehe... I actually meant that it's very hard to derive a message from the hash. (not the message).
      
      You are absolutely correct.
Fabulous Idea! (Score:3, Interesting)

by under_score ( 65824 ) writes: <mishkin AT berteig DOT com> on Saturday December 01, 2001 @01:30PM (#2641298) Homepage

The people who came up with this idea deserve to be considered heros! This is one of the coolest uses of technology I have seen. (Not to be too gushing: SPAM is a rich mans problem - I hope someone comes up with some cool technological solutions to some of humanities more basic problems.) I run a server which hosts mail for a number of domains. I haven't yet, cause I just heard of it, but this will be used! There might be some interesting extensions based on possible problems: certain kinds of spam interest certain people. Perhaps a categorization system would be useful so that spam can be filtered based on these categories (for example, some people might like receiving 100 MLM spam messages a day :-P ). Also, there is an (extremely) slim chance that a legit mail might be blocked based on match hashes. Although this is extremely unlikely, could it be fixed somehow? Finally, some spam comes with very slight differences but is essentially the same spam instance. Chain letters are in a grey area. It would be good to have some heuristic methods of filtering based on content too. I don't know the characteristics of the hashing algorthm used, but perhaps by doing three hashes: start of message, middle of message, and end of message, it may be possible to identify spam even if a small part has been change. Anyway, just some random thoughts. Kudos again to those who have built this!

Share
twitter facebook
- Not necessarily such a Fabulous Idea! (Score:3, Interesting)
  
  by marxmarv ( 30295 ) writes:
  
  The people who came up with this idea deserve to be considered heros!
  
  Wouldn't that be BrightLight?
  
  I don't know the characteristics of the hashing algorthm used, but perhaps by doing three hashes: start of message, middle of message, and end of message, it may be possible to identify spam even if a small part has been change.
  
  HTML email provides too many places to hide garbage. Comment tags and unused X- attributes are the obvious ones; finely (or grossly) tweaking COLOR elements, or any number of things done to inlined images, provide an effectively infinite number of variations which will pass any filter based on the usual message digest algorithms.
  Many such tricks can be defeated by only hashing words that appear in some standard dictionary and discarding all else, such that
  
  <FONT COLOR="#FEFDFA"><BLINK X-515322451412135135>LIVE CO--ED NAKED DRESSED GIRLS, =46REE</BLINK></FONT>
  
  gets reduced to LIVE NAKED DRESSED GIRLS before hashing. Even then, the smart thing to do is not to block matching mail but to blackhole the sources of matching mail, preferably permanently.
  
  (Not to be too gushing: SPAM is a rich mans problem - I hope someone comes up with some cool technological solutions to some of humanities more basic problems.)
  
  Humanity's more basic problems are the inability to cope with the concept of a world without scarcity. Would that technology fix that instead of providing the powerful with more ways to create unnatural scarcity.
  -jhp
- Re:Fabulous Idea! (Score:2, Interesting)
  
  by mmol_6453 ( 231450 ) writes:
  
  I own and operate an ISP, and I will not install this software on my servers, because I refuse to withold my customers' mail.
  
  However, I will reccommend this software to my customers, so they can use it at their option. That way, they can do what they want. (And I don't get hit with a lawsuit on the off chance a very vital email gets blocked.)
How about a server frontend approach? (Score:3, Insightful)

by serial frame ( 236591 ) writes: on Saturday December 01, 2001 @01:32PM (#2641306)

It would be very neat if this were provided as a free service that acts as a front-end to an existing POP3 account. Simply sign up, provide info like your username, POP3 host (but not password; that can be passed from the service to your POP3 server on log-in for safety reasons). Then, point your favourite mail client at the service's POP3 server, and...voila. Same e-mail, minus the spam.
Nothing truly insightful here, just speculation from a convenience freak.

Share
twitter facebook
- Re:How about a server frontend approach? (Score:2)
  
  by crisco ( 4669 ) writes:
  
  Or how about an email client program that logs into your POP mailboxes, downloads mail (without removing it from the mailbox), compares spam signatures and then proceeds to remove spam from the mailbox. Very useful for those of us who don't yet run our own mail servers.
  Might be a little slower for those dial-up users, especially if they are being charged for connect time. But for people with a shell account (I'd love to set a cron job for every hour or so) and an ISP that is unwilling to run a filter, or someone with inexpensive connetivity who would like to reduce spam, it would be a beautiful solution.
- Re:How about a server frontend approach? (Score:2)
  
  by budgenator ( 254554 ) writes:
  
  For it to be widely used as a server front end, you would have to convince a lot of network types that its both effective and secure. Because right now they would view it as another piece of software to config and patch, in an area where they had no software before and no presidence to have any software. Also the legal types tend to worry about bogus claims like Email = free speach, and liability over mistakenly blocked Emails ect its easier for them to concider it a user problem.
  
  About 80% of the spam to our domain get forwarded to user bitbucket anyways. This is because our domain name is poiuyt.com and a lot of people use it as a FAKE Email domain instead of using example.com; qwerty@poiuyt.com get tons of spam. Life would be a lot simpler for me if I got off my duff and learned enough about the pop3 protocal to write a script that just found out how many spam's to delete and delete them w/o downloading. Oh well such is the cost of laziness.
Fighting spam (Score:5, Informative)

by Brian Kendig ( 1959 ) writes: on Saturday December 01, 2001 @01:36PM (#2641317)

I'll post my usual public service announcements here:

SpamCop [spamcop.net] is a great service for reporting spam; just paste the spam message into the web form, and it'll automatically figure out where the smap came from and send complaints off to the appropriate people.

The Spam Bouncer [spambouncer.org] is a procmail-based personal spam screening tool. It's got some interesting features, but I haven't used it in a long while.

The way I avoid spam is to have my mail client screen out any email which contains any of these phrases:

to be removed to be permanently removed to get removed to get off the list to get off this list to be taken off to remove yourself removal instructions remove in subject line "remove" in subject line remove in the subject "remove" in the subject 'remove' in the subject S.1618 S. 1618

This list by itself catches about 80% of the spam I get.

Share
twitter facebook
- Re:Fighting spam (Score:2, Informative)
  
  by sqlrob ( 173498 ) writes:
  
  don't forget:
  
  one time mailing
- Re:Fighting spam (Score:2, Informative)
  
  by invenustus ( 56481 ) writes:
  
  The way I avoid spam is to have my mail client screen out any email which contains any of these phrases:
  
  Um, are you on any legitimate mailing lists? Don't those get filtered out? I'd imagine half of Slashdot's readership is on one or more of the Linux development lists. I'm Yahoo! Groups mailing list for any number of different interests....
  - Re:Fighting spam (Score:2, Interesting)
    
    by Thanatopsis ( 29786 ) writes:
    
    Not really, you simply change the order in which your filters get checked and filter out legitimate mailing list traffic from SPAM. For example I am member of various ZDNet lists and development lists. I filter those based on the sender or the from address into my mailbox for them and then I can read them at my leasure.
- Re:Fighting spam (Score:2, Informative)
  
  by suwain_2 ( 260792 ) writes:
  
  I think there's a potential problem with this... Not sure if you'll ever have any actual problems with it, but...
  Suppose you send me mail with the exact text in your post. Now, I don't actually get any spam, but it's not a problem. BUt let's say I reply, and leave the original text. SUddenly, my mail meets every single criteria that you're filtering.
- Re:Fighting spam (Score:2)
  
  by FattMattP ( 86246 ) writes:
  
  Also try JunkFilter [zer0.org]
- Foreign spam removal (Score:5, Informative)
  
  by wideangle ( 169366 ) writes: on Saturday December 01, 2001 @03:23PM (#2641548) Homepage
  
  For the many /.ers who:
  a. Use Outlook secretly
  b. Receive loads of foreign spam
  c. Don't know any foreign languages
  d. Don't have any foreign friends
  e. Don't have any friends
  
  This Outlook rule is for you!
  Apply this rule after the message arrives with Ô or ¾ or Ç or or É or ½ or Í or ò or Ë or ® or Ä or ã or Ï or Ö or Ô in the subject or body delete it and stop processing more rules.
  This blocks 99% of foreign spam [spamhaus.org]. Sue Mosher wrote about other effective methods [slipstick.com] for killing spam in Outlook. Finally, before you reply saying "You dummy, that filter works in any client!" -- You're right.
  
  Parent Share
  twitter facebook
- Add one for this: (Score:2)
  
  by TomatoMan ( 93630 ) writes:
  
  This ad is produced and sent out by: AdAd Systems, NY, NY 1 1 2 2 2. To be r e m o v e d from our mailing list please email us at
  harold02@musiclover.com.au with r e m o v e in the subject.
  
  Note the spacing with the word "remove". I wonder if these guys read your post.
- Re:Fighting spam (Score:2)
  
  by Restil ( 31903 ) writes:
  
  Ya.. the S.1618 catches a lot. Or just filtering "this is not spam" or "you requested more information" would get a bunch too. :)
  
  Sad.. I know.
  
  -Restil
- Re:Fighting spam (Score:2)
  
  by csbruce ( 39509 ) writes:
  
  I find that setting aside e-mail that's not actually addressed to me catches a lot of spam.
idea won't work if reaches critical mass (Score:4, Insightful)

by intuition ( 74209 ) writes: on Saturday December 01, 2001 @01:37PM (#2641321) Homepage

Razor catalogs spam by hashing the entire text of the message. Later potential spam is "detected" by hashing entire texts of messages to see if the hash matches any of the existing hashes in the spam catalog.

To get around this all a spammer has to do is change/add at least one charachter to each spam. This would make all the hashes unique and no spams would be detected.

Share
twitter facebook
- Re:idea won't work if reaches critical mass (Score:2)
  
  by morzel ( 62033 ) writes:
  
  Technically, it would be possible to create hashes for different pieces of the message, which can be combined in one single "signature" to detect potential matches. It would be more complicated for the catalogue server to execute searches, and the answers won't always be absolute (e.g. partial match).
  - Re:idea won't work if reaches critical mass (Score:2)
    
    by intuition ( 74209 ) writes:
    
    Technically, it would be possible to create hashes for different pieces of the message, which can be combined in one single "signature" to detect potential matches. It would be more complicated for the catalogue server to execute searches, and the answers won't always be absolute (e.g. partial match).
    
    You would have to define in advance what a "piece" of a message would consist of. Then the spammer simply puts the extra space, unique charachter, etc. in each "piece" of the message. Then, curiously, morzel is still receiving spams despite his/her modified spam blocking approach.
    
    The central problem is whatever heuristic they use to define what a spam is, it has to be predefined and well known. This would imply the spammer would have knowledge of said heuristic and would be able to form his emails in such configuration as to avoid detection.
    
    An AC has replied to your post as well suggesting a incomprehensible replacement which at one point says doing preprocessing on both the spam and the mail Ok, buddy and you are going to force the spammers to properly preprocess their mail so that it will get blocked by the mail server filter......right.
    
    If you can force people to do preprocessing a much better (and comprehensible) solution is [cypherspace.org]
    Hash cash Wherein you force each client to precompute a special value that is costly-enough in terms of CPU cycles to deter spamming. This value can be instantly verified by your client, mailserver, etc. and the email will be summarily dropped if the value is not of the costly-variety. Even if this value had to be checked by the recievers client itself, if a significant aamount of clients were configured not to display the email until the value was verified incentives for sending spam would drop. (hopefully to the point where the effort to send the spam outweighs the return to the spammer)
  - - Re:idea won't work if reaches critical mass (Score:3, Interesting)
      
      by morzel ( 62033 ) writes:
      
      It is true that it is not always trivial to pick the pieces in a way that the fragments being hashed start at the same offset, but isn't always needed to add extra complexity. Due to the sheer numbers of the same message being sent by the spammers, it would be quite difficult and timeconsuming for them to create a lot of "slight variants" of the same message. Add to that that spammers aren't the only resourceful people on this planet: we can make it difficult for them as well.
      
      This is how I would do it:
      
      Strip HTML/markup language, so that we get plain text of the message.
      
      Strip all "meaningless" characters from the text, keep only alphabetic (or alphanumeric) characters, no spaces or punctuation.
      
      Uppercase everything.
      
      We now have one string, with all the meaningful characters of the email, which makes it quite hard for spammers to vary much without mutilating the message they're trying to convey.
      
      Pick a 8 entry points in this string based on the occurance a number of well-chosen, predefined two-character combinations that are likely to be found in English text(*) - these need to be defined upfront. There are lots of texts available in the gutenberg project to analyze to get to such a set.
      
      This is hard: we need to find a good balance between physical location in the string, and the occurance of the combinations we have defined, so that we can take a broad "sample" of the text. Luckily for us , spammers tend to send long messages :-)
      
      Now we compute the hash of the fragments, defined by our entry-points and a fixed length. These hashes combined provide a "real big signature" of the spam message. Pick the last two bytes of every hash, and stick them together for a "small signature" that can be used for searching/matching. We need to define our protocol for searching the catalogue in such a way that when a partial match is found using the small signature, we can retrieve the full signature to check further.
      
      Based on this we have a rating from 0/8 -> 8/8 for the probability of a mail being a spam message. End user settings can define what is destined for the bitbucket, and what goes in your mailbox.
      
      In the end, spammers can (and will) try to circumvent these measures, but it would be hard and (hopefully) time-consuming, and it will require them to mutilate their messages to be undetected. Of course, this system only works properly when people are willing to submit spam fingerprints to the catalogue servers.
      
      Anyway, that's my 0.02 EURO...
      
      (*)Of course, English isn't the only language being used in spam, but I guess it's the most prevalent here. You can ofcourse apply the same principle to any language. Heck, if you really want to push the envelope, you can try to detect the language (character frequency analysis and checking for very common words).
- Re:idea won't work if reaches critical mass (Score:2)
  
  by DaSyonic ( 238637 ) writes:
  
  Spammers already do this. Both to the subject line and in the email you will often find a series of 6-8 random numbers attached. This does not make it impossible for this plan to work however.
Yes I've posted this before but (Score:3, Interesting)

by 4444444 ( 444444 ) writes: <4444444444444444 ... 444444@lenny.com> on Saturday December 01, 2001 @01:41PM (#2641331) Homepage

I love costing spammers real money just got to
http://goto.com
and do a search for "bulk email" each link you click will cost the scumbags that sell spam software or spamming services several dollars each
Also I love this new technology I wish all isp's would use it

and for more spam fighting ideas please check out
http://www.lenny.com/spam

Share
twitter facebook
- Re:Yes I've posted this before but (Score:2)
  
  by TMB ( 70166 ) writes:
  
  That goto.com (though it looks like they've changed their name to Overture) link is damn cool... over $8 per click?! Though that only hurts the companies that make the software, not the ones that use it. Still worthwhile though...
  
  Now I wonder whether they have any limitations for hits from a given IP address? One little perl script could put some of those companies out of business otherwise.... :-)=
  
  [TMB]
- Re:Yes I've posted this before but (Score:2, Interesting)
  
  by bleeeeck ( 190906 ) writes:
  
  I love costing spammers real money just got to http://goto.com and do a search for "bulk email" each link you click will cost the scumbags that sell spam software or spamming services several dollars each
  Here's the link [overture.com] for you lazy people.
  The top few listings are more than $8 each.
- - there are some scripts (Score:3, Informative)
    
    by 4444444 ( 444444 ) writes:
    
    you can find some scripts here
    
    http://www.lenny.com/spam
How do you compute a signature? (Score:5, Informative)

by cperciva ( 102828 ) writes: on Saturday December 01, 2001 @01:41PM (#2641336) Homepage

As far as I can tell from a quick glance at this, it looks like the entire message body is being used to compute the signature. This isn't going to work very well -- over half of the spam I receive is "personalized", and that fraction is growing every day.

This could work very well, but we need some way of computing signatures which will be invariant across different copies of personalized spam for this to be effective.

Share
twitter facebook
- Re:How do you compute a signature? (Score:2)
  
  by Chagrin ( 128939 ) writes:
  
  If, when creating the siganture, you make sure to only use words that are common to spam or dictionary words you'd be able to avoid the majority of any personalization present.
  - Re:How do you compute a signature? (Score:2)
    
    by FFFish ( 7567 ) writes:
    
    For instance, they could use a Markhov chain algorithm to parse their ever-increasing collection of sample spam, and use that to determine the "spamness" of email.
Open for abuse? (Score:2, Insightful)

by robstah ( 537647 ) writes:

Although, i marvel at the theory and innovative use of peer to peer technology to achieve exemplary aims. I have some concerns about the possibilities of abuse, AFAIK the submission system for spam, is not moderated in any way. In fact only the hash is sent to the server and not a copy of the spam, i am therefore concerned that the system could possibly be abused by someone submitting the hash of a legitimate mail to the system that would then result in this email from being recieved by the other hosts. This could be done to prevent the circulation of bugtaq items, my a malicous user for instance. And as everyone has different personal opinions about SPAM and what constitues it, i think a set of clear guidelines is required and when submissions are made a copy of the mail is associated with it and a human being moderates the hashes being submitted. Although i have my doubts about the system, if these were put to rest i would have no hesistation in implementing a system like this.
Re: Distributed spam filter (Score:3, Insightful)

by blibbleblobble ( 526872 ) writes: on Saturday December 01, 2001 @01:45PM (#2641347)

It does seem like a remarkably sensible system, just getting email clients to talk to each other about the emails they get.

You can tell if the same email has been sent to hundreds of people (and if you use hashes, you can do that without revealing the email)

You can click a "this is spam" button when you read an email, and anyone who trusts you (i.e. has your public key in their "trusted filtering friends" list) can look for similar messages and filter them.

But, there do seem to be a load of problems:
- Personalised email, as someone already mentioned
- Privacy problems with letting others into the secrets of your mailbox
- If you have the original of a message, you can calculate the hash, then see who else got the message (i.e. works for personal mail as well as spam)
- Relatively easy for malicious users to wrongly label someone as a spammer

Well worth investigating, though...

Share
twitter facebook
SpamAssassin uses Razor (Score:5, Informative)

by wideangle ( 169366 ) writes: on Saturday December 01, 2001 @01:49PM (#2641360) Homepage
From http://spamassassin.taint.org/ [taint.org]:
SpamAssassin is a mail filter to identify spam.
Using its rule base [taint.org], it uses a wide range of heuristic tests on mail headers and body text to identify "spam", also known as unsolicited commercial email.
The spam-identification tactics used include:

header analysis: spammers use a number of tricks to mask their identities, fool you into thinking they've sent a valid mail, or fool you into thinking you must have subscribed at some stage. SpamAssassin tries to spot these.

text analysis: again, spam mails often have a characteristic style (to put it politely), and some characteristic disclaimers and CYA text. SpamAssassin can spot these, too.

blacklists: SpamAssassin supports many useful existing blacklists, such as mail-abuse.org [mail-abuse.org], ordb.org [ordb.org] or others.

Razor: Vipul's Razor [sf.net] is a collaborative spam-tracking database, which works by taking a signature of spam messages. Since spam typically operates by sending an identical message to hundreds of people, Razor short-circuits this by allowing the first person to receive a spam to add it to the database -- at which point everyone else will automatically block it.

Once identified, the mail can then be optionally tagged as spam for later filtering using the user's own mail user-agent application.
SpamAssassin requires very little configuration; you do not need to continually update it with details of your mail accounts, mailing list memberships, etc. It accomplishes filtering without this knowledge, as much as possible.
Call your ISP [google.com] and ask if they use it.
Share
twitter facebook
Sounds tres cool (Score:2)

by Saint Aardvark ( 159009 ) writes:

I came across an ad recently for a commercial system that worked in a similar way; they had a bunch of different pop accounts set up to catch spam, and then created signatures of those messages in real time. You subscribe to their service, and you get an updated list every . Can't remember the name of the company, but I do remember them saying that new spam messages were typically sent out to clients w/in 15 minutes.
One question about this system that I hope the poster (or someone else using this system) will answer: what's it like on server load? Right now, at the ISP I work at, we're using procmail to filter for spam (check the graphs here: http://selenium.dowco.com/spam/spam.html [dowco.com]). It's a good way of doing things, but there are some shortcomings: basically, since it runs on our mailserver, I can't run all the body searches I want; in fact, we had to cut out body searches recently because the load was getting too high and/or email was taking too long to get through. There's some workarounds that I haven't got around to putting in yet (body scanning only when 3k in size, etc), but you can see my point. Anyone?
- brightmail? (Score:2)
  
  by autopr0n ( 534291 ) writes:
  
  You might be thinking of brightmail. I think that's what they do (to lazy to look it up)
This is just a temporary solution. (Score:5, Informative)

by mrsam ( 12205 ) writes: on Saturday December 01, 2001 @01:52PM (#2641366) Homepage

Spam generators have been trying to hash-bust these kinds of filters for years now. A four year spam generator automatically appends random junk at the end of the Subject header or at the tail end of the message, in order to defeat the early hash-based spam filters.

This is probably a 'fuzzy' hash function that should ignore minute variations. However, it goes without saying that if this hash-based spam filter becomes widespread, then the spammers will simply figure out how to hash-bust their way past it.

To have any hope of working over the long term, this kind of an approach must include the ability to distribute not just the hashes themselves, but the hash function as well, so that the hash function itself can be adjusted, when needed.

Share
twitter facebook
- Heh, intresting idea (Score:2)
  
  by autopr0n ( 534291 ) writes:
  
  I always figured that the major problem with a system like this was randomized messages. I figured a way around it would be try to make a 'conceptual' hash of the contents, that try to analyze the meaning of the text, not just the data.
  
  The big problem with that, is well, it's not easy :). But redistributing the hash function when spammers figure out the old one is an interesting idea as well. The big problem is with the more technically savvy spammers (yeh, I'm sure they're are some out there, unfortunately) who could reverse engineer the hash to figure out what makes it tick.
- +1 Hackerly on the MQR standard (Score:2)
  
  by MarkusQ ( 450076 ) writes:
  
  To have any hope of working over the long term, this kind of an approach must include the ability to distribute not just the hashes themselves, but the hash function as well, so that the hash function itself can be adjusted, when needed.
  Yes! In fact, why not have many fuzzy hash functions floating around at once? That way, their task would be to come up with something that yielded a different hash against all of the hash functions at once, a much harder problem. If some spammer figures out a way to do it, an anti-spammer can devise a function (looking at lots of copies of the spam, which shouldn't be hard to come by) that would catch it, and now that trick won't work any more.
  Distributing the functions with the hash (with a few safe guards, e.g. re: the halting problem) would make this darned near imposible to beat.
  -- MarkusQ
One way around potential abuse. (Score:5, Insightful)

by chris_7d0h ( 216090 ) writes: on Saturday December 01, 2001 @01:56PM (#2641381) Journal

To eliminate the situation where one person posts a lot of "incorrect" signatures, a ranking system could be applied.
The thought goes like this.
A person submits a signature of "identified" spam mail to a "supernode" for ex. and the submission gets a ranking of 1. Each additional submission (by other users) increases the score by a number.

This way, there are several classifications which could be used to filter incoming mail. For the mail providers, they could opt for only removing mail matching signatures with a very high score (thus very likely these will be actual spam) or they could filter anything reported.

The purpose of allowing the use of classifications is that it will take longer time to get higher scores, since more people have to report the specific spam mail. Some people whish to eliminate things the least bit suspected, but mileage may vary.

Do you see a resemblance with the ./ moderation?

Share
twitter facebook
- Re:One way around potential abuse. (Score:3, Informative)
  
  by MindStalker ( 22827 ) writes:
  
  Why bother. A hash is only going to affect a very specific mail. How often do you get mails that many other people get the same identical mail if it isn't spam. Listservs might be a problem. But I'm sure you could filter for each of your subscribed servs so that they don't get deleted.
  - Re:One way around potential abuse. (Score:2)
    
    by Suidae ( 162977 ) writes:
    
    It doesn't have to be a MD5/SHA/whatever hash, it can be a signature based on a fuzzy match. The point is, whatever it is, it needs to be submitted by a number of unrelated sites before its accepted as valid data. Each site can set their own threshold for messages, depending on how much they want to filter.
Mailwasher (Score:3, Informative)

by Heem ( 448667 ) writes: on Saturday December 01, 2001 @02:06PM (#2641399) Homepage Journal

I'm using Mailwasher [mailwasher.net] it works well for me. Allows you to preview your message headers, delete,blacklist and 'bounce' anything you dont want to recieve. Works well on spam as well as email from your ex-girlfriend.

Share
twitter facebook
X-YahooFilteredBulk (Score:4, Informative)

by Malc ( 1751 ) writes: on Saturday December 01, 2001 @02:24PM (#2641436)

I noticed that a lot of spam coming through my Yahoo account had been tagged with the header "X-YahooFilteredBulk". I added this to my Exim system filter and I've gone from 20+ spams a day in my inbox to 2 in a week. Thank you Yahoo!

Unfortunately, a lot anti-spam measures (including Exim 3's system filters) only take place after a message has been accepted for delivery. For me, this results in a lot of bounce messages frozen in the queue as they cannot be returned (Hotmail mailbox full, etc). I've switched on features like verifying the sender and the headers, but this doesn't catch them all, and in some cases might even stop some legitimate spam (one of my mailing lists uses incorrect syntax for the "RCPT TO:").

More effective anti-spam systems need to filter before the message has been accepted. If you wait until then, it is already too late and it is on your system. No, refusing accept delivery is much effective IMHO, and forces the MTA's further up the chain to deal with it. They shouldn't have accepted it in the first place! When you get spam, return 550 (or whatever the code is) and let the SMTP client deal with it. In an ideal world, ever provider (ISP, or free service like Yahoo) will implement stricter MTA's. If the spam rejection can be pushed far enough up the chain, life for everyone will easier.

BTW, according to Philip Hazel (a message I recieved to a question I posed on the Exim mailing list), Exim 4 will offer much more functionality along these lines, including the invocation of C funtions after the DATA phase of the SMTP input. I guess this would be the spot to plug in Vipul's Razor, although I don't know what kind performance hit that would lead to. Mr. Hazel also pointed out that some stupid clients are in contravention of the RFC and will continue to try and delivery a message if they recieved 5xx after the DATA phase... oh well: they'll be using my bandwidth but they won't be putting any crap on my server.

Share
twitter facebook
a good idea, but... (Score:3, Interesting)

by deander2 ( 26173 ) writes: <public.kered@org> on Saturday December 01, 2001 @02:29PM (#2641444) Homepage

What stops the spammer from including a unique identifier in each e-mail (such as a count variable), changing the SHA for each e-mail that goes out?

Just a thought...

Share
twitter facebook
- Re:a good idea, but... (Score:2)
  
  by Animats ( 122034 ) writes:
  
  What stops the spammer from including a unique identifier in each e-mail...
  That's a serious problem with a signature-based spam recognizer. There are spam generators that already make each spam unique. Some just personalize the message. Some add text composed of random phrases to the message. Some append a number to the subject line. Just hashing the text of the message won't work for long.
I've managed to filter most spam (Score:3, Interesting)

by Rikardon ( 116190 ) writes: on Saturday December 01, 2001 @02:34PM (#2641453)

I found a clever way to defeat most spam on the webpage of an avid cyclist; unfortunately I can't remember his name or enough information about him to run a Google search and give this method proper attribution. But here goes anyway:

The key to this method is to realize that most spam has a spoofed "To" address -- RARELY is it addressed directly to you. If you dig in the message headers, you'll usually found it was mailed (or CC'd) to a whole bunch of people at once, for obvious reasons. So you set up your mail filters thusly:

First, set up a filter allowing any "legal" mailing lists you're on to go to your Inbox.

Next, a filter to allow any mail sent directly to you (i.e. you@domain.com is in the To or CC lines) to go to your Inbox.

Finally, a filter that deletes everything else.

You'd be amazed how effective this is. Since setting this up, I only get maybe one spam message past this system every three or four months.

Mind you, I also have my email come in via Bigfoot, which has a pretty good spam filter itself. But this has nonetheless proven quite effective.

Share
twitter facebook
- Re:I've managed to filter most spam (Score:2, Insightful)
  
  by LiteForce ( 102751 ) writes:
  
  This won't work if somebody has sent you a message by way of BCC (Blind Carbon Copy).
- Good method, but why use the Inbox at all? (Score:2)
  
  by wideangle ( 169366 ) writes:
  Set a filter that sends "legal" mailing lists to your mailing list folder.
  
  Set another filter that sends friends/family/work/etc to their own folders.
  
  Anything else (spam) gets dumped in the Inbox.
  
  ------
  If you have O2002, you can do something similar by whitelisting. [win2000mag.com] "Whitelisting is the opposite of blacklisting. Whereas the latter bans messages from certain senders, whitelisting accepts mail from specific senders."
  "The new feature is an additional Rules Wizard condition: "sender is in Address Book," where you choose the address book--I've chosen my Contacts folder. For a message from a sender found in my Contacts folder, the rule applies a "known sender" category and stops processing the message. The "stop processing" action ensures that the message stays in my Inbox. Another rule at the bottom of the list moves everything that previous rules didn't handle into my Junk Mail folder for later review."
  How do you do this with PINE/procmail? I'd like to stop using Outlook.
- - Re:I've managed to filter most spam (Score:2)
    
    by psamuels ( 64397 ) writes:
    
    I'd also like to setup on my mail server a check where if the reply-to address != the from address, deny the message at the server.
    
    Don't do this! There are many legitimate uses for Reply-To. Think about it. If Reply-To should always be the same as From, why did the standards even bother to define it?
    
    Most commonly, some mailing lists set the Reply-To to the list address. This is Considered Harmful, partly because some users have other legit uses for the same field, but some list servers do it anyway.
Virus Detection (Score:5, Interesting)

by doorbot.com ( 184378 ) writes: on Saturday December 01, 2001 @02:42PM (#2641465) Journal

This seems like it would be a great method for virus detection on a non-Windows machine. For those of you who run *nix mail servers which eventually filters down to Windows clients, having a mail tagged as viral would be nice to have it be immediately denied at the server. So I'm assuming all it would take is a smart admin to tag the email as spam, and then it will propagate around to the other servers (less than 1k would transfer!).

Share
twitter facebook
One flaw, depending on your perspective... (Score:4, Interesting)

by wirefarm ( 18470 ) writes: <jim.mmdc@net> on Saturday December 01, 2001 @02:51PM (#2641483) Homepage

I spent the last few days hacking together a bulk mailer in perl. I did so with a lot of sensitivity and a bit of trepidation and a lot of social engineering to my employer who wanted to put together a way to send invitations to a party via email, rather than the very expensive snail mail method that we had been using.

This was emailed to our real customers - our 'A list'. These are the people who get invited to these parties each time - people who come and enjoy the food and drinks, no strings attached.

But, yet, technically, it *is* bulk email and this first time, unsolicited. A very large percentage of the people responded enthusiasticly that they want to remain on the list for this, but a few (8 out of 3500) asked to be removed from the list. One guy seemed annoyed and I typed him a personal apology. (In fact, I doubt that this guy read the email before sending off his remove request.)
What if that guy had submitted the email as spam to this system?
In that case, the rest would miss out on coming to a good party.

I hate spam as much as anyone on slashdot. I was asked to set up a bulk email and found that it could be done in a way that was not offensive in this case. Had it conflicted with my conscience, I would have refused.

Maybe the system needs some sort of moderation as a filter, too. At least that would allow valid bulk email to survive one trigger-happy end-user.

Ok, go ahead and tell me that I'm wrong in this...
Cheers,
Jim in Tokyo

Share
twitter facebook
- - Intentions matter (Score:2)
    
    by mgkimsal2 ( 200677 ) writes:
    
    Wow - you're taking that to the extreme. Do you shut out people who approach you in a room to talk to you because you didn't give them permission first? Pretty much the same concept.
    
    If I email my bills to clients, but they didn't request them first, does that mean it's 'spam' and they don't have to pay it?
    
    This company had legitimate relationships with current customers. If you can't email a current customer with information about something about your current relationship, then there's something seriously wrong with that definition of 'spam'.
    
    Hmmm... I guess I'm not allowed to send anyone ANY email ever unless the intended recipient requested it first. If the 'real world' operated like some people want email to operate, the world would be a mighty dull place...
  - Re:One flaw, depending on your perspective... (Score:2)
    
    by Fatal0E ( 230910 ) writes:
    
    reading your reply I wonder if you have ever ran a business. Believe it or not (are you sitting down?) email is a very effective way of keeping your customers in touch with what your company is up to. If you're clever,(you're still sitting right?) those people might even be interested in the products that can supplement the things they already bought! On the other hand, it's my responsibility to take them off those lists at their request. Thats a business plan that even predates the internet. Shocking aint it? I know!
    
    Sarcasm aside, it's not much of a leap in logic to assume that people who bought things from you in the past might also be interested in your new products. Most of my vendors that I trust with my money and investment I also trust with my email address. Most, not all. Besides, I like keeping up with their new products. I stay informed that way.
    
    But isnt that spam you ask? The answer is no. When Cisco sends me (unsolicited) specs on their firewall after I bought their VOIP gateway I dont take that as an intrusion on my space. When someone from Palm Beach tells me that for 10 grand I could get rich even quicker that is SPAM. See the difference?
    
    Obviously this is all subjective but I think you're being harsh when you label a party invatation as spam. If I sent you a nice, fancy invatation to my New Years party via snail mail would you call me up and yell at me for sending you junk mail?
    
    I would never go to a party announced via spam, even if it were at the Playboy mansion with hot and cold running blondes.
    
    we call this "talkin out (of) your ass" in my neighborhood.
    - - Re:Opting out (Score:2)
        
        by Fatal0E ( 230910 ) writes:
        
        firstly, I would have emailed you cuz I didnt want to discuss this on /. but since I dont troll I can sacrifice some karma :)
        
        anyway, I guess the biggest gap in our opinions is over good spam. UCE from respectable, reputable people to me is a good thing. They are the companies I send my money to.
        
        If someone on the bugtraq list came up with a commercial app that he wanted people on the list to beta test for him I would consider doing it. If he later offered it to subscribers (as before, via the list) at a substantially deep discount I wouldnt mind that either.
        
        My two examples up there are figurative, as opposed to the literal ones I gave you the first time.
        
        Victoria's Secret sends catalogs to my g/f, Thinkgeek sends me pamphlets and Cisco sends product announcements and specs. I like em all. I hold them in the same regard. Spam can be good...but not often :)
        
        Opting-out is not a valid mechanism, because as you know many spammers use opt-out responses as a way to maintain their lists of valid email addresses.
        
        You are correct, but my point is that for those of us who dont rely on spamming as our sole source of income, it's done responsibly. IOW there isnt a volture waiting for all those opt out emails to come in so he can sell them at a higher premium. I like to think that for most private corps that do these things (like mine), opt out means opt out.
        
        from the original post that got us both going...
        This was emailed to our real customers - our 'A list'. These are the people who get invited to these parties each time - people who come and enjoy the food and drinks, no strings attached.
OH NO!! (Score:2, Funny)

by evilpaul13 ( 181626 ) writes:

I'll never get another "funny email" from my Mom again!
how I filter spam (Score:2)

by scrytch ( 9198 ) writes:

By filtering out mails that contain the phrase "this is not spam"
- Re:how I filter spam (Score:2)
  
  by psychosis ( 2579 ) writes:
  
  excellent!!! I'd never thought of that. Bravo!
List of server-based spam filter systems (Score:5, Funny)

by tgeller ( 10260 ) writes: on Saturday December 01, 2001 @03:20PM (#2641542) Homepage

A canonical list of server-based spam filtering systems [spamcon.org] is on the SpamCon Foundation site, along with other sysadmin resources [spamcon.org].

Share
twitter facebook
an other effective spam stopping method ? (Score:3, Insightful)

by Sarin ( 112173 ) writes: on Saturday December 01, 2001 @03:24PM (#2641551) Homepage Journal

I receive about 40 spam messages in my mail account each day and I run my own mail server (qmail). Someone told me about a very basic spam stopping method. Just remove the mail-account for a couple of weeks and then reconnect it again, you should less or no spam after that period.

I receive too much real messages in order to try this out and I think most spammers won't bother to actuall remove an email address from their database if it doesn't exist. But has someone else tried this with any luck?

This p2p spam sounds really nice and I'm going to give it a try asap. I already "lost" an other mail-account in the flood of spam I got on it, so now it forwards all messages to msnbill@microsoft.com (microsoft domain billing address).

Share
twitter facebook
Similar to DCC (Score:2, Informative)

by bedessen ( 411686 ) writes:

See also DCC, the distributed checksum clearinghouse [rhyolite.com]. It uses a fuzzy hash so that bulk emails with minor differences are caught. I think the details differ a lot but the idea is more or less the same.
The death of SpamCop (Score:3, Informative)

by Animats ( 122034 ) writes: on Saturday December 01, 2001 @05:01PM (#2641812) Homepage

I use SpamCop [spamcop.net] to filter the mail for four domains. SpamCop used to be quite effective, because it used a challenge/response system, sending new mail sources an autoreply E-mail with a URL that had to be visited before the mail was forwarded. While that's a pain for the sender, it's been 100% effective in stopping spam.
Recently, though, SpamCop switched to a heuristic spam-filter, which is quite leaky. Not only does spam get through, messages from well-known viruses come through. It stops maybe half the spam now.
So SpamCop is now no more effective than typical procmail filters. So there's no point in paying for SpamCop service any more.
Anyone know of a good challenge/response alternative to SpamCop?

Share
twitter facebook
- - Re:The death of SpamCop (Score:2)
    
    by PigleT ( 28894 ) writes:
    
    "It challenges unknown senders and holds their mail in a pending queue."
    
    FWIW I find this system pretty stupid. The amount of work that *you* have to do resulting from spam is that you have to press `delete' or deal with it. It is highly unfair to multiply that work off onto all senders of legitimate email - you could reasonably say that that means the spammers have won.
    - Re:The death of SpamCop (Score:2)
      
      by Animats ( 122034 ) writes:
      
      The amount of work that *you* have to do resulting from spam is that you have to press `delete' or deal with it. It is highly unfair to multiply that work off onto all senders of legitimate email....
      If that's too much work for a sender, I probably don't want to hear from them anyway. After all, I'm going to have to compose a reply. It's only required on the first e-mail from a new source, so it doesn't bother anyone I hear from regularly.
      Another advantage of the challenge/response system is that it validates the source address. This validates the source of incoming threats.
Answers to some questions raised on slashdot. (Score:5, Informative)

by vipul_ved_prakash ( 540517 ) writes: on Saturday December 01, 2001 @05:48PM (#2641930) Homepage

Hi,
Some of you point out that Razor's use of SHA-1 signatures can be defeated by introducing randomness in the message. This is true; SHA-1 will eventually be phased out and replaced by a fuzzy hashing mechanism like nilsimsa in future. [http://lexx.shinn.net/cmeclax/nilsimsa.html] [http://www.geocrawler.com/archives/3/2539/2001/7/ 0/6173567/] The protocol is structured to aid change of hashing algorithms seamlessly, without breaking the existing system. Regarding the possibility of poisoning the database, we are working on a reputation system that will assign credit to honest reporters. Once we have a critical mass of users, it would be hard for dishonest reporters to even join the reporting network, much less be able to mount a DOS attack. Some of these issues have been discussed on the razor-users mailing list. The list archives are located at [http://www.geocrawler.com/archives/3/2539/2001/] best, vipul.

Share
twitter facebook
Not Gnutella-like at all; it's Napster-like. (Score:2, Interesting)

by jordan ( 17131 ) writes:

The comment made in the submission states that Razor is gnutella-like. That is BS too; if anything, it's Napster-like. Razor is a centralized, collaborative filtering system. One could argue that Razor's master servers are distributed and that the entire system is therefore not fully centralized, but this will change shortly to a master/slave model, which will allow the introduction of a reputation management system.

Keep your eyes peeled.

--jordan
How about PGP signatures? (Score:2)

by Corrado ( 64013 ) writes:

I had a thought about this a little while ago. What if you only accepted mail from people that included their PGP fingerprint, and then only the particular people that you want to accept mail from?

This turns your mailbox into a Opt-In situation. I realize that this would be hard to do, and that you would have to swap fingerprints off-line, but wouldn't you have to do that anyway? This would also require mail clients to allow you to set up a new X-Header (most will, won't they?) like PGP-Fingerprint or something.

It certainly would keep unwanted mail out of your mailbox. And if you decided that you didn't want any more mail from a particular person, just remove their fingerprint. This also gets around the problem of someone sending email from different addresses. I personally have 4 or 5 different email addresses that I use for various purposes.
- Re:Stopping bogus entries? (Score:2, Informative)
  
  by cwebster ( 100824 ) writes:
  
  search google for SHA digest, read how it works the take a good look at your question
- Re:Stopping bogus entries? (Score:3, Informative)
  
  by Anonymous Coward writes:
  
  You don't seem to understand the concept of a hash. A hash function parses a message into blocks of n-size. If the message is not a multiple of n, it's padded using one of several techniques. It then reduces the size of this block through a complex algorithm that's difficult to reverse. For instance, MD5 uses a 512 bit input, and spits out a 128 bit output. Then it puts the output blocks together to form new n-sized blocks and runs those through the algorithm again until it has one n-size block. This block is run through the algorithm and the output is the message digest, or hash. The chances of two messages having the same hash is inversely proportinal to the length of the hash. The ability of an attacker to find two messages with the same hash depends on the strength of the hash. Hope this clarifies everything.
  .derf
- Re:Stopping bogus entries? (Score:2, Informative)
  
  by cheebie ( 459397 ) writes:
  
  In the first place, it's not 20 words, it's 20 characters. In the second place, those 20 characters are simply the SHA signature of the offending message. I assume they key on some of the more constant headers and (possibly part of) the body of the text. By the very nature of digital sigs, it would be difficult (impossible?) to key on something like "any post with the word 'carroway' in it".
- Re:Why this wont work. (Score:2, Informative)
  
  by glomph ( 2644 ) writes:
  
  Spammers -have- been doing this for a long time, appending some randomly generated crap characters to the subject line, to avoid hash-recognition.
- - Re:Have you looked at Hotmail's new spam filter? (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    I wonder if Hotmail is using the same kind of logic. I mean, they allow the user to label which emails the user sees as spam. Then they can set somekind of threshold based on how many labels a signature has received.
    
    When the threshold is crossed, then the signature will be categorized as spam to all other users.
    
    It will work beautifully considering how many users they have.
  - - - Re:Have you looked at Hotmail's new spam filter? (Score:2)
        
        by wideangle ( 169366 ) writes:
        
        Yes you need to scan your junk folder occasionally.
        Think of it like dumpster diving -- sometimes you
        discover good stuff in there.:
        Seriously, the trick is making legit mail easier to find.
        So in addition to the whitelist, you need more filters:
        
        Filter mail sent from [family addresses] to family folder
        
        Filter mail sent from [friend addresses] to friends
        
        Filter mail sent from [*@work.com] to work
        
        Filter mail with subj [ebay] to ebay
        
        Filter mail with [foreign strings] to junk AND color it Gray
        
        Filter mail with [spam criteria] to junk AND color it Gray
        
        Anything else goes to junk for review
        
        Key rules are #5 and 6, which color spam appropriately.
        Now it's easier to review your junk folder, because real mail is most likely colored Black.
        Add one more rule, to be checked after you send a _reply_ to a legit msg:
        
        Is recipient in address book? (whitelist) If not, add it.
- Re:Anyone know where these people live? (Score:2)
  
  by Malc ( 1751 ) writes:
  
  And what good will that be? Are you planning some vigilante action?
- spammer said stopping spam in un-american. (Score:2)
  
  by www.sorehands.com ( 142825 ) writes:
  
  I got a spam from one spammer, who gave a an 800#.
  
  I called him, and he was saying that it was un-American to stop spam. In my case, he He got the email address from a the prairielaw.com website, but it was too expensive for him to pay for advertising.
  
  Maybe you'd like to discuss it with him.
  Locators, Inc
  
  888-595-9131 Toll Free
  - Re:spammer said stopping spam in un-american. (Score:3, Informative)
    
    by zulux ( 112259 ) writes:
    
    Watch out! In some cases an 888 or 800 number can act like a 900 number - It can cost you money!
    
    http://www.bbbsouthland.org/topic110.html [bbbsouthland.org]
    for more information.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

SpamBouncer (Score:5, Informative)

Great use of p2p (Score:5, Insightful)

Re:Great use of p2p -- Wont work. (Score:2, Interesting)

Re:Great use of p2p -- Wont work. (Score:5, Interesting)

Re:Great use of p2p -- Wont work. (Score:2, Informative)

Re:Great use of p2p -- Wont work. (Score:2)

Re:Great use of p2p -- Wont work. (Score:2, Funny)

Re:Great use of p2p -- Wont work. (Score:2)

Re:Great use of p2p -- Wont work. (Score:2)

Some positivism and less bitching please... (Score:3, Funny)

Re:Some positivism and less bitching please... (Score:2)

Re:Some positivism and less bitching please... (Score:3, Interesting)

Re:Great use of p2p -- Wont work. (Score:4, Interesting)

Re:Great use of p2p -- Wont work. (Score:5, Interesting)

I think you may have missed the point. (Score:2)

Re:Great use of p2p (Score:2, Interesting)

Re:Great use of p2p (Score:5, Informative)

Re:Great use of p2p (Score:2)

So... (Score:5, Interesting)

One fix . . . (Score:2)

Apply it late (Score:2)

Re:So... (Score:4, Insightful)

Re:So... (Score:4, Insightful)

Re:So... (Score:2)

Authentication with servers? (Score:5, Insightful)

Bogus hashes won't tag valid mail (Score:4, Informative)

Re:Bogus hashes won't tag valid mail (Score:2)

Fabulous Idea! (Score:3, Interesting)

Not necessarily such a Fabulous Idea! (Score:3, Interesting)

Re:Fabulous Idea! (Score:2, Interesting)

How about a server frontend approach? (Score:3, Insightful)

Re:How about a server frontend approach? (Score:2)

Re:How about a server frontend approach? (Score:2)

Fighting spam (Score:5, Informative)

Re:Fighting spam (Score:2, Informative)

Re:Fighting spam (Score:2, Informative)

Re:Fighting spam (Score:2, Interesting)

Re:Fighting spam (Score:2, Informative)

Re:Fighting spam (Score:2)

Foreign spam removal (Score:5, Informative)

Add one for this: (Score:2)

Re:Fighting spam (Score:2)

Re:Fighting spam (Score:2)

idea won't work if reaches critical mass (Score:4, Insightful)

Re:idea won't work if reaches critical mass (Score:2)

Re:idea won't work if reaches critical mass (Score:2)

Re:idea won't work if reaches critical mass (Score:3, Interesting)

Re:idea won't work if reaches critical mass (Score:2)

Yes I've posted this before but (Score:3, Interesting)

Re:Yes I've posted this before but (Score:2)

Re:Yes I've posted this before but (Score:2, Interesting)

there are some scripts (Score:3, Informative)

How do you compute a signature? (Score:5, Informative)

Re:How do you compute a signature? (Score:2)

Re:How do you compute a signature? (Score:2)

Open for abuse? (Score:2, Insightful)

Re: Distributed spam filter (Score:3, Insightful)

SpamAssassin uses Razor (Score:5, Informative)

Sounds tres cool (Score:2)

brightmail? (Score:2)

This is just a temporary solution. (Score:5, Informative)

Heh, intresting idea (Score:2)

+1 Hackerly on the MQR standard (Score:2)

One way around potential abuse. (Score:5, Insightful)

Re:One way around potential abuse. (Score:3, Informative)

Re:One way around potential abuse. (Score:2)

Mailwasher (Score:3, Informative)

X-YahooFilteredBulk (Score:4, Informative)

a good idea, but... (Score:3, Interesting)

Re:a good idea, but... (Score:2)

I've managed to filter most spam (Score:3, Interesting)

Re:I've managed to filter most spam (Score:2, Insightful)

Good method, but why use the Inbox at all? (Score:2)

Re:I've managed to filter most spam (Score:2)

Virus Detection (Score:5, Interesting)

One flaw, depending on your perspective... (Score:4, Interesting)

Intentions matter (Score:2)

Re:One flaw, depending on your perspective... (Score:2)

Re:Opting out (Score:2)

OH NO!! (Score:2, Funny)