Live spam-catching contest at CEAS 126

Posted by CmdrTaco on Wednesday April 11, 2007 @12:19PM

noodleburglar writes "The 2007 Conference on Email and Anti-Spam (CEAS) will feature a live spam-catching contest. Entrants will be treated to a torrent of spam and must use their spam filtering technique to filter out as much as possible, while also letting legitimate messages. My money's on Spam Assassin." This ought to be a sweeps week television spectacular.

This discussion has been archived. No new comments can be posted.

Live spam-catching contest at CEAS

Load All Comments

Search 126 Comments Log In/Create an Account

Comments Filter:

CRM114 (Score:4, Informative)

by sageFool ( 36961 ) writes: on Wednesday April 11, 2007 @12:22PM (#18690681) Homepage

http://crm114.sourceforge.net/ [sourceforge.net] using hyperspace! It's been working better than spam assassin for me.

Share
twitter facebook
- Re: (Score:1, Funny)
  
  by Anonymous Coward writes:
  
  Unlike many other "filters", CRM114's default action is to read all of input, and put NOTHING onto output.
  
  This is either:
  
  1) "automatic" white-listing?
  2) Not healthy and you should eat more fibre.
- - Agile and evolutionary versus ergodic spam (Score:3, Insightful)
    
    by goombah99 ( 560566 ) writes:
    
    The trouble I can see with a test like this is that's it's a static test. It assumes a key feature of spam which is not true. namely that the spam signature is constant over time or at least makes an ergodic assumption. The thing about spam is that it is evolutionary. Not only does it's signature vary but the spammers learn what is getting through and shift to sending more of that flavor.
    
    To see why this matters consider two spam hypothetical spam programs. One blocks 99% of the test set spam but lets a
    - Re: (Score:2)
      
      by gvc ( 167165 ) writes:
      
      The trouble I can see with a test like this is that's it's a static test.
      
      No it isn't. Hence the name Live Spam Challenge.
      - Re: (Score:2)
        
        by goombah99 ( 560566 ) writes:
        
        No you are mistaken I believe. The term "live" is meant inthe sense of real time and sequentially deliveres spam. An on-line test. Not a test where one has the entire corpus of spam to train and filter. But the spam signature waveform is, unless I'm wrong, not going to be reactive to the filters. I'd even bet that all filters will be delivered the same message sets for ease of comparison. I doubt the spam will evolve it's signature in an intelligent reactive manner to evade the filter. But that's the
        
        Re: (Score:2)
        
        by gvc ( 167165 ) writes:
        
        I meant live to mean that the spam was captured and delivered in real time. If one or more spam filters adds the spam to Razor, or an RBL, or whatever, that'll be observable -- by spammers and filters alike.
    - Re: (Score:2)
      
      by martin-boundary ( 547041 ) writes:
      
      This contest is testing filters on a live short window of time. What you want has already been done many times in the past (look up the work done by NIST [nist.gov] for example).
      In the past, filters have been tested on spam data collected over literally a year or more, which captures the natural variation of the spam stream. Note that in these tests, filters aren't given the full dataset immediately, they have to learn the new spam patterns as the test progresses. That's what you're talking about, and it's been done
      - Re: (Score:2)
        
        by goombah99 ( 560566 ) writes:
        
        Every thing you say is completely wrong.
        
        This contest is testing filters on a live short window of time. What you want has already been done many times in the past (look up the work done by
        NIST [nist.gov] for example).
        I'm sorry but you have utterly misunderstood what I was saying or you don't understand the reference you linked to. The reference you link to is an on-line tracking filter for spam. The spam itself can vary or not, but it is not co-evolving in response to the filter itself which is what real spam does.
        In the past, filters have been tested on spam data collected over literally a year or more, which captures the natural variation of the spam stream.
        Now I'm certain you don't understand the difference between spam varying and spam co-evolving. In simple terms the first is game theory when you opponent does not c
        
        Re: (Score:2)
        
        by martin-boundary ( 547041 ) writes:
        
        I understand perfectly your point and simply disagree.
        The spam itself can vary or not, but it is not co-evolving in response to the filter itself which is what real spam does.
        There is no such thing as realtime coevolving spam in response to the filter. A filter doesn't give feedback to a spammer. There is no direct information path from the decision taken by a filter and the subsequent decisions taken by spammers on future spam campaigns. To believe there is is like believing in the tooth fairy.
  - Re: (Score:2)
    
    by TFGeditor ( 737839 ) writes:
    
    I was wondering how the test-spam generator will handle headers, especially origin IP address. That alone is often 75 percent accurate in determining spam. If it sources from an IP in Korea, South America, or Europe and is destined for a North American inbox, odds are it is spam.
    
    Not flaming, just an observation based on my own experinces.
    - Re: (Score:2)
      
      by HomelessInLaJolla ( 1026842 ) * writes:
      
      especially origin IP address
      Gmail, and possibly other new webmail services, no longer include the X-Originating-IP field and actually go the opposite route--all e-mail I receive from gmail accounts appears to originate from an internal 10. IP address.
      
      I cannot possibly come up with any viable justification for this. I can think of plenty of excuses and all of them rely on idiotic fallacies.
    - Exploiting This (Score:2)
      
      by Slashdot Parent ( 995749 ) writes:
      
      I agree, and was pondering how to exploit that fact. I couldn't think of a good answer, so I decided to just let the Bayesian classifier figure it out for me.
      
      I use a routine that can quickly determine the origin country of an IP address and just insert that origin country into the headers of the message in an X- header. Then, it's just one more thing for the Bayesian classifier to decide what to do with. It realizes that I don't get much ham from Latvia, so when it sees X-Origin-Country: Latvia, that spa
My money (Score:2)

by Mateo_LeFou ( 859634 ) writes:

is on whatever Gmail uses. I've not yet seen a spam message in my inbox, nor have I missed any mail, even from auto-mailing scripts at websites I'm building...
- Re: (Score:3, Funny)
  
  by rodney dill ( 631059 ) writes:
  
  Well let's just find out, just what is your gmail address, hmmmm?
  
  ;)
  - - Re: (Score:3, Funny)
      
      by Zephyros ( 966835 ) writes:
      
      Translation: "You have no chance to survive. Make your time."
- Group spam detection (Score:5, Informative)
  
  by Animats ( 122034 ) writes: on Wednesday April 11, 2007 @12:32PM (#18690811) Homepage
  
  Gmail, like SpamCop, has a group spam filter system. It looks at mail sent to a large number of recipients. The defining characteristic of spam is that it's sent to a large number of recipients, after all. If you're in a position to watch the incoming mail of a few million mailboxes, detecting spam is easy.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by ProfessionalCookie ( 673314 ) writes:
    
    Yeah- I'm waiting to see algorithmically generated spam where no two messages are alike. Bleh! That being said gmail does a tremendous job of letting through legitimate messages (which is no doubt the hardest part of making a spam filter these days).
    - Re: (Score:2)
      
      by Animats ( 122034 ) writes:
      
      Yeah- I'm waiting to see algorithmically generated spam where no two messages are alike.
      We've had that for years. The latest variant is in those Viagra spams with a faint pattern of background noise in the images, different for each spam.
      - Re: (Score:2)
        
        by ProfessionalCookie ( 673314 ) writes:
        
        Botnets. Now they're programmable!
  - Re:Group spam detection (Score:5, Interesting)
    
    by kebes ( 861706 ) writes: on Wednesday April 11, 2007 @12:43PM (#18691017) Journal
    
    You're right--but the size of Gmail gives them another advantage. In those marginal cases where the spam filter isn't sure about an email (is this spam or a mailing list?) it has the advantage of having a huge number of people checking all the emails. That is, the users do the final check.
    
    I have received a spam to my gmail account exactly once. And when I did, shocked, I clicked the "mark as spam" button. The point is that this spam was probably sent to millions of Gmail users, and the algorithm wasn't sure how to categorize it. But because I clicked "spam" (and probably a few other people did, too), it was marked as spam for everyone. So most users never say it in their inbox. Thus only a dozen out of the million recipients was ever bothered by the spam. Conversely, an email list would receive no (or very few) "mark as spam" clicks, and would be allowed to pass. So basically the Gmail userbase acts the workforce to continually train the spam filter, and moreover to detect new spam within minutes of it being sent.
    
    It's hard to beat a system like that. But the point is that it relies on the large number of users who are all (effectively) sharing their spam training sets with each other in realtime.
    
    This is not to say that the baseline algorithm that Gmail implements isn't quite effective, but the point is that Gmail can use the users to resolve those tricky false-positive and false-negative situations.
    
    Parent Share
    twitter facebook
    - Re: (Score:1)
      
      by iminplaya ( 723125 ) writes:
      
      This doesn't lead to the possibility that a group of users could mark a legitimate sender as a spammer? I think this an old question, but I don't remember the answer. And if it is possible, how do you defend against it?
    - I wonder how they deal with pseudo-spam (Score:2)
      
      by grahamsz ( 150076 ) writes:
      
      I know I've removed myself from a few mailing lists by simply having gmail count them as spam.
      
      These aren't really spam, they are companies that I did business with once and can't be bothered to find my username and password to change my email subscription settings. But gmail seems to happily block everything else from that sender without my interaction.
      
      Surely other users do want these particular emails so there must be some kind of per user dynamic as well.
    - Re: (Score:2)
      
      by asninn ( 1071320 ) writes:
      
      Thus only a dozen out of the million recipients was ever bothered by the spam. Conversely, an email list would receive no (or very few) "mark as spam" clicks, and would be allowed to pass. So basically the Gmail userbase acts the workforce to continually train the spam filter, and moreover to detect new spam within minutes of it being sent.
      This probably plays a role, but it will not be the only thing GMail relies on (and probably not even the most important factor), and it will likely require more than
    - Re: (Score:2)
      
      by Matt Perry ( 793115 ) writes:
      
      I have received a spam to my gmail account exactly once.
      I wish my Gmail account was like that. Maybe you're new to Gmail. I get several spams in my inbox per week. Mostly these are spam messages in Russian and Chinese but I still get a lot of spam in English as well. I always use the button to mark them as spam, but Gmail doesn't seem to get the message that I don't want anything written in Russian. It's also disappointing that I can't create a filter to mark messages as spam. The best I can do is cat
- Re: (Score:3, Informative)
  
  by 0100010001010011 ( 652467 ) writes:
  
  Set up a catchall on your domain. You'll start getting stuff through. Especially the images ones. Some of the newer "make it look like a real e-mail" gets through.
  
  Everywebsite I have gets its own e-mail account, eg. slashdot@myhost.com.
  One day I started getting spam to site@myhost.com. So I setup in dreamhost to bounce everything to that e-mail address.
  
  Then I started getting flooded with:
  otehoenut-site@myhost.com
  cgjwbmkh-site@myhost.com
  
  Google has, thankfully, let me do delete of *site@myhost.com, but for a
- Gmail's filtering is not that great (Score:2)
  
  by winkydink ( 650484 ) * writes:
  
  Try slutting your address around a bit. Mine is only publicly readable here on /. and I get plenty of spam in my gmail inbox. Yahoo seems to do a better job based on my experience.
  - Re: (Score:2)
    
    by jfengel ( 409917 ) writes:
    
    Huh. I'm using GMail to host my domain. My email addresses are pretty slutty (a combination of supporting the catchall, some public "info@" addresses that get forwarded to me, and a few mailing lists with lousy privacy or security policies.)
    
    I do see perhaps three spams a day that actually make it into the inbox, and about 300 or so that are shunted to the spam folder.
    
    There may be false positives in there, but with 300 per day I'm not going to find out. I've never noticed one in there, or had a friend tel
- Re: (Score:2)
  
  by hpavc ( 129350 ) writes:
  
  The google gmail news group says otherwise for many other people, the filtering is practically non-existent it seems for me.
  - Re: (Score:2)
    
    by martin-boundary ( 547041 ) writes:
    
    That's not surprising. It's mathematically impossible for a single filter to classify emails correctly for a large group of people, because any large group is inconsistent. Someone believes X is spam, but another one truly believes X is not spam. Whatever the filter does, it's going to be wrong on one group of people. You're part of that crowd of Gmail's users.
    You'll be much better off with a personal filter, that learns what you like, not what the majority of Gmail users like.
- Re: (Score:2)
  
  by thePowerOfGrayskull ( 905905 ) writes:
  
  is on whatever Gmail uses. I've not yet seen a spam message in my inbox, nor have I missed any mail, even from auto-mailing scripts at websites I'm building...
  I will agree that it's great for spam; but when it comes to 419 emails, it sucks. Badly. I'm not sure how I got on the 419ers lists, but I get at least 10-12 of them a day, none of which are caught by gmail filters. On the other hand, the 50-60 regular spam emails are correctly filtered. If only I could perform regex filtering in gmail, I could catch the 419 emails myself very easily, as they all have very common attributes.
- Re: (Score:2)
  
  by gvc ( 167165 ) writes:
  
  You're welcome to use Gmail -- or any other filter you like, animal, vegetable, or mineral -- to participate in the Live Challenge.
- Re: (Score:2)
  
  by SL Baur ( 19540 ) writes:
  
  My bet would be on the gmail filter too. I've had my old xemacs.org email address (which has been harvested to death) forwarded through there for some months now. It's not perfect, but it still only lets through about as much spam as my old handcrafted .procmailrc did 8 or 9 years ago. Which is really good considering how much more spam there is today.
  
  If I could tell it to junk everything except text in certain languages it would work even better. It seems to miss a lot of Korean and Russian spam.
Sweeps (Score:3, Funny)

by cyphercell ( 843398 ) writes: on Wednesday April 11, 2007 @12:25PM (#18690725) Homepage Journal

This ought to be a sweeps week television spectacular.

It think I've seen people catching spam on tv, just not the kind you're talkin' 'bout. http://www.spam.com/ [spam.com]

Share
twitter facebook
- Re: (Score:1)
  
  by session_start ( 1086203 ) writes:
  
  The trick is to try to catch the spam in a net with such velocity that the spam "squishes" through the net to fall on the ground, leaving you with only valid "message" hidden amongst the spam.
My money (Score:2)

by TodMinuit ( 1026042 ) writes:

My money is on whoever rigs up a Amazon's Mechanical Turk-based system fast enough.
- Re: (Score:2)
  
  by Afecks ( 899057 ) writes:
  
  My money is on whoever rigs up a Amazon's Mechanical Turk-based system fast enough.
  
  Because you'd really want thousands of random people reading your emails looking for spam?
Damn. (Score:1)

by daeg ( 828071 ) writes:

Damn. I was hoping they'd be launching phone-book sized printed copies of spam at the contestants, complete with blood, with each week adding a few pounds. Add some half naked chicks and dudes (cater to multiple markets) dancing around, maybe some buckets of slime and you've got yourself a show worthy of running on Fox.
Curious:When urologists email each other... (Score:5, Interesting)

by dpbsmith ( 263124 ) writes: on Wednesday April 11, 2007 @12:33PM (#18690847) Homepage

... are they able to refer to Pfizer's brand name for sildenafil, Lilly's name for tadalafil, or Bayer's brand name for vardenafil without getting caught in the spam filters?

Share
twitter facebook
- Re:Curious:When urologists email each other... (Score:4, Informative)
  
  by kebes ( 861706 ) writes: on Wednesday April 11, 2007 @12:53PM (#18691185) Journal
  
  Suffice it to say that a doctor is likely to write an email like:
  
  "Ted, I just read the news about Viagra in the New England Journal of Medicine. Very interesting results, though the error bars are a bit large to draw any major conclusions just yet. What do you think?"
  
  Whereas a doctor rarely writes email like:
  
  "NoW ava ilable is generic V1AGRA at low price! Generic, quality, all low price now!"
  
  The point is that modern spam filters don't just look for "bad words" but consider relative word frequencies, the sender and receiver fields, word correlations, formatting elements, URLs, etc. Spam filters in your email client will be trained against email you typically send/receive, and so can be even more precise. Spammers of course try to make their emails include words so that they end up looking like real email, but if the filter is good enough, then the only way to get past it is to send an email that now lacks those critical spam elements (like the link you're supposed to click to buy the generic drug or whatever)...
  
  Parent Share
  twitter facebook
- That depends upon the method used. (Score:2)
  
  by khasim ( 1285 ) writes:
  
  Pure content scanning would probably trigger those ... unless you had previously manually approved similar messages.
  
  Other approaches use multiple tests such as checking whether the sending server's IP address is on a blacklist or whether any of the links in the message (should it contain links) were on blacklists.
- Re: (Score:2)
  
  by misleb ( 129952 ) writes:
  
  Only if they write things like:
  
  Hey, I just pre sc ribed V.1.4.G.R.A to a patient today.
  
  The monk said to the fox, why don't the squirrels to be or not to be, that is my answer. The fog was as thick as umbrellas in the wind thought the old maid.
- Re: (Score:1)
  
  by cgrayson ( 22160 ) * writes:
  
  See, er, listen to this hilarious Onion Radio News story from Feb. 8: Brilliant Scientist Trying To Get Word Out About Penis-Enlargement Breakthrough [theonion.com] (warning: page may auto-play audio).
- Re: (Score:2)
  
  by mutterc ( 828335 ) writes:
  
  Happened with a lame spam filter my company used to have. This was a year or so ago.
  
  I emailed my wife "can you stop by and pick up the Strattera and Effexor from the pharmacy?" once. Her reply, containing my message, got plonked by the filters.
- Re: (Score:2)
  
  by Atario ( 673917 ) writes:
  
  ... are they able to refer to Pfizer's brand name for sildenafil, Lilly's name for tadalafil, or Bayer's brand name for vardenafil without getting caught in the spam filters?
  
  I would hope they use the real names and not the brand names.
- Spamassassin scored -1.3 (Score:2)
  
  by Slashdot Parent ( 995749 ) writes:
  
  I thought your question was intriguing, so I composed the following message:
  Subject: Interesting phenomon related to Viagra use Hi, Dr. Smith- I just wanted to write you to let you know that I really enjoyed the article you wrote in the New England Journal of Medicine about the side effects of Cialis, Viagra, and Levitra. It turns out a patient of mine experienced debilitating nausea while on Levitra, so I prescribed Viagra in its place, as you recommend. In addition, I thought you might be interested to know
I wish the contest was.... (Score:2, Interesting)

by ruffnsc ( 895839 ) writes:

physically catching the spammers! (your imagination can do the rest)
- The First Annual Greased Spammer Contest! (Score:5, Funny)
  
  by Penguinisto ( 415985 ) writes: on Wednesday April 11, 2007 @12:47PM (#18691101) Journal
  
  (cue Monster Truck Rally announcer guy voice...) THIS SATURDAY AT THE EXPO CENTER! The Best admins and the worst spammers come together in a throwdown-showdown-lowdown Greased Spammer Contest! We kidnap, strip, and grease down every known spammer we can find on Planet Earth! We bring 'em here, then we give our lucky mail server admins (as determined by lottery) a chance to catch 'em! The spammers will be released into a large pit, where the admins may use any method to catch and immobilize spammers (firearms and other projectile weapons are excluded). Points will be given for the number of spammers caught, the methods of capture, and the level of eye-rattling violence applied to each spammer after their capture! Watch as the winning admin gets to publicly execute the dreaded Sanford Wallace by any method that he or she can dream up! Any method at all! You'll buy a ticket for the whole seat, but you will only need the edge! Get your tickets at the Mondotix - DON'T MISS IT!(/voice)
  /P
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by NewbieV ( 568310 ) writes:
    
    You forgot to mention that's it's being held on
    
    SUNDAY! SUNDAY! SUNDAY!
    
    Be There!
- Re: (Score:2)
  
  by HTH NE1 ( 675604 ) writes:
  
  I wish the contest was physically catching the spammers!
  Only as long as it is not catch-and-release.
Will SMTP server settings count as well? (Score:2)

by Penguinisto ( 415985 ) writes:

...or just the filter software/daemon performance/stats alone? There's lots you can do to the MTA itself to stop spam before it even has to be examined by the filters (mostly by monkeying w/ the SMTP session handling and timeouts).
It's be interesting to see a solid setup that handles a combination of the two, then publish the results (yes, spammers can read those results/settings to try to foil the setup, but many settings would make it patently unprofitable for them to do so).
/P
- I can't tell from the write up. (Score:2)
  
  by khasim ( 1285 ) writes:
  
  But I doubt that they have a hundred thousand systems that they'll be using to send the test spam.
  
  A big part of the system I use at work is based upon IP addresses and rDNS. I block a HUGE amount of spam just by rejecting all connections from Comcast that aren't from their SMTP servers.
  
  I know, some people want to run SMTP servers at home. But so far none of them have attempted to send email to my system.
  
  So it really depends upon how they configure the test spam servers. Personally, I don't see this as being
- Re: (Score:2)
  
  by gvc ( 167165 ) writes:
  
  Envelope information will be preserved, so you can determine the purported sender, multiple recipients, HELO IP, actual IP, etc. But you can't play interactive games with the SMTP protocol because the same email must be delivered to all participants.
- Re: (Score:2)
  
  by pe1chl ( 90186 ) writes:
  
  I agree. I filter the majority of spam by just doing strict RFC compliance testing in the SMTP engine. It rejects almost everything sent via botnets. What comes through is mostly 419 scamming, because that is sent via bonafide mailservers. But that is easily filtered with SpamAssassin.
- Re: (Score:2)
  
  by SCHecklerX ( 229973 ) writes:
  
  That's my plan (I want to see how well my stuff works without customizing it too much just for the contest). Let's hope more details arrive soon...
The prize list :) (Score:5, Funny)

by davidwr ( 791652 ) writes: on Wednesday April 11, 2007 @12:37PM (#18690895) Homepage Journal

1st prize: Job offer from a security-software vendor
2nd prize: Lifetime supply of Hormel meat products
3rd prize: Commemorative tin of SPAM meat product
Last place: Inheritance from Nigerian Prince

Share
twitter facebook
- Re: (Score:2)
  
  by LearnToSpell ( 694184 ) writes:
  
  2nd prize: Lifetime supply of Hormel meat products
  
  Which is about 4 1/2 days if that's all you eat.
that's easy. Yahoo mail! (Score:3, Funny)

by number6x ( 626555 ) writes: on Wednesday April 11, 2007 @12:38PM (#18690921)

Just open a yahoo mail account, and start posting with the e-mail address all over th internet.
You'll catch more spam than anyone else!
Oh, you want me to filter out spam, not just get spam, nevermind.

Still, it might be the fastest way to build a database of spam.

Share
twitter facebook
- Re: (Score:2)
  
  by CrazyTalk ( 662055 ) writes:
  
  Actually thats not a bad idea - have a contest to see how much spam you can ATTRACT with a fresh email account in a given time period. My Verizon account would win hands down. (And to you spammers out there - no, my email address is NOT CrazyTalk@verizon.net)
  - Re: (Score:2)
    
    by Kozar_The_Malignant ( 738483 ) writes:
    
    Actually thats not a bad idea - have a contest to see how much spam you can ATTRACT with a fresh email account in a given time period. My Verizon account would win hands down. (And to you spammers out there - no, my email address is NOT CrazyTalk@verizon.net)
    
    The poor bastard who actually does have CrazyTalk@verizon.net is really, really pissed about now.
Professional spammers in attendance? (Score:5, Interesting)

by MobyDisk ( 75490 ) writes: on Wednesday April 11, 2007 @12:40PM (#18690957) Homepage

I wonder if professional spammers will attend the conference to learn how to get through the next generation of filters. Maybe it would be like playing spot the Fed at the hacker's conferences.

Share
twitter facebook
SpamAssassin? (Score:4, Interesting)

by raddan ( 519638 ) writes: on Wednesday April 11, 2007 @12:41PM (#18690993)

Ha ha, silly admin. My money's on greylisting [wikipedia.org].

We use both SpamAssassin and OpenBSD's spamd, to great effect. spamd does most of the work, though. Daniel Hartmeier [benzedrine.cx] (site down ATM, unfortunately) has an example of how to tie SA scores back into spamd for blacklisting, which is just awesome. I'd implement it here, but our current setup is effective enough as to not make it worth my time.

Share
twitter facebook
- Greylisting no longer works (Score:1)
  
  by Tipa ( 881911 ) writes:
  
  Greylisting was designed on the single proposition that spam mailers wouldn't "call back" if they got a "call back later" code from the site they were spamming. And maybe that was true for awhile. In my last job I had to add spam filtering to our email and greylisting was one of the first things I tried.
  
  The spammers just kept trying until they got through.
  
  Spamming has evolved past greylisting and it is now worthless.
  
  Bayesian keyword filtering is decent, but is constantly attacked by images or hiding the spa
  - Re: (Score:2)
    
    by LurkerXXX ( 667952 ) writes:
    
    Graylisting is worthless? Umm, no.
    
    It's certainly not perfect, but it reduces the load on my spam-filter. A *lot*. More than 90+% of smtp connections don't make it through spamd here. I hardly call that worthless.
    
    Last year it was more like 99+%. Here's some stats from someone else last year: http://undeadly.org/cgi?action=article&sid=2006021 7105149 [undeadly.org]
  - Re: (Score:3, Interesting)
    
    by raddan ( 519638 ) writes:
    
    It doesn't work? Maybe you should tell that to my 300-strong userbase!
    
    I'm certain that there are differences in implementation between different greylisters. I've never tried Postfix's, for example, because OpenBSD's works fine for me. A small point wrt to OpenBSD's spamd: you actually need to try thrice. The first time you're rejected. The second time you're marked as OK, but still rejected. The third time you get through. Maybe it's the third time, or some of the time limits, or some other thing
    - Re: (Score:2)
      
      by Slashdot Parent ( 995749 ) writes:
      
      I made my own greylisting implementation because none of the ones I found did exactly what I wanted.
      
      Mine is time-based, not rejection count based. In other words, if your IP isn't whitelisted, I do some tests on your IP to see how long you have to wait to get through.
      
      First, I try to do a reverse DNS lookup on your IP. No result means I don't like your IP.
      
      Then, I look to see if I can find your IP address anywhere in the reverse-DNS result (indicating a dynamic IP). If I find it forwards or backwards, I do
- Flawed (Score:3, Informative)
  
  by lazarus ( 2879 ) writes:
  
  "This ought to be a sweeps week television spectacular."
  This ought to be ignored as the contest is flawed.
  
  "Ha ha, silly admin. My money's on greylisting."
  They're sending a stream of spam from where? Sounds like a real mail server...
  
  From TFA: "Live email stream, delivered by standard protocols (SMTP, IMAP, POP)"
  [One wonders how else they would deliver e-mail if it was not from standard protocols. I also wonder how they plan on delivering e-mail using POP... The mind boggles...]
  
  In any case if I read this
  - Re:Flawed (Score:4, Interesting)
    
    by gvc ( 167165 ) writes: on Wednesday April 11, 2007 @02:30PM (#18692743)
    
    So here's the issue. If you are going to try to discriminate among filters using several thousand messages, you have to send them all the same messages. To send them the same messages you have to capture and redistribute them. You can pass on all the info from the capture, including all SMTP commands, but you can't do intrusive protocol probes. And since this is *real spam* you can't very well ask the sender to act in an obliging way by repeating its message and behavior for each participant.
    
    I'd be very interested to hear of a design that would allow greylisting to be tested. The best I can come up with is to fail the message after transmission, then to try to simulate the behavior of the sender in response to this failure. But that would be catering to one very specific method of perturbing the protocol. And it would be necessary to do a fair amount of work to spoof the IP address presented to the participant filters.
    
    For this reason, we chose to exclude all SMTP interactions, and simulate a second-in-the-chain filter appliance application. The reasons are practical, not policy.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by lazarus ( 2879 ) writes:
      
      Gordon,
      
      Thanks for your response. I just sent your counterpart at IBM a lengthy probing e-mail about this which I can summarize as:
      
      1. Real stream or fake stream?
      2. Points for cost effectiveness?
      3. Points for scalable/redundant architecture?
      
      I applaud what you are doing and I wish you the best success (contests like this are good at stimulating inventiveness). I've been racking by brain trying to figure out how you could do this in a way that wouldn't be discriminatory. The best I could come up with woul
  - Re: (Score:2)
    
    by Thundersnatch ( 671481 ) writes:
    
    GENERAL JUNK E-MAIL FILTERING RANT (You've been warned): If you're using an anti-spam technique which takes more cpu cycles to execute than it takes for the spammer to send the damn spam in the first place, you've already lost this war. In other words, as long as it's costing you more than it is costing him/her you will always be on the losing end of the deal.
    Yes, the spammer will always win, since his CPU cycles and bandwidth are free. But those costs don't matter at all.
    Bayesian and other resource-inten
  - Re: (Score:2)
    
    by richi ( 74551 ) writes:
    
    Quite.
    
    Fair comparative testing of spam control technologies is extremely difficult -- by some measures, it's impossible. Because some promising filter techniques rely on examining the real-time behaviour of the sending machine, it proves tricky to provide the exact same stream of email to all the filters at the same time.
    
    For example, some filters attempt to fingerprint the sending machine's operating system -- the idea being that, say, a Windows 98 PC has no business submitting email direct-to-MX.
    
    M
  - Re: (Score:2)
    
    by raddan ( 519638 ) writes:
    
    OK, so for the purposes of this contest, which does not model the real world, greylisting does not work. So what's the purpose of the contest, then?
    
    Here's why greylisting will continue to work in the real world:
    
    1. If a spammer adopts RFC-compliant mailers, greylisting will prevent them from pumping out huge numbers of mails. They will have to burn CPU cycles on their end in order to push mail through. This increases the cost of sending mail, and reduces their margins since they will be hitting few
    - Re: (Score:2)
      
      by gvc ( 167165 ) writes:
      
      So what's the purpose of the contest, then?
      
      First, the contest will establish a baseline against which greylisting may be compared. It is much more difficult to measure false positive and false negative rates for intrusive techniques like greylisting and challenge-response. Too difficult to be done in an open competition. But the open competition can show what other techniques can do, and then there will be some onus on the greylisters and challenge-responders to show that their techniques really are a va
- Re: (Score:2)
  
  by Sentry21 ( 8183 ) writes:
  
  I'll second greylisting. I set up a new mail system on our mail server last year to replace our crufty and pathetic qmail installation. I started with RBLs in postfix and spamassassin/clamav via amavisd, and that was all well and good. A week after adding in greylisting, however, I took out spamassassin filtering by default (users can still enable it on a per-account basis). The reason? RBLs block out the most prolific hosts, and greylisting blocks the vast majority of everything else. The only mail that wa
- Re: (Score:2)
  
  by tacocat ( 527354 ) writes:
  
  I'm not that impressed with SpamAssassin. Too much overhead in trying to keep all the static filtering rules up to date. Eventually, it get's dumb
  
  The best spam filters I've seen in terms of effectiveness is bogofilter and dspam. Both of these are extensions of the Bayes statistical filtering.
  bogofilter is awesome but it can't manage tokens from a database. Hence you can't have multiple machines very easily and users cannot share a database. Virtual hosting makes it harder and eventually you kind of
- Why Not Use Both? (Score:2)
  
  by Slashdot Parent ( 995749 ) writes:
  
  Ha ha, silly admin. My money's on greylisting.
  Why not use both?
  
  I use both, and I have to say that greylisting catches a metric boatload of spam. On the other hand, spammers have wised up and many are now retrying.
  
  Sure does take a lot of load off of spamassassin, though.
West Virginia (Score:1)

by ehaggis ( 879721 ) writes:

Back in West Virginia we'all used to go spam catchin' every weekend while they was in season! Them spam made good eatin'.
- Re: (Score:3, Funny)
  
  by UnknowingFool ( 672806 ) writes:
  
  Back in West Virginia we'all used to go spam catchin' every weekend while they was in season! Them spam made good eatin'.
  
  Don't lie. You and your buddies got drunk and would go spam tipping. There was no hunting involved.
My entry: Human computers (Score:1)

by davidwr ( 791652 ) writes:

I'm going to take a page from the Veruca Salt [wikipedia.org] needle-in-a-haystack problem and outsource this to a million peasants in India.

To pay for it I'll be spamming the world with my stock pump-and-dump scheme.

This just in: DAVI (OTC) NOW $0.02 TARGET $0.25!
New packaging? (Score:3, Funny)

by davmoo ( 63521 ) writes: on Wednesday April 11, 2007 @12:44PM (#18691045)

A torrent of spam? It doesn't come in cans anymore?!

The cans were so much easier to catch, too.

Share
twitter facebook
Spam Rage Rampage (Score:2)

by Dekortage ( 697532 ) writes:

A couple of years ago, I wrote a prototype for a video game called "Spam Rage Rampage" -- a first-person shooter where you roamed a Tron-like world, killing spam zombies and rescuing real people (== legitimate mail) while you searched for clues to the location of the nefarious spam kingpin, Ospama Bin Sendin. Each zombie represented a different class of spam... prostitute zombies for porn, business-suited zombies for stocks, pharmacist zombies for pill ads, etc.
Upon seeing a demo, one of my friends commen
Greylisting? (Score:3, Insightful)

by schmiddy ( 599730 ) writes: on Wednesday April 11, 2007 @12:49PM (#18691131) Homepage Journal

I can't help but wonder how realistic this scenario is.. They're basically going to have a single server dumping a whole ton of spam at your filtering package, and you're supposed to be able to filter on.. what, just the content of the messages? Real world techniques use many more subtle hacks, such as greylisting, or actually looking at the domains the messages are coming from. If their server is going to be dumping millions of messages at you in a short amount of time, I don't think they'll let you use greylisting or similar techniques.

Share
twitter facebook
- Re: (Score:1)
  
  by blhack ( 921171 ) writes:
  
  No. they give the nerds of an email address, then reverse the web filter so that it ONLY allows them to go to porn sites.
  
  after a few minutes their email servers should reach critical mass.
- Re: (Score:2)
  
  by martin-boundary ( 547041 ) writes:
  
  Read the rules. You can use any technique you like, you're getting each message delivered to you in real time transparently as if you were hooked to the net yourself. If you need POP, you get it, if you need SMTP, you get it. You can use external RBLs if you like, you can use a commercial filter from work (just pipe the data you receive through the work filter and report the result, assuming you have permission of course) etc. Even greylisting shouldn't be an issue in principle.
  Just pretend you're an admi
Boring. (Score:3, Funny)

by bmo ( 77928 ) writes: on Wednesday April 11, 2007 @01:16PM (#18691501)

Couldn't we just have a contest where actual live spammers are fed to lions?

To quote Bill Mattocks...

"My sense of personal integrity is none of your concern."
-thus spake Walt "Pickle Jar" Rines
"I'm going to pound your balls flat with a wooden mallet."
-thus respondeth Bill Mattocks

Share
twitter facebook
- Re: (Score:2)
  
  by Anne Thwacks ( 531696 ) writes:
  
  Mod parent up +10 Wonderful Idea
Kobayashi Maru (Score:3, Funny)

by Kozar_The_Malignant ( 738483 ) writes: on Wednesday April 11, 2007 @01:40PM (#18691919)
Find a creative and unique solution (cheat):
- Hunt through CEAS conference hall
- Find contest spammers
- Drag spammers back to contest area
- Spammers are beaten to death by audience
- Win!!!
- ...Oh, wait, they weren't realspammers?
- Sorry
Share
twitter facebook
CEAS Call for Participation (Score:2)

by gvc ( 167165 ) writes:

Many of the questions asked here are answered in the Challenge Call for Participation [www.ceas.cc]

Or the overview talk [youtube.com] that Rich Segal gave at the MIT Spam Conference.

The guidelines are scheduled to be finalized May 1.
- Re: (Score:2)
  
  by SL Baur ( 19540 ) writes:
  
  Participants will compete in filtering a live 24-hour e-mail stream
  Looks like greylisting is acceptable.
  Simulated user-feedback will be provided to train learning-based filters.
  And it looks like gmail-type filters are acceptable.
  
  Good job guys. The results will be interesting to read.
On ESPN... (Score:2)

by vjmurphy ( 190266 ) writes:

"This ought to be a sweeps week television spectacular."

Is there an ESPN 6 or 7 cable channel? I'm thinking this is below Cheerleading and Dog Agility, but perhaps above Lumberjack competitions.
Isn't this already on TV? (Score:3, Funny)

by Minwee ( 522556 ) writes: <dcr@neverwhen.org> on Wednesday April 11, 2007 @01:43PM (#18691981) Homepage

"This ought to be a sweeps week television spectacular."
I think that it already is, but it's only on in Japan and uses real SPAM.

Share
twitter facebook
Visions of tennis ball machine gone.... (Score:1)

by zippoiii ( 887540 ) writes:

Sigh. And i had such hopes. Pictures of a team of people, with a spam and tennis ball loaded tennis ball launcher at the other end of a court. When something gets fired at you, determine if you should let the ball go by, or wack the spam from the air. Alas, it's not to be. Dan
I got a better idea (Score:2)

by Indy1 ( 99447 ) writes:

Issue hunting permits for the spammers themselves. Whoever wastes the most spammers, wins.

Evidence of wasted spammers can be in the form of complete heads, or ears.
From the not-from-a-dept dept. (Score:2)

by etherlad ( 410990 ) writes:

Relevant to nothing, but this is the first time I can remember seeing an article on /. without the requisite department tag in the story header.

Anyone want to try their hand at making up their own?
How to test against spam that isn't REAL spam? (Score:2)

by necro2607 ( 771790 ) writes:

Okay, here's the first question I have, and I'm sure many others wonder the same. How will spam be combatted when it's not real spam? For example, Spam Assassin checks actual mail server names and addresses to see if they are on known spammer lists and so on. Won't extremely useful/effective features like these be overriden by the fact that these spam emails are intentionally sent and won't be from any known spam-relaying mail servers??
- Re: (Score:2)
  
  by gvc ( 167165 ) writes:
  
  The mail messages will contain header information from which the sending IP may be derived. Of course, spammers try to forge this info, but the most recent header is guaranteed to be correct.
  - Re: (Score:2)
    
    by necro2607 ( 771790 ) writes:
    
    That's exactly what I'm saying. Since these "contest" spams will be from the contest organization (as in, not from actual spammers), I would imagine they won't have the headers that indicate the mails were from spam-relaying servers out there on the net. So how are contestants supposed to use filtering-based-on-host-IP measures in their spam filtering application??
    - Re: (Score:2)
      
      by gvc ( 167165 ) writes:
      
      I don't think you understood the parent. The messages are from actual spammers, not from the contest organization. The spams are merely relayed, and they are relayed accurately.
Just dump unsolicited email with URLs in them. (Score:2)

by iamcf13 ( 736250 ) writes:

Problem solved.

Now get people and free email services like Hotmail and Gmail to turn off their URL signatures in the bottom of their outgoing emails and you will stamp the spam email menace out in one bold stroke.

Moves the spam back to USENET which is already spammed-out already.... :P

If people you don't know want to start a meaningful email conversation with you, they WON'T try to get you to visit the URL of some 'paysite' contained in their email.

Then something has to be done about spammers bouncing their
Error rate (false positives) isn't the whole story (Score:3, Insightful)

by InakaBoyJoe ( 687694 ) writes: on Wednesday April 11, 2007 @10:04PM (#18697615)

From TFCFP (call for participation):
Filters will be evaluated based on a weighted combination of the percentage of spam blocked and its false positive percentage.
From a theoretical standpoint, a low false positive average over an entire set (like <1%) might seem okay, but that doesn't take into account what's important to users.
Take, for example, a message from a long-lost friend, whose current address isn't yet in your whitelist, and who would have no other way of contacting you should the message get spamboxed. Here's an example of a message that's important to a user but gets lost among the everyday messages when simply talking about the percentage of false positives.
There's lots of other examples, too -- if you run your own domain, your messages are likely to be spamboxed, etc. Furthermore, the lower the false-positive rate, the less likely a user is to actually *check* their spambox, thus making a single false-positive even worse.
Microsoft's own Hotmail, of course, is notorious for spamboxing messages like that. And yet the conference is being held at Microsoft, and Microsoft's own spam researchers proudly touted their system in the February 2007 Communications of the ACM [acm.org].
Something tells me the leaders in the field are sort of missing the point. Simply bringing down the aggregate false positive rate is *not* enough. The measure needs to take into account how often the user actually misses information that's important to them.

Share
twitter facebook
- Re:Error rate (false positives) isn't the whole st (Score:2)
  
  by gvc ( 167165 ) writes:
  
  a low false positive average over an entire set (like <1%) might seem okay, but that doesn't take into account what's important to users.
  
  A 1% false positive rate is not OK. The good systems will misclassify at most a couple of good emails per thousand, and the vast majority of those will lie in the grey area between ham and spam. A few will be internet transactions -- sign-up messages, receipts, and the like -- and a vanishingly small number will be personal communications.
  Microsoft's own Hotmail, of

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

CRM114 (Score:4, Informative)

Re: (Score:1, Funny)

Agile and evolutionary versus ergodic spam (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Exploiting This (Score:2)

My money (Score:2)

Re: (Score:3, Funny)

Re: (Score:3, Funny)

Group spam detection (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Group spam detection (Score:5, Interesting)

Re: (Score:1)

I wonder how they deal with pseudo-spam (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Gmail's filtering is not that great (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Sweeps (Score:3, Funny)

Re: (Score:1)

My money (Score:2)

Re: (Score:2)

Damn. (Score:1)

Curious:When urologists email each other... (Score:5, Interesting)

Re:Curious:When urologists email each other... (Score:4, Informative)

That depends upon the method used. (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Spamassassin scored -1.3 (Score:2)

I wish the contest was.... (Score:2, Interesting)

The First Annual Greased Spammer Contest! (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Will SMTP server settings count as well? (Score:2)

I can't tell from the write up. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

The prize list :) (Score:5, Funny)

Re: (Score:2)

that's easy. Yahoo mail! (Score:3, Funny)

Re: (Score:2)

Re: (Score:2)

Professional spammers in attendance? (Score:5, Interesting)

SpamAssassin? (Score:4, Interesting)

Greylisting no longer works (Score:1)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:2)

Flawed (Score:3, Informative)

Re:Flawed (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Why Not Use Both? (Score:2)

West Virginia (Score:1)

Re: (Score:3, Funny)

My entry: Human computers (Score:1)

New packaging? (Score:3, Funny)

Spam Rage Rampage (Score:2)