Forgot your password?

Live spam-catching contest at CEAS 126

Posted by CmdrTaco
noodleburglar writes "The 2007 Conference on Email and Anti-Spam (CEAS) will feature a live spam-catching contest. Entrants will be treated to a torrent of spam and must use their spam filtering technique to filter out as much as possible, while also letting legitimate messages. My money's on Spam Assassin." This ought to be a sweeps week television spectacular.
This discussion has been archived. No new comments can be posted.

Live spam-catching contest at CEAS

Comments Filter:
  • by dpbsmith (263124) on Wednesday April 11, 2007 @12:33PM (#18690847) Homepage
    ... are they able to refer to Pfizer's brand name for sildenafil, Lilly's name for tadalafil, or Bayer's brand name for vardenafil without getting caught in the spam filters?
  • by ruffnsc (895839) on Wednesday April 11, 2007 @12:35PM (#18690857) Journal
    physically catching the spammers! (your imagination can do the rest)
  • by MobyDisk (75490) on Wednesday April 11, 2007 @12:40PM (#18690957) Homepage
    I wonder if professional spammers will attend the conference to learn how to get through the next generation of filters. Maybe it would be like playing spot the Fed at the hacker's conferences.
  • SpamAssassin? (Score:4, Interesting)

    by raddan (519638) on Wednesday April 11, 2007 @12:41PM (#18690993)
    Ha ha, silly admin. My money's on greylisting [].

    We use both SpamAssassin and OpenBSD's spamd, to great effect. spamd does most of the work, though. Daniel Hartmeier [] (site down ATM, unfortunately) has an example of how to tie SA scores back into spamd for blacklisting, which is just awesome. I'd implement it here, but our current setup is effective enough as to not make it worth my time.
  • by kebes (861706) on Wednesday April 11, 2007 @12:43PM (#18691017) Journal
    You're right--but the size of Gmail gives them another advantage. In those marginal cases where the spam filter isn't sure about an email (is this spam or a mailing list?) it has the advantage of having a huge number of people checking all the emails. That is, the users do the final check.

    I have received a spam to my gmail account exactly once. And when I did, shocked, I clicked the "mark as spam" button. The point is that this spam was probably sent to millions of Gmail users, and the algorithm wasn't sure how to categorize it. But because I clicked "spam" (and probably a few other people did, too), it was marked as spam for everyone. So most users never say it in their inbox. Thus only a dozen out of the million recipients was ever bothered by the spam. Conversely, an email list would receive no (or very few) "mark as spam" clicks, and would be allowed to pass. So basically the Gmail userbase acts the workforce to continually train the spam filter, and moreover to detect new spam within minutes of it being sent.

    It's hard to beat a system like that. But the point is that it relies on the large number of users who are all (effectively) sharing their spam training sets with each other in realtime.

    This is not to say that the baseline algorithm that Gmail implements isn't quite effective, but the point is that Gmail can use the users to resolve those tricky false-positive and false-negative situations.
  • by raddan (519638) on Wednesday April 11, 2007 @01:44PM (#18691993)
    It doesn't work? Maybe you should tell that to my 300-strong userbase!

    I'm certain that there are differences in implementation between different greylisters. I've never tried Postfix's, for example, because OpenBSD's works fine for me. A small point wrt to OpenBSD's spamd: you actually need to try thrice. The first time you're rejected. The second time you're marked as OK, but still rejected. The third time you get through. Maybe it's the third time, or some of the time limits, or some other things that spamd is doing (BTW, we do not use *any* blacklists), but it works great. I probably see a spam in my inbox once a month, maybe. The rest of my users who complain about the "spam" they're still getting are really getting email they've signed up for (listservs aren't spam, people!), in which case, it's usually just a simple matter of education.

    I don't know where your greylisting system failed, but it works wonders for us. When I implemented it, I was a sysadmin rock star for a week. Who knew there were anti-spam groupies? Now it's back to picking the crud out of the VP's keybord ;^)

    (You're spot-on about one thing though: defense in depth. That principle is in effect for EVERYTHING, which is why I want to administer electric shocks to our Mac users when they try to call the Help Desk.)
  • Re:Flawed (Score:4, Interesting)

    by gvc (167165) on Wednesday April 11, 2007 @02:30PM (#18692743)
    So here's the issue. If you are going to try to discriminate among filters using several thousand messages, you have to send them all the same messages. To send them the same messages you have to capture and redistribute them. You can pass on all the info from the capture, including all SMTP commands, but you can't do intrusive protocol probes. And since this is *real spam* you can't very well ask the sender to act in an obliging way by repeating its message and behavior for each participant.

    I'd be very interested to hear of a design that would allow greylisting to be tested. The best I can come up with is to fail the message after transmission, then to try to simulate the behavior of the sender in response to this failure. But that would be catering to one very specific method of perturbing the protocol. And it would be necessary to do a fair amount of work to spoof the IP address presented to the participant filters.

    For this reason, we chose to exclude all SMTP interactions, and simulate a second-in-the-chain filter appliance application. The reasons are practical, not policy.