Become a fan of Slashdot on Facebook


Forgot your password?

Live spam-catching contest at CEAS 126

noodleburglar writes "The 2007 Conference on Email and Anti-Spam (CEAS) will feature a live spam-catching contest. Entrants will be treated to a torrent of spam and must use their spam filtering technique to filter out as much as possible, while also letting legitimate messages. My money's on Spam Assassin." This ought to be a sweeps week television spectacular.
This discussion has been archived. No new comments can be posted.

Live spam-catching contest at CEAS

Comments Filter:
  • CRM114 (Score:4, Informative)

    by sageFool ( 36961 ) on Wednesday April 11, 2007 @12:22PM (#18690681) Homepage [] using hyperspace! It's been working better than spam assassin for me.
    • Re: (Score:1, Funny)

      by Anonymous Coward
      Unlike many other "filters", CRM114's default action is to read all of input, and put NOTHING onto output.

      This is either:

      1) "automatic" white-listing?
      2) Not healthy and you should eat more fibre.
  • is on whatever Gmail uses. I've not yet seen a spam message in my inbox, nor have I missed any mail, even from auto-mailing scripts at websites I'm building...
    • Re: (Score:3, Funny)

      by rodney dill ( 631059 )
      Well let's just find out, just what is your gmail address, hmmmm?

    • Group spam detection (Score:5, Informative)

      by Animats ( 122034 ) on Wednesday April 11, 2007 @12:32PM (#18690811) Homepage

      Gmail, like SpamCop, has a group spam filter system. It looks at mail sent to a large number of recipients. The defining characteristic of spam is that it's sent to a large number of recipients, after all. If you're in a position to watch the incoming mail of a few million mailboxes, detecting spam is easy.

      • Yeah- I'm waiting to see algorithmically generated spam where no two messages are alike. Bleh! That being said gmail does a tremendous job of letting through legitimate messages (which is no doubt the hardest part of making a spam filter these days).
        • by Animats ( 122034 )

          Yeah- I'm waiting to see algorithmically generated spam where no two messages are alike.

          We've had that for years. The latest variant is in those Viagra spams with a faint pattern of background noise in the images, different for each spam.

      • by kebes ( 861706 ) on Wednesday April 11, 2007 @12:43PM (#18691017) Journal
        You're right--but the size of Gmail gives them another advantage. In those marginal cases where the spam filter isn't sure about an email (is this spam or a mailing list?) it has the advantage of having a huge number of people checking all the emails. That is, the users do the final check.

        I have received a spam to my gmail account exactly once. And when I did, shocked, I clicked the "mark as spam" button. The point is that this spam was probably sent to millions of Gmail users, and the algorithm wasn't sure how to categorize it. But because I clicked "spam" (and probably a few other people did, too), it was marked as spam for everyone. So most users never say it in their inbox. Thus only a dozen out of the million recipients was ever bothered by the spam. Conversely, an email list would receive no (or very few) "mark as spam" clicks, and would be allowed to pass. So basically the Gmail userbase acts the workforce to continually train the spam filter, and moreover to detect new spam within minutes of it being sent.

        It's hard to beat a system like that. But the point is that it relies on the large number of users who are all (effectively) sharing their spam training sets with each other in realtime.

        This is not to say that the baseline algorithm that Gmail implements isn't quite effective, but the point is that Gmail can use the users to resolve those tricky false-positive and false-negative situations.
        • This doesn't lead to the possibility that a group of users could mark a legitimate sender as a spammer? I think this an old question, but I don't remember the answer. And if it is possible, how do you defend against it?
        • I know I've removed myself from a few mailing lists by simply having gmail count them as spam.

          These aren't really spam, they are companies that I did business with once and can't be bothered to find my username and password to change my email subscription settings. But gmail seems to happily block everything else from that sender without my interaction.

          Surely other users do want these particular emails so there must be some kind of per user dynamic as well.
        • by asninn ( 1071320 )

          Thus only a dozen out of the million recipients was ever bothered by the spam. Conversely, an email list would receive no (or very few) "mark as spam" clicks, and would be allowed to pass. So basically the Gmail userbase acts the workforce to continually train the spam filter, and moreover to detect new spam within minutes of it being sent.

          This probably plays a role, but it will not be the only thing GMail relies on (and probably not even the most important factor), and it will likely require more than

        • I have received a spam to my gmail account exactly once.

          I wish my Gmail account was like that. Maybe you're new to Gmail. I get several spams in my inbox per week. Mostly these are spam messages in Russian and Chinese but I still get a lot of spam in English as well. I always use the button to mark them as spam, but Gmail doesn't seem to get the message that I don't want anything written in Russian. It's also disappointing that I can't create a filter to mark messages as spam. The best I can do is cat

    • Re: (Score:3, Informative)

      Set up a catchall on your domain. You'll start getting stuff through. Especially the images ones. Some of the newer "make it look like a real e-mail" gets through.

      Everywebsite I have gets its own e-mail account, eg.
      One day I started getting spam to So I setup in dreamhost to bounce everything to that e-mail address.

      Then I started getting flooded with:

      Google has, thankfully, let me do delete of *, but for a
    • Try slutting your address around a bit. Mine is only publicly readable here on /. and I get plenty of spam in my gmail inbox. Yahoo seems to do a better job based on my experience.
      • by jfengel ( 409917 )
        Huh. I'm using GMail to host my domain. My email addresses are pretty slutty (a combination of supporting the catchall, some public "info@" addresses that get forwarded to me, and a few mailing lists with lousy privacy or security policies.)

        I do see perhaps three spams a day that actually make it into the inbox, and about 300 or so that are shunted to the spam folder.

        There may be false positives in there, but with 300 per day I'm not going to find out. I've never noticed one in there, or had a friend tel
    • by hpavc ( 129350 )
      The google gmail news group says otherwise for many other people, the filtering is practically non-existent it seems for me.
      • That's not surprising. It's mathematically impossible for a single filter to classify emails correctly for a large group of people, because any large group is inconsistent. Someone believes X is spam, but another one truly believes X is not spam. Whatever the filter does, it's going to be wrong on one group of people. You're part of that crowd of Gmail's users.

        You'll be much better off with a personal filter, that learns what you like, not what the majority of Gmail users like.

    • is on whatever Gmail uses. I've not yet seen a spam message in my inbox, nor have I missed any mail, even from auto-mailing scripts at websites I'm building...

      I will agree that it's great for spam; but when it comes to 419 emails, it sucks. Badly. I'm not sure how I got on the 419ers lists, but I get at least 10-12 of them a day, none of which are caught by gmail filters. On the other hand, the 50-60 regular spam emails are correctly filtered. If only I could perform regex filtering in gmail, I could catch the 419 emails myself very easily, as they all have very common attributes.

    • by gvc ( 167165 )
      You're welcome to use Gmail -- or any other filter you like, animal, vegetable, or mineral -- to participate in the Live Challenge.
    • by SL Baur ( 19540 )
      My bet would be on the gmail filter too. I've had my old email address (which has been harvested to death) forwarded through there for some months now. It's not perfect, but it still only lets through about as much spam as my old handcrafted .procmailrc did 8 or 9 years ago. Which is really good considering how much more spam there is today.

      If I could tell it to junk everything except text in certain languages it would work even better. It seems to miss a lot of Korean and Russian spam.
  • Sweeps (Score:3, Funny)

    by cyphercell ( 843398 ) on Wednesday April 11, 2007 @12:25PM (#18690725) Homepage Journal

    This ought to be a sweeps week television spectacular.

    It think I've seen people catching spam on tv, just not the kind you're talkin' 'bout. []

    • The trick is to try to catch the spam in a net with such velocity that the spam "squishes" through the net to fall on the ground, leaving you with only valid "message" hidden amongst the spam.
  • My money is on whoever rigs up a Amazon's Mechanical Turk-based system fast enough.
    • by Afecks ( 899057 )
      My money is on whoever rigs up a Amazon's Mechanical Turk-based system fast enough.

      Because you'd really want thousands of random people reading your emails looking for spam?
  • by daeg ( 828071 )
    Damn. I was hoping they'd be launching phone-book sized printed copies of spam at the contestants, complete with blood, with each week adding a few pounds. Add some half naked chicks and dudes (cater to multiple markets) dancing around, maybe some buckets of slime and you've got yourself a show worthy of running on Fox.
  • by dpbsmith ( 263124 ) on Wednesday April 11, 2007 @12:33PM (#18690847) Homepage
    ... are they able to refer to Pfizer's brand name for sildenafil, Lilly's name for tadalafil, or Bayer's brand name for vardenafil without getting caught in the spam filters?
    • by kebes ( 861706 ) on Wednesday April 11, 2007 @12:53PM (#18691185) Journal
      Suffice it to say that a doctor is likely to write an email like:

      "Ted, I just read the news about Viagra in the New England Journal of Medicine. Very interesting results, though the error bars are a bit large to draw any major conclusions just yet. What do you think?"

      Whereas a doctor rarely writes email like:

      "NoW ava ilable is generic V1AGRA at low price! Generic, quality, all low price now!"

      The point is that modern spam filters don't just look for "bad words" but consider relative word frequencies, the sender and receiver fields, word correlations, formatting elements, URLs, etc. Spam filters in your email client will be trained against email you typically send/receive, and so can be even more precise. Spammers of course try to make their emails include words so that they end up looking like real email, but if the filter is good enough, then the only way to get past it is to send an email that now lacks those critical spam elements (like the link you're supposed to click to buy the generic drug or whatever)...
    • Pure content scanning would probably trigger those ... unless you had previously manually approved similar messages.

      Other approaches use multiple tests such as checking whether the sending server's IP address is on a blacklist or whether any of the links in the message (should it contain links) were on blacklists.
    • by misleb ( 129952 )
      Only if they write things like:

      Hey, I just pre sc ribed V.1.4.G.R.A to a patient today.

      The monk said to the fox, why don't the squirrels to be or not to be, that is my answer. The fog was as thick as umbrellas in the wind thought the old maid.

    • by cgrayson ( 22160 ) *
      See, er, listen to this hilarious Onion Radio News story from Feb. 8: Brilliant Scientist Trying To Get Word Out About Penis-Enlargement Breakthrough [] (warning: page may auto-play audio).
    • by mutterc ( 828335 )

      Happened with a lame spam filter my company used to have. This was a year or so ago.

      I emailed my wife "can you stop by and pick up the Strattera and Effexor from the pharmacy?" once. Her reply, containing my message, got plonked by the filters.

    • by Atario ( 673917 )

      ... are they able to refer to Pfizer's brand name for sildenafil, Lilly's name for tadalafil, or Bayer's brand name for vardenafil without getting caught in the spam filters?
      I would hope they use the real names and not the brand names.
    • I thought your question was intriguing, so I composed the following message:
      Subject: Interesting phenomon related to Viagra use

      Hi, Dr. Smith-

      I just wanted to write you to let you know that I really enjoyed the article you wrote in the New England Journal of Medicine about the side effects of Cialis, Viagra, and Levitra. It turns out a patient of mine experienced debilitating nausea while on Levitra, so I prescribed Viagra in its place, as you recommend.

      In addition, I thought you might be interested to know
  • physically catching the spammers! (your imagination can do the rest)
    • by Penguinisto ( 415985 ) on Wednesday April 11, 2007 @12:47PM (#18691101) Journal
      (cue Monster Truck Rally announcer guy voice...) THIS SATURDAY AT THE EXPO CENTER! The Best admins and the worst spammers come together in a throwdown-showdown-lowdown Greased Spammer Contest! We kidnap, strip, and grease down every known spammer we can find on Planet Earth! We bring 'em here, then we give our lucky mail server admins (as determined by lottery) a chance to catch 'em! The spammers will be released into a large pit, where the admins may use any method to catch and immobilize spammers (firearms and other projectile weapons are excluded). Points will be given for the number of spammers caught, the methods of capture, and the level of eye-rattling violence applied to each spammer after their capture! Watch as the winning admin gets to publicly execute the dreaded Sanford Wallace by any method that he or she can dream up! Any method at all! You'll buy a ticket for the whole seat, but you will only need the edge! Get your tickets at the Mondotix - DON'T MISS IT!(/voice)


    • by HTH NE1 ( 675604 )

      I wish the contest was physically catching the spammers!
      Only as long as it is not catch-and-release.
  • ...or just the filter software/daemon performance/stats alone? There's lots you can do to the MTA itself to stop spam before it even has to be examined by the filters (mostly by monkeying w/ the SMTP session handling and timeouts).

    It's be interesting to see a solid setup that handles a combination of the two, then publish the results (yes, spammers can read those results/settings to try to foil the setup, but many settings would make it patently unprofitable for them to do so).


    • But I doubt that they have a hundred thousand systems that they'll be using to send the test spam.

      A big part of the system I use at work is based upon IP addresses and rDNS. I block a HUGE amount of spam just by rejecting all connections from Comcast that aren't from their SMTP servers.

      I know, some people want to run SMTP servers at home. But so far none of them have attempted to send email to my system.

      So it really depends upon how they configure the test spam servers. Personally, I don't see this as being
    • by gvc ( 167165 )
      Envelope information will be preserved, so you can determine the purported sender, multiple recipients, HELO IP, actual IP, etc. But you can't play interactive games with the SMTP protocol because the same email must be delivered to all participants.
    • by pe1chl ( 90186 )
      I agree. I filter the majority of spam by just doing strict RFC compliance testing in the SMTP engine. It rejects almost everything sent via botnets. What comes through is mostly 419 scamming, because that is sent via bonafide mailservers. But that is easily filtered with SpamAssassin.
    • That's my plan (I want to see how well my stuff works without customizing it too much just for the contest). Let's hope more details arrive soon...
  • by davidwr ( 791652 ) on Wednesday April 11, 2007 @12:37PM (#18690895) Homepage Journal
    1st prize: Job offer from a security-software vendor
    2nd prize: Lifetime supply of Hormel meat products
    3rd prize: Commemorative tin of SPAM meat product
    Last place: Inheritance from Nigerian Prince
  • by number6x ( 626555 ) on Wednesday April 11, 2007 @12:38PM (#18690921)

    Just open a yahoo mail account, and start posting with the e-mail address all over th internet.

    You'll catch more spam than anyone else!

    Oh, you want me to filter out spam, not just get spam, nevermind.

    Still, it might be the fastest way to build a database of spam.

    • Actually thats not a bad idea - have a contest to see how much spam you can ATTRACT with a fresh email account in a given time period. My Verizon account would win hands down. (And to you spammers out there - no, my email address is NOT
      • Actually thats not a bad idea - have a contest to see how much spam you can ATTRACT with a fresh email account in a given time period. My Verizon account would win hands down. (And to you spammers out there - no, my email address is NOT

        The poor bastard who actually does have is really, really pissed about now.

  • by MobyDisk ( 75490 ) on Wednesday April 11, 2007 @12:40PM (#18690957) Homepage
    I wonder if professional spammers will attend the conference to learn how to get through the next generation of filters. Maybe it would be like playing spot the Fed at the hacker's conferences.
  • SpamAssassin? (Score:4, Interesting)

    by raddan ( 519638 ) on Wednesday April 11, 2007 @12:41PM (#18690993)
    Ha ha, silly admin. My money's on greylisting [].

    We use both SpamAssassin and OpenBSD's spamd, to great effect. spamd does most of the work, though. Daniel Hartmeier [] (site down ATM, unfortunately) has an example of how to tie SA scores back into spamd for blacklisting, which is just awesome. I'd implement it here, but our current setup is effective enough as to not make it worth my time.
    • Greylisting was designed on the single proposition that spam mailers wouldn't "call back" if they got a "call back later" code from the site they were spamming. And maybe that was true for awhile. In my last job I had to add spam filtering to our email and greylisting was one of the first things I tried.

      The spammers just kept trying until they got through.

      Spamming has evolved past greylisting and it is now worthless.

      Bayesian keyword filtering is decent, but is constantly attacked by images or hiding the spa
      • Graylisting is worthless? Umm, no.

        It's certainly not perfect, but it reduces the load on my spam-filter. A *lot*. More than 90+% of smtp connections don't make it through spamd here. I hardly call that worthless.

        Last year it was more like 99+%. Here's some stats from someone else last year: 7105149 []

      • Re: (Score:3, Interesting)

        by raddan ( 519638 )
        It doesn't work? Maybe you should tell that to my 300-strong userbase!

        I'm certain that there are differences in implementation between different greylisters. I've never tried Postfix's, for example, because OpenBSD's works fine for me. A small point wrt to OpenBSD's spamd: you actually need to try thrice. The first time you're rejected. The second time you're marked as OK, but still rejected. The third time you get through. Maybe it's the third time, or some of the time limits, or some other thing
        • I made my own greylisting implementation because none of the ones I found did exactly what I wanted.

          Mine is time-based, not rejection count based. In other words, if your IP isn't whitelisted, I do some tests on your IP to see how long you have to wait to get through.

          First, I try to do a reverse DNS lookup on your IP. No result means I don't like your IP.

          Then, I look to see if I can find your IP address anywhere in the reverse-DNS result (indicating a dynamic IP). If I find it forwards or backwards, I do
    • Flawed (Score:3, Informative)

      by lazarus ( 2879 )
      "This ought to be a sweeps week television spectacular."
      This ought to be ignored as the contest is flawed.

      "Ha ha, silly admin. My money's on greylisting."
      They're sending a stream of spam from where? Sounds like a real mail server...

      From TFA: "Live email stream, delivered by standard protocols (SMTP, IMAP, POP)"
      [One wonders how else they would deliver e-mail if it was not from standard protocols. I also wonder how they plan on delivering e-mail using POP... The mind boggles...]

      In any case if I read this
      • Re:Flawed (Score:4, Interesting)

        by gvc ( 167165 ) on Wednesday April 11, 2007 @02:30PM (#18692743)
        So here's the issue. If you are going to try to discriminate among filters using several thousand messages, you have to send them all the same messages. To send them the same messages you have to capture and redistribute them. You can pass on all the info from the capture, including all SMTP commands, but you can't do intrusive protocol probes. And since this is *real spam* you can't very well ask the sender to act in an obliging way by repeating its message and behavior for each participant.

        I'd be very interested to hear of a design that would allow greylisting to be tested. The best I can come up with is to fail the message after transmission, then to try to simulate the behavior of the sender in response to this failure. But that would be catering to one very specific method of perturbing the protocol. And it would be necessary to do a fair amount of work to spoof the IP address presented to the participant filters.

        For this reason, we chose to exclude all SMTP interactions, and simulate a second-in-the-chain filter appliance application. The reasons are practical, not policy.
        • by lazarus ( 2879 )

          Thanks for your response. I just sent your counterpart at IBM a lengthy probing e-mail about this which I can summarize as:

          1. Real stream or fake stream?
          2. Points for cost effectiveness?
          3. Points for scalable/redundant architecture?

          I applaud what you are doing and I wish you the best success (contests like this are good at stimulating inventiveness). I've been racking by brain trying to figure out how you could do this in a way that wouldn't be discriminatory. The best I could come up with woul
      • GENERAL JUNK E-MAIL FILTERING RANT (You've been warned): If you're using an anti-spam technique which takes more cpu cycles to execute than it takes for the spammer to send the damn spam in the first place, you've already lost this war. In other words, as long as it's costing you more than it is costing him/her you will always be on the losing end of the deal.

        Yes, the spammer will always win, since his CPU cycles and bandwidth are free. But those costs don't matter at all.

        Bayesian and other resource-inten

      • by richi ( 74551 )

        Fair comparative testing of spam control technologies is extremely difficult -- by some measures, it's impossible. Because some promising filter techniques rely on examining the real-time behaviour of the sending machine, it proves tricky to provide the exact same stream of email to all the filters at the same time.

        For example, some filters attempt to fingerprint the sending machine's operating system -- the idea being that, say, a Windows 98 PC has no business submitting email direct-to-MX.

      • by raddan ( 519638 )
        OK, so for the purposes of this contest, which does not model the real world, greylisting does not work. So what's the purpose of the contest, then?

        Here's why greylisting will continue to work in the real world:

        1. If a spammer adopts RFC-compliant mailers, greylisting will prevent them from pumping out huge numbers of mails. They will have to burn CPU cycles on their end in order to push mail through. This increases the cost of sending mail, and reduces their margins since they will be hitting few
        • by gvc ( 167165 )
          So what's the purpose of the contest, then?

          First, the contest will establish a baseline against which greylisting may be compared. It is much more difficult to measure false positive and false negative rates for intrusive techniques like greylisting and challenge-response. Too difficult to be done in an open competition. But the open competition can show what other techniques can do, and then there will be some onus on the greylisters and challenge-responders to show that their techniques really are a va
    • by Sentry21 ( 8183 )
      I'll second greylisting. I set up a new mail system on our mail server last year to replace our crufty and pathetic qmail installation. I started with RBLs in postfix and spamassassin/clamav via amavisd, and that was all well and good. A week after adding in greylisting, however, I took out spamassassin filtering by default (users can still enable it on a per-account basis). The reason? RBLs block out the most prolific hosts, and greylisting blocks the vast majority of everything else. The only mail that wa
    • by tacocat ( 527354 )

      I'm not that impressed with SpamAssassin. Too much overhead in trying to keep all the static filtering rules up to date. Eventually, it get's dumb

      The best spam filters I've seen in terms of effectiveness is bogofilter and dspam. Both of these are extensions of the Bayes statistical filtering.

      bogofilter is awesome but it can't manage tokens from a database. Hence you can't have multiple machines very easily and users cannot share a database. Virtual hosting makes it harder and eventually you kind of

    • Ha ha, silly admin. My money's on greylisting.
      Why not use both?

      I use both, and I have to say that greylisting catches a metric boatload of spam. On the other hand, spammers have wised up and many are now retrying.

      Sure does take a lot of load off of spamassassin, though.
  • Back in West Virginia we'all used to go spam catchin' every weekend while they was in season! Them spam made good eatin'.
    • Re: (Score:3, Funny)

      Back in West Virginia we'all used to go spam catchin' every weekend while they was in season! Them spam made good eatin'.

      Don't lie. You and your buddies got drunk and would go spam tipping. There was no hunting involved.

  • I'm going to take a page from the Veruca Salt [] needle-in-a-haystack problem and outsource this to a million peasants in India.

    To pay for it I'll be spamming the world with my stock pump-and-dump scheme.

    This just in: DAVI (OTC) NOW $0.02 TARGET $0.25!
  • by davmoo ( 63521 ) on Wednesday April 11, 2007 @12:44PM (#18691045)
    A torrent of spam? It doesn't come in cans anymore?!

    The cans were so much easier to catch, too.
  • A couple of years ago, I wrote a prototype for a video game called "Spam Rage Rampage" -- a first-person shooter where you roamed a Tron-like world, killing spam zombies and rescuing real people (== legitimate mail) while you searched for clues to the location of the nefarious spam kingpin, Ospama Bin Sendin. Each zombie represented a different class of spam... prostitute zombies for porn, business-suited zombies for stocks, pharmacist zombies for pill ads, etc.

    Upon seeing a demo, one of my friends commen

  • Greylisting? (Score:3, Insightful)

    by schmiddy ( 599730 ) on Wednesday April 11, 2007 @12:49PM (#18691131) Homepage Journal
    I can't help but wonder how realistic this scenario is.. They're basically going to have a single server dumping a whole ton of spam at your filtering package, and you're supposed to be able to filter on.. what, just the content of the messages? Real world techniques use many more subtle hacks, such as greylisting, or actually looking at the domains the messages are coming from. If their server is going to be dumping millions of messages at you in a short amount of time, I don't think they'll let you use greylisting or similar techniques.
    • by blhack ( 921171 )
      No. they give the nerds of an email address, then reverse the web filter so that it ONLY allows them to go to porn sites.

      after a few minutes their email servers should reach critical mass.
    • Read the rules. You can use any technique you like, you're getting each message delivered to you in real time transparently as if you were hooked to the net yourself. If you need POP, you get it, if you need SMTP, you get it. You can use external RBLs if you like, you can use a commercial filter from work (just pipe the data you receive through the work filter and report the result, assuming you have permission of course) etc. Even greylisting shouldn't be an issue in principle.

      Just pretend you're an admi

  • Boring. (Score:3, Funny)

    by bmo ( 77928 ) on Wednesday April 11, 2007 @01:16PM (#18691501)
    Couldn't we just have a contest where actual live spammers are fed to lions?

    To quote Bill Mattocks...

    "My sense of personal integrity is none of your concern."
                                                    -thus spake Walt "Pickle Jar" Rines
    "I'm going to pound your balls flat with a wooden mallet."
                                                    -thus respondeth Bill Mattocks
  • by Kozar_The_Malignant ( 738483 ) on Wednesday April 11, 2007 @01:40PM (#18691919)

    Find a creative and unique solution (cheat):

    • Hunt through CEAS conference hall
    • Find contest spammers
    • Drag spammers back to contest area
    • Spammers are beaten to death by audience
    • Win!!!
    • ...Oh, wait, they weren't realspammers?
    • Sorry
  • Many of the questions asked here are answered in the Challenge Call for Participation []

    Or the overview talk [] that Rich Segal gave at the MIT Spam Conference.

    The guidelines are scheduled to be finalized May 1.
    • by SL Baur ( 19540 )

      Participants will compete in filtering a live 24-hour e-mail stream
      Looks like greylisting is acceptable.

      Simulated user-feedback will be provided to train learning-based filters.
      And it looks like gmail-type filters are acceptable.

      Good job guys. The results will be interesting to read.
  • "This ought to be a sweeps week television spectacular."

    Is there an ESPN 6 or 7 cable channel? I'm thinking this is below Cheerleading and Dog Agility, but perhaps above Lumberjack competitions.
  • by Minwee ( 522556 ) <> on Wednesday April 11, 2007 @01:43PM (#18691981) Homepage

    "This ought to be a sweeps week television spectacular."

    I think that it already is, but it's only on in Japan and uses real SPAM.

  • Sigh. And i had such hopes. Pictures of a team of people, with a spam and tennis ball loaded tennis ball launcher at the other end of a court. When something gets fired at you, determine if you should let the ball go by, or wack the spam from the air. Alas, it's not to be. Dan
  • Issue hunting permits for the spammers themselves. Whoever wastes the most spammers, wins.

    Evidence of wasted spammers can be in the form of complete heads, or ears.
  • Relevant to nothing, but this is the first time I can remember seeing an article on /. without the requisite department tag in the story header.

    Anyone want to try their hand at making up their own?
  • Okay, here's the first question I have, and I'm sure many others wonder the same. How will spam be combatted when it's not real spam? For example, Spam Assassin checks actual mail server names and addresses to see if they are on known spammer lists and so on. Won't extremely useful/effective features like these be overriden by the fact that these spam emails are intentionally sent and won't be from any known spam-relaying mail servers??
    • by gvc ( 167165 )
      The mail messages will contain header information from which the sending IP may be derived. Of course, spammers try to forge this info, but the most recent header is guaranteed to be correct.
      • That's exactly what I'm saying. Since these "contest" spams will be from the contest organization (as in, not from actual spammers), I would imagine they won't have the headers that indicate the mails were from spam-relaying servers out there on the net. So how are contestants supposed to use filtering-based-on-host-IP measures in their spam filtering application??
        • by gvc ( 167165 )
          I don't think you understood the parent. The messages are from actual spammers, not from the contest organization. The spams are merely relayed, and they are relayed accurately.
  • Problem solved.

    Now get people and free email services like Hotmail and Gmail to turn off their URL signatures in the bottom of their outgoing emails and you will stamp the spam email menace out in one bold stroke.

    Moves the spam back to USENET which is already spammed-out already.... :P

    If people you don't know want to start a meaningful email conversation with you, they WON'T try to get you to visit the URL of some 'paysite' contained in their email.

    Then something has to be done about spammers bouncing their
  • by InakaBoyJoe ( 687694 ) on Wednesday April 11, 2007 @10:04PM (#18697615)

    From TFCFP (call for participation):
    Filters will be evaluated based on a weighted combination of the percentage of spam blocked and its false positive percentage.

    From a theoretical standpoint, a low false positive average over an entire set (like <1%) might seem okay, but that doesn't take into account what's important to users.

    Take, for example, a message from a long-lost friend, whose current address isn't yet in your whitelist, and who would have no other way of contacting you should the message get spamboxed. Here's an example of a message that's important to a user but gets lost among the everyday messages when simply talking about the percentage of false positives.

    There's lots of other examples, too -- if you run your own domain, your messages are likely to be spamboxed, etc. Furthermore, the lower the false-positive rate, the less likely a user is to actually *check* their spambox, thus making a single false-positive even worse.

    Microsoft's own Hotmail, of course, is notorious for spamboxing messages like that. And yet the conference is being held at Microsoft, and Microsoft's own spam researchers proudly touted their system in the February 2007 Communications of the ACM [].

    Something tells me the leaders in the field are sort of missing the point. Simply bringing down the aggregate false positive rate is *not* enough. The measure needs to take into account how often the user actually misses information that's important to them.

    • a low false positive average over an entire set (like <1%) might seem okay, but that doesn't take into account what's important to users.

      A 1% false positive rate is not OK. The good systems will misclassify at most a couple of good emails per thousand, and the vast majority of those will lie in the grey area between ham and spam. A few will be internet transactions -- sign-up messages, receipts, and the like -- and a vanishingly small number will be personal communications.

      Microsoft's own Hotmail, of

Testing can show the presense of bugs, but not their absence. -- Dijkstra