Live spam-catching contest at CEAS - Slashdot

Please create an account to participate in the Slashdot moderation system

×

Live spam-catching contest at CEAS 126

Posted by CmdrTaco on Wednesday April 11, 2007 @12:19PM

noodleburglar writes "The 2007 Conference on Email and Anti-Spam (CEAS) will feature a live spam-catching contest. Entrants will be treated to a torrent of spam and must use their spam filtering technique to filter out as much as possible, while also letting legitimate messages. My money's on Spam Assassin." This ought to be a sweeps week television spectacular.

This discussion has been archived. No new comments can be posted.

Live spam-catching contest at CEAS

Search 126 Comments Log In/Create an Account

Comments Filter:

Greylisting? (Score:3, Insightful)

by schmiddy ( 599730 ) writes: on Wednesday April 11, 2007 @12:49PM (#18691131) Homepage Journal

I can't help but wonder how realistic this scenario is.. They're basically going to have a single server dumping a whole ton of spam at your filtering package, and you're supposed to be able to filter on.. what, just the content of the messages? Real world techniques use many more subtle hacks, such as greylisting, or actually looking at the domains the messages are coming from. If their server is going to be dumping millions of messages at you in a short amount of time, I don't think they'll let you use greylisting or similar techniques.

Share
twitter facebook
Agile and evolutionary versus ergodic spam (Score:3, Insightful)

by goombah99 ( 560566 ) writes: on Wednesday April 11, 2007 @01:41PM (#18691949)

The trouble I can see with a test like this is that's it's a static test. It assumes a key feature of spam which is not true. namely that the spam signature is constant over time or at least makes an ergodic assumption. The thing about spam is that it is evolutionary. Not only does it's signature vary but the spammers learn what is getting through and shift to sending more of that flavor.

To see why this matters consider two spam hypothetical spam programs. One blocks 99% of the test set spam but lets a particular form of spam comprising only 1% of the test set through. And contrast this with another program that is adaptive but to avoid false-postives has to err on the side of letting through 20% of the spam it flags (making it only 20% effective).

While the former method would smoke the latter in a static trial. in the real world spammers would just shift to exclusively sending the kind of spam that gets through the first filter.

To make this a real contest they should make it adversarial. Give the spam script a feedback signal on which spam is getting through and let it adjust it's mix of spam and chaffe to try to maximize the the rate it can push spam through (or bust the filter by chaffing to minimize the number of legit e-mails that survive).

Parent Share
twitter facebook
Error rate (false positives) isn't the whole story (Score:3, Insightful)

by InakaBoyJoe ( 687694 ) writes: on Wednesday April 11, 2007 @10:04PM (#18697615)

From TFCFP (call for participation):
Filters will be evaluated based on a weighted combination of the percentage of spam blocked and its false positive percentage.
From a theoretical standpoint, a low false positive average over an entire set (like <1%) might seem okay, but that doesn't take into account what's important to users.
Take, for example, a message from a long-lost friend, whose current address isn't yet in your whitelist, and who would have no other way of contacting you should the message get spamboxed. Here's an example of a message that's important to a user but gets lost among the everyday messages when simply talking about the percentage of false positives.
There's lots of other examples, too -- if you run your own domain, your messages are likely to be spamboxed, etc. Furthermore, the lower the false-positive rate, the less likely a user is to actually *check* their spambox, thus making a single false-positive even worse.
Microsoft's own Hotmail, of course, is notorious for spamboxing messages like that. And yet the conference is being held at Microsoft, and Microsoft's own spam researchers proudly touted their system in the February 2007 Communications of the ACM [acm.org].
Something tells me the leaders in the field are sort of missing the point. Simply bringing down the aggregate false positive rate is *not* enough. The measure needs to take into account how often the user actually misses information that's important to them.

Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

390 comments32-Hour Workweek for America Proposed by Senator Bernie Sanders
358 commentsWhat Should Happen to Empty Downtown Office Spaces?
340 commentsHacktivism Erupts In Response To Hamas-Israel War
324 comments'Feedback' Is Now Too Harsh. The New Word is 'Feedforward'
248 commentsWorkers are Resisting Calls to Return to Offices

"Engineering without management is art." -- Jeff Johnson