Live spam-catching contest at CEAS 126
noodleburglar writes "The 2007 Conference on Email and Anti-Spam (CEAS) will feature a live spam-catching contest. Entrants will be treated to a torrent of spam and must use their spam filtering technique to filter out as much as possible, while also letting legitimate messages. My money's on Spam Assassin." This ought to be a sweeps week television spectacular.
Greylisting? (Score:3, Insightful)
Agile and evolutionary versus ergodic spam (Score:3, Insightful)
To see why this matters consider two spam hypothetical spam programs. One blocks 99% of the test set spam but lets a particular form of spam comprising only 1% of the test set through. And contrast this with another program that is adaptive but to avoid false-postives has to err on the side of letting through 20% of the spam it flags (making it only 20% effective).
While the former method would smoke the latter in a static trial. in the real world spammers would just shift to exclusively sending the kind of spam that gets through the first filter.
To make this a real contest they should make it adversarial. Give the spam script a feedback signal on which spam is getting through and let it adjust it's mix of spam and chaffe to try to maximize the the rate it can push spam through (or bust the filter by chaffing to minimize the number of legit e-mails that survive).
Error rate (false positives) isn't the whole story (Score:3, Insightful)
From TFCFP (call for participation):
Filters will be evaluated based on a weighted combination of the percentage of spam blocked and its false positive percentage.
From a theoretical standpoint, a low false positive average over an entire set (like <1%) might seem okay, but that doesn't take into account what's important to users.
Take, for example, a message from a long-lost friend, whose current address isn't yet in your whitelist, and who would have no other way of contacting you should the message get spamboxed. Here's an example of a message that's important to a user but gets lost among the everyday messages when simply talking about the percentage of false positives.
There's lots of other examples, too -- if you run your own domain, your messages are likely to be spamboxed, etc. Furthermore, the lower the false-positive rate, the less likely a user is to actually *check* their spambox, thus making a single false-positive even worse.
Microsoft's own Hotmail, of course, is notorious for spamboxing messages like that. And yet the conference is being held at Microsoft, and Microsoft's own spam researchers proudly touted their system in the February 2007 Communications of the ACM [acm.org].
Something tells me the leaders in the field are sort of missing the point. Simply bringing down the aggregate false positive rate is *not* enough. The measure needs to take into account how often the user actually misses information that's important to them.