Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Spam IT

Spam Trap Claims 10x-100x Accuracy Gain 419

SpiritGod21 writes in with a NYTimes article on a new approach to spam detection that claims out-of-the-box improvement of 1 or 2 orders of magnitude over existing approaches. The article wanders off into human-interest territory as the inventor, Steven T. Kirsch, has an incurable disease and an engineer's approach to fighting it. But a description of the anti-spam tech, based on the reputation of the receiver and not the sender, is worth a read.
This discussion has been archived. No new comments can be posted.

Spam Trap Claims 10x-100x Accuracy Gain

Comments Filter:
  • Ummmm.... (Score:3, Insightful)

    by rustalot42684 ( 1055008 ) <fake@acDEGAScount.com minus painter> on Monday December 03, 2007 @11:36PM (#21567711)
    I read part of TFA, and it seems to be saying that you can id spam mails because they are being sent to a person who gets lots of spam. But that still doesn't take into account the fact that that person also receives legit mail, AND the fact that what is spam to one person isn't spam to another.

    Also, seems like a bit of a slashvertisment for what is yet an unproven technology - the only benchmarks we have are ones they provide.
  • by damn_registrars ( 1103043 ) <damn.registrars@gmail.com> on Monday December 03, 2007 @11:41PM (#21567753) Homepage Journal
    At least once a week there seems to be another flashy technique to filter or block spam. Great.

    Except that this ignores the truth behind the spam problem, that many people don't seem to care about. Spam is, at its root, an economic problem. Spam is sent by people who are making money helping someone sell something. The spam you got this afternoon for discount v!@gra or 0EM software is making money for someone. And as long as someone can still make money off of it, they'll keep doing it.

    If you want to stop spam, you need to take away the economic incentive. We've already seen how many spam filtering / blocking programs produced in the past 5 years? But yet the spam problem just keeps growing as the number of "solutions" grows. This tells us that the spammers are more than willing to work on ways to circumvent these reactive techniques, so that they can continue to make money off their deeds.

    Once we can stop spam from being profitable, we will finally see it go away. But no sooner.
  • by ender- ( 42944 ) on Monday December 03, 2007 @11:51PM (#21567827) Homepage Journal

    If you want to stop spam, you need to take away the economic incentive. We've already seen how many spam filtering / blocking programs produced in the past 5 years? But yet the spam problem just keeps growing as the number of "solutions" grows. This tells us that the spammers are more than willing to work on ways to circumvent these reactive techniques, so that they can continue to make money off their deeds.

    Once we can stop spam from being profitable, we will finally see it go away. But no sooner.
    But why would the anti-spam software companies want that? If they succeed in actually eliminating spam, they'd also go out of business. It may be profitable for the spammers, but I suspect it's even more profitable for the anti-spam companies.

  • Is it a joke? (Score:3, Insightful)

    by jmv ( 93421 ) on Monday December 03, 2007 @11:51PM (#21567829) Homepage
    Seriously, I don't see how anything working remotely as described can work. First, it guarantees that any OSS mailing list will be flagged as spam because we our emails tend to be on the web and we all receive lots of spam. Then how the hell is someone going to know what percentage of spam I receive (or do they expect everyone to give them access to their inbox?)? Even if that were to work, all the spammers would have to do is let the zombies send one email at a time, at which point either they block all my email or they leave it all through. Dumb idea or dumb reporting?
  • by sonikbeach ( 939185 ) on Monday December 03, 2007 @11:52PM (#21567835) Homepage
    How does one initialize this system? Spam is determined by user reputation, yet user reputation is determined by quantity of spam received. Am I missing something? The logic seems circular.
  • by ucblockhead ( 63650 ) on Monday December 03, 2007 @11:53PM (#21567841) Homepage Journal
    Yes, and once we can stop drugs from being profitable, we will see them go away too.

    Oh, and prostitution, too. And identity theft. And insurance fraud. Yup, it's simple to fix. Just make it unprofitable! Simplicity itself!
  • by pclminion ( 145572 ) on Monday December 03, 2007 @11:58PM (#21567875)

    At least once a week there seems to be another flashy technique to filter or block spam. Great.

    It's not "flashy." It's called information theory and statistics. It is an extremely powerful concept that has far more important potential uses than simply filtering spam email. Every new advancement in automated classification and knowledge extraction is VITALLY IMPORTANT to our ability to cope in a world which has suddenly been flooding with SO MUCH information. This power tool is being applied to what some might see as a "silly" problem, but the fact remains that spam is a powerful motivation to researchers to push further limits in the fields of pattern recognition, information and natural language processing.

    If you're against the advancement of information processing techniques, then... uh, okay, I guess. If you can't see beyond spam, you are terribly short sighted.

  • by Anonymous Coward on Tuesday December 04, 2007 @12:06AM (#21567919)
    Isn't making spam less profitable what they're attempting to do by blocking it? Doesn't that defeat the initiative in its own way?

    I mean, I'd imagine inventing new ways of blocking spam would be a lot easier than standing the economy on its head.
  • by explosivejared ( 1186049 ) <hagan@jared.gmail@com> on Tuesday December 04, 2007 @12:07AM (#21567937)
    Exactly! The system lacks a way of defining what exactly it's blocking. How does one determine that one say receives 25% spam? Does Abaca do the analysis or are you just supposed to guess? While the equation obviously works on paper, when implementation comes it is clearly missing a major element, ie a definition of spam.
  • Re:KInda flawed (Score:2, Insightful)

    by swillden ( 191260 ) <shawn-ds@willden.org> on Tuesday December 04, 2007 @12:17AM (#21568001) Journal

    Does that make better sense?

    Not much.

    Two issues: First, how does the system know that Jane's e-mail is mostly spam. Who tells it? Does it use some other filters to identify the spam in order to determine her spam rate?

    Second, how does the system know that the message you received and the message Jane received are the same? Spammers have long been randomizing parts of messages in order to block older spam filters.

  • by MightyYar ( 622222 ) on Tuesday December 04, 2007 @12:22AM (#21568039)
    In TFA, the example is:

    "At 99.8 percent you miss two out of 1000," said Mr. Kirsch. "At 95 percent you miss 50 out of 1,000. So other systems give you 25 times as much spam. Who wants that? Nobody we know."
    He then goes on to claim that more users will improve the system to where it is 100x better than 95%, or 99.95% effective.
  • by OzRoy ( 602691 ) on Tuesday December 04, 2007 @12:32AM (#21568111)
    On the Internet, yes. Because, ya know, the spammers won't just move to where spam isn't illegal, like Nigeria or something.

    Wake up, they are already committing fraud, and already breaking the law. The agencies already exist that fight fraud, and yet how many spammers have actually been caught and charged with fraud? How much of this spam has actually been stopped?
  • by choongiri ( 840652 ) on Tuesday December 04, 2007 @12:38AM (#21568135) Homepage Journal

    No, if you are harvesting email addresses and sending unsolicited commercial messages to them, it is quite simple:

    You are a spammer.

  • by courseofhumanevents ( 1168415 ) on Tuesday December 04, 2007 @12:41AM (#21568159)
    "MightyYar" --> "him gay, try!"
  • by CustomDesigned ( 250089 ) <stuart@gathman.org> on Tuesday December 04, 2007 @12:41AM (#21568169) Homepage Journal
    Honeypots have been a published anti-spam technique for a decade. The idea is to publish bogus mailboxes that are not close to any legit mailbox. Any message with a honeypot as any recipient is spam. 100% accurate. (And I blacklist the IP for a week for good measure.) I use a variation, where any message with 3 or more invalid recipients is spam (blacklist IP). That is a little risky since someone may legitimately be trying various mailboxes manually with a telnet session because they forgot the exact name. This technique gives each recipient a score between 0 and 1 that reflects how close to a honeypot that recipient is, with actual honeypots (100% spam) being 1.0.
  • Re:Ummmm.... (Score:3, Insightful)

    by MechaStreisand ( 585905 ) on Tuesday December 04, 2007 @12:47AM (#21568205)
    ... AND the fact that what is spam to one person isn't spam to another...

    That's not true though. Spam is defined as bulk, unsolicited e-mail. Even if some retard actually likes to read their spam e-mails and buy things they advertise, that doesn't change the fact that the message was sent in bulk (to many other people as well), and that it was unsolicited by at least the vast, overwhelming majority of them.
  • by Kadin2048 ( 468275 ) * <slashdot.kadin@xox y . net> on Tuesday December 04, 2007 @01:22AM (#21568419) Homepage Journal

    get very few opt-outs
    Might this be because nobody with two neurons to rub together actually uses an opt-out link? (After all, if you're scummy enough to send me unsolicited email, you're probably scummy enough to use that "opt out" as a test to determine whether my address is real, and thus to be sold to other scum for more profit.)

    You may be a nice person and run a respectable enterprise in all other respects, but if you're sending out unsolicited emails on anything more than an individual basis, you're a spammer.

    Furthermore, "This should definitely be legal, it's a great marketing tool and helps my business very well," is not a legitimate justification. It would really help my business if I could hunt down my competitors and kill them, but somehow I doubt that's going to go over very well at the inevitable murder trial. Why? Because nobody cares what's good for you or me, what matters is what's good for society as a whole. And both murder and spam are (admittedly varying degrees of) harmful.
  • by mcrbids ( 148650 ) on Tuesday December 04, 2007 @01:25AM (#21568437) Journal
    Once we can stop spam from being profitable, we will finally see it go away. But no sooner.

    Way to go, Captain Obvious!

    This goes down in history with other sayings of similar caliber, such as

    1) "Once we can stop scams from being profitable, we will finally see them go away. But no sooner."

    2) "Once we can stop prostitution from being profitable, we will finally see it go away. But no sooner."

    3) "Once we can stop theft from being profitable, we will finally see it go away. But no sooner."

    Somehow, despite having 4,000 years of civilization to work on these ills, the appropriate technology to eradicate these plagues has never been concocted. I'd wager that spam is not a technical problem, it's a human problem. And so long as we have A) money and B) an Internet, there will be spam.

    See, there is no clear definition of spam. If I send you a direct, personal, business email that you are expecting while we're on the phone when you ask me for a quote, that's clearly not spam. And if I write a program to send out 100,000 "P3niz Pil1z" emails, that's clearly spam. But there are a MILLION shades of grey in between the two.

    A) I could personalize the Peniz pil1z so that they have your name at the top.

    B) I could randomize the text in the Peniz pil1z email. I could restrict the list of recipients to only those who have, at some point in the distant past, looked at a porn site.

    C) I could send emails to clients of email lists in clear areas of interest to my email. EG: Send an email pronouncing my new electronic pilot gadget only to registered pilots and/or plane owners.

    With each modification, we move further away from "pure" spam, towards "legitimate" commercial email.

    D) I could send a quote to people who have called or contacted people in my business, even though they didn't ask for anything like my quote.

    E) I could send the quote to people who have contacted my business, who didn't ask for the current quote, but have asked about something similar.

    F) I could send the quote to you persuant to a conversation, even though you didn't ask for it, if/when you have asked about something similar.

    G) Finally, we're over to the other extreme. You are a pilot, you want my gadget, and you are asking me for a quote, which I send you.

    And there's no sharp line between the two extremes. I get emails I don't mind too much from G down to around D without personally minding too much. I get annoyed at C and anything below that is below my line. But there are plenty of people who get offended at anything below G!

    It's entirely a personal, subjective decision.
  • by jhol13 ( 1087781 ) on Tuesday December 04, 2007 @01:25AM (#21568441)

    solution is metered billing and micropayments.
    As long as most of the spam is generated by zombie machines this will not help at all.
  • by halcyon1234 ( 834388 ) <halcyon1234@hotmail.com> on Tuesday December 04, 2007 @01:32AM (#21568489) Journal

    how do you propose we remove the economic incentive for spam?

    Easy enough. Remove the customers. Set up a spam operation selling drugs. Except instead of sending what's advertised, send arsenic. Once all the customers have died, there won't be anyone left to buy spam-stuff. And, as a bonus, you help the genepool.

  • Re:Ummmm.... (Score:5, Insightful)

    by Mundocani ( 99058 ) on Tuesday December 04, 2007 @02:04AM (#21568703)
    The main problem I can see is that even if this system works it is easily circumvented. The big assumption is that you can identify the recipients of a particular message, but spammers can easily ensure that information isn't easily obtained.

    First they can ensure that the message itself doesn't contain any recipient info (a big bcc basically).

    Then they avoid batching recipients based on their domain so he SMTP server can't tell who else is receiving the message.

    The only way to derive the recipients now is to compare all messages against all others in order
    to match them up. So they hash every message and combine those with identical hashes.

    But putting a little unique text in each message during transmission foils that.

    Spammers: 1 New weapons: 0
  • Re:KInda flawed (Score:2, Insightful)

    by letxa2000 ( 215841 ) on Tuesday December 04, 2007 @02:05AM (#21568713)

    Now, let's say that BOTH YOU AND JANE receive the same message M.

    That's the problem I have with this. Spam stopped being truly mass produced years ago. Each spam is now normally sent to each user with a different mix of nonsense. The probability of two different people receiving the same message is virtually zero.

  • by Valdrax ( 32670 ) on Tuesday December 04, 2007 @02:10AM (#21568741)
    Over 99 percent spam blocking means fewer than one mistake in every 100 messages processed. That's 10 to 100 times fewer mistakes than any other available systems.

    That still means that the best other systems make a mistake on 1 out of every 10 messages, and the worst ones make a mistake on every single message. That's still ridiculous hyperbole.

    (Personally, I'll take the system that makes 100% mistakes, and I'll use the Spam folder as my Inbox.)

    Now if you said that it has 1/10 to 1/100 the error rate of normal clients (which is what they're actually claiming, I think), THAT would make mathematical sense AND be an achievement. The Slashdot title of the story is just bad no matter how you spin it.
  • by Kadin2048 ( 468275 ) * <slashdot.kadin@xox y . net> on Tuesday December 04, 2007 @02:22AM (#21568799) Homepage Journal
    There's all sorts of commercial mail that's not spam. If I order something from you, and you send a reply back confirming my order, that's both commercial and definitely not spam. As is any other reply to an inquiry.

    Where it crosses the line and becomes spam is when it's unsolicited. That's the key. Unsolicited commercial email is the very definition of spam, and no amount of hand-waving about opt-outs or the selectivity of the lists is going to change that.

    Businesses that have relied on cold-calling via any medium to drum up sales have always been sleazy in my book, but when you do it via email, you're pushing the cost out onto the recipient and onto uninvolved third parties. That's at best unethical, and at worst flat-out theft.
  • Re:No (Score:4, Insightful)

    by arth1 ( 260657 ) on Tuesday December 04, 2007 @04:17AM (#21569327) Homepage Journal

    Ironically, you are completely wrong also - RTFA again. It isn't at all about senders, it's about recipients.

    You didn't RTFA well enough. That it's about recipients is the selling point.
    That's a truth with modifications, though. Look at the quote from the web site I put in my parent post to yours, which clearly shows that it's a block based on who the sender has sent an email to. I'll repeat it, in case you missed it:

    "Because ratings are based on the most recent 25 emails for each sender, the system reacts instantly to spam attacks, usually within just a few messages."

    Yes, it's a recipient based system in that it assigns a score to the sender based on what the recipients of the emails are. But the blocking occurs due to the score of the sender, based on previous emails, not on the recipient of the current email.

    Just think -- if it was based on blocking based on recipient only, it would either block all or no e-mail to an inbox with a single recipient. It would then only be effective for e-mails with multiple recipients, which doesn't match the claims made.
    Again, think, and read the article (and that goes for the moderators too).
  • by arth1 ( 260657 ) on Tuesday December 04, 2007 @04:30AM (#21569385) Homepage Journal

    You have got the system completely BACKWARDS.
    Sorry for AC but i've already moderated in this discussion.

    (Ah, that explains the completely asshat moderation here, then.)

    No, I didn't get it backwards -- RTFA. It's called a recipient verification system, but when you look at their own description on how it operates, you'll find that:

    - It looks at the recipients of a message, and based on how much spam each of the recipient accounts gets, assigns a score to the sender.

    - This score is accumulated over the last 25 emails.
    (The reason for this is rather obvious, if you think about it -- if it based its score on just the last e-mail, if you sent an e-mail to someone who receives a lot of spam, it'd be automatically blocked, and that person would not get any e-mail at all.)

    Say a sender sends three e-mails, to foo@foo.invalid, bar@bar.invalid, a bunch more people, and finally baz@baz.invalid. If foo@foo.invalid receives 30% spam, and the overall average is 80%, that means that the e-mail is unlikely to be spam. So a score is saved in a table for the sender. Then it goes to bar@bar.invalid, who also has a low 40% spam rate, and another "good" score is saved for sender. When the sender then after a while sends an email to baz@baz.invalid, who has a spam rate of 95%, the fact that he sent an e-mail to foo and bar earlier will increase the likelihood of his email to baz going through.
    Conversely, if foo and bar received more spam than average, an e-mail sent to baz would be scored as more likely to be spam, even if baz received a record low 10% spam.

    Yes, in a way, it's receiver based, because it builds the score based on the receivers' ratio of spam to valid e-mails. But the score is applied to the sender, and they state this in clear text on the web site itself. You only have to read past the sales pitch and down to the technical details.
  • by MightyYar ( 622222 ) on Tuesday December 04, 2007 @10:25AM (#21571189)
    Right back atcha:

    courseofhumanevents -> "Must Fence A Nervous Ho"
  • by Eudial ( 590661 ) on Tuesday December 04, 2007 @10:53AM (#21571523)
    Linux is not gay, homosexuals are gay.
  • by damn_registrars ( 1103043 ) <damn.registrars@gmail.com> on Tuesday December 04, 2007 @10:59AM (#21571601) Homepage Journal

    It's called information theory and statistics.

    I agree with you on the significance of information theory. There are plenty of important applications of it, but I don't think that spam filtering is one of them. As I said before, you can filter all the email you want, and in the end you'll just find that the spammers will find a way past your filters and you'll again be bombarded with offers for penis pills.

    further limits in the fields of pattern recognition, information and natural language processing.

    If someone wants to use spam to train their algorithms for work in those areas, I certainly do not oppose it. But if they think that it will somehow solve the spam problem, I stand by my statement that they are dead wrong. On the other hand, if they want to apply it to something like indexing research journal articles, or some other application that is for the greater good, then I applaud their work.
  • by nuzak ( 959558 ) on Tuesday December 04, 2007 @12:13PM (#21572561) Journal
    > My point is, the "powers" that be, in the particular case, are likely incompetent - incapable of successfully pulling off such a conspiracy.

    They're the ones creating the successful antispam systems -- you know, the ones that actually scale up on the gateway. The popular vision of bumbling PHB buffoons everywhere is just another stupid slashdot stereotype, fostered by insecure social retards who have to foist their apparent superiority over everyone by scoffing at everything. Sure, they exist, but long-term successful tech companies generally have -- get ready for it -- smart people working for them.

    Anyway, the antispam companies don't have the leverage to pull off an end to spam. Symantec and Cloudmark and Ironport and so forth could stand up and scream and rant and rave at ISPs and yell about the need to secure email infrastructure, to block outbound port 25 from residential ranges, to deploy SPF, or hell just to stop bouncing (I'm looking at you Barracuda), but as long as the ISPs run their ranges as open sewers, and just slap in a few boxes to stop everyone else's spam, the spam problem will continue. And they don't like having vendors telling them how to run their business. The people with the power to stop the spam problem, who won't, are not the antispam vendors, it's the ISPs sending spam. So perhaps I was too harsh about the assessment of the PHB problem -- they certainly do seem to be the norm at ISPs (notable exceptions like AOL and parts of Roadrunner excepted).

Work without a vision is slavery, Vision without work is a pipe dream, But vision with work is the hope of the world.

Working...