Slashdot Banner
Stories
Slash Boxes
Comments
typodupeerror delete not in

Hot Comments

Comments: 157 +-   Stopping Spam Before It Hits the Mail Server on Wednesday July 29, @11:32AM

Posted by Soulskill on Wednesday July 29, @11:32AM
from the napalm-would-catch-it-even-earlier dept.
spam
networking
internet
Al writes "A team of researchers at the Georgia Institute for Technology say they have developed a way to catch spam before it even arrives on the mail server. Instead of bothering to analyze the contents of a spam message, their software, called SNARE (Spatio-temporal Network-level Automatic Reputation Engine), examines key aspects of individual packets of data to determine whether it might be spam. The team, led by assistant professor Nick Feamster, analyzed 2.5 million emails collected by McAfee in order to determine the key packet characteristics of spam. These include the geodesic proximity of end mail servers and the number of ports open on the sending machine. The approach catches spam 70 percent of the time, with a 0.3 false positive rate. Of course, revealing these characteristics could also allow spammers to fake their packets to avoid filtering."
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • I'll go first.

    All spammers have to do is change the characteristics of the message. It's always going to be a cat and mouse game, just like antivirus and antispyware, so saying that they've found THE solution to blocking spam from hitting the server is slightly irresponsible.

    • Re: (Score:3, Interesting)

      Unless they use a truly novel approach of stopping spam before it hits the server.

      I suggest an AK-47.

      • C4 on the outside of the firewall. That might remove more than expected...but it works!

        • by gnick (1211984) on Wednesday July 29, @11:56AM (#28868865) Homepage

          I realize that you're kidding, but removing more than expected is something that I consider unacceptable. If it hits the mail server and gets shuffled off into a spam folder with 100 pieces of trash, that's fine. But if it's not even going to make it to the mail server, 0.3% is too high a false positive rate.

    • by jammindice (786569) on Wednesday July 29, @12:11PM (#28869137) Homepage
      Your post advocates a

      ( X ) technical ( ) legislative ( ) market-based ( ) vigilante

      approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

      ( ) Spammers can easily use it to harvest email addresses
      ( ) Mailing lists and other legitimate email uses would be affected
      ( ) No one will be able to find the guy or collect the money
      ( ) It is defenseless against brute force attacks
      ( X ) It will stop spam for two weeks and then we'll be stuck with it
      ( ) Users of email will not put up with it
      ( ) Microsoft will not put up with it
      ( ) The police will not put up with it
      ( ) Requires too much cooperation from spammers
      ( ) Requires immediate total cooperation from everybody at once
      ( ) Many email users cannot afford to lose business or alienate potential employers
      ( ) Spammers don't care about invalid addresses in their lists
      ( ) Anyone could anonymously destroy anyone else's career or business

      Specifically, your plan fails to account for

      ( ) Laws expressly prohibiting it
      ( ) Lack of centrally controlling authority for email
      ( ) Open relays in foreign countries
      ( ) Ease of searching tiny alphanumeric address space of all email addresses
      ( X ) Asshats
      ( ) Jurisdictional problems
      ( ) Unpopularity of weird new taxes
      ( ) Public reluctance to accept weird new forms of money
      ( ) Huge existing software investment in SMTP
      ( ) Susceptibility of protocols other than SMTP to attack
      ( ) Willingness of users to install OS patches received by email
      ( ) Armies of worm riddled broadband-connected Windows boxes
      ( X ) Eternal arms race involved in all filtering approaches
      ( ) Extreme profitability of spam
      ( ) Joe jobs and/or identity theft
      ( ) Technically illiterate politicians
      ( ) Extreme stupidity on the part of people who do business with spammers
      ( X ) Dishonesty on the part of spammers themselves
      ( ) Bandwidth costs that are unaffected by client filtering
      ( ) Outlook

      and the following philosophical objections may also apply:

      ( X ) Ideas similar to yours are easy to come up with, yet none have ever
      been shown practical
      ( ) Any scheme based on opt-out is unacceptable
      ( ) SMTP headers should not be the subject of legislation
      ( ) Blacklists suck
      ( ) Whitelists suck
      ( ) We should be able to talk about Viagra without being censored
      ( ) Countermeasures should not involve wire fraud or credit card fraud
      ( ) Countermeasures should not involve sabotage of public networks
      ( ) Countermeasures must work if phased in gradually
      ( ) Sending email should be free
      ( ) Why should we have to trust you and your servers?
      ( ) Incompatiblity with open source or open source licenses
      ( ) Feel-good measures do nothing to solve the problem
      ( ) Temporary/one-time email addresses are cumbersome
      ( ) I don't want the government reading my email ( X ) Killing them that way is not slow and painful enough Furthermore, this is what I think about you: ( X ) Sorry dude, but I don't think it would work. ( ) This is a stupid idea, and you're a stupid person for suggesting it. ( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!
      • Ideas similar to yours are easy to come up with, yet none have ever
        been shown practical

        This is partially true. Spamassassin uses a few of the things described on the article already.

  • RFC 3514 (Score:4, Funny)

    by Anonymous Coward on Wednesday July 29, @11:35AM (#28868407)

    Problem already solved back in 2003, I don't get any spam now.

    • Re:RFC 3514 (Score:5, Informative)

      by darpo (5213) on Wednesday July 29, @11:46AM (#28868639)
      For those who don't feel inclined to Google for it:

      "The evil bit is a fictional IPv4 packet header field proposed in RFC 3514, a humorous April Fools' Day RFC from 2003 authored by Steve Bellovin. The RFC recommended that the last remaining unused bit in the IPv4 packet header be used to indicate whether a packet had been sent with malicious intent, thus making computer security engineering an easy problem."
  • by pearl298 (1585049) <mikewatersaz.gmail@com> on Wednesday July 29, @11:39AM (#28868483)

    Just like other criminals, spammers must quickly respond to what actually works. In essence this is the flaw in any "security by obscurity" scheme, the bad guys simply respond to whatever works. If you get to try several billion times a day then you can try a whole lot of combinations.

  • That means that in my office of 50 people, with an average of 50 emails per day (a very very low estimate), we'd get 7-8 false positives daily. I'd hear bloody murder if that was the case.

    We get a lot more mail than that per day, and our spamassassin without autolearning (simply flag anything higher than 5.0) does a hell of a lot better job than that... down in the range of 1-2 false positives a month. Assuming a low daily average of emails (like my example), that's .002% false positives.

    • And of course, if you want to actually spot the false positives, you have to let all the spam hit the mail server anyway. Unless you're willing to just ignore all the spam packets and put up with all those false positives being lost to the ether, this won't reduce your mail processing load at all.

    • It is somewhat ambiguous, but I had read it 0.3%, not 3%, which implies that you'd lose 0-1 emails/day if you were averaging 50 total a day. Still higher that way than your current method, but nowhere near as bad as 7-8 daily.
      • 50 a day * 50 people = 2500 messages, 2500 messages * 0.3% = 7.5 emails.

      • Right, you read it wrong, like you were supposed to. 70% = 0.7, 30%= 0.3. Ergo, if it isn't catching spam correctly, its marking the rest as spam, that way you catch all the spam! I wonder at what point in time it'd be better to reject everything and just deal with escalated messages (to phone calls, txts, tweets, etc). Then you can ignore email all together.
        • Re: (Score:3, Informative)

          From the article, "The end result was a system capable of detecting spam 70 percent of the time, with a 0.3 percent false positive rate." The summary dropped an instance of the word "percent". I wasn't sure how to read it either so I specifically looked for the source of the 0.3 in the original.
  • The original is "The end result was a system capable of detecting spam 70 percent of the time, with a 0.3 percent false positive rate."

    • Oh yeah. I was thinking a rate of 0.3 was huge. 0.3 percent is much better but still not acceptable.
  • by johndiii (229824) * <johndiii&amilost,com> on Wednesday July 29, @11:43AM (#28868577) Journal

    0.3 would be terrible - three out of ten false positives. 0.3 percent - what the article actually says - is not too bad. But current techniques allow me to check the spam bin for such messages. This technique would pretty much preclude that capability, since the mail would never arrive at the server. I'm not sure that a rate of 0.003 would be acceptable under those circumstances.

    • 0.3 percent false positive

      They predicted something around 97 billion e-mails per day sent in 2007. I wouldn't want to guess what it's at today, but it's probably higher. Regardless, 0.3% of the emails equates to about 291 million legitimate emails per day black holing. No errors. No "marked return to sender". It just vanishes, eaten by the shub internet. Oops. And we can be pretty sure those numbers are higher -- this is a back of the envelope analysis.

      • No errors. No "marked return to sender".

        If the box just dumps the packets on the floor, the sender will eventually get an error message from their mail server. Of course the mail server will have tried uselessly quite a lot of times (for days, usually) before giving up.

      • Personally I would think that if 10 is 100%

        10 isn't 100%. 1 is 100%. That's how % is defined.

      • Re: (Score:3, Interesting)

        Help me here... Personally I would think that if 10 is 100% 0.3 is less than 1 mail. And not 3 out of 10.

        .3 is 300 out of 1000.

        .3% is 3 out of 1000.

        It's similar to the confusion created when idiots write "It only costs me .25 cents to make a phone call" when they really mean ".25" or "25 cents".

      • by vux984 (928602) on Wednesday July 29, @12:18PM (#28869287)

        And when my mail filters blocks spam, it sends out a message with redirections to an alternative gsm-number telling them to call me so I can whitelist the adres.

        That's called back scatter and its as bad as spam.

        Think about it, my mail servers block about 35,000 spam per day. If they sent a message to each failed recipient with alternative instructions, that would be 35,000 messages I sent out. Some 34,990 of those messages would either be undeliverable or would get delivered to people who had nothing to do with the original message. You are effectively clogging up a bunch of innocent peoples mail systems with your messages.

        Put it another way, suppose some spammer sends 1,000,000 messages with your email address spoofed as the sender. If everyone else did what you do, you would then receive 1,000,000 messages back to your inbox giving you alternate instructions to contact these people.

        You wouldn't want that. Nobody else does either. So please stop.

          • Re: (Score:3, Insightful)

            I do get your point really. But my dad (read: the boss) would not be happy if he missed a deal cause a million people who got spoofed got 1 mail from us telling them to call us if their message wasn't spam.

            Read that over a few times. You are saying its ok to send out a MILLION unsolicited and annoying email messages (aka SPAM) to people who have never heard of you, so that your father won't miss a single deal?

            How is that any different from rationalizing sending out a million direct marketing spam in the hop

  • IP addresses, he notes, are easy to fake.

    Sure, you can fake your IP address so you get past this filtering, because it just looks at the first packet. It won't help you though, because you can't complete a TCP 3-way handshake from a fake address, and without doing that you can't actually send spam.

  • Isn't this just pushing the processing back a level, but still arriving at its destination? I guess you could implement bandwidth-provider-level (i.e. before the customer even gets their packets) spam filtering this way, but I'm sure most organizations would prefer to retain control by doing their own filtering.
  • by CopaceticOpus (965603) on Wednesday July 29, @11:48AM (#28868711)

    So this software functions in both space AND time? Fascinating.

    It's good that they specified that in the name, to avoid questions such as "Will this software work in the universe which we inhabit?"

  • by damn_registrars (1103043) on Wednesday July 29, @11:59AM (#28868921) Journal
    It sounds like this approach would be fairly CPU intensive; analyzing the characteristics of packets, comparing them to other packets, looking for information on their originating systems, etc... It seems like they are throwing a non-trivial amount of computational time at the problem in order to spare the storage space that would be otherwise taken up by spam.

    And of course as others have already pointed out, this just starts another round of whac-a-mole by pursuing this avenue.
  • Wrong approach (Score:5, Insightful)

    by Animats (122034) on Wednesday July 29, @12:02PM (#28868993) Homepage

    The fundamental property of spam is that it involves many similar messages going to a large number of destinations. That's what to look for. Google can do that, because they manage a very large number of mailboxes with a single system. SpamCop used to do that, but they had to be in the mail-forwarding business to do it and that was too expensive.

    Trying to detect spam by looking only at the mail for a single account is inherently a form of guessing. The existing technologies are reasonably good, but not good enough that the spammers give up.

    • The fundamental property of spam is that it involves many similar messages going to a large number of destinations.

      It won't be long until the zombies create individual spams for each recipient. Just scramble the catch words, add some random stuff to the gifs so they message-digest differently etc..., and there's not enough similarity in the messages anymore to be statistically detectable. If at all, traffic analysis would help, but here too, botnets are extremely flexible and could spread batch runs in I

  • by crymeph0 (682581) on Wednesday July 29, @12:10PM (#28869125)

    Your post advocates a

    (x) technical ( ) legislative ( ) market-based ( ) vigilante

    approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

    ( ) Spammers can easily use it to harvest email addresses
    ( ) Mailing lists and other legitimate email uses would be affected
    ( ) No one will be able to find the guy or collect the money
    ( ) It is defenseless against brute force attacks
    (x) It will stop spam for two weeks and then we'll be stuck with it
    (x) Users of email will not put up with it
    ( ) Microsoft will not put up with it
    ( ) The police will not put up with it
    ( ) Requires too much cooperation from spammers
    ( ) Requires immediate total cooperation from everybody at once
    (x) Many email users cannot afford to lose business or alienate potential employers
    ( ) Spammers don't care about invalid addresses in their lists
    ( ) Anyone could anonymously destroy anyone else's career or business

    Specifically, your plan fails to account for

    ( ) Laws expressly prohibiting it
    ( ) Lack of centrally controlling authority for email
    ( ) Open relays in foreign countries
    ( ) Ease of searching tiny alphanumeric address space of all email addresses
    ( ) Asshats
    ( ) Jurisdictional problems
    ( ) Unpopularity of weird new taxes
    ( ) Public reluctance to accept weird new forms of money
    ( ) Huge existing software investment in SMTP
    ( ) Susceptibility of protocols other than SMTP to attack
    ( ) Willingness of users to install OS patches received by email
    (x) Armies of worm riddled broadband-connected Windows boxes
    (x) Eternal arms race involved in all filtering approaches
    ( ) Extreme profitability of spam
    ( ) Joe jobs and/or identity theft
    ( ) Technically illiterate politicians
    ( ) Extreme stupidity on the part of people who do business with spammers
    ( ) Dishonesty on the part of spammers themselves
    ( ) Bandwidth costs that are unaffected by client filtering
    ( ) Outlook

    and the following philosophical objections may also apply:

    ( ) Ideas similar to yours are easy to come up with, yet none have ever
    been shown practical
    ( ) Any scheme based on opt-out is unacceptable
    ( ) SMTP headers should not be the subject of legislation
    ( ) Blacklists suck
    ( ) Whitelists suck
    (x) We should be able to talk about Viagra without being censored
    ( ) Countermeasures should not involve wire fraud or credit card fraud
    ( ) Countermeasures should not involve sabotage of public networks
    ( ) Countermeasures must work if phased in gradually
    ( ) Sending email should be free
    (x) Why should we have to trust you and your servers?
    ( ) Incompatiblity with open source or open source licenses
    ( ) Feel-good measures do nothing to solve the problem
    ( ) Temporary/one-time email addresses are cumbersome
    ( ) I don't want the government reading my email
    ( ) Killing them that way is not slow and painful enough

    Furthermore, this is what I think about you:

    (x) Sorry dude, but I don't think it would work.
    ( ) This is a stupid idea, and you're a stupid person for suggesting it.
    ( ) Nice try, assh0le! I'm going to find out where you live and burn your
    house down!

    • I think you missed a few:
      (X) Bandwidth costs that are unaffected by client filtering

      (X) Ideas similar to yours are easy to come up with, yet none have ever been shown practical.

  • First: I do not want others to decide what's spam for me.
    Second: I got graylisting, amavisd with spamd & co, and more. Why exactly would I put such a system on every other node of the net too? To throw away resources?

  • What exactly does this mean? A rate is usually a comparison of two values. What two values were compared to get 0.3?
  • It's become a source of unending comedy as spammers who aren't very good at English in the first place use a dictionary and thesaurus to get past the filtering software resulting in extremely entertaining subject lines. For example-

    YOU REMEMBER WHEN SEX WAS THE LAST TIME? REFRESH THE MEMORY OF VIA GRA!

    No more hair Rogaining medicine.

    GIRLS DO ANYTHING FOR A BIG HOSE

    It boosts your rod!

    Make two days nailing marathon

    for your delicate advantage

    And all that is just from the most recent page in my spa

  • by cenc (1310167) on Wednesday July 29, @03:33PM (#28872919)

    Why does it seem everyone ignores the real source of the majority of spam: Microsoft windows computers infected by viruses running botnets that send spam. Yes, is generated by other systems, but not nearly the amount that is being generated by MS based botnets.

    How about everyone just send their frigen spam bill to MS. How about a class action for everyone to collect for the damage that MS does to networks around the World. Better yet lets just forward all the spam we get to MS. Let them sort it out.

    • Re: (Score:3, Insightful)

      Many spam messages are propagated by botnets, spoofed IPs, etc, so that isn't a perfect solution. Really, we need to combine different approaches, instead of trying to find a holy-grail.
    • Spam is almost exclusively produced by botnets. Vulnerable computers exist all over the world, so it shouldn't be surprising that more spam comes from outside your country (wherever you live) than inside. You, personally, have no one in China or Russia that you correspond with, but a debtor nation like the US is in a rather poor position to f*ck with the legitimate mail traffic of its main creditor. The most effective way to kill spam would be to aggressively eliminate botnets, wherever they are. A machi
      • Re: (Score:3, Interesting)

        Many have found, if your outside the US, blocking US is much more effective then blocking China and Russia.

    • Because many spam emails are generated from open relay servers.
    • I hear this suggestion a lot. However, many of us work for global companies that deal with legitimate email from these countries. We can't just reject IP blocks for countries when we have dealings in them. China and Russia are huge for international companies.

    • Good plan, block the countries sending the most spam. Currently, most spam is sent from the USA. I notice that your mail server is in the USA, so unfortunately this means you won't be able to contact anyone adopting this plan, but I don't think it's too high a price to pay for reducing the total amount of spam.
      • by oldspewey (1303305) on Wednesday July 29, @11:51AM (#28868769)

        what happens when someone tries to contact me out of the blue before I have a chance to white list them?

        Absolutely nothing happens ... at least from your perspective.

        • Slightly off-topic, sorry, but I think it's abysmal enough to post and interest a few (or just make you thankful you're not here.)

          "Absolutely nothing" is my company's solution to filtering out large or suspect attachments. If somebody sends me an attachment and my company's filters don't like it, the e-mail is dropped. I don't get a notice saying, "This e-mail contains suspicious attachments and has been removed." My customer doesn't get a reply saying, "This e-mail could not be delivered to the recipien

"You know, we've won awards for this crap." -- David Letterman