Forgot your password?
typodupeerror
Security Businesses Google The Internet

Google Purges Thousands of Malware Sites 133

Posted by kdawson
from the just-in-time-for-holiday-shoppers dept.
Stony Stevenson sends in word on the most massive "SEO poisoning" seen to date. The attack was directed at Google in particular and resulted in tens of thousands of Web pages hosting exploits showing up on the first page of Google searches for thousands of common terms (PDF). Sunbelt Software blogged about the attack on Monday after investigating it for months. By Wednesday Google had removed tens of thousands of malware-hosting pages from its index.
This discussion has been archived. No new comments can be posted.

Google Purges Thousands of Malware Sites

Comments Filter:
  • BBC News piece (Score:5, Insightful)

    by MLCT (1148749) on Thursday November 29, 2007 @08:35AM (#21517019)
    http://news.bbc.co.uk/1/hi/technology/7118452.stm [bbc.co.uk]

    The sites were targeting IE exploits.
    • Re:BBC News piece (Score:5, Informative)

      by TubeSteak (669689) on Thursday November 29, 2007 @08:48AM (#21517137) Journal
      FTF Summary:

      Sunbelt Software blogged about the attack on Monday after investigating it for months.
      From Your BBC:

      "This was fairly epic," said Alex Eckelberry, head of Sunbelt Software - one of the firms that uncovered the attack.

      Mr Eckelberry said tens of thousands of domains, many based in China and only a couple of days old, were used in the vanguard of the attack.
      ...
      The booby-trapped websites were thought to be in operation for about 24 hours before Google began stripping them out of its search index.
      So which was it?
      Months of Google poisoning or just day(s)?
      • Re:BBC News piece (Score:5, Insightful)

        by Mike89 (1006497) on Thursday November 29, 2007 @09:14AM (#21517391)
        They could've 'poisoned' Google for months (linked to domains that didn't exist yet), then set the domains up and waited a few days for Google to recrawl. Then again, I'd have thought pagerank would be age-based too. Those search requests are the kind that show up weird dodgy sites anyway (who searches any of those exact terms anyway?!)
      • Re:BBC News piece (Score:5, Informative)

        by Alexeck (864216) on Thursday November 29, 2007 @10:05AM (#21518029)
        So which was it? Months of Google poisoning or just day(s)? It wasn't "months". I think that confusion came from a subsequent blog post we made where we talked about having tracked _comment spam_ bots for months. This attack was only a matter of days. A number of the domains involved, for example, were registered on the 24th or 25th of November. Alex Eckelberry Sunbelt
    • by Tarlus (1000874)

      The sites were targeting IE exploits.
      Well, if we're talking about exploits by the thousands, they'd have to be targeting IE.
    • Still waiting for the day when Slashdot stops posting articles about exploits that have no mention of the OS in the summary...
  • Sounds good, I'm glad someone is actively trying to make the internet a safer place for people in general, as well as cleaning up search pages for people who can spot malware sites from the search engine. This is also good for Google, thanks to their fantastic business model: "the more people who use the internet on a regular basis, the more money we make".
    • Re:Sounds Good To Me (Score:5, Informative)

      by Andrew Nagy (985144) on Thursday November 29, 2007 @09:51AM (#21517841) Homepage Journal
      I'm probably too late on this discussion, but I thought something needed to be said. I work in online marketing (no, that doesn't mean I am a spammer) and I think this speaks volumes about what Google is hard-pressed to admit. The system can still be gamed. And it seems to me that no matter what Google does to improve their algorithm, the system will still be vulnerable to gaming.

      In part, I think this has to do with the oddness that is their ranking strategy. They want to find the most relevant sites for any given query. So they study online behavior and adjust their algorithm to reflect that behavior. At the same time, they publish "guidelines" on how webmasters should design their sites and link out/in. It seems like they're trying to influence how websites behave online and then say that they're picking up on the organic trends. But in the end, they generate the trends. And then they tell everyone how to do it. Because of this, the system will always be vulnerable.

      Until, that is, PigeonRank(TM) [google.com] is launched.
      • Re: (Score:2, Interesting)

        by mikew03 (186778)
        If this is the best spammers can do against Google I think we should be more impressed than concerned. Apparently most of these sites were up only a few days before being removed. And although they did manage to get on page 1 did anyone else notice how bad the site summaries looked? You'ld have to be a total idiot to click on any of those results even if they were page one.
        • Well that's what scares me. If a bunch of morons can game the system for a few days with horrible meta info, what could some serious SEO-ers do? What have they already done? Can I really trust most of Google's results?

          I tend to browse Google results with McAfee SiteAdvisor installed as a plugin. I don't particularly like McAfee, but I do like being able to see reputations of sites before I click on them. Of course, if McAfee hasn't tested the site yet, I accept the risk.
          • by rahvin112 (446269)
            How often does this happen? Do you routinely see Phishing sites in the first 5 pages when googling? If not I would say google is doing their job admirably. Not to mention that although this went on for a few days (as in sites were continued to be added for a period of days) that the sites were out of the rankings within 24 hours. That's impressive, for such a large scale attack on Googles ranking system they only managed to get into the top page for less than 24 hours. I remember some of the old search engi
    • by Hatta (162192)
      This just makes me wonder where the news item was when Google indexed all these sites in the first place.
    • by noldrin (635339)
      Seeing as Google has been warn of poisoning before and refuses action, or that Google was long warned about proxy hacking and refused action, I'm not about to pat them on their back for not actually fixing their stuff and instead just trying trying to clean up the major incidents after a couple of days. If Microsoft acted like that, people would go into hysterics, let alone patting them on the back. And this comes from someone who loves Google and hates Microsoft.
      • Seeing as Google has been warn of poisoning before and refuses action
        How do you know Google "refuses" to do anything about it? Because you don't see something doesn't mean work isn't being done behind the scenes. Do you work at Google and know first-hand that it intentionally isn't working on the problem? If not, then STFU and go peddle your Microsoft apologies elsewhere.
        • by noldrin (635339)
          I know because the problem has been given to them over a year ago and still exists. Perhaps you should go learn something before you flame. Then perhaps your posts might mean something.
          • Just because it exists doesn't mean it isn't being worked on. I'm still waiting for your proof that Google "refuses" to do anything about it.

            And as for flames, you're the one throwing around baseless accusations. Get some proof, or get off.
            • by noldrin (635339)
              So if Google doesn't comment on a problem, then it's completely absolved of all responsibility because perhaps somewhere somebody might be working to fix it? I was hoping that you might go out and learn on your own the history of Google ignoring problems. So here is an overview. Google bombing started in 2000 or before. This is the core of what makes attacks like we saw this week work. Google's response to this problem had been consistently that it wasn't their problem.

              "We don't condone the practice

              • I see nothing in there from Google stating that they "refuse" to fix the problem. It's nice that you posted a bunch of blog links to other problems, but none of them back up your original flame.
                • by noldrin (635339)
                  If you say so, that is your opinion, but you have given nothing to back it up.
                  • I have made no assertions. I have nothing to back up. I'm still waiting for you to back up your claim. Do you "refuse" to back them up because you can't because they're untrue?
    • Sounds good, I'm glad someone is actively trying to make the internet a safer place for people in general...

      This is actually another scary example of Google being more and more evil. Think, if the US government had the DNS servers point that domain to a "This is a known malware" site, slashdot would be up in arms. But when a private corporation removes it from their index that's a good thing?

      I believe in net neutrality, and I believe in search engine neutrality as well. That is, just as AT&T should

  • all your base (Score:3, Interesting)

    by Kranfer (620510) on Thursday November 29, 2007 @08:40AM (#21517061) Homepage Journal
    Yay! No more Malware, I always hated gettng horrible search results that hosted these things. I am glad that Google said to them, "All your base are belong to us" or maybe, "Resistance is Futile" is more along the lines I am looking for. When will their crawlers automatically disqualify ALL sites that contain malware though? That would be nifty.
    • Re:all your base (Score:5, Interesting)

      by sm62704 (957197) on Thursday November 29, 2007 @08:56AM (#21517223) Journal
      When will their crawlers automatically disqualify ALL sites that contain malware though? That would be nifty.

      I don't think it would be possible. I linked to a turing test program I wrote called "art.exe" from my Artificial Insanity [mcgrew.info] page that I hosted on another site I owned (which I since have let lapse). The only way a crawler would know that this program was benign was because it isn't listed in any of the antivirus lists of viral signatures.

      What would be nice is if Google would have its crawlers automatically check pages as they crawled. If there were any known malwars the page would be blacklsted. But there's no way I can think of to flag malware that hasn't been identified as such by humans.

      -mcgrew

      PS:)downside would be that you couldn't find microsoft.com (Foghorn Leghorn says...)
      PPS: I've been mulling over rewriting the Artificial Insanity program in javascript. But I'm having a hard time finding the time.
      • by Nossie (753694)
        So google decides that one of its competitors is malware and purges all existence of them...

        I'm thinking an independent body would be better deciding what is and what is not malware.
        • Re:all your base (Score:5, Insightful)

          by darthflo (1095225) on Thursday November 29, 2007 @09:58AM (#21517931)
          Nothing (except antitrust law, maybe) stops Google from "forgetting to include" live.com in it's indexes now and this situation is quite unlikely to change in the near future. The only two reasons I think of as relevant to leave competitors in are the outrage from both the internet community and the "forgotten" competitor (perhaps culminating in lawsuits for anti-competitive behaviour, IANAL) and the desire for the own index to be perceived as fair and complete.

          An independent body deciding about the malness of any ware is, if a certain responsiveness could be guaranteed, a creepy idea. Forming such a commitee would very surely be a huge leap in the direction of an often-mentioned TCPA (Palladium, NGSCB, Donkey poop)-secured blacklist society. A small aristocraty of people in this decision commitee would become the target of a trillion-dollar industry and be able to decide exactly what piece of software is ran by anybody. On the other hand, allowing anybody to participate in these votes would guarantee this operation not to be effective because of the huge delay this would cause. The same goes for adding legal ways to fight a decision by this body - having one would cause the system to become as slow as many legal systems throughout the world are today, not having one would be a surefire way to cause dissatisfaction with lots and lots of developers (both natural and legal persons).
          Also, don't forget to take into account the current legal trouble e.g. encryption software is going through. I'm certain an independent body would decide similar to lawmakers throughout the world. Essentially, you could probably forget about running Linux (Open Source? That could run anything, including highly illegal tools like decss without any way to stop it), any cd/dvd copying software (It's fun to break the D-M-C-A (sung to the tune of YMCA)), nmap (Remember germany banning "Hacker tools"?) or anything else.

          Sorry for painting such a dystopian future, but letting any (independent, governmental or profit-oriented) body whatsoever decide what software's good and what's bad just isn't what you, me or most anybody else wants.
          • Re:all your base (Score:4, Interesting)

            by Nossie (753694) <IanHarvie@4[ ]el ... t ['Dev' in gap]> on Thursday November 29, 2007 @10:15AM (#21518169)
            I do agree... and maybe an independent body would just become corrupt like the rest of them BUT.

            In googles interest, they are a search engine and not a publisher and for that reason are not subject to the indexes of child porn and other illegal activity. Once google start going down the road of blocking spam and other malicious sites it could be suggested they lose the right of being an automatic aggregation engine.

            All the The pirate bay does is index pointer links, all google does is index pointer links -- one of them has a safe harbour in the US and the other does not. How long before Google itself loses its 'safe harbour' ?
        • by toleraen (831634)
          Well, google does provide the blacklisted phishing sites for Firefox, but no one seems to be complaining...
      • Re: (Score:3, Interesting)

        by halcyon1234 (834388)
        Easy enough. Google has access to a massive amount of IP addresses and computer resources. All they need to do is set up a whole bunch of virtual machines that have no protection on them at all. Those virtual machines can start visiting indexed pages (using a rotating set of IP addresses so the target website doesn't know they're being "tested"). If a machine gets infected, it will be very easy to spot. Something will have installed on that machine. A rootkit or a adware install is fairly obvious, eve
        • by sm62704 (957197)
          A rootkit or adware would be obvious, but what of trojans? A trojan has to be explicitly installed by a user; there would be no way for the bot to tell if the suspected trojan was doing what it was advertised to do or not, even if it did something egregious. I mean, make a batch file with the single DOS command "deltree /y C:\*.*" named "NakedLady.jpg.BAT an you have a simple trojan that will delete every file and directory on a user's machine, provided the user leaves Microsoft's stupid default "hide exten
          • I don't think Google really can protect against Trojans. There isn't a security system advanced enough to fix a chair-to-keyboard interface error. But they can detect flybys and sites that intend to exploit the user with some sort of auto-run code, or buffer overflow, or something that no legitimate site would do to a user (without warning or proof of concept).
    • Re: (Score:2, Interesting)

      by Mathinker (909784)
      > When will their crawlers automatically disqualify ALL sites that contain malware though?

      Not possible; even disregarding the problem that other posters have raised, that the automatic recognition of novel malware is more or less impossible, most of the black hats setting up these sites have started to get really sophisticated and the servers can return different web pages based on IP addresses, and often never serve up exploits more than once to any given IP address.

      Like everything in the security game,
  • by garcia (6573) on Thursday November 29, 2007 @08:40AM (#21517063) Homepage
    Recently (end of October) Google reordered some of their sites and dropped the PageRank on many (mine included) there was a blog post about it here [dailyblogtips.com]. My PageRank suffered immensely dropping from an overall high of 6/10 to the now 3/10. The most noticeable difference for me was that for the next two weeks (and the first time ever) I was no longer the #1 hit for: Bill Roehl, "Bill Roehl", or any variation thereof. Not only that but the first result from Google wasn't even for my root page, it was for some post I had underneath. I found that to be very odd.

    Now, while I was digging through the Google results to find out why this could have possibly happened (prior to reading the blog post linked above) I found tons of SEO spam sites that my site had been linked from. I had never seen that many junk results returned before and was surprised they were getting through. I was seriously concerned that they had something to do w/my ranking drop.

    At least Google is getting back on track dumping those bastards. While most people probably don't change their default settings to see anything more than the first 10 results, I am constantly looking through the first 100 on various searches and have seen more and more of that. I was wondering if some of the claims of Google's drop from #1 would imminent if something didn't change.
    • At least Google is getting back on track dumping those bastards. While most people probably don't change their default settings to see anything more than the first 10 results, I am constantly looking through the first 100 on various searches and have seen more and more of that. I was wondering if some of the claims of Google's drop from #1 would imminent if something didn't change.

      Well, they may be getting back at them, but...

      Ironically, Google itself refused to confirm or deny that it had cleansed its index of the more than 40,000 malware hosting sites, or even that they had existed. "Google takes the security of our users very seriously, especially when it comes to malware," a company spokeswoman said today in an e-mail. "In our search results, we try to warn users of potentially dangerous sites when we know of them. Sites that clearly exploit browser security holes to install software, such as malware, spyware, viruses, adware and Trojan horses, are in violation of the Google quality guidelines and may be removed from Google's index."

      What is Google afraid of? That their stock price will plunge if everyone finds out they were manipulated by malware sites?

    • by Rob T Firefly (844560) on Thursday November 29, 2007 @09:19AM (#21517455) Homepage Journal

      I was no longer the #1 hit for: Bill Roehl, "Bill Roehl", or any variation thereof.
      Perhaps there is simply someone else who is better at being Bill Roehl than you. Don't fret, though. You can always go back to Bill Roehl School and brush up with some post-graduate Bill Roehl stuff.

      Personally, I'm comfortable with the fact that I'm only the second-best me [google.com] out there. Let that other fella have his glory, because I'm never going back to the Rob Vincent Academy. I'm not going into it here, but those bastards Rob, Rob, and Rob know why.
      • by garcia (6573)
        It had nothing to do with that. The two sites that outranked mine were pointing back to me. That's why it made no sense.
    • dropped the PageRank on many (mine included)

      They also removed your /. ''homepage'', as they did with mine (for whatever reason).

      search [google.de]

      CC.
      • by garcia (6573)
        I never noticed that in my results before.
        • by foobsr (693224)
          They seem to consider the link from there as 'spam' as they seem to have removed all those who link to a page, even a fellow who links to debian. Twenty years down the road they consider which words are appropriate and which are to be avoided (of course based on an objective a sophisticated semantic weighting scheme(tm)) to get indexed.

          CC.
    • have no fear, you'll soon be back and better than ever! Bill Roehl is now being searched more than ever thanks to slashdotters.
      • by garcia (6573)
        Actually, I've already regained the top spot within a few weeks of that PageRank drop. My post was just talking about the general weirdness that was occurring around that time.

        There have only been 12 Google Searches for [B|b]ill [R|r]oehl today though. Not nearly enough to stroke my ego ;)

  • by Anonymous Coward
    ...welcome any move towards private pwnership of IE users.
  • Google had removed tens of thousands of malware - hosting pages from its index.

    Wierd, usually it's tha pages that are hosting malware, rather than the other way around. OW! Stop hitting me!
  • The keywords .. (Score:2, Interesting)

    by ninjeratu (794457)
    .. do not look like random words from a generator. They look targetted too with all the references to Microsoft software, Cisco, VPN. But then .. "train a dog to fetch" and "go go go go go go go go go go go"? Anyone have any ideas as to why and how they made that list?
    • They seem to be targeting Accountants and DBAs who work from home today and will go back inside the corporate firewall tomorrow. Oh and dog trainers for some reason
    • Re: (Score:2, Funny)

      by gzerphey (1006177)
      I remember hearing something about the Windows random number generator...
  • And what's SEO? (Score:3, Informative)

    by allcar (1111567) on Thursday November 29, 2007 @08:49AM (#21517151)
    For those of you, like me, who did not immediately recognise this TLA [wikipedia.org], it stands for Search Engine Optimization [wikipedia.org].
  • Censoring (Score:5, Funny)

    by Fredtalk (1105765) * on Thursday November 29, 2007 @08:50AM (#21517161)
    Sounds like net censorship to me! What if I wanted to visit those malware sites?
  • A hidden gem (Score:5, Interesting)

    by dotancohen (1015143) on Thursday November 29, 2007 @09:01AM (#21517255) Homepage
    The pdf contains a list of 2161 popular Google search terms. This is an SEO wet dream. Thanks!
  • by peipas (809350) on Thursday November 29, 2007 @09:05AM (#21517289)
    Is it just me or do the first five pages of "common terms" in the PDF contain the term Excel, and then the next four pages contain the term vpn? It seems to me there were two common terms in these first nine pages with random words tacked on.
  • "if u a dog go fetch"
  • by Anonymous Coward
    Google employees are quick to jump on Slashdot stories and get their spin and mods in. The "Go Google!" posts are coming in quick. The fact is that the first page of Google results has as much spam today as an AOL inbox back in 1995. The results have turned to junk.

  • tech support. Now what're we supposed to do over the holiday season? Boxshift?
  • From the summary: tens of thousands of Web pages hosting exploits showing up on the first page of Google searches for thousands of common terms

    So, how do you tell the difference between this and any normal Google results page?
  • ...if my eyes and brain RTFA correctly. I recognize Google is the big(gest) player, but it's not like the purveyors of fine malware focused exclusively on Google and Google alone. It's in TFA if you're willing to take a look-see.
  • Stalinism (Score:1, Funny)

    by Anonymous Coward
    What about the rights of those spammers? They're living in an impoverished third world country (Russia) and are just trying for a better life. They're no different than the home shopping network or eBay.

    And you won't tolerate them. You deny them their civil rights. You deny them their FREEDOM OF SPEECH!

    This is outright Stalinism. It's not their fault fat, stupid, bored, lonely Americans will buy products geared toward the intelligence of a labrador. They're just trying to feed their families... to be part o
  • For many months I have been using "Site Advisor", still free from McAfee. It works perfectly with FireFox. I searched for "Advisor" and did not find mention of it in these articles, but I would be surprised if any of these sites earned that nice green dot which I find so reassuring, am I wrong to be so reassured ?
  • by Anonymous Coward
    Let me create a blacklist of domains that are never shown on search results.

    This would then include the sites: *.cn
    which would include:

    bucket.rabbitexothermicsoup.cn
    flight.othersittingport.cn
    aggressive.xeroxmaneshop.cn

    Also the top 40 search result domains for 'geforce 8800gt review' or any other product, the content of which is typically:

    Reviews for Geforce 8800GT: (0)
    Click here to write your review for Geforce 8800GT

  • That's my advice as part of the solution to cut down on malware. Of course there are millions of .com malware sites, but you can't just cut out .com. On the other hand with rare exception, most people can without penalty stay away from .cn sites.
  • by Animats (122034) on Thursday November 29, 2007 @12:57PM (#21520805) Homepage

    After reading this, I immediately checked to see if Google had fixed their open redirector. [google.com] No, they haven't, and there are six exploits of it listed in PhishTank. Google needs to turn that off. If they absolutely insist on having an open redirector, it needs its own subdomain, which is what Yahoo does. Then the subdomain can be blacklisted without collateral damage.

    Phishing via exploits of major sites is a big problem, but involves a small number of major sites. 168 major sites today. [sitetruth.com] The usual exploits are:

    • Phishing site web servers on DSL lines. Some ISPs are good at kicking these off, and some aren't as good. "bellsouth.net" has more entries in PhishTank than any other domain.
    • "Open redirectors", URLs that can be exploited to redirect to another site, like the Google URL above.
    • Web hosting services, especially free ones, sometimes find themselves hosting phishing sites.
    • "Web 2.0" sites which allow uploading of user content but don't check it for exploits. Photobucket is used by some phishers, who upload hostile ".swf" files.
    • Break-ins on legitimate sites, where, typically, some obscure page is hosting hostile content. When an ".edu" site shows up in our list, that's usually what happened.

    Out of 1.6 million domains in DMOZ, and over 10,000 phishes in PhishTank, only 168 domains are in both. So the number of sites that need to be fixed is small. In fact, some of those sites are already fixed, but the entries haven't been removed from PhishTank yet. (Hint: if you kill a hostile page on your domain, make it a 404 error; that gets the page out of PhishTank's "active and online" list automatically. Don't just change the content or redirect it somewhere else, or it stays in the tank until somebody rechecks it manually, which can take weeks.)

    For every site in the list, there's some competitor in the same business who isn't on the list. "Everybody has this problem" isn't a valid excuse any more. This is a useful point to make with management if you find your own company on the list.

    This list of 168 exploited sites is updated automatically every three hours. There's also a list of sites recently removed from PhishTank. "n-insanity.com", "tropmet.res.in", "wsjob.com" were dropped from the list today; they no longer have active, online entries in PhishTank. "gentlesource.com", "t35.com" (an eBay phish), "tilapia.com" (another eBay phish), and "uic.edu" (already fixed) were added; they just appeared in PhishTank. If you have any responsibility for a site on the list, please take steps to fix the problem. If you're not part of the solution, you're part of the problem.

  • ... thousands of malware sites abandon Google and take their business to MSN Search.
  • Can they get rid of Swik.net while they're at it? I loathe that damn site.

Maybe Computer Science should be in the College of Theology. -- R. S. Barton

Working...