Google's Research on Malware Distribution 83
GSGKT writes "Google's Anti-Malware Team has made available some of their research data on malware distribution mechanisms while the research paper[PDF] is under peer review. Among their conclusions are that the majority of malware distribution sites are hosted in China, and that 1.3% of Google searches return at least one link to a malicious site. The lead author, Niels Provos, wrote, 'It has been over a year and a half since we started to identify web pages that infect vulnerable hosts via drive-by downloads, i.e. web pages that attempt to exploit their visitors by installing and running malware automatically. During that time we have investigated billions of URLs and found more than three million unique URLs on over 180,000 web sites automatically installing malware. During the course of our research, we have investigated not only the prevalence of drive-by downloads but also how users are being exposed to malware and how it is being distributed.'"
Re:They already show a warning. (Score:1, Informative)
Re:And what platform does the malware run on? (Score:3, Informative)
Actually they do add a warning for infected sites (Score:5, Informative)
I just wonder how it is that hightstats.net can still be in existence when it contains known malicious stuff that hackers are inserting into unwary websites?!
Key points to take from the paper (Score:5, Informative)
The next worst offender is the US with 1/6.
About 3.5M websites attempt to send you to exploits from 180K distribution sites.
63% of the 180K malicious sites are IIS, 33% are Apache, and a handful are other.
80% of malware from not in ads (e.g. iframes) was within 4 redirects of the malware distributor.
80% of malware from ads was more than 4 redirects from the distributor.
3/4 of distribution sites and 1/2 of landing sites are in 2 blocks occupying 6.5% of IP4.
Among drive-by downloads, 1/2 alter your startup, 1/3 attack your security, 1/4 corrupt your preferences, and 7% install BHOs.
87% of outbound connections the malware initiates are HTTP, 8.3% are IRC.
The three AV engines tested against malware retrieved by the study had detection rates of about 35, 50, and 70%.
The part I find scariest is the 3.5M malware fronts. I mean, there are only about 70M active hosts on the entire Internet - that's 5 percent! Since I think that trying to make programmers these days write secure code is a lost cause, we should focus on breaking up the software monoculture. This kind of shit really starts to lose it's efficacy if only 1/4 or 1/5 attempts even attack the right browser...
This can be fixed, but impacts ad revenue model (Score:4, Informative)
The paper points out that most of the attacks involve redirection of some portion of page content. That's a useful piece of information, because, other than for advertising purposes, redirection of IFRAME items and images is quite rare. A useful blocking strategy would be to block all redirects below the top level page. Many ads will disappear; no great loss.
Checking for hostile full web pages is already being done. McAfee SiteAdvisor was the first to do that, then Google copied them. Our "bottom feeder filter", SiteTruth [sitetruth.com], does some of that too, although it throws out far more sites than McAfee or Google do, just by insisting that some identifiable business stand behind any page that looks commercial.
Google's revenue model depends, to some extent, on those "bottom feeder" sites: all those anonymous "landing pages", "directory pages", "made for AdWords pages", and similar junk. Those things bring in substantial AdWords revenue, although they don't usually generate much in the way of sales for advertisers. Throwing them out of the "Google Content Network" would cut Google's ad income. This is where "don't be evil" collides with Google's profitability.
This looks like a solveable problem, but the solution will come from the security companies, not the search companies. The search companies can't afford to fix it.
Re:Search engine ranking (Score:4, Informative)
Also, what's your problem with JavaScript? If you ever used the Google front page (instead of your browser's quick search function or