Spam Sites Infesting Google Search Results 207

Posted by CmdrTaco on Monday October 01, 2007 @09:16AM from the hate-when-that-happens dept.

The Google Watchdog blog is reporting that "Spam and virus sites infesting the Google SERPs in several categories" and speculates, ...Google's own index has been hacked. The circumvention of a guideline normally picked up by the Googlebot quickly is worrisome. The fact that none of the sites have real content and don't appear to even be hosted anywhere is even more scary. How did millions of sites get indexed if they don't exist?

This discussion has been archived. No new comments can be posted.

Spam Sites Infesting Google Search Results

Load All Comments

Search 207 Comments Log In/Create an Account

Comments Filter:

It's the Rand Corporation (Score:3, Funny)

by OptimusPaul ( 940627 ) writes: on Monday October 01, 2007 @09:22AM (#20809189)

in conjunction with the saucer people under the supervision of the reverse vampires are forcing our parents to go to bed early in a fiendish plot to eliminate the meal of dinner. We're through the looking glass, here, people...

Share
twitter facebook
Google index hacked? (Score:5, Funny)

by InvisblePinkUnicorn ( 1126837 ) writes: on Monday October 01, 2007 @09:22AM (#20809197)

Hacking of Google databases might explain why Google Translator used to translate the Russian name for "Ivan the Terrible" as "Abraham Lincoln" [blognewschannel.com].

Share
twitter facebook
- Re: (Score:2)
  
  by AmIAnAi ( 975049 ) * writes:
  
  The Google translation service gives the option to suggest a better translation. It's more likely that this service operates automatically and it just takes enough people suggesting the same translation to force the change through.
  
  Might be interesting to try. But I would hope that they have monitoring in place to spot a sudden surge in alternative translations.
- Re: (Score:2)
  
  by rumith ( 983060 ) writes:
  
  Besides, it used to translate 'Peter Norton' to Russian as 'Eugene Kaspersky'. :) This trick has been taken down already.
SEOs (Score:5, Informative)

by Chilled_Fuser ( 463582 ) writes: on Monday October 01, 2007 @09:23AM (#20809201)

Using one page of information for Google's spider and then using a redirect for a non-spider user. It's an SEO tactic.

Share
twitter facebook
- Re:SEOs (Score:5, Interesting)
  
  by glindsey ( 73730 ) writes: on Monday October 01, 2007 @09:36AM (#20809329)
  
  Which raises the question: Why not have GoogleBot do a check also as a normal user-agent (IE/Firefox/etc.) and see if the page is significantly different than when it identifies itself? At the very least GoogleBot could check if there are common blacklist words ("viagra" et al) on the website when identifying itself as IE or Firefox.
  
  Parent Share
  twitter facebook
  - Re:SEOs (Score:4, Interesting)
    
    by dschuetz ( 10924 ) writes: <david&dasnet,org> on Monday October 01, 2007 @09:41AM (#20809381)
    
    I was pretty sure that Google already did some kind of checking for this sort of dodge. It could be that the sites in question have found some way to dodge the dodge -- maybe they figured out when a google revisit (with a different user agent) would occur, or maybe they recognize google IP addresses and always give the scammed page regardless of user agent, or some other similar trick.
    
    That's what makes this scary -- as I said, I thought google was already on the lookout for such scams, and if they're being beat on such a large scale it might mean a major shift in google's strategy is in order...
    
    Parent Share
    twitter facebook
    - Re:SEOs (Score:5, Informative)
      
      by Billosaur ( 927319 ) * writes: <wgrother AT optonline DOT net> on Monday October 01, 2007 @09:54AM (#20809541) Journal
      
      It's more than likely related to IP address than user agent. I used to work in web site metrics, and the number of fouled up user agents and spoofs was always staggering, but IP was a pretty good indicator of who was doing something. No doubt the bad guys have tracked the Google bot's IP over a long period of time and perhaps made some correlations to give them a pretty good idea if the site is being revisited by Google under an assumed user agent. I'm not sure, but it would seem to me that Google would have thought of spoofing it's IPs long ago, to avoid people being able to track them, though I can't say how you'd go about that.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by Shimmer ( 3036 ) writes:
        
        Google would have thought of spoofing it's IPs long ago, to avoid people being able to track them, though I can't say how you'd go about that.
        
        Easy: Hire a relatively unknown 3rd party to perform the comparison for you.
        
        Re: (Score:2)
        
        by nahdude812 ( 88157 ) * writes:
        
        Or set up relay points with different ISP's - buy a rack or a few U's all around the world running nothing but off the shelf proxy software that only proxies for Google's IP addresses.
      - Re: (Score:2)
        
        by glindsey ( 73730 ) writes:
        
        Yeah, spoofing an IP is easy if you're not looking for a response... but if you're spoofing a request (as a GoogleBot would be doing), where does the response go?
        
        Perhaps Google should create a browser extension -- completely voluntary, of course -- that essentially turns everybody's browsers into a distributed GoogleBot. Of course then they have to deal with malicious nodes poisoning the data, but that could be resolved by having a dozen or so random systems checking the same website and sending their res
      - Re: (Score:2)
        
        by mollymoo ( 202721 ) writes:
        
        I'm not sure, but it would seem to me that Google would have thought of spoofing it's IPs long ago, to avoid people being able to track them, though I can't say how you'd go about that.
        The fundamental problem with spoofing IPs for this kind of work is that you need to use the right IP to get any data back. You need to have real IPs which are 'disposable'. A botnet, in other words. Google could, if they were evil, create the world's largest botnet by getting JavaScript embedded in search results pages or
        
        Re: (Score:2)
        
        by rthille ( 8526 ) writes:
        
        Why not just have the google toolbar compare the page it sees in the end users' browser with that google found when spidering. Very similar to that botnet, but without the nefariousness...
      - Re: (Score:2)
        
        by tknd ( 979052 ) writes:
        
        I'm not sure, but it would seem to me that Google would have thought of spoofing it's IPs long ago, to avoid people being able to track them, though I can't say how you'd go about that.
        
        That's so simple!
        
        Create free "accelerator" application/browser plug-in to gather web site stats.
        
        Distribute application as a beta.
        
        ???
        
        Profit!
  - Re:SEOs (Score:5, Interesting)
    
    by jmagar.com ( 67146 ) writes: on Monday October 01, 2007 @09:42AM (#20809397) Homepage
    
    Google does this already [bbc.co.uk], perhaps not with spiders, or in the way you described. But they do seek out and destroy sites that are caught faking keyword densities and other SEO tactics on crawl pages vs human pages.
    
    Parent Share
    twitter facebook
  - Re:SEOs (Score:5, Insightful)
    
    by Tim C ( 15259 ) writes: on Monday October 01, 2007 @09:44AM (#20809413)
    
    At the very least GoogleBot could check if there are common blacklist words ("viagra" et al) on the website when identifying itself as IE or Firefox.
    
    So medical supply or information websites shouldn't be indexed by Google?
    
    I know what you're trying to do, but no word is 100% inappropriate. What if someone is actually looking for information on Viagra, or replica Swiss watches, or cheap stocks? What if someone is looking for information on spam?
    
    Check for significant differences in content with different user-agents yes, but banned words? That really doesn't seem like a good idea to me.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by Bender0x7D1 ( 536254 ) writes:
      
      What if someone is looking for information on spam?
      
      Which spam would that be:
      spam: Unsolicited bulk email.
      Spam: A spiced pork and ham product from Hormel.
      - Re: (Score:2)
        
        by nuzak ( 959558 ) writes:
        
        Gary Theurk's spam was email. Usenet didn't even exist then.
        
        Re: (Score:2)
        
        by Jeremiah Cornelius ( 137 ) * writes:
        
        Academic point.
        
        Gary sent the first unsolicited email in 1978 - to about 400 recipients. It was on a topic that - given the size and constitution of the ArpaNet community - could be reasonably assumed of some interest to the "audience".
        
        The famous Green Card Lottery spam was sent to every available Usenet newsgroup. This was quickly termed "Spam" by the Usenet at large - in reference to its ubiquity, like the Spam in the Python sketch ("Bloody vikings!") This was the first wide-scale, flagrant dismissal of
    - Re:SEOs (Score:4, Insightful)
      
      by glindsey ( 73730 ) writes: on Monday October 01, 2007 @11:27AM (#20810745)
      
      What if someone is actually looking for information on Viagra, or replica Swiss watches, or cheap stocks? What if someone is looking for information on spam?
      That's a good point. But perhaps combinations of keywords would work -- it's pretty unlikely that you'd see "viagra" and "mortgage" on the same site, for example. If you partner this with checking for significant user-agent differences it could become a pretty good tool, I think.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by barakn ( 641218 ) writes:
        
        Results 1 - 10 of about 3,010,000 for viagra mortgage. (0.28 seconds)
        
        Re: (Score:2)
        
        by glindsey ( 73730 ) writes:
        
        And browsing through the top results, I see almost every one is either (a) about spam or (b) a spam page of its own. This seems to strengthen my theory, not weaken it. Now, combine that with checking to see if the page hides details when User-Agent = Googlebot (as the pages talking about spam should remain relatively unchanged), and you have a fairly aggressive filtering system.
    - Re: (Score:2)
      
      by PhilHibbs ( 4537 ) writes:
      
      A legitimate medical supplied web site would contain that information both when crawled by Google and when browsed by Firefox or IE. The GP is suggesting that the site might appear innocuous to Google's crawler but be a spam site to any other visitor. Therefore Google should try faking a browser ID and checking the contents produced. However, the problem with this is that some sites allow Google through but require registration from anyone else.
    - Re: (Score:2)
      
      by TeamSPAM ( 166583 ) writes:
      
      Context should matter, but that didn't stop Beaver College [google.com] from changing their name because of porn/child safety filters.
  - Re:SEOs (Score:5, Insightful)
    
    by suv4x4 ( 956391 ) writes: on Monday October 01, 2007 @09:49AM (#20809493)
    
    Which raises the question: Why not have GoogleBot do a check also as a normal user-agent (IE/Firefox/etc.) and see if the page is significantly different than when it identifies itself? At the very least GoogleBot could check if there are common blacklist words ("viagra" et al) on the website when identifying itself as IE or Firefox.
    
    It does. It also detects landing pages mentioned above. Apparently it's something more subtle than what one could think of in few mins on Slashdot, and we'll learn soon enough.
    
    Parent Share
    twitter facebook
    - Re:SEOs (Score:5, Funny)
      
      by colourmyeyes ( 1028804 ) writes: on Monday October 01, 2007 @10:15AM (#20809803)
      
      Apparently it's something more subtle than what one could think of in few mins on Slashdot
      Blasphemy! In my relatively short time lurking on Slashdot, I've seen nearly all the world's problems, including hideously complicated questions of physics, SOLVED in posts no more than a few paragraphs long.
      
      It's amazing, really.
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by garett_spencley ( 193892 ) writes:
      
      It's a sticky situation / tactic for both Google and it's webmasters.
      
      For example, I have a web site that displays the most recent content for returning visitors and the most popular content for visitors who are visiting my site for the very first time. It's also possible for each user to chose which page to see. This is done to increase productivity on the site and to to increase the likelihood of a new visitor becoming a repeat visitor.
      
      When googlebot visits my page I give it the page with the freshest cont
    - Re: (Score:2)
      
      by glindsey ( 73730 ) writes:
      
      Apparently it's something more subtle than what one could think of in few mins on Slashdot, and we'll learn soon enough.
      Damn. So much for my applying to Google with the bullet point "Solved PageRank spamming problems by posting on Slashdot after thinking for about thirty seconds" on my resumé.
  - Re: (Score:2)
    
    by walt-sjc ( 145127 ) writes:
    
    They should. Google already has guidelines [google.com] that cover this type of behavior. They should enforce them. It's amazing how many sites (including well known sites) violate these guidelines all the time. You would think that Google, with all it's cash (meaning that it can afford to devote the manpower,) would want to improve the quality of their search results, delisting this crap. If they fail to do so, they will start to lose their user base.
    - Re: (Score:2)
      
      by nuzak ( 959558 ) writes:
      
      It's not that Google can't delist the crap when they run across it. It's just much harder to keep it from getting re-indexed immediately after, unless they fix the fundamental weakness that the spammers are exploiting. And the effects of jiggering the ranking algorithm are *very* widespread, and not taken lightly. Google can and has delisted high-profile offenders before (BMW and Ricoh come to mind) but they don't want to have to fight their own processes in playing whack-a-mole with every chickenbone sp
      - Re: (Score:2)
        
        by walt-sjc ( 145127 ) writes:
        
        Delisting means removing. It should NOT be re-indexed EVER, until the site owner agrees in writing to stop the bad behavior. It can work off IP addresses. Domain names are free, but IP addresses are MUCH more limited.
  - Re: (Score:2)
    
    by dargaud ( 518470 ) writes:
    
    I suggested better than this a long time ago: use the IE/Firefox rendering engine completely, and feed the resulting image to an OCR program. This way, anything written on white_on_white, font=1, display:none and other tricks get ignored. Then compare the results. Ditch the site if there's too much difference.
    - Re: (Score:2)
      
      by Solra Bizna ( 716281 ) writes:
      
      The theory is good, but the execution would be horribly complicated, and computationally intensive, and have a very high margin for error. (Computers don't intuit flow as well as humans, for a relatively minor example.)
      
      -:sigma.SB
- Re: (Score:2)
  
  by IBBoard ( 1128019 ) writes:
  
  That's not SEO, that's SEM (Search Engine Manipulation - I've patented that version of the acronym). SEO involves optimising a site rather than making it completely different for normal users is manipulation and 'blackhat' tactics. It would be interesting, if a little off-putting, if someone has successfully scammed Google to such a great extent through simple cloaking.
  
  As for the suggestion of a different user agent, I guess it'd be simple enough to either do a reverse lookup and see if it contains "google"
  - Re: (Score:2)
    
    by Pope ( 17780 ) writes:
    
    BMW did, and they got delisted from Google: http://news.bbc.co.uk/1/hi/technology/4685750.stm [bbc.co.uk]
  - - Re: (Score:2)
      
      by Jeremiah Cornelius ( 137 ) * writes:
      
      Children is learning.
- Re: (Score:2)
  
  by seanyboy ( 587819 ) writes:
  
  I've renamed my user agent to be googlebot.
  Hopefully (don't know if it works), sites like this will give me the correctly indexed information.
Google hacked, sites don't exist, um ... (Score:4, Insightful)

by icepick72 ( 834363 ) writes: on Monday October 01, 2007 @09:28AM (#20809255)

Submitter says Google's index has been hacked which could imply the severe case: direct security threat and entry to it, or more likely: managing to get it to index something Google would not want it to index.
Submitter asks: How did millions of sites get indexed if they don't exist?
Okay, I call this an idiot story. Millions of sites come into being and go out of being all the time. What does this statement have to do with anything? It seems like submitter has a lack of understanding how basic Google and the web work, but the story has made it to Slashdot. I think the Slashdot IQ level is dropping because this is a Digg story.

Share
twitter facebook
- Re: (Score:3, Informative)
  
  by Clandestine_Blaze ( 1019274 ) writes:
  Millions of sites come into being and go out of being all the time. What does this statement have to do with anything? It seems like submitter has a lack of understanding how basic Google and the web work, but the story has made it to Slashdot.
  If you had bothered reading the article, you would have seen:
  
  The .cn sites don't appear to be hosted ANYWHERE. They are simply redirected domain names. How they got ranked in Google in such a short period of time for fairly competitive keywords is a mystery. Google's index even shows legitimate content for the .cn sites.
  It appears that the faked sites are redirecting the Googlebot to a location where content can be indexed, while at the same time recognizing normal users and redirecting them to a site that includes the malware mentioned earlier. This is an obvious violation of Google's guidelines, but the spammers have found ways to circumvent the rule and hide it from the Googlebot.
  Yes, millions of sites do come into being all the time. Had Google indexed a site, and had said-site disappeared before the index was updated, you would simply either hit a landing page (if that domain was purchased but not set-up) or you would get an error message [carrotsticksareyummy.com]
  
  The submitter was referring to instances when a fake redirector is being set-up and tricking the googlebot by sending it to websites with content and keywords while sending normal use
- Re: (Score:2)
  
  by Spy der Mann ( 805235 ) writes:
  
  I think the Slashdot IQ level is dropping because this is a Digg story.
  
  [some guy] [scary] At least you should thank it's not a Fark.com story!
Not hosted anywhere? (Score:3, Informative)

by Vicegrip ( 82853 ) writes: on Monday October 01, 2007 @09:29AM (#20809263) Journal

The article makes the claim that the "hijacked keywords" are going to redirection websites that do not "appear to be hosted anywhere".

That seems a little incredible to me. :)

Invisible, IPless, Chinese web-servers are taking over Google! Personally, I'll just let Google worry about trying to protect its search engines. :)

Share
twitter facebook
- Re:Not hosted anywhere? (Score:5, Interesting)
  
  by IBBoard ( 1128019 ) writes: on Monday October 01, 2007 @09:43AM (#20809411) Homepage
  
  Yeah, I think "not hosted anywhere" is somewhat of a simplification for "actually hosted somewhere but never show any content to a normal user because they redirect you to another domain instead". While it might fly for a complete non-techy, I wouldn't have thought /. would have too many people believing in responses from machines that don't exist.
  
  Parent Share
  twitter facebook
  - Where do all the calculators go when they die? (Score:4, Funny)
    
    by Scrameustache ( 459504 ) writes: on Monday October 01, 2007 @10:32AM (#20810017) Homepage Journal
    
    I wouldn't have thought /. would have too many people believing in responses from machines that don't exist.
    Were getting phantom pings from the ghosts of the still-smoldering servers we slashdotted in our folly!
    I'm scared...
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by Jonathan_S ( 25407 ) writes:
      
      | I wouldn't have thought /. would have too many people believing in responses from machines that don't exist.
      
      Were getting phantom pings from the ghosts of the still-smoldering servers we slashdotted in our folly!
      I'm scared...
      But the good news is that you aren't getting them anymore.
      - Re: (Score:2)
        
        by Scrameustache ( 459504 ) writes:
        
        Were getting phantom pings from the ghosts of the still-smoldering servers we slashdotted in our folly!
        I'm scared...
        But the good news is that you aren't getting them anymore. You grammar nazis should give us a break on monday mornings : )
  - Re: (Score:2)
    
    by rk ( 6314 ) writes:
    
    Maybe someone dropped a logic bomb through the trap door.
- Re:Not hosted anywhere? (Score:5, Funny)
  
  by TheRaven64 ( 641858 ) writes: on Monday October 01, 2007 @10:04AM (#20809653) Journal
  
  Those of us on Internet 3.0, Quantum Edition, have this problem all the time. Quoogle indexes sites without collapsing their wave functions. When you click on a link, the waveform collapses and the server may or may not exist. Web spiders are therefore being replaced by cats [thecheezbu...actory.com].
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by mgblst ( 80109 ) writes:
    
    I know you are trying to be funny, but how can google index a site without collapsing its wave function? That would go against all quantum theory, wouldn't it?
    - Re: (Score:2)
      
      by John Hasler ( 414242 ) writes:
      
      > I know you are trying to be funny, but how can google index a site without collapsing its
      > wave function?
      
      The Googlebot is not an "observer".
      
      > That would go against all quantum theory, wouldn't it?
      
      It would "go against" the Copenhagen interpretation.
specific phrases? (Score:5, Interesting)

by rubberglove ( 1066394 ) writes: on Monday October 01, 2007 @09:43AM (#20809399)

The story would be more interesting if it included an example hijacked search phrase.
I'd like to check it out myself.

Share
twitter facebook
- Drivers (Score:2)
  
  by headkase ( 533448 ) writes:
  
  Try to search for a driver - any driver! I've run into many pages that require 'registration to download' them. And of course registering costs bucks so its a scam.
- Re: (Score:3, Informative)
  
  by wbean ( 222522 ) writes:
  
  There's a sample search phrase posted in the comments to the original blog entry. It produced a lot of funny .cn results for me. Here it is:
  
  Bayesian networks and decision graphs Finn rapidshare
Google's Algorithm (Score:2)

by Midnight Thunder ( 17205 ) writes:

Two problems I see are:
- Sites offering one content to Google and another to users. This is indeed something that Google frowns on, but not something that seems to be in place to be tested by the spider.
- Google's fame comes from their PageRank algorithm and unfortunately people now know how to game the results. If Google were to implement multiple algorithms then users could indicate which search type the wish to use. While it certainly makes thing more complicated for Google, it also makes
Wait and see. (Score:5, Insightful)

by eniac42 ( 1144799 ) writes: on Monday October 01, 2007 @09:48AM (#20809469) Journal

People, its just a blog. If someone has really hacked Google, we will hear soon enough. Otherwise scamming and spoofing the ratings with rubbish sites is a sport thats been going on a long, long time..

Share
twitter facebook
- Re:Wait and see. (Score:5, Insightful)
  
  by tbannist ( 230135 ) writes: on Monday October 01, 2007 @10:37AM (#20810079)
  
  Actually, it's worse than that. It's a blog that can't provide any actual evidence that anything they claim is true. As far as we know, the entire story is bogus because the blogger has provided nothing to prove that any of his claims are true.
  
  Parent Share
  twitter facebook
Ironic side link (Score:2)

by IBBoard ( 1128019 ) writes:

Oh, the irony. We have a /. story talking about spammers exploiting Google, and what side link do we get?
Compare prices on Spam Software

I wonder whether some of the software lets you spam Google's listings easily? Perhaps that's how it was achieved?
- - Re: (Score:2)
    
    by IBBoard ( 1128019 ) writes:
    
    It's an irony because the article is going "oh no, bad spammers" and the advert is going "buy your cheap software and become a big bad spammer here!"
Horrible solution... (Score:2)

by SanityInAnarchy ( 655584 ) writes:

TFA suggests that if you want to search actual Chinese sites, you should use google.cn, not google.com.

Erm... no, bad idea. Maybe google.cn won't have the same spam, maybe it will, but it most certainly is censored for other reasons as well. (Unless they've stopped doing this and I've completely missed the news -- there is one tank man on the first page of a google.cn image search for "tiananmen square", compared with almost the entire first page being tank men on google.com.)

And maybe a good suggestion to
Let me tell you how it happened (Score:2)

by unity100 ( 970058 ) writes:

Spam sites had been indexed before the provider learned about spamming and pulled the plug on the sites.
- Re: (Score:2)
  
  by walt-sjc ( 145127 ) writes:
  
  However, anything with a high pagerank (early in the results) should have more scrutiny by google, and be de-listed quickly. Frankly, I find search engine spam worse than email spam. I can easily filter email spam, but search engine spam is MUCH more difficult since you frequently can't tell if a result is spam without visiting the spam site.
Nutcase conspiracy theory adopters web2.0 version (Score:2, Insightful)

by georgeb ( 472989 ) writes:

Quotes:

"Some searches (very specific phrases, and I won't list any of them right now - Google knows which they are) return results with a large number of .cn (Chinese) sites."

"The .cn sites don't appear to be hosted ANYWHERE." (wow!)

"[...] the Word-Confirm on all of their sites, including the one I will have to use to post this, generate a large number of rogue responses, and the HELPDESK facilities with thousands of consoles and employees each all over the planet watch the responses and other traffic chara
Sure it's not his browser that's porked? (Score:2, Interesting)

by AskChopper ( 1077519 ) * writes:

I think he needs to run AdAware. Seriously.. I've entered a bunch of the usual suspects into google trying to find these hordes of .cn sites that pop up. No joy yet.. Anyone else found one?
Google is working on this ... (Score:4, Informative)

by miller60 ( 554835 ) writes: on Monday October 01, 2007 @10:06AM (#20809671) Homepage

Back in May Google launched on online security blog [blogspot.com] as part of a broader effort to detect malware sites, presumably to exclude them from the SERP results. They're clearly behind the curve. But this post [blogspot.com] offers an overview of Google's efforts and ambitions in this area.

Share
twitter facebook
Simple way to eliminate pharmaceutical spam (Score:3, Funny)

by Alzheimers ( 467217 ) writes: on Monday October 01, 2007 @10:08AM (#20809725)

Free universal health care

Share
twitter facebook
- Re: (Score:2)
  
  by p0tat03 ( 985078 ) writes:
  
  Funny, I live in Canada and I still get lots of pharma spam. That being said, it's usually in the viagra/cialis category...
- - Re: (Score:2)
    
    by cdrguru ( 88047 ) writes:
    
    The elderly would suffer the most from a universal healthcare system in the US. Right now, most healthcare spending is in the last year or so of life. There is also a spike for newborns, mostly preemies and drug-addicted babies.
    
    If we get universal healthcare in the US, we're going to have to have an age-based cutoff like other countries do. Sorry, no treatment if you are over 70 or something like that. I don't see the AARP and similar groups going for that and they are a pretty substantial voting block.
What hijacked phrases? Not seeing this. (Score:5, Informative)

by Animats ( 122034 ) writes: on Monday October 01, 2007 @10:52AM (#20810289) Homepage
I'm not seeing any of this. I'm trying commonly spammed phrases in Google, and seeing nothing unusual.
- "digital camera" - OK
- "ink cartridge" - OK
- "flat screen TV" - PCworld at the top
- "auto parts" - OK
- "london hotels" - usual results
- "britney spears" - usual results
- "viagra" - Pfizer, Wikipedia, etc.
- "rebelde" (the Mexican telenovela, one of the top ten searches) - normal
Not one .cn site in the top 10 for any of these.
Share
twitter facebook
Search Engine Pessimisation (Score:2, Insightful)

by ajs318 ( 655362 ) writes:

Worse, I think, is the act of spamming blogs with links. The theory is that, the more links there are pointing to a website, the more popular it must be; so, by using commonly-available, spam-advertised commercial software to pollute blogs with links unrelated to the subject matter, webmasters imagine they can improve their ranking without paying baksheesh to the search engine companies.

I have had an idea for a hack to WordPress, which will make all links invisible to GoogleBot (and maybe the other sear
meta refresh (Score:2)

by fyoder ( 857358 ) writes:

I read the story with interest as something like this happened to me the other day. It didn't even occur to me that Google had been hacked. I figured the original site had been compromised. A hacked web site can be defaced for shits and giggles, obviously, but it could also have a meta refresh tag added to send the browser off to wherever the defacer wants. With the security hole history of most CMS systems out there, I'm surprised that doesn't happen more often.

It looks like Firefox 3 will allow disab [diveintomark.org]
lot's of dead .cn domains (Score:2)

by cyberworm ( 710231 ) writes:

I was noticing something similar to this earlier. There were quite a few domain names ending in .cn. Seemed mostly like junk domain names, but were very odd for ending in .cn
What is up with images? They being abused too? (Score:3)

by hurfy ( 735314 ) writes: on Monday October 01, 2007 @07:10PM (#20817359)

I just did an image search and forgot a space. I got a lot of bizarre results, a large number of odd ones come from .hu

I searched on Opel Manta but forgot the space. With it i got many matches very little junk in 1st 10 pages. Without a space i got weird results starting on 1st page. What does a car name have to do with a naked chick with a Nokia phone? Mud wrestlers? Homer Simpson? Paris Hilton? Dozens and dozens of unrelated pictures it seems.

Spyware is off ATM so i didn't get any farther than that.

Share
twitter facebook
- I call Bullshit!!! (Score:4, Insightful)
  
  by Jennifer York ( 1021509 ) writes: on Monday October 01, 2007 @09:37AM (#20809343) Homepage
  
  Any evidence to back that up? I seriously doubt that a single individual has the ability to make a change on production boxes without a committee of senior managers approving the change.
  Google will adjust, find the method of manipulating the page ranks, and close the hole.
  
  Parent Share
  twitter facebook
  - Re:I call Bullshit!!! (Score:5, Insightful)
    
    by Billosaur ( 927319 ) * writes: <wgrother AT optonline DOT net> on Monday October 01, 2007 @09:46AM (#20809443) Journal
    
    It may not be a question of a single developer making changes, as much as a single developer (or group of them -- safety in numbers) divulging to certain third parties how the algorithms work in the page ranking system. It's very rare any company gives anyone production access to make changes, but then again I've seen that happen too, where something breaks, they give a developer access to patch it in a hurry before the hew and outcry set in, then forget to revoke his/her access. Of course Google is global, so any change would have to propagate through the system vis source control, so tracking it wouldn't be that hard. I doubt any developer, no matter how nefarious, would take the risk.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Interesting)
      
      by Metasquares ( 555685 ) writes:
      
      They've had people working on their algorithms for quite some time now. I doubt it's in the state where it's something you can just give away all at once... or precisely target, for that matter. It's probably hundreds of thousands of lines of code by now, if not more. They should have systems in place to notify them when that much data is copied at once.
      
      Still waiting for them to allow weighting of search terms, though :)
  - Re: (Score:3, Insightful)
    
    by zymano ( 581466 ) writes:
    
    No it's not. Whenever you ask just a computer program to weed out spam , it will always be outwitted by average human intelligence.
    
    There are websites strictly devoted to google ranking.
    
    Let me add this about Google. The google corporation really isn't 100% innovative. Their search uses common links to rank. This has led to evolution of the spammers. They load their pages with links to spam. So my point to slashdot is......
    
    If google is so damn loaded with money and that their search tech uses common user l
- I Bet It's a Simpler Explanation (Score:5, Interesting)
  
  by eldavojohn ( 898314 ) * writes: <eldavojohn&gmail,com> on Monday October 01, 2007 @09:37AM (#20809345) Journal
  
  Google is susceptible to an erosion of moral tenacity, just like any other corporation.
  This would be far more interesting but the sad fact is that it's probably the simplest explanation: spammers are merely more sophisticated. I mean, a while ago a few people teamed up to Google bomb Bush as a "miserable failure" [wikipedia.org] and it worked. They exploited Google's page ranking system. It's pretty easy to exploit because they patented it so you merely need to read the patent [uspto.gov]. From there you get an idea of how to exploit it.
  
  I imagine that spammers could band together or simply get botnets 'clicking' as independent IP addresses links that boost their page rank. That's how it worked with Bush, they simply linked his homepage as "miserable failure" and suddenly he was the number one result from that query in Google.
  
  I find this more likely an explanation than someone changing the data or values in the database. There's going to be plenty of evidence left in the logs & it's not like nobody's going to notice. This is Google's bread & butter, no amount of money in the world could entice a worker to mess with it. They would have to be exceptionally stupid as the lawsuits that follow would be in the billions.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Informative)
    
    by suv4x4 ( 956391 ) writes:
    
    I imagine that spammers could band together or simply get botnets 'clicking' as independent IP addresses links that boost their page rank. That's how it worked with Bush, they simply linked his homepage as "miserable failure" and suddenly he was the number one result from that query in Google.
    
    I like your post, but Google can't detect if you "click" a link. It doesn't need botnets to click links from different IP addresses.
    
    It just needs the mere *presence* of those links, with the same text, to the same page
    - Re: (Score:2)
      
      by Arthur B. ( 806360 ) writes:
      
      Hum, google can and does detect if you click a link, link in the google result page are redirections, look closer.
      - Re: (Score:3, Informative)
        
        by suv4x4 ( 956391 ) writes:
        
        We're not talking about the results page, but just links. In sites separate from Google.
        
        Re: (Score:3, Insightful)
        
        by Arthur B. ( 806360 ) writes:
        
        Unless the sites happen to have google ads...
        
        Re:I Bet It's a Simpler Explanation (Score:4, Interesting)
        
        by nahdude812 ( 88157 ) * writes: on Monday October 01, 2007 @12:07PM (#20811269) Homepage
        
        Or Google Analytics.
        
        Parent Share
        twitter facebook
      - google-analytics.com (Score:3, Insightful)
        
        by TFGeditor ( 737839 ) writes:
        
        Has anyone ever looked into how google-analytics.com (formerly Urchin) works? This blogger http://labnol.blogspot.com/2005/11/prevent-google-analytics-from-tracking.html [blogspot.com] gives a bit of info--and it does not appear to comply with the Google "do no evil" mantra.
        
        Re: (Score:2)
        
        by ajs318 ( 655362 ) writes:
        
        For me, it doesn't work; because google-analytics.com is in my /etc/hosts file pointing to 127.0.0.1.
        
        Re: (Score:2, Insightful)
        
        by nicolastheadept ( 930317 ) writes:
        
        And how is a webmaster simply monitoring what people click on evil? Just because you may be paranoid, it doesn't make Google evil.
  - Re: (Score:2)
    
    by xeoron ( 639412 ) writes:
    
    I know Google has pointed me to various spam filled project pages on sourceforge... hopefully Slashdot's parent company is doing something about it.
    - Re: (Score:2)
      
      by sconeu ( 64226 ) writes:
      
      There's spam/ads on sourceforge? Never seen it... Oh wait... I use AdBlock and FlashBlock.
      - Re: (Score:2)
        
        by PingXao ( 153057 ) writes:
        
        They recently teamed up with DoubleClick, which is still evil and I don't care who owns it. They also made significant changes to their privacy policy which I don't like. Don't think it was ever covered here at /. which is not surprising given the ownership situation.
  - Re: (Score:2)
    
    by gaspyy ( 514539 ) writes:
    
    Please RTFA. The cases mentioned violate the normal guidelines so blatantly that it's hard to imagine how they got through. Anyone who's done a bit of SEO knows that a stunt like this is nearly impossible to pull off.
  - Re: (Score:2)
    
    by mcguyver ( 589810 ) writes:
    
    FYI, the miserable failure google bomb, and others like it, were killed a few months ago. More spam will invade the system and Google will again figure out a way to combat it. It's an arms race between search engines and spammers that's never going to go away and fortunately Google does a decent job at fixing spam.
- Re: (Score:2)
  
  by Nintendork ( 411169 ) writes:
  
  Uh-hu. Right. My theory is that Bill Gates, George Bush Sr. and Hitler (Now undead) teamed up to thwart Google. Anyone else have a fun theory to throw out there and get modded up for? Apparently, any theory that's entertaining will do!
  - Re: (Score:2)
    
    by nuzak ( 959558 ) writes:
    
    > My theory is that Bill Gates, George Bush Sr. and Hitler (Now undead) teamed up to thwart Google.
    
    Clearly the Hitler they're in league with is none other than the vile Space Hitler. Someone call in Good Hitler!
- Re: (Score:2)
  
  by ajs ( 35943 ) writes:
  
  Google is susceptible to an erosion of moral tenacity, just like any other corporation.
  Vague and unsubstantiated claim noted.
  Someone from within has given the keys to someone who has paid a lot of money to get them.
  Why would this kind of left-field claim with absolutely no facts to back it up be modded up? This is pure rumor mongering, and even for Slashdot this is a rather crude bit of slander.
- - Re: (Score:2, Insightful)
    
    by onepoint ( 301486 ) writes:
    
    well for those of us whom deal with Google as their lively hood ( I currently run PPC campaigns and do some SEO work on my web sites ), this was a problem.
    
    I spent the better part of a afternoon about 2 weeks ago, submitting my searches to Google asking them too look at these sites.
    
    they were under my key word group and it was driving me nut's.
- Re: (Score:2)
  
  by Dachannien ( 617929 ) writes:
  
  This should be fairly easy for Google to get around, by re-requesting pages within a short time frame using, say, the IE user-agent string, perhaps from a different IP address. If the pages come up hugely different, toss the page out of the index altogether.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
  - Re: (Score:2)
    
    by tbannist ( 230135 ) writes:
    
    As someone pointed out above, they actually do that, but it seems someone has managed to figure out the alternate IP addresses that they use to verify the search engine results and spoofed those as well.
    
    It could be an ex-employee (either fired, quit, or possibly a contractor) who's sold the information to some black hats, or it could be any number of other things. There's money to be made by subverting Google's index, so you have to know that there are people working on ways to do so all the time.
- Re: (Score:2)
  
  by FooAtWFU ( 699187 ) writes:
  
  The sites could show one content to Googlebot and another to normal visitors.
  Or it could be tricky. Offer the same text/html content, but make part of the content User-Obvious / Bot-Invisible content (images or something thrown together with JavaScript) and downplay or hide the Bot-Obvious content with tricky style sheets or more JavaScript (or just put in a bunch of newlines so it's way down the page). Ultimately it becomes some sort of weird Turing test for Google to be able to detect this sort of stuff.
- Re: (Score:2)
  
  by TractorBarry ( 788340 ) writes:
  
  Try this
  
  http://www.givemebackmygoogle.com/ [givemebackmygoogle.com]
  
  It's not perfect as you can't customise the block list but it's a start. Even better make your own version to run on localhost so you can have your own block list etc.
- Re: (Score:2)
  
  by VGPowerlord ( 621254 ) writes:
  
  One trick that works a lot of the time is to not visit a page that has no link to a Google Cache in the index results.
  
  Most legitimate sites don't put the code to disable Google's Cache option, but most of the spam sites do for some reason...
- This Finding was Validated (Score:3, Interesting)
  
  by Jeremiah Cornelius ( 137 ) * writes:
  
  and commented on by Dvorak. (God, did I just say that he confirmed anything!?!)
  http://www.pcmag.com/article2/0,1895,2188281,00.asp [pcmag.com]
  
  Also, the Reg noticed - after my Slashdot posting, for once - so they are chasing this tail!
  http://www.theregister.co.uk/2007/10/01/google_spam_infiltration/ [theregister.co.uk]
  
  Wheee!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

It's the Rand Corporation (Score:3, Funny)

Google index hacked? (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

SEOs (Score:5, Informative)

Re:SEOs (Score:5, Interesting)

Re:SEOs (Score:4, Interesting)

Re:SEOs (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:SEOs (Score:5, Interesting)

Re:SEOs (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:SEOs (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:SEOs (Score:5, Insightful)

Re:SEOs (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Google hacked, sites don't exist, um ... (Score:4, Insightful)

Re: (Score:3, Informative)

Re: (Score:2)

Not hosted anywhere? (Score:3, Informative)

Re:Not hosted anywhere? (Score:5, Interesting)

Where do all the calculators go when they die? (Score:4, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Not hosted anywhere? (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

specific phrases? (Score:5, Interesting)

Drivers (Score:2)

Re: (Score:3, Informative)

Google's Algorithm (Score:2)

Wait and see. (Score:5, Insightful)

Re:Wait and see. (Score:5, Insightful)

Ironic side link (Score:2)

Re: (Score:2)

Horrible solution... (Score:2)

Let me tell you how it happened (Score:2)

Re: (Score:2)

Nutcase conspiracy theory adopters web2.0 version (Score:2, Insightful)

Sure it's not his browser that's porked? (Score:2, Interesting)

Google is working on this ... (Score:4, Informative)

Simple way to eliminate pharmaceutical spam (Score:3, Funny)

Re: (Score:2)

Re: (Score:2)

What hijacked phrases? Not seeing this. (Score:5, Informative)

Search Engine Pessimisation (Score:2, Insightful)

meta refresh (Score:2)

lot's of dead .cn domains (Score:2)

What is up with images? They being abused too? (Score:3)

I call Bullshit!!! (Score:4, Insightful)

Re:I call Bullshit!!! (Score:5, Insightful)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

I Bet It's a Simpler Explanation (Score:5, Interesting)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3, Insightful)