El Reg Says Google Choking on Spam Sites 234
Grubby Games writes "The Register is reporting that Google is full, and in trouble." From the article: "Recently, we featured a software tool that can create 100 Blogger weblogs in 24 minutes, called Blog Mass Installer. A subterranean industry of sites providing 'private label articles,' or PLAs exists to flesh out 'content' for these freshly minted sites. And as a result, legitimate sites are often caught in the cross fire. But the new algorithms may not be solely to blame. Google's chief executive Eric Schmidt has hinted at another reason for the recent chaos. In Google's earnings conference call last month, Schmidt was frank about the extent of the problem. 'Those machines are full,' he said. 'We have a huge machine crisis.'" James Robertson points out that's a fairly selective bit of quoting.
Everyone - Attention (Score:5, Funny)
Thanks!
Re:Everyone - Attention (Score:5, Funny)
- Andrew
Re:Everyone - Attention (Score:2)
Re:Everyone - Attention (Score:3, Insightful)
Re:Everyone - Attention (Score:5, Funny)
myspace.com
Problem solved.
Re:Everyone - Attention (Score:2, Informative)
Maybe you should let your own little personal prejudices slide a bit. MySpace isn't the great Internet evil, you know.
Re:Everyone - Attention (Score:5, Insightful)
Re:Everyone - Attention (Score:2)
Maybe you should let your own little personal prejudices slide a bit. MySpace isn't the great Internet evil, you know.
joe-baldwin.net's MySpace profile [myspace.com] complete with auto-playing song.
We all have bias, you know. :)
Re:Everyone - Attention (Score:2)
Re:Everyone - Attention (Score:2)
Link?
Re:Everyone - Attention (Score:2)
Link?
Surprisingly enough it's http://www.myspace.com/sp0rk173 [myspace.com].Re:Everyone - Attention (Score:2)
Re:Everyone - Attention (Score:2)
Reported the issue and blammo... content erased...
Which means someone out there does listen.
Re:Everyone - Attention (Score:2)
MySpace is a useful way of tracking all 9738 of your online friends. Really, would you be able to do that on your own?
Re:Everyone - Attention (Score:2)
And what exactly is so "poorly configured" about allowing anonymous commenting? Some of thebest comments come from anonymous users. (Not that anyone in Wordpress is really validated.) Wordpress not only makes anon commenting easy, but it automatically catches many types of spam posts (especially link-spam) and requires that they be approved before going live.
Re:Everyone - Attention (Score:2)
yes blogger and about,com is one of those I would filter out indefinetely
Re:Everyone - Attention (Score:2)
Re:Everyone - Attention (Score:3, Funny)
Emo sucks.
If you use Myspace you are emo.
If you use Myspace you suck.
Re:Everyone - Attention (Score:2)
Re:Everyone - Attention (Score:2, Funny)
Re:Everyone - Attention (Score:5, Funny)
Could someone do a quick backup first? There might be something on the internet that I might need later. I think you can just use Ghost or whatever you IT guys do. Also, please burn it to CD and have it on my desk by COB today.
-Executive Chief Officer SydBarrett
SQL Solution (Score:3, Funny)
where lower(page_text) like '% beastiality%'
or lower(page_text) like '% lose weight%'
or lower(page_text) like '% refinance%'
or lower(page_text) like '% ebay%'
or lower(page_text) like '% make money fast%'
or lower(page_text) like '% enlarge your%'
or lower(page_text) like '% teens%';
commit;
Re:Everyone - Attention (Score:3, Funny)
Time for a format (Score:2)
Meanwhile we can decide which websites are no longer needed and don't bother reinstalling them, because they're crap anyways and takes up space and hard to remove.
Re:Time for a format (Score:2)
How accurate is the Register Article? (Score:5, Informative)
With hardware (and bandwidth) getting cheaper, I find it hard to believe that Google has actually run out of space. But certainly the explosion in the number of web pages is an issue, especially with auto-generated pages. One current example is the V7ndotcom Elursrebmem SEO contest [watching-paint-dry.com] (white-hat celiac charity site I'm supporting) - that nonsense phrase returned zero results on January 15th, 2006 ...
but now returns almost 5,000,000 ... of which I gotta believe the
vast majority were NOT typed in by humans.
So maybe it's more that the techniques/algorithms used to spider and index are struggling with the bazillions of web pages out there. Or it could just be disgruntled webmasters PO'ed that their web site isn't listed!
Re:How accurate is the Register Article? (Score:2)
Re:How accurate is the Register Article? (Score:5, Informative)
Be warned.
Re:How accurate is the Register Article? (Score:3, Informative)
I don't know what his problem is, perhaps he just needs pageviews for the advertisers. So: write knocking article about popular website, fans of the website look, pageviews escalate.
Google -- check.
Wikipedia -- Check
Slashdot -- ?
(The captcha word for this submission was "referral". How do they do that?)
Re:How accurate is the Register Article? (Score:2)
Yes, that's him. I didn't mention it because I couldn't remember off-hand what his other phobia was (it's late in my time zone).
He's okay when he's not doing opinion pieces, though.
Re:How accurate is the Register Article? (Score:2)
With Orlowski, every piece is an opinion piece.
The guy's shameless. For pete's sake, he links to the source of the quote he twisted, which makes it clear it has been twisted.
He's either got no concern for truth, or has no ability to discern it.
Wait a minute! (Score:2)
Re:How accurate is the Register Article? (Score:2)
I thank my lucky stars every day that we have a news reporting medium where people who are spewing bullshit are swiftly called out on it.
Damn right.
Are you using Google? (Score:2)
Let's not even talk about the spam pages. I've emailed suggestions for instance banning domains that use javascript redirects -- you know, you see a SEO page with javascript off and the porno page with it on. No legit site shunts off visitors to third party sites with zero delay.
I've also suggested a Slashdot ty
Re:How accurate is the Register Article? (Score:5, Informative)
Re:How accurate is the Register Article? (Score:2)
Re:How accurate is the Register Article? (Score:2)
They wear their bias on their sleeves, which, in my opinion, is a good thing, because you know the type of slant that's on what you're reading, and nobody claims to be "Fair and Balanced" when they're anything but.
Re:How accurate is the Register Article? (Score:2)
Anyway, the point I wanted to make is that bias is one thing, but distortion is quite another. When one's bias leads one to gross distortions, then there's a problem. I think that's what the poster was getting at.
Re:How accurate is the Register Article? (Score:2)
Re:How accurate is the Register Article? (Score:2)
Is this in competition with the guy whose girlfriend will have a threesome with him if his blog gets a million hits? If so, I've got to throw in with him. Sorry, sick kids but when you get older, you'll understand.
Anyway, now that the link has finally loaded while I was writing the above -- OK, the quote is out of context but it's not that out of context. (Certainly not by /. standards for "out of co
Cheap hardware works both ways (Score:2)
Surely, Google isn't the only one to take advantage of cheap hardware? According to Netcraft [netcraft.com] the internet doubled in size in the last three years, increasing by 3.1 million new hostnames in April 2006 alone.
Re:How accurate is the Register Article? (Score:2)
No, but I think it has run out of people to install and manage all that space. Apparently hires into "Reliability Systems Engineering" (or whatever Google calls their system admin group) are one of the hottest areas for Google right now.
Re:How accurate is the Register Article? (Score:2)
Google is Full!? (Score:5, Funny)
Are you sure? (Score:2)
Re:Are you sure? (Score:3, Funny)
Re:Are you sure? (Score:2)
Spammer jokes (Score:3, Funny)
So what do you have when you push 50% of all the spammers in the world into a hole and bury them? A good start.
Did you know that if you took all the spammers in the world and lined them up end to end around the equator of the earth that two thirds of them would drown?
Re:Spammer jokes (Score:5, Funny)
more internet space (Score:5, Funny)
I saw one at bestbuy.com that looks pretty good.
Adsense is to blame (Score:5, Insightful)
There are people who are literally making $10,000 or more per month just putting up junk content sites that are auto generated for the purpose of creating adsense revenue.
Don't get me wrong, I think adsense is a good thing, but Google's allowance of spam sites is giving adsense a bad name.
Re:Adsense is to blame (Score:4, Interesting)
Banner ads were taking the same path. If anything, we should thank google for making internet advertising less intrusive.
Re:Adsense is to blame (Score:2)
Re:Adsense is to blame (Score:5, Funny)
Re:Adsense is to blame (Score:2)
Google is fighting that war fairly well with their new smart pricing system (in AdSense) but I would much prefer to see an option for publishers to opt-in to a better AdSense program that offers possibly better income if new content
Re:Adsense is to blame (Score:3, Insightful)
I believe this is all an unintentional consequence of AdSense. I'm sure the people at Google knew some of this would happen, but probably not to this extent.
The Reg MIght Be On To Something (Score:3, Informative)
I know the GoogleBot indexes the site almost every day. Yet, while one of my sites is completely out of date (the Cache is from 2005), another is almost completely up to date.
Google's got problems.
PageRank is everything to Google... (Score:2)
So the site that gets updated has links to it that Google thinks are good, and the site that doesn't get updated doesn't have good linkage. That is to say, if it would come up at the top of the list in a Google search, it gets scanned more often, but if it would come up on page 32 of 32, it gets scanned very very rarely.
How Google crawls a site (Score:5, Interesting)
Re:How Google crawls a site (Score:2)
Before you get too scared.... (Score:3, Funny)
I've heard of the user being ignorant... (Score:3, Interesting)
Eh, or I could be completely off my rocker, and just not competent enough to see a simple and effective method of combating these guys.
Re:I've heard of the user being ignorant... (Score:2)
Fud Light (Score:2, Interesting)
Google Indexing (Score:5, Funny)
Re:Google Indexing (Score:2)
Re:Google Indexing (Score:2)
Don't forget the modifiers NEAR, FAR, and HUGE.
Enjoy,
Re:Google Indexing (Score:2)
I wonder how many people even here on Slashdot remember the real mode memory "model"
Anyway, 40Gi pages ought to be enough for everyone
(40Gi is to 64Gi what 640Ki is to 1Mi if anyone wonders...)
Re:Google Indexing (Score:3, Funny)
Re:Google Indexing (Score:2)
No, not shifted farther to the left -- it's used as an index into a table of base addresses. Silly!
Right (Score:3)
Re:Right (Score:2)
Really, though, slashdot is addicted to trolls and flamebaits.
There is an obvious solution (Score:3, Funny)
(Slightly OT): Scout sign/salute? (Score:2)
One idea? (Score:5, Insightful)
Well given that a human would have a hard time deciding if the page was autogen'ed if the text was in their second language, this *is* quite an issue.
So it sounds like Google needs to *shudder* have a user feedback system where humans with logins add moderation metadata to the search results and in return get results based on this moderation en-mass.
I know what your thinking,
It would withstand abuse since a massive amount of human inputed data would keep spambots from trying to exploit the moderation system. What's more, their toolbar could incorporate the control to flag a page as autogen'ed garbage.
Re:One idea? (Score:3, Interesting)
I foresee a time when to access large parts of the net you will be required to use some central "proof of life" system. The current mish-mash of captchas isn't working. We have custom English captchas on a forum I admin and it doesn't seem to stop the bots: presumably when they get stuck they call for help.
It's hard to believe a third of Googles index is auto-generated crap, but then I couldn't really believe the "50% of net traffi
Re:One idea? (Score:2, Informative)
Hey, looks like they are:
http://googleblog.blogspot.com/2006/04/this-is-tes t-this-is-only-test.html [blogspot.com] The Googleblog shows that they have a cookie-based "block this site from results" feature in general beta test to random people on the site.
Re:One idea? (Score:2)
Now, we do know that some people, mostly persons looking to maximize their click through advertising, will make a page appear to be useful for a certain search result
Re:One idea? (Score:2)
It sounds noble in theory, but in practice it doesn't work so well.
A bunch of phony moderations will boost the pages of ads. Only allowing users with logins to rate results won't save you; the spammers will simply create millions (yes, millions) of bogus accounts, farm them to improve their "karma" the
If google and the spammers have an arms race... (Score:5, Interesting)
The root of all evil (Score:2)
Then money came. dot-com came. And the turd started hitting the fans.
Now, I'm not saying to "outlaw" making money on the net. As much as I'd enjoy the "free and open" net of the old days, without people making (or hoping to make) money from the internet, we would still be hanging on dialup and paying inane amounts of cash for it. But it's time for some radical changes.
1. E
Careful... (Score:3, Informative)
Careful, that linked page is 99.9% likely to be a legitimate user's hacked hosting account. What's faaaaaar more effectiv
Re:Careful... (Score:2)
Let's put it that way: I'm pretty sure the hoster gets the idea fairly quickly.
BlueFrog extended? (Score:2)
Re:BlueFrog extended? (Score:2)
Gmail (and others) have a "Mark as SPAM" button, and now that button should be extended to the entire web.
But it can only be in the google toolbar, and similar toolbars, and not many users have them installed, or do they?
And then is the whole "I will report as SPAM my competition's home page" issue.
A disappointing change (Score:2)
Viagra and Cialis (Score:2)
Okay, so the people who actually want information on viagra or cialis will have to resort to the old fasioned way, watching TV, but at least that fixes the internet.
Answer is user rating system (Score:2)
When google bar installed and you are logged into google (gmail or anything) put a little button there
Rate this site : Search engine spam, good info, mediocre
yes I would click on it (if it is a function that does not take me to 30 other sites and require me to log -in
It is time we start using our custeomers/visitors/human feedback. ANYONE can generate content from other sites. Just wget whatever. html stip it, mix w
User feedback is king (Score:2)
when new users signed up, old user rated the user upon a question form and decided : stay or go
that worked with 100 people or so (small BBS)
now the net and google is a BIG BBS
user interaction is good, bots are dumb and if you have 100+ sites with different
Re:Not so sure... (Score:2)
Re:Not so sure... (Score:2)
Re:Not so sure... (Score:2)
Re:Not so sure... (Score:2)
Re:Finally, an explanation (Score:4, Interesting)
Re:Obvious (Score:2)
Re:Google is full. Try this... (Score:5, Informative)
Go to yahoo and search for "slashdot poneys". This will bring up a bunch of results, all approximately 1 month old.
Now do the same search on google. Notice how many of the results from yahoo do not appear in the google results at all.
Google has such a big backlog that they don't get around to spidering new sites for several months. While google does give priority to certain high-profile sites like slashdot and visits those frequently, most other sites do not get indexed for several months.
Okay, so I tried this, just for kicks. You can verify, by a single click:
Yahoo: http://search.yahoo.com/search?p=slashdot+ponies [yahoo.com]
Google: http://www.google.com/search?hl=en&q=slashdot+pon
Since when does 44900 results on Yahoo mean that they have more than 92100 results on Google? As far as what's appearing, I was able to find most every one I saw on Yahoo on the first 2 or so pages of Google's results. I also see more results on Google that look like they'll show me more of what I'm looking for (since I am probably looking for the April 1st joke, screenshots especially).
Works alright for me. Looks like I don't have a reason to switch again yet.
Re:Google is full. Try this... (Score:5, Funny)
44 on yahoo, 229 on google.
Wait, what was I saying?
Re:Google is full. Try this... (Score:2, Insightful)
Re:Google is full. Try this... (Score:2, Interesting)
Top 10 results for "slashdot poneys" on yahoo:
1. slashdot.cuteness.org (not on google)
2. jfaughnan.blogspot.com (#1 on google)
3. jfaughnan.blogspot.com (#1 on google)
4. index.cristal-trace.com (not on google, outdated link)
5. mfrost.typepad.com (#22 on google)
6. pcdq.blogspot.com (not on google)
7. www.ninme.com (#15 on google)
8. www.firstworld.biz (not on google, spam)
9. musicindustry.firsindustry.com (not on google, spam)
10. girls-having-sex-with-h
Hey Taco... (Score:2)
Did you have to massage that? Or do you have a gift?