Webmasters Pounce On Wiki Sandboxes 324
Yacoubean writes "Wiki sandboxes are normally used to learn the syntax of wiki posts. But
webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank. One such webmaster recently demonstrated this successfully. Isn't it time for Google finally to put some work into refining their results to exclude tricks like this? I know all the bloggers and wiki maintainers would sure appreciate it."
Oh well (Score:5, Informative)
google works (Score:4, Informative)
I've seen this (Score:4, Informative)
This may become a big problem for sites like this. The only solution might be one of those annoying "write down the letters in this generated gif" humanity tests.
Not a big deal (Score:5, Informative)
Re:Cyberneighborhood Not-Watch? (Score:5, Informative)
http://www.robotstxt.org/wc/meta-user.html
Re:Cyberneighborhood Not-Watch? (Score:2, Informative)
Re:Naughty behaviour (Score:3, Informative)
Any suggestions?
The only big one I know of right now is Nutch. It is an open source search engine that is in the later stages of development, but hasn't produced a large, usable site yet.
nutch.org [nutch.org]
Since it will be open source, you will be able to read the ranking algorithms and change/abuse them as you see fit.
This one http://search.mnogo.ru/ [mnogo.ru] is also available.
Re:Why just wikis? (Score:5, Informative)
My wiki got hit by this stupid link, but not in the sandbox. Of course, recovering the previous version of the page is easy... it's wiping out any trace of the lameness that gets trickier. I suppose the easiest way to defeat this would be to require simple registration in order to edit Wiki pages.
What else can we do? Alter the names of the submit buttons and some of the other key strings involved in Editing?
visual security code for sign-up (Score:5, Informative)
Re:"Finally"?? (Score:3, Informative)
I checked, and I've got documented evidence of this. On April 25 last year, I reported that earthlink.net was showing up as the top search result [perl.org] for queries involving various religious words, including "Bear Valley Bible Institute." The Church of Scientology (which owns Earthlink) was clearly engaging in something to distort the page rank of earthlink. I had noticed this for a long time before I recorded it.
On that same day, I reported the problem to Google via their feedback mechanism. I note today that the problem is gone.
Now if I can just do something about the "Church Of Christ at eBay Low Priced Church Of Christ. Huge Selection! (aff)" ads I keep getting on Google, I'll be happy... ;)
Re:Why just wikis? (Score:2, Informative)
Re:Cyberneighborhood Not-Watch? (Score:2, Informative)
Re:Sure, that will work (Score:2, Informative)
Re:Why just wikis? (Score:4, Informative)
It has probably already been done in any wiki software worth its salt. Here's what MoinMoin [wikiwikiweb.de] does for example:
* It has a regexp of HTTP_USER_AGENTS which should receive a FORBIDDEN for anything except viewing a page. The default setting includes many known bots (including Google) and utilities such as wget.
* Most pages contain the appropriate robot meta tag, whith the relevant noindex and/or nofollow settings.
In addition to that, the webmaster can of course set up a robots.txt file, and actually should do so because there are tools out there which don't understand the robot meta tags (or they don't want to take a performance hit) and the user agent of which can easily be changed by the user... wget comes to mind.
Of course, it shouldn't be too hard to add regexps to prevent certain links from being done, or certain hostnames or IPs from altering the site (editing pages, reverting them, deleting them).
It's already been invented. (Score:4, Informative)
Here's Google's stance on the subject (boils down to you don't want it indexed, put in a damn robots.txt file) [google.com]
Hell, even Google News uses robots.txt [google.com]
Clean sandbox daily. (Score:3, Informative)
Chip H.
Re:Why just wikis? (Score:5, Informative)
Then again maybe that mostly says something about their popularity.
Re:visual security code for sign-up (Score:3, Informative)
You'd be pretty lucky to hit the exact same image twice.
Re:Which is why I thought it was real time (Score:4, Informative)
Re:mod parent up (Score:1, Informative)
Comment removed (Score:3, Informative)