Webmasters Pounce On Wiki Sandboxes 324
Yacoubean writes "Wiki sandboxes are normally used to learn the syntax of wiki posts. But
webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank. One such webmaster recently demonstrated this successfully. Isn't it time for Google finally to put some work into refining their results to exclude tricks like this? I know all the bloggers and wiki maintainers would sure appreciate it."
Why just wikis? (Score:5, Insightful)
Yes... PLEASE... (Score:5, Insightful)
What happened to the nice internet we had in 1996?
You know... (Score:3, Insightful)
Some people ... (Score:2, Insightful)
Yes its a sandbox, no its not your personal playground.
Who's fault is that? (Score:5, Insightful)
Some search engines accept any old site. Others accept sites based on human approval and categorization. Google is a nice combination of the two - by using outside references (counting how often the site is linked) it assumes that the site is more relevant. Because other people have put links on their sites. That's a human factor, without directly using human beings to review and categorize the sites and rankings.
Sure it can be abused, but it's not Google's fault; perhaps these areas of abuse (blogs, wikis, etc.) should address the problems from their end.
ROBOTS.TXT (Score:5, Insightful)
As a sidenote, I think that with recent Wiki abuse, the issue of open wikis will become a similar one to open proxies and mail relays.
Same site, a few days later: Don't do it. (Score:2, Insightful)
I decided to stop posting backlinks in Wiki sandboxes, the SEO strategy previously explained. [...] In the meantime I'm asking developers and those hosting Wikis of their own to please exclude sandboxes from search engine results (via the robots.txt file). Doing so would shield the sandbox from backlink-postings, and there is no need for it to turn up in search results in the first place.
This sure makes sense, and who knows, maybe future wiki distributions do it by default. (If
would work universally...)Well, it's about time this gets some attention (Score:5, Insightful)
From what I can see, it looks like those "search ranking professionals" who "guarantee to raise your google rank in 30 days" are using blog spamming, and perhaps Wiki Spamming as a way to increase their clients ratings.
It's not about meta tags, or submitting anymore... it's spamming.
Perhaps it's time for people to finally be warry of these services. After all, can a third party really guarantee a position in another companies search index?
IMHO those services are pure evil. They either do nothing, or they do something to increase page rank... what is that "something"? How many options do they have?
If they are going to use my blog... why can't I get a cut in that business?
Re:You know... (Score:1, Insightful)
Re:You know... (Score:1, Insightful)
Except spambots can also work to make sure that the most helpful links are the ones linking to spam sites.
Re:Cyberneighborhood Not-Watch? (Score:3, Insightful)
apache + search + p2p = distributed search engine (Score:2, Insightful)
This way all the modificed web servers would make a giant distributed search engine.
Some nice algorithms like koorde or kademlia could be used.
Anyone thought about starting something like this?
David
Re:You know... (Score:4, Insightful)
Tomorrow today yesterday (Score:5, Insightful)
The Arch Wiki [gnuarch.org] has sufferred several times from such vandals in the past few months. I'm sure other wikis have, too. They create links over single spaces or dots, so that casual readers don't notice them. Attentively watching the RecentChanges page is the most effective way to find and fight them, but this is tiresome. I guess many wikis will require posters to be authenticated soon, which is a blow in the wiki ideal, but not such a major blow. Alternatively, maybe someone will develop heuristics to fight the most common abuses (e.g. external link over a single space).
So, this is not new, but this is now news.
Re:Well, it's about time this gets some attention (Score:5, Insightful)
Re:You know... (Score:4, Insightful)
Re:Yes... PLEASE... (Score:3, Insightful)
Re:Why just wikis? (Score:3, Insightful)
I'm not sure this will make you feel better but this startergy has a limited lifetime.
The contribution of your page to another pages page rank depends on two factors, firstly the page rank of your page, and secondly the number of links coming from your page.
As more people take up this tactic the return everyone gets from it, gets smaller. E.g. When there are hundred of links on that page they cease to have any real value. Eventually people should give up on this one.
Sandbox persistence (Score:3, Insightful)
But if the problem is to have in websites areas where visitors (even unregistered ones) can post random text and links, even slashdot is potentially target of the same (maybe should be a "Spam" mod score?) or by the way, any site where unregistered visitors can store content in a way or another, be wiki or not.
Re:visual security code for sign-up (Score:5, Insightful)
There was a story about defeating this system on /. a while back.
Rather than using OCR or anything poeople would merely harvest a load of images from a signup site - possible when there are only a given number of finite images, or when there is a consistent naming policy.
Then once the images were collected they would merely setup an online porn site, asking people to join for free proving they were human by decoding the very images they had downloaded.
Human lust for porn meant that they could decode a large number of these images in a very short space of time, then return and mount a dictionary attack...
Quite clever really, sidestepping all the tricky obfuscation/OCR problems by tricking humans into doing their work for them ..
Easy solution (Score:3, Insightful)
Re:Cyberneighborhood Not-Watch? (Score:3, Insightful)
That is, even if you make your links useless (easy with a no-follow meta tag) it wont help, the majority of this spam is AUTOMATED, and will spam your wiki/blog/guestbook based on simple page queues.
Your best personal defense is to manually remove any page or html queues that a spammer would pick up on as being common to a certain type of postable web page or element.
Bloggers have been creating blacklists (banning both poster ips and destination urls) with some degree of success. This is a deterrent, having a spammer show up on a blacklist whereby webmasters use a distributed file to 'clean' their blogs automatically.
YHBT. YHL. HAND. [Was: Re:Well, ...] (Score:2, Insightful)
Overuse of absolutes can lead to their deterioration. As an American I couldn't feel more turgid: now when the Europeans get ready to yell HITLER!!!! in IRC, I can just pre-emptively yell 9/11!!!!!!! and lose/end the conversation.
To be fair, the difference between these 'blog abusing 'minor annoyances' and the large scale deaths/destruction of 9/11 can be seen as just a matter of scale. To some people I know, the economic impact of terrorism keeps them awake at night: the value of human life be damned, watch that bottom line! (Not the most civicly minded people, IMHO.)
Being respected members of polite business society, these people and their defective outlook just as dangerous to you and I as the wiki 'blog abusers and 9/11 baby killers. To them, you are either a customer, employee or garbage to be taken out by security.
This, by the way, is how we treat anybody who we have successfully alienated. Look at these 'blog spammers. Would anyone have cried if Al Queda had blown up a spammer's house?
Both sides of this argument stand at the top of a moral mountain with a very slippery slope and are trying to make the other fall off as far and as fast as possible. I'm waiting to see who tumbles first.
Like they say on bash.org: I will become rich and famous when I invent a device to punch people in the face through the Internet.
Re:You know... (Score:3, Insightful)
Re:Well, it's about time this gets some attention (Score:2, Insightful)
OK, so it's not really fair to get into relative levels of "evil", but let's also not minimize the "evil" that search optimizers do. It's not just a bunch of extra comments on blogs or wikis.
Their fundamental business model is CONTRARY to my interests as a consumer trying to get product information. They don't wish to let me find the product or the review or the site that MOST PEOPLE FOUND USEFUL, they only want me to find the one that PAID THEM THE MOST MONEY.
I realize that's just the way things are, but that's obviously counter to my whole purpose for using a search engine like Google. They are intentionally polluting the search results. It's not the methods I find "evil" (although blog comment and wiki spamming are pretty shady) as much as the end result - the loss of helpful web searches.
Time to reconsider Wikis. (Score:1, Insightful)
> Isn't it time for Google finally to put some work into refining their results...
Isn't it time to also reconsider the Wiki paradigm? More sites (like this [docbook.org]) are requiring logins. "Golden Prose" [wikipedia.org] indeed! IMHO, Wikis are evolving into crude Content Management Systems.
Re:image based spam control (Score:3, Insightful)
Maybe it wasn't obvious to blog and wiki programmers that the ability to post a comment or edit a wiki page was worth money. It isn't worth a lot per post, but because these are online systems, they are very susceptible to bots that can post in huge volume. All of those posts together can alter a site's placement in Google search results, and that's definitely worth money.
Instead of whining about Google being influenced by attacks that use your Wiki or blog, how about making it hard for bots to post in the first place? Is that really an important feature that you can't live without?
Re:image based spam control (Score:3, Insightful)
Why not just show the picture of an object, like an apple or something, and ask the user to type in what it is? I mean, you could have a few hundred of these and it would be nearly impossible for an automated system to guess. (You have a few hundred different items, and like 5-10 images of each item.) I dunno, seems easier to me, but I don't write web software.
just like spam (Score:3, Insightful)
Re:Cyberneighborhood Not-Watch? (Score:3, Insightful)
Re:Grow up (Score:5, Insightful)
Re:Cyberneighborhood Not-Watch? (Score:2, Insightful)
Like it is, it's hell to try to get decent robotic behaviour out of anything other than HTML pages.
Re:Grow up (Score:1, Insightful)