Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
The Internet Spam

Webmasters Pounce On Wiki Sandboxes 324

Yacoubean writes "Wiki sandboxes are normally used to learn the syntax of wiki posts. But webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank. One such webmaster recently demonstrated this successfully. Isn't it time for Google finally to put some work into refining their results to exclude tricks like this? I know all the bloggers and wiki maintainers would sure appreciate it."
This discussion has been archived. No new comments can be posted.

Webmasters Pounce On Wiki Sandboxes

Comments Filter:
  • by raehl ( 609729 ) * <(moc.oohay) (ta) (113lhear)> on Monday June 07, 2004 @12:56PM (#9357478) Homepage
    In the real world, there are neighborhood watch signs to "deter" criminals.

    Perhaps there could be a command in the robots.txt file which says "Browse my site, but don't count any links here for page ranking"? That would make your site less of a target for spammers, but not prevent you from being ranked at all.
  • like porn (Score:5, Interesting)

    by millahtime ( 710421 ) on Monday June 07, 2004 @12:58PM (#9357498) Homepage Journal
    These seems similar to the system all those porn systems used to get such a high rank in google.

    Kind playing the system with the content not being quite as desirable.
  • < jab jab > (Score:2, Interesting)

    by jx100 ( 453615 ) on Monday June 07, 2004 @12:59PM (#9357512)
    Well, couldn't have been that successful, for he didn't win [searchguild.com].
  • Complacency (Score:5, Interesting)

    by faust2097 ( 137829 ) on Monday June 07, 2004 @01:01PM (#9357534)
    Isn't it time for Google finally to put some work into refining their results to exclude tricks like this?

    It was time to do that at least a year ago. It's pretty much impossible to find good information on any popular consumer product and this is a problem that's been around for a long time.

    But they're too busy making an email application with 9 frames and 200k of Javascript to pay attention to the reason people use them in the first place. It's a little disappointing, I'm an AltaVista alumni and I got to watch them forget about search and do a bunch of useless crap instead, then die. I was hoping Google would be different.

  • This happened on the POPFile Wiki [sourceforge.net]. Eventually I solved it by changing the code of the Wiki itself to have an allowed list of URLs (actually a set of regexps). If someone adds a page which uses a new URL that isn't covered it wont show up when the page is displayed and the user has to email me to get that specific URL added.

    It's a bit of an administrative burden, but stopped people messing up our Wiki with irrelevant links to some site in China.

    John.
  • Google. (Score:4, Interesting)

    by Rick and Roll ( 672077 ) on Monday June 07, 2004 @01:05PM (#9357576)
    When I search on Google, half the time I am looking for one of the best sites in a category, like perhaps "OpenGL programming". Other times, however, I am looking for something very specific that may only be referenced about twenty times, if at all.

    When I do search in the first category, especially for things such as wallpaper, or simpsons audio clips, the sites that usually turn up are the least coherent ones with dozens of ads. I usually have to dig four or five pages to find a relevant one.

    The people with these sites are playing hardball. Google wants them on their side, though, because they often display Google text ads.

    Right now, my domain of choice is owned by a squatter that says "here are the results for your search" with a bunch of Google text ads. I was going to/may still put a site there that is very interesting, and the name was a key part of it.

    I firmly believe that advertisements are the plague of the Internet. I would like to see sites selling their own products to fund themselves. Google doesn't really help in this regard. The text ads are less annoying than banner ads, but only slightly less annoying.

    Don't get me wrong, I like Google. It's an invaluable tool when I'm doing research. I would just like to see them come out in full force against squatters.

  • Re:Yes... PLEASE... (Score:3, Interesting)

    by lukewarmfusion ( 726141 ) on Monday June 07, 2004 @01:06PM (#9357587) Homepage Journal
    As my site grows, I'm thinking about adding a mechanism to address those issues: when the user requests a page for the first time, he'll get a session value that says he's a valid visitor to the site. When he submits a comment, he has to have that value, or comments aren't allowed. I don't know how you'd write a script to circumvent that. (If someone can tell me, I'd love to know so I try to prevent it!)
  • Hmm (Score:4, Interesting)

    by Julian Morrison ( 5575 ) on Monday June 07, 2004 @01:11PM (#9357628)
    Leave the links, edit the text to read something like "worthless scumbag, scamming git, googlebomb, please die, low quality, boring" - and lock the page.
  • Wait a minute - a way to spoof Google to get your page ranked better through WiKi? OMFG! Call the internet police, call Dr. Eric E. Schmidt, call out the Google Gorilla goons! I'm sure the good Dr. has a fix like the ones he used at Novell...

    The problem with the whole Google model is that it's biased to begin with. If I'm looking for granny-smith apples, chances are an internet chimp they've bought the space with banana's to Google's goons. It becomes obvious when you see a chimp site that is near the top that has no business at the top. To the experienced googler, it's just an annoying fly on the screen and you just move further down.

    I'm hoping that Google doesn't get too bogged down in becoming that big Ape like Micro$oft and be a little more proactive in protecting their business property. It's bad enough that they're selling top space to companies willing to pay, but here's hoping they don't slip on their own banana peels.
  • Re:Yes... PLEASE... (Score:5, Interesting)

    by n-baxley ( 103975 ) <nate@NosPAm.baxleys.org> on Monday June 07, 2004 @01:12PM (#9357638) Homepage Journal
    The system was even easier to rig back then. Back in 96ish, I created a web page with the title "Not Sexy Naked Women". Then repeated that phrase several times and then gave a message telling people to click the link below for more Hot Sexy Naked Women which took them to a page that admonished them for looking for such trash. I added a banner ad to the top of both of these pages, submitted them to a search engine and made $500 in a month! Things are better today, but they're still not perfect.
  • Re:Why just wikis? (Score:3, Interesting)

    by abscondment ( 672321 ) on Monday June 07, 2004 @01:12PM (#9357640) Homepage

    posting on Wikis doesn't screw up your own blog.

    posts on message boards will be deleted quickly, unless the board is expressly google bombing (as in the current Nigritude Ultramarine 1st placer [google.com]) / people are stupid

    i think the idea is that wikis make it easier in general for your post to stay up and not affect your blog.

  • Re:Why just wikis? (Score:5, Interesting)

    by nautical9 ( 469723 ) on Monday June 07, 2004 @01:12PM (#9357641) Homepage
    I host my own little phpBB boards for friends and family, but it is open to the world. Recently I've noticed spammers registering users for the sole purpose of being included in the "member list", with a corresponding link back to whatever site they wish to promote. They'll never actually post anything, but they've obviously automated the sign-up procedure as I get a new member every day or so, and google will eventually find the member list link.

    And of course there are still sites that list EVERY referer in their logs somewhere on their site, so spammers have been adding their site URLs to their bot's user agent string. It's amazing the lengths these people will go to spam google.

    Sure hope they can find a nice, elegant solution to this.

  • "Finally"?? (Score:5, Interesting)

    by jdavidb ( 449077 ) on Monday June 07, 2004 @01:30PM (#9357802) Homepage Journal

    Isn't it time for Google finally to put some work into refining their results to exclude tricks like this?

    I take extreme issue with that statement, and I'm surprised noone else has challenged it. Google does in fact put quite a bit of work into making themselves less vulnerable to these kinds of stunts. They even have a link on every results page where you can tell them if you got results you didn't expect, so they can hunt down the cause and refine their algorithm.

    The system will never be perfect, and this is the latest issue that has not (yet) been dealt with. Quit your griping.

  • by MaximusTheGreat ( 248770 ) on Monday June 07, 2004 @01:38PM (#9357870) Homepage
    What about using random image based spam control lik the one yahoo uses on its new mail signup?
    So, every time you edit/post comment, you would be presented with an image with a random distorted text, which you will have to type in to be able to edit/post. That should take care of automated systems.
  • Re:Yes... PLEASE... (Score:3, Interesting)

    by joggle ( 594025 ) on Monday June 07, 2004 @01:51PM (#9358011) Homepage Journal
    Why not generate an image containing modified text like yahoo and others? Using a little PHP magic, it shouldn't be too hard (see here [resourceindex.com] to get a start).
  • by bcrowell ( 177657 ) on Monday June 07, 2004 @01:55PM (#9358053) Homepage
    Google's algorithm isn't the problem. The problem is the availability of easily abused areas such as these "sandboxes."
    I'm not even convinced Google's algorithm has a problem. One thing a lot of people don't realize about the page rank algorithm is that your page rank goes down if you have lots of outgoing links that aren't reciprocated with links coming back from the site you linked to. It may be that this technique simply leads to a reduction in the page rank of the sandbox, which, after all, is appropriate, since the sandbox isn't something the the sandbox's owner even wants people to find by Google searching.

    Sure it can be abused, but it's not Google's fault; perhaps these areas of abuse (blogs, wikis, etc.) should address the problems from their end.
    Yeah, the simplest thing to do would be for the sandbox's owner simply to use the robots.txt file to forbid indexing of the sandbox page. That keeps the rest of the web site's page rank from being adversely affected, deters spammers from abusing the sandbox, and does Google's users a service by not directing them to the sandbox, which they don't want to find.

    Spammers aren't stupid -- if I was an Evil Spammer(tm), I'd certainly make sure my script checked the robots.txt and didn't waste time spamming sandboxes that weren't going to be indexed.

  • by phutureboy ( 70690 ) on Monday June 07, 2004 @02:05PM (#9358146)
    You can also list robots.txt commands as meta tags in the [head] portion of the document. So, the wiki authors could just put them in the sandbox template, and individual site owners would not even have to know about / monkey with robots.txt to be protected.
  • Disallow weblinks (Score:2, Interesting)

    by Will2k_is_here ( 675262 ) on Monday June 07, 2004 @02:22PM (#9358308)
    With regards to just editing the sandbox which nobody monitors anyway, why not just include a rule to deny adding URLs. There is no conceivable reason to allow a user to add a URL in the sandbox.

    And if your thinking "I want to practise adding links with the required syntax", it's not hard. The only thing you need to use the sandbox for beyond learning how other basic syntax works (and you can apply that to links without practising) is structuring.
  • by wamatt ( 782485 ) on Monday June 07, 2004 @02:30PM (#9358376)
    Spammers are going there because you have a high PR. So cut the PR supply and you in business, http://www.site.com/~url=http://www.link.com and voila - URL rewriting. no more PR for mr spammer.
  • by swb ( 14022 ) on Monday June 07, 2004 @02:58PM (#9358693)
    I thought it was a real-time thing, where the account creation bots passed the image that loaded during the signup process to a porn site and the images were decoded by a real person, and the result passed back to the bot who then signed up for the account.

    To avoid the timing problems with porn signons needing to happen concurrent with account signups, the account generation process was actually initiated by a porn signon. It limits your account generation ability, but only to the extent that you have porn traffic.

    Did I just imagine this, or does it work that way?
  • Re:Google. (Score:2, Interesting)

    by nsingapu ( 658028 ) on Monday June 07, 2004 @11:34PM (#9362519) Homepage
    Don't get me wrong, I like Google. It's an invaluable tool when I'm doing research. I would just like to see them come out in full force against squatters.

    Google owns oingo.com - perhaps the largest collection of squatter sites out there.

And it should be the law: If you use the word `paradigm' without knowing what the dictionary says it means, you go to jail. No exceptions. -- David Jones

Working...