Forgot your password?
typodupeerror
Spam Bug

Comment Spams Straining Servers Running MT 186

Posted by michael
from the critical-mass dept.
dJ phuturecybersonique writes "Netcraft reports that 'Comment spam attacks on Movable Type weblogs are straining servers at web hosting companies, leading some providers to disable comments on the popular blogging tool. The issues are caused by bugs in MT, forcing publisher Six Apart to recommend configuration changes while it prepares fixes.' More..."
This discussion has been archived. No new comments can be posted.

Comment Spams Straining Servers Running MT

Comments Filter:
  • Wow (Score:3, Funny)

    by Anonymous Coward on Saturday December 18, 2004 @05:09PM (#11126175)
    It's [changedetection.com] a [apple.com] good [networkforgood.org] thing [thing.net] Slashdot [slashdot.org] doesn't [wired.com] have [mecca.org] this [historychannel.com] problem [mathforum.org].
  • by miyako (632510)
    So...Netcraft confirms it, blogging is dead?
  • Why don't bloggers just disable HTML in comment posts, the spammers are looking for Google PR aren't they?
    • Re:Easy Solution (Score:2, Interesting)

      by Anonymous Coward
      Or make an in-between page for every URL linked. So, someone leaves a link, it gets made into http://www.example.com/linkout.php?linkid=23890 (or whatever), then linkout.php just SHOWS the link (not a redirect) with a noindex,nofollow tag (for Google) and robots.txt entry. No PR, yet a user can still click. Another alternative would be to be use javascript since Googlebot doesn't seem to parse it yet.
    • Hmm... semi off-topic, but it would be neat if search engines like Google could be trained to ignore negative score Slashdot comments. On systems where there's built-in feedback, that would be one way to combat the spam, just train the search engine crawlers to ignore comments with poor scores.

      Eric
      See your HTTP headers [ericgiguere.com]
      • "On systems where there's built-in feedback, that would be one way to combat the spam, just train the search engine crawlers to ignore comments with poor scores." 1. Google should punish URLs with negative feedback! 2. Or Google should ignore URLs in comments. Dang, I'm still shaking - Steelers 33, Giants 30. Great game.
      • Re:Easy Solution (Score:3, Informative)

        by tepples (727027)

        it would be neat if search engines like Google could be trained to ignore negative score Slashdot comments

        Given that the static page is written at a Score:1 threshold, and that Google obeys Slashdot's suggestion in robots.txt not to index the dynamic pages, this is already the case.

    • Well referral spam has been going on for ages (I list mine [idunno.org], but don't link to the urls) and people still publish web logs.

      Ease of use is going to win every time.

    • I disabled html in comment posts a long time ago. Spammers don't care, their spambots keep spamming blindly. Statistically, they will find lots of sites that allow html.
  • by cybrthng (22291) on Saturday December 18, 2004 @05:12PM (#11126199) Journal
    But DoS attacks as well. Running several political blogs I often get "freeped"

    The best solution for me:

    1. User email address verification
    2. server generated images to verify real user for registration
    3. Regular cookie expiration after x amount of time
    4. host filtering (referr filtering usually gets ride of "freepers" unless they open a new window

    However - nothing beats good moderators, quality users and sticking to your nich. Don't go pissing people off tossing your blog around the world yourself and not expect to get anything in return.

    It's a jungle out there :)
    • by doormat (63648) on Saturday December 18, 2004 @05:20PM (#11126241) Homepage Journal
      Some context: This is a "freeper" [wikipedia.org]. They have also been known to use militant mob-style tactics to bother/silence those who dont agree with them, as parent has dealt with. Kinda ironic ya know... they are freepers yet they work hard to silence those who dont agree with them.
      • While it started from FreeRepublic users, the verb "to freep" now can refer to hordes of people from any political blog, whether right- or left-leaning. The two most common sources of freepers are FreeRepublic itself (right-wing) and DailyKos (left-wing).
      • hmm how is this any differnt from slashdot? freeping is just another name for the slashdot effect.
        • As I understand it, "freeping" a site means intentionally manipulating something like a poll so it swings in your political favor. For example, sites on both sides were encouraging their users to "freep" the CNN/MSNBC/etc. polls after the Presidential debates this year.

          The Slashdot effect is more mindless.
    • sage advice :)

      The worst part of being a slashdot member is watching people devistate and ruin a server because of childish acts of vandalism.

      Take for instance whenever slash points towards wikipedia, within minutes the page will be modified to some trolls' agenda.
      Having to wade through the crapflood of comments on blogs and forums after slash has been there is almost embarassing sometimes.
      The servers can generally cope with a slashdotting and work perfectly just hours or days after the initial hit, howeve
    • 2. server generated images to verify real user for registration

      I don't know if something like that have already been done but there was a paper on neural networks used to crack captchas. It was very efficient on basic text (even with a medium amount of distortion) and showed that intelligent spam bots could be written in the future (not that I want to scare you though ;)
      • Yeah, but the amount of power it takes to decode them at least limits the amount of posts it allows.

        The question becomes one of spam. Whether it's in your email box, or the comments of your blog, it's the same.

        You want it to be easy to filter out the spam and still make it easy for legitimate readers to make comments.

        Looking at the slashdot system, a mail-verified registration system seems to be mostly sufficient.

        On my blog the spambot was putting porn weblinks into the webfield, and a generic 'dude th
    • IANAB (I am not a blogger) but it seems to be that track back is at least a partial solution. Perhaps assumed negative on the automatic track back post until it is activated by the author. http://www.movabletype.org/trackback/beginners/
    • 3. Regular cookie expiration after x amount of time

      I really hate it when web sites do that. Does anyone know of a Mozilla plug-in or something that will let me edit the expiration date of any cookie, preferably when the cookie is being set?

      • yeah, i hate it too - but it works. Keeps those "one timers" who come in just to hammer the board with crap and then get re-prompted for a login they most likely forgot before and have to go through a registration process again and usually just give up..

        Ofcourse you could also just regenerate the cookies bsaed upon post scoring - for example if people get modded up lengthen cookie time and such because there is some trust being given.

        Give a reward for participation of sorts
    • server generated images to verify real user for registration

      Use a visual CAPTCHA and completely disrespect readers with impaired vision.



      • Hey, those people can still read your blog. They just can't post comments to it. In the context of all the other shit they're prevented from doing because of blindness, it's not such a big deal.
  • Old news. (Score:3, Insightful)

    by 1_interest_1 (805383) on Saturday December 18, 2004 @05:12PM (#11126201)
    This has been going on for quite awhile now, and still no official fixes from SixApart?

    Shame on them.
  • by IO ERROR (128968) * <{error} {at} {ioerror.us}> on Saturday December 18, 2004 @05:14PM (#11126206) Homepage Journal
    There are many reasons to use WordPress [wordpress.org] instead of Movable Type [wordpress.org].

    First and foremost, it's free (speech and beer) and distributed under the GPL.

    Second, the actual developers of the software actually participate in the support forums [wordpress.org], so if you do have a question, it's likely to be answered very fast by someone intimately familiar with the software.

    Third, it's a lot less susceptible to comment spam, especially after applying a few plugins and hacks [wordpress.org]. I've never received a single one, and that's not for lack of spammers trying.

    Fourth, it's very easy to customize the look and feel of the site without knowing any PHP. HTML and CSS is about all you need to know. Knowing PHP helps a lot if you want to really customize it, but it isn't a requirement.

    Finally, they've already included a Movable Type import utility [carthik.net], so those of you who are sick of MT for this and many other reasons [cafefort.com] can move over with little hassle.

    Signed,
    A very happy WordPress user and occasional contributor.

    • I've been using MT for 2 years now, and the comment spam is actually making a significant bump in the traffic to my server (I doubt anyone else actually reads my stuff...). I had looked at Wordpress a while back and didn't think it was quite "on par" with Movable Type, but MT has done it's best to alienate even myself.

      I share my MT installation with my brother. Not surprisingly, we like having our own weblogs. MT now charges for something that simple.

      The fact that Wordpress is released under the GPL an
      • You can take a look at my blog [ioerror.us] to get some idea of what is available, but be aware that I run nightly builds [wordpress.org] (don't try this at home, kids!) so a few things you see might not be available. And the Google search box at my site [ioerror.us] definitely is not part of WordPress, and might never be; I developed that bit myself. I can't imagine anything you can do with MT that you can't with WP.
    • The down side to WordPress is that it's really very immature code. Not only does it handle UTF-8 characters poorly, but even casual usage turns up a number of bugs in various different parts. This suggests to me that the developers fixed it in one section but didn't fix it in other parts of the code - not exactly thorough. I ran into all this stuff inside my first three hours of usage.

      Of course, all of this is fixable, and just calls for more people to jump in and get involved. I learned a bit of PHP and

    • Do they support multiple blogs with a single installation yet? That was the big reason I didn't move to Wordpress a while back...
      • Re:multiple blogs (Score:3, Informative)

        by IO ERROR (128968) *
        Multiple blogs are partially supported in 1.2, and 1.3 will have much better support for this type of installation (e.g. web hosting, etc.)
      • This was also a showstopper for me; I passed on Textpattern for the same reason.

        (As an aside, solid multiple blog and multiple user support is one of Movable Type's best features, and it irks me that so many MT plugin developers write their code under the assumption that every MT installation only has a single user.)
    • Wordpress has its pros, but the support forum is a ghost town. Maybe when more people migrate over to it this will change, but I think only a small percentage of my questions even had some kind of reply. The wiki is out-dated and full of tips that dont even apply to the current version.

      The current version is buggy (password reset, no way to link to user's profile, etc), but runs well enough and now that MT costs money I'm sure there will be more WP users out there soon. Then again, blogger is great for t
    • I use wordpress. Its nice but comment spam is a real problem. Or at least it was. I had the same online poker guy spaming me 5 times a day until I changed the name of the php file that comments get submitted to. That seem to have done the trick, at least as far as automated spamming.
    • Second here. I got sick of MT when I tried to upgrade from 2.6 to 3.01, and while I was at it, switch from Berkeley DB files to MySQL. The upgrade alone took me 6 hours or so (over a number of days), I posted questions on the forums and go no answers. This is for a *paid* product. The BerkDB->MySQL switch simply did not work. They have a script that supposedly does the conversion, but it doesn't work with all versions of BerkDB files, even though MT pretty much does.

      I posted this problem to the fo

  • by SethJohnson (112166) on Saturday December 18, 2004 @05:15PM (#11126215) Homepage Journal


    I had to ditch Moveable Type explicitly due to comment spam. The real problem with it was that there was no way to delete more than one at a time. The web app only displays the last five comments and then you have to go digging through every article to find the other spams. Real pain in the ass. I switched to Wordpress, which is also beseiged by comment spam from Online Poker outfits. In Wordpress [wordpress.org], however, you can mass-edit with all comments listed with checkboxes to delete whichever are spams.

    In Moveable Type and Wordpress, you can pretty much eliminate the script-driven spambots by renaming the comment cgi handler and then editing all other files that reference it. I didn't think of this till after I swtiched to Wordpress, though.
    • That looks a lot more robost than MT (mind you I'm still using 2.65). When this whole comments thing started getting out of hand, I actually edited every damn post since last year to be comments-closed.

      Maybe I'll switch too. I was planning to do a redesign during the break. Does it have pretty versatile templating?

      • MT 3.x has a Comments page that lets you review 20, 50, etc., comments at a time, select them all to delete, etc.

        Much improve and appreciated. I also turn on comment moderation and this fixed the problems I had with comment spam.
    • Sorry, but renaming mt-comments.cgi to something else takes a spammer all of two seconds to bypass. They just sniff for the text field names in the comment form, and find out the name of the comment handler that way.

      I'm a user at TextDrive, and a bunch of users and admins there have a mailing list where we are VERY aggressive in defeating spam. mod_security is great for blocking based on the contents of a POST payload ("contains texas holdem? Sorry, you get an Error 412.") and mod_dosevasive, which is grea
    • I tried renaming the comments script and it worked for a while, but spammers are smart enough to work around that. Lately I had been getting spam even a few minutes after renaming the script.

      I installed mt-blaclist [jayallen.org], which pretty much solved the problem for me. It allows you to search by regular expression and massively de-spam and blacklist the urls they point to. All subsequent comments containing those urls or other known spam expressions get trashed automatically.
    • I just implemented their TypeKey service on my MT blog when it came out. I used to get comment spam nearly daily, but in the five months since I turned on TypeKey I haven't had a single instance of it. I don't know why more blogs aren't using it, since it is free, and it works quite well for me...
      • Blog spammers are starting by pursuing the low-hanging fruit. As more and more weblogs switch to central authentication systems like TypeKey, I expect that spammers will find it worthwhile to figure out how to spam using TypeKey accounts. If I'm wrong in thinking this, I still haven't heard a good reason from Six Apart or anyone else why that would be the case. I would be happy to be wrong about this, though.
    • Perhaps this was added in version 3.x, but you certainly can delete more than one comment at a time in Movable Type, and there is no need to "dig through" each post to find the latest comments, whatever the number. I believe that the comments page displays 20 comments at a time by default. It's unfortunate, though, that Six Apart pissed everyone off by licensing 3.x as they did, or more people would be taking advantage of 3.x's small but worthwhile improvements.

      I agree with other posters that renaming

    • Perhaps it defeats the purpose of a web-driven administration tool, but the times when I've had to purge spam comments I've simply done it through the database.

      A few "delete from mt_comment where...", and one rebuild later (back in the web admin tool) it was all done. Very little fuss.

      Of course, this talk of alternatives has me interested anyway...

  • by happyemoticon (543015) on Saturday December 18, 2004 @05:17PM (#11126223) Homepage

    If your case is like mine, where mt is stored in a directory just off of your public web site, do this: use a .htaccess to put a password on your whole MT directory. They can't access comments.cgi (assuming it's just a bot doing the spamming), they can't post comments. I don't really like the idea of people touching my CGIs anyway. Make sure your robots.txt excludes the MT directory as well.

    That is, assuming you don't give a damn about people's comments.

    • That is, assuming you don't give a damn about people's comments.

      Who posts comments on websites anyway? It's not like anyone reads them.
  • How long until we have content/poster filtering for blogs like we have for e-mail? If someone got coding right now, they might make a pretty penny off of this...
  • You are all pretentious twats [kuro5hin.org]

    Every last one of you. You're all latte-sipping, iMac-using, suburban-living tertiary-industry-working WASPs who offer absolutely no new insights on anything whatsoever apart from maybe one specialist field if we're lucky.

    Quite an enjoyable rant.

    xox,
    Dead Nancy
    • I live in the urbs, I drink cappuccinos, and I work for an academic research unit. My computer is not an iMac, but a PC with XP and Slackware. I'm a euromutt of catholic derivation, and I have pretty broad interests.

      But that's pretty damn funny, I'll admit. They forgot, though, that they're all writing dark fantasy novels which will never be published.

      There are far too many weblog addicts out there who are excessively vain, and are under some kind of bizarre pretense that they matter, and they seem to e

    • The link above was funny as hell and explained the MT load issue in far more plain language than the original article! Somebody waste some points and get that back up out of the negatives . . .
  • besides WP, Nucleus [nucleuscms.org] is also a good blogging tool, easy to use and its secure. I use this and WP, both are nice. Also I was getting a lot of comment spam using WP, but I turned off letting other sites know when I update and the online casion spam stopped.
    • but I turned off letting other sites know when I update and the online casion spam stopped.

      I've seen this observation mentioned once before, and I'd like to see this explored further. It seems that spammers are harvesting URLs from sites like weblogs.com [weblogs.com] and blo.gs [blo.gs]. I don't doubt that their finding blogs via Google searches, though, so turning off update notifications is probably a temporary solution at best.

  • challenge the user (Score:5, Informative)

    by lseltzer (311306) on Saturday December 18, 2004 @05:24PM (#11126268)
    We had a similar problem on our ziffdavis.com blogs (like my security blog [ziffdavis.com]) and we think we have solved it with with one of those graphic field challenges to the user (enter the value in the nearby graphic).
    • by jacobito (95519)

      Captchas are currently great for weeding out automated spammers; unfortunately, they're also great at weeding out people who cannot see. This unnecessarily renders your site inaccessible to a portion of your audience. From a geekier perspective, this sort of assumption-laden web design runs completely contrary to the accessible, device-independent spirit of the original WWW.

      Of course, since the blog you linked doesn't even work at all as I write this, maybe you're not concerned with accessibility for

      • Works fine for me, sorry if you have a problem.

        I have seen this sort of challenge with an audio option for the sight-impaired. I'll see if that's an option for us.

        In the meantime, if my choice were between having the spam and this accessibility problem, I'll put up with the accessibility problem for now and look for a solution to it. The spam was intolerable and the only thing blind users are denied is the ability to post.
        • My apologies for jumping on the temporary issue with your web site, which was occurring for me on Firefox 1.0 for Windows and Mac OS X, but which righted itself shortly after I made the comment.
  • Call me untrendy, but I still like dotcomments [yahoo.com].
  • They hired Jay Allen, creator of MovableType blacklist, as project manager, but MT BL is not part of the standard distribution. It's not a standard feature, nor is there anything designed in house that provides the same functionality if God-forbid Jay Allen won't let them bundle it as a standard feature. The worst part is that it is having major problems working with MT 3.121, the latest release.

    Personally I think MT needs to just scrap the entire comment system and start over again. They need to implement
    • >This is why we need something like the Child Online Protection Act.

      This is exactly why we DON'T need "won't someone think of the children" legislation. You're going to put up with massive censorship because of some blog spam that can be easily fixed with typekey, blacklists, etc? For some useless blog comments we're going to censor the web? Wow. Amazing, how Americans can even suggest such a thing. So much for the land of the free, eh?

      Like all mediums, parents should be making sure their children ar
    • You're a little behind the curve. MT hired Jay Allen specifically so he could integrate his antispam tools into the standard MT distribution. He's only worked there a short time, do you seriously expect quality software to appear overnight?
  • I am entirely unfamiliar with the issue of spam as it pertains to blogs. Are spammers placing ads (as in, posting their URLs) to random peoples' blogs? Or is the problem that they are just polluting the comment list with random garbage?

    If the issue is posting of URLs, then it should be a simple matter of the blog site checking any URLs against SURBL [surbl.org], a spam URL blocklist.

    What am I missing here? When did this become such a huge issue?
    • by crayz (1056) on Saturday December 18, 2004 @06:04PM (#11126491) Homepage
      A few problems, as a Wordpress user and as someone who's run into problems w/ other people's MT blogs:
      - spam bots attack WP and MT through various means, one of the most common being to simply POST to the mt-comments.cgi or wp-comments-post.php URLs on peoples sites
      - the bots mainly post huge amounts of links to stupid websites, like viagra or poker strategy. the goal is to get a higher google ranking by having links from many different sites
      - the biggest problem for WP users is that you get flooded with literally hundreds of comments per day. if you have good filtering you'll at worst just have to sit around and delete some manually
      - the biggest problem for MT users(or that MT users cause) is that because of the poor design of MT, the comments script takes up a huge amount of CPU time. apparently it actually goes through the process of rebuilding the static post pages even when comments are moderated or auto-deleted. now imagine you have 500 posts and they all get hit at the same time - it's something close to a forkbomb on the server

      The best solution to all of this is to find a way to prevent the stuff from ever getting posted. Once it's submitted you're going to have to analyze it in some way and decide if its SPAM or its good. There are some simple solutions like renaming the comment post scripts, and some more complicated ones like using a verification number or requiring users to register. In any case, it's a very major problem for almost anyone with a blog.
      • How would requiring users to register help? Spammers can register more easily than legitimate users can.
        • Registering is a good way of filtering people. You can force them to do things like get approved first, or get them to provide a valid e-mail address and recieve an e-mail and click on a link within it, etc. Also makes it easier to then kill anyone who's spamming not just by IP or URL but by username

          All this prevents the simplistic SPAM bots from just POSTing to your cgi scripts and forces them to jump through hoops
    • Yes, they post comments which are basically just a list of URLs with lost of links to their sites. The theory being that this will increase their page rank. Luckily, MT already has a blacklist to filter those out but it has to be updated constantly.

      The funny thing is that we (another weblog system, but suffering from the same problem) are seeing a lot of spam posts recently where they put the link text into the href attribute and the actual URL as the link text. Not sure what they're trying to accomplish w
  • Bla bla bla bugs yada yada proprietary yatta yatta use open source!

    There, HAND.
  • by crayz (1056)
    I work for a web host and we've had this issue. 744 on mt-comments.cgi. Sorry guys.
  • NoIndex HTML Tag (Score:3, Insightful)

    by beebware (149208) on Saturday December 18, 2004 @05:54PM (#11126439) Homepage
    At the start of this year (Jan 2004), I actually proposed a possible solution to avoid this sort of thing [beebware.co.uk]. Basically, Google et al starts recognising:
    <!-- robots:noindex --> / <!-- /robots:noindex -->
    And then bloggers can put the comments section of their sites inside the HTML "no index" markup and hence if they are hit by comment spam, Google and the other search engines ignore that content.
    • It might help, but I would rather have Google be searching the comments as well as the main post! Even if comment spam is a problem, you don't want to loose all the other comments that might have value.

      Perhaps Google could recognize a Moveable Type site and just ignore comments from them.
  • by yerdaddie (313155) on Saturday December 18, 2004 @06:03PM (#11126483) Homepage
    I myself run an MT blog and have been contemplating moving to wordpress to dodge the spam bullet, however temporarily.

    It occured to me thought that what would really fix this is to push the load onto the spammers by building a Reusable Proofs of Work (RPOW) [cryptome.org] system.

    For those who are unfamiliar, RPOW is a proposal to stop mail spam by asking the sender to do a little "work" that would make sending a lot emails computationally too expensive.

    As I'm in the last throws of my PhD I'll have to delay on this one, but maybe the lazy web can help out on this one, so the same thing doesn't happen to wordpress or whatever blogging monocultures exist.
    • That's what the WordPress plugin Spam Stopgap Extreme [elliottback.com] does.
    • As I'm in the last throws of my PhD...

      What's the saving throw vs. dissertation committee?
  • "Blog" software predates the existence of a separate category of "blog software", and most of the older stuff works better. SlashCode, I hear, has been known to run several high-traffic sites. There is also Scoop, which was developed for kuro5hin.org, and used at a few other places (like dailykos.org). Both are also much more full-featured than your average "blog software", especially in that they include threaded comments.
  • I've used Wordpress ever since it branched off from b2. Unfortunately, its success has made it a good target for comment spam. The available plugins, such as Farook's WPBlacklist , work really well. However, the amount of incoming spam attempts is sort of like a DDOS attack on us little guys who have servers running on their home cable lines. It just disapointing that we have to put up with this.
  • The solution is to impliment authentication images, much like paypal or the like use when you register. It generates some odd-looking image with a few characters and digits in it, and you as the user have to type it in.

    There is a system like this for wordpress called wp-authimage [gudlyf.com] that works quite well. You do have to know a bit of php and it requires GD on your websever, but neither of those things are super-difficult. I used it on a blog I run [thisisagang.org] with some friends and it works quite well. Our comment s
  • Netcraft comfirms it; Movable Type is dying!

    Sorry, had to plug that one. I run Drupal for my CMS, and lately I've been getting some 'free poker' spams in my comments. I've installed the Spam module and am holding my breath. Do modules like that work in MT?

    Time for me to go check my friends MT sites...

    CB
  • I run WordPress and used to get hit by many casino/cialis spams. I found that I get no comment spam after using a WP hack (http://www.gudlyf.com/index.php?p=376) called AuthImage, which is a CAPCHA (basic Turing test based on character recog.) I strongly recommend it, and would be grateful to any OSS vigilante who could port it to a proper WP plug-in.
  • I'm kind of excited, kind of disappointed. I run a blog [n1zyy.com] with ten different posters running MT. We've been getting slammed with comment spam lately. I just assumed it was in relation to Google starting to move my site up a bit in the ranks. Apparently not. :(

    At first, most of the spam was from obviously-fictitious domains. I earned myself weeks of absolute lack of spam by throwing this into /lib/MT/App/Comments.pm -- I started mine a few lines after line 150 in my case:

    # If an e-mail address is given... ma
  • (Not minorities. Don't even start.)

    Just takes a few assholes to ruin a public resource. They're like the people who steal and/or vandalize phonebooks in the public phone booths.

    Bring punch to the party, and somebody will want to piss in it.

  • The problems are somewhat bigger than they mention. MT performs some very heavy database activity to even get to the point of finding that comments have been disabled completely. Even without triggering the page rebuilds, several hundred requests coming in will grind the server to a halt. The problem is compounded if you're running a flat database backend like sqlite, which does huge memory allocations and can launch you into a swapfest.

    Given that instances of mt-comments.cgi are expensive even when they n

One man's "magic" is another man's engineering. "Supernatural" is a null word. -- Robert Heinlein

Working...