Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
IT

Vendors Take Blame For Most Data Center Incidents 57

dcblogs writes "External forces who work on the customer's data center or supply equipment to it, including manufacturers, vendors, factory representatives, installers, integrators, and other third parties were responsible for 50% to 60% of abnormal incidents reported in a data center, according to Uptime Institute, which has been collecting data since 1994. Over the last three years, Uptime found that 34% of the abnormal incidents in 2009 were attributed to operations staff, followed by 41% in 2010, and 40% last year. Some 5% to 8% of the incidents each year were tied to things like sabotage, outside fires, other tenants in a shared facility. But when an abnormal incident leads to a major outage that causes a data center failure, internal staff gets the majority of blame. 'It's the design, manufacturing, installation processes that leave banana peels behind and the operators who slip and fall on them,' said Hank Seader, managing principal research and education at Uptime."
This discussion has been archived. No new comments can be posted.

Vendors Take Blame For Most Data Center Incidents

Comments Filter:
  • I think it's time to switch to Gamemaker. Have to face the music some day, yes?

  • by Karmashock ( 2415832 ) on Wednesday February 29, 2012 @04:52PM (#39202019)

    I'm sure outside forces installing things are disruptive. But then are they the primary forces doing installations in general? And if that's the case, then it would be more appropriate to call them simply installation related issues... and that's both common and to be expected.

    Install anything new and teething issues tend to crop up.

    • by Anonymous Coward

      I think you are 100% correct. I would also consider it the responsibility of the operational staff of a datacenter to properly monitor any external vendors granted access to the datacenter to ensure they don't break anything. I mean, I'm a customer of Datacenter A, I don't really give a crap if the person who took down my network works for Datacenter A or Vendor B that Datacenter A hired. The fact is, my SLA is with Datacenter A, and they are 100% responsible for maintaining the integrity of that datacen

      • by Imrik ( 148191 )

        What happens when the person who took down the network works for Vendor C that you hired?

    • Speaking only anecdotally, I've found, on many occasions when installing server applications, that the vendor's installation mechanism breaks the system in some way. These products can't be trusted in their default configuration, yet the nature of software installation entails an elevated degree of trust.

      A characteristic example would be a network service which inserts its own startup behavior into one of the standard Unix scripts in /etc/init.d rather than providing its own standalone script. There's
      • It's also very very easy to blame the guy that was there for a day and then not there the next to defend himself.

        I've run into a few situations where a coworker blamed mistakes they made on someone that was recently fired or only came to the office occasionally.

        Why not after all?... what are they going to say in their defense?

  • They need to monitor and control their vendors/contracts/etc better.

    • but, but, but, THAT'S TOO HARD when passing the buck is so much easier!
    • Is there a category for warning the PHB that it won't work and being told "I don't need excuses! Just get it done!" as that might be the missing 102%...
      • by Archangel Michael ( 180766 ) on Wednesday February 29, 2012 @05:10PM (#39202231) Journal

        over worked, understaffed, added three projects this month and only closed one that was already in the works. It isn't too hard, or that it can't be done, it is also we don't have the time to do it right because we're still cleaning up the mess from the last three projects that were "critical" and were over budget and late. We'd be outsourced, but the cost of hiring outside vendor is about 10x what in house staff costs, and they would charge more for each project added.

        Which is why I no longer try to do things on "low budget" and why everything I look at is Enterprise level. Enterprise level allows me to blame the vendor, because THEY are the ones that are selling this shit to the PHB who doesn't know how ridiculously over simplified the vendor makes it sound.

    • by sociocapitalist ( 2471722 ) on Wednesday February 29, 2012 @05:19PM (#39202319)

      Nice try -

      The reality is that whenever something goes wrong, the vendors/contractors are almost always blamed regardless of who is at fault. It's standard business practice for the customer to bring in a vendor for just this reason - if something goes wrong, they can point at the vendor. The bigger the vendor name the better this works. If you can bring the manufacture(s) in that's best of all. Who can blame you then?

      The 'abnormal' incidents where an internal employee is blamed are probably instances where there was absolutely no way for that employee to escape responsibility (ie the syslog entry shows that user logged in, using a one time password token in his possession so that there's no chance of "the vendor has my username and password bullshit", and entering the command 'reboot').

      I'm not saying that vendors, contractors and manufacturers don't make mistakes - they're human and from the manufacturer standpoint there are always bugs that are going to cause problems. I'm just saying that the employee / external aspect should be taken into account and thus these statistics taken with a very large grain of salt.

      • As a vendor, I will attest to this. I have 'fallen on the sword' more than once for employees to save face.

    • by s73v3r ( 963317 ) <s73v3r@gSLACKWAREmail.com minus distro> on Wednesday February 29, 2012 @05:24PM (#39202359)

      I honestly wonder how many of these incidents blamed on outside vendors are actually the result of something the outside vendor did, and not the result of some manager yelling and screaming loud enough to make the vendor do something to shut him up and not lose business.

  • Blame game? (Score:3, Insightful)

    by swb ( 14022 ) on Wednesday February 29, 2012 @04:54PM (#39202041)

    It sounds like this is just some kind tool to show that "it's not our fault, really" -- but at the end of the day, aren't the internal staff responsible for managing the "outside forces" up to and including setting standards, supervision, etc?

    Or is this one of those deals where so much it oursourced that it's easy for everyone to deny culpability?

    • Depends - if you bring in HP or IBM to do provide a solution for you all the way from business requirements, applications development, systems, networking, security, etc, etc, and then at implementation you have IBM, Juniper, Cisco, etc on site to support and something happens...you've covered yourself perfectly. No one can blame you because you got 'the best in the business'.

  • by Above ( 100351 ) on Wednesday February 29, 2012 @04:58PM (#39202091)

    Corporate America loves to outsource. Not because it's efficient or cheap, but because it provides someone to blame!

    Outsource the network to one firm, the generator to another, the HVAC to a third. Hire temp contract lackeys to staff the place, and rent-a-cops to "guard" it. Then, when something goes wrong, blame them. If it's a big enough issue fire them and replace them with the next batch of people who won't be trained, won't care, and will eventually screw up.

    This article isn't illuminating, it's simply restating the design parameters of the system!

    • by swb ( 14022 )

      This is what I really wanted to post.

      And it's not like its even done "intentionally" to find someone to blame, it's just that there is SO much outsourcing and the buck stops...nowhere. Everybody does the least amount they possibly can to keep something from going wrong, because they (well, hell, *I*) know it will because there's inadequate training, documentation, testing, PHBs and Suits screaming about how late projects are, nobody bought enough storage/CPU/bandwidth/amperage and the aforementioned suits/

    • by dkf ( 304284 )

      Corporate America loves to outsource. Not because it's efficient or cheap, but because it provides someone to blame!

      Outsource the network to one firm, the generator to another, the HVAC to a third. Hire temp contract lackeys to staff the place, and rent-a-cops to "guard" it. Then, when something goes wrong, blame them. If it's a big enough issue fire them and replace them with the next batch of people who won't be trained, won't care, and will eventually screw up.

      They're forgetting that the one thing they cannot outsource is the overall responsibility for having things working enough to support their business, for if they get rid of that then they've eliminated the need for them to exist at all (and their supplier will simply cut them out of the equation with no ill-consequences). If things keep failing horribly because the people they're outsourcing to suck, it's Corporate America's fault for outsourcing to the wrong people (or outsourcing at all).

      Mind you, it migh

  • by Anonymous Coward

    80-90% of abnormal incidents caused by vendors was the previous vendors fault.

  • by Chemisor ( 97276 ) on Wednesday February 29, 2012 @05:05PM (#39202171)

    It's the design, manufacturing, installation processes that leave banana peels behind and the operators who slip and fall on them

    When a company tries to get around minimum wage laws by hiring low-paid monkeys to do their design, manufacturing, and installation, they get exactly what they deserve.

  • Comment removed (Score:4, Insightful)

    by account_deleted ( 4530225 ) on Wednesday February 29, 2012 @05:07PM (#39202195)
    Comment removed based on user account deletion
  • by L3370 ( 1421413 ) on Wednesday February 29, 2012 @05:08PM (#39202213)
    If you let them in your datacenter, it's your fault if anything goes wrong in there.
    If your vendor botched a deployment or delivers a functionally useless product, it's your fault for buying into their marketing campaign and not understanding what you just got yourself into.

    But mostly, I think the blame system was by design here...Hire someone else to do the job for everything possible. Fire them/drop contracts when they don't work for you, then file insurance claims to compensate (plus extra if you do it right) for the damages. The trick is to keep the damages rolling as expected--enough to keep insurance revenues up, but not enough so that your premiums adjust to make it unprofitable.
  • i don't know but i have been taking these pills, my wife is happy, and they were recommended to me by the Uptime institute as well, so this study must be close.

  • by Joe_Dragon ( 2206452 ) on Wednesday February 29, 2012 @05:32PM (#39202467)

    contractors and sub contractors add middle man and overhead.

    Some times to the point where a sub may get a job with little to documentation or a job with poor or bad documentation.

    Or a sub may hit a issue and have to work though alot middle man off site managers to get things fixed or just be told do as the documentation says and we will have to get a other contract to fix it.

    • by Anonymous Coward

      As a contractor I experienced this exact thing in a datacenter. During installation of fiber infrastructure I noticed some anomalies on the scope of work. On site contact pretty much said to do exactly what it said (which was wrong/incomplete) because it was designed by the corporate level 1 engineers. Of course this contact retires and the replacement comes in and blames me for an unfinished job.

  • by Joe_Dragon ( 2206452 ) on Wednesday February 29, 2012 @05:37PM (#39202515)

    When a data center is working on another company’s server then the one that they should be working on?

    http://thedailywtf.com/Comments/Remotely-Incompetent.aspx [thedailywtf.com]

  • by billybob_jcv ( 967047 ) on Wednesday February 29, 2012 @06:44PM (#39203207)

    Back in the prehistoric days a group of us were sitting in a bull-pen outside the datacenter. There were big windows on the datacenter wall so we could all ooh & ahh at the blinky lights on the servers and switches. Suddenly, my workstation froze - and when I (and every other person in the bullpen) yelled and looked up, we saw our network admin standing in the datacenter looking back at us with a "What?" look on his face. In his hand was the Ethernet cable he had just pulled out of a core switch...

       

  • by hawguy ( 1600213 ) on Wednesday February 29, 2012 @08:00PM (#39203803)

    Is this surprising? The vendors/contractors do more of the risky work. When it comes time for UPS maintenance, our vendor comes in to take the UPS offline and do the work. If they screw up when they bypass the UPS, they can take down the datacenter. Likewise, when it comes time to add a new disk tray to the storage system or replace a failed controller board, instead of having our staff do it (who may add one tray every year if that), we have the vendor do it, so there's more chance of him doing the wrong thing and bringing down our storage system -- but there's less chance of the vendor causing a problem than our own staff since the vendor's engineer does this twice a week.

  • Quality in a data center, or any facility for that matter, depends on controlling the processes within that facility. If vendors have signed on to working within the procedures developed by the data center operators, fine. There should be minimal problems. But if vendors are allowed on the property to do work not covered by these plans and controls, antics will ensue.

    There is nothing inherently wrong with bringing in outside vendors. As long as their function has been planned for. And there is some means

  • But with the possible exception of a meteor strike, there's always someone to blame for a data center problem.

    I always blame Anonymous Coward. He's the one that failed to order the meteor sheilds.

  • Several years ago, I was working a support case with a major bank. Their remote storage mirroring between BFE, [Southwest State Here] and BFE, [Flyover Country State here] failed, and they wanted to know why. I obtain SAN switch logs from both fabrics and attempt to troubleshoot the issue. The logs revealed that the network ports dropped offline one by one, about 5-7 seconds apart, and then the problem hit the other switch. They came back online one-by-one about three minutes later. The ports were scat

It is easier to write an incorrect program than understand a correct one.

Working...