Forgot your password?
typodupeerror
Businesses IT

How Do You Evaluate a Data Center? 211

Posted by ScuttleMonkey
from the check-for-major-fault-lines dept.
mpapet writes to ask about the ins and outs of datacenter evaluation. Beyond the simpler questions of physical access control, connectivity, and power redundancy/capacity and SLA review, what other questions are important to ask when evaluating a data center? What data centers have people been happy with? What horror stories have people lived through with those that didn't make the cut?
This discussion has been archived. No new comments can be posted.

How Do You Evaluate a Data Center?

Comments Filter:
  • by Critical Facilities (850111) * on Monday November 09, 2009 @05:29PM (#30038552) Homepage

    Beyond the simpler questions of physical access control, connectivity, and power redundancy/capacity and SLA review

    Well first of all, I don't know that I'd write any of those things off as "simple". But some other points worth looking into would be:

    1. Raised Floor Height
      Cable Management (over or under floor)
      Cooling Capacity and Redundancy
      Power Quality (not just redundancy)
      Age and Condition of Electrical Hardware (ATSs, STSs, UPSs, Generators)
      Outage/Uptime History
      Fire Suppression System and Smoke Detection System
      Maintenance records
      Maintenance records
      Maintenance records
    • by jeffmeden (135043) on Monday November 09, 2009 @05:39PM (#30038686) Homepage Journal

      Add to that:

      -KW deliverable to each rack

      -Ambient temperature in the cold aisle and how closely it's held (and possibly make it part of SLA)

      -On site technicians (and/or security) and their hours

      -Customer access policy and applicable hours (are you going to be happy, AND are threats going to be kept out?)

      • Re: (Score:2, Interesting)

        by NervousWreck (1399445)
        Maintenance records Maintenance records Maintenance records are Moses and all the prophets
      • by rijrunner (263757)

        Do they have remote console capability?

        Onsite or offsite support?

        Level of skills of onsite support.

        How many people are onsite during offhours?

        Is your equipment secure?

        Can you come out and service your equipment? Do you have to be escorted?

        How quickly can equipment go from being ordered to being deployed?

        What are their change procedures? Are their change windows when all work needs to be do

    • by whoever57 (658626) on Monday November 09, 2009 @05:40PM (#30038706) Journal
      That's interesting, but the OP really needs to know what is good or not. For example, you state "Raised Floor Height". What is good? Newer datacenters don't have raised floors because it is more energy efficient to have concrete floors. "Cooling Capacity" -- what's good and what is bad? How is this measured? Some datacenters may talk aobut how cool they keep the ambient air, but there isn't much evidence that this actually provides a noticable difference to the lifetime or any other factor related to the equipment.
      • Good question - I would expect it's mostly relevant for those DCs that do underfloor cabling. Cooling capacity is measaured in kW, I believe - the ability to remove heat. since air has a low specific heat, it shouldn't matter much, so long as the temp is stable and it isn't very humid.
      • by JWSmythe (446288) <jwsmytheNO@SPAMjwsmythe.com> on Monday November 09, 2009 @05:58PM (#30038964) Homepage Journal

            I noticed something when touring one datacenter. They had a neat conference room that overlooked the whole datacenter. You could see the heat rising off of one area (Google's room). They went on and on about the wonders of their cooling, and how they had so much capacity.

            We later took the guided tour. The person I was with was talking to our guide, and I was paying careful attention to our environment. There were tremendous hotspots on the floor. We're not talking about 78 degrees. It was closer to the 90's. Other spots were downright cold. Why? Because they had all this capacity, and no real planning. The circulation was insufficient, even though the capacity was available. A well populated rack will always be hot at the back, but it's expected that they will draw the air off of that area rather quickly. I've even seen datacenters that enforce their hot/cold aisles, but then there isn't much of a reason for it. There is no air return on the hot side, and it's just blowing at another aisle's cold side.

            Sometimes it's good to just walk the floor with a tech (not a salesman), and ask questions about the operation. What kind of fiber do you have coming in? How many providers? How good are your generators really? Do you test them on a regular basis? I've found a sales minion will say there are a dozen providers coming in, but it will turn out that only one has substantial fiber, and the others are sharing that. {sigh} Sometimes they will have generators, but they've never test fired them. Sometimes the tech is just frustrated at the nonsense at that datacenter, and that's indicative of how it's going to be to work with them.

               

        • Re: (Score:2, Interesting)

          by outlander (140799)

          One thing not mentioned: a rigorous procedure for handling of decommissioned equipment. Failure to have proper audit mechanisms in place for hw removal is asking for data theft.

          • by JWSmythe (446288)

                That's another whole story. :) Don't make me tell you about the Catalyst 5000 switch I bought on eBay, just to find it was still configured for the classified network it was on.

        • by atrus (73476)

          You want contained hot aisles or contained cold aisles to maintain maximum efficiency. You want managed airflow.

          Its perfectly ok for the hot aisle to be at 100+F. Its also perfectly ok for the cold aisle to be at the mid 70s, as long as there is no stratification or leakage (top of the rack should be within limits). What you want to see is offline CRAHs or VFDs installed in the CRAHs throttling back their airflow.

          I work at a company which specializes in monitoring and helping customers improve their cap

          • by JWSmythe (446288)

            I'm really happy to see when people know what they're talking about. :)

            At one datacenter, I worked closely with the site engineer Our cage was fairly stuffed, and he used ours as an example to newbie customers on how to do some things. The center of our cage was our cold "aisle". The outside was our hot side (just two rows of servers). He'd bring his infrared temperature sensor along, and point out the back of our rack was only about 10 degrees hotter than the front, because we did account for good air f

      • Re: (Score:3, Informative)

        you state "Raised Floor Height". What is good?

        24" is good, 36" is better. I once had a place with 8'0".

        Newer datacenters don't have raised floors because it is more energy efficient to have concrete floors.

        Hogwash.

        Cooling Capacity" -- what's good and what is bad? How is this measured?

        Capacity is measured in BTUs, or specifically tons (12,000 BTUs to a ton). What's most important is the relationship between BTUs and KW consumptions. In a nutshell, how much heat can you remove from the building vs how much are you putting in?

        • "Newer datacenters don't have raised floors because it is more energy efficient to have concrete floors.

          Hogwash."

          Oh, well, thank goodness we got that out of the way.

          In all seriousness though, do you have any verifiable information about this? I was under the assumption that every datacenter had a raised floor but I really don't have any idea where this idea came from.
        • Re: (Score:3, Informative)

          by whoever57 (658626)

          Newer datacenters don't have raised floors because it is more energy efficient to have concrete floors.

          Hogwash.

          Yeah, what do I know about the subject? I'm just quoting from a recent talk given by Subodh Bapat, Vice President, Energy Efficiency and Distinguished Engineer, Sun Microsystems.

          Oh, and there are some articles about this [greentechmedia.com]

          But please, continue to refute my statement with clear, unsupported, single-word denials. They carry so much weight in an argument.

          • Let's be clear - if you have hot-aisle cold-aisle with proper plenum separation, i.e. either walls or curtains blocking airflow from cold to hot other than via the racks and systems - then you can run all the air from the ceiling fine and it doesn't matter.

            If you do not do that, you have to pump up the cold air pressure to jet it down so that it goes down into the cold aisles to floor level, and does not get sucked right back up into the return vents in the hot aisles in a loop up near the ceiling.

            Once you

          • But please, continue to refute my statement with clear, unsupported, single-word denials. They carry so much weight in an argument.

            Yeah [datacenterknowledge.com], you're probably right. I mean, no one [datacenterknowledge.com] is putting in raised floor environments anymore. I don't know what I was thinking [datacenterknowledge.com].

            Quote all you want. I run an enterprise data center, and I can tell you that raised floor is certainly NOT dead.

            • A 12" slab with mesh (what mine is) does have a higher load bearing capacity that a raised floor, though. Equipment weight and load distribution is completely a non issue.

              • While that's true, it's not really too much of an issue. The really, really heavy equipment (CRAC Units, PDUs, RDCs, etc) typically have their own stands, which are placed directly on the concrete slab below the raised floor. The advantages in flexibility with regard to airflow and cable management that exist with a raised floor environment are numerous.

                Mind you, I'm not suggesting that one cannot have a successful, efficient data center without a raised floor, I am merely responding to an earl
    • by Sandbags (964742) on Monday November 09, 2009 @06:00PM (#30038992) Journal

      - Raised floor is certainly important, and a given. Check
      - Cable management above AND below the floor. This is not an either-or... Check
      - Cooling capacity is hard to judge, should be scalable. Redundancy is often overlooked but is often even more important that capacity... Check
      - Power quality: never seen a big datacenter without a Liebert, or at least UPS in every rack. Power does not have the be contitioned except between the UPS and the machines/devices. A whole data center power conditioner is often more efficient, but unnecessary for the little guys. either way - check.
      - Age is irrelevent as long as it's under support. If it's not, replace it. Generators need to be run several times a year to validate their condition, and also to grease the innards... See too many good generators get kicked on and fail an hour later because the oil hand't been changed in 3 years....
      - Outages should be tracked, by system, rack row, and power distro. When system seem to be going down more frequently in one area, there's usually an underlying reason... As Google recently proved as well for us all, do not ASSUME all is well, routine disgnostics including memory scans should be performed on ALL hardware. Even ECC RAM deteriorates with age (rapidly) and needs to be part of a maintenance testing and replacement policy - Check.
      - Fire suppression is usually part of your building codes, and a given, as is the routine checks (at least anually) by law.

      In addition, we deploy:
      - Man traps on all enterences to data centers. You go in one door, it closes, then you authenticate to a second door. A pressure plate ensures only one person goes in/out at a time (and it it's tripped, a scurity guy looking at a screen has to override).
      - Full 24x7 video surveilance of the data centers.
      - in/out logs for all equipment. To take a device in/out of a datacenter requires it being logged in a book (by a designated person). This is for anything the size of a disk/tape and larger. All drive bays are audited nightly by security and if drives go missing, security reviews the access logs and server room security footage to see who might have taken them.
      - clear and consistent labeling systems for rack, shelves, cables and systems.
      - pre-cable just about everything to row level redundant switches, and have no cabling from server to other servers not passed through a rack/row switch first. Row switches connect to distro switches. This ensures cabling is simple, and predictable.
      - Colorcoded cabling: we use 1 color for redundant cabling (indicating their should be 2 of these connected to the server at all times, and to seperate cards in the backplane and seperate switches to boot), a seperate color for generic gigabit connections, another color for DS View, another color the out management network(s), another color for heartbeat cables, and yet another for non-ethernet (T1/PRI/etc). Other colors are used in some areas to designate 100m connections, special connectivity, or security enclave barriers, and non-fiber switch-to-switch connections. Every cable is labled at both ends and every 6-8 feet inbetween.
      - FULLY REDUNDANT POWER. It's not enough to have clean poewr, and good UPS and a generator. In a large datacenter (more than a few rows, or anything truly mission critical), you should have 2 seperate power companies, 2 seperate generators, and 2 fully segregated power systems at the datcenter, room, row, and rack levels. in each datacenter we use 2 Liebert mains, each row has a seperate distribution unit connected to a differnt main, and each rack has 4 PDUs (2 to each distro). Every server is connected to 2 seperat PDUs, run all the way back to 2 completely independent power grids. For a deployment of 50 servers or so this is big time overkill. We have over 3500 servers, we need this... We can not rely on a PSU failure taking out racks at a time which may server dozens of other systems each.

      • by EdIII (1114411) *

        A pressure plate ensures only one person goes in/out at a time (and it it's tripped, a scurity guy looking at a screen has to override).

        Uh huh. A big F&U to the guys who came up with that one. Unfortunately I always show up as "more than one person". I suspect that more than a couple people on Slashdot are similarly challenged.

        Oh yeah... the man traps. If there are going to be situations in which somebody might be trapped in there for an extended period of time have an emergency box that can be brok

        • Re: (Score:3, Funny)

          what, it's not like you're going to starve...
        • by Ash Vince (602485)

          On a complete tangent i have actually been trapped in a datacenter. We did not expect a repair to take as long as it did and our access expired at midnight. We tried to take a break and discover we were locked in until someone could come and open the door to renew our access cards. Only took half an hour for an engineer to attend (called out at about 1am) but was quite amusing when we joked about it after the system was up and running again.

          Certainly not a night I want to repeat in a hurry, but I had only w

          • by Sandbags (964742)

            I had to be locked in one over an entire weekend VOLUNTARILY.

            Major server overhauls, were going to take all weekend to do full backups, replace drive cages, rebuild, and then restore.

            The client operated the top 3 floors of a 15 story building, but because of the design, had I not been locked in a secure area (datacenter), I would have had access to several other floors I was not supposed to be on that were wide open from the elevator (no loched doors). Security locked up the whole place Fridays at 8PM and

      • by Trogre (513942)

        You forgot the pony.

      • - Raised floor is certainly important, and a given. Check

        Not so fast. I've been very happy at the Switch SuperNAP which is on concrete with all cabling run overhead. And for very good reason. The typical (though changing) datacenter has mixed hot and cold air - typically cold air pumped up from the bottom (?!? kind of fighting nature, there) then allowed to rise into the ceiling. The alternative at Switch is strict hot/cold-isle isolation. Cold air drops down as per nature on the cold (intake) side and is contained on the hot-side where it rises and is pulled fro

        • by Gorobei (127755)

          The typical (though changing) datacenter has mixed hot and cold air - typically cold air pumped up from the bottom (?!? kind of fighting nature, there) then allowed to rise into the ceiling.

          It's not fighting nature, it's using it right. If you push cold air in from the top, it warms up, and wants to go back up. Push it in from the bottom, and it does its natural thing and wants to exit via your top exhaust.

          • by Sandbags (964742)

            in a dul row arrangement, cold air will never "warm up" on it's row. It has to be pulled through a server from the cold side to the hot side where it then rises.

            2 of our datacenters are being remodled to go a step further, and not cool the space between racks, only what's in the, in a similar design (sealed racks). so the additional several thousand square feet of the datacenter need only be maintained at a nice comfortable room temp.

        • by Sandbags (964742)

          2 of our 6 datacenters are built using the hot/cold isle method. 2 others are in a migration to a newwer design that still uses hot/cold, but it;s all intra-rack. The datacewnter itself will be a nice 76 degrees, but inside the racks will be 65 or lower. Cold moisture free air is pumped in from the top front of the rack. Servers pull it through to the back and hot air is pulled up and out. The racks are sealed for both airflow and noise reduction. Each rack chassis is about 6-8" deeper than a normal r

      • I'm definitely not arguing with you here, and I agree with all the supplemental points you added. But, in the interest of clarifying my points:

        Power quality: never seen a big datacenter without a Liebert, or at least UPS in every rack. Power does not have the be contitioned except between the UPS and the machines/devices. A whole data center power conditioner is often more efficient, but unnecessary for the little guys. either way - check.

        I would argue that incoming Power Quality from utility is still a key factor worth looking into. While it's true that your UPS(s) are going to correct any power quality problems, this also means that your UPS(s) have to work harder to correct the problem(s), and particularly in the case of low Power Factor [wikipedia.org], you will get less actual power available (as you will

    • You missed a few (Score:4, Insightful)

      by syousef (465911) on Monday November 09, 2009 @06:01PM (#30039004) Journal

      You forgot a few:

      - Enough qualified *on site* staff 24x7 to deal with all clients including yourself

      - 24x7 phone support, with people who understand English and have immediate access to the techies

      - Company financial records and history (You don't want someone almost broke or a new startup with no backing)

      - These days availability of virtualisation solution and supporting hardware (depending on your application, if virtualisation is an option)

      Oh and your emphasis on maintenance records may be a little misplaced. They can be faked. They also may not be available due to security concerns (of their other clients). *IF* you can get hold of them they should be complete. Hardware service level should be part of the agreement and service schedule should be part of that.

    • by icebike (68054) on Monday November 09, 2009 @06:06PM (#30039066)

      Presumably the OP is looking for a hosting site, or processing center, rather than looking at purchasing the facility.

      If so very few of the items mentioned in the parent post are germane, other than Outage/Uptime History. What is under the floor is not your problem in hosting arrangement.

      You might be interested in location (flood plain, quake zone) and, but if the place has been in business for more than 10 years it all boils down to Outage/Uptime History.

      The cost, and ease of migration should the relationship sour and the names of the last big customers to exit the facility would be nice to know.

      • by aaarrrgggh (9205)

        Even if you are going co-lo, you are best off hiring a pro to help you. (Disclaimer: I am a pro.) The checklist approach is a good starting point, but far too often you will miss site-specific issues.

        How old are the UPS batteries? Where are the preventative maintenance logs? What happens if the facility is expanded? Are there any single points of failure you need to understand? Is it 6N/5 or N+1? Are there lease terms that you need to be careful with?

        We have done big deals with large customers, and d

    • Re: (Score:3, Interesting)

      by mcrbids (148650)

      As you indicate, these are hardly simple questions!

      While I would not endorse them today, for years I hosted at GNI, part of 365 Main. Things generally worked well, even if their staff were terse and often unfriendly, so I had no particular complaints until they had a power prolem that cost us about 2000 in direct cost and about two business days to finally, fully resolve. The amount of terse double-speak that came out of them left a very bad taste in my mouth and I've left as soon as I could. Stay clear o

      • I liked the Herakles center in Sacramento too. We also normally need SAS 70 Type II facilities. Those address mostly audit and security concerns, but a data center must be in good operational shape to stay current with a type II. the ones I normally run away from are central offices turned colocation. "Yeah, we are just finishing up our audit." Uh huh.
    • by tempest69 (572798)
      IMHO.. Check it for the standard disaster scenarios..
      Flood... is this place in new orleans, below sea level, or in Cheyenne, a mile up with 12 inches of moisture a year.
      Fire, Looting, Snow (can collapse roofs if not shoveled, and leave power outages for days). tornadoes, hurricanes, earthquakes, landslides, sinkholes, train derailment, proximity to a place someone would like to destroy (WTC, Fedral buildings,the killdozer targets). Buildings that can be affected by highway closures. Close proximity to
    • * N+N redundancy in main supply *and* UPS *and* generators *and* cooling * security - access to the site, protection of the installed equipment, storing equipment being delivered * truly diverse fibre as well as sufficient bandwidth with low/no contention * good peering policies to avoid poor latency to your customers' providers * good value remote hands - some providers are very rigid about charging for every second you ask them for support, most are helpful with free basic cover * hidden charges for telep
    • by kilodelta (843627)
      Also - data centers do move. I know, been through a couple myself. Make sure the movers are bonded, check references for the movers, and insure both the equipment and costs to recover lost data.

      One move was government systems. One of the horror stories I heard back when checking with other state agencies was that in one case movers dropped an entire rack of servers, destroying all.

      Our moving company un-racked every server, then wrapped in padding and blanket and placed in a stable rolling cart. Only 3
    • by Col. Panic (90528)

      also water detection. it's pretty handy to know as soon as possible when your cooling system is leaking under the raised floor.

  • History (Score:3, Insightful)

    by micksam7 (1026240) * on Monday November 09, 2009 @05:30PM (#30038568)

    Look at a datacenter's history [recent and past], outages, maintenance issues, customer support, management and etc, in conjunction with their listed redundancies and capacities.

    Just because they have two electrics going to each server, doesn't mean a random maintenance tech will flip the wrong switch. :)

    • by marbike (35297) *

      Nor does it mean that when they lose a leg of power that it will cut over nice and neatly. Got bit by that a while back. Their power bump put the air handlers offline. Ten minutes later all of our servers went into thermal shutdown.

    • And of course, don't forget the simple visual inspection. Sometimes the cabling infrastructure [darkroastedblend.com] may not quite be up to spec.
  • attack it (Score:2, Insightful)

    by Anonymous Coward

    set it on fire, throw floods at it, generate tornados, then top it off with a nice earthquake.

  • by digitalsushi (137809) <slashdot@digitalsushi.com> on Monday November 09, 2009 @05:37PM (#30038668) Journal

    I ran a data center long, long ago. My sales guy knew it wasn't going to pan out and threw me to the wolves. He asked me to start the tour, and then he took a long lunch to miss it.

    The guys I gave the tour to seemed very intelligent. They only spent about 60 seconds on our data center. The instant they saw the carpet, their eyebrows were up. When I didn't lie to them that there was no diesel generator on the other side of the (secretly dead) batteries, they did exactly what they should have and stormed out without saying thanks.

    • by Nefarious Wheel (628136) on Monday November 09, 2009 @06:13PM (#30039148) Journal

      A smattering of basic physics helps.

      Long ago in a distribution centre a far far away - well, east SF bay, anyway - we had a custom mini doing a bit of work for a major retail store chain's logistics business. In the warehouse they built a little room for the mini upstairs, everything cheap but per spec, they insisted. They used one of their domestic air conditioners for the cooling, as it had the right thermal rating to match the heat dissipation we required for our gear. Cool, we said - no problem, cheap is ok as long as it's specced correctly.

      It wasn't long before we had a service call for a hardware failure. Sent the engineer out, and it was about 110 in the computer room. They'd installed the air intake and air outflow of the air conditioner in the same tiny room.

      • by Zerth (26112)

        They'd installed the air intake and air outflow of the air conditioner in the same tiny room.

        Wow... I bet they sometimes leave the fridge open to cool the break room.

      • by upuv (1201447)

        I have a phrase I use.

        "You can never out smart an idiot!"

        You think you have ever angle covered. You are proud of how tight the whole design is. Then along comes some moron and BOOOMMMMM. They defeat your planning in 1 second.

        Happens all the time.

        I've had a lot of managers in companies look at me when I say that. First off the idiots get very offended. The smart ones get it. The phrase acts like pre-filter before any work is actually done.

        As a result over the years I have learned that design and implem

    • Re: (Score:3, Funny)

      by Anonymous Coward
      I think "data center carpet" should be a new slashdot meme. I can not stop laughing at how ridiculous that "data center" must have looked with that carpet. Please tell me that it was the baby poo green shag carpet from the 70's. That would really make it feature complete.
  • by Astrobirdr (560760) on Monday November 09, 2009 @05:37PM (#30038670)
    I'd also ask:

    Number of years in business.
    Involvement of the owner in the current business.
    Number of years the current owner has been in this business.
    Also do a check with the Better Business Bureau to see what, if any, complaints had been filed.

    And, as always, Google is your friend -- definitely do a search for the business you are considering along with the word(s) problem, issue, complaint, praise, etc!
    • by djweis (4792)

      I'm not sure BBB involvement is relevant for a data center. It's not an auto repair shop with thousands of customers per year. Searching the online court records would be a more appropriate resource.

  • Pull floor tiles and compare the amount of obsolete technology-- Thicknet cables, VAX cluster interconnects, water chiller hookups, FDDI cables, etc. with the amount of space remaining.

    Anything less than 4 inches of obsolete crud isn't worth excavating. Leave it a few more years.

    --Joe

  • Word of mouth (Score:5, Insightful)

    by tomhudson (43916) <barbara.hudsonNO@SPAMbarbara-hudson.com> on Monday November 09, 2009 @05:45PM (#30038776) Journal

    Find someone you trust who's already a customer. Word of mouth beats any number of white papers or studies or guarantees.

  • by swordgeek (112599) on Monday November 09, 2009 @05:46PM (#30038780) Journal

    I'm assuming this is evaluating for co-location purposes. Here are some things I'd ask.

    1) How quickly can I get a new server deployed into it? How do I do it?
    2) Can I get a tour? Now? (Note that this not only lets you see the data centre, but also will give you an idea of security. Look for procedures on getting in, notice if they ask you to sign a release form, etc.)
    3) How close to capacity are you? (The answer should include space, floor weight, power, cooling, and network. If it doesn't, why not?)
    4) What are your racking/networking/cabling standards? (They should have some, at least where you connect to them, but they shouldn't be onerous).
    5) How many people manage the data centre? You don't want to be one car accident away from loss of access or service.
    6) How about power management? Is the centre on a UPS, redundant UPSes, or nothing? Can you get charts of the power going to the servers? Can you get DC for telecom servers, or only AC? Is it on a generator for long-term outages? (Note that you may not need this--in which case you shouldn't pay for it. Alternatively, if you need it, make sure it's there!)
    7) Is it manned 24/7? (Ditto!)

    If you can, ask them to pull a tile so you can see under the raised floor. Underfloor cabling (and suspended ceiling cabling for that matter) should be neat, tied, and labelled. Dead cables should be pulled, not left to rot. There has to be sufficient clearance for unrestricted airflow. Cages are better than lying on the floor.

    Most of what makes a good data centre comes down to organization. If it's a rats nest, then even if there's one guy who knows "everything," it will be less reliable, less consistent, and less predictable. Procedures should be written down, printed, filed in labeled binders, and regularly updated. (Note: Online copies should be canonical, but also needs to be accessible offline when shit --> fan.)

    Fire suppressant mechanisms (wet vs. dry, live pipes, etc.) need to be considered, as does emergency lighting. If the operators need to start digging around for a flashlight to read what they should be doing, then things aren't happening the way they should.

    Be picky. If they're leasing space to you, then their data centre design and maintenance is their BUSINESS, and they had better get it right! Look for a neat, well-organized, well-documented, well-panned data centre. Also make sure that it fits your needs.

    • by Red Flayer (890720) on Monday November 09, 2009 @06:09PM (#30039094) Journal

      If you can, ask them to pull a tile so you can see under the raised floor. Underfloor cabling (and suspended ceiling cabling for that matter) should be neat, tied, and labelled. Dead cables should be pulled, not left to rot. There has to be sufficient clearance for unrestricted airflow. Cages are better than lying on the floor.

      Just want to add... Don't let them pick the tile. They probably get this request frequently enough that they have a "show" tile or two if they are a shoddy organization. Pick one on your tour, as an offhand request that you had "forgotten" until then. If they try to steer you to a specific tile, that tells you they have something to hide, and you need to question everything else they've shown you samples of.

      [paranoid and loving it]

      • by vlm (69642) on Monday November 09, 2009 @06:46PM (#30039564)

        Just want to add... Don't let them pick the tile. They probably get this request frequently enough that they have a "show" tile or two if they are a shoddy organization.

        If you pull this stunt, please understand that a techs hidden stockpile of magazines and canned soda does not necessarily indicate a shoddy organization, it merely means they have employees that like reading certain magazines for the interviews, and prefer to store their drinks in a nice clean spot underneath the chiller rather than the proverbially filthy employee refrigerator. On the good side this is a strong indication they don't have an under the floor rodent infestation.

        Strangest thing I ever found under the floor was a vast amount of one employees (clean) clothing. He was kind of stuck in the process of moving and needed a temporary place to stash stuff. Apparently no one found it unusual that he was hauling bags of clothing in and out.

  • by chris.knowles (1109139) on Monday November 09, 2009 @05:46PM (#30038788)
    There are basically 3 perspectives from which to evaluate the Datacenter. They're pretty well universal to any IT eval. People, Process and Technology. The datacenter facility itself is only one piece of the puzzle (Facility = Technology, which only accounts for a fraction of the total cost of operating a Datacenter). There are also the people running the datacenter and how they are organized and interact with the technology, one another, and their customers (internal and external). From a people/process standpoint, if you want to give a general "score" to them, you can assess them against the SLM maturity scale. (Read about the Gartner Maturity Model for Infrastructure and Operations) Evaluating a datacenter is going to be a balance between the cost of operating the datacenter and the level of service you require from said datacenter. There really isn't enough information in the question to give you a good answer. Are you looking at evaluating the acquisition of a datacenter to grow into, are you looking for a managed services DC to host your gear with operational support? Are you looking for rack space with pipe and power? If you give more details to your inquiry, I'm sure the community can provide you with some great answers.
  • by Jailbrekr (73837) <jailbrekr@digitaladdiction.net> on Monday November 09, 2009 @05:52PM (#30038868) Homepage

    Regardless of how well they are decked out, always start with a "pilot project". Start small for a short period to evaluate real world performance of both their equipment and their tech support. We currently have a pilot project in place to evaluate a datacentre for outsourcing our compute requirements. We have learned that while they have exceptionally good equipment in place, their responsiveness and ability to provision is highly questionable.

    • by rijrunner (263757)

      Honestly...

      Have you *ever* seen a pilot project where they had not already made the decision to deploy regardless of the outcome?

      It always plays out the same way..

      Client admin: we had problem A, Problem B, Problem C
      Sales dude: Hey.. we worked out the bugs...
      Client CIO: Cool. Let's move the rest in now that all the problems are fixed.

      Honestly.. pilot programs are supposed to have problems. You can't get a good feel for the situation because almost invariably you are working with project support and not stea

      • If you have the resources, start pilot projects in a few data centres and then go ahead on one or two after you've evaluated them as a customer for a bit. This works best if you don't tell them that's the plan. See how they treat you as a small customer; they'll treat you at least that well as a large one.
  • What do you need? (Score:2, Insightful)

    by Tdawgless (1000974)
    What does your company _NEED_? How much bandwidth do you need? What kind of servers do you need? Are you looking for Co-Lo or Dedicated? If you're doing Co-Lo, how much power and space do you need? If you're doing dedicated, do you need managed or unmanaged? PCI compliance? HIPAA compliance? Do you want to pay for certain redundancies? Do you need an Uptime Institute Tier certified facility? I could go on and on. The one thing that you need consistently is good customer service. The rest depends on what
  • Vending machines. (Score:3, Insightful)

    by Kenja (541830) on Monday November 09, 2009 @05:55PM (#30038918)
    Since the odds are I'm going to be spending the night there at some point, good vending machines or a cafeteria are a must.
  • Power from the ceiling, data under the floor.

    The reason is data centre floods don't occur very often but when they do the d.c can tolerate the data cable being in water but when the power gets in contact with water circuit breakers trip and they don't work again until they are dry.

    I encountered it when the AC water feed burst and co-incidentally the drain for the data centre had been blocked. If your power and data are through the floor then I would suggest that you invest in a good wet and dry vacuum c

    • IT was data room not a data center and it was 25yrs ago...

      On the third floor was data room with computer and phone switch next too the main A/C flume going from basement to 20flr (walking closet without floor or ceiling. The external auditors came though and wrote up the location that there was no raised floor in case of flooding.

      Site's response: HA HA HA - Flood - 3rd floor?

      Next week the water main at the 5th floor brpke and came down the A/C flume and flooded the third floor in 2 feet of water!

      Opps

      • by MrKaos (858439)

        Next week the water main at the 5th floor brpke and came down the A/C flume and flooded the third floor in 2 feet of water!

        Exactly, mine was on the fifth floor!!!

  • by petes_PoV (912422) on Monday November 09, 2009 @05:59PM (#30038974)
    Such as street access. Is there more than one way in, if the access road was closed off (police incident, subsidence, civil unrest - depending where it's sited), what would happen. Could staff get to work, or leave for home?
    Ease of recruiting / retaining sufficiently qualified staff in the locale, or persuading your to commute or relocate
    Is the on-site restaurant / canteen or local eateries likely to give everyone food poisoning (this could be a single point of failure)
    Local crime rate - number of times the facility has been broken in to - even the amount of graffiti on the walls could be a negative indicator
  • an outside air duct (Score:4, Informative)

    by spywhere (824072) on Monday November 09, 2009 @06:03PM (#30039038)
    When I worked at a corporate office in Maryland, they used the building's air conditioning to cool the server room.
    This worked well until the outside temperature got down to about 15 degrees Fahrenheit, but then it failed miserably: the outdoor condensers no longer functioned, the AC shut down, and the entire IT department went into a panic.
    The first time this happened, I (a lowly Help Desk tech) suggested to the CIO that he run a duct into the room from the outside: a simple fan would bring in enough sub-freezing air to cool the servers.
    The second time it happened, the look on his face told me he hadn't taken my suggestion seriously enough.
    The third time, he flipped a switch and the fan cooled his server room just fine.
    • by SuperQ (431) *

      Yup, that was done at the University of Minnesota CS department a few years ago. They had to take A/C offline in the middle of the winter for a few days and the datacenter room was down 3 stories below ground. We ended up propping open 2 emergency exit stairs on 2 ends of the building, covering the two hallways with plastic tarps to prevent air flow. The the doors at either end of the datacenter were propped open and a gigantic 6' fan was turned on. It pulled a ton of cold air down one stairwell and pus

  • Personnel (Score:3, Funny)

    by girlintraining (1395911) on Monday November 09, 2009 @06:06PM (#30039058)

    More important than the technology is the policies and training of the personnel running the operation. It will fail, eventually: It always does, no matter how well its designed or what with promises of infinite uptime. So walk into the data center and count the number of people wearing hiking boots, divide by the number of racks, and there you go. The most grizzly looking guy wearing hiking boots usually knows everything. He also usually has a lighter and a screwdriver if you ask.

    I don't know why this is...

  • by HockeyPuck (141947) on Monday November 09, 2009 @06:11PM (#30039120)

    I used to have a large cage in an Exodus colocation facility. Turns out that if we wanted to put in an EMC Symm5 (these are three tiles wide), we would have to rent a fork lift and put it through an open rollup door on the second floor. Their "freight elevator" was barely big enough for two people and a dolly.

    One of my other cages was housed in a Global Crossing facility; when they started to run out of out cooling, they would hook up huge external A/C units in the parking lot and run 2ft diameter ducting to a hole in the wall. If you happened to walk near one of these openings you'd be greeted by freezing 50mph winds.

    Anybody find it odd that Exodus bought Global Crossing, who then went out of business?

    • I used to have a large cage in an Exodus colocation facility. Turns out that if we wanted to put in an EMC Symm5 (these are three tiles wide), we would have to rent a fork lift and put it through an open rollup door on the second floor. Their "freight elevator" was barely big enough for two people and a dolly.

      I bet I know what facility you're talking about.

      I put the first Sun E10Ks in one of those, when they were pretty new. It took a while to communicate "No, we're serious, it's that many inches wide, that many inches deep, and weighs 1,600 pounds. Show me your engineering drawings on the raised floor, and I need to measure the elevator again."

  • by NoNsense (6950) on Monday November 09, 2009 @06:12PM (#30039136)

    I am the Director of Operations for our DC. When we give tours, I explain the following (pseudo order of the tour):

    - Begin with the history of the building, when it was built (1995), why it was build (result of Andrew in 1992), and how it is constructed (twin T, poured tilt wall).

    Infastructure:
    - Take you through the gen room, show you it is internal to the building, show you the roofing structure from the inside, explain the N+1 redundancy, the hours on the gens, when they are ready for maintenance, how they are maintained, by whom (the vendor), how the diesel is stored, supplied, duration of fuel at max and current loads. Explain conduct before a hurricane or lockdown, how we go off grid 24hours ahead of a storm, mention our various contracts for after storm refill and our straining / refill schedule.
    - Take you to the switch gear room, explain the dual feeds from the power company, how the switch gear works, show you the three main bus breakers, show you the numerous other breakers for various sub panels, etc. Explain and show you the spare breakers we have in case replacement is needed.
    - Take you to the cooling tower area, explain the piping, the amount of water flowing, the number of pumps, how many are needed, the switching schedule, explain the N+1 capacity and overall capability of the towers, explain maintenance, show you the replacement pumps in stock, explain the concept of condensed water cooling if needed.
    - Take you through the UPS and battery rooms, explain the needed KW capacity, what the UPSs back up and what they do not. Show the various distribution breakers out to floor, their capacity, the static switches, bypass, explain the battery capacity, type of cells, number of cells, number of strings, last time the jars were replaced and how they are maintained. Explain max capacity of the load vs time. Answer questions relevant to switching from utility->UPS->generator and back.

    Raised floor:
    - Take walk on raised floor, explain connectivity, vendors, path diversity we have, how the circuits are protected. Show them network gear, dual everything, how we protect from a LAN or WAN outage, and specific network devices we have for DDoS, Load Balancing, Distribution, Aggregation. Explain how telco and others deliver DS0 to OC-12 capacity, offer information on cross connections regarding copper, fiber, coax. Explain our offerings (dedicated servers up to 5K sq ft cages) and ask what they are interested in.
    - Explain below the floor, size of raise, that power and network is delivered under, what are on level one trays, level two trays, and the piping for cooling. Show the PDU units and how they related to the breakers in the previous rooms. Show them the cooling panel and leads out to CRAC units, explain the cooling capacity, plans for future cooling, explain hot/cold aisle fundamentals, and temperature goals. At this point, there are usually more questions about vented tiles, power types available and overall floor density in watts/sq ft.
    - Explain the fire detection / mitigation system, monitoring of PDU's, CRAC units, and FM200. Explain the maintenance of the fire system, show them the fire marshal inspection logs and the panels that alert the police and fire departments (both on floor and in our security office in front).
    - While finishing the walk on the floor, show cameras, explain process to bring in and remove equipment, tell them the retention on the video, explain the rounds the guards make, the access list updates and changes.

    NOC:
    - At this point we're back to the front of the building, go into the NOC, explain what we are monitoring (connectivity, weather, scheduled jobs, etc). Introduce NOC and security staff, explain they will always get a person if they call, submit a test ticket from a e-mail on my phone, they will see the alerts light up and the pager for the NOC will signal. The final steps are to introduce them to security and then I'll lead the customer(s) to the conference room so they can continue the conversation

    • by kilodelta (843627)
      Nice job! I had the responsibility for redundancy planning too. Went with an APC Symmetra in room, and a 125kW natural gas fired generator on the back end with auto-transfer, regular generator exercising, and regular maintenance and monitoring as part of the whole APC package. Did a 10 year contract on that one. So they notice a battery gone bad in the Symmetra before we did. Also had a 480VAC dedicated service to the NOC.

      Dual redundant AC systems. One fails the other one runs.

      Raised floor (8") - this
  • I had a machine colocated in a very nice, secured facility right in the middle of a major city where all the telco wiring runs. It was awful for these reasons:
    - they advertised 24 access to your equipment on the web site, then the smarmy salesperson explained how that's actually not going to happen. That should have been it right there, but I was dumb.
    - later, they had a brief power outage due to a contractor f-ing up one day, and I was never notified. This in turn disabled my traffic shaping configs, which

  • - data redundancy, offsite specifically
    - ability to cut over? ie what happens if there's an earthquake, are your services to the world down until everything is replaced and backups are restored?
    - what do you have on hand for hot spares in the event of equipment failure?
    - when you are in failover mode for whatever reason, how does it impact your performance? ie does webmail just crawl until the mirror finishes rebuilding?
    - how are your external resources? got a plan to truck in gas for the genny if a torn

  • Many good comments, but nobody is asking what PUE a datacenter gets. Bad PUE turns into lower rack deliverable power and more expensive power when you do get it. I would have a hard time picking a datacenter that didn't have tight closed loop hot isle cooling.

  • The quality of a datacenter has less to do with the equipment (although thats important), and more to do with who designed and is running the equipment.

    Most of the datacenter outages I have been a part of in one way or another (Customer, or Provider) have been caused by:

    Poor planning
    Human Error
    Poor design

    As a normal customer, there is no way to know if any of these problems exist. The solution? Ask for references that utilize that datacenter. Make sure they don't give you a customer that utilizes another da

  • Nuke it from orbit and then see how soon their backup site with your backup data has you back online.

  • Unless yours servers absolutely must be local, one of the most important factors should be local climate and environmental risk. I've worked in a couple datacenters in Michigan and it's really ideal:

    * No state-wide forest fires
    * No flooding if you're above the flood plain
    * No hurricanes
    * Very few tornadoes

    On top of that, if the AC units should spontaneously fail all at once, 99% of the time you can just open up all the doors and run a couple of large fans to keep things cool enough to run.

  • Depending on your uptime needs and size, also consider weather you want your Internet Access included as part of your Data Center or whether you want a carrier neutral facility. Many places just lump data access in with the colocation space and you get an ip.

    Other places, sometimes called a "hotel operator" simply rent you space and power, after which you can connect to one of usually a couple hundred ISPs that are cross-connecting in their meet-me room.

    Also, don't know if you are starting or moving. If y

  • Seriously. Use AWS for your custom apps. Outsource your email and other G&A aspects of running your company. Data centers are for dinosaurs. All of the cool kids are in the cloud.
  • Besides the obvious backup power, cooling, environmental stuff mentioned above....

    1. Expandability.. how much rack/cage space is available nearby? Get a right of refusal on any empty space near your stuff if you can.
    2. Power max per rack, can you get enough for SAN's, blades etc
    3. Remote hands availability/skills/costs (really want to make a trip to the datacenter to replace dead hard drives? No. Do the employees know enough to do *limited* work for you)
    4. 24/7 access, near employees if a physical presence

  • by lanner (107308) on Tuesday November 10, 2009 @12:54AM (#30042396)

    I'd guess 90% of projects fail at step #1: Define your needs. What's the objective here? Why are we doing this, and what are the benchmarks required for success. Does this sound familiar?

    First, define your needs, then evaluate possible solutions to what might meets your needs.

    If you don't know what you need, you don't know what the hell you are doing. Hire someone who does, like a consultant.

  • If you have commercial information that you absolutely cannot allow to fall into the wrong hands (or accidentally deleted, corrupted, not backed up [emc.com], whatever), is storing that data in a data center ever really acceptable? I would think not, but I'd like to hear someone else's opinion. Has anyone here done things DIY for this very reason?
  • I wrote an extensive article on choosing a datacenter/colocation facility several months back. The full post can be found on my blog, but I will paste it below for your Slashdot reading convenience:

    http://www.bitplumber.net/2009/04/how-to-choose-a-colocation-facility/

    How to choose a colocation facility

    Choosing a colocation facility is one of the most important decisions an IT professional can make. It will have repercussions for years down the road, as there is generally a contract term associated, and it

  • Working Conditions (Score:3, Informative)

    by Deal-a-Neil (166508) on Tuesday November 10, 2009 @03:33AM (#30043078) Homepage Journal

    Is there a good desk working area? Is there a landline/PBX for you to make calls from? Is there decent mobile phone reception in the work area and by your cabinet? Can you eat food or bring drinks into the work area or around your cabinet? Is it in a shady neighborhood, where you might feel a little intimidated bringing in tens of thousands of dollars of emergency IT equipment @ 3 AM? In the event that your credentials aren't working (i.e. hand scanner, ID card swipe), can they let you in remotely, or is it manned 24/7? Is it carrier neutral and are there other backbone providers that you can connect with? Do they charge for running cables between cabinets, especially in cases where the cabinets are not adjacent? What is the max amperage that they'll provide per cabinet? Do the rack cabinet doors remove easily? Are there chairs available, and damn it, are they comfortable?

  • Your question is a little ambiguous. Are you looking to buy a data center of your own or are you renting rackspace?

    If you are buying the Data Center
    1.) Normal title , lien, Structural due diligence as for any RE purchase
    2.) Is it on a flood plain
    3.) Seismically active site?
    4.) Serviced by multiple communication providers from multiple CO's
    5.) Power available from two different substations.
    6.) Physical security / susceptibility to civil unrest
    7.) Physical access driveways, parking, loading docks, hal

  • White Mountain datacenter in downtown Stockholm, Sweden. It is located in a bunker 30 meters under solid bedrock. It was a cold war bunker that was converted into this datacenter and is said to be able to withstand a Hydrogen bomb blast. http://www.youtube.com/watch?v=qwlATf9xse4 [youtube.com]
  • I am sorry, but all these people going on and on and on about what you want in a data center are missing the unforeseeable. And the only way to do that is redundancy. What you want is two or more different data centers, in two or more distinct regions (ideally of the World).

    Ask yourself, what would you want running if the entire city or region was nuked, had an earthquake, was hit by a Tsunami, or an asteroid dropped on it? How about if an airplane flies in to it?
    What happens if the regional power grid or n

  • This summer Equinix Paris had a major failure of their cooling system. Of course, they had a backup, but as was expected, the backup was identical to the primary system, and therefore failed identically. Temperature raised over 55C AFAICT. We didn't experience hardware failure since all our servers shut down automagically at 45C. We also had all our systems clustered over a gigabit MAN to another DC, so we suffered only a minor outage.

    Shit happens. You always have to keep that in mind. But two things could

I find you lack of faith in the forth dithturbing. - Darse ("Darth") Vader

Working...