AOL Creates Fully Automated Data Center 123

Posted by Unknown Lamer on Tuesday October 11, 2011 @06:18PM from the tomorrow-system-architect-automates-himself dept.

miller60 writes with an except from a Data Center Knowledge article: "AOL has begun operations at a new data center that will be completely unmanned, with all monitoring and management being handled remotely. The new 'lights out' facility is part of a broader updating of AOL infrastructure that leverages virtualization and modular design to quickly deploy and manage server capacity. 'These changes have not been easy,' AOL's Mike Manos writes in a blog post about the new facility. 'It's always culturally tough being open to fundamentally changing business as usual.'" Mike Manos's weblog post provides a look into AOL's internal infrastructure. It's easy to forget that AOL had to tackle scaling to tens of thousands of servers over a decade before the term Cloud was even coined.

AOL Creates Fully Automated Data Center

This discussion has been archived. No new comments can be posted.

Search 123 Comments Log In/Create an Account

Comments Filter:

Who? (Score:4, Insightful)

by Jailbrekr ( 73837 ) writes: <jailbrekr@digitaladdiction.net> on Tuesday October 11, 2011 @06:27PM (#37684624) Homepage

Seriously though, most telcomm operations operate like this. Their switching centers are all fully automated and unmanned, and usually in the basement of some non descript building. This is nothing new.

Two points. (Score:4, Insightful)

by rickb928 ( 945187 ) writes: on Tuesday October 11, 2011 @06:58PM (#37684962) Homepage Journal

One - If there is redundancy and virtualization, AOL can certainly keep services running while a tech goes in, maybe once a week, and swaps out the failed blades that have already beeen remotely disabled and their usual services relocated. this is not a problem. Our outfit here has a lights-out facility that sees a tech maybe every few weeks, and other than that a janitor keeps the dust bunnies at bay and makes sure the locks work daily. And yes, they've asked him to flip power switches and tell them what color the lights were. He's gotten used to this. that center doesn't have state-of-the-art stuff in it, either.
Two - Didn't AOL run on a mainframe (or more than one) in the 90s? It predated anything useful, even the Web I think. Netscape was being launched in 1998, Berners-Lee was making a NeXT browser in 1990, and AOL for Windows existed in 1991. Mosaic and Lynx were out in 1993. AOL sure didn't need any PC infrastructure, it predated even Trumpet Winsock, I think, and Linux. I don't think I could have surfed the Web in 1991 with a Windows machine, but I could use AOL.

Re:Offtopic, but IT workers? (Score:3, Insightful)

by aix tom ( 902140 ) writes: on Tuesday October 11, 2011 @07:00PM (#37684984)

The software still needs to be written. The programs still need to be run somewhere.
Technically not much has changed. The "Cloud" is still made up of servers that have to be administered. The main effect is that the IT and network admins will have to keep up with technology, especially the new virtualization layers between the hardware and the running application. But keeping up to date has always been a part of working in IT.

Re:So it will take ages for a fix (Score:4, Insightful)

by Zocalo ( 252965 ) writes: on Tuesday October 11, 2011 @07:29PM (#37685228) Homepage

Who cares? I'm guessing you don't have much experience of server clusters but generally, long before you get to the kind of scale we are talking about here, you start treating servers in the same way you might treat HDDs in a RAID array. When one fails, other servers in the cluster pick up the slack until you can either repair the broken unit or you simply remote install the appropriate image onto a standby server and bring that up until an engineer physically goes to site. Handling of the data is somewhat critical though; should a server die you ideally need to be able to resume what it was working on seemlessly and without causing any data corruption; think transaction based DB queries and timeout/retry.

If you have enough spare servers and you can easily get by with engineers only needing to go on site once a month or so, assuming you get your MTBF calculations right that is. There's a good white paper [google.com] by Google on how 200,000 hr MTBF hard drive failure rates equate to drive failures every few hours when you have a few 100k HDs.

Re:no security or maintenance? (Score:4, Insightful)

by EdIII ( 1114411 ) writes: on Tuesday October 11, 2011 @08:34PM (#37685646)

The whole idea is not to need to get to stuff quicker at all.
If you are:
1) Completely virtualized.
2) Use power circuits that are monitored for load, on a battery back up, power conditioners, and diesel fuel generators for local utility backup.
3) Use management devices to control all your bare metal as if you are standing there, complete with USB connected storage per device that you can swap out the iso for.
4) Have redundancy in your virtualization setup that allows you to have high availability, live migration, automated backups, etc.
What you get is an infrastructure that allows you to route around failures and schedule hardware swap outs on your own timetable, which can be far more economical.
If you don't have that then it does involve costly emergency response at 2am to replace a bare metal server that went down. You either pay somebody you have retained locally to do it, or you are the one driving down to the datacenter at 2am to do the replacement yourself with who-the-heck-knows how long it will take with uptime monitoring solutions sending out emails like crazy to the rest of the admin staff, and heavens help you, some execs that demanded to be in the loop from now on due to an "incident".
Don't know about you..... but I would rather be able to relax at 10pm and have a few beers once awhile (to the point I can't drive) without worrying about bare metal servers going down all the time, or who is on call, etc.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

AOL Creates Fully Automated Data Center 123

AOL Creates Fully Automated Data Center More Login

AOL Creates Fully Automated Data Center

Who? (Score:4, Insightful)

Two points. (Score:4, Insightful)

Re:Offtopic, but IT workers? (Score:3, Insightful)

Re:So it will take ages for a fix (Score:4, Insightful)

Re:no security or maintenance? (Score:4, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot