Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Facebook IT

Making Facebook Self Healing 74

New submitter djeps writes "I used to achieve some degree of automated problem resolution with Nagios Event Handler scripts and RabbitMQ, but Facebook has done it on a far larger scale than my old days of sysadmin. Quoting: 'When your infrastructure is the size of Facebook's, there are always broken servers and pieces of software that have gone down or are generally misbehaving. In most cases, our systems are engineered such that these issues cause little or no impact to people using the site. But sometimes small outages can become bigger outages, causing errors or poor performance on the site. If a piece of broken software or hardware does impact the site, then it's important that we fix it or replace it as quickly as possible. ... We had to find an automated way to handle these sorts of issues so that the human engineers could focus on solving and preventing the larger, more complex outages. So, I started writing scripts when I had time to automate the fixes for various types of broken servers and pieces of software.'"
This discussion has been archived. No new comments can be posted.

Making Facebook Self Healing

Comments Filter:
  • by Maow ( 620678 ) on Sunday September 18, 2011 @12:43AM (#37432272) Journal

    Facebook is an amazing place to work for many reasons but I think my favorite part of the job is that engineers like me are encouraged to come up with our own ideas and implement them. Management here is very technical and there is very little bureaucracy, so when someone builds something that works, it gets adopted quickly. Even though Facebook is one of the biggest websites in the world it still feels like a start-up work environment because there's so much room for individual employees to have a huge impact.

    Like building infrastructure? Facebook is hiring infrastructure engineers. Apply here.

    Damn, if I weren't so adverse to soul crushing rejection, I'd apply.

    This guy was insightful and informative, so I believe what is quoted above.

    And I'm surprised: I figured Facebook would be either more bureaucratic (like MS) or kinda dickishly autocratic (like Zuckerberg is rumoured to be).

Your computer account is overdrawn. Please reauthorize.

Working...