The Setup Behind Microsoft.com 412
Toreo asesino writes "Jeff Alexander gives an insight into how Microsoft runs its main sites. Interesting details include having no firewall, having to manage 650 GB of IIS logs every day, and the use of their yet unreleased Windows Server 2008 in a production environment.
Re:Beta in production environment. (Score:5, Informative)
Re:Firewall Schmirewall (Score:5, Informative)
Re:Firewall Schmirewall (Score:5, Informative)
"...At this point we still don't use firewalls for MS.COM..."
and then
"Router ACLs are in place to block unnecessary ports"
blocking unnecessary ports is a firewall feature (IMHO ?)
Anyway it looks quite impressive. I still don't understand how to handle 650 GB of logs
Re:Beta in production environment. (Score:5, Informative)
Re:Microsoft brainwashing (Score:5, Informative)
Uh, didn't I read an article not too long ago about how the update.microsoft.com site was broken into?
Link, please?
Re:Beta in production environment. (Score:3, Informative)
You can debate the drawbacks and benefits of having so many versions, but XP was never intended to be a substantial server.
Re:Beta in production environment. (Score:1, Informative)
But it's not intended for servers, either on Vista or XP, as the GP said.
Re:Firewall Schmirewall (Score:4, Informative)
Re:Firewall Schmirewall (Score:1, Informative)
akamai (Score:4, Informative)
it's one reason why why doing a lookup on Microsoft servers, it often shows that they are running Linux. It's also another reason why people point out that Linux is more scalable because even Microsoft can't eat it's own dogfood.
Misleading Summary. Total Propaganda (Score:4, Informative)
2. I get into discussions where tech guys spew traffic numbers and I'm never impressed. It creates issues if you want to actually do something with the data which I doubt they do much beyond running the usual marketing metrics. Until you actually shoot for 99.99 service uptime, you begin to comprehend the challenge it is (on any platform) the traffic itself is not the challenge.
3. I'm very interested in reading what their hardware budget is like. I get excellent performance out of Linux compared to server 2003 boxes on similar compaq dl380's.
Re:Supporting (Score:5, Informative)
Re:But generally.. (Score:5, Informative)
And no, I don't see any need to firewall a web farm either.
Re:Swimming in acronym soup... (Score:5, Informative)
HBI: High Business Impact. Social Security numbers
NLB: Network Load Balancer.
AV: AntiVirus.
DoS: Denial of Service
IIS: Internet Information Services. 'httpd' for Windows.
Re:Microsoft brainwashing (Score:3, Informative)
I can't vel (BTW, on an related note, burden of proof is on the person who makes the claim. This follows by necessity from the impossibility of proving a negative.)
Re:Eating dogfood is good (Score:5, Informative)
Nevermind that the UI for 2008 is roughly the same as 2003, only with a more extensive (yet still looking clean and fairly spartan with the eyecandy) set of configuration utilities for roles and features. Just wish I could say the same for the control panel.
As for the 'research' panel... okay, I work here at microsoft, and I own my own copies of office at home, and I have no idea what that is. Of course, I'm hardly an office power user.
You can bet your bottom dollar that office 2007 is all that's in use around most of the company. As is vista, although it tends to be a mixture of vista, xp and 2003/2008 in most offices, usually for a variety of legacy reasons (maintenance of older projects, testing, etc)
I've got all but XP myself, but only because I haven't needed it to do my job.
Re:Beta in production environment. (Score:3, Informative)
Re:Microsoft brainwashing (Score:3, Informative)
But don't believe me though, go install Server 2003 R2 yourself. IIS either isn't installed unless you specify, or it comes locked down to server ONLY static content. (I know that latter part is the default IIS setup, because I had to go turn everything I needed on).
Re:Beta in production environment. (Score:3, Informative)
Re:Supporting (Score:3, Informative)
The thing that causes the confusion is if you do an nmap -O, and it guesses the host operating system to be Linux despite running IIS on the web server.
Re:Perhaps the only ones who can do it "right" (Score:2, Informative)
I'm actually out of words at this point.
Re:Microsoft brainwashing (Score:2, Informative)
Re:Supporting (Score:3, Informative)
http://news.netcraft.com/archives/2003/08/17/wwwmicrosoftcom_runs_linux_up_to_a_point_.html [netcraft.com]
Re:Firewall Schmirewall (Score:1, Informative)
Re:Firewall Schmirewall (Score:5, Informative)
Logging in fixed format is not more efficient than variable format text files (unless we're talking about transactions but we're not). Let's assume you're logging the basics: IP address, Timestamp, Return code, URI and we'll look at logging in fixed format then variable format.
Every record will require 63 bytes and we'll round up to 64 for proper word alignment). So, if we log 1000 messages, we will consume 64,000 bytes total.
Ok. Now for text logging with space delimiters. We have 3 options below, each requiring slightly less space than the previous. We'll run totals for each.
16 + 15 + 2 + 50 + 1 = 84 bytes * 1000 = 84,000 bytes
16 + 11 + 2 + 50 + 1 = 80 bytes * 1000 = 80,000 bytes
12 + 10 + 1 + 50 + 1 = 74 bytes * 1000 = 74,000 bytes
Wow. Fixed binary format kicks variable text format's ass. Wrong. This assumes the URI (or message) block will always occupy 50 bytes. It will not. Let's go right down the middle and assume it averages 25 bytes and we'll recalculate.
16 + 15 + 2 + 25 + 1 = 59 bytes * 1000 = 59,000 bytes
16 + 11 + 2 + 25 + 1 = 55 bytes * 1000 = 55,000 bytes
12 + 10 + 1 + 25 + 1 = 49 bytes * 1000 = 49,000 bytes
Variable text format almost always beats fixed binary format for logging. That's why Microsoft (and the rest of the world) stores log files as text. Plus, it's far easier to manage and debug when you can slice and dice the files with standard command line tools.
One more thing. I know what you might be thinking. We're logging URLS, which will probably consume the majority of the 50 byte allotment. Most developers will calculate an average width size and double it, so no matter what we'll still be filling about 50% of the message section.
Last point. If I were to use your example, the savings with text logging would even be greater. 2 URLS would be stored, both consuming about 50% of their data block. IP address, timestamp, URI, Referrer URI, Return Code. There's also a bunch of other little optimizations you can do such as storing the domain, year, month, and day in the filename rather than in the data or dropping the least significant byte in the HTTP return code.
Re:But generally.. (Score:4, Informative)
Re:Firewall Schmirewall (Score:3, Informative)
Firewall is not an synonym for stateful filter like you imply later on in this thread. For some data to support my statement, the firewall entry at wikipedia [wikipedia.org] says:
"A firewall is a dedicated appliance, or software running on another computer, which inspects network traffic passing through it, and denies or permits passage based on a set of rules."
It then goes on to mention classify firewalls into first, second and third generation ( the first being what you called "Port blocking" ).
In retrospect IPHBT. Oh well.
Re:Microsoft brainwashing (Score:4, Informative)
URL Not Working... (Score:2, Informative)