Slashdot Log In
The Setup Behind Microsoft.com
Posted by
kdawson
on Thursday December 13, @11:14AM
from the matter-of-scale dept.
from the matter-of-scale dept.
Toreo asesino writes "Jeff Alexander gives an insight into how Microsoft runs its main sites. Interesting details include having no firewall, having to manage 650 GB of IIS logs every day, and the use of their yet unreleased Windows Server 2008 in a production environment.
Related Stories
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Mostly how they run it (Score:5, Funny)
Re:Mostly how they run it (Score:4, Funny)
Beta in production environment. (Score:2, Funny)
So even MS has given up on Vista.
Re:Beta in production environment. (Score:5, Informative)
Re:Beta in production environment. (Score:5, Informative)
Re:Beta in production environment. (Score:5, Funny)
Re:Beta in production environment. (Score:5, Funny)
Gotta give credit to MS for eating their own dog food...
Allow incoming connection on port 80? Confirm/deny
Re:Beta in production environment. (Score:5, Funny)
Re:Beta in production environment. (Score:5, Interesting)
That said, the choice to use longhorn server in production isn't actually a bad one. It's really, REALLY stable. I keep hearing (from people both inside and outside the company) that it's more stable than 2003 is (and 2003 has the benefits of multiple service packs). It's also a lot more configurable about what it runs, and how much of it it enables when it's installed. I wouldn't bet the entire stable on it, but I'd be willing to put money on it getting a place.
All in all, it's pretty sweet, if you look at it from the sysadmin perspective. Also, the stuff you can setup when you couple it with vista is really nice (from a security standpoint, particularly). That said, some of that functionality is being backported to XP with SP3 or whatever.
Re:Beta in production environment. (Score:5, Insightful)
Dude, if you can't hack that right now, how are you dealing with unix instead?
If any platform's based on a standard of bizarre naming due to space saving stupidity, that's it. Far more so than windows. Infact, name any mature platform that's based on reasonable standards for it's underlying API's and structure.
Didn't think you could. While it's true that things like the FHS are helping on the unix side, try telling an oldschool developer like oracle that they need to follow it. They'll laugh. and laugh.
and laugh.
Windows is in much the same position. At least
ash
Re:Beta in production environment. (Score:5, Insightful)
Because at least Unix has conventions.
Really? Ok, lets open up C:\Windows on one of our Windows servers. Hmmm a folder named "$hf_mig$". I suppose you know what that means or what convention that follows? Or C:\Windows\adam. Kinda looks like it might be some directory tools. Maybe ADAM = Active Directory AdMinistration? What's that doing there anyway? I could keep going down the list. I suppose there is a very good reason why there are
First of all, I was only talking about superficial organization. And if you want to see something nice, have a look at OS X some time. Not only is the System (/System) well organized, but most applications are neatly self contained in
I could give fuck-all what Oracle thinks. My Debian systems are very well organized, thank you very much. I don't find desktop wallpapers in
-matthew
Re:Beta in production environment. (Score:5, Interesting)
Conventions are a nice way of saying "that's the way it's always been, so that's the way it stays." Windows has similar problems left over from legacy, going all the way back to CP/M. Yes, this sucks, but so does some conventions in unixland. Just ask a Solaris 10 admin how much it sucks when your upstream vendor breaks decades-long convention.
Really? Ok, lets open up C:\Windows on one of our Windows servers. Hmmm a folder named "$hf_mig$". I suppose you know what that means or what convention that follows? Or C:\Windows\adam. Kinda looks like it might be some directory tools. Maybe ADAM = Active Directory AdMinistration? What's that doing there anyway? I could keep going down the list. I suppose there is a very good reason why there are
You're not looking in the right place. Microsoft, love it or hate it, worked out a long time ago that 'filename' and 'metadata' aren't necessarily the same thing. The filename and path are just handy locational indexes, and don't necessarily need to mean *anything*. Sure, a DLL can, and often, for newer stuff, IS far longer than 8.3, but it wasn't until later versions of NT (3.5/4.0, I don't remember my history too well) that support for it kicked in well enough, and there's some legacy stuff around. You don't break legacy just because it's fun. Microsoft gets this right, even if they had to tread over it a fair bit in vista, and add some nasty hacks to deal with most of the fallout.
Anyway, as I was saying, you're not looking in the right place. Case study: C:\windows\system32\apss.dll: Microsoft(r) InfoTech Storage System Library.
Problem solved. (it's not at all difficult to use something like powershell (or possibly other tools) to just print this out in a souped up version of ls with a little scripting, I might add, just like I can do a few similar scripting tricks on my debian system to tell you who owns the copyright to 90% of
Want another one?
c:\windows\System32\bitsigd.dll: Background Intelligent Transfer Service IGD Support
Oh look, another one, fully named.
Of course, this starts to fall down when the file doesn't contain metadata, but that's a problem for, say, XML schema files in
First of all, I was only talking about superficial organization. And if you want to see something nice, have a look at OS X some time. Not only is the System (/System) well organized, but most applications are neatly self contained in
Yes. I do.
I will admit that the mac platform is neatly arranged, but their QA seems to have gone to the toilet right now. A place that windows' QA has emerged from rather nicely, I should mention.
As for random stuff appearing in random places, try dealing with commercial software. Even on linux, the developers will put shit in strange places. Open source software is a different matter, you've got enough control that you, or the maintainers, can apply the shoe-horn. Windows doesn't have this problem either. Windows software goes in where it should, and, i should mention, is *legally* obliged to go away completely cleanly when the user requests. I'm not kidding about this. We do a lot of QA just making sure that 'uninstall' for our newer shit works.
We can't be responsible for what third parties do, however. Neither can apple (I just *love* dealing with adobe's software on apples, btw. Or Zend Developer Framework. mmmhm. ) Nor you. Install maya on linux sometime. Or matlab, or something else that you can't fuck with the organisational structure of, because the licensing server would crack the shits.
ash
Re:Beta in production environment. (Score:4, Insightful)
Firewall Schmirewall (Score:5, Funny)
Microsoft servers are notorious for their invulnerability.
Re:Firewall Schmirewall (Score:5, Informative)
But generally.. (Score:5, Insightful)
Cisco Guards for DoS detection and automated response
What in the world do *you* perceive the difference being between a 'firewall' and a router blocking ports based on source and destination being compared with a set of rules (aka ACLs)? Generally, firewall rules *can* get more complex than that, but mere port blocking by an intermediate router has been considered a firewall, even if it doesn't log violating or accepted packets, even if it doesn't have complex rules about connection state. Even if it doesn't have the word 'firewall' emblazened on the chassis somewhere.
Re:But generally.. (Score:5, Informative)
And no, I don't see any need to firewall a web farm either.
Re:But generally.. (Score:4, Informative)
Re:Firewall Schmirewall (Score:5, Insightful)
Ah, the little children. Do you know what the first firewalls were? Routers with access lists. Anything that blocks anything from going to one place from another is a firewall. Port blocking is a firewall, and there exists no firewall I know of that can't be configured to do nothing other than port blocking. You don't have to inspect packets, track flows, or any of those other things to be a firewall, all you have to do is offer some means of restricting traffic. And blocking ports does that.
Re:Firewall Schmirewall (Score:5, Informative)
"...At this point we still don't use firewalls for MS.COM..."
and then
"Router ACLs are in place to block unnecessary ports"
blocking unnecessary ports is a firewall feature (IMHO ?)
Anyway it looks quite impressive. I still don't understand how to handle 650 GB of logs
Re:Firewall Schmirewall (Score:5, Funny)
Re:Firewall Schmirewall (Score:5, Funny)
Well geez.. in that case I sure hope they do regular backups of
Re:Firewall Schmirewall (Score:4, Funny)
$
Works fine for me. Are you sure you're not confusing
Re:Firewall Schmirewall (Score:4, Informative)
Re:Firewall Schmirewall (Score:5, Funny)
Re:Firewall Schmirewall (Score:5, Informative)
Logging in fixed format is not more efficient than variable format text files (unless we're talking about transactions but we're not). Let's assume you're logging the basics: IP address, Timestamp, Return code, URI and we'll look at logging in fixed format then variable format.
Every record will require 63 bytes and we'll round up to 64 for proper word alignment). So, if we log 1000 messages, we will consume 64,000 bytes total.
Ok. Now for text logging with space delimiters. We have 3 options below, each requiring slightly less space than the previous. We'll run totals for each.
16 + 15 + 2 + 50 + 1 = 84 bytes * 1000 = 84,000 bytes
16 + 11 + 2 + 50 + 1 = 80 bytes * 1000 = 80,000 bytes
12 + 10 + 1 + 50 + 1 = 74 bytes * 1000 = 74,000 bytes
Wow. Fixed binary format kicks variable text format's ass. Wrong. This assumes the URI (or message) block will always occupy 50 bytes. It will not. Let's go right down the middle and assume it averages 25 bytes and we'll recalculate.
16 + 15 + 2 + 25 + 1 = 59 bytes * 1000 = 59,000 bytes
16 + 11 + 2 + 25 + 1 = 55 bytes * 1000 = 55,000 bytes
12 + 10 + 1 + 25 + 1 = 49 bytes * 1000 = 49,000 bytes
Variable text format almost always beats fixed binary format for logging. That's why Microsoft (and the rest of the world) stores log files as text. Plus, it's far easier to manage and debug when you can slice and dice the files with standard command line tools.
One more thing. I know what you might be thinking. We're logging URLS, which will probably consume the majority of the 50 byte allotment. Most developers will calculate an average width size and double it, so no matter what we'll still be filling about 50% of the message section.
Last point. If I were to use your example, the savings with text logging would even be greater. 2 URLS would be stored, both consuming about 50% of their data block. IP address, timestamp, URI, Referrer URI, Return Code. There's also a bunch of other little optimizations you can do such as storing the domain, year, month, and day in the filename rather than in the data or dropping the least significant byte in the HTTP return code.
Supporting (Score:1, Troll)
Re:Supporting (Score:5, Insightful)
Re:Supporting (Score:5, Insightful)
Re:Supporting (Score:5, Informative)
Re:Supporting (Score:4, Funny)
Talc [wikipedia.org] is technically a rock...
Microsoft brainwashing (Score:2, Insightful)
Re:Microsoft brainwashing (Score:5, Informative)
Uh, didn't I read an article not too long ago about how the update.microsoft.com site was broken into?
Link, please?
Re:Microsoft brainwashing (Score:4, Informative)
Re:Microsoft brainwashing (Score:5, Funny)
Link, please?
Hi, and welcome to Bizaro World... (Score:1)
wtf! (Score:1)
I wonder what platform they use... (Score:1)
Eating dogfood is good (Score:5, Insightful)
Re:Eating dogfood is good (Score:5, Informative)
Nevermind that the UI for 2008 is roughly the same as 2003, only with a more extensive (yet still looking clean and fairly spartan with the eyecandy) set of configuration utilities for roles and features. Just wish I could say the same for the control panel.
As for the 'research' panel... okay, I work here at microsoft, and I own my own copies of office at home, and I have no idea what that is. Of course, I'm hardly an office power user.
You can bet your bottom dollar that office 2007 is all that's in use around most of the company. As is vista, although it tends to be a mixture of vista, xp and 2003/2008 in most offices, usually for a variety of legacy reasons (maintenance of older projects, testing, etc)
I've got all but XP myself, but only because I haven't needed it to do my job.
No firewalls? (Score:1)
look:
In terms of how we protect the sites, we utilize (starting at the outside edge of the network and working in):
1.
Cisco Guards for DoS detection and automated response
2.
Router ACLs are in place to block unnecessary ports
No a firewall, but... (Score:2, Insightful)
Priceless... (Score:4, Funny)
Server to run it on: ~$2000
Beta testing Microsofts new server 2008 in a production environment: Priceless
Re:Priceless... (Score:4, Insightful)
Ever tried to bookmark something on that site? (Score:2)
I wonder if its on purpose (to avoid bookmarking) or just bad design.
HBI? (Score:2)
HBI Health and Biomedical Information
HBI Healthcare Building Ideas (magazine)
HBI Home Builders Institute
HBI Home Business Institute
HBI Horizontal Blanking Interval (television)
HBI Hot Beef Injection (band)
HBI Hot Briquetted Iron (plant or facility)
HBI Hubbard Broadcasting Inc.
Wikipedia: Page does not exist.
Re:HBI? (Score:4, Funny)
Re:HBI? (Score:4, Insightful)
Microsoft and logs do not compute (Score:1, Funny)
Re:Microsoft and logs do not compute (Score:4, Insightful)
Swimming in acronym soup... (Score:5, Funny)
HBI?
GFS (is the G for "Ghost")?
NBI?
NLB?
ACE?
TIA
Re:Swimming in acronym soup... (Score:4, Interesting)
Re:Swimming in acronym soup... (Score:5, Informative)
HBI: High Business Impact. Social Security numbers
NLB: Network Load Balancer.
AV: AntiVirus.
DoS: Denial of Service
IIS: Internet Information Services. 'httpd' for Windows.
Better response: (Score:1, Flamebait)
1. We run Linux.
What happened to Akamai Linux? (Score:3, Interesting)
Perhaps the only ones who can do it "right" (Score:5, Insightful)
That said, with their closed source and closed-doors policy to revealing details about the inner workings of the OS, _Microsoft_ may be the only company that can successfully deploy a 100% Microsoft powered solution. How many registry changes, service daemon modifications, and other tweaks have been made to get their config running this way? The world may never know. It's probably impossible for the consumer world to ever have that level on knowledge about the Windows environment, and thus run it at peak security levels. For most consumers and businesses, a Linux OS with properly implemented firewalls is much more secure than an out-of-the-box Windows deployment and router ACLs.
akamai (Score:4, Informative)
it's one reason why why doing a lookup on Microsoft servers, it often shows that they are running Linux. It's also another reason why people point out that Linux is more scalable because even Microsoft can't eat it's own dogfood.
Ok... (Score:1)
Misleading Summary. Total Propaganda (Score:4, Informative)
2. I get into discussions where tech guys spew traffic numbers and I'm never impressed. It creates issues if you want to actually do something with the data which I doubt they do much beyond running the usual marketing metrics. Until you actually shoot for 99.99 service uptime, you begin to comprehend the challenge it is (on any platform) the traffic itself is not the challenge.
3. I'm very interested in reading what their hardware budget is like. I get excellent performance out of Linux compared to server 2003 boxes on similar compaq dl380's.
Now there's a best practice (Score:3, Funny)
Now there's a best practice that other corporations should follow - the use of test software in a production environment.
3 Free Tips (Score:1)
|In terms of how we protect the sites, we utilize (starting at the outside edge of the network and working in):
So there you have it. I think this is a good insight into how we run our own internet properties today. What do you think? Have you got any feedback for the boys over at our MSCOM Operations team?|
3 Free Tips, the rest I charge for:
1st don't advertise your networks security especial from the outside - in.
2nd don't believe your own propaganda on rock solid. There are too many issues in it to be rock solid.
3rd don't state your future migration plans on secure architectures to the public.
Cheers
--- Just because you go hunting doesn't mean you have to shoot yourself in the foot ---
No Firewalls! (Score:2)
They don't trust even Win2k8 servers to be secure enough without the *nix safety blanket.
Back In The Days (Score:1, Interesting)
perhaps microsoft will stamp one of these (Score:1)
On the Subject of Firewalls and Microsoft.com (Score:1)
He is basically clueless (Score:2)
1. For what I understand they don't handle data that needs some audit trail in transactions and so on so they don't need firewall. I don't see any logic in his statement.
2. 650GB/day (of what exactly?) may seem a lot but in fact a quite regular database cluster and a proper design would handle that easily if it is well scaled.
3. He is probably just quoting somebody else. Maybe he is right here but it is hard to judge with no knowledge on how exactly does this setup use? And what he means as firewall is another mystery for me.
4. He is stating that some form of NLB made by MS in their web server architecture is bad since it makes normal network design complex and expensive. Is that what he is stating?
5. This point also makes no sense to me. Of course application security is essential since it has nothing to do with firewall. A firewall merely passes or not the traffic based on simple, low-level protocol parameters. Firewall does not protect against application flaws. Application flaws occur at very different level. He is even clueless about OSI model...
The rest is just bullshit about how it is cool to use untested software in production. Actually it is very uncool.
Also this "knowledge" of his is useless. I would love to see some insights on such large setups from somebody who is not M$ and actually did research and testing on which platform to use. Like Google for example.
And also how does microsoft.com compares to google.com? Which is bigger in means of traffic/application load/databases and so on?
Microsoft shill doesn't know what a firewall is (Score:2)
Also, running AV software on a web server? What? I can't think of very many situations where that would be at all defensible.
The rest of the article reads like a marketing presentation. Very enterprise.
the article itself (Score:1)
MS Xenix (Score:2)
LoB
Bad link (Score:1)
This explains the crappy service (Score:2)
Now how stupid is that? What sys admin would use an unreleased OS in a production environment?
That's like Rule No. 1, isn't it?
Hidden (Score:2, Funny)
Some of the new stuff (Score:1)
General:
This will be the last Windows Server that will have 32-bit installation available. With the popularity of x64 based Intel and AMD processors, and the proven reliability of WOW64, this shouldn't be a problem.
You may add/remove as many roles at a time, with a single reboot required after all the roles have been installed
You can bypass entering the product code on installation (Activation still requires the code though). Setup is no longer linear - you can pick and choose what you wish to configure.
Virtualization:
Virtualization has now become a feature of the OS, rather than a separate application installation. You can enable virtualization as a server role. When this happens, a thin layer acts as the interface between the virtual hosts and the hardware (marketing term: "Hypervisor"). The parent host OS then becomes a virtual image (that can't be moved). All hosts are treated as equals.
Virtualization requires the 64-bit edition of Server 2008 installed.
Virtual machines can now have memory spaces > 4 GB and have multiple cores
Virtual machines can run any Windows and some Linux variants are now supported (most likely all will run; MS will actually field support calls for the supported Linux variants).
Event Log
The event log is so much better that I can't begin to explain how much better is it. You truly have to see it. Here's some of the features:
Events displayed within each subsystems management screens. Ex: if I were to open IIS management, I would see a default screen with all the events that were generated by IIS, and none that were generated by other systems.
Events from all eventlogs (Application, Security, System, etc) can be displayed in one window
You are able to see events categorized by event severity, and grouped by time frame (ex: 1 critical event in the last hour, 3 in the last day, x in the last week).
You are able to push events to a central server from multiple server, or you can pull events from other servers to one (subscription)
You are able to execute applications or send emails when an event is fired. You set up criteria for that to happen (event ID, severity, text in body/subject, etc).
Management
The Computer Management MMC console has been replaced by the Server Management console. The Server Management console is automatically populated with links to the management windows for each installed role, thus making it the de-facto configuration window.
PowerShell is a new command line interface. It is a hybrid console/scripting environment, created to aid in systems management. You can manage either the local server or remote servers from it.
New Server 2008 Core Installation Option
Server core is an optional way to implement Windows 2008. It removes the GUI portion of the OS as well as a number of other features, thus reducing the attack surface of the OS.
Core is not a separate product; the Standard, Enterprise, and Datacenter editions can all be installed in Core mode
Managed with remote tools and command prompt (cmd)
5 available server roles
Included:
o DNS
o DHCP
o File sharing
o AD
o WSV - windows server virtualization
o Limited IIS - static content only
o Task manager
Not included:
o No GUI
URL Not Working... (Score:2, Informative)
Take it easy (Score:1)
Asked to log in before reading a blog? What? (Score:2)
Link broken? (Score:2)
Not Found: Forum Not Found
The forum you requested does not exist.
Was the article deleted?
Isn't it ironic? (Score:2)
hum ... (Score:2)
We apologize, but an error occurred and your request could not be completed.
This error has been logged. If you have additional information that you believe may have caused this error please report the problem here.
this is what I get (Score:3, Funny)
"We are currently unable to serve your request
We apologize, but an error occurred and your request could not be completed.
This error has been logged. If you have additional information that you believe may have caused this error please report the problem here.
"
I think that gives a good demonstration of how they run their site...
We are currently unable to serve your request (Score:2)
Slashdotted. Oh, the many levels of delicious irony...
We are currently unable to serve your request (Score:2)
We apologize, but an error occurred and your request could not be completed.
This error has been logged. If you have additional information that you believe may have caused this error please report the problem here.
The above is what I get when I try to RTFA. I guess that tells me all I need to know!
Post deleted (Score:1)
Re:A router can be a firewall too (Score:1)
Re:They do use firewall (Score:2)
Re:They do use firewall (Score:1, Flamebait)
Re:Router ACL= Firewall (Score:2, Flamebait)
2. Router ACLs are in place to block unnecessary ports
Right-o ! Shows what a brainwashed, single-minded dim he is. Doesn't say "(Microsoft) Firewall v.0.38.2a" on the shrink-wrapped package; and voilà, isn't (a firewall). That's how they keep the masses unwashed and in admiration. (But I digress.)
Actually, the whole thing is a disgrace, but what to expect
2. We have ~650GB/day of IIS logs [...] Just IIS logs are a challenge without trying to parse another ~650GB of firewall logs.
Why is an IIS log size just as large as a firewall log ? Makes me wonder, if he thinks they were the same ??
650GB of what ? ASCII text or gzip ?
3. 5+ years ago, there wasn't a firewall solution that would scale to our needs and this forced us to focus on network, host, and application security.
I'd never would want their stuff for free even. Because the use of the word 'forced' is absolutely wrong. Program security is the alpha and omega of security; and anyone who wants to have his software taken seriously would look into exactly these. Not into firewalls.
5. Application security is critical since a firewall is likely going to allow traffic on the correct port and protocol through to the web servers so IIS/ASP.NET/Applications must deal with these requests gracefully.
This is so right, see above. But the mentality implies he is unaware of the fact that predictable and graceful behaviour is what we want in the applications in the first place.
6. We do run AV on our servers when we can. At times product adoption means we don't install it, but we do normally run AV.
Makes one wonder what this is supposed to tell us. At times they don't get an AV running on their own boxen ? Can someone point out to me, which logic underpins non-usage of AV for 'product adoption' ? Like, on those boxen containing Vista ?
Re:No filewall? (Score:1)
Re:A router can be a firewall too (Score:2)
IIS is a web server, thus those are web server logs, which can be parsed to get statistics about page views, errors, etc...
Re:nothing better to say? (Score:2)
And if you're referring to Linux versioning, please remember that with OSS products there is no remit to get a "finished" product into a box onto the shelves - just because it happens to be "Random Linux App v0.3" does not mean it is "not quite done".
Re:It Blows (Score:2)
Re:Bill Gates Behind a Curtain (Score:1)
But it's a setup
until you're fed up
Re:It Blows (Score:1)
We apologize, but an error occurred and your request could not be completed.
This error has been logged. If you have additional information that you believe may have caused this error please report the problem here.
How appropriate seen articles subject.
Hah.