The Setup Behind Microsoft.com 412
Toreo asesino writes "Jeff Alexander gives an insight into how Microsoft runs its main sites. Interesting details include having no firewall, having to manage 650 GB of IIS logs every day, and the use of their yet unreleased Windows Server 2008 in a production environment.
What happened to Akamai Linux? (Score:3, Interesting)
Re:Firewall Schmirewall (Score:3, Interesting)
My question is why are the logs in ASCII text format? When all you want is say the IP [4 bytes], time of day [4 bytes], URI, referrer and return code [do you really care about their browser strings? You are MS after all, just assume it's IE].
Storing an IP as text requires on average 15 bytes, so right there you can shave off 11 bytes with a binary IP. Time of day is worse, a date+time string is like 25 chars. Doesn't seem like much, but multiply the 32 bytes per entry you save by say 50 million hits and that's 1.5Gbyte you saved. That's not counting the white space you can remove, and a simple huffman code you could apply to the URL/referrer.
Heck, just piping the binary IP/date and ASCII URL/referrer through gzip [or use libz's gzPrintf() etc...] could make a large difference as well.
Point is, bragging about 650GB/day logs is not really impressive when you're "doing it wrong" (tm). That's like bragging about how much you cut your face while shaving.
Re:Swimming in acronym soup... (Score:4, Interesting)
Re:Beta in production environment. (Score:3, Interesting)
Re:Beta in production environment. (Score:5, Interesting)
That said, the choice to use longhorn server in production isn't actually a bad one. It's really, REALLY stable. I keep hearing (from people both inside and outside the company) that it's more stable than 2003 is (and 2003 has the benefits of multiple service packs). It's also a lot more configurable about what it runs, and how much of it it enables when it's installed. I wouldn't bet the entire stable on it, but I'd be willing to put money on it getting a place.
All in all, it's pretty sweet, if you look at it from the sysadmin perspective. Also, the stuff you can setup when you couple it with vista is really nice (from a security standpoint, particularly). That said, some of that functionality is being backported to XP with SP3 or whatever.
Re:But generally.. (Score:3, Interesting)
Re:But generally.. (Score:2, Interesting)
Most of the low end routers claimed "firewall" when they did nothing other than nat. Though now someone else wrote code that runs on their Linux core so they have firewalls they didn't have to pay for. But what you are saying is that a filter firewall is a firewall under every documented definition of the word, but wouldn't sell well because people expect stateful operation. That sounds like you are violently agreeing. The first thing that comes to ones mind isn't the only correct answer. Otherwise, horses can no longer be mustangs, since if you mention someone went out and bought a mustang, nearly all people would picture a car and not a horse. Language doesn't work that way.
Re:Beta in production environment. (Score:3, Interesting)
Re:Microsoft and logs do not compute (Score:2, Interesting)
The sad part is that despite your perfectly good retort and explanation to the gym-class idiot, he probably read a quarter of your post, mentally tagged you as a MS fanboy, and kept giggling. Makes all the non-idiotic GNU/Linux advocates look like idiots standing next to him.
Back In The Days (Score:1, Interesting)
Re:Beta in production environment. (Score:3, Interesting)
Alternately, you can think of "Home" as the successor to Windows ME, with an NT kernel. I'll try to do this schematically (WKS = Workstation, SVR = Server, and some other weird abbreviations used to make the alignment work): In reality, things are a lot more complicated, because there are other editions, Win 2K Advanced Server, x64 editions, and God knows how many variants of Vista. (Maybe "Vista Business" is a better fit than "Ultimate" above too.) In addition, a lot of people who were or would have been in the 95/98 line moved to the "Pro" line for XP. But, for most people, things probably progressed as indicated.
While that is more or less true, consider that tere are really only three main OS Codebases in Microsoft now. Windows NT (non server, the current offering is various form of Vista, as well as XP until they discontinue it). Windows server (a very close relative to the NT series, but optimized for server environments, and multi-processor usage.) Those two code bases are close enough that they share binaries (when on the same architecure) and they could even be used for the opposite purposes with only minor difficulty.
However Windows CE codebase is a bit different. It is still distinctly Windows, but Executable compatibility with the NT series is rare. (That is due in large part to the fact that most CE devices seem to be platforms other than x86.) Interestingly it is possible to create .NET apps that run under CE and modern NT. Since the desktop Framework is largely a superset of the compact framework, the desktop assemblies get used, so code using only .net compact framework and no CE specific assemblies will run just fine on a desktop system.
Now you may notice that there are also some special sub-codebases. For example there is the NT Embedded codebase (seen as Windows XP Embeded), and the NT PE versions
Re:Firewall Schmirewall (Score:3, Interesting)
Fixed binary
Variable text
Let's add one more variation: variable length binary records. Maybe that will offer some savings.
Variable binary format
Pretty good, some savings over variable text; however, we now lost the ability to edit, head, tail, or do anything useful with command line tools. Not exactly worth it for a 1% gain. Oh yes, don't forget gzip will compress ASCII text better than binary because it'll drop the 8th bit on every byte so you'll automatically pickup a built in 12.5% gain with ASCII files which blows away the 1% gain of variable binary format.
Vista as a server (?) (Score:3, Interesting)
At least this is true with the version I'm testing - June 2007 CTP (Community Technology Preview). I expect in later versions this will be obscured.
Re:Beta in production environment. (Score:5, Interesting)
Conventions are a nice way of saying "that's the way it's always been, so that's the way it stays." Windows has similar problems left over from legacy, going all the way back to CP/M. Yes, this sucks, but so does some conventions in unixland. Just ask a Solaris 10 admin how much it sucks when your upstream vendor breaks decades-long convention.
Really? Ok, lets open up C:\Windows on one of our Windows servers. Hmmm a folder named "$hf_mig$". I suppose you know what that means or what convention that follows? Or C:\Windows\adam. Kinda looks like it might be some directory tools. Maybe ADAM = Active Directory AdMinistration? What's that doing there anyway? I could keep going down the list. I suppose there is a very good reason why there are
You're not looking in the right place. Microsoft, love it or hate it, worked out a long time ago that 'filename' and 'metadata' aren't necessarily the same thing. The filename and path are just handy locational indexes, and don't necessarily need to mean *anything*. Sure, a DLL can, and often, for newer stuff, IS far longer than 8.3, but it wasn't until later versions of NT (3.5/4.0, I don't remember my history too well) that support for it kicked in well enough, and there's some legacy stuff around. You don't break legacy just because it's fun. Microsoft gets this right, even if they had to tread over it a fair bit in vista, and add some nasty hacks to deal with most of the fallout.
Anyway, as I was saying, you're not looking in the right place. Case study: C:\windows\system32\apss.dll: Microsoft(r) InfoTech Storage System Library.
Problem solved. (it's not at all difficult to use something like powershell (or possibly other tools) to just print this out in a souped up version of ls with a little scripting, I might add, just like I can do a few similar scripting tricks on my debian system to tell you who owns the copyright to 90% of
Want another one?
c:\windows\System32\bitsigd.dll: Background Intelligent Transfer Service IGD Support
Oh look, another one, fully named.
Of course, this starts to fall down when the file doesn't contain metadata, but that's a problem for, say, XML schema files in
First of all, I was only talking about superficial organization. And if you want to see something nice, have a look at OS X some time. Not only is the System (/System) well organized, but most applications are neatly self contained in
Yes. I do.
I will admit that the mac platform is neatly arranged, but their QA seems to have gone to the toilet right now. A place that windows' QA has emerged from rather nicely, I should mention.
As for random stuff appearing in random places, try dealing with commercial software. Even on linux, the developers will put shit in strange places. Open
Re:Beta in production environment. (Score:3, Interesting)
But you can have both... Metadata and reasonably named "locational indexes". Is it so strange to think that people, particularly administrators, might want to have some idea what a file does and why it is there just be noting its "locational index?" I see this is a significant flaw in the design of Windows. And then there is the Registry, of course. Who would have guessed that users might actually want/need to edit it manually. Certainly not Microsoft. That is just poor planning on their part and I won't excuse it.
You can break legacy. It isn't fun, but it doesn't have to be disastrous either. Apple did it with OS X. And then they did it again when moving from PPC to x86. The only reason Microsoft can't do it is because they've got so much inertia. And it will be their downfall. Though it would probably help if Microsoft didn't wait 4-5 years between major releases (more granular change). Even if Microsoft did want to break legacy, everyone has gotten so used to the old flaws that they can't change. Vista might well be awesome. But the reality is that many people will still be running XP even 5 years from now. Apple, on the other hand, has gotten people accustomed to significant changes.
Fortunately I don't have to much on Linux. I will admit that much of the mess in Windows is as much the fault of developers as it is with Microsoft. But that distribution of responsibility doesn't make using and administering Windows any more pleasant.
Indeed, Adobe does make a mess out of a Mac, that is for sure. Fortunately, the majority of applications I use on the Mac just drop right into
-matthew