The Story Behind a Windows Security Patch Recall 135
bheer writes "Raymond Chen's blog has always been popular with Win32 developers and those interested in the odd bits of history that contribute to Windows' quirks. In a recent post, he talks about how an error he committed led to the recall of a Windows security patch."
Re: (Score:3, Funny)
Re: (Score:2)
And, might I add, the spikes in said pit should be +1, evil-outsider bane.
Re: (Score:1)
Oh wait I see it... Maybe we should cut them some slack.
Re: (Score:1, Offtopic)
What the... (Score:2, Insightful)
If this happened at Apple... (Score:5, Funny)
Re: (Score:1, Funny)
The Money Quote (Score:4, Interesting)
Seriously, it's good to get a glimpse of the interactions in the dev side of MS. It's astonishing that MS even allows this to happen at all. The March 07 Wired had a feature on Channel 9 [wired.com] that humanized the MS organization quite a bit, IMO. It's not just about chair-throwing, marketing hyperbole, and world domination after all... oh wait.
Re: (Score:3, Funny)
I choose moof [wikipedia.org]!
Re: (Score:1, Offtopic)
Fascinating (Score:5, Insightful)
Re: (Score:2, Insightful)
That's quite an appropriate analogy. If you RTFA, you would know that the loop in question is designed to be bounded by a guard variable/event, but they had already terminated the thread that sets the guard to the state that allows the loop to terminate.
The root cause of the hang is that most programmers are not really aware of the states involved at process termination, so they assume invalid things about the DLL process termination event -- namely that it's okay to wai
Re:Fascinating (Score:5, Interesting)
Revealing (Score:1, Interesting)
Backwards compatibility (Score:2)
It accumulates complexity over time.
The result is that even Microsoft can't get reasonably trivial things right.
Not to mention almost all Windows software code being highly complicated compared to equivalent code on other systems.
Re: (Score:3, Insightful)
How much open source work have you actually done? I've done a lot, and this idea is one I see very often in people who haven't done any serious API development work before. The approach of attempting to patch every app when an API changes simply doesn't scale. There's a reason all the important open source APIs (gtk, glibc, alsa, X etc) have "gone stable" in the past 5 years, and it's simply a better approach.
Re: (Score:3, Interesting)
But they only have to maintain source-level compatibility. Microsoft has to maintain binary-level compatibility.
Also, when specific things are extremely and seriously broken, compatibility can be dropped altogether, and some buggy programs broken. Microsoft cannot afford to break buggy programs, even if those are few and far between - nobody can fix them.
Re: (Score:2, Insightful)
This is one of the stupidest comments I've read here in a long time. A secondary "watchdog" thread was employed to enforce a time-out on the helper program's sniffing o
Re: (Score:2)
Re: (Score:1)
And how do you write a program that hosts plug-in executable code, to check whether it will hang its host, without the "unnecessary complication" of another thread of execution, either in the program itself, or in another proces
Re: (Score:3, Interesting)
The far-simpler approach is to use asynchronous programming. Never use blocking API calls. All good API's always provide non-blocking interfaces.
If long computations are required, split them up into short computations and run th
Re: (Score:3, Interesting)
Re: (Score:1)
Re: (Score:2)
Re: (Score:3, Interesting)
An error he committed? (Score:5, Insightful)
Okay, he made an error. Why the HELL wasn't it caught in QA? Microsoft wants us to believe that the reason that we have to wait for patches is that they are getting some kind of exhaustive QA. This patch and executable were specifically created to avoid problems with invalid shell extensions. Don't you think that given that fact the thing to do would be to test it with some invalid shell extensions?
This is the reason that Windows admins have to be so much more paranoid about patches than the rest of us. A Windows patch is highly likely to be a big pile of crap that causes your system to not work properly. I think we can all remember certain service packs that broke various versions of Windows NT pretty much completely...
If you can't have confidence that security patches will fix more than they break, how can you have sufficient confidence to even install that vendor's products, let alone count on them for mission-critical applications?
Re: (Score:1, Redundant)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Yes, they were running Windows NT.
But seriously, Microsoft's claim to fame is backwards compatibility... significantly changing the way the system works without even making a minor revision update (let alone a major one) is very naughty.
XPSP2 broke a mad pile of software. Before that we had windows 2000 service pack... 2? I think that was the REAL
Re: (Score:2)
I was writing a component to track files in a system, and we
were not to use a DB for this. So, we stored them in the file
system. I wrote a stress tester for this component, which
caused it to write files like mad. Long story short, after
all the activity, the machine appeared to be OK. Next reboot,
however, it would die. Repeatable. Very repeatable.
Fixed in the next SP.
Course, recently, we just decommissioned a DB server, 2003
server, MSSQL 2000, if I ran a script against th
Re: (Score:2)
Must be the "map cap" motherboard.
Thanks!
Re: (Score:1)
No love for Apple on this one.
Tried to install 10.2.8, 10.3.9, 10.4.9, or virtually any Security Update?
C'mon, admit it: you held your breath, didn't you?
Re: (Score:2)
Re: (Score:2)
Absolute control of the hardware on which your software runs can come in handy, I guess.
Re: (Score:2)
No, I just watched them install. OS X isn't some bug-free panacea (I've had grey screens, etc.) but I've never had an update blow up my computer.
Re: (Score:2)
http://blogs.msdn.com/mattev/archive/2004/06/21/1
You should read this, which I wrote a few years ago, and which upset many mac zealots (as seen from the comments)
Re: (Score:3, Funny)
Re: (Score:2)
Also keep in mind that if Windows had the same symptoms then the box would just get a reinstall.
Re: (Score:3, Insightful)
Obnoxious, sure, but not really different than any other OS. (In fact I have had brand-spanking new Windows hardware that would lose video if I applied WindowsUpdate recommended driver updates.)
I'd have to be pretty stupid to think that
Re:An error he committed? (Score:5, Informative)
As he points out in his response to the second comment on his blog post, internal testing can't possible cover every single third party shell extension on the planet. (Nor does he try to use that as an excuse for his screw-up.)
Re:An error he committed? (Score:4, Insightful)
IT was an error hat happened all the time, under its most basic use.
While the global OS QA might be excused for some wierd bug that happens under unforseen circumstance, this wasn't even tested to see if it fixed what it wqas supposed to.
Sounds like sloppy(i.e. none) QA to me.
Re: (Score:2)
Much like the QA that went into the spelling of my post.
Re: (Score:2)
Re: (Score:2)
No, you didn't understand the explanation. This sort of understanding failure is exactly what caused the need for the verclsid program in the first place!
The problem was that this specific shell extension (for an obsolete HP printer) contained a concurrency bug - it tried to synchronize with a thread in its DLL detach function. This is never correct, because in a DLL detach you can't make any assumptions about the liveness of other threads. Because there are buggy shell extensions out there that hang, the
Re: (Score:2)
Especially if they aren't actually trying to break it!
If they were trying to break it, then they almost certainly would have discovered this flaw.
Most likely they just have a small handful of shell extensions that they would install and test with.
What this says to me is that there was no intelligence behind the test plan.
Sure, the guy made a mistake. But it is the purpose of testing to make sure that the softwa
Re: (Score:2)
I see even after people responded to you you STILL didn't RTFA. The particular shell extension was for a printer that was so old it wasn't produced at the time the patch was made. How many pieces of hardware does Windows support? Do you want them to test EVERY one of them with every single bug fix? You're batshit insane; even the entire OSS community combined couldn't pull that off.
Re: (Score:2)
The way I envision Microsot QA is a huge warehouse full of every hardware device they could get, with computers having every version of OS that they ever shipped and a switching system to let any of that hardware be tested with any of the computers. Total cost of that warehouse would be in the million$, which means about 0.1% of total Microsoft market capitalization.
Re: (Score:2)
A million monkeys on a million computers in a big warehouse, eh? Heh, now there's a mental image. I wonder how long it would take them to do
Re:An error he committed? (Score:5, Insightful)
Seriously, though, just putting all that equipment in one building would create a zeppelin-hangar-sized building. Finding any specific router or PCI modem would be near impossible. The logistical difficulties of your plan I think would be insurmountable, not even considering the manpower question.
The real point Raymond mentions is that if MS does tons of testing on all the hardware they have available, they get bad press for being slow to release patches. If not, they get bad press for having to recall buggy patches. It's a lose/lose situation for them.
Re:An error he committed? (Score:4, Funny)
Thank you for making one of the most obvious (and thus pointless) statements of the century (did you know that things fall to the ground when you drop them? I'm completely serious) Yes, you are absolutely correct. In any relatively deterministic system, doing something bad in a predictable way will cause the same failure, predictably. Obviously, as this is deterministic, who is doing said bad thing in said predictable way is irrelevant; thus, multiple things may do the same bad thing with the same bad outcome. The blindingly obvious question this raises is exactly how many things do this. Whether 1 or 2 (or even 10) pieces of hardware do this makes little different if there's 5,000,000 pieces of hardware to test, and you only have the manpower to test 5,000 of them. Would you call testing a patch with merely 5,000 pieces of hardware horribly negligent? If so, I suggest you go work for them, and demonstrate that it's possible to test all 5,000,000 pieces in one month (several times, actually, as there are several patches to check).
There is even a comment which raises a more detailed question about the explanation, which has not yet been answered.
That poster is correct in his last paragraph (and the preceding paragraph, which indicated the problem): it was overlooked because, if it was going to break in this patch, it would have been breaking before this patch, as well; only the timing would have changed. Do you check every morning when you get up to make sure the sky is still blue and the grass is still green (I can smell the jokes coming already)? There are a million ways to do things that MSDN tells you specifically to never ever ever do; do you expect MS to check third-party code for every single one of them?
On one last personal note: Don't try to out-asshole me. You will fail. I'm not exactly proud of that, but you need to pull your head out of your ass before you come after me.
I am hurt that you give me so little credit. I would never attempt to challenge you at something I am so totally and obviously outclassed in. I would be much more concerned if you put me on your friends list.
Re: (Score:1)
As for the sneaky, "was included with a printer driver for a printer that is no longer manufactured", typical microspeak, it doesn't say that it also wasn't included with a whole range of other drivers, just that it was specifically in that one (sounds better in marketing terms,
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
But it shouldn't have such a fragile design in the first place.
Of course, a lot of things about MSFT operating systems should be different, but aren't.
Re: (Score:2)
1) There's a lot of pressure to get a security patch out as soon as possible;
2) It's impossible to test every single case (the breaking case was a shell extension for a printer that wasn't even being manufactured anymore)
As Raymond likes to say, "you can't have everything."
Re: (Score:1, Flamebait)
Now look kid, I read the fucking article. Don't make stupid assumptions. It only makes YOU look like an ass. It doesn't do shit to me except piss me off and suggest to me that I'm dealing with an idiot.
Microsoft still typically makes us wait for them, days to weeks after they are reputed to be completed. One of two things is true in these situations. Either the
Re: (Score:2)
Ya know, you are right, you are an arrogant ass, or whatever you admitted to be - and pretty good at it, as you also admitted... but right as rain as well!!! :-)
Gotta add you to my friends list - for both reasons!!! ;-)
Oh - but here's something to add to your response... if MS knew which printer driver for which outdated printer (as they seem to be indicating they do), then why not use that driver for the test? Of course, I too agree with you and think especially with the number of responses about this
Re: (Score:2)
How to tell if somebody has only read the summary: they ask a question that was explicitly answered in the
Re: (Score:1)
Re: (Score:2)
Raymond did not answer the question, he made a statement that may or may not be unrelated.
"Is it raining out?"
"Dont you see my umbrella?"
And what does that mean? Nothing other than I brought my umbrella with me to work for reasons I havent stated... nor have I stated it was raining.
Raymond's marketing doublespeak doesnt say much of anything other than he made a mistake (with some explanation of the mistake) and that he is upset that people complain that a patch takes too long and that people complain
Re:An error he committed? (Score:5, Insightful)
Just so we're clear:
Microsoft is not selling you products that have gone through exhaustive QA, nor are we issuing patches that have gone through exhaustive QA.
The key word here is "exhaustive".
You can imagine that as much as it costs a business when they get a hotfix from us that breaks them, it costs us _at least_ that much in real employee hours (dollars), not to mention the direct and indirect, monetary and non-monetary costs of having to admit that we screwed up a patch.
Software testing cannot tell you how good your product is, only in what ways it doesn't appear to be bad. Every release decision is a _decision_, and its based on necessarily incomplete data put together by imperfect humans with non-infinite time.
A release decision is a culmination of many nested risk/reward tradeoffs. Sometimes, that decision gets made incorrectly, or at least gets made in a way with known or even unknown downsides.
You'll notice that the patch was an interaction problem with an antique 3rd party product. From my time doing admin work on Solaris, IRIX, and Linux machines, I can tell you the big difference between this situation and "those" situations. I never _ran_ 3rd party software on Solaris, IRIX, or Linux (well, I ran 3rd party software on linux all the time, but i just expected it to break anytime i patched anything.. it was a mandatory recompile of any dependant libraries and applications).
I also think your glasses are a little rosy. There were some IRIX patches back in the day that you couldn't back out. Or that wrecked your XFS volumes. I think in every operating system there has been at least one instance of a patch / upgrade / new version that some user opted to back out, because it hurt them and their scenarios more than it helped.
I run very little non-Microsoft software on my windows machines and thus I rarely worry about patches from MS. If you're doing something weird, you need to be more risk averse. IIRC, Microsoft's official recommendation for businesses with critical systems is to install patches in a pre-production environment to ensure compatability with the specific intricacies of your business. You can choose to play fast and loose, but you should be aware that you're making a risk/reward tradeoff decision, based on incomplete data.
Just like we have to do.
Re: (Score:2)
Why not?
http://finance.yahoo.com/q/bs?s=MSFT&annual [yahoo.com]
Last year, your employer earned US$12,600,000,000 profit and has US$34,000,000,000 in cash. Certainly they could pony up for a comprehensive test suite.
But... "you" don't have to. Why should MSFT create a decent product when sheeple, people who are managed by short-sighted idiots, and people trapped by vendor lock
Re: (Score:2)
So your test plan can't be "exhaustive" (he was using the definition: "treating all parts or aspects without omission"). Instead yo
Re: (Score:1)
Re: (Score:2)
Once upon a time, you could just shoot bits at the parallel port and printing would work. The HP 7150 "driver software" (which is 200mb, by the way), doesn't ever detect USB device insertion on vista. Why HP feels the need to have some convoluted, assinine way of doing this is beyond me, but they do.
So no, there is no HP print driver stuff on my main vista machine at home. But it's not for want of
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Could you please clarify what you mean here? You're apparently implying that Linux even has a concept of 'third party software [wikipedia.org]'. All software is third party in Linux, because ther
Re: (Score:1)
No, he's implying that the more software you put on the machine, the more likely it is to find software on the machine that has shit code which happens to work more as a result of luck than deliberate effort.
Re: (Score:2, Interesting)
> wants us to believe that the reason that we have to wait for patches
> is that they are getting some kind of exhaustive QA.
M$ doesn't have "QA". It has QC.
Quality Assurance is the fence at the top of the cliff that at each stage prevents faults from arising, and thus from impacting on later stages of development.
Quality Control is the ambulance at the bottom of the cliff that responds to the emergency call once the fault has b
Lesson (Score:5, Insightful)
They should have simply got rid of the magic naming system in favor of something explicit, such as a Shell Extension Interface that a shell extension must fully implement.
Re: (Score:2)
It sounds like they tried to do that, but he said: "As we saw earlier, lots of people mess up IUnknown::QueryInterface". I'm not familiar with Windows or COM, but I take that to mean that the way they find out what in
Re: (Score:1)
So you mean they should have rewritten a core OS API after it had already been gold mastered? Yes, it was a stupid design to begin with, but that doesn't mean that it's possible to rewrite something like that. This is exactly the reason that Vista is having problems with compatibility. Because core API's were rewritten.
Re: (Score:2, Informative)
"They should have simply got rid of the magic naming system in favor of something explicit, such as a Shell Extension Interface that a shell extension must fully implement."
Seems to me like they had, how would you implement plugins otherwise?. The problem is that if explorer loads these plugins (which do adhere to an interface) and they do something stupid, explorer will hang, since it is the host process. This is bad since explorer.exe on windows is responsible for running the shell.
Therefore they choose to make a separate process (that vert something exe) try and load the plugin and run some tests. Questionable heuristics I agree, but giving those circumstances, I can't co
Re: (Score:2)
Do what most unix programs do - fork() first. So if the child process crashes the parent process happily carries on regardless. AFAIK Windows processes couldn't do fork() , perhaps this is not l
Re: (Score:3, Informative)
There is no "magic naming system". Each plugin implements the shell extention interface and registers its CLSID; when explorer needs to load the plugin for a particular CLSID, it looks it up in the registry, finds the corresponding dll, loads it, and accesses the shell extension's COM interface.
And to think that your post was modded "Insighful" rather than "Arrogantly Ignorant".
Re: (Score:3, Informative)
Re: (Score:2)
Erm, how does your proposal fix this? COM has a simple negotiation mechanism where Explorer can say "Do you support FOO?" and the extension can say "Yes, here you go" or "No, I don't". If you get rid of the COM indirection in favor of a simpler plugin system, you'd have to drop this negotiation mechanism or re-implement it, and that is what would be grossly unsafe.
COM is not the problem here. The problem is buggy code. The only policy design that is worth debating here is whether it makes sense to allow 3
"Magic filename" (Score:2)
Microsoft, in its stupidity and/or attempt to complicate their system so that potential compatibility is more difficult, duplicated the functionality of the "file system" with the registry. That makes registry keys such as CLSID's filenames too. The grandparent was right in calling them magic file names.
Honesty (Score:5, Insightful)
This illustrates the kind of employee I like to have. One who can talk about his mistakes the same way he talks about anything else work-related.
Some years ago I myself made a rather expensive mistake which involved the design of an aircraft structure. The fellow I was working for at the time had one of those razor-blade intellects and I got called into his office for a chat. When he asked me what happened I had two choices, weasel or turkey. In engineering it's always possible to talk the complicated talk and hope to obfusticate your way out of a situation, but fortunately I said "I make a mistake." And you know what? That was exactly the answer he was looking for.
You see, the most important thing is not to be perfect, it's to be honest. That's what a boss, of which I am one now, wants.
If you have a boss that doesn't want that, better watch out for yourself.
education (Score:2, Insightful)
Re: (Score:3, Insightful)
Honesty, but without emotional baggage.
A stuffup is a stuffup, learn and move on.
Reading
Re: (Score:2)
This one bit a client of mine... (Score:5, Informative)
I verified that "Save" and "Save As..." were not working in Word. Word would just hang and only Task Mangler could shut it down. I carry the Sysinternals utilities on CD and USB key, so I rebooted and ran FILEMON, REGMON, and PROCEXP to see what was happening when I tried to save a doc in Word. Sure enough, Word would spawn verclsid.exe as a child process and then hang.
I googled "verclsid" and "Explorer", got nothing on the web and about a dozen Usenet posts from people having the same problem. I played a hunch and renamed verclsid.exe to verclsid.exX. I do that when I'm manually hunting malware that leaves
Problem solved. When the patch for the patch came out, a working verclsid.exe was dropped in %system% and I deleted the
Oh, and the buggy third party shell extension came with a very common HP DeskJet printer. As for Google, the next day I googled "verclsid": there were hundreds of web results and Usenet hits. The day after, tens of thousands. This one bit a lot of people in the ass.
k.
Re: (Score:2)
Re: (Score:2)
The Microsoft KB article [microsoft.com] that came out later that week mentioned that systems
Re: (Score:1)
http://www.nirsoft.net/utils/shexview.html [nirsoft.net]
Re: (Score:1)
Re: (Score:2)
On the other hand, installing a business-class HP LaserJet printer is a breeze. Just the drivers, no crapware, no hidden updaters, no Imaging Center, no Share-to-Web bullshit.
k.
A bit more background info (Score:5, Informative)
Courtesy of JSI FAQ:
You experience one or more of the following strange behaviors:
- You are unable to open special folders, like My Documents or My Pictures.
- Some 3rd party applications hang when accessing My Documents.
- Office files won't open in Microsoft Office if they are stored in My Documents.
- Entering an address into Internet Explorer's address bar does nothing.
- The Send TO context menu has no effect.
- The plus (+) sign on a folder in Windows Explorer does nothing.
- Opening a file via an applications File / Open menu causes the application to hang.
This behavior is caused by a new VERCLSID.EXE binary, which validates shell extensions before Explorer.exe, the Windows Shell, can use them. VERCLSID.EXE is installed by the MS06-015 (908531) security update.
The following 3rd party applications cause VERCLSID.EXE to hang:
Hewlett-Packard's Share-to-Web Namespace Daemon ("%ProgramFiles%\hewlett-packard\hp share-to-web\Hpgs2wnd.exe), auto-started from the Registry Run key and the Startup menu, which ships with:
HP PhotoSmart software
Any HP DeskJet printer that includes a card reader
HP Scanners
Some HP CD-DVD RWs
HP Cameras
Sunbelt Kerio Personal Firewall which has a feature that prompts when Explorer launches VERCLSID.EXE, but you can configure it not to prompt.
To workaround this behavior, add the HP shell extension to the VERCLSID.EXE white list:
1. Open a CMD.EXE window.
2. Type the following command and press Enter:
REG ADD "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\S
3. Shutdown and restart your computer.
NOTE: If you find other COM controls or shell extensions that cause this behavior, you can add them to the white list.
Re: (Score:2)
Re:A bit more background info (Score:5, Funny)
It was an HP printer driver... (Score:2)
That's just insane.
Now multiply that by all the different revisions and patches of the HP drivers, and consider testing each Windows/Application patch against it (on every language, for every version).
You could deforest the planet with test pages before you hit every code path.
Obligatory Mac OS X Tangent (Score:2)
One might assume that the Finder extension framework is sprinkled with all sorts of Cocoa goodness, where objects are magically discovered, loaded, and consumed by Finder though some thoughtfully conceived Objective-C interfaces/protocols.
Nope. It's COM, complete with IUnknowns and HRESULTs, UUIDs and E_FAILs. (The headers are provided by Microsoft). Finder is, after all, just a plain old C++ appli