Data Loss Bug In OS X 10.5 Leopard 603
An anonymous reader writes "Leopard's Finder has a glaring bug in its directory-moving code, leading to horrendous data loss if a destination volume disappears while a move operation is in progress. This author first came across it when Samba crashed while he was moving a directory from his desktop over to a Samba mount on his FreeBSD server."
Re:Terrible bug (Score:5, Insightful)
TFA looked decent in it's details. Even step by step recreation. But it's a pretty serious bug, that as you mention, *needs* to be fixed quick and I didn't see any other sources confirming it.
Re:Tiger has this problem as well!!! (Score:5, Insightful)
And I wouldn't exactly call this regression testing, as such functions as file movement aren't usually impacted by later changes. It should be pretty basic on the design chart. Sounds to me more like "working as intended...use move at your own risk". Which I think it stupid, but I don't see how this really was *missed*, especially since some are claiming it's been this way since at least Tiger.
Every System have critical bugs (Score:2, Insightful)
I guess, more bugs are to be revealed when the number of users continue to rise, but they will also be fixed, so the system will become more and more stable with the time.
The difference is, how the media and people in general reacts to an error of such a kind. Could yu imagine what people would scream and cry is Vista happened to just loss a bite of information? Oh, christ, I don't want to even THINK about this. Now, Abble, does that mistake, and... that's OK, nobody is perfect... Will get fixed.... People have double standards, but in the end, Vista was not THAT bad, and Abble's OSX was not THAT good. The true, is as always there, somewhere in the middle.
Re:Wow (Score:3, Insightful)
I was curious, so I tried this scenario with Nautilus (the file manager in GNOME). It prompted me: "A folder named 'A' already exists. Do you want to replace it?" which sounds rather much like the Mac OS behavior your described. But it goes onto explain: "The folder already exists in 'B'. Replacing it will overwrite any files in the folder that conflict with the files being copied." This suggests instead that unlike the dialog heading, B/A will not be replaced, but the two directories' files merged. Indeed this is what it does.
I'd call this a bug. (The wording of the dialog, that is.)
I don't understand (Score:2, Insightful)
Target diskete failure doing a move from C: to A: on DOS, yep your data is gone.
Network error moving from one windows box to another, yep your data is gone.
NFS write failure on Linux 2.4, check your data is gone.
Maybe move should be implemented as copy, completly then delete but its often not. I don't think there is any convention that demands it be that way. If you care about the data, at all you should always copy, check(maybe cursory, maybe md5 depending on how much you care) then delete.
I tell my users all the time, "move it only if you can lose it."
I don't think this is really a "bug" so much as a behavior, ie there is no handling of media exceptions when doing a move. Now if you data sometime went bye bye with two working devices that would be bug, and that is not what is being described here.
I don't think its fair to single MAC OS out for this either. As far as I know most mainstream OS seem to handle move operations with media exceptions badly. I am also not a MAC appologist. I don't have nor have every had any Apple products. Sure maybe the OS should copy, check at least no exception events happened durring and then delete but its not a bug. I don't think you can blame the OS for problems when the hardware under it be it a disk, NIC card, or memory flakes out. If it handles it gracefully then that is a virtue of the system, if does not handle it then its room for improvement in terms of features but not really a "bug".
Re:Par for the course? (Score:5, Insightful)
Re:Par for the course? (Score:5, Insightful)
If you drag a folder called "Documents" into your home directory and click on "OK",
To be fair, I don't think it asks you whether it's ok to move that directory. It will warn you that it's going to replace that folder, and the buttons will either say, "Replace" or "Stop". It's not that ambiguous.
The only thing that makes it problematic is if you're accustomed to working in a file manager that will automatically merge directories, then you might think it's going to merge when it's actually going to replace. I would say that neither behavior is "wrong", but you certainly can get unhappy results if you're expecting one behavior and get another.
Honestly, it took me a little while to get used to it, but now that I expect it, it's fine. Usually, if I'm doing anything complicated with copying/moving lots of stuff recursively, I'm going to want to use a command line anyhow. In the command-line, "cp" and "mv" work in normal unix fashion.
Re:Tiger has this problem as well!!! (Score:0, Insightful)
Re:Terrible bug (Score:5, Insightful)
You're asking if a bug wherein entire folder hierarchies can go *poof* in the event a network share drops should be considered critical? Are you serious?
Re:I don't understand (Score:3, Insightful)
That said, I've never understood why move isn't implemented as a copy, check, delete and only be destructive before completing if the move process figures out you don't have enough space and then prompts you.
Apple Customer Quality Feedback (CQF) Program (Score:3, Insightful)
Many years ago, think System 7, Apple had this great Customer Quality Feedback (CQF) program. We tortured our systems between the alpha-testers and the great unwashed masses. There were Apple staff who (gasp) listened to our bug reports and got back to us reasonably quickly. It was grand.
Then someone got fired, or promoted, or whatever, and the CQF program got lost in the shuffle. Every few years I get an email from someone at Apple telling me that they're reconstituting it, but nothing ever comes of it, and - you know - it's hard to understand how they can ignore a free, highly-motivated bunch of fanboys.
Thread bug? (Score:3, Insightful)
Think about it: safe data movement has been around since filesystems existed. However, the new Finder is multi-threaded. It could be that the error handler is doing the wrong thing with the thrown exception...after all, what -do- you do with an exception in a subthread? What mechanism do you use to throw it upwards to the parent thread?
That's the joy of error handling, which is totally separate (though completely integral) to your normal architecture issues.
Re:Par for the course? (Score:4, Insightful)
Finder Has A 'Move' Now? (Score:3, Insightful)
The workaround is trivial - copy files until you're certain. In fact, I'd recommend doing that in all OSs anyway. Moves or cut-pastes are fraught with potential badness. I've lost files in Windows doing that, and always wondered what's wrong with just moving and deleting manually later on.
Re:Terrible bug (Score:1, Insightful)
Actually, without more information than the complainer posted, we know NOTHING.
Consider this, Samba crashed on his FreeBSD machine... and? when? how?
If Samba (on his FreeBSD machine) crashed after accepting the data and reporting back to the machine running Leopard that the transfer was completed but BEFORE it flushed it from cache to disk, how is Leopard at fault?
If Samba (on his FreeBSD machine) reported that the files were successfully written before it crashed (and actually finished writing the files), how is it Leopard's fault?
It is actually highly unlikely that it is Leopard's fault or many more data loss scenarios would pop up, since Finder is initiating the data transfer, AND THEN, the file delete. For this to actually happen, Finder (and the underlying code) would have to "forget" to get back a status from the FS driver indicating success on copy - and then start deleting the originals anyway. This "bug" would surface under any number of scenarios unrelated to this guy's FreeBSD setup.
Is his FreeBSD share in question on an NTFS partition with a flakey NT driver (or some other file system driver that's flakey)? Some of them will (incorrectly) report files written when they are not - which would trigger Leopard to (correctly) delete the files. I've experienced this exact same problems with some early JFS builds which incorrectly report a file written, then never get to it before the system is restarted or crashes. I wouldnt blame that on the OS of the connecting machine... it did what it was supposed to.
The only possibility that could make it a MacOSX problem is if the Samba share was reporting back something (else) that MacOSX thought meant "all done" thus starting the deletion of the original. And even in that case, it could be because of a non-standard Samba implementation on the FreeBSD box that is sending back an erroneous code.
You have to remember, Finder doesnt actually CHECK each file... it checks the return codes from the FILE SYSTEM (whether local or network) and then handles its next steps based off that (ie: success, disk full, write error, etc). This is the same procedure for virtually every operating system that runs on PC (yes, there are certain file transfer methods that actually do a file verification stage, but that is NOT the default for 99% of standard, end-user file transfers).
Even in the case of using a transfer method that actually verifies the files, it can be a moot point. If the files are still in cache, or the file-system structures are cached and those caches arent flushed and then the system or protocol/FS crashes, I'll lose data... but in the meantime, if the sending system requests "verification" of each file, the receiving system, via reading what is cached, will report that the writes were successful.
I fail to see how he - or anyone else - speculates this is a Leopard bug without more information. Yeah, it might be... but it more likely isnt.
Re:That's silly. (Score:5, Insightful)
Reeducate the user, you say. Surely you must be joking, right?
Let's ignore for a moment that Leopard may have a few bugs that will have to be ironed out. That's only to be expected with *_any_* newly released OS and the reason why no sane person would ever dare to update the OS on a mission critical machine within the first few months of the release.
However, if you can't rely on your OS to perform a simple file move without risking data corruption, then the right solution is definitely not to verify every single operation by hand. Automating tedious tasks is exactly what computers do best, and that the OS ensures the integrity of the copy before throwing away the original is definitely something you should expect.
Re:Terrible bug (Score:5, Insightful)
Only if a backup of the files was run, which is a requirement of Time Machine.
If this was Vista, then there would be a good chance to use 'previous versions' to recover the folder data, as it does not 'solely' rely on external backup for timeline file recovery. Vistas use volume level file version snapshots(a feature of NTFS that HPS+ doesn't support), so there are backups even on the drive if it hasn't ever been backed up.
(Remember Time Machine usually only runs once an hour, and all versioning or changes made in that hour are never kept or tracked.)
-PS Not trying to troll, but this is a perfect example of the difference between Vista's previous versions and Apple's Time Machine I have tried to point out in the past.
Vista does both volume level backups and external backups, unlike just external backups like Time Machine does.
See why IT people prefer Vista's method, even if a backup hasn't run that hour, even users themseleves can access files and folders rather easily that got deleted or changed. And since this has been around Since Windows 2003 Server, in corporate environments, even XP users accessing 2003 servers have had this feature for over 4 years now.
Vista's claim to fame is that it enables these features on the local hard drive, and also integrates with the Vista backup system, so file versions appear from the backups in addition to the 'snapshot versions' on the main hard drive.
Re:That's silly. (Score:1, Insightful)
Re:Terrible bug (Score:3, Insightful)
To me, that's a pretty clear bug. What stuns me is that Apple isn't using the underlying "mv" command - since it should certainly deal with this situation. They rolled their own defective version.
Never re-invent the wheel. You'll have a square wheel with 13 lug nuts of all different sizes. Just go to Goodyear and take the tried-and-true.
That's silly? You're silly. (Score:3, Insightful)
Re:A great disturbance in the Apple (Score:3, Insightful)
Correction: This'll be a problem for the Yuppie stoners. The rest of the stoner crowd moved to other OSs long ago.
This problem, as much as I like to jeer at elitists, is not Mac-only... We just read [slashdot.org] how Vista freaks around the 16,384 file number when copying. I could rant about "flaws" like this in each of the "big three", Linux, OSX, and Windows. But I'm not. Why?
They're going to happen.
Folks, ya have to remember... At present, and especially commercially, "quality assurance" means jack when you have a set-in-stone release date. Whether it's vaporware like WinFS, or trying to sneak in some code before a freeze, rushing due to financial/time concerns are what's screwing product quality far more than who's making it.
***IMPORTANT MESSAGE FOLLOWS*** (Score:1, Insightful)
Is that clear? If you don't have a backup procedure in place, stop right now, whatever you are doing and establish one.
If you have not followed this advice and experience data loss, it is your responsibility and your fault.
This may not be a bug in Leopard (Score:4, Insightful)
If this is true then the bug is im the copy of Samba running on the file server. We do not yet have enough information to know.
Re:Mod Parent Up! (Score:2, Insightful)
It's not logical at all. Why should moving one folder delete an existing one?
As for the warnings etc, how many users actually read those warning pop-ups? And if they do read them do you reckon they'd understand exactly what was meant?
Mac's are usually pretty good for usability. I'm surprised they'd make such glaring errors, such as not protecting a User's work, not letting them undo, and having strange behaviour which is difficult to learn on the basic file-system interface, which in theory any user should be able to figure out how to operate on the first go.
See: http://www.asktog.com/basics/firstPrinciples.html [asktog.com] for more info on usability.
Re:Par for the course? (Score:1, Insightful)
I cannot believe you just claimed copying files in Vista is something good about Microsoft's latest operating system. Copying files in Vista is the biggest fucking pain I have ever experienced since using cassette tapes to store Atari files.
I think you, TheNetAvenger (624455), are a paid Microsoft shill.
Re:Or [possibly], go fix it. (Score:5, Insightful)
Re:I don't understand (Score:2, Insightful)
The nasty case here is that, as far as mv is concerned, the write is complete as soon as it's done writing to cache.
Your test of interrupting the mv in the middle doesn't hit this case because mv hasn't finished yet. If the cached write then fails there's nothing it can do. Unless mv uses uncached writes or does a sync you're gong to have this sort of problem.
Re:Not just 10.5 (Score:3, Insightful)
I've heard of a phrase for this sort of thing -- "defective by design".
Re:This may not be a bug in Leopard (Score:4, Insightful)
The second you drop the file, it begins copying the file to the destination volume, but the original file disappears... poof, gone. If you stop the move operation, you're left with an incomplete file in the destination volume and with nothing in the original volume. This is to me a major bug.
This is why, for large files, I always copy.
Re:That's silly. (Score:3, Insightful)
On the contrary, that is exactly the thing to do. When you are working in mission critical environments and are charged with the safety of important data, it behooves you to do things the "slow way" sometimes for the simple reason that you have a safety net. In case of a disaster, it's much easier to restart a file copy than it is to pull data off of a backup tape because some of it got lost in the middle of a move operation.
Re:Ah, the "outsourcing" coding model.. (Score:3, Insightful)
Not exactly. The Leopard finder is doing something like: like in "remove $from regardless if the copy was successful or not".
Re:Par for the course? (Score:3, Insightful)
Of course, it's also perfectly possible to have identically-labelled pieces of paper in a folder, so you can't take the analogy too far...
I'm a Windows and Unix guy, so to me merging folders makes perfect sense. I know I'm biased, but I'd have thought that a new user would think "hang on, if folders contain files, how come I can't just put all the files from the new folder into the old one like this? Why does it replace them all?" I know you could do it manually, but then you have to manually recurse through all the subdirectories. (And I appreciate you can use the command line, but that just raises another question - how come the GUI operates on a completely different principle?)
Re:Par for the course? (Score:1, Insightful)
Doing a "replace" for that operation makes sense in a spatial system because all spatial icons are treated the same way. You'd wouldn't expect dragging a Word file named "happy.doc" into a folder already containing a "happy.doc" to perform a merge operation; so why would you expect that with a folder in the same situation?
Because a folder is different from a file. When working with folders, you are actually using that to group files together. The folder itself is not the entity you're concerned with, it's the files inside it. If you move a folder onto another with the same name, you are actually telling the system "move the files in this location to this other location". I think this is the more frequent user scenario, rather than a user wanting to REPLACE the contents of one directory with another. I think in these cases, the one that loses the least number of files should prevail.
Re:That's silly. (Score:2, Insightful)
1. Copy the source file to the destination
2. Read the destination file back to ensure it's identical to the source file
3. Delete the source file
Rocket science it ain't (probably been patented though...). At what point in this process would a power failure lose your data? If your OS programs a move in any other way, run away.
Re:Par for the course? (Score:3, Insightful)
STOP (Score:3, Insightful)
rsync is VERY powerful. VERY powerful. In order to glean benefit of that power, you have to be educated about how to use it. Could Apple have defaulted the app to use the "include apple-specific-stuff?" Sure. Should they have? Probably.
Regardless, rsync is HARD and complex for "joe sixpack" and it's not simpler for him to run
rsync -e ssh --recursive -l -t -g --compress * joe@destination:dest_dir
than it is to run
rsync -e ssh --recursive -l -t -g --compress * joe@destination:dest_dir --includeAppleSpecificStuffHere
It's the same to them - all gibberish.
Re:Or [possibly], go fix it. (Score:1, Insightful)