WinFS Beta 1 Released Early 582
Mouldy Punk writes "Infoworld is reporting that WinFS Beta 1 has been released. The new relational file system for Windows is posted on MSDN Subscriber Downloads. This release is designed to offer developers a preview of WinFS capabilities. WinFS will be in beta when Windows Vista ships and will RTM afterwords. WinFS, when it ships, will be available for download for Windows Vista and possible support for Windows XP is being considered. The distribution mechanism for WinFS will be through an add-on download much like the .NET framework is today. Tom Rizzo also notes that there is a new blog dedicated to Win FS."
WinFS sybchronization engine (Score:2, Interesting)
I'll bet it is based on the Unix 'file' command.
What exactly is it? (Score:3, Interesting)
But from what I've heard, WinFS sits atop of NTFS and simply connects it to a SQL database for indexing. How the hell is this revolutionary. You could place all your files in a "My Documents" folder and then make a nice pretty front end to it, categorizing each file, and then hacking the file chooser to use your interface.
I really think Microsoft should have though harder about this and made it a real filesystem with a new structure and layout on disk. It could have really be different and revolunatory, but from what I can tell, it's just a layer now and offers nothing really new or innovative.
Give it a rest, OK? (Score:1, Interesting)
You are (deliberately?) misunderstanding what WinFS is designed to accomplish. But like everyone else you seem to have made up your mind. Whereas you avoid mention of the numerous limitations that traditional filesystems like ext2 and even journalled filesystems have.
Re:Umm (Score:3, Interesting)
If WinFS could do for WinAMP what BeFS allowed SoundPlay+BIYS to do, I'd be a happy camper. However, I haven't tried XP Media Center, so maybe they did better than BIYS. Who knows?
Performance? (Score:1, Interesting)
Every time I've compared filesystems even EXT2 and EXT3 spank NTFS. More modern filesystems like Reiser and XFS do even better.
My comparison is usually building a large application, so it involves a lot of small-file I/O. And I mean serious perf problems, like 30% to 40% differences in build time.
SQL for the file system doesn't sound stable to me (Score:1, Interesting)
In response to the idea that WinFS is going to get it's indexing power from a custom SQL engine, I have to say that SQL Server on our XP boxes isn't reliable enough to use an an integral part of the file system. IT JUST IS NOT! Consider how many implementations of home and small business users won't have the benefit of IT support staff. Sure there are implementations of SQL on XP that are stable and blah blah blah, but we deal with SQL crashing in dev or even production environments regularly. Sometimes it is just restarting SQL that does the trick, sometimes it halts the whole server.
Point is I don't want something as critical as my OS file system relying on SQL to tell me if my files should be backed up or not...one bad worm and bad news for everyone!
Re:Is this really a file system? (Score:3, Interesting)
Re:Is this really a file system? (Score:3, Interesting)
Re:Ever been to Cairo? (Score:2, Interesting)
Re:Ever been to Cairo? (Score:2, Interesting)
http://www.windowsitpro.com/Article/ArticleID/48/
And an excerpt...
If WinFS gets out of this beta stage then I will be amazed.
Re:diff -u WinXP Vista (Score:3, Interesting)
Probably the most interesting to the Linux community is that the services for Unix (SFU) POSIX-compliancy layer is going to be running at the same level as the Win32 execution code. They aren't going to be nested, they're going to be parallel. Theoretically, it might even be possible to replace USER, GDI, and EXPLORER with your favourite X server and DE/WM. Theoretically. I won't be able to tell for sure until I get my hands on a copy, and I cancelled my subscription to MSDN years ago.
Maybe somebody else who actually has a copy can expand on it....
Sounds like an AS/400 to me (Score:5, Interesting)
Your description sounds an awful lot like what the AS400 team used to describe when I worked at companies that had good AS400 techies. It hybridized the mainframe-style contiguous file allocations with an integrated RDBMS that tracked the file information, much as the file information pages do with other file systems.
I find it interesting that so many "advances" other systems are making nowadays sound exactly like what the AS400 developers used to talk about. Using databases to store configuration information. Making the database an integral part of the OS. Virtualizing all storage so the system could shuffle files based on size changes and usage patterns to minimize head thrashing. Using wizards/forms for adding new software, changing configurations, etc.
I guess it's all considered "new" because so few people ever actually learned anything about the AS400 internals -- they just used them and counted on the system to do it's job properly.
Re:Sounds like an AS/400 to me (Score:3, Interesting)
Words can't even describe
So then what is Delete (Score:5, Interesting)
What then is delete? How does a user distinguish between "remove an association from the blob of data" vs "remove this blob of data altogether". Should the blob automatically delete when you remove all metadata around it? If not, how will you find it again? If so, would you really want data vanishing just because you removed a keyword?
What does partial backup look like on a system? How can you have a combination of partial backups and know you have a whole? I can do that with a set of five directories. Let's say you tag a set of files with "project fred". But one small file, that you almost never care about, gets tagged with "project ferd". What good is the ol' Fred backup now?
At some core level these blobs of data that users place on a system need ONE meaningful location where they always "are". You need someplace where the file will always be, no matter what other associations you remove. You need somewhere you know it will be to assure yourself EVERYTHING you care about is backed up or moved between systems.
The perfection you seek can just as easily be obtained with files in directories that allow metadata on top of them and things like smart folders that are essentially queries over the user-defined and automatically extracted metadata. In fact I think that's what WinFS does anyway (just like OS X does today).
If you really like the system you describe nothing is stopping you from storing all your files in a DB and writing an explorer on top of that. Yet all this time, things like that have never taken off in the market.
Some things do not take off because the technology to make the useful has not yet arrived. But some things simply never take off because in practice they are not practical, and the filesystem as a full-fledged database with no default structure is one of those things.
Think like a programmer not like a user (Score:3, Interesting)
Is Linux Trailing? (Score:5, Interesting)
Reiser4 is technologically ahead of WinFS as a high performance storage layer, see www.namesys.com [namesys.com] for details on its design. When you do this layering the way they did it, with the metadata stored in a layer above the FS rather than integrated into it, you lose a lot of performance while gain the advantage of successfully avoiding dealing with a host of technical issues. We are at least 5 years ahead of them technically in the storage layer.
That said, semantic enhancements matter more than performance, and it is better to do something semantically than to do nothing, and what Linux currently is doing is nothing.
The political support for adding semantic enhancements to Linux namespaces is mixed at best. I worry we will see that death by committee rules, and there will be no belief that each FS should try to innovate in its own way and compete with the others until one is proven the right solution. We are in serious danger of having MS implement bad technology, and Linux having to devote large amounts of resources to copying it in 5 years because we were late and chose to trail rather than lead. If the filesystems were free to compete in semantics, we could have one or several of the Linux filesystems leading them instead.
SQL and the relational model is fundamentally the wrong model for semi-structured data. See www.namesys.com/whitepaper.html [namesys.com] for why.
Technically, I would worry much more about Apple. Dominic Giampaolo is very bright, and well funded. His chances of delivering on a good set of semantics are high because he and Jobs are very sharp, and neither of them is afraid to go where no one has gone before. Our chances of losing technically to Giampaolo and Jobs are high, because we are frankly not well funded, and a lot of us are complacent with semantics that are still pretty much the same as their father's Unix box.
So, in summary, I would say that we are still ahead but losing speed fast.
Thanks for your kind words Hisham.
Re:Is Linux Trailing? (Score:5, Interesting)
I've been watching the fun you've had on lkml and wanted to say don't give up! The work you and your team are doing is wondeful.
If anything, I think you should stop focussing on getting Reiser4 into the kernel and instead start demonstrating the applications of your ideas on semantics. In other words - put what you've built to work outside the kernel and prove to people that they cannot live without a next-generation filing system. It may even mean doing things you have never done before, like creating a new distro derivative.
I know how emotionally draining free software politics can be, we get a lot of that in my own autopackage project. If it gets too much rather than risk burn out, go off and do your own thing for a while. If you really do have a better way people will join your banner ;)
Re:Is this really a file system? (Score:5, Interesting)
Yes, I have seen the "My Documents" folder of my mother's account. And as you say she has like 500 documents, including MS Explorer saved files AND their corresponding folders to hold images and misc binary files.
Yes I know that for me it is really stupid, as I tend to order every thing on its subfolder. For example let me tell you how I order my music:
blah blah, you get the idea.
And, althoug I have heard the marvelous things that programs as iTunes, Win.Media Player, Winamp Media Library or even MusicMatch jukebox do to order music libraries I still cant get one that I find really useful.
Maybe for a lot of us that is THE way to do it, but see, my mother, as a lot of computer users is just a Biology teacher. She knows the minimum required to do what she NEEDS to do in her computer (Word, Excel, Power Point) you just need to understand that people does not have the model in their heads, I mean, the model of the file system, that you/we automatically recall when we open the Windows Explorer/Knoqueror/etc...
That attitude (of the most people you are talking about) to me is just like, for instance: ``I don't want to learn about strings and notes, I just want to play the guitar!''
Now, as an example, Think about the WinFS like Gmail, I really found the Gmail approach useful, more if I have thousands of mail. If you see, desktop search bars have gained a lot of acceptance these days.
That is because we no longer know what each file in our computer does, and we do not have to care. We need to get exactly the file that we need when we need it, and you can do that searching.
Now before ranting about the facts I gave, just take my last paragraph and replace the word file with mail and instead of a Microsoft technology you will have a Google technology, is it bad? no, I really dont care where all my files go, if I need to have some files classified then a Tag would be great. otherwise I just want the OS to identify it when I ask for it.
Re:Where's the Answer? (Score:5, Interesting)
I agree however that it would seem people have been caught with their pants down in regards to WinFS though. The usual sentiment about it among Linux peeps from what I've seen is that it either isn't doable, or that it is, but that it'd be horribly slow.
Methinks a change in attitude is called for, however. This could very well be Bill's answer to the One Ring if he gets it out, which is presumably why Microsoft are trying to get a working release ASAP. Forget the coder bias for a minute here, and think about what the implications of this could be from the perspective of ease-of-use...and then think about what a battle we'd have converting people to Linux if we still don't have it when Microsoft does.
Longhorn was intended to be a Linux killer...but of all the elements I've seen, WinFS is the only one which could truly cause us problems...Especially when you consider how difficult back-engineering compatibility with such an FS would probably be.
As I said, I'm aware WinFS hasn't been taken seriously around here so far...but somebody needs to start to.
Re:Is this really a file system? (Score:2, Interesting)
As an aside, the Windows Search function has to be the worst thing ever written - even after you remove the mutt. Many times I've used it to search for filenames (not even text within files), only to be told that there are no results. Yet I know the file is there. Sure enough, after painful manual searching that the Search function is supposed to do for me, I find the file, and every time the filename matched the spec I chose. Let's hope WinFS actually allows people to find their files - at the moment, this doesn't happen 100% of the time.
Re:Is Linux Trailing? (Score:5, Interesting)
Re:Is this really a file system? (Score:2, Interesting)
> useful, more if I have thousands of mail.
I disagree. I have a Gmail account, which I use for just a few things; it probably has a few hundred messages in it at this point, which is to say, practically nothing.
I also have a *real* mail account, and I get the mail from that in Gnus, and store it using the nnml backend. I have at this point about 2GB of mail stored that way on my system.
I have greater difficulty using and finding things in the gmail account.
Granted, it took longer to *learn* to use Gnus, but once I got past that initial learning point, it's significantly easier to use on a day-to-day basis. If I had to handle in Gmail all of the mail that I handle from my primary account, I could not do it.
Re:So then what is Delete (Score:2, Interesting)
> association from the blob of data" vs "remove this blob of data altogether".
> Should the blob automatically delete when you remove all metadata around it?
> If not, how will you find it again? If so, would you really want data
> vanishing just because you removed a keyword?
If I had my way, the user interface would not provide any way to actually delete a file. Nothing good can come from that, and *plenty* of bad comes from it on a *regular* basis. Anyone who has to work with end users knows this is true.
There should be a trash bin they could throw it in, and it should sit there until the drive dies or someone wipes out the filesystem. (Or, if the drive actually runs low on space, and the swap file is not larger than a few gigabytes, the files that had been in the trash the longest could be actually deleted, after prompting the user to check if it's okay. Four nines of end users would never encounter this. If the drive runs low on space due to an enormous swap file, then the process using the largest amount of memory should be terminated, as it's obviously runaway.)
The last time an end-user *needed* to delete a file was in 1996, when the hard drive could only hold about 2GB and it was necessary to free up space. (No, don't even talk about sensitive information. If it's *actually* sensitive, just deleting the file isn't good enough anyway, and you know it.)
Third-party shareware and freeware utilities would spring up for emptying the trash. Which would be fine, because most of the people who delete things they really still want are afraid to download and install anything anyhow.
As far as removing metadata/keywords from a file... that brings up another shortcoming of current systems. If I had my way, we'd all be using filesystems that provide automatic versioning, and the metadata would be versioned as well as the contents of the file itself. So there'd still be a record of what keywords the file _used_ to have. (Yes, an automatic versioning would need an attribute that you could set on a given file or directory to prevent versioning there, which would be important for things like swapfiles and potentially useful for things like logfiles. But normal files should be versioned. It's not like you're going to fill up that 350GB hard drive with word processing documents and PowerPoint presentations, and the really big multimedia files would only have multiple versions if you were editing them, which normal users don't do; the relative few who do video editing or whatever could turn off versioning in certain folders if they see fit.)
Re:Is this really a file system? (Score:3, Interesting)
Bah. Most people use their computer because they have to in order to do work. And, honestly, it's not such a terrible request that the computer be easier to use. Half of the things that the user is required to manage should be managed by reasonable defaults.
Seen NSS yet? (Score:2, Interesting)
Re:So then what is Delete (Score:3, Interesting)
You haven't worked with many databases then.
The one we use does exactly that -- set the value to NA (similar to NULL, but not at all the same, since an NA value implies a default which is not necessarily 0 or "no value") and the row is removed from the database. Some relational models do the same thing, or force you to do it -- go ahead, try and set a primary key column to NULL. Your only choice is to delete the row entirely or do something silly like set it to a sentinel value (presuming your key is across multiple columns).
You can't have it both ways - does a set of data get removed when all user-defined meta-data gets removed or not?
You can have it both ways -- the metadata he referred to is not user-defined! It's system defined and you could certainly differentiate between the two. I'm not sure if this would be a good idea or not; I haven't done research into what ReiserFS and WinFS do in these situations.
If not then how is a user really going to know when it's safe to "totally destroy" a file? Perhaps it was germae to some other keyword they had forgotten.
Uh... you're not from a database background are you? The relevant concept here is foreign key. There's a price to be paid for using them, but they certainly prevent the problem you're describing.
If I copy a whole directory onto a CD I know that every file I put there is on that CD. If I ask to backup all files for Project Fred I cannot *know* by keyword alone that all the files are really there except through blind faith that I have properly tagged all files for that project.
I fail to see the difference between making sure that you put all the right files in the directory and making sure that you tagged all the files correctly. They are analogous operations. Just because you're more familiar with A than B does not mean that B is less capable -- just that you're not familiar with it.
Your entire line of questions regarding backup falls into this category. Backing up a RDBMS is hardly a new thing.
The difference is saying a files default location is really id "4784874GA" vs. "~/Pictures". Think I'll take the latter thanks!
And clearly databases are doomed to failure for the exact same reasons. Ever taken a look at the raw data in an Oracle data file? Or MySQL? Or any other relational database? How about some non-relational ones? Make any sense to you? No? Well then obviously it's useless.
For that matter, when's the last time you read any file system other than FAT in raw mode? Traced through the core structures of Ext2 or NTFS lately? Not so human readable.
We routinely put overlays on top of data in order to make it more useful to humans. And a relational file system is just another way of doing it.
And furthermore as I said, you can get all of the benefits you were looking for with the way filesystems are being enhanced.
Shrug. You go debate Hans Reiser then. Clearly he's clueless about why a relational file system is superior to a hierarchical one. There are some areas where a hierarchical FS + extensions will lag behind a relational one. The inverse is also true. The question becomes -- which areas are more important?
I don't know the answer to that, and neither do you. But your complaints about metadata and organization are about as valid as people complaining that they can't use buggy whips to make their new fangled automobiles go faster.
Re:Is Linux Trailing? (Score:4, Interesting)
Re:So then what is Delete (Score:3, Interesting)
Actually, backups are an interesting issue that I hadn't thought about with the whole file-as-DB debate.
Backing up a DB is straightforward. In fact, with journals and all that it can be made possible to do atomic backups (ie a backup that captures the state of the filesystem in an instant of time).
However, the issue here is partial backups. Doing a backup of a 400GB drive onto 800 CDs or 80 DVDs or a tape or two with a $2000 drive is simple enough already. However, when I do a backup I don't want all records that changed since the last backup. I want all important records that changed (usually my home dir). Probably the easiest way to do that is via a backup field on the database, with an easy way to control its default setting (off for OS/Software, on for data, inherit from parent metadata).
One issue with big databases is that they are only useful if your relationships are good (ie your keywords/projects/etc). Users in my experience do a lousy job picking these on their own, and often resent the work of having to choose them. In many database apps they are set silently in the background based on the context of a user's operations, but while this works in an application program that automates a particular business process, it will be harder to extend this to general practice.
I think the jury is out on this whole debate. I think nobody will really know what is easier until people start trying it and learn to love it or hate it.
I love databases in general. However, the features that make them very powerful have always been the hardest things to explain to ordinary users...