If you recall my past blog post on Windows Home Server, we were discussing a critical file corruption bug in the operating system that Microsoft had finally identified. The Drive Extender technology for handling the WHS storage pool had a particularly nasty and hard to fix race condition that resulted in data corruption in certain situations when the service was migrating data from one drive to another. Due to the nature of the bug and Microsoft’s desire to pack the fix with Power Pack 1, the delivery of the fix was not expected for 3 months, which at the time would have made the delivery date June.

June came and went, the Power Pack was nowhere to be found. Microsoft did release a public beta of the Power Pack in June, but this is hardly the same. As any professional server administrator can tell you, you don’t test new patches on your production servers.

Finally, as July comes to a close, Microsoft has released the shipping version of Power Pack 1. Now if you haven’t quite picked up on the tone of this blog post yet, I am not particularly happy with the situation. I’m quite happy that the issue has been fixed, I am also happy that Microsoft has deployed the rest of the Power Pack 1 features, but I am not happy with how long it has taken.

When we get right down to it, the corruption issue was first acknowledged 7 months ago back on December 21st of 2007. And while we acknowledge that the issue is a particularly complex one that took Microsoft months just to completely identify, does this change the fact that the problem has taken an extraordinarily long time to correct? No, it does not.

In what other industry and with what other product is it acceptable for a common data corruption issue to persist in a server for this long? In what other situation is it appropriate to continue selling faulty software and devices utilizing that software after the faulty condition has been found? Unless you are enough of a computer enthusiast to read sites like AnandTech, there’s a good chance you would not have even known about the issue.

While WHS is a home product and I certainly don’t expect the same kind of rapid response from Microsoft for it as I would Windows Server products, WHS is still a server, one that until this week had a major malfunction. Call it vitriol or call it common sense, but a bug fix for an issue of this kind of importance should not take this long. This is certainly the worst support I have ever seen for a Microsoft operating system that I can recall; I cannot recall a bug like this having ever gone unfixed for this long before.

Microsoft should be ashamed of themselves (at least as much as a large multi-national corporation can be) on the matter. While I hesitate to blame the WHS development team directly because not everything is in their hands, it’s certainly appropriate to blame Microsoft as a whole. Would committing more resources to WHS have resulted in a bug fix coming sooner? Potentially. Pulling the sales of WHS devices (particularly 2+ drive servers) should definitely have happened however, along with pulling the OEM copies of the software itself. The single greatest problem is that a bug like this existed, but the second greatest problem was that WHS continued to be sold, and this is something that could have immediately been resolved.

So here we are, 7 months after the bug was first acknowledged and just a year after WHS first shipped. At this point the data corruption issue is fixed and it’s once again safe to use a WHS with multiple drives, and knock-on-wood there are no further corruption problems in the OS. But I find myself reflecting on what I said back in March, “It also undermines a great deal of confidence in Microsoft that will take some time to recover.” I have a troubling lack of confidence in Microsoft’s ability to support Windows Home Server, and I can’t bring myself to once again recommend Windows Home Server at this time. I still have hope for the next version of WHS, but I think that Microsoft has blown it for WHS v1, it’s a v1 product that should be avoided. This is a shame for HP in particular, as Microsoft’s premiere US partner for WHS in the United States they have gone above and beyond everyone else (even Microsoft) to improve the WHS experience and to produce some really good servers, and ultimately it’s what Microsoft has or has not done that makes it all a moot point.

With that said, not everything is a bad thing today. Besides fixing the data corruption bug, Power Pack 1 adds a number of features and fixes that resolve our earlier issues with WHS when we first previewed it. The long-awaited connector package for Vista x64 is here, allowing computers running that OS to be backed up, and you can now back up the server itself to an external disk. Furthermore network I/O has been significantly improved in some cases so that WHS isn’t nearly as pokey as it once was (although we’d still like Vista’s file I/O prioritization) and the Drive Extender service behavior has changed so that it no longer engages in file balancing as much. Even lesser features such as the ActiveX control for the web access portion has been beefed up to better handle uploading large files and multiple files. Even without the data corruption fix, Power Pack 1 is a big update for WHS that everyone with a WHS box will likely enjoy.

Comments Locked

12 Comments

View All Comments

  • PrincessNybor - Sunday, September 21, 2008 - link

    Shame on Microsoft, but shame on you as well. You dangled an Ubuntu article in front of us SEVEN MONTHS AGO.

    I am a Windows Home Server user, and I appreciate this update. But I am also a Linux user, and it would be nice to get some kind of respect on that front as well.

    Are we getting an Ubuntu article or not? If you aren't going to write it, that's fine. But please post an update. Thanks.
  • rcme - Wednesday, July 23, 2008 - link

    Ryan, This post and your previous post are "spot on". GREAT WORK!!

    I have a WHS, it is a great product, and I am glad they fixed the data corruption problem.

    However, I have to wonder if the WHS DE is really totally fixed. This is based on your initial analysis that the DE data corruption was due to a fundamental flaw in the DE design and growing comments on the WHS community forum about another WHS DE problem.

    The "buzz" that seems to be growing on the WHS forum is that the WHS PP1 fix for DE has created a new problem with DE data balancing/migration on multiple drive systems. That is, if one adds a new disk drive to an almost full WHS, DE has problems migrating data to this empty drive, resulting in several different possible "out of space" type of errors, even though one has an empty disk drive.

    The ability to properly utilize the disk space from a newly added disk drive is a very basic, fundamental, function of WHS DE that is not working.

    One has to wonder if, in order to fix the DE data corruption problem, a compromise had to be made in the DE functionality (breaking something as basic as balancing/migration). This goes back to your original comments about the WHS DE design being flawed. The result being that without a complete WHS DE re-design/re-write, the WHS DE developers are now left with making compromises in DE functionality to fix these basic problems.

    So, this begs the question, Is the current WHS DE design flawed so badly, to the point that one fix (i.e. data corruption) will break some other basic function (i.e. data migration), so that the only hope for a fully functional DE is to wait for the next version of WHS, when the WHS team has a chance to do a complete re-design/re-write on the WHS DE component?
  • Jedi2155 - Thursday, July 24, 2008 - link

    That is exactly the kind of fear I have regarding WHS right now that has prevented me from embracing it as well. Microsoft should seriously be ashamed of themselves....
  • yyrkoon - Wednesday, July 23, 2008 - link

    If you do not like it - Do not use it. If you do like it - Use it . . .

    However suggesting that Microsoft should eat shipping on a product that has already shipped is simply silly. Would it be the right thing to do ? Maybe, but it would be financially unwise.

    Now let us remember that to error is human, and since Microsoft is made of of many, many humans . . . I will also try to remember this myself when reading such a blog, and try not to question why one so technically astute is not using something else already.
  • Calin - Wednesday, July 30, 2008 - link

    Yes, error is human.
    But corporations are NOT HUMAN - or at least are treated, and treat others, like they inhuman.
    For Microsoft, try to search for "Mike Rowe's Soft", to see the human side of Microsoft. His site now is redirected to Microsoft's own Live Search engine. There are plenty other things, but the idea remains:
    If someone is not human, we shouldn't treat it as human.
  • AlterEgoist - Tuesday, July 22, 2008 - link

    It was a bad bug, especially given that the whole point of the product was to protect your data. Fortunately, not many people actually lost data, from what I understand. I heard that the bug was buried way down in the kernal, so the Microsoft folks had to re-work alot of code to get everything fixed. If you took the PP1 beta and are having issues installing the RTM version from Microsoft's download site, see MediaSmartHome.com for some advice in the forums.
  • Z Ice - Tuesday, July 22, 2008 - link

    Mr. Smith -

    My respect for AnandTech and yourself leads me to post here for a couple of reasons. Primarily so that your readers can get a little more info about the severity of the bug, and secondly, because I feel that WHS is one of the very best products to come from Microsoft, or any OS developer, in a long time.

    To start, your article is somewhat misleading. Yes, 7 months is a ridiculously long time to fix a major bug. However, this particular bug had a known work around that prevented any corruption from occurring. Your article makes it sound as if the corruption was unavoidable. I'm not talking about using only 1 drive, either. The bug was avoidable generally just by not editing files while they were stored on the home server (this was particularly annoying with programs that auto-edited files, like Windows Media Player). So obviously that hurt the ease of use of WHS and prevented users from fully utilizing all of the cool features of WHS, but it did allow everyone to avoid the bug.

    Now that we have a more complete disclosure on the nature of the bug, I must say that if you overlook WHS because of it, you're missing out. The automated backups alone are worth the price of admission. The remote features, such as logging in to any of your home computers from a web browser anywhere in the world and downloading any file you have stored on your server, put WHS in to the value category as far as I'm concerned. This is before we get to the media streaming capabilities, the add-ons (like Whiist that allows you to setup a website on your server in minutes that you could share photos stored on your server from if you wanted to), and the ease of use.

    So, despite my respect for AnandTech's reviews, I'm going to need to respectfully disagree with your opinion that WHS should be avoided. I recommend it wholeheartedly.
  • Calin - Wednesday, July 30, 2008 - link

    The article on Anandtech.com that identified the issue also showed that you could witness corruption on files YOU DID NOT EVER edit - like on photos.
  • jordanclock - Tuesday, July 22, 2008 - link

    According to the DailyTech article on the same subject...

    "The situations in which corruption could occur grew as well (initially the WHS believed that files could only be corrupted when edited on the server)."

    So the workaround wasn't even a workaround. Especially when considering that using the default media player (which would normally be a completely normal decision) would result in editting of files and possible corruption.

    Seven months is an extreme amount of time for a product with "server" in its name. Especially when the whole point of the product is to provide a central storage location and that very function is fundamentally flawed.

    Automated backups? That's great and all, but if you can't edit the files (and the act of simply adding new files may result in corruption) you only need one backup. Unless you like a corrupted backup.

    I think you really need to consider the ramifications of this bug. This is a bug in one of the headlining features of the product, one of the features that really sets it apart from other products, and it was fatally flawed. It was flawed for a year, and it was officially acknowledged for SEVEN MONTHS. Imagine if for an entire year you couldn't use Vista's Aero theme, even with the proper hardware. And for seven months Microsoft said "We know there is a problem, please wait for the next Service Pack." Or if you couldn't use the Dashboard in OSX and for seven months Apple said "You can use it, just don't change any widgets in it, or you may risk losing them." And for the obligatory car analogy, it would be like the Toyota Prius being unable to recharge its batteries and to have Toyota spend seven months telling people that they should continue to buy the Prius and that owners should simply wait for a fix, while gaining no benefit from the most prominent feature of the vehicle.

    Yes, it's great they fixed the problem. Yes, the other features are wonderful and WHS is now fully functional and does a bang-up job at what it's meant for. But this severely harms the reputation of Microsoft and WHS. Many may still not adopt WHS simply because they don't want to risk a repeat. They may even avoid future WHS-based products.

    I personally would recommend WHS only to those that are fully aware of situation that just ended. I could not, in good conscience, recommend this product to someone under any other circumstances.

    I respectfully (albeit just barely at this point) disagree and wholeheartedly agree with the opinion that WHS has a permanently blemished reputation and that the lax attitude of Microsoft on the issue has completely turned me off to the product, as it should to others. We simply cannot let a company release a product with such a large fault and sit idly while they wait to release their fix. It is completely inexcusable behavior and for you to defend it in such a way is disgusting, to say the least.
  • idomagic - Tuesday, July 22, 2008 - link

    "The bug was avoidable generally just by not editing files while they were stored on the home server"

    So basically "don't use it and it won't break"?

    Very clever, it's the work around of all work arounds!

Log in

Don't have an account? Sign up now