Anatomy of the VA's IT Meltdown 137
Lucas123 writes "According to a Computerworld story, a relatively simple breakdown in communications led to a day-long systems outage within the VA's medical centers. The ultimate result of the outage: the cancellation of a project to centralize IT systems at more than 150 medical facilities into four regional data processing centers. The shutdown 'left months of work to recover data to update the medical records of thousands of veterans. The procedural failure also exposed a common problem in IT transformation efforts: Fault lines appear when management reporting shifts from local to regional.'"
In other words.... (Score:5, Insightful)
Once again, the VA shows its true colors and mucks up another project funded by taxpayers for the well-being of our nations Veterans. A more screwed up organization one will not find.
Re: (Score:1, Insightful)
Re:In other words.... (Score:5, Informative)
At the time, there seemed to be a lot of waste (think $10,000 CD burner in 1993ish, optical cards with images and data impressed on them, etc). But they really were trying to be ahead of the game - a friend of mine showed me his green card and it was almost identical to a design I was working with when I was at the DVA. They also had mechanisms for charging back to private insurance companies in the event a veteran was only partially covered for a visit.
Oh, and just about all the software that was written and in use by those hospitals are in the public domain and downloadable [worldvista.org] for free - many other hospitals use VistA as their base.
Re: (Score:1)
Re: (Score:3, Insightful)
Re:In other words.... (Score:5, Insightful)
If any one hospital or chain of hospitals peformed as consistantly lousey as the VA has that hospital would have been sued into oblivion decades ago. Hundreds of thousands of vets who've used the VA's services can attest. But, we can't neccessarily sue the VA because they're part of the government. Go to any VA hospital in the US. Odds are that after you pass through the pretty facade they've set up you'll find patient after patient sitting in a wheel chair or bed lined along some wall waiting for some over-worked, over-stressed and under-staffed doctor and not getting the care they deserve.
The VA needs to take a lesson from the corporate world and change it's face. Rename itself, start fresh. AND START DOING THEIR G-D JOB! That's the best dismal chance they've got to make things right. As it is right now there isn't a Vet in the US or abroad that thinks highly of the VA. And if there is, I'd find 100 that would refute any positive statement made about the VA.
And, yes - I'm a Vet. My Father is a Vet. My Grandfather is a Vet. My Uncle is a Vet. I don't recall them looking forward to communicating with the VA, either.
In closing, if the VA *did* do their job the homeless wouldn't consist of 25% US Veterans that couldn't re-adjust to civilian life after witnessing the horrors of war!
http://www.cnn.com/2007/US/11/08/homeless.veterans/ [cnn.com]
http://www.cnn.com/HEALTH/blogs/paging.dr.gupta/2007/05/mia-in-plain-sight.html [cnn.com]
Re: (Score:2)
Re:In other words.... People just don't care. (Score:2)
I have seen people refuse to stand for the National Anthem on Veterans day at an airshow. Did you miss the people complaining about Google have a banner for Veterans Day?
If people will not stand and actively complain about a Google's Veterans Day banner why should they want to fund or fix the VA? That actually costs real m
Re: (Score:2)
Re:In other words.... (Score:4, Informative)
http://www.washingtonmonthly.com/features/2005/0501.longman.html [washingtonmonthly.com]
If you think the VA is bad, you can always go to your favorite HMO and have a higher chance of death.
Did I mention that the VA is a leader in hospital IT infrastructure and is decades ahead of other hospitals?
http://en.wikipedia.org/wiki/Veterans_Health_Information_Systems_and_Technology_Architecture [wikipedia.org]
The VA is the largest hospital system in the US and its budget is decreased most years after adjusted for inflation. Given the predicament that Congress puts them in, they've done pretty well.
However, every single mistake they make is a public headline. Private hospitals have the luxury of being sued and quietly settle for $$$. Instead, the VA has to endure lots of bad publicity.
If the VA was a corporation, costs would skyrocket and even more corners would be cut. If you want to make it better, how about you ask Congress to provide adequate funding for the avalanche of people they are getting?
Re: (Score:2)
May I remind you of Walter Reed Medical Hospital travesty that *recently* made headlines?
http://www.encyclopedia.com/doc/1G1-161076682.html [encyclopedia.com]
http://akaka.senate.gov/public/index.cfm?FuseAction=newsarticles.home&month=3&year=2007&release_id=1570 [senate.gov]
http://www.usatoday.com/news/washington/2007-03-21-va-review_N.htm [usatoday.com]
Re: (Score:3, Informative)
HOWEVER, it is a brilliant example in which a public outrage was sparked, and the government was forced to do its job, and did indeed clean things up after the horrible conditions were brought to light.
If it were a private hospital, I fear that things would have been kept hush-hush for far longer through lawsuits and settlements. Even then, the worst that the government could do to the place would be to either impose fines, or shut the
But Walter Reed isn't a VA facility (Score:2)
Put another way, would it be reasonable or appropriate to blame NIH (Dept of Health and Human Services) for security breaches at Los Alamos (Dept of Energy)? I mean, they both do basic science research, so they must be the same,
Re: (Score:2)
Re: (Score:2, Insightful)
I won't say it's perfect, but it has quite low overhead (relative to private insurance) and if there was no debate about who was allowed on and who wasn't it could be streamlined further.
Very few people want a single source of healthcare providing everything.
Re: (Score:1, Flamebait)
Oh please. That's like looking at FEMA's response to Katrina and saying "see, you can't expect the gov't to do anything right." It's so Republican to intentionally break government agencies and then use their brokenness as a reason to privatize everything.
Re: (Score:3, Informative)
Re: (Score:2)
Read the comments that were attached to that article. They don't give as glowing a review as the article.
1 Article - followed by many more jaded Veterans or the family members who had to assist their Veteran to get there.
Re: (Score:2)
Re: (Score:2)
I normally reply to these things with examples of well (or reasonably well) functioning universal medicare systems in pretty much all industrialized (and other) nations. However when faced with the continuing stream of total and utter fiascos authored by the US federal or state governments, who somehow mastered hereto before unattainable levels of incompetence, graft and general stupidity in anything even remotely relating to common good, I am beginning to lean towards another notion. It is the theory that
Re: (Score:2)
Re: (Score:2)
Apparently not. Their public institutions are becoming less "public" as time goes on, serving and defending the newly froming de-facto corporate aristocracy (complete with dynastical "presidencies") and court-jester "parties". Creation of new egalitarian institutions, or even repeairing of the once impressive existing ones, is apparently already out of the question. As are inheretance taxes, the last remaining obstacle between any pretense of meritocracy and the feudal order.
And so the Republic crumbles t
Oh, great. (Score:3, Funny)
(of course, it would be a first for 'em... even if it's the "wrong" Vista we're talking here).
I see the problem (Score:4, Funny)
There clearly is just not enough synergy..
Re: (Score:2)
not cancelled (Score:2)
Re: (Score:2)
Start by banning the name. (Score:2)
Re: (Score:1)
Assumption junction, what's your function? (Score:4, Insightful)
DOH! Looks like it was all just due to someone's assumption that someone else would do their job.
From my experience, you can assume things happened, but if you don't verify that they actually happened - you are DOOMED.
Re:Assumption junction, what's your function? (Score:5, Informative)
DOH! Looks like someone was making assumptions without reading the article. They considered switching to the backup, but since they didn't know whether the problem was on their end or the server's end, they were afraid that switching to the backup data center would destroy that one as well.
Re: (Score:2)
And since they didn't know what the actual problem was, they just assumed things and it got hosed. I stand by my original statement.
Re: (Score:2)
Generally, "they're stupid 'cause when u assume lol" is reserved for thoughtlessly destructive acts. The decision not to sync to peer wasn't one -- it was an informed decision to cut their losses and have merely *one* hospital down, rather than risk having N hospitals down.
Somebody explain why it doesn't work like this (Score:1, Redundant)
that brings another point to mind...
DIDN'T THEY TEST THE FREAKING THING!?
my 2 cents. (Score:5, Insightful)
unfortunately one of the best ways to learn how well your disaster recovery system works is to have a disaster. The problem with scheduled drills is the scenarios themselves are planned out and typically not run system wide ie test the part of the system then that part of the system etc. on RTFA it seems much of the breakdown occurred because too many people assumed. There was also no centralized decision making identities who had access to all the information. All scenarios when view from there individual perspective seemed to have made the right decision. However sometimes when implementing a global recovery plan one system may have to be sacrificed by another.
Re: (Score:2)
awesome! (Score:4, Informative)
Instantly, technicians present began to troubleshoot the problem. "There was a lot of attention on the signs and symptoms of the problem and very little attention on what is very often the first step you have in triaging an IT incident, which is, 'What was the last thing that got changed in this environment?'" Raffin said.
p.s. I am shocked at how many junior cowboy IT people remain employed, given the supposed glut of hire-able and knowledgeable folks.
Re: (Score:3, Insightful)
Re: (Score:3, Insightful)
Like the budgie! (Score:3, Insightful)
First thing I learned in the military: your weapon was made by the lowest bidder.
Re: (Score:2)
What was the last thing that got changed in this environment?
The way most IT networks are run you would be lucky to know if anyone is updating their environment. Businesses don't care about employing good or experienced employees, not that they know how to know they have one, they are concerned with whether they are under-budget so they can get their bonus for the year.
Re: (Score:2)
I should also point out that you often have to be careful of instantly blaming the last change made before a
Zonk, you retard (Score:5, Insightful)
Please, God, isn't there some kind of Editing 101 correspondence-school course we can send all these guys to? I mean, I love Slashdot to death, but please God, can you give the staff just one ounce of basic editorial skills: spelling, grammar, etc? Teach them to write for clarity, not just brevity? Maybe go for broke and touch on dupe-checking, fact-checking, changing links so they point to the original article instead of some guy's AdSense-laden blog page that says nothing more than "here's the story"?
You're EDITORS, for God's sake (even if in name only), you are indeed allowed to EDIT submissions.
Re: (Score:1)
Hey -- I didn't design my brain's pattern recognition systems.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Who knows.
Maybe the "the" that you refer to is a typo? who can tell in a
How hard is it to expand an acronym in it's first usage?
Re: (Score:1)
For a brief second I though it was about a VIA chip that someone overclocked and melted, and they were doing some kind of post mortum on it.
Re: (Score:2)
Furthermore, few people outside the USA are likely to know what "the VA" is, so a bit of clarification would be handy. Scroll the Slashdot front page right now--this is one of the shortest headlines on the screen. The
+1 C'mon Editors (Score:5, Funny)
1. Looks at title: omg! Slashdot's parent company had an IT meltdown! ha-ha! But waitaminute
2. Looks at icon: a
3. Looks at summary: and
4. Looks at icon: I remember that! It means
5. Looks into the inner recesses of my mind:
7. Looks at lightbulb over head: of course! There *is* no VA Linux! It's Sourceforge, Inc now! But that must mean
6. Looks at summary:
Gee thanks, Zonk, just what I needed before going to sleep. Now I'll dream of the Queen in Virginia melting down medical computers for Slashdot's open source overlords. Again.
Last thing I needed
Re: (Score:2)
Get over it. Life's too short to get so upset about this kind of stuff.
Re: (Score:2)
Which is exactly why Slashdot's editors are doing their readership such a huge disservice by not EDITING. I've read PLENTY of threads that were about some minor point instead of being about the story. I've seen stories where literally every +5 comment in the discussion was NOT about the content of the story but rather an error in the reporting or something else tangential--usu
Re: (Score:2)
When I read it, I first thought Veteran's Administration. VA Linux IT meltdown doesn't even make a lot of sense.
Of course, I've been reading about federal government IT meltdown's for a long time, so I'm conditioned.
rd
Re: (Score:2)
Exactly how obvious should that be to the 95.4% of the people in the world who aren't American?
Re: (Score:2)
While I understand that
Re: (Score:2)
if I, as a Brit, referred to the RBL would you pick up what that expanded to straight away?
Google is your friend [britishlegion.org.uk]
it doesn't seem unreasonable to ask for peculiarly American acronyms to be written out in full for the benefit of the other 5.75 billion people in the world
I'd be surprised if the number of connected people was much over a billion.
Re: (Score:2)
A quick google does bring up the "US Department of Veterans Affai
Re: (Score:2)
Though, despite the change, "the VA" (the Veterans Affairs?) has managed to stick around, even in official literature. It makes the grammar nazi in me die a little every day, so I choose to take my pent-up frustrations out on you. Sorry.
Hey, at least I don't work for the Postal Service, right?
Re: (Score:2)
You know how it is. Names stick. To those who were born at a certain time, the VA will always be the VA. Hell, even their website is still va.gov--despite the name change coming two years before TBL invented the WWW. (Though they might have had the domain name before then.)
Re: (Score:2)
treasury.gov
state.gov
ed.gov
interior.gov
hhs.gov
hud.gov
Thank god they didn't register and promote www.theva.gov (*shudder*)
VA Acronym? (Score:2, Insightful)
Re: (Score:2)
Mod parent up, not off-topic.
And tag the article "Veteran's Association" because other applicable acronymns for VA include "VA Software" which is the former name of SourceForge Inc (symb: LNUX), who own Slashdot. Also, even after reading the blurb for the article "Virginia" is a possible acronym for VA.
Sometimes, it doesn't make sense to shorten things with acronyms. Especially within areas where confusion like this exists.
Re: (Score:2)
Re: (Score:2)
What's even worse? It's not the VA any more. It's the Department of Veterans Affairs [wikipedia.org]. But no one ever calls it the DVA.
OK, so maybe the hospital part of this Department thingie is the "VA" in question.
Nope, sorry, wrong again. That's the Veterans Health Administration [wikipedia.org], or VHA.
"Do not try to unfubar the VA; that's impossible. Instead only try to realize the truth: There is no VA."
Why always centralizing? (Score:4, Insightful)
1) Trying to centralize gives us large expensive computers that are made out of the same components as smaller ones and thus fail just as the smaller ones do, however, ever trying to cram more crap on the same machine will bring down everything at once whenever it fails.
2) Trying to centralize has the ultimate goal to eliminate jobs but they need those people since they know all the little details and hickups their systems have. If people know a project is going to eliminate their job, they won't be cooperative. IT not being cooperative is very bad in this world where everything is computerized.
3) Eventually the same number of people is going to have to work in the centralized system just because you also centralize the problems and more problems will bring more people, more people will bring more overhead and inefficiency, more inefficiency will bring more people (at least that's the default in today's business world, throwing more people at an IT problem doesn't make it disappear faster)
4) More people in a project that was designed to be more cost efficient means the managers will have to cut expenses. Cut expenses brings underpaid people, underpaid people bring less or no experience and higher turnover, higher turnover means more cutting expenses.
Therefore: keep your local IT guy(s) and infrastructure although you can't squeeze 100% of work/day and it will bring a little more expense. The end-users have a better relationship with the guy(s) and that makes happier people. Centralizing brings more overhead, less customer-interaction with IT and thus more inefficiency throughout the business.
Re: (Score:3, Insightful)
Obviously not all of this data needs to be centralized, but it's existance should be. We don't know to what level the VA was doin
Re: (Score:3, Insightful)
If that's how you're doing it, you're doing it wrong.
On how many smaller systems can you upgrade your disk controller's firmware without having to reboot or even stop access to the disks? Not a problem on a good SAN system.
And those systems
Re: (Score:2)
Here, here, great post.
But I will pick at one thing, 3 & 4 are not issues with a unionized organization like VA. Given your example the 7 of the 8 leftover people would be "repurposed" to tackle another massive pile of outstanding work and the 8th one would either get promoted to manage the remaining 7 or they would be "trimmed" and someone else would be promoted to manage the 7.
My experience with government organizations is that they tend to carry a lot of overhead and they can't really get rid of
Re: (Score:2)
Standardizing IT in VA would be fantastically helpful (150-something hospitals all running different software is a nightmare), and they thought that this would be an easy way to do it. Easier, certainly, than managing a 150-site rollout and 150 different migrations all with more than a couple nines of uptime.
The folks in charge of making this decision work at the pleasure of the President, which means they're looking for work in January of '09. They need something on their re
Re: (Score:3, Informative)
Re: (Score:2)
All your medical records are belong to us (Score:1)
I'm in the outsourcing business (Score:1, Troll)
Re: (Score:2)
Do they teach people how to read these days?
They messed up everything they could mess up. (Score:1, Insightful)
Couple of reasons: First, they're running Vista. I'm not trying to be all "You must only run Linux or ur a n00b" here -- you can run Windows servers just fine, but no reasonable IT planner should ever, *ever* consider using an OS that new for a mission-critical enterprise application. If it doesn't have two or three years in the field, don't even consider it.
Second, their failover plan suck
Re: (Score:1)
Yes the article talks about Vista, but Vista the application as in "Veterans Health Information Systems and Technology Architecture". An application and acronym that predates the Microsoft OS of the same name. I know we like to blame MS for everything, but they have no involvement in this problem.
Two, their failover plan had three levels of planning. That is far better than most of the failure planning I have seen. As a result of this failure, there were degradations in service becaus
Re: (Score:3, Informative)
Vista, Veterans Health Information Systems and Technology Architecture, is the VA's system for maintaining electronic health records.
It sounds like they are running something much older. Again from TFA:
According to Director Eric Raffin, members of the technical team were at the site with staffers from Hewlett-Packard Co. conducting a review of the center's HP AlphaServer system running on Virtual Memory System and testing its performance.
"Virtual Memory System" on an Alpha would be "VMS" would it not? Note the article only states that some folks were working on VMS at the same time when the Vista system (not the Microsoft OS) went down. It doesn't say that they were the same system, but you should consider that their environment is a bit more older and complicated that you make it out to be.
The article
Re: (Score:2)
Regarding point the second: Yup. 100% right.
Regarding point the third: You are more right than you know.
Regarding point the last: They weren't middle managers.
Re: (Score:2)
First off, please read close enough to discern which VISTA they are talking about - it's kinda spelled out the
Re: (Score:2)
I've got a real problem with the way I responded to your post in the first place. It's been bugging me ever since I sent it - and here's why. I railed against your post in reaction to the numerous times I've faced the "wait until x.1" in the corporate world. Even though I get tired of it after a while, I had no reason to unload my personal baggage on your post, making it see
Re: (Score:2)
The root cause (Score:1)
Poor VMS. (Score:3, Insightful)
We hardly knew ye.
It happens (Score:4, Insightful)
Seems to me that things worked otherwise well is a major accomplishment. They are still on the old system and are entering in data back into that system and migrating into the new system. But it seems things went well otherwise.
Anytime you do a major shift like this, it's hard. The users hate it because they can do their job very quickly on the system they are use to, but now have to learn a new system and slow down.
Things happen.
Re: (Score:3, Interesting)
because some IT staffer changed a port # at one of their hub data centers without following proper procedure -- that's minor.
I don't know if I agree with that. "Change Control" or "Change Management" is a crucial part of any Data Center. The fact that these ports were changed without being properly "run up the flagpole" is a glaring mistake with very unfortunate results. I'll bet anyone swapping ports in the future will ask permission several times over before trying it again.
I work at the heart of this... (Score:5, Insightful)
VISTA runs on HP's VMS, and on top of that it runs Cache from Intersystems. (And yes it costs the tax payers a lot! But a lot less since we've been centralizing it over the last 3 or 4 years.)
It is a HUGE system.
The centralization that we're currently undergoing is massive, this problem was (IMHO) scape goated to a poor change control process.
I know what was change, I know who changed it, and I know when they changed it. However, this 'melt down' has happened three times... (Not to the same drastic outcome.) It comes down to VMS locking out logons because locks aren't being released properly. (Now you could argue that the reason locks got behind was this change... But I don't think that is the real reason because of our previous problems.)
It's that simple. Ask the VISTA manager over lunch sometime. They weren't afraid of data corruption. They were afraid if they moved the systems, the other system would lock up too with too much user load.
There goes "VISTA". Everyone logged in is fine. Everyone not on... Isn't getting on.
Now comes the bad part... No procedures!
We take 32 medical centers, and throw their IT into a data center. You 'had' clear lines of who owns what, and what happens when they go down. Now you centralize all that... Who raises the flag when something bad happens? Is it the site that has the problem? Is it someone who now controls the system at the data center? Who is responsible for what?
Oh wait... OI&T only has a dozen staff... And almost NONE of those people are technical. Everyones pay was simply moved from one appropriation to another. But what about the IT systems?!?! We moved those too, but didn't hire any permanent staff to take care of it? We just rubber banded a bunch of people together that work across the whole west coast and hand them a pager and say good luck?
Suffice it to say, we have some REALLY REALLY hard working people... And some really bad management. (Congress forcing us to do things on a time table is really annoying. Especially since they expect results, but don't expect any documentation... What do you think is going to get skipped?)
Congress: How is that data center move going!
Howard: We've moved 28 sites!
Congress: Good Job!
Howard:
Then again... Howard doesn't even know everything we skip to get things done.
Bah
Re: (Score:2)
Re: (Score:2)
This person knows what's going on and is telling you.
Cache is part of the problem (Score:2, Interesting)
Re: (Score:3, Informative)
M is what T-SQL/stored procs wants to be when it grows up. I'm pretty sure getting help from Intersystems isn't an issue at the VA.
This is a Management/Change Management issue. Not a technical issue.
Re: (Score:2)
Thanks for that excellent insight from the inside.
First of all, the soft computer press quoting of idiot bigwigs and their scapegoating for whatever they're planning and then again the scapegoating when it fails is irritating, but it's the only semi-technical info available. If it weren't for Slashdot where AC's can dish the dirt we
Oh - *that* VA (Score:2)
What meltdown? (Score:2)
I've gone to VA hospitals since 1989. I got insurance when I started teaching and started going to local doctors and hospitals. Before a year was up I was going back to the VA. Treatment that the VA doesn't provide is treatment the vet didn't request. To be fair, at the VA you need to reques
That's ok (Score:2)
And State recently suffered a MAJOR web outage. Press says it was hacked, I know better. I used to manage that web server before I was summarily laid off. The MySQL database would start going haywire because it was an ancient version. All you had to do was kill the MySQL slave and restart MySQL and all would be f
Disappointing your employees . . . (Score:1, Interesting)
A lof of the admins were unhappy about that, as I would have been. I am just curious if the failure to complete the project had to do with the lack
Re: (Score:2)
Doesn't VA own slashdot? (Score:2, Funny)
Re: (Score:1)
Re: (Score:2, Informative)
Re: (Score:2)
-Ask to do stuff I already tried? Check.
-Pretend like a download/burn failure could cause this specific problem? Check.
-Give inconsistent story about which CD is needed to fix boot errors? Check.
-Ignore information about error message? Check.
-Focus on irrelevant Windows usage? Check.
-Feigning surprise that someone would run Ubuntu in a completely anticipated, common environment? Check.
I know a lot of what I've