IT Infrastructure As a House of Cards 216
snydeq writes "Deep End's Paul Venezia takes up a topic many IT pros face: 'When you've attached enough Band-Aids to the corpus that it's more bandage than not, isn't it time to start over?' The constant need to apply temporary fixes that end up becoming permanent are fast pushing many IT infrastructures beyond repair. Much of the blame falls on the products IT has to deal with. 'As processors have become faster and RAM cheaper, the software vendors have opted to dress up new versions in eye candy and limited-use features rather than concentrate on the foundation of the application. To their credit, code that was written to run on a Pentium-II 300MHz CPU will fly on modern hardware, but that code was also written to interact with a completely different set of OS dependencies, problems, and libraries. Yes, it might function on modern hardware, but not without more than a few Band-Aids to attach it to modern operating systems,' Venezia writes. And yet breaking this 'vicious cycle of bad ideas and worse implementations' by wiping the slate clean is no easy task. Especially when the need for kludges isn't apparent until the software is in the process of being implemented. 'Generally it's too late to change course at that point.'"
Re:All comes down to budget (Score:5, Insightful)
Budget and the lack of ability to see ahead, on the side of the decision makers.
Far too often decision makers are not the people who also have to suffer, I mean work with the tools they bought. They are often easily swayed by a nifty presentation from a guy who doesn't know too much either but promises everything, and of course the ability to cut cost in half, if not more, so they buy. Only to find out that the solution they bought is not suitable to the problem at hand. And then the bandaids start to pop up.
As a non-developer, this is what I see (Score:4, Insightful)
Maintaining code is boring.
Everyone wants to work on the latest and greatest stuff, no one wants to maintain or even release patches.
It sucks, especially since it isn't limited to just software development.
I've seen companies where their "core switch" was a Cisco 2548. This wasn't 10 years ago, this was last year! Unreal.
Re:All comes down to budget (Score:5, Insightful)
They are often easily swayed by a nifty presentation from a guy who doesn't know too much either but promises everything, and of course the ability to cut cost in half, if not more, so they buy.
If you've worked in a huge shop, you know that the big software vendors send reps out to IT managers for golf outings and the like. Screw it if the software works or not, just fluff up the guy with the budget rubber stamp.
implemented (Score:3, Insightful)
i guess its ok that the sysadminds coopted the work 'implemented' where one would normally
say 'installed'
but that kind of leaves the actual implementors without a word now
and in this particular usage, its kind of odd, because usually the best time to
find and fix these problems is exactly when its being implemented, rather than
when its being installed
like bubblegum under a desk... (Score:5, Insightful)
Re:All comes down to budget (Score:3, Insightful)
To be blunt, most IT departments act like cost centers and don't provide any strategic value. Business units help by shorting the budgets and whining about band-aid technology instead of seeing how IT can build the business. It takes an exceptional move by IT or amazing insight from a business unit to raise IT above the slog and allow it to provide a competitive advantage to the business units. Projects that do this get firehosed with funding.
Consultants take advantage of this catch 22 situation when they sell new projects. It lets them get the new implementations and cutting edge development. This situation also causes application oriented mini-IT organizations to pop up in the business units from time to time. That, in turn, causes more headaches for central IT.
Software = untouchable mentality (Score:4, Insightful)
This happens in commercial software development, too. There's this belief (often held all the way up the management chain to the top) that software, even bad software, represents some kind of massive, utterly permanent investment that must never be thrown away and re-written.
I've worked with managers who would think nothing of throwing away a million dollar manufacturing machine to replace it because it's old, yet cling with all their might to ancient software code that represents a similar level of investment.
Re:Kludges are short-time fixes and long-time prob (Score:5, Insightful)
It doesn't look like "doing it right the first time" is an option here. RTFA. They're talking about vendor applications being crappy and crufty, and IT departments being required to support them. The IT department didn't pick the app, and isn't allowed to not support it. They can't switch to another app (usually apps like this have little or no competition, and they're probably locked-in anyway).
So there's really nothing they can do but complain as long as they're required to support some shitty application on the latest version of Windows, as these are the requirements set down by upper management.
Re:Summary (Score:4, Insightful)
You obviously do not belong here.
Nerds only have one time zone: UTC
Re:All comes down to budget (Score:3, Insightful)
Re:Solution is obvious - Linux (Score:2, Insightful)
I've been saying exactly the same thing since about 1994---since I got into linux thing. Every program I wrote since just "works" without changes (granted, I don't write many gui apps; mostly data management stuff). My Windows counterparts (same corp, doing semi-related apps) have to release a "new version" every time .net is patched---or something along those lines. Your environment shouldn't make your things break or not work right.
Re:Take responsibility and stop the magical thinki (Score:5, Insightful)
I'm going to tackle some of the conceptual problems that are hinted at above, which is usually where the difficulties lie, usually in trying to use the wrong software and expecting to somehow "make everything better" if you just make it work "my way" - the true "Magical Thinking".
I tend to agree with your conclusions, "wipe the slate clean" is a drastic action. I disagree with some of the approach you use to arrive at them:
a.) Problems are solved by people being invested in solving them, not process. This requires the antithesis of "Units" - Ownership; Ownership in the company, Ownership of the mission, and a direct heart felt connection to the success of the company. Until you have staff, from the CEO down, that own problems, from the mess in the coffee room to server down time, you will have a "business house of cards" no matter how good the process. In fact, most of the time, fixing things involves re-writing and/or reconsidering process - usually starting with asking the question - "Do we really need that?"
b.) Sometimes you really do have a train wreck on your hands. If you have mastered a.) b follows almost effortlessly, because now, you can *talk* about this behemoth that is eating your company and everybody sees the discussion for what it is, not empire building or managerial fingerprinting.
when you run into a train wreck - assess your tech problem - is the fix easily found? Are your processes using the software at cross purposes? if so, which is cheaper to fix? No amount of bug fixing will repair using the wrong software. It won't even fix using the right software in the wrong way.
In the end, re-asses often, be frugal, not cheap, if it truly is a requirement to run your business, buy the most appropriate. If you've made the mistake of buying a Kenworth long hauler when you needed 3 old UPS trucks - admit it, sell it back, take your loss and get what you really need.
Thats not "magical thinking" it's just common sense.
The meaning of Quality (Score:5, Insightful)
More than any other type, businesses are run by salesmen. These are people whose strongest attributes are the ability to build relationships, to communicate value, and a strong inclination to increase their personal wealth.
Increasingly, the stuff salesmen sell is based on complex technologies that, really, are beyond the reach of their comprehension. They kind of understand the products they sell, but really, they don't. If the world only had salesmen, there wouldn't be any sophisticated products.
Say hello to the engineer...a person who builds products. His strongest attributes are a desire to solve problems, a willingness to absorb the tedious but essential details needed to build a complex system, and a personality that derives gratification from doing so.
We now begin the business cycle. The salesman says, "Build me something I can sell."
The engineers says, "I will build you something that works well."
And therein begins a lifetime of the two, symbiotically, talking past each other. The engineer serves the salesman, and the salesman serves himself. But make no mistake about it: the salesman is in control.
For a salesman, QUALITY means it works well enough for him to sell more, and most importantly, to make more money for himself. For an engineer, QUALITY means it works reliably and efficiently. To be sure, QUALITY is an abstract and moving target that varies according to the eyes of the beholder. But to understand why we have the predicament described in this article, we need only understand the SIGNIFICANCE OF QUALITY TO A SALESMAN.
I would continue to expound, but then, most readers here need only reflect on their already frustrated pasts to understand the mechanics of this convenient but often vacuous relationship.
Re:All comes down to budget (Score:5, Insightful)
If a kludge works, is documented, was implemented with proper change controls, and can be repeated, is it really a kludge anymore?
Yes.
You've either don't know what a kludge is, or don't have enough ability to see how fixing things or implementing something the wrong way can really be a horrible mistake that feeds on itself and creates other mistakes. Kludges aren't something you can simply document around. The rest of your post isn't really worth responding to, since it makes the false assumption that kludges are simply poorly documented behavior. If that's the worst you've seen, you're lucky.
Re:Take responsibility and stop the magical thinki (Score:3, Insightful)
Both are, IMO, essential, which is while while I pointed at particular areas of process, my big picture message was about IT shops taking "responsibility for assuring the quality of the IT infrastructure."
Neglect of process is a symptom of people not being invested in solving problems that leads to bad results on its own, but even a good (nominal) process isn't going to work well if people aren't invested.
I prefer "responsibility"; "ownership" is, IMO, misapplied here. (Though, arguably, one of the reasons people do not take responsibility is because they don't, in fact, have ownership -- but ownership is a material relationship, and responsibility is the relevant attitude.)
But I think in substance we generally agree.
You kind of contradict yourself there: if fixing things usually requires changing the process, then "how good the process" is obviously has fairly direct bearing on success. The key thing is that processes aren't good (or bad) in a vacuum, they are good or bad based on the effects they have in your organization, in acheiving your mission; the same nominal process that is good for a group of people when considered against one mission is going to be bad for the same group of people when considered against different goals, and the same process that is good for one group of people with a given mission will suck for another group of people with the same mission, because people matter.
Re:All comes down to budget (Score:5, Insightful)
the IT department is treated as pure cost instead of something that provides strategic value.
I can't count the times I've gone in somewhere and saw major deficiencies in their IT infrastructure. I mean really bad, O-M-G size problems. And when you point them out they act like you're trying to pad your billing. Just fix whatever isn't working that day. One of them was a doctors office.
Imagine if their patients acted that way. I don't care if I have cancer, just remove that lump in my underarm.
That's what you get when the problem is dictating the solution.
Re:All comes down to budget (Score:3, Insightful)
To be honest, though, Linux is generally very good at backwards compatibility if you statically-link everything when you compile (as is frequently the case with commercial software). The Linux system calls never change, except to add new ones once in a while, so it should be very rare that something doesn't run.
Of course, if something is compiled with dynamic links, this isn't the case, as many of the dependencies will change over the years, but that's why static-linking is available, to avoid this problem. Dynamic linking is better for software that's distributed by the distro, as they can make sure all dependencies in place. Boxed commercial software doesn't have this luxury, so it needs to stick with static linking.
The main place where people complain a lot about Linux's backwards compatibility is with drivers, but that's a design decision. In Linux, drivers are supposed to be included with the kernel. If you don't want to do that, then you'll suffer the consequences. Application software doesn't have this problem as it doesn't link directly to the kernel.
I was torn between modding this up and commenting. (Score:4, Insightful)
I was torn between modding this up and commenting.
I picked commenting.
This statement:
Everyone wants to work on the latest and greatest stuff, no one wants to maintain or even release patches.
is very, very true. We (Apple) have a hard time getting applicants who want to do anything other than work on the next iPhone/iPad/whatever. Mainline kernel people are difficult to hire, even though the same kernel is being used on the iDevices as is being used on the regular Macs. Everyone wants to work on the new sexy. For some positions, that works, but for most of them, you have to prove yourself elsewhere before you get your shot.
I think that, for the most part, we see the same thing in marketing for higher education (with the exception of one track, one of the universities I went to has become a diploma mill for Flash game programmers; sadly, I would not hire recent graduates from there unless they have an experience track record). There are video game classes at most universities, but while it might be sexy, you are most likely not going to be getting a job doing video games, 3D modeling for video games, or anything video game related, really, unless you get together with some friends and start your own company, and even then it's a 1 out of 100 chance of staying in business.
I don't really know how to address this, except by the people who think they are going to be the next great video game designer remaining unemployed.
-- Terry
Re:As a non-developer, this is what I see (Score:3, Insightful)
Absolutely nothing. A 24 port gigabit switch makes a great foundation for a small to medium-sized network with typical business use. It's a stretch to call it a 'core', but anybody who tells you that you need some kind of crossbar fabric chassis switch at the center of your average branch office is just trying to sell you hardware and service contracts.
Re:Don’t patch bad code - rewrite it (Score:5, Insightful)
1. Rewriting means rethinking; most legacy code is functional and is usually rebuilt in OOP. Whenever you rethink how something works it tends to change the entire behavior to say nothing of all the new bugs you'll have to hunt down. You're customers will definitely notice this.
2. Scope creep!! Rebuilding it? Why not throw in all that cool functionality we've been talking about for the past 10 years but couldn't implement because the architecture couldn't handle it. You get the idea.
Want an example? Netscape 5 [joelonsoftware.com]
No! (Score:4, Insightful)
There is constant pressure to re-implement existing architecture. Most of the time, the people who want to do this do not have a clear understanding of the business process involved, don't realize that the existing frameworks represent years of bug fixes and are at least stable for that reason. They only think "Wow this sucks, a new one would HAVE to be better."
I'm not saying that you should never rebuild something from the ground up, but the scope of the project should be limited and the entire endeavor should be well documented and well understood from the beginning. And if the guy who's pushing for a rewrite can't demonstrate a deep and fundamental understanding of the business flows being automated, he should be taken out and shot (Or at least pummeled soundly.)
Just how much documentation can you read? (Score:5, Insightful)
The problem with the whole idea of "if we only had enough documentation and change control" is that it becomes a non-trivial event to actually read through the documentation. Let's take an imaginary system that's been in production for 5 years...assume every last drib and drab of change has been documented...now you've got a 2000 page document and several hundred change records that tell you *everything*. Except, when it comes right down to it, mastering that 2000 pages of documentation and all the changes made afterwards is a months if not years long project - hardly effective for dealing with production problems that need to be solved in minutes or hours.
The illusion being perpetrated here is that people are interchangeable, and if you just have enough documentation, you can replace Mr. Jones with 20 years of hands on experience with the system with Mr. Vishnu living in Bangalore (or even Mr. Smith in the next cube, for that matter), with a net cost savings.
Now, I'm not saying documentation is a bad thing -> lord knows, it helps to have a knowledge base you can search...but knowing what to search for is knowledge you only get by real world experience with maintaining a production system. This is not digging ditches, boys and girls, this is skilled, if not essentially artistic labor.
Simply put, people matter more than process.
Re:Take responsibility and stop the magical thinki (Score:4, Insightful)
I've found the problem to almost always be the last thing listed. It's the contractor syndrome. "If you give me $1,000,000 now, I'll save your $500,000 a year for the rest of the time you'd have used that." Well, they think you are lying. They think that you wouldn't actually save the $500,000 a year, but would take the $1,000,000 this year and add it to your budget as a permanent line item, costing them $1,500,000 a year, rather than saving $500,000.
You can blame the IT director/manager/CIO/whatever for not being convincing enough, but there seems to be a pattern where people bid low then have massive overruns where the highest bidder would have been cheaper. As such, the people the IT person is talking to are often so jaded they don't trust anyone with price estimates.
When IT units don't take responsibility for assuring the quality of the IT infrastructure, surprisingly enough, the IT infrastructure, over time, becomes an unstable house of cards, with the IT unit pointing fingers everywhere else.
And when the IT units have the responsibility, but not the authority to fix things, what then? Most all places tie the hands of IT then complain when the solution isn't perfect.
A contrarian opinion (Score:1, Insightful)
This reads like those "articles on investing" written by mutual fund companies. After all, isn't the publisher in the business of selling software and services? If your stuff works, it will continue to work for the same requirements if you don't mess with it. Sometimes there's more risk in starting over: "Hey (insert-born-in-1990's-name here) let's rewrite the airliner's flight control software as a facebook app, it's out-of-date."
Re:I was torn between modding this up and commenti (Score:5, Insightful)
Everyone wants to work on the latest and greatest stuff, no one wants to maintain or even release patches.
I don't really know how to address this, except by the people who think they are going to be the next great video game designer remaining unemployed.
Here's how you address it: you hire one of those 9 out of 10 CS graduates who "Just got in it for the money". Had you offices in the Midwest, you'd have no problem finding programmers whose only ambition is to crunch out brain-dead code until they can move into management. Trust me, I work with these people and they're even worse than the primadonnas interested only in the "cool" things. Naturally, not everyone can be the next game programmer, or work on cool things, but you probably don't want to hire those whose only ambition is to do the grunt work.
Typically, the primadonna has to have his ego coaxed into doing the grunt work. But you can usually count on him to do it fast, and not to make a total mess of things. Granted, some people have a higher estimation of their abilities than their peers. But at least someone passionate about coding can be inspired to improve their code; they'll actually accept coding standards once reasonably explained. But here's a short list of problems with the typical "career type":
It's easier to convince a rock-star programmer that documentation is necessary than it is to convince the career-track political programmer that a race condition is a problem, that architecture matters, that maintainability and scalability are important. Just the other day, I had a department manager question the value of writing reusable code - in fact, he was so hostile as to suggest that it wasn't worth our time to make code reusable... (And not only that, but reported to my boss that my suggestion otherwise was "distracting to what we're trying to accomplish here"...)
I know the starry-eyed programmers can be a handful at times, but those indifferent to technical issues will lay a minefield in your company. Suddenly, years after they've moved on, you'll find your new hires telling you the projects they built aren't worth salvaging, that you'll have to start over, etc... I've seen these types move into management and turn an otherwise fun profession into a death march. You don't want the stupid, or the political, types of people writing code. They'll set your company up for failure every time.
Re:All comes down to budget (Score:3, Insightful)
To be honest, in all of my years as a programmer eventually becoming a full software engineer (meaning I design, implement, and maintain software solutions), doing it "The Right Way" has always lead to bankruptcy. Always. Of course correlation is not causation, but for the times I've seen companies fail when "following the process" vs. "Release early and often", the latter half were the ones to stay in business.
Obsolesence (Score:1, Insightful)
There is non-trivial value in IT systems and supporting hardware and frameworks that can be expected to continue to operate for many many years perhaps as much as several decades without significant change.
This concept is contrary to the views of many "modern" software and hardware manufacturers who have a vested interest in continuing maintenance and support and therefore have no reason to invest heavily in testing as whatever bugs their customers report will be fixed next week by the next compile'n ship patch.
By letting the inmates run the asylum -- failing to insist on systems designed from the core to be maintainable, expandable and operate in a sane and correct manner you as an IT manager are only encouraging crappy products and outcomes from your vendors.
Call me a heretic but I don't see software technology improving much over the past two decades. There are a lot more choices and more opportunties to aggregate the works of others to your advantage saving you time and money but very little has changed in terms of underlying core design principals and fundemental understanding of the nature of the space. There are no meaningful design automation systems for general purpose applications.
Re:All comes down to budget (Score:3, Insightful)
To be honest, in all of my years as a programmer eventually becoming a full software engineer (meaning I design, implement, and maintain software solutions), doing it "The Right Way" has always lead to bankruptcy. Always. Of course correlation is not causation, but for the times I've seen companies fail when "following the process" vs. "Release early and often", the latter half were the ones to stay in business.
You can do "release early, release often" within an ITIL framework, just because most places implement it poorly doesn't mean it can't be done well.
Re:All comes down to budget (Score:3, Insightful)
Odd, ain't it? Those sales, I mean, training meetings are always in a holiday resort. When your boss is at a "business meeting" at some place near the sea or high up in the mountains (Summer and Winter, respectively), you better make some room in the next few weeks in your schedule, you're gonna get some new hard- or software.
Re:All comes down to budget (Score:4, Insightful)
This is because IT is managed by managers, not engineers.
If all managers had coalface IT backgrounds at least (even to the point of just helpdesk) the problem would not be there.
As usual strategic and policy decisions are being made by people who don't understand the nuts and bolts.
Would you design a car by having a committee of non engineers approving every major decision. No. But that is how IT infrastructure seems to be built...
Re:like bubblegum under a desk... (Score:3, Insightful)
[PHB] So you're saying that quick and dirty fixes in the past worked ... and some of them are still working? Must be good, then! [/PHB]