Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
IT

Should Developers Have Access To Production? 402

WHiTe VaMPiRe writes "Kyle Brandt recently wrote an editorial exploring the implications of providing developers access to the production servers of a Web site. He explores the risk introduced by providing higher level access as well as potential compromise solutions."
This discussion has been archived. No new comments can be posted.

Should Developers Have Access To Production?

Comments Filter:
  • by mr_stinky_britches ( 926212 ) on Wednesday August 25, 2010 @12:36PM (#33370450) Homepage Journal

    And the concensus is ... NO

    Who let this question through? It doesn't even seem controversial. I am not aware of any good reason to routinely give developers access to production.

  • by SatanicPuppy ( 611928 ) * <SatanicpuppyNO@SPAMgmail.com> on Wednesday August 25, 2010 @12:45PM (#33370602) Journal

    See, to me this is more an issue of devs not obeying the rules. They damn well shouldn't be changing production code, and they damn well shouldn't be linking code from other servers.

    Either your devs are a bunch of barely trained lunatics, or they're breaking the rules in a vain attempt to get things done in a timely manner.

    Most times, when I see devs screwing with production it's either a "hero" coder who is way too good to use best practices, or a situation in which the environment is so hostile that the "best" solution seems to be breaking the rules.

    I once did some contract work for a company where the Q&A and testing process took a minimum of two weeks for the most trivial changes, and where the admins on the production servers refused to deploy things like security patches without a testing period that ran close to a month. The devs there had a hundred tricks for sneaking their code into production, and linking production code to the development servers in an attempt to meet their productivity goals.

    Fucking nightmare. Once we ironed out the Q&A thing, and split the admins into two groups (one who maintained, and the other who upgraded and approved changes) the whole process evened out and the devs stopped screwing around on production.

  • Re:For me (Score:3, Interesting)

    by x2A ( 858210 ) on Wednesday August 25, 2010 @12:45PM (#33370608)

    Eugh yeah I hate that. So what I try to do is code in such a way that if a bug should occur, the whole thing stops working, that way there's no point in my /not/ fixing it on the production server! I'm a freakin genius! No of course I'm joking, but a recent project has hit some problems where I've been able to explain and the client has actually been able to understand the challenges of trying to reproduce an intermittent undiagnosed problem without touching the production code (ie, is just not worth the time trying to do) and lets me fiddle with the code. Usually tho it's enough for me to be able to add logging code where it's needed and there's no end-user-visible effects. There've also been problems that have languished, but as soon as I've had the go-ahead to try resolve it on the live system and resolved it quickly and without interruption, so they're getting more okay with letting me do it that way. Sometimes I'll just fix a problem and not tell them, to avoid all the hassle. At the end of the day, I know better than them (which is why they come to me) and sometimes you do just have to make a judgement call. BUT, it's not a massive project with many developers, and in those conditions obviously you need to retain more order.

    rm -rf /^H^H^H^H^H^H^H^Hoops wrong window

  • by merlinokos ( 892352 ) on Wednesday August 25, 2010 @12:45PM (#33370610)

    I work in an environment where the devs fix bugs before adding features, so the code is stable almost all the time. I have less than 1 callout a week that's caused by something a dev has done to the code.
    We hire the best devs, and work in an environment where fixing bugs is more important than adding features. The result is that our devs get full access to production, and even offer to provide support in order to ensure that they're the ones that are woken up if something they've broken falls over OOH.
    I've been at my current company long enough that I'd forgotten there were places where devs and ops didn't trust each other.

  • Re:No correct answer (Score:3, Interesting)

    by RyuuzakiTetsuya ( 195424 ) <taiki@c o x .net> on Wednesday August 25, 2010 @12:46PM (#33370636)

    Bingo, and I don't think the line is drawn at the size of the company but how mission critical the application is. Granted, in a 20 person firm versus a 20,000 person firm, the developer's probably also the administrator. But OTOH, if your business is 20 people big and one of them is a developer, it's easy to assume it's probably IT based and as such, some sort of administrative control is probably a good idea in general. Think about it. Well, sure, 20 people but how many machines? Half rack? rack? 5 racks? A whole datacenter?

  • by Todd Knarr ( 15451 ) on Wednesday August 25, 2010 @12:50PM (#33370688) Homepage

    Speaking as a developer, I want/need read-only access in production. All too often I need to dig out information while troubleshooting, and most commonly I don't know what all bits I'll need when I start. If it were easy to identify exactly what I'd need to find the problem, I usually already know what the problem is. The hard ones are the ones I can't replicate in development and I only have a starting point, something that won't identify the problem but might help me narrow down where to look next. In those cases the only place I can look is production (since I can't make it happen in a controlled development environment) and I can't give the admins a list of what I'll need (because I need to dig through logs and config files before I'll know what I need to look for next). And if we've gotten to this point, it's probably a priority problem impacting production so it needs to get fixed Right Bloody Now.

    OTOH, while I may need to look at production, I don't need and don't want the ability to modify production except by going through the admins. This, of course, also requires admins who can follow basic instructions like "Look at config file FOO. Find the line in section X that starts with Y. It's value should be XYZZY followed by the number 1. Change that 1 to a unique number for that machine/instance. Repeat this for every machine/instance.". But all too often the response is "That's too complicated. Can you just give us config files to install?". And of course when I ask for the current config files, so I can be sure I'm not overwriting any other modifications to them (which may have happened since the admins control them and do modify them), I get "We can't do that, they've got production passwords in them.". Now all I can do is throw up my hands and go "Whatever.".

  • by TarPitt ( 217247 ) on Wednesday August 25, 2010 @01:09PM (#33370974)

    I worked as an IT auditor for a very big public accounting firm. Reviewing IT controls was a key part of the financial audit (and more so now with Sarbannes Oxley).

    If I found developers had access to production, it was automatically a "no reliance" finding.

    This means the financial applications are inherently untrustworthy that the financial auditors would have to review original source documents for validation.

    "No reliance" meant the audit became much more expensive as a result.

    Also - if the auditors can't rely on the financial reports, should management?

  • by PPH ( 736903 ) on Wednesday August 25, 2010 @01:18PM (#33371112)

    Good point. Lots of people are jumping in with remarks about developers tweaking production code. But there are other sorts of access. As a 'developer' I've had very good experiences with having shell access to production systems to read logs, inspect packages and even run little test suites to verify the configuration of the production system.

    For example, at one outfit, I was one of the few non administrative users with shell access to production servers. One day, everything came to a screeching halt. Nothing worked and the admin claimed no changes to the application. So I logged in and got a message to the effect that the /tmp directory was inaccessible. It turns out that IT management was under orders to clean, sweep and get rid of all unneeded 'junk'. So this manager asks his admin what each directory structure is for and is told that /tmp is where 'junk' is written. Orders went out that 'junk' was not to be kept on production servers and /tmp was to be deleted. Now, the admins knew better. But at this company, nobody bucks the chain of management. If the boss says, "Bolt the wings on this one backwards", you do it and move on. Being outside that chain of command, I was able to get things put back the way they were supposed to be. But having shell and read access to that system was what enabled me to see the problem (of course, being a *NIX geek gave me the experience to know what /tmp was).

    There's a large debate going on in many organizations on how much information to give each employee. DoD security requirements aside, giving everyone broad read access empowers employees to handle exceptions and solve problems without having to go up through the management chain. The down sides are: When it's easy to fix instances of problems, some people settle for that. Rather than searching for the root causes and making the necessary changes to eliminate them, the decision is often make to maintain the status quo. Because its so easy to fix things when they inevitably break. It creates the image of the hero or industrious worker when one can be seen to jump in and save the day. Repeatedly. I'm motivated by sloth. I like fixing things so they don't keep bothering me. The other down side is that undocumented work arounds make processes hard to reorganize and eventually outsource. If one needs to move their IT process offshore, for example, its much better to have everything compartmentalized and documented. This minimizes the amount of information the supplier will need in order to perform the job.

  • With Great Power... (Score:3, Interesting)

    by Sandor at the Zoo ( 98013 ) on Wednesday August 25, 2010 @01:22PM (#33371158)

    Of course developers should have some level of access to the production environment. No matter how good your test environment is, it's not going to match the live server in load, or what's in cache, or the concurrent access to some resource, etc.

    Our process was to have one person with access, investigating whatever problem via the SQL command line, or the Rails console (let the RoR jokes commence), with another person watching, to make sure they were doing select * and not update or delete. Even then we'd execute stuff in a transaction or sandbox so that we weren't making any permanent changes, although changes to memcache generally can't be rolled back so easily.

    I've seen admins, who are adamant that dev not be allowed to change anything, change psql configurations at a whim, crippling DB performance. And then blame dev for poor response times. That's so not cool.

  • Re:For me (Score:3, Interesting)

    by PotatoFarmer ( 1250696 ) on Wednesday August 25, 2010 @01:32PM (#33371338)
    Use an automated process that rebuilds your test environments nightly from production backups. Test environment synchronization and backup verification rolled into one.
  • by recharged95 ( 782975 ) on Wednesday August 25, 2010 @02:01PM (#33371756) Journal
    Then again, when I interviewed with Google for youtube:

    "We develop on the main servers, it's typical here, but now, aside from that and more importantly, I have a question : can you show the result of inserting the following values into an empty AVL Tree ... in a Python context"

    Being more of a software engineer than a pure CS wonk (couldn't answer the question completely), and having worked on spacecraft control software........ just say not working there was mutual.
  • by Bourdain ( 683477 ) on Wednesday August 25, 2010 @02:10PM (#33371838)

    If I screw up, people can't get the correct pills. It's fun to make other people live dangerously. :-p

    FTFY. Well, for certain values of "pharmacy benefit management system". If your production hacking can botch scrip fulfillment, please say what company you're working for so I can try to avoid it like the plague it is.

    I don't know if Blue Cross Blue Shield has fixed this but, as of a few weeks ago (and this probably has existed for a while), living in EST has made it impossible for scrips to be fulfilled via insurance between midnight and 3AM. This is because, according to the late night pharmacist who is familiar with the issue, the servers are in PST and won't allow fulfillment from the anything but the "current day" regardless of time zone. Too bad the devs there don't understand time zones adjustments / UTC/GMT. Yet again, non-profit environments don't tend to attract the swiftest of folk in general.

  • Re:For me (Score:3, Interesting)

    by gorzek ( 647352 ) <gorzek@gmaiMENCKENl.com minus author> on Wednesday August 25, 2010 @02:12PM (#33371850) Homepage Journal

    The whole article is absurdly vague, anyway. Sometimes developers need access to production--such as on a critical system--and sometimes they don't. Had the article between written toward a narrower domain, something more specific than just "Web sites," it might actually be useful. As it is, it's too light and fluffy to have much real-world impact.

  • by ChronoFish ( 948067 ) on Wednesday August 25, 2010 @02:16PM (#33371896) Journal
    For a one man show the answer is self evident.

    For a small web company developing "brochure-ware" - probably more efficient.

    For a small team it's ideal to have individual sandboxes - with one sandbox listed as "staging". Assign the lead developer to turnover code to production. Individual developers have access but are told not to touch anything. They will typically sift through live environment making sure it matches what is in their sandbox, looking at logs, etc.

    For a mid-size team you need one person for maintenance (which includes monitoring nightly builds, responding to code turnover requests, managing automated testing). Even more critical if the code you write is compiled, fragile, or highly sensitive. - Individual developers don't have access to the live box - maybe the team lead will.

    For large teams or small team "units" part of a large production shop : Several layers of "staging and testing" will exist. Code turnovers are mostly automated. Developers don't have access. Automated rollbacks are possible from a robust code management system.

    The key is discipline. If you find yourself modifying live code - you're not disciplined. It means you're not willing to insert logging code and would rather pollute the production environment. There should never be a need to copy from production back to a sandbox (that is what version control software is for!) And version control files should never live on the production server (i.e. in Subversion you never do a checkout of code on the production server - you do an export instead).

    Even with controls in place, there may be a tendency to "develop on production by proxy". Which means instead of re-creating the problem in development, the developer is saying "here try this, here try this, here try this". The team lead should recognize this and put a stop to it.

    -CF
  • Re:For me (Score:5, Interesting)

    by mwvdlee ( 775178 ) on Wednesday August 25, 2010 @03:11PM (#33372444) Homepage

    In the mainframe shop we used to have 5 stages; (production, shadow (with similar load to production), functional acceptance, system integration and development), next to that 2 well secured "emergency" stages linking to prod and shadow and a single "free for all" development area outside the control of the basic stages.

    Mainframe shops tend to be much more closed and mature than more modern environments, and hence much less goes wrong.

    I've also worked in a Java shop at the same company, where they had 3 stages and a locally for dev, but the stages were much less controlled and you could easily skip straight to production. Obviously only the most experienced of programmers did this and only when they were absolutely certain. Obviously quite some more fixes went wrong on production.

    Currently working in an environment without stages, I try to work on test copies as much as possible but the temptation of bugfixing directly in production is quite large.

  • Re:For me (Score:4, Interesting)

    by tomhudson ( 43916 ) <barbara,hudson&barbara-hudson,com> on Wednesday August 25, 2010 @03:42PM (#33372870) Journal

    ... and really bad planning happens all the time in the real world.

    Look around you. Try to convince yourself that all this was properly planned. The real world is messy.

  • by jhughe90 ( 668051 ) on Wednesday August 25, 2010 @05:36PM (#33374350)
    I've worked for 3 different Fortune 500s over the last 11 years, a defense contractor, a telco, and a bank. Big companies, big IT departments, many many custom-written applications.

    Reading a lot of comments it looks like there's a wide variety of definitions for some of the job titles and roles people are discussing here, so I'll list how I see them:

    * System Admin - Person(s) responsible for the hardware and supporting (OS, Web service, code language and client libs, JVMs, etc) software. They do not in any way support the applications running on said system and would be incapable of debugging or supporting an *application* problem even with a gun to their head. Most can only describe 2-3 sentences of what the applications even do. They do not report to or answer directly to the application teams. They also do NOT install application code.

    * Database Admin - Only want to address roles here. At every location, the actual application data stored in the database is NOT the role or responsibility of the admin. It belongs to the application team and any changes are their job and their accountability. The DBA only deals with schemas, packages, procedures, scripts, access roles and grants, etc. DBAs should NOT MANIPULATE DATA. Asking or allowing them to do so opens up a never ending blame game and is counterproductive. If you want to create some title and role within the application team where all data manipulation funnels through, that's the way to do it.

    * Implementation Specialist (Code Migration) - Trained monkeys who are supposed to follow a set of pre-delivered instructions for deploying application changes. In my experience their technical knowledge is limited, they cannot verify copy/paste correctly, and screw up (transferring ZIP files in ASCII instead of binary) more than they succeed. I don't feel this position is even necessary. The PROCESS is necessary and it can be performed by anyone, even a developer, as long as they switch their role hats before starting and are held accountable for accurately following the deployment instructions given.

    * Production support - They act both in a technical and relationship role, being the contact point between the customer (internal or external) and the application team when issues arise. Generally have read-only access to production. They are able to debug many problems and resolve a few, but definitely not all of them. They do not participate in any part of the development lifecycle processes.

    * Developers - Not going to discuss or debate any pre-production roles here since it's irrelevant to the topic. Developers are the only ones I would be confident could debug ANY problem. They are going to need some reasonable level of access to production, logging, or information if you want to have an application that can maintain high availability and recover quickly from any type of outage.

    If your definitions to these roles differ significantly, then my answer for your company's situation would change.

    Depending on the size of the application and the team allocated to run it, I've performed up to 4/5 roles and was pretty much the 5th as well since the Sys Admin only could barely squeak by supporting Windows 2k and definitely had zero knowledge of any of the supporting software. Are you going to hire someone to do 6 hours of work per month just to separate the responsibilities? Of course not. So the OP's generalized question is open to a million different interpretations because of all the different variables that weren't specified.

    My most recent application team recently went through our production lockdown after finally migrating over an application suite purchased from another company. Developers and Prod Support have read-only access. Database passwords used by the applications have been restricted down to just a couple individuals. When changes need to be made to either the application or data, an Emergency ID is checked out to a requesting individual with the appropriate access level,

This file will self-destruct in five minutes.

Working...