Getting Started With DevOps

DevOps has been defined in this article by Stephen Nelson-Smith, and the executive summary is that operations and development should no longer be separate functions (and never should have been) and need to start working closely together.

Why? Without working together, failures inevitably occur. For example, at the last Boston DevOps Meetup, one of the attendees, a developer, was commenting on the disconnect between him and his sysadmin and how their relationship was unlike the devops model.

“That all sounds nice for you guys, but my sysadmin at work doesn’t seem to care about any of this. He’s not engaged.”

The developer went on to give examples of times when the production servers broke the code because the production servers were configured incorrectly, or the ops person didn’t assist in debugging a problem because the ops person felt the problem was the developer’s to deal with.

We all talked about this for a while, when I realized that in another bar in another town, that developer’s sysadmin was saying to his friends, probably over a beer, “That all sounds nice for you guys, but my developers don’t care about any of this. They’re not engaged.” The sysadmin probably went on to talk about how the developers don’t keep their configurations sane and how they never debug the problems they create.

This is why we need DevOps. (And probably, really, DevOpsQASales, but that’s another post).

How do you get started with DevOps at your work?

If you are developer, invite your ops guys to your scrum or weekly meeting. Make sure they come and always ask them if what you are talking about has an impact on their work.
If you are an ops guy, invite your developers to your scrum or weekly meeting. Tell the developers about upcoming changes in each environment. Ask the developers what is going in their world.

If you’re a small team, bring everyone, if you are large, bring one developer from each project, but don’t invite the development managers, or the ops managers. Invite the people who do the work. You need to have the developers who will say, “We’re implementing a feature that uses these extra libraries, can they be installed in production?” and the ops people who will say, “Oh, if you use v2.8 of that library it won’t work on the older machines because of x, can you guys use v2.9?”. You want this to happen before you go to production.

Adding meetings to your calendar always sucks, but you’ll save headache later by talking to each other, and more importantly, you’ll buy into the projects that everyone is working on. You’ll believe in the work others at your company are doing and want to help if there are issues.

The developer above should be talking to his ops guys on a daily basis. They should go for a beer and talk about technical problems at work. I guarantee one of them will say, “Oh for that I just do x y and z, and it works great.” and this will be the solution to a nagging problem.

Technology, of course, can help a lot. In the example above, the ops guy should:

Create virtual machine images of production on a regular schedule that the developers must run their code on as part of checkin cycle.
Use a configuration management tool such as puppet or chef to keep staging and QA environments matching production environments.
Talk to the developers about the network, hardware and software that make up the production environment, including details of the resource limitations of those components.

While the developer should:

Write up details of the software’s operating requirements in terms of resource usage, environment configuration, and other dependancies.
Package the software properly, so the ops people can review package manifests on upgrades automatically to track changes in QA and staging.
Codify operational issues in unit tests (lossy network unit tests, disk full unit tests, out of memory unit tests, blocked port unit tests).

Working together:

Use build tools such as Hudson to trigger jobs on checkin – jobs which run the full set of unit tests on a production-like environment/virtual machine.
Define hand off procedures for new features, which require checkoffs from ops, QA and development.
Push the ops deployment scripts and methods to the developers’ workstations so the developers can use them.

Also, whenever possible you should automated that which can be automated. That’s what computers are really good at.

These are just a few examples. There’s a lot to talk about in later posts, such as more detail on the heavy use of automation, Agile practices in operations, proper storage of documentation (version control! plain text formats!) and more.

I hope this is useful introduction to some DevOps concepts as I’ve understood them. Please comment below about your own experiences.

February 17, 2010 devops

What DevOps means to me… » Kartar.Net says:

February 18, 2010 at 22:56

[…] the last year or so a bunch of presumptuous European sysadmins and developers, joined by some of their American brethren and even a couple of us antipodeans (there are others too!) have been talking […]

Geoffrey Thomas says:

February 21, 2010 at 22:34

My experiences are different because my big projects are all for an all-volunteer student group, but I’ve always found the split between dev and ops to be rather weird, by which I think I mean that it’s not a division that occurs to the unprejudiced mind. All of our projects’ production servers are maintained by the core devs, and the development servers are accessible to the entire dev team, and they’re kept in sync as to how they’re run. The occasional time we need to interact with a separate ops team (e.g., when we need beefier hardware or VMs run by the school), it’s actually kind of odd for us as developers not to be assumed to be running and installing the whole machine. If I’m going to run this service, I want either myself or one of the other developers to have made sure the machine is the way we want it.

We also do what I guess you’d call DevOpsHelpdesk; the all service developers/maintainers receive e-mails to the support queue. This helps us be extremely responsive to users’ questions and have a very good sense of what sorts of things our users would like to see and are using our stuff for, but it’s pretty close to reaching the point where it doesn’t scale. Still, I think it’s good for developers and maintainers to be regularly watching the support queues; your help desk might just look at the system and think something isn’t possible, where the developers might have had it as an easy feature all along that they haven’t implemented because they didn’t think there was demand for it.

Adam Fletcher says:

February 22, 2010 at 15:04

Hi Geoffrey,

I agree with you on having developers see into the support queue and to have the respond to support tickets (and I agree it doesn’t scale). You may find that as the organization grows and the developers are put on work that is more “important” then answering support tickets you’ll still want to rotate developers into handling some support tickets for a week at a time, or having a way for support personnel escalate some tickets to developers very quickly. That’s an interesting problem that I haven’t spent much time thinking about, but fits well with the core tenants of DevOps as I see them.

The Simple Logic » Blog Archive » DevOps Documentation says:

February 23, 2010 at 22:42

[…] a previous entry I wrote that the key to removing the wall between developers and operations is communication. I […]

Content

Getting Started With DevOps