I know, continuous integration/delivery (CI/CD) is not a new concept. But let's not take it for granted that it is widely adopted or even well understood, at any level. In fact, I was surprised to learn how many organizations aren't following these practices. Executives seem to disregard it - they shouldn't. I don't speculate why, I'm sure there are many reasons. Those who are responsible for their software development life-cycle (SDLC) often lack the support - budget / resources - and possibly the full-scope understanding. And vendors sell it as if it were a one-size-fits-all solution. It is not.
For me, the past two years have been a gradual evolution towards continuous delivery of our applications, the final goal of which was just recently implemented. The only attention I've given to the matter this past week is this blog and to watch the process just work.
"That's fantastic," you say?
It is! It is fantastic.
For those less technically inclined, imagine having to merge all your team's individual edits into a single a word document - making sure that it all fits the narrative, is grammatically correct, and has the same voice. And... you have to do it several times a day on several documents. I'd give up if I had to do it once. And I bet you'd pay well for a process that does it all automatically - with contextual validation, grammar, and spell-checking... and possibly wordsmithing. CI is that solution for the same challenge that development teams face. Take that one step further, and imagine that the auto-merge solution, then, automatically distributes the document back to the team for review, then onto the intended recipient once approved. That would be crazy awesome, right? All of you and your team's time focused only on the product. That's continuous delivery.
Sounds simple, right? That's what the bloggers and the vendors sold me, too. A developer checks in their code, it automatically gets merged, it automatically goes.
And that's what my first iteration did, albeit with some gaps for human intervention. It was a proof of concept, granted, and though it ultimately checked off most of my boxes, I learned quickly that was only the beginning of a journey that was not so simple.
I kept copious notes during the process, and I found these common themes in my comments.
Define the metrics up-front
CI/CD, whether you outsource or not, requires a lot of work and investment. Its a no-brainer for any tech, but it still has to be sold to those who are paying ... and buyers will want numbers. More importantly, buyers want empirical progress reports, which will be based on these findings.
Measure the entire process, as each of targets will have a measurement.
- How much time and money is spent doing builds and deployments?
- Digging deeper, how much time is lost while developers await completion of their pull-requests, and how many defects are introduced in the conflicts that inevitably occur when merging aged pull-requests? A pull-request is akin to a team member asking you to merge in their word document changes. A merge-conflict is what happens when more than one have edited the same section, and you didn't pick the right edits.
- Consider, too, the Ziegarnik effect. Any delay between development and testing, whether QA or UAT, is directly proportional to the team's ability to keep the details of the change in focus. Changes that are introduced too long after the development phase will require that the developers and stakeholders re-learn why the change was made in the first place. While subjective, it does consume unnecessary time as it often leads to misunderstandings and scope-creep.
Engage the execs from the gate
After all, your current position is costing them money, and the solution will save them money. Don't move on until they've bought in. The process is disruptive and will require your focus, and the execs will need to understand this.
Recognize, though, that by exposing the potential for cost improvement, you are also acknowledging a key deficiency that the execs didn't know to know; in other words, execs assume that their experts are expertly efficient. Why wouldn't they? And If you don't have a build and/or deployment manager, which many organizations do not afford, that means that a likely well-compensated worker is doing grunt work, which doesn't look good for anyone, and it significantly limits the ability to market the achievement. So... you may have to accept that the only pat on your back will be by your own hand.
Engage everyone else
Everyone! All the developers, sponsors, QA, and, um, ops.
I made the mistake of thinking that I knew the process and that I didn't need to have those conversations. I was wrong, and I really regretted it. I essentially had to rework the entire process because, for one example, I failed to consider that QA needed some level of stability in their environment. Which meant that there needed to be a 'gate' before introducing new code.
Understanding the process from both the current and the desired perspective of all those involved will lead to a much better process. And it will help you establish the fundamental components beforehand - like a well defined Scrum board, the columns of which define the overall process within your solution must fit.
It is a great opportunity to break barriers and truly engage ops, as well. I needed their help often to make sure the environments were appropriately provisioned, and they, much to my surprise, were very happy to understand what all of our servers were actually doing.
You should leave this stage with a well defined end-to-end process, without having made a single configuration change. I acted out the process with refinements several times (after my initial PoC) before settling on a tenable solution.
You need to know all of your tools, no matter what
I made the mistake of picking the CI/CD tool first. We were using Microsoft DevOps, so it seemed a natural progression. The problem was that I kept deferring to the tool because, well, Microsoft must know better than me. As I worked through each of the steps, trying to determine if it were me or the tool that was faulty, I discovered that I wasn't escaping the need to understand what that tool was doing. I still had to understand how to version packages in CI, and how to make sure dependencies were identified and compiled in order. I still had to understand how to branch effectively, and how to cross repositories. I still had to understand how to control my load-balancers. By the time I was done, I realized that I could do it all myself without locking myself into a third party tool.
Any CI/CD solution is just a methodology and a task runner to orchestrate tools that you already have at your disposal. I know this because I ultimately rolled my own. I used mchnry.flow, which is a dotnet task runner that I wrote while sitting in the hospital while my son was recovering from surgery. It wasn't as intended, but it does the job remarkably well. Beyond that, I used only the CLI's and API's that come with tools I was already using.
I'm not advocating roll-your-own. That requires a constitution that is uncommon in my experience. I'm merely saying that you shouldn't expect to escape the details of the implementation. If you reject the consultant that comes with the tool, your experience will be much like mine. If you take the consultant, but choose not to shadow, then you will have to engage that consultant with every change. And that's not a tenable solution. Not only will you certainly go through change as you jump on the next tech wagon, but so, too, will all of the tools you are using. It never ends.
If you are hesitant to turn it on for Production, you're not alone
I was surprised to learn through many discussions how many stop at UAT, or even QA. Production is the ultimate goal, right? The process worked in QA, then again in UAT... why wouldn't it work in production? All those late nights, run-books, and support calls should be a thing of the past.
I've got nothing to assuage that fear, and I certainly understand it. I'm not going to succumb to it, but I understand it. But AI is a promising solution to many of the concerns. We are working now on solutions that can see what changed to better understand the impact of those changes. Imagine if your tools could know that two changes are interdependent, though functionally independent. With the advent of services as well as third party API's, these inter-dependencies are becoming increasingly difficult to track, and it is becoming increasingly difficult to manage change. We can't rely on a human to see and test those use-cases, so we always risk one change making it to production without regard to something that it impacted. But it's all still just code that can be reflected upon. There's no reason that our tools can't read the code and find those dependencies. We're not quite there yet, though some are trying.
When we started this, our metrics landed at about 40 hours per two week sprint. Some of it was subjective, but the hours spent managing builds and deployments - the lion's share of the estimate - were spot on. Fast forward to this last sprint; while the metrics aren't formally compiled, I know that we will not have spent more than 2 hours even thinking about it. QA gate's the request, it get's reviewed and accepted, the packages get build and published, the load-balancers get stopped/started, and the impacted apps get deployed. Do the math. That's awesome!
Through this adventure, I have developed a unique perspective on SDLC and CI/CD. If you want some advice, or just want to share, please feel free to reach out.