Our CI/CD pipeline and why it's awesome
The CI/CD pipeline at Medit and why it makes us better
Spotify | iTunes | Sticher | Google Play | Player.fm | MyTuner Radio
On this episode, I take you through our CI CD pipeline, what that is and why it's great.
Dave: Hello, and welcome to the podcast. I'm your host, Dave Albert. In this show, I talked about technology building a company as a CTO, and co founder and have guests to discuss their roles in technology and entrepreneurship.
So first, what is a CI CD pipeline? CI is continuous integration, which means that as your code is shipped and merged, that it's a available for deployment.So like if you are building off feature branches, and then merge them into master very regularly, as opposed to long Live branches that say, are in a more waterfall fashion of month long, multi month long branches of v1, v2, v3, etc., that makes continuous integration. Well, that's not continuous integration. But if you take features that last a few days that are then integrated in as they're merged, I consider that continuous integration. You know, as with anything technical, there's always some political debates and certain factions find one method more suited to the description or definition for them than others. But I think that's close, the shorter the time between a branch and a code review, pull request merge, the better. So I'm a fan of small changes, small commits that are constantly moving forward, as opposed to larger commits and larger branches that take much more time to integrate, see their constant integration is easier than massive integrations.
So the CD can stand for either continuous delivery, or continuous deployment. I guess, technically, we're doing continuous delivery. Now, in our test environment, we're doing continuous deployment. So I'll start at the beginning, we use that bucket for our code source repository, one of our developers creates a branch off of master makes some set of changes based on what our requirements are, then sends that pushes that to that bucket and creates a pull request. The pull request is then reviewed by someone else that then gets merged into master if it's approved. And then oh, let's let's start with the API. So the API gets merge as it gets merged into master we use the bucket pipelines which are part of Bitbucket. So I guess it's probably what Atlassian bamboo used to be, that then builds a Docker container with the API in it. We're still in the process of updating some more of our unit tests to be able to run this way. So at the moment, testing isn't done at this phase, though, it should be if you can, at all make that happen. That's probably the one of the best things you can do we then ship it to our or push that to our Azure Container Registry, push it to the Azure Container Registry, then send a web hook to a small Kubernetes pod which basically is just kind of a small little Python flask application that I wrote, that makes a system call based on just a simple couple of parameters. And then that will run a either home or Kubernetes command in the test environment, and then launch our API, our analytics API or our slack within the test environment, then the slack bot will send some notification messages to let us know that it's being run. And then after a bit of a delay, so that everything can spin up, it lets us know that it's running our I call them smoke tests, I guess it's integration tests. I don't know I there's a lot of different terms, I suppose I could look it up. But I just prefer smoke tests from post man. So if you write post man tests, you can export those, the JSON. And then there's a NPM module called Newman, which will run your tests. So then it runs our suite of post man tests against the newly deployed test environment, and you that passes or fails, sometimes it fails, even though all the code is good, timing issues, random strangeness, I don't know. So if it fails, sometimes we just rerun the test a few times. And if it continues to pass, then then we think it's acceptable. I've never had a set of tests that pass once that didn't pass most of the time. So I didn't have one that failed 90% of the time and just happen to pass once that's never happened. So it's almost always some additional Kubernetes pod was not ready at the time, or possibly system load, because our test environment has less system resources than our production environment, because obviously, it has less traffic. So then, once the tests have passed, assuming everything is ready, and it's not too late in the evening or Friday, then we'll coordinate to deploy to production, which is just telling our slack bot to ship to production. And the build number, usually, what we try to do is make sure that the topic in the chat ops channel is the current running production build, so that should something go wrong in the middle of a deployment, we can tell it to ship to production, the previous version, without having to dig through myself get a little panicky, if productions not working perfectly. I know, there are a few people that I've worked with, who give very calm when production goes down. But I'm not one of them, I get, I get a little too much adrenaline going. So I prefer to be prepared for disaster, so that it's easier to recover, then to in the heat of the moment, try to figure out what to do. But this is made shipping to production, a non event, I mean, I know it's, we're a smaller startup, and some larger organizations can be and shipping can take days weeks, and still have issues where they're trying to manually reconfigure, troubleshoot things. So what this is done for us is just made it really nice both that Well, for all the reasons of test is always available, nobody has to ask what was deployed to test when are we sure we're in the right environment, because every time there's a merge to master that's done, it's easy to run a status command to see what the version is and production.
It's easy to be sure that test and production match. If something goes wrong during a deployment, it's super simple to roll back. Now, one of the things that I need to spend a little time on soon, is ensuring that we have a process for deploying and rolling back and production should this like, not work or Slack, not work, something not new, but it you know, fell by the wayside. But this was a at a meetup, I went to who was reminded, I was reminded that not having a process for deployment when one of your tools is unavailable is not a good idea. Because if you don't practice that process, it will be broken.
So I definitely need to spend some time soon, ensuring that we have a process for doing that and practice it occasionally. So that's, that's been extremely useful. There's new thinking about test. So we never get out of sync, there's no working involved in shipping to production, I mean, to shipping the test at all, there's no, there's very little work and shipping to production. It took, you know, maybe five hours over several days, just as I had time working on it to enable all of that. And it saved me probably an hour every time we've shipped to test. So that was an invaluable amount investment in time. And for production,
it's even more and the stress of production deployments zero. Now the you must spend some time on it, it really I wish I'd done it in the first day. Obviously, I was infrastructure was different on day one versus now. But knowing I was going to be using Docker containers pretty much from the beginning could have made this so simple and definitely recommend it.
So then how we do our iOS deployment is somewhat similar. So again, still Bitbucket in the past we had been using, well we're still using Bitrise. So when it gets when a the iOS build gets merged into master we now tell bit rise to automatically send it to test flight to build and send a test flight, we had been doing a build on every creation of a pull request to master, we kind of stopped that lately, but that was because we were having some issues with testing. So I need to are not testing but with I think it is testing. Yeah, we've had some unit test issues and you will have test issues. But I think that might have been because our Fast Lane version was out of date. So I need to get back to checking on that definitely need more tests and iOS, we do what solid amount of QA but it would be better to have some more new testing to avoid a couple of accidents that have happened in the past. So it gets merged into master. Now that triggers a bit rise run, which then since the test flight, the way we handle version changes, is with a pre commit hook. So every time you commit in the iOS branch repository, it pumps up the build number. And then when we need to go a major or minor version, one of us does it manually, we could probably use fast lane or bit rise or something in our tool chain to do that for us. But right now, it was just easy to create a small little script that's in the repository. And then just a little sim link command in the read me and just make sure everybody does that of course, that means that we have occasional issues can merge conflicts. But that's actually not the worst thing because then that requires us to to a merge from master before we finish. And sometimes that can reveal issues that we didn't know we're going to happen. So I'm kind of torn on going back and back to having bit rise fast lane take care of that for us versus doing it ourselves.
I don't know, I'd love to hear if anybody's got thoughts on that. So anyway, we had been using tags to trigger the bit rise job that would ship to test flight. But we decided let's make it every time it merges. And so what we're doing now is every pull request should be shipped bubble. Now, that doesn't mean that we will ship it, it doesn't mean that every every version should be perfect, right. So if you've started to build some element of a ongoing project, so like, we're in the middle of finishing up our groups for iOS, that's really only available to some people, so the internal people, so it's not usable by the majority of our users. But it's not visible to the majority of our users. So this is what you would call feature flags, right, though, that's not necessarily exactly what this is. But that's the same principle, if you are a member of the people who've been assigned access to the group's you can't see the groups so well, that means that we know that the people in this group are able to see are ready to see things that are not quite mainstream ready. So that means that we never get in a situation well, we should never get in a situation things happen, where wherever having to consider going back to some version branching off of that and doing a hot fix, there's no need for hot fixes, really, we just roll forward. So whatever the current master branch is, we can branch off to that make the change submitted, commit, merge, test, and test flight, and then ship that. So that is that is that is saved us so much time. Now, it does cost you some overhead in some cases, because you need to make sure that if you're developing some long term element that you think that
through and ensure that there's a way to gate access, so that only the people that you want to have access to that have access. But it's, it's, I wish had, again, I wish we'd done this from the beginning, it has made our lives so much better. as developers, it's made the people who test the application in test flight happier because it's constantly up to date. They never go, you know, a week two weeks without seeing a new version. So we do have to communicate what which versions should be tested. And which first versions are just ongoing development. So basically, we just say, if you want to test stuff, that's fine, but will let you know when there's something that needs major reviewthis also make sure that we don't have massive amounts to test at the end that allows us to get our version or to ship basically, every week or every other week just depends on what's going on. We usually have bug fixes. So it's nice to get those out every week. Or we have you know, small incremental changes that you wouldn't normally go through the hassle of creating a release. But when all you have to do is just go into the iTunes Connect. And I guess it's called a hub connect now and copy the two or three fields, fill out your release notes, and then submit it this very little reason not to just do that when you've got you know, even if you've only got a handful of changes for a week, it's still nice to keep it fresh.
So get on your CI-CD pipeline. If you're just getting started. It will save you so much time in the long run. It's so much easier to automate. Very small bits, it's kind of like test driven development. I'm gonna say its delivery, delivery driven development. So prepare for delivery or deployment. Deployment driven development. Yeah, let's call it that. So you prepare for your deployment. That means that those chain the those decisions and the things that you have to set up to allow that to work you'll be doing from the beginning as opposed to refactoring. So just like if you write a test before you write code, you have to write the code to be able to test it as opposed to trying to write a test on code that already exists. So a lot of times, you actually have to refactor the code so that it can be testable. So this would ensure that you've got deliverable code, even if it's an empty application. And that's what we did for Android. So androids and it's very early stages, but it already is set up so that every time we have to check with the trigger is but we can always change the trigger to a deployment I mean, emerge, it's still you know, we're not quite ready to test much of it yet, but once we get to that stage, then we'll make sure that a merge to master the ships at to air not air hockey, I think it used to be air hockey hockey app.
So that's it for this episode. But um, yeah, definitely create a CI CD pipeline. If anybody's
interested in talking about it. Let me know you can email me at [email protected]
or on Twitter @Dave_Albert.
Until next time, remember, any sufficiently advanced technology is indistinguishable from magic.