I've been at Bluecrew for a month now and one thing I'm actively improving is their continuous integration (CI) strategy. Every team has values around their code and systems. The existential question I ask about each value is this: does CI enforce the value? If not, the value does not exist. You could think of this as an all or nothing CI strategy.
What is continuous integration?
The most common thing added to CI is a project's test suite because the team probably values having passing tests. When someone creates a pull request to integrate new code, the CI system runs the tests. If they fail, the developer sees an error and is blocked from merging the pull request. This is the basic principle of CI: running scripts to catch integration problems and halting the integration, if necessary.
All or nothing
What else besides tests should you add to CI? Everything! If your team values code reviews then you can configure GitHub to block merging until the code has been reviewed. CI would then enforce code reviews. If you want everyone to use a code formatter like prettier, the CI system could run a command like this that fails when someone forgets to format their code:
prettier --check "src/**/*.js"
If you don't add this check to CI, I guarantee you that non-formatted code will creep in. Similarly, if you find value in static analysis tools such as a linter then the CI system should fail when someone forgets to lint new code.
On one project, Bluecrew has a Jira integration where GitHub pull requests get linked to tickets but only if the branch is named correctly. It's not the greatest integration but it works. I discovered that many resolved tickets lacked pull request links because of typos or incorrect casing in branch names. I added a simple bash script to CI that now fails when branch names are named incorrectly.
Again, if a team value isn't in CI, it doesn't exist. It might have existed once but it will fade away over time.
Incremental adoption
If you're introducing new CI checks to an existing code base it may not be feasible to fix all problems at once. For that case, I still suggest enabling the CI check immediately but making an exception for older files. For example, eslint lets you opt out of linting by putting this comment at the top:
/* eslint-disable */
This makes it immediately safe to enable a CI check for linting.
When you find time to fix an old file, you can just remove the eslint-disable
comment and hack away.
If you wait until all files are lintable before enabling the CI check, your team will fail to adopt linting forever.
The CI part is all or nothing but adoption can still be incremental (most tooling provide ways to opt out).
A lot of missing CI checks I've found at Bluecrew probably stem from incremental adoption but maybe also an assumption that everyone uses an editor configured in a certain way -- this is a dangerous assumption.
System tests in CI
A common mistake I've seen at many companies is not including system tests in the CI of all related projects. System tests are valuable to each team because they catch regressions but they can be slow (more on that later), hard to diagnose, fragile, or flaky. If a given team doesn't want these in their CI, that either means the tests aren't valuable or they need fixing. They probably just need fixing.
Writing system tests is one of the hardest problems in computer science and I wish they were taken more seriously. I recommend having senior folks help with their architecture. Beyond CI, if the development, diagnostics, and maintenance of system tests aren't fully baked into team culture, they will cease to exist.
The story that often gets missed in the design of system tests is this: as the one who first sees a CI failure, I should be able to decipher and act upon it. They pose a truly unique integration challenge.
Boosting productivity
Enforcing all team values with CI may seem passive aggressive (you could just ask your teammate to fix the thing, right?) but I see it more as helpful automation. CI is a safety net that should give developers courage to accomplish hard tasks. It's not meant to shame a developer, it's meant to make them faster and more productive.
When you don't have to worry and think about making mistakes, your mind is free to experiment on new features quicker. This leads to a feedback loop of rapid iteration which is the key to building great software products.
CI in your code editor
You get a real productivity boost when you can put some CI checks directly into your code editor and see results as you type code. You can only do this with fast checks like static analysis or well designed unit tests, though.
As a long time Vim user / VSCode convert, I recommend VSCode because it does so many CI things and is easy to configure. Putting CI checks in your editor definitely isn't a CI strategy (it's not integrating code) but it gives you early warnings before making a pull request.
Too much CI?
You put everything in CI. Great! Now CI is slow ... sad trombone. It happens. The sweet spot is when CI enforces all team values but doesn't slow down the actual integration part. In my experience, when CI starts taking longer than 10-15 minutes, it becomes a problem. Long running system tests are typically the culprit.
This is usually a good problem because it means you have lots of helpful automation but it's still a problem. Here are some possible solutions:
- Audit each CI check to make sure it still enforces a real team value
- Use standard profiling techniques to identify and fix bottlenecks
- Add some smarts like inspecting the pull request diff so you can figure out the bare minium set of CI checks -- for example, running tests on changed files only
- Make long running CI checks non-blocking but alert developers on failure so they can follow up with a fix. This is risky because it's easy to ignore "alerts." I consider this to be a last resort.
- Run system tests in parallel and throw more machines at them
- Dedicate a team in your company to making CI fast! This will be the best money ever spent but it's a hard sell since you can't measure productivity.