Berlin

November 4 & 5, 2024

New York

September 4 & 5, 2024

How to decide on engineering guardrails

Engineering guardrails can help you navigate complex software systems. Here are some best practices to guide you on the development process.

By Pat Kua

December 26, 2023

5 minutes

Learn how to strategically choose and implement effective engineering guardrails for streamlined development and improved team efficiency.

Guardrails help people avoid dangerous situations in real life. Similarly, in any significantly complex software system, good engineering guardrails can keep people on a safe path, and reduce the cognitive complexity of day-to-day work. From coding and building through testing and release, guardrails will exist as rules, standards, and best practices related to the development process. 

Ask your team what their biggest pain points are

As a leader, instead of deciding what engineering guardrails to implement on your own, you should talk to your team first. Ask people what surprised them about planned or unexpected work, what was painful for them, or when they received late feedback on work or the quality of work. Once you have a collection of pain points, consider what engineering guardrails exist for those pain points. 

For example, I remember with one team I worked on, the topic of slow continuous integration (CI) build came up week after week. Each build was approximately 40-45 minutes long, caused mostly by lots of long integration/acceptance tests. Because the process was long, people avoided running builds after small changes and so ran them less frequently. This encouraged the team to take on more risk, sometimes committing to what they felt would break anything, and letting the central CI, instead of local, build fail. 

CI build breakages continued, and when a build failed, it took much longer to work out what caused it. A retrospective we held put a spotlight on this issue that affected everyone, but no individual was keen to take action on it alone. We eventually agreed as a team to prioritize work to speed up the build by converting some long-running integration or acceptance tests into faster-running unit tests.

We didn’t only fix the issue, but we also put in an engineering guardrail, as we knew that build time was a key indicator. We put a check in place that failed the build, and then the build time hit 10 minutes. When the build got over 10 minutes, we restarted with an aim to find different ways to increase the build speed.

Look for repeated mistakes

The only failure is the failure to learn from failure. Mistakes happen, but it becomes an issue if different team members or the same team members make the same mistake repeatedly. Most people correct the mistake and hope they won’t do it again, but they don’t think about ways to reduce the likelihood or remove the possibility of it happening again. As a leader, consider engineering guardrails when you notice a pattern of repeated mistakes. 

Early on in my career, people added additional debugging statements that would dump output to the console. Sometimes they forgot to remove these statements before checking in their final code to source control, and when the application ran in production, the application console quickly filled up with distracting information that wasn’t very useful for day-to-day use. Of course, it was easy to remove them (and convert them to logging statements), but this happened once or twice every couple of weeks over several months. To remedy this, we implemented a quick engineering guardrail that added a pre-commit check that scanned code to remind people what they should and shouldn’t commit.

Favor automatable guardrails over manual ones 

As you think about which engineering guardrails to implement, favor those that you can support through automation, rather than manually. 

Manual guardrails sound tempting because they’re low effort, but there are a number of downsides. For instance, people can forget to run the manual processes or skip steps in the process (especially for a very complex guardrail). And if you decide to have a role that “enforces” the manual guardrail, it increases team conflict because no one likes to be told they are doing it wrong. Automation gives people faster feedback and increases their autonomy because they choose to change their work based on neutral feedback, feeling less judged as a result. Fast feedback loops also minimize the blast radius of mistakes and speeds up the learning opportunities. 

One example of this was when I was working on an application that was localized for different languages. The designs were typically completed in English, after which point we would implement the screens with translatable placeholders. For each placeholder, we would have a translatable key, which would need an entry in a file that represented the language. The English translation file would be the first, and then the other languages would be sent off to specialist translators. Because we were worried about this process we had a step before our release to check each language would have an appropriate translation. Since we were changing the system all the time, this process would mostly work, but occasionally we would miss copying the translation, or the translators hadn’t provided all the new and updated translations. To fix this, we ended up writing an automated test that would ingest all the translation files, and make sure there was an entry for every key. This simple engineering guardrail saved us lots of work and also worry, especially as the translation files grew extremely large. 

Choose your engineering guardrails wisely

Engineering guardrails improve the quality of a system and reduce the cognitive load of team members. But the wrong guardrails can slow a team down, so to choose good ones, start by asking your team members what they would find useful, look for repeated mistakes that no one notices, and favor guardrails that are automatable instead of manual.

Estimated reading time: 5 minutes