DORA metrics aren’t enough on their own. By focusing on pull request size, dev teams can quickly improve their cycle times and development workflow and make the leap to elite performance.
Since its inception in 2016, the DevOps Research and Assessment (DORA) program has provided dev teams with some great metrics to guide them on their journey to performing at an elite level. But DORA metrics should only be one piece of the puzzle.
Make no mistake, tracking DORA metrics are important and useful – just not for everything that an engineering team strives to do, such as showing how developers directly impact the business bottom line.
Used in tandem with LinearB’s own Engineering Benchmarks, however, dev teams can start to use DORA metrics to power themselves toward elite workflows. Based on a study of nearly 2,000 dev teams and 847,000 code branches, these benchmarks help guide us towards what the 10% of dev teams that we consider elite look like in practice.
Small but powerful PRs are best
What’s clear is that elite dev workflows start and end with small pull request (PR) sizes. In our experience, this is the best indicator of simpler merges, enhanced CI/CD, and faster cycle times.
PR size, rework rate, and deployment frequency all affect cycle times, but PR size continues to present the most significant opportunity for real organizational change.
Luckily, it’s also easiest to focus on reducing PR size. It’s concrete, measurable, and achievable. Elite teams make less than 225 code changes (including additions and removals), making them easier to review and safer to merge.
Because small PRs get picked up and reviewed fast, they lower cycle times and positively impact other DORA metrics. There are fewer hand-offs and less idle time. Production blow-ups are smaller and teams can recover more quickly.
Beyond efficiency and moving work through the development pipeline quickly, low PR pickup and review times also tell a good story about team chemistry. Teams that have a smooth code review process also tend to have better code quality.
To help streamline pull request merges, LinearB has released gitStream. This free dev tool allows teams to decide what pull requests should be deemed either low, medium, or high risk. The tool has already allowed hundreds of dev teams to deploy more frequently by systematically not treating all PRs the same.
Increasing deployment frequency
If PR size represents the guts of a project, deployment frequency is the heart. Teams should always strive to plan and work in small, manageable, and quickly releasable chunks. Good scoping and planning nets out to smaller PR sizes, resulting in a team that is constantly merging and deploying. The more frequent the deployment, the better the organizational cadence and developer experience.
It’s important to note that elite deployment frequency is daily – and anything more than a week suggests the need for critical focus. Daily deployment of code indicates a stable, healthy continuous delivery pipeline, which can happen quite naturally with lower PR sizes.
Smaller PR sizes correlate with higher test coverage and more thorough reviews (hallmarks of higher deployment frequency and code quality), reducing change failure rates (CFR). It’s also much easier to roll back and fix issues, helping to lower your mean time to restore (MTTR). Cycle times are lower, customers are happy, and so are developers.
Reworking and refactoring
The concept of rework rate (or code churn) can sometimes be confusing. If a dev writes code, the code merges to the main “trunk,” or the release, and it’ll almost always be refactored in time.
People assume refactoring is bad, but refactoring 6- or 12-month-old code is a good thing.
It’s important to distinguish between rework and refactoring. Refactoring is a process of making preexisting code more efficient. Rework is the bad kind – a repeating pattern in a poorly functioning process. Or the rework could be due to a quality problem; perhaps product or engineering aren’t aligned on objectives?
Unless the code has just been committed, strong refactoring is a healthy sign of a well-functioning team. Teams with lower PR sizes, rework rates, and higher deployment frequency also have more time to focus on refactoring.
How development workflow impacts other metrics
Understanding DORA metrics is important. They do matter, but they’re not enough.
PR size, deployment frequency, and rework rates all affect development workflow, impacting overall productivity and efficiency. Average, or even strong dev teams can grow stronger by zeroing in on these key metrics. When they do, other metrics like planning accuracy, CFR, and MTTR often fall in line.
A crucial dimension of LinearB’s Engineering Benchmarks study is that predictability stems from smaller PRs and shorter cycles. By looking at the key indicators, teams can foresee problems before they come up, and they have the time and space to plan for them. Instead of spending up to three cycles recovering, teams know exactly how long problem-solving will take – if those problems happen at all.
With proper focus and utilization, development workflow metrics can transform organizations. As useful as DORA metrics are, dev teams’ overarching goal should be using better indicators to power them into the elite performing bracket with a supercharged development workflow.
This article is based on an episode of Dev Interrupted, a podcast for dev leaders that explores different strategies and tricks for everything from managing dev teams to speeding up delivery times.