Continuous Delivery, High-performing Engineering Teams, and the Holy Grail

Jeremy Meiss

DevRel & Community professional

Open to work

So back to the tech industry…

Continuous Delivery

aws-cd-def — Source: AWS Continuous Delivery

What would you say CD is?

Continuous Delivery

the ability to get changes-—features, configuration changes, bug fixes, experiments—-into production or into the hands of users, safely and quickly in a sustainable way.

– Jez Humble. DevOpsDays Seattle 2017. Continuous Delivery Sounds Great But It Won’t Work Here

Modern Software Delivery

engineers owning their own code in production
practicing observability-driven development
testing in production
separate deploys from releases with feature flags
continuous deployment/delivery

– Charity Majors, Fintech DevCon 2023, Twitter thread

Deploy as fast as possible, as automated as possible

– Charity Majors, Fintech DevCon 2023

one size doesn’t fit all — Image: Consumer Choice Center

CI/CD Benchmarks for high-performing teams

Duration

Mean time to resolve

Success rate

Throughput

Duration

the foundation of software engineering velocity, measures the average time in minutes required to move a unit of work through your pipeline

Duration Benchmark

<=10 minute builds

"a good rule of thumb is to keep your builds to no more than ten minutes. Many developers who use CI follow the practice of not moving on to the next task until their most recent check-in integrates successfully. Therefore, builds taking longer than ten minutes can interrupt their flow."

– Paul M. Duvall (2007). Continuous Integration: Improving Software Quality and Reducing Risk

Duration: What the data shows

Benchmark: 5-10mins

Improving test coverage

Add unit, integration, UI, end-to-end testing across all app layers
Add code coverage into pipelines to identify inadequate testing
Include static and dynamic security scans to catch vulnerabilities
Incorporate TDD practices by writing tests during design phase

Optimizing your pipelines

Use test splitting & parallelism for simultaneous multiple tests
Cache dependencies & data to avoid rebuilding unchanged code
Use Docker images custom made for CI environments
Choose the right machine size for your needs

Mean time to Recovery

the average time required to go from a failed build signal to a successful pipeline run

"A key part of doing a continuous build is that if the mainline build fails, it needs to be fixed right away. The whole point of working with CI is that you're always developing on a known stable base."

– Martin Fowler (2006). “Continuous Integration.” Web blog post. MartinFowler.com

MTTR Benchmark

<=60min MTTR on default branches

MTTR: What the data shows

Benchmark: 60 mins

Treat your default branch as the lifeblood of your project

Getting to faster recovery times

Treat default branch as the lifeblood of your project
Set up instant alerts for failed builds (Slack, Pagerduty, etc.)
Write clear, informative error messages for your tests
SSH into the failed build machine to debug remote test env

Success rate

number of passing runs divided by the total number of runs over a period of time

Failed signals are not all bad

Success rate benchmark

90%+ success rate on default branches

Success rate: What the data shows

Benchmark: 90%+ on default

Throughput

average number of workflow runs that an organization completes on a given project per day

Throughput benchmark

It depends.

Throughput: What the data shows

Benchmark: at the speed of your business

Throughput is the most dependent on the other metrics

High-performing teams in 2023

2023 State of Software Delivery Report

go.jmeiss.me/SoSDR2023

Thank You.

/in/jeremymeiss

@IAmJerdog

@jerdog

@jerdog@hachyderm.io