Addressing Process Bottlenecks in Software Delivery

Theory of Constraints and Queue Theory

Hany Elemary
navalia

--

The modern software development process is multifaceted. It relies on sequencing streams of work such as design, development, testing, and deployment to achieve the end goal for its customers.

Often, when trying to enhance software delivery performance, organizations get sidetracked by surface-level issues while the core problems continue to fester. To help navigate this, we use two theories: the Theory of Constraints (ToC) and the Queue Theory. These ensure we’re focusing our attention on what’s most impactful.

Introduction

At a high level, bottlenecks in software delivery stem from three main areas; People, Process and Technology. Here are a few examples:

  • People — underpowered staff, missing competencies or critical roles.
  • Process — lengthy reviews/approvals (coding, security, QA, deployment).
  • Technology — tech debt or legacy systems.

This article will focus entirely on Process as it tends to be the most common theme at the enterprise.

Theory of Constraints and Software Delivery

The Theory of Constraints presents a methodology for identifying the most significant constraint that stands in the way of achieving a goal and then systematically improving that constraint until it’s no longer the limiting factor.

In the context of software delivery, the process constraints at the enterprise often come in many flavors of “Approvals”:

  • Change Approval Boards (CAB)
  • Architecture Review Boards (ARB)
  • Code Review Approvals
  • Manual Security Approvals
  • Manual Testing Approvals

Given the influence of these sub-groups within the enterprise, development teams sometimes choose to improve areas they have more control over, either upstream or downstream of the main constraints. An admirable approach, yet ineffective.

The Theory of Constraints tells us that regardless of how efficient the upstream or downstream processes are, the overall software delivery timeline is determined by the constraint’s efficiency.

A visualization for the constraint / bottleneck

This implies that improving code or test practices upstream may yield marginal improvements to the overall software delivery performance. In fact, if these improved practices enable more throughput, they’re likely to cause longer delays at the constraint. See Queue Theory below. Conversely speaking, improvements to the downstream activities do not help new units of work flow any faster.

Queue Theory and Software Delivery

Queue Theory provides a mathematical approach to studying waiting lines. By analyzing the service rate (how quickly tasks are completed), and the number of servers (resources available to perform the tasks), queue theory provides insights into the average waiting time, the expected queue length, and overall system utilization.

In software delivery, a queue is formed whenever tasks (such as feature development, bug fixes, delivery pipelines, process approvals) are waiting for available resources (developers, testers, compute, CAB). By analyzing lead time for changes (how quickly code gets to production) and cycle time (how quickly a task gets completed), we can get a glimpse of where optimizations are necessary.

Below is an example that showcases lead time for changes vs cycle time. The example also uses the Change Approval Board (CAB) as a proxy for number of servers available; the constraint.

Lead time vs Cycle time

Though this is purely a visual example, it’s not uncommon for, let’s say, CAB approvals to take the same length of time it took to do the creative work itself at the enterprise. For instance, if a feature took a single week in “creative land” (develop and QA), it may take another week in “process land” (approve deploying to production).

And while CAB is commonly seen as the bureaucratic equivalent of the DMV (Department of Motor Vehicles) in enterprise software, it’s important to note that even in “creative land”, inefficient processes can arise. And like CAB, these processes often start with the best intentions but eventually turn into obstacles, acting as barriers to progress.

To illustrate the point, practices like code reviews or pull request approvals, though inherently creative, can lead to bottlenecks when they are overly extensive or when there’s a mismatch between the expected quality and the actual requirements.

Teams with faster code reviews have 50% higher software delivery performance — DORA 2023 Report.

Premature Optimizations and Mistaken Approaches

Capacity increase

While People can also be a constraint (understaffed teams), increasing capacity (adding more developers) may not always be the right optimization upfront given process bottlenecks. For one, increasing capacity might produce work that ends up waiting in line, leading to longer queues and increased lead times, thereby negating the intended effect.

And two, arriving at the “right” capacity is much more difficult in a system that has process constraints. How do you know what investments to make? How many more staff should you add? And does the benefit outweigh the cost? If so, how do you measure it? These questions are incredibly difficult to answer if you don’t have a good view of where all the constraints lie.

Parallelizing work streams

Adding parallel streams of work, while maintaining a singular constraint, will not lead to shorter timelines either. Instead, it has the same effect as increasing capacity. Unless you also parallelize your constraint, the increased capacity will either have no impact or result in longer lead times.

When a task cannot be partitioned because of sequential constraints, the application of more effort has no effect on the schedule.

Fred Brooks, The Mythical Man-Month.

Bundling work in large batches

To reduce the overhead of approvals, development teams may have the tendency to work in larger batches before they get to the “approval gate.” Here are some examples:

  • Bundling different changes in a single pull request/code review/architecture review.
  • Bundling multiple features together before getting QA approvals.
  • Bundling multiple changes in a single deployment.

This approach does save time spent in “process land.” But it comes with considerable drawbacks.

Let’s use deployments to better illustrate this point. A team cuts down the time spent in CAB approvals by bundling 4 features together rather than seek separate deployment approvals for each.

Yet, the larger the change, the more challenging it becomes to diagnose and fix any ensuing problems. Simultaneously, this approach delays the delivery of value to the customer. Larger batches mean customers must wait longer to receive updates.

Graph representing both large batch vs small batch deployments while highlighting the area under the curve as deployment risk.
Risk and time to value are directly correlated with batch size.

Addressing Constraints in Software Delivery

At large enterprises, removing constraints associated with different organizational boundaries can be daunting. Shifting the focus of these groups towards guidance instead of gates is often the pragmatic approach.

For instance, the Change Approval Board (CAB) can guide development teams into how best to de-risk deployments via principles and hands-on help. This implies proficiency in platform engineering, CI/CD concepts and enabling a mature DevOps culture.

The Architecture Review Board (ARB) can guide teams by publishing opinions and view points on technology and architecture choices. They can also guide in complex situations with many trade-offs and help with tool assessments while creating a technology compass for the enterprise.

The Security team can enable secure coding practices by partnering with development teams to embed vulnerability and dependency scanning in CI/CD pipelines. Evangelizing secure practices, raising awareness on the latest attack vectors and holding hackathons can level up the security knowledge within the organization.

The QA team can ensure quality outcomes with the development teams by evangelizing a strategy that balances automated integration testing with manual, exploratory testing. They can guide and enable development teams by identifying the systems’ choke points. Quality is everyone’s responsibility, not just QAs. This mindset enables joint bug bashes & fix sessions to reduce the QA “sign off” pileup.

And, lastly, the Development team can adopt pair programming as a measure of reviewing code real-time. Not as a substitute for code reviews, but rather, as an avenue that improves software delivery performance, collective knowledge-sharing and quality.

In all of these cases, there’s a common theme that reduces hand-offs and barriers:

Move away from high coordination, low autonomy activities to high collaboration and enablement.

As a final point of caution, some of these standalone teams contradict ideal team topologies, creating undesirable silos and power imbalances. The suggestions here are merely transitional, not the ideal state.

Though may be unpopular, organizations should aim to dissolve isolated teams or “center-of-excellence” structures, opting instead for temporary enablement teams or integrating them into stream-aligned teams.

Conclusion

Queue Theory and Constraint Theory offer tools for software development optimization to ensure a healthy flow of customer and business value. By focusing on identifying and addressing the process constraints, leaders can effectively reduce lead times, improve software quality, and make better use of their staff and resources.

When you have constraints across People, Process and Technology, it’s often more practical to first address the Process constraints followed by the others. For one, streamlined processes will be foundational when addressing constraints across People and Technology. Processes are also less costly to change compared to altering technology infrastructures or reshaping the workforce.

Additional Resources:

Acknowledgments

Big thanks to Smitha Ajay, Luke Belliveau, Bruna Castelo, Max Francis, Bruno Furtado, Cliff Morehead, Lav Pathak, Ibrahim Taha, Justin Ramos, Gedeon Santos and Rodrigo Vasconcelos for reading drafts of this article and providing valuable feedback.

--

--

Hany Elemary
navalia

Technology Leader. High Performing Teams Enabler. Author & Speaker.