Reading Don Reinertsen’s book “The Principles of Product Development Flow: Second Generation Lean Product Development” was a truly enlightening experience, especially when it comes to understanding queues and the problems that queues cause in a regular product development process. Having almost five years of development behind Kanbanize, and many more before that, we came to realize that understanding and measuring queues was proving to be one of the most important aspects of product development.
The goal of this article is to present a simplified overview of queues, how they work, how you can find them in your business and what tactics you can use to fight them off.
Why are queues in product development dangerous?
The short answer:
Queues can cost you a lot of money.
The long answer:
Let’s dissect what happens in a regular software development team, the QA part of which is understaffed (which is almost always the case). What would usually end up happening in a situation like this is that an accumulated pile of tasks will often be waiting to be tested by the QA guys. This is the “Ready to be Tested” queue, which could theoretically contain a great deal of still not discovered issues.
Based on the size of the “Ready to be Tested” queue, we could run into some serious troubles. Suppose we have a module X, which is a central one and many other modules get developed on top of it. If the QA people cannot get to module X soon enough, we risk that all the other modules get built without X having been tested. What if QA discovers major issue with X and architecture needs to be changed? Congratulations, we’ve just lost a lot of money.
There is an almost identical scenario with QA being replaced by product management. Quite often product managers have too much work on their own “To Do” queue and fail to produce good requirements or stories on time. Having nothing to work on, developers start working on what’s left in their heads after a 10-minute phone call with the PM or what the team lead thinks the PM wants. After some time the PM takes a look at the already working software and asks for ‘a few’ changes, which require quite a bit of refactoring. Guess what? We’ve just lost a lot of money once again.
How does a queue occur?
Queues in a given process occur right before a step with limited capacity and/or high utilization (QA, Product management). Actually, capacity utilization is the single most important factor for the occurrence of queues. This is somewhat natural, because when a process is run at 100% utilization, any new work would automatically sit on the “waiting” queue until someone has free capacity to take it.
The interesting part is that capacity utilization affects the size of the queue exponentially: going from 80% utilization to 90% utilization would double the queue size; going from 90% to 95% will double it once again. Since the queue size also affects the cycle time of each new job, you have to be careful what percent utilization you operate your processes on. If cycle time is important – choose lower utilization, which would guarantee a quick turnaround time for important new jobs.
Shall I target for a zero-size queue?
It is important to note that queues are not always bad and whether you should allow a queue of а given size to occur is an economical question. If an extra developer costs you N, but the potential delay of the project costs you 1000 x N, then it might be wise that you just get the developer and zero down the development queue. However, if you need to spend a gazillion dollars to shorten the test cycle with one week and the benefit would be a pat on the back, well, you shouldn’t do it.
As a matter of fact, queues are sometimes necessary. If you have a machine or a person that affects the overall throughput of the system (bottleneck) you don’t want any idle time for them. To make sure they always have something to work on, you deliberately build a queue right before them. This queue takes away the variation of arriving new jobs and ensures maximum throughput for your process.
Typical places to look for queues
In reality, a company that is not trying (hard) to implement lean will have “ghost” queues all over the place. We say “ghost” queues because nobody realizes that they exist and that they greatly affect the ability of the business to execute. To make it easier for the non-experienced lean thinker to find queues in their work, we’ll present a couple of typical places where they occur in almost any company out there.
As described in the example above, product management happens to have a big queue of non-refined ideas collected by themselves or the marketing team. A lot of ideas could potentially elevate the company to a new level, as long as product management has the capacity to refine them into real business cases and work with the engineering teams to realize them. Failing to do so results in a lot of missed opportunities or sometimes to direct losses, when customers churn because their feedback was not heard.
This is the right place to say that implementing a FIFO (first in first out) scheduling algorithm for product management (like the one lean people use in manufacturing) might be a really bad mistake. One of the methods to minimize the economic losses caused by queues is to actually sequence the jobs in the most economically feasible order. If you have a very critical job, that might lead to a lot of extra costs if delayed, you may want to schedule it before many others, that have no such risk associated with them. If you have two such jobs, you may want to start with the shorter one.
Sustaining / Support teams
In a typical product development organization, a sustaining team will have a huge backlog of customer issues. If things are really bad, they will be growing day to day and the only way to fight them off would be to either defer many of the low-priority issues or just bring new engineers on the team. Both ways are highly ineffective as far as economics goes, because everyone knows that the sooner a bug gets fixed, the less it costs. Well, if you never fix it, it won’t cost you anything, but how many customers will get chased away by such behavior?
The only reasonable way to tackle sustaining queues is to 1) maintain their size so that they never grow out of control and 2) gradually pay off the debt you’ve accumulated. We will explore a few ways to manage queues in the last section of this article.
We’ve already discussed what happens when QA is overloaded and can’t get to important jobs on time. It’s fair to say that QA people can also benefit from a proper scheduling strategy (not FIFO). An idea that is pretty natural and usually achieves good results is assigning each job a priority; the higher the priority, the higher the job on the queue.
Business reviews / management reviews
When operating in a field of great risk it is usually required that all jobs get reviewed by a senior team member. A concrete example might be a code review by a team lead, sign-off by a product manager, test cases review by a senior QA, etc. This type of queues is quite bad not just for the economics, but also for the morale of the team. No person is happy to work on something, then pass it for a review, start working on something else and then receive the results of the review two weeks later (when they’d forgotten half of what’s been done). We will explore a potential resolution for such situations in the last section below.
Companies that work on embedded software will very often suffer when they realize their hardware will be available towards the middle of the project. More often than not such companies will have a centralized purchasing team or department, that is responsible to supply every engineering team with the hardware they need. Unfortunately, these teams have a lot of jobs on their queues and they additionally accumulate the delays caused by the queues of the vendor. Having big queues in purchasing might lead to some serious troubles so having these under control is quite essential as far as economics goes.
These are by far not all the places where you can find queues in a company. Queues are basically everywhere and we, as lean thinkers, should do our best to first, start spotting dangerous queues, second, evaluate the economics of these queues and third, figure out how to minimize the potential losses they might cause.
When it comes to managing queues, the absolute winner is the combination of the Kanban method with a cumulative flow diagram (CFD diagram), which is a graphical tool that represents data about the arrivals, time in queue, quantity in queue, and departure.
Implementing a Kanban System
Kanban is gaining more and more popularity as a tool superior to traditional or even agile methods. It’s a pity that it has been overloaded with a lot of meanings and nowadays almost nobody knows what Kanban really is, but we urge you to get to the source of truth by reading David Anderson’s book Kanban: Successful Evolutionary Change for Your Technology Business. If you’re interested to know why Kanban is not just a few sticky notes on the wall, feel free to check this blog article: Kanban Is More Than Sticky Notes on the Wall
As far as this article is concerned, we recommend using Kanban to 1) visualize queues and 2) limit the number of jobs in queues.
When queues are not buried on your hard drive but visualized on the whiteboard or a big 65” screen, then ghosts suddenly get brought to light. It’s not easy to neglect the fact that X number of ideas are pending to be refined or Y number of orders should be processed tomorrow unless you want to have the project delayed. Visualization is a really powerful tool and it is the first step to adopting Lean and Kanban. Here is how a regular Kanban board looks in Kanbanize:
The second, but even more important thing to do is to limit the size of queues. It is as simple as saying “The maximum number of jobs ready to be tested is 5”. Whenever you fill in your “Ready to be Tested” queue with 5 items, there should be no more upstream work, that would result in more items on that queue. You basically stop the entire “production line” and focus everyone on freeing capacity in that particular queue.
How do you do it? The most obvious thing to do is ask your developers to start testing. This will make sure that no up-stream work is being done and that the QA queue actually gets depleted. This is what happens in a supermarket when there are more then X people waiting on each cash desk. The guys that usually put things on the shelves switch to being cashiers. When the queues are gone, they go back to what they usually do.
If switching developers to testing is not an option, then in the short term you have to ask them to do something valuable, which is not going to end on the “ready to be tested” queue. This might be reading books, figuring out new ways to implement scalable architecture or even getting lunch for the QA guys. In the long term, you have to figure out how to balance the system, so that you establish a smooth flow across your Kanban system. This might mean getting more QA guys, investing in test automation or off-loading some of the testing in the development cycle itself.
Cumulative Flow Diagram (CFD)
Since we have a dedicated article on the Cumulative Flow Diagram, it would be worthless to repeat ourselves, so we’d urge you to go read this article: Cumulative Flow Diagram in Kanbanize.
It is important to remember that queues are one of the major factors that affect the economics of a business. Being an economic problem, they should be approached with an economic state of mind.
The typical places to look for queues are process steps (or people) with high utilization and limited capacity. Having an up-stream queue in front of such process steps or people might be a good step, as it removes the turbulence from your bottleneck.
One of the most productive ways to manage queues is through the implementation of Kanban systems for each of the services in your business. Probably the best tool to analyze the allocation of jobs in queues and how they affect your cycle times is the cumulative flow diagram or also called CFD.
Happy Kanbanizing from Kanbanize – Kanban Software for Visual Work Breakdown and Tracking