As more and more work is pushed into the system, the amount of multitasking increases exponentially. Any work that is started is put aside in order to serve some other urgent request; then that work is resumed; then it is put aside again; then resumed; and so on, and so forth. The pattern repeats over and over again; and for all and any kind of work.
It is a vicious circle that effectively makes any work spend most of its time waiting to be worked on, rather than actually being worked on.
One way to address this is to focus on any work item, and make sure it is worked on until it is done before any other work item is started. The focusing mechanism can be realized by deliberately limiting the work in process.
How can we deliberately do so?
Very simply by not starting any work until there is sufficient available capacity to take care of it. In other words, it is just a matter of deciding when to start work. Rather than starting work as soon as it arrives into the system, the starting is postponed.
Why would this make sense? First, it is another way not to try to change the rate of delivery (which we know is costly and prone to failure). Second, by deliberately deciding when to start working, we can determine how much work is loaded into the system; and in particular, we can decide to start work at the same rate that we are able to deliver it.
In our simplified diagram, it would look like this:
Limiting Work in Process
The effect of not starting work is (obviously) that the amount of work effectively being worked on is limited; and because we now strive to start work at the same rate at which it can be delivered, we have the sought after effect: the two lines are parallel.
The amount of work in the system remains constant, as well as the time from start to finish. The system is not overburdened, but it has become stable and predictable.
By highlighting the difference between the rate of starting work and the rate of delivery, this constantness is clear, precisely because the two lines are parallel:
But We Cannot Wait!
The idea of deferring the start of work is hard to digest if we are used to always starting work for the sake of “serving the customer” and being responsive.
Even worse, if we believe that starting working sooner will deliver the work sooner, we might conclude that this postponement will result in an even later delivery, and hence discard the idea before thinking any further.
Yet the diagrams shows it differently.
Postponed commitment does not mean late delivery. Since we have not changed anything with respect to the slopes of the demand line and of the delivery line, the overall performance will – at least – not be worse than before. (Actually, it will be better as a consequence of the reduced multitasking.)
The effect of the postponement can be thought of as a rearrangement of the time the work item is actually worked on, and the time the work item is sitting still waiting to be worked on. It is as if (most of) the waiting time is moved in front of the work.
In other words, work is queued in a waiting line until it can be worked on and serviced with no interruptions. The situation is not much dissimilar to what happens when we stand in line for a burger at McDonald’s or for having a ride at Disneyland.
It is a common experience that we wait in line until our turn comes.
Since, from the delivery perspective, there is no apparent delay, there is no reason why a client would perceive this as any worse.
Notably, we can even tell clients how long they will have to wait for their service and predict when they will receive the delivery. Much like McDonald’s and Disneyland.
While nothing changes from the client’s perspective, the positive effect is that we have gained those parallel lines that give us stability and predictability. We will not be overburdened and we can keep our promises.
At this point, when the system is stable and predictable, we can start to address the real issue of increasing performance.
Touch Time and Wait Time
Even after extracting the “Disneyland attraction wait time” from the overall service time, there is still lots of time that goes wasted. We have weeded out all wait time that is external to our process, but there typically is substantial wait time inside it.
We can reflect about this. Let’s start by distinguishing between these two states:
- When work is being worked on, which we qualify as Touch Time
- When work is waiting to be worked on, which we qualify as Wait Time
If in our simplified model, a horizontal line in our diagram represents the time it takes for a piece of work to go from start to finish, we can represent its Touch Time and Wait Time as follows:
The work timeline is split into two parts; the four segments at the top represent the Touch Time, while the three segments at the bottom represent the Wait Time.
If Touch Time and Wait Time are measured, it is not uncommon to find a disproportionate unbalance between the two, where the total amount of Wait Time is far greater than the total amount of Touch Time. (It is not coincidental that, in the illustration, the bottom segments are longer than the top segments; even though in real settings the disproportion would be even greater than what illustrated, so much that the drawing would have to extend for several pages in width!)
Work Faster or Deliver Earlier?
Conventional improvement initiatives focus on “working faster,” because they aim at increasing the rate of delivery. We know that in order to deliver faster, investments have to be sustained, changes have to be undertaken, and the risk of failure is large.
With the observation that any complex work process is a sequence of interdependent steps, and that the execution time is necessarily divided into Touch Time and Wait Time. Working “faster” obviously means “touching” the work less; focus is decidedly on reducing the Touch Time.
But we do have another option. We can consider the alternative and not try to work faster at all, but to strive to reduce the Wait Time(s)!
If we are able to reduce the Wait Times without changing the existing work processes, it means that we do not have to sustain investments, undertake changes or incur risks.
The operation will be much cheaper – it will cost nothing! – and there will be no pain in adopting different working procedures.
The two alternatives can be visualized like this:
Everything remains as it was: same work processes, same tools, same people, same infrastructure. (Hence no extra cost and very low risk of failure.) The only thing that changes is in the decision making about when to start and stop working – in other words, how to coordinate and synchronize work – so that the wait times are reduced.
But the actual working procedure is not changed at all. How brilliant isn’t that?
Delivering Earlier Will Increase Throughput
Now there is an apparent paradox to come to grips with.
While work is not performed any faster than before – and hence there is no additional stress or burden put on the system or on the people – the time from start to finish is reduced.
In the diagram, this reduction results in the delivery line left shifting with respect to the demand line. So at any given point in time, the new delivery line will be higher than the old one.
The slope of the delivery line remains the same (because work is not performed any faster than before); but since it is left shifted, the amount of work delivered at any point in time will be greater than before the improvement effort.
It is worth repeating: no effort is expended to try to work faster, work is still performed at the same speed as before. Instead, the total service time is reduced by reducing the Wait Time inside the process.
On the diagram it would look like this:
Is it Worth it? The Impact of Flow Efficiency and Little’s Law
The initial model has undergone a lot of elaboration. We have examined a non-conventional way to produce a performance improvement, and the last diagram shows that it is possible to achieve it by postponing commitment and striving to reduce wait times, rather than just “work harder” to increase the rate of delivery.
The question that arises: is it worth it?
To reply factually, we would need to run some experiments and collect real data, and make the decision accordingly. That’s what we typically do in TameFlow: we make a hypothesis, run experiments, collect data and decide accordingly.
When we start measuring Touch and Wait times, we will typically find surprises. The amount of Touch Time is always a small fraction of what we might initially expect.
To contemplate how much or little Touch time there is with respect to the end-to-end time (service time), we consider the metric of Flow Efficiency.
Flow Efficiency is the ratio between the Touch Time and the end-to-end service time (i.e. the sum of the Touch and Wait Times), expressed as a percentage. If no attention has been given to these ideas, flow efficiency is very low. Typically it is in the order of 3-7%.
To answer the question if it is worth it, let’s run a thought experiment and reason about a hypothetical improvement.
Suppose that we are given the power to effectively produce a time reduction in the order of 20%. But we have to choose where to apply this. We can choose to decrease the Touch Time (i.e. we increase the delivery rate, this is the conventional improvement thinking of “work faster”); or to decrease the Wait time (i.e. postpone commitment, and “deliver earlier”).
Suppose further, for the sake of argument, that the current flow efficiency is 5%. That means that 5% of the whole service time is Touch Time; while 95% is Wait Time.
From the structure of our diagram, which plots Work against Time, we also can quantify the throughput of our efforts in terms of the amount of work produced per unit of time. This is an (extremely simplified) version of the so-called Little’s Law, which is expressed by the equation:
TP = WIP / FT
where TP is throughput, WIP is work in process and FT is (flow) time.
Our time reduction will be applied to the denominator, but with the distinction of the two cases. Here is a breakdown of the reasoning:
|Increase Delivery Rate
|Postpone Commitment & Reduce Wait Times
|Reduce touch time by 20%||Reduce wait time by 20%|
|Touch time is 5% of total time||Wait time is 95% of total time|
|Total time is reduced by 1% (i.e. 20% of 5%)||Total time is reduced by 19% (i.e 20% of 95%)|
|Total time becomes 99% of the original (100%-1%)||Total time becomes 81% of the original (100%-19%)|
|New throughput increases by a factor of (100/99)=1.01||New throughput increases by a factor of (100/81)=1.23|
|The impact on throughput is +1%||The impact on throughput is +23%|
It is clear that a 20% reduction applied on Touch Time is a much worse choice than an equivalent 20% reduction applied on Wait Time, due to the (generally) bad flow efficiency.
The conclusion is clear: when reducing Touch Time by 20%, throughput will increase by a mere 1%; but when reducing Wait Time by the same amount, throughput will increase by 23%!
The difference is astonishing. But the situation is even worse because in the former case we have to sustain expenses, reorganizations, restructuring, retraining and accept all related risks of failure; while in the latter, we incur in none of that.
If we examine the changes from a financial perspective by referring to the fundamental equation of throughput accounting (for an introduction to throughput accounting, see the blog post: Theory of Constraints and Software Engineering), the differences in favor of the second choice become even more compelling.
That deciding equation is:
ROI = (T – OE) / I
The return on investment is the difference between the (financial) throughput (i.e. sales minus totally variable expenses) and operating expenses, divided by any investment.
We can reasonably expect the financial throughput to increase in proportion to the increase in operational throughput; so it will be in the order (or proportion) of +1% and +23%, respectively, for the two cases.
Regarding operating expenses, they would increase substantially when trying to “work faster,” for instance, if we have to hire new people; but there would be zero difference in operating expenses when trying to “deliver sooner” because hiring or similar contingencies would never be required (remember that in the second case nothing changes except decision making).
Regarding investments, they would increase substantially when trying to “work faster,” for instance, if we have to buy new equipment (computers, offices, etc.); but there would be zero difference in investments when trying to “deliver earlier” because acquiring new assets or equipment would not be necessary.
Considering that the “working harder” option will need to be sustained by an increase in operating expenses or by additional investments (or, more likely, both), despite the nominal improvement of 20% in the Touch Time, the overall improvement becomes a mere 1% reduction in the time to market, the bottom line impact will be almost negligible, if not even negative.
There is no doubt that it is much wiser to try the unconventional approach and “deliver earlier.” Time to market will be reduced by 19% and operational throughput will increase by 23%; there will be no changes in work processes and consequently no increase in operating expenses nor investment. Zero changes, zero costs, zero investments, zero risks; but 23% more in throughput.
Basically, the improvement comes for “free” – except, of course, that we need to adopt the new mental model, and make decisions (about when to start and when to stop work) accordingly.
This example explains why conventional improvement initiatives, that focus on “working harder,” provide only marginal bottom line effects. As this hypothetical example shows, even a massive 20% reduction in touch time, translates into just a 1% increase in throughput. No wonder that improvements of 2-3% are considered as successful. In comparison, a 23% increase in throughput must be considered as miraculous.
The Limits of “Deliver Earlier”
The model presented offers many benefits. When making a deliberate effort to eradicate Wait Times and improve Flow Efficiency, there comes a point where there will be diminishing returns.
At some point, there will be no more Wait Time that can be removed effectively. That is when it will be necessary to consider the conventional option of “working harder.”
That is when we need the support of further mental models – those of constraints management – to learn where to focus improvement efforts and where to invest in order to have a real impact on the organization’s performance, while minimizing the risk of failure. Without knowing where to focus the effort (on the constraint), any effort will be totally vain and wasted.
Without such additional mental models, any such effort is very likely doomed to fail, or, in the most favorable conditions, produce only marginal improvements, similar to the example above of improving the Touch Time by 20% but not resulting in any significant benefits.
This post is an extract from the course material of the TameFlow Performance Leadership Training. The course is delivered online by Steve Tendon, and is aimed at team leaders and their teams (up to 16 people). The course is very demanding and lasts 90 days. Participants are entitled to an extended 90 day free trial of Kanbanize Enterprise Plan with Premium Analytics. Existing Kanbanize users are entitled to a 10% discount on the course fee.
Steve Tendon is a management consultant, business advisor and author. He holds a MSc in Software Project Management with the University of Aberdeen, and a MIT Fintech Innovation: Future Commerce certificate with the Massachusetts Institute of Technology. With a background in Software Engineering, he led the development of numerous applications in various fields, like: banking, health care, legal, human resources, and others.
As a management consultant focusing on organizational performance, he has had a number of significant assignments. For Wolters-Kluwer (a global information services company based in Amsterdam), he designed a digital transformation resulting in 40 million Euro cost savings. Steve is the creator of the TameFlow management approach. TameFlow was critical for William-Hill (a FTSE 100 iGaming company in London) to increase productivity by 240% and reduce time to market by 70%, and also win the UK Agile Awards 2014. TameFlow is used across many industries, like: aviation, automotive, financial services and others.