Stratify Data to Hone in on Special Causes of Problems

By John Hunter, founder of

We have a tendency to focus on special causes even when poor results are due to common causes within the system. To improve results that are due to the system trying to determine the specific problem with any bad result and fix that problem is an inefficient strategy.

image of the Fourth Generation Management book cover

But there are times when seeking the special causes of bad results and finding the root cause of that problem and fixing it is the best strategy. One strategy to help identify special causes so they can be studied and addressed is to stratify your data. Brian Joiner provides a good example in his book, Fourth Generation Management (page 142-143).

A garage responsible for city vehicles found a spike in the need to replace fuel pumps. The supervisor asked the mechanics to look into the problem (letting those familiar with the issues address the issue). Since the organization was becoming familiar with using data they dug into the data over the past year.

They discovered that heavy equipment vehicles were responsible for most of the fuel pump replacements. With their mechanical knowledge they knew there was no reason that should be true. So they attempted to determine what other factors could further isolate the special cause of the problem. This is a stratification strategy.

By stratifying the data you refine your view to make it easier to identify what is causing the problem. Instead of looking at all vehicles and seeking to find the cause they had stratified the data and learned they could exclude looking at most of the processes (those that don’t impact large vehicles). And they then sought to further refine the scope by stratifying the data to further isolate the scope of the investigation. As you refine the scope you can discover what is common just to the population you have isolated by stratifying the data.

They discovered that vehicles of all ages had the problem, that it didn’t matter what other repairs had been performed, but that the problem only appeared in diesel-powered vehicles.

To stratify the data sensibly you need for those with the expertise to know what factors are likely to have the potential to cause the problem you are trying to address. As well as people that can think about results that don’t seem to make sense using their specialized knowledge.

What was it about diesel fuel in heavy equipment that could lead to problems with fuel pumps? Nothing jumped out at them at first. Then they looked at where the fuel was coming from – different vendors, different fueling locations – and Aha!, discovered that the problem only appeared in vehicles that used the fueling station on the east side of town.

By stratifying the data by potential causes that experts (on related processes and the problem you are investigating) think could be related you learn where to focus the investigation. Some ways of stratifying the data won’t help isolate the issue into a smaller set of results. But in their case, one way of stratifying the data presented a clear indication of where to focus their efforts.

With the indication that the fueling station on the east side of town was a root cause they examined that process more closely and determined the fuel tank at that location had sprung a leak and water got into the fuel mixture leading to rusted fuel pumps.

Stratification helps us identify and eliminate common cause variation by revealing patterns in the data that point to the source of the trouble.

To perform stratification analysis, we must have information on conditions related to the data… You may have some of the information you need readily at hand, but early on most organizations find that the information they need most is not available, either because it hasn’t been collected, or because it is “in the computer” and is somehow inaccessible to humans. In such cases, you may need to collect new data.

Special causes can hide in the mass of data. By using stratification you readjust your focus and can discover special causes that can be addressed.

I believe Fourth Generation Management by Brian Joiner is the most valuable management books as well as one of the most useful for those interested in applying W. Edwards Deming’s management theory. The book is especially excellent for understanding how to use data within a management context. Brian discusses data not with formulas but how managers can and should use data to improve. And he also discusses how misusing data leads to management problems common to organizations lead without an understanding of variation.

Related: Quality Comes to City HallGood Process Improvement PracticesHow to Use Data and Avoid Being Mislead by DataThe Art of DiscoverySpecial Cause Signal Isn’t Proof A Special Cause Exists

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Scroll to Top