Justifying Root Cause Analysis

Make the business case with a significant calculated return on investment.

How often have we heard that we do not have time to do root cause analysis (RCA)? This is certainly the paradigm from those closest to the work, especially if they operate in a reactive culture. What about when we hear that RCA is too expensive? This is generally the paradigm from management, or those farthest from the work.

What is common about these two perspectives? Both perceptions represent reality because if that is what we believe, then our decisions will be made on that basis. So how do we overcome this hurdle of letting our paradigms prevent us from taking advantage of opportunities?

Opposing paradigms: operations vs. finance
Let’s explore this issue from two different perspectives: operational and financial. The operational people are those who are closest to the work and are responsible for maximizing the output of the organization. In this world, a reactive culture usually dominates. So whether we are making paper, processing patients, or dealing with customer complaints, we are likely dealing with the moment and handling one fire at a time.

In this world it is difficult to listen to people who advocate an activity like RCA because as it stands now, there does not seem to be enough time in a day to do our current job. Now someone wants us to perform another task, RCA, when our plates are already full. Let’s face it: this is the reality when working at this level. We do not see RCA as a solution to our already overburdened work schedule. We see RCA as a nuisance to being able to fight fires in the short-term.

Contrast this perspective against the financial one. Management level people are typically the ones that are charged with fiscal responsibility. So their world is one of numbers, statistics, and ultimately dollars. When people approach them about the concept of RCA, the first issue in their mind is: "How much is this going to cost?" Again, this is their world.

Usually in this world the first question is not "What value proposition does RCA bring to the table?" In the financial world we are dictated to by the budget, and no matter how attractive the opportunity, the cost in relation to the budget will be one of the major deciding factors. Sometimes our performance evaluations will reward us for staying within the budget, so there is a personal incentive to view everything from the cost standpoint versus the value standpoint.

What happens when these two worlds collide? We become risk aversive in our decision-making and our operations. When this happens we hang out in the safety zone and if we are lucky, we make marginal improvements over time. Creativity is stifled and we become human robots doing nothing more or less than we are told.

The facts about RCA
RCA is not a tool that is related to any specific industry; it is specific to human beings. We all come with the same equipment; we have brains that are wired to use inductive and deductive logic to think things through and solve problems, no matter what the problem. This must be realized and accepted in order to disprove those that believe that RCA is a tool for only mechanical situations or only for an industrial plant. We as human beings will use the same mental faculties to solve why a crude unit in an oil refinery failed as to why the cable does not work on the upstairs television.

Any RCA methodology on the market today must hang its hat on the science behind cause and effect relationships. The only difference between RCA methods is the manner in which they graphically represent these cause and effect relationships and how well these hypotheses are proven to be true or false.

So how do we build a convincing argument that supports why we should be permitted (if not required) to do RCA? We all know that the decision to pursue this RCA task will come from management, as they will have to authorize the funds to train employees and allow them time to practice what they learn. Therefore we must appeal to the financial perspective in order to get the ball rolling in the operations world.

Chronic events
What is the best way to demonstrate future trends in spending? Given everything constant, the past. We have all heard of the definition of insanity—doing the same thing over and over again and expecting a different result. The same is typical with spending trends. In industry, what is usually a large category in any maintenance budget? The one labeled "General" or "Routine." This is like the "Other" category. This is a reservoir for all expected, unexpected events.

These are the items that have fallen into the cost of doing business paradigm. They are typically small in individual consequence, and they do not hurt people or violate any regulations. They just retire into the pasture of the budget and are never questioned. From year to year we review the past spending on such items and bump it up a little for the cost of living increase.

If we can agree in concept up until this point, then let’s try to now express this in a graphical and financial manner. This is how we prepare our business case to management in an effort to sell the concept of RCA.

First, in order to convince management to invest in an RCA effort, we must present the opportunities that are available. In the RCA world, opportunities are generally expressed in terms of current losses experienced. With this in mind, let’s picture a scenario we can all relate to from past experience at some point in our careers.

Developing an RCA business case
For example, we work in a continuous process manufacturing operation. The nature of the product is irrelevant. This operation produces a high-margin product in a sold out market. Simply put, we can sell anything we can make. In this environment, what should be the most appropriate definition of a loss? Is it when equipment breaks down? Is it when the operation stops?

Step 1–Identifying the scope of the analysis. In making our business case, we want our presentation to have the utmost impact. Therefore, we need to seek out the area with the greatest opportunities available. This is usually the area referred to as the bottleneck of the operation.

The bottleneck is the typically weak link and we all know that the operation can only be as strong as its weakest link. Everyone usually knows which operation is the weakest link in any organization. For our purposes, we need to identify what this operation is, where it begins, and where it ends (Fig. 1). This will be the scope of our analysis for our business case.

Step 2–Defining a loss in the current economic environment. Now with the scope of the analysis defined, we can move on to understanding what is most important to measure in this operation. Since we can sell all we can make, the most important factor to the business is reliability of the operation. This means that a lost downtime hour is far more important than a spare piece of equipment that fails.

Remember, this is under the conditions described earlier. To set our focus, we will define a loss for our facility as any event or condition that interrupts the continuity of maximum quality production.

Step 3–Mapping out the weakest link. To help identify specific events that occur within the weakest link operation, we must draw a simple process flow block diagram. A block diagram easily maps out the flow of the product through the operation.

Step 4–Determining the potential gains. Based on this weakest link, what is its design capacity versus what it is actually producing? If the system is capable of producing 1 million tons per year and we, on average, are producing only 850,000 tons per year, then the opportunity lies in the difference or 150,000 tons.

Since this is a high margin product, when we cannot sell each pound, we lose the margin. For example, let’s say that we can make a $100/T margin. Therefore, the financial opportunity is 150,000 tons x $100/T = $15 million.

Step 5–Locating the losses. Now that we know there is $15 million out there for the taking, how do we identify where it is? We simply take the information we have collected and develop a spreadsheet to make our data collection efforts easier. We need to locate the events that are preventing us from reaching our potential. An appropriate spreadsheet may have column headings including Subsystem, Event, Mode, Frequency, Impact/Occurrence, and Total Annual Loss.

Where does the most reliable data come from to fill out such a spreadsheet? This is up to you. If you feel that your computerized maintenance management system (CMMS) is accurate enough to reflect the true activities of the field, then you should use it as your data source. If you do not, then you should contact the source of raw data: the people.

Oftentimes we do not realize that people are the most common sources of data input into databases. When events in the field occur so often, and they take short periods of time to repair, the effort to put them in recording systems outweighs the time it took to fix them. The end result: they do not make it into the recording systems and they remain in the heads of those that fixed the problems. Such events are hidden gold and the only way to find them is to talk to those closest to the work.

Step 6–Identifying the significant few. Imagine our spreadsheet with dozens or hundreds of events listed (depending on the size of the operation). When do we know when we are done? End the list when the Total Annual Loss column totals ±10 percent of the target identified (difference between actual production and potential).

Now that we have this wealth of information, how do we finalize our business case? Take the total of the Total Annual Loss column and multiply it by 80 percent. Then sort the events from the highest to the lowest total annual loss and see how many events it takes to add up to 80 percent of total annual losses. Typically, 20 percent or less of the events will be accountable for 80 percent or more of the losses.

Step 7–Finalizing the business case. From this exercise we can see that it is possible to pinpoint the specific events that are causing the greatest losses. Contrary to popular belief, the majority of these Significant Few events are chronic events versus sporadic ones.

This process has the unique capability of bubbling the chronic events to the top of the list, which otherwise go unnoticed because of their seemingly insignificant individual impacts. However, when aggregated over a year’s time, this analysis shows what is truly important.

Step 8–Calculating return on investment (ROI). We now can take all these elements of the business case process and roll them into a report for our management presentation. We can prove that the cost of training a team in RCA and focusing them on the Significant Few can yield a significant predetermined result.

We can easily calculate a proposed ROI that will be astounding. We have backed up all our claims and support our findings with evidence (hard data). Average ROIs for RCA range between 600 and 1000 percent. Oftentimes this is a hard sell because the numbers are so unbelievable, but using this process supports the case.

We did not attempt to hide any of the real costs of conducting a RCA. If there are more, then they should be added. RCA will require a little education and some software to help organize the effort. It is expected that a minimum class of 15 students and a maximum of 25 would be held. This will provide the necessary skills to the team leader, various team members, and supporting management personnel. This is a one-time cost that is sunk thereafter. Then there will be varying levels of dedication to the effort, but ideally there should be a full-time driver who oversees the analyses in progress.

Of course there will be team members needed based on their expertise in the analysis at hand. The make-up of the teams will change because of this. However, with this rotating role, it is expected that only four team members at a time will be occupied on a part-time basis during an analysis.

To provide support for solid conclusions, the RCA teams may need additional engineering support to help prove their hypotheses. The funds for this anticipated function are accounted for.

While this is only an example, we can get the idea. There is no need to beef up costs in such a business case because conservative numbers usually make just as convincing a case. Also, conservative numbers are easy to defend because we can use the fallback position of, "…we didn’t even include… ." Accounting department figures are the most credible because if the origin of these numbers is questioned, we can point to the bean counters as the source.

Now this is such an unbelievable ROI number, even though our data supports it, that we can make the case that if we cut the opportunity in half, the ROI would still be around 3850 percent. What is an acceptable ROI for an engineering project at our facilities now? MT

Robert J. Latino is senior vice-president of strategic development for Reliability Center, Inc., a reliability engineering firm specializing in improving equipment, process, and human reliability, 501 Westover Ave., Hopewell, VA 23860; (804) 458-0645

