The concept of run-to-failure (RTF) is widely misunderstood. Most people, engineers included, will provide the automatic response that if a component fails and nothing happens, it is a run-to-failure component. Another prevalent, but totally incorrect, assumption is that having redundant components or redundant systems automatically means the component or system is run-to-failure. These misconceptions are recipes for disaster.
Unfortunately, RTF has become the misunderstood “orphan” in the picture of reliability. The RTF definitions that exist today do not adequately address the true meaning of RTF, whether they are found in a reliability centered maintenance (RCM) publication, a regulatory publication, or any other publication. The standard definition for RTF usually reads something like: “The component is allowed to fail without the requirement for any type of preventive maintenance” or “Run-to-failure is a policy that permits a failure to occur without any attempt to prevent it.”
These definitions are far too shallow to prevent the mismanagement of this very important concept. The time has come for a very precise and prescriptive definition for identifying when a component can be classified as RTF. I have termed this “The Canon Law For Run-To-Failure.” See accompanying text “The Canon Law for Run-To-Failure.”
The Canon Law For RTF is very specific. It goes beyond the traditional definition of RTF: preventive maintenance (PM) is not required prior to failure. There is no mention that corrective maintenance is required in a timely manner after failure.
However, that is only part of the RTF story. There are several other qualifiers before a component can be classified as RTF.
RTF components are understood to:
• have no safety, operational, or economic consequences as the result of a single failure.
• be evident to operations personnel when they fail.
RTF components are important
RTF components have been mistakenly designated as unimportant because they have no significant consequence as the result of a single failure. However, after failure, the component is required to be restored to an operable status via corrective maintenance in a timely manner.
RTF does not imply that a component is unimportant. It is just that some components must have a preventive maintenance strategy and RTF components do not. However, all components, even RTF components, are important to reliability and must have an equivalent corrective maintenance strategy, commensurate with all other components, and prioritized accordingly, based on the plant conditions at that time.
RTF components are designated as such due to the failure being evident and having no significant consequence as the result of a single failure. If it did not matter whether a failed component was restored to an operable status in a timely manner, one would question why that component was even installed in the plant.
Similarly, if the failure was forever hidden and no one ever knew about it and it did not matter how many additional multiple failures occurred, one also would question why that component was installed in the plant. The limited exceptions to this logic would include components that are used strictly for convenience and have no pertinent function.
Another major misconception in regard to an RTF component is that fixing it when it is broken is either optional or it has no important consideration for a timely repair. This is absolutely incorrect.
So often, engineers and senior management embrace the belief that RTF components are like secondhand junk cars—not worthy of even worrying about either before or after they fail. However, that line of reasoning is tantamount to having a flat tire, putting on the spare, and throwing the flat, with the nail embedded, back into the trunk and never worrying about it again.
One reason for this misguided logic is that preventive maintenance historically, and RCM specifically, has focused only on critical components to the detriment of all others. Another reason is the RTF terminology itself. It has somewhat of an ominous connotation.
I can remember several occasions when I had to use the words that a specific component was “governed by corrective maintenance” just to avoid using the RTF terminology because the receivers of the information in the conversation were not sufficiently astute to accept an RTF component as being anything other than totally irrelevant.
Corrective maintenance strategy needed
The fact is that all components are important to reliability, even RTF components. All components must have an equivalent corrective maintenance strategy.
A total proactive maintenance program includes corrective maintenance as well as preventive maintenance as integral parts of its strategy. Preventive maintenance is a strategy to prevent component failures. Corrective maintenance is a strategy to fix components once they have failed or have become degraded. These two entities are performed integrally to prevent a failure consequence at the plant level.
Having an RTF component as part of the maintenance plan may seem startling at first, but once you think about it, it becomes quite clear. The ultimate objective of a maintenance program is to prevent a consequence of failure at the plant level. Preventive maintenance tasks are specified to prevent component failures that can have either an immediate unwanted consequence of failure or the potential for an unwanted consequence of failure at the plant level when they fail. Corrective maintenance is specified to eliminate or reduce the vulnerability of a plant consequence should an additional component fail while any component, including an RTF component, is in its failed state.
If you do not impose a proactive corrective maintenance strategy, you run the risk of an unwanted plant consequence with an additional failure. Therefore, an RTF component that has failed must be fixed in a timely manner.
After a component has failed, whether it was governed by a preventive maintenance strategy or an RTF strategy, it is prioritized for corrective maintenance with an equivalent relative importance based on the plant conditions at that time. For example, what other equipment is out of service, what equipment performance levels are in an alert state, what associated equipment is planned for replacement, etc? This requires a decision process, usually by Operations and Engineering, which considers all pertinent factors in attempting to prevent any possibility of a failure consequence at the plant level.
The traditional vision of preventive maintenance, which I refer to as the smaller picture, is to prevent failures at the component level, prior to the component failure resulting in an unwanted plant consequence. However, I like to think in terms of the bigger picture of preventive maintenance which is to prevent an unwanted consequence of failure not only at the component level but also directly at the plant level. To do so includes addressing and prioritizing corrective maintyenance within a total proactive maintenance strategy, and RTF components are an integral part of the bigger picture.
Neil Bloom is an RCM and preventive maintenance program consultant and author of an upcoming book entitled “Classical RCM Made Simple” to be published by McGraw-Hill later this year. He has spent more than 35 years in engineering and maintenance management positions in the commercial aviation and nuclear power industries.