Using MTBF to Determine Maintenance Interval Frequency Is Wrong

Collecting failure data to calculate mean time between failures (MTBF) in order to determine accurate maintenance task intervals is wrong and should not be done. MTBF is a measure of reliability. It is a measurement of the time between two successive failure events.

Failures fall predominantly into two categories—age related and random. Typically, age related make up less than 20 percent of all failures while random make up 80 percent or more.

For age related failures, it is not MTBF, but rather useful life that is significant when attempting to determine maintenance task intervals to avoid failures. There is a point in a piece of equipment’s lifetime at which there is a rapid increase in its conditional probability of failure. The measurement between the point when the equipment is installed and the point where the conditional probability of failure begins to sharply increase is the “useful life” of the equipment. It is different than MTBF. The MTBF is defined as the average life of all the population of that item in that service.

If we want to prevent a failure from occurring, using traditional preventive maintenance, we would intervene just prior to the end of the equipment’s “useful life,” not just prior to MTBF. Incorrectly using MTBF to determine the preventive maintenance interval will result in approximately 50 percent of all failures occurring before the maintenance intervention. In addition, approximately 50 percent of the remaining components that have additional life will receive unnecessary maintenance attention—in both cases, not a very effective maintenance program. Therefore we need to use “useful life” and not MTBF when looking at age related failures and determining the frequency of preventive maintenance tasks.

Random failures make up the vast majority of failures on complex equipment as research has shown. For example, consider the failure of a component. Assume that each time the component failed we tracked the length of time it was in service. The first time the component is put into service it fails after 4 years, the second time after 6 years, and the third time after only 2 years (4 + 6 + 2 = 12/3 = 4). We know that the average lifespan of the component is 4 years (its MTBF is 4 years).

However, we do not know when the next component will fail. Therefore we cannot successfully manage this failure by traditional time-based maintenance (scheduled overhaul or replacement). It is important to know the condition of the component and the life remaining before failure; in other words, how fast can the component go from being OK to NOT OK. This is sometimes referred to as the failure development period or potential failure to functional failure (P-F) interval.

If the time from when the component initially develops signs of failure to the time when it fails is 4 months, then maintenance inspections must be performed at intervals of less than 4 months in order to catch the degradation of the component condition. The inspection also must be performed often enough to provide sufficient lead time to fix the equipment before it functionally fails. In this case, we might want to schedule the inspection every 2 months. This would ensure we catch the failure in the process of occurring and give us approximately 2 months to schedule and plan the repair.

Failure prevention requires the use of some form of condition-based maintenance at appropriate inspection intervals (failure finding, visual inspections, and predictive technology inspections).

My experience has been that for every $1 million in asset value as many as 150 condition inspection points must be monitored. Gathering and analyzing condition monitoring data to identify impending failure for assets worth billions of dollars is practically impossible without the use of reliability software.

The reliability software you choose should be able to:
• collect equipment condition data from controls, sensors, data historians, predictive maintenance technologies, and visual inspections
•use single or multiple data points to analyze the data, applying defined rules and calculations to get a true picture of equipment health
•perform the calculations and conduct the analysis automatically
• present results visually through flashing alarms and trending graphs, identifying potential failures and recommending corrective actions—before the equipment fails. MT

Newsletter Sign Up



Your First Name:

Your Last Name:

Your E-Mail Address:

Would you like our Newsletter?:

Enter verification image value
  

Featured White Paper

fluke-white-paper-aprilWIRELESS TEST TOOLS CAN CUT TROUBLESHOOTING TIME

By: Fluke Corporation

The automation of more and more processes and operations in today’s factories and commercial buildings is helping to reduce energy consumption and increase safety and productivity as never before. However, automation has also added a large dose ofcomplexity for the technicians who maintain and troubleshoot the systems. Click here to learn how Fluke's CNX 3000 Wireless system can help. 

Featured Supplier: New Pig

newpig

New Pig’s PIG® Latching Drum Lids enable quick, easy drum access and secure closure to help meet closed container regulations. Designed to open and close easily with one hand, the Lids keep drum contents dry and pure without hassle.

Click here to see PIG Latching Drum Lids in action.

Connect with MT


linkedin
 

facebook   twitter

 Follow Maintenance Technology for the latest updates, news and more.

Synergy Is In Our DNA

A partnership with Maintenance Technology and Lubrication Management & Technology keeps your message, products and services in front of 82,000+ decision-makers 24/7/365.

Online, in print and in person, our two synergistic publications are better than one. For more information, click here to contact your MT/LMT Sales Representative today.