Skip to main content

Plant Reliability

“Reliability is the probability that an equipment or system does not fail for a period of time under given conditions”

Achieving 100% reliability is impossible. Not even Oil and gas and nuclear plants achieve this number, but they get very close. Improving reliability is not only hard work. It is also expensive. So, for mining and processing plants, what is the optimum level of reliability? The answer depends on the criticality, but before embarking on a reliability improvement project, we have to define the reliability level we would like to achieve, where the cost of reliability is not higher than the cost or risk of equipment failure.

Source: Practical Reliability Engineering O’Connor & Kleyner (2012)

Plant reliability can be improved by implementing the following processes below,

1. Early defect detection

Early defect detection is key to avoiding catastrophic failure and extending the operation of the asset before needing repair. Some of these processes include,

  • A well-implemented condition monitoring program, including methodologies such as vibration analysis, oil sampling, thermography, etc.
  • Inspections are done at the right intervals. Stopping and opening up equipment too often is detrimental to the plant’s reliability since every time we do an inspection, there is a chance that we introduce a failure mode.
  • Operator’s regular walks to the plant to identify operating issues that might cause equipment failure, such as noises, leaks, spillages, etc.

2. Risk mitigation

In the event the equipment fails, it is necessary to have all the processes in place to do a quality repair in the shortest time possible to minimise losses. This is having a plan for when equipment fails. Some of the processes include,

  • Spare parts identified and available.
  • A preservation strategy to ensure the spares are available and in “as new” condition.
  • Corrective task lists with a pre-defined scope, including resources and parts needed.
  • Corrective work instructions to do the repairs.
  • Quality Assurance tools such as Inspection and Test Plan (ITP) to ensure the critical steps are verified.
  • Having rotable spares minimises downtime by eliminating the need to do a refurbishment onsite. It also ensures the quality of the repair.
  • Have written procedures for the shutdown and start-up of the equipment and operators well-trained in them.
  • A well-defined operating strategy on stand-by equipment to ensure they are available when the duty fails, i.e. running exclusively the duty equipment and only running the standby to verify is available. Failing to do this will increase the probability of multiple failures.


3. Failure Prevention

When equipment fails, it is generally because there are not enough systems and processes to prevent or identify that failure early enough before it becomes catastrophic. It is the Management’s responsibility to create the conditions and to have those processes in place to drive plant reliability in the same way they drive safety culture. Effective failure prevention must be a top-down approach.


To read more about the subject, follow the link below,

Preventing Equipment Failure