Maintenance Strategies for Reliability and Cost Optimization

Corrective Maintenance

Corrective maintenance, the original maintenance technique, remains widely used today. This approach involves repairing equipment after it fails. Corrective maintenance can be categorized into five main types:

  1. Fail Repair: Restoring a failed component to its operational state.
  2. Salvage: Disposing of non-repairable material and using salvaged parts from irreparable items in repair, overhaul, or rebuild programs.
  3. Rebuild: Returning a component to its original specifications by disassembling it, replacing worn or unserviceable parts, and ensuring original tolerances.
  4. Overhaul: Restoring a component to good condition using an inspect-and-repair-as-appropriate approach.
  5. Servicing: Maintenance activities required as a result of corrective action.

Effective corrective maintenance involves the following steps:

  1. Fault recognition
  2. Localization
  3. Diagnosis
  4. Repair
  5. Checkouts

Advantages

  • Full utilization of the component’s lifespan.
  • Reduced inspection and planning overhead when failures are infrequent.

Disadvantages

  • Unpredictable failures can disrupt production.
  • Requirement to stock spares for many parts to enable rapid repairs.
  • Addressing only the broken part without identifying the root cause can lead to recurring failures.
  • Limited applicability when component failure poses safety risks to employees.

Preventive Maintenance

Gaining traction in the 1940s and 1950s, preventive maintenance is currently one of the most prevalent techniques. It centers on periodic component inspections and scheduled maintenance based on time cycles or usage. Understanding the equipments statistical failure patterns is crucial for effective preventive maintenance, as the goal is to perform maintenance before failure occurs. Task selection criteria, which vary by industry and machine type, are essential for accurate scheduling.

Advantages

  • Predictable maintenance schedules reduce the need for extensive spare part inventory.
  • Minimal production disruptions since maintenance can be scheduled during breaks or planned downtime.
  • Enhanced control and fewer unexpected issues.

Disadvantages

  • Underutilization of the machine’s lifespan.
  • Increased maintenance costs due to scheduled replacements.
  • Ineffectiveness against random failures, potentially leading to unnecessary replacements.
  • Persistent risk of failure despite preventive actions.

Predictive Maintenance

Predictive maintenance employs techniques such as visual inspections, vibration analysis, thermography, temperature monitoring, ultrasonic testing, lubricant analysis, electrical condition monitoring, and nondestructive testing. This approach aims to reduce maintenance costs relative to traditional methods by ensuring components are serviced only when indicators show deterioration.

Predictive maintenance tracks equipment status through monitoring functions, enabling timely substitutions and repairs to minimize unexpected failures. A well-defined maintenance plan and a detailed history of task duration and workforce requirements enhance program effectiveness.

Advantages

  • Significant reduction in maintenance costs.
  • Early failure detection enables timely spare part procurement, minimizing inventory needs.
  • Parameter analysis helps identify root causes of failures.

Disadvantages

  • Requires equipment with measurable indicators related to condition.
  • Challenges in establishing clear, reliable parameters that perfectly indicate equipment status.
  • Continuous monitoring may not be feasible for all assets; periodic checks may be required for critical components.

Total Productive Maintenance (TPM)

TPM is a methodology focused on maximizing equipment availability by optimizing maintenance and production resources. Unlike traditional approaches, TPM integrates operators into the maintenance process. Operators receive training in basic maintenance and fault-finding and collaborate with technical experts in dedicated teams. This empowers operators to understand machinery, identify potential problems, and proactively address them, reducing downtime and production costs. TPM strives for zero errors, zero work-related accidents, and zero losses through close collaboration between production and maintenance units.

Pillars of TPM

  1. Initial phase management
  2. Health and safety
  3. Education and training
  4. Autonomous maintenance
  5. Planned maintenance
  6. Quality maintenance
  7. Focused improvement
  8. Support system

Advantages

  • Cost reduction
  • Enhanced customer satisfaction
  • Reduced accidents
  • Improved environmental control
  • Increased staff confidence
  • Cleaner work zones
  • Enhanced teamwork
  • Stronger workermachine relationships

Disadvantages

  • Difficulty in implementing an effective TPM system
  • Requires motivated and responsible employees

Tools for TPM Implementation

5S

A workplace organization method encompassing:

  1. Sorting: Eliminating unnecessary items, prioritizing essentials, and ensuring easy accessibility.
  2. Straightening (Set in Order): Arranging work, workers, equipment, parts, and instructions for optimal workflow and division of labor.
  3. Systematic Cleaning: Regular cleaning to detect anomalies and maintain standards.
  4. Standardization: Establishing consistent procedures and visual controls.
  5. Sustaining: Enforcing adherence to rules and procedures to prevent backsliding.

Poka-Yoke

Techniques that prevent mistakes, minimizing defects in products and processes and enhancing quality and reliability.

Reliability, Maintainability & Availability

Reliability

The probability of a component performing its intended function under specified conditions for a given period. Reliability is an inherent characteristic and is expressed as a probability.

Reliability Function:

9rg8h9WWGobXIXL5BNlg9sSqlE2uC2x1je4LZH+t

Maintainability

The ease and speed with which a system or equipment can be restored to operational status after a failure. Maintainability depends on factors such as equipment design, installation, personnel availability and skill levels, maintenance procedures, test equipment, and physical constraints.

Availability

The probability of a system operating satisfactorily at any given time.

Availability Formula:

Imagen

Imagen

Reliability and Maintainability Models

Several probability distributions are used in reliability and maintainability analysis, including:

  • Weibull Distribution: Commonly used for modeling lifetimes in industrial reliability, particularly for failures related to wear-out, fatigue, or aging.
  • Lognormal Distribution: Frequently used in maintainability analysis as it represents the distribution of most repair times, especially for tasks with multiple sub-tasks of varying durations.
  • Normal Distribution: Applicable to straightforward maintenance tasks with consistent completion times, such as simple replacements.
  • Exponential Distribution: Used for tasks where completion times are memoryless, like substitution methods for failure isolation.

Weibull Distribution

This distribution models failures where the failure rate is proportional to a power of time.

Ff+WXDjwB+S8U9JcGrdsAAAAASUVORK5CYII=

IDwB1bEQvFyi+skAAAAASUVORK5CYII=

K Values:

  • K < 1: Failure rate decreases over time (early failures).
  • K = 1: Constant failure rate (random failures).
  • K > 1: Failure rate increases over time (wear-out failures).

Maintainability Programs: Elements

  • Design for Maintainability: Incorporating design changes to improve working conditions and the work environment.
  • Management Approach: Defining management elements, communication channels, responsibilities, inspection procedures, supervision, and operator training.
  • Analysis and Testing: Continuous review and testing to assess the effectiveness of implemented solutions.
  • Cost Analysis: Evaluating the financial viability of solutions by comparing costs and improvements.
  • Data Collection and Failure Analysis: Gathering and analyzing data on failures to identify trends and areas for improvement.
  • Design for Maintainability (Tools): Developing new tools and technologies to enhance productivity and address maintenance challenges.

Benefits of Maintainability Programs

  • Improved maintainability and reduced maintenance time.
  • Enhanced system reliability and availability.
  • Reduced maintenance costs.
  • Improved safety for maintenance personnel.
  • Simplified maintenance procedures.

Reliability Block Diagrams (RBD)

RBDs visually represent the reliability of a complex system by illustrating how component reliability contributes to overall system success or failure. Components are depicted as blocks connected in series or parallel configurations.

  • Series Configuration: All components in a series path must function for the system to operate. Failure of any component in the series leads to system failure.
  • Parallel Configuration: Redundant components are arranged in parallel. The system functions as long as at least one component in the parallel path is operational.

Series Configuration:

NuzjZZ+J18wAAAABJRU5ErkJggg==

Parallel Configuration:

8f7mBCTfwA0fqQ0NfRRCqAAAAAElFTkSuQmCC

Root Cause Analysis (RCA) for Failures

RCA is a structured process for identifying the underlying cause of an event or failure. It involves a systematic investigation to determine the root cause and prevent recurrence.

Key Questions in RCA

  • What happened?
  • Where did it happen?
  • What changed?
  • Who was involved?
  • Why did it happen?
  • What is the impact?
  • Will it happen again?
  • How can recurrence be prevented?

Disadvantage of RCA

  • Reactive approach; analysis occurs after a failure.

Fault Tree Analysis (FTA)

FTA is a deductive, top-down approach to failure analysis using Boolean logic to determine the probability of a specific system failure or undesired event. It visually represents the logical relationships between events leading to a failure.

Applications of FTA

  • Understanding failure logic.
  • Resource optimization.
  • Safety performance monitoring.
  • Diagnostic manual creation.

Failure Mode and Effects Analysis (FMEA)

FMEA is a systematic method for identifying potential failures in a design, manufacturing process, or assembly process. It analyzes the consequences of these failures and prioritizes them based on severity and likelihood.

When to Use FMEA

  • During design or redesign of a process, product, or service.
  • When applying an existing process in a new way.
  • Before developing control plans.
  • When planning improvements.
  • When analyzing existing failures.
  • Periodically throughout the lifecycle.

How to Conduct an FMEA

  1. List process steps.
  2. Identify potential failure modes for each step.
  3. List the effects of each failure mode.
  4. Rate the severity of each effect (110).
  5. Identify the causes of each failure mode and rank their likelihood of occurrence (110).
  6. Identify existing controls and rank their effectiveness (110).
  7. Calculate the Risk Priority Number (RPN) for each failure mode (Severity x Occurrence x Detection).
  8. Prioritize failure modes based on RPN.
  9. Develop actions to address high-priority failure modes.
  10. Re-evaluate Occurrence and Detection rankings after implementing actions.

Work Prioritization and Work Orders

Work Prioritization

Prioritize maintenance tasks based on their criticality and potential impact on operations to ensure resources are allocated effectively to the most important issues first.

Work Orders

Documented instructions for completing specific maintenance tasks. They provide technicians with the information needed to perform the work safely and effectively.

Components of a Work Order

  • Reference and Description: Location, machine details, and assigned personnel.
  • Planning Section: Description of work, required spare parts, tools, estimated time (MTTR), and required skill level.
  • Craft Feedback: Technician’s notes on work completed, any issues encountered, and confirmation of task completion.

Storeroom Management and Homogeneous Poisson Process

Storeroom Management

Efficient management of spare parts and materials to ensure availability when needed while minimizing inventory costs.

Objectives of Storeroom Management

  • High service level to users.
  • Low inventory investment.
  • Cost-effective purchasing.
  • Adequate safety stock.
  • Complete inventory identification (quantity, location).
  • Item availability and accessibility.
  • Prompt purchase orders and efficient goods receipt.

Inventory Systems

  • Non-Reparables: Items that cannot be repaired to their original state.
  • Reparables: Items that can be repaired to a functional condition.

Homogeneous Poisson Process (HPP)

A statistical model used to predict the probability of failures occurring over time, assuming that failures are independent and uniformly distributed.

Using HPP for Spare Part Estimation

  1. Determine the failure rate (λ) based on historical data.
  2. Use the Poisson distribution table to determine the required stock level for a desired probability of success (e.g., 90%, 95%, 99%).

Five Myths of Storeroom Management

  1. More inventory is always better.
  2. Inventory management is a clerical task.
  3. Inventory is a necessary evil.
  4. Technology alone can solve inventory problems.
  5. Inventory management is a one-time project.

Economic Considerations and KPIs

Economic Considerations

Our target should be to optimize maintenance cost by finding the optimal point where the curvewhich represents cost versus maintenance frequencyreaches its minimum.

When considering money, there are three main mistakes in maintenance:

  1. Minimum Cost: It is dangerous to reduce maintenance cost without justification (doing nothing is the cheapest). Maintenance should be appropriate; a common benchmark is around 5% of acquisition cost.
  2. Continuous Maintenance: Excessive maintenance creates extra cost for unnecessary tasks and may steal resources from more important activities.
  3. Total Availability: Although TPM aims to maximize OEE, RCM seeks a compromise or good ratio between availability and maintenance cost (typically 9099%).

Compare equipment cost to an iceberg: the small part above the waterline (about 1/8) is the Acquisition Cost; the rest includes:

  • Operating (approx. 10%)
  • Maintenance & Spares (approx. 5%)
  • Training (usually included in acquisition)
  • Risk (MTBF): which increases with age

As years pass, cumulative cost increases and our ability to influence those costs decreases, so we must find the ideal moment to replace equipment.

A practical exercise is to use a spreadsheet to calculate total equipment cost per year for different service lives (considering acquisition, operation, maintenance, downtime, etc.) and identify the optimal (minimum) yearly cost.

Key Performance Indicators (KPIs)

Any maintenance plan must be evaluated using a concise set of KPIs. Too many KPIs can burden data collection; too few may be insignificant. Important KPIs include:

  • MTBF (Mean Time Between Failures)
  • MTTF (Mean Time To Failure)
  • MTTR (Mean Time To Repair)
  • Worst actors list of frequently failed equipment
  • Maintenance cost / Unit output (%)
  • Maintenance cost / Total Sales (%)
  • Total Maintenance Cost (per plant, section, ) (;)
  • Repair cost per work order
  • Unscheduled Maintenance Downtime (hours) take care of minor breakdowns!
  • Scheduled Maintenance Downtime (hours)
  • % of work orders assigned as “Rework Status” over the last month (the ones we have to repeat)