Failure Mode and Effects Analysis as a Framework for Predicting Harm Before It Happens

In healthcare, we have traditionally been better at investigating failures than preventing them. When something goes wrong, teams convene, timelines are reconstructed, contributing factors are mapped, and a corrective action plan is filed. This process, widely known as Root Cause Analysis (RCA), has served as the backbone of patient safety improvement for decades. And yet, the very structure of RCA reveals a fundamental limitation: it requires something to have already gone wrong before the work of learning begins.

Failure Mode and Effects Analysis (FMEA) challenges that assumption. Rather than waiting for an adverse event to initiate a safety review, FMEA asks a different question before anything goes wrong: What could fail here, and what happens if it does? For healthcare organizations striving to protect patients in increasingly complex care environments, this shift in perspective is not merely procedural. It is a marker of organizational maturity.

From Rear-View Mirror to Windshield: Understanding the RCA-FMEA Relationship

To appreciate what FMEA offers, it helps to first understand what RCA does well, and where it falls short.

Root Cause Analysis is a retrospective tool. It is triggered by an event, most commonly a sentinel event, and is designed to uncover the underlying systemic causes rather than assign individual blame (Abujudeh et al., 2014). The Joint Commission requires accredited facilities to conduct an RCA following sentinel events, making it both a safety tool and a compliance obligation (American Institute of Healthcare Compliance, 2025). RCA asks: what happened, why did it happen, and what can be done to prevent recurrence?

These are important questions. When conducted well, RCA builds teamwork, surfaces systemic vulnerabilities that affect far more than a single department, and creates a record of institutional learning (Hospitalist, 2021). The problem is embedded in the word “retrospective.” By definition, an RCA analysis begins after a patient has already been placed at risk, and often after harm has already occurred.

FMEA, by contrast, is prospective. It is a forward-looking assessment of systems and processes, designed to predict the ways in which things can go wrong before they do (Kuo et al., 2022). Rather than starting with a case, it starts with a process, and asks a multidisciplinary team to map every step and every conceivable point of failure. No adverse event is required to initiate the work.

The Joint Commission has recognized FMEA as the logical follow-up to RCA, describing it as a means for “designing and implementing an action plan for improvement” (Anderson et al., 2012). In this framing, the two tools are not competitors but complements. RCA provides the institutional memory of what has gone wrong. FMEA uses that memory, along with expert judgment about system design, to anticipate what might go wrong next. Together, they form a more complete safety ecosystem than either could offer alone.

How FMEA Works: Mapping Risk Before It Materializes

FMEA is a structured, team-based method with roots in military and aerospace engineering, first used in the 1940s to analyze potential errors in military operations (Pelino et al., 2022). Its application to healthcare has grown steadily, with the Veterans Affairs National Center for Patient Safety developing a healthcare-specific adaptation called Healthcare Failure Mode and Effects Analysis (HFMEA) in the early 2000s (DeRosier et al., 2002).

The core process follows a consistent sequence. A multidisciplinary team selects a high-risk clinical process, maps every step in that process, and then identifies all the ways each step could fail. For each potential failure mode, the team scores three factors on a scale of 1 to 10:

Severity: How serious would the consequences be if this failure occurred?
Occurrence: How likely is this failure to happen?
Detection: How likely is it that the failure would be caught before causing harm?

These three scores are multiplied together to produce a Risk Priority Number (RPN). The RPN allows the team to rank failure modes and direct corrective action toward those that pose the greatest combined risk (Performance Health, 2025; Anesthesia Patient Safety Foundation, 2024). After interventions are designed and implemented, the RPN is recalculated to confirm that the risk has meaningfully decreased.

This methodology has been successfully applied across a wide range of healthcare settings. In hemodialysis, FMEA applied to a center performing approximately 12,000 dialysis sessions per year identified 31 distinct failure modes, with the majority concentrated at the patient connection stage, exactly the kind of finding that would be unlikely to surface without systematic prospective review (Frontiers in Public Health, 2022). In pharmacy, FMEA has been used to identify and prioritize failure modes in medication dispensing processes, surfacing systemic issues like overcrowded dispensing counters as contributors to nearly 57 distinct failure modes (Journal of Pharmaceutical Sciences, 2021). In perioperative care, FMEA has been applied to processes such as anticoagulant management, identifying incorrect timing between therapy suspension and surgery as a high-priority risk before any adverse event occurred (Pelino et al., 2022).

The Near Miss: Healthcare’s Most Underused Teacher

Between the serious adverse event that triggers an RCA and the theoretical risk that FMEA is designed to address, there exists a middle ground that healthcare organizations often squander: the near miss.

A near miss is an event that could have led to patient harm but did not, either because of a timely intervention or by chance. These incidents are not merely “close calls” to be quietly absorbed by the team involved. They are data. They represent the moment when a system’s latent failures came close enough to the surface to be visible, without requiring a patient to pay the price for that visibility.

The World Health Organization has described near-miss reporting as a hallmark of highly reliable learning organizations (Liu et al., 2022). Near misses reveal gaps in processes that might otherwise go unnoticed until a real error occurs (GoAudits, 2026). They expose the same system vulnerabilities that an adverse event would expose, but without the human cost.

Despite this value, near misses remain significantly underreported in many healthcare settings. Common barriers include fear of blame, uncertainty about what constitutes a reportable event, skepticism that reporting will lead to meaningful change, and the additional workload that formal reporting entails (AHRQ, 2023). Research has found that healthcare workers often “do a quick fix and not report,” effectively absorbing the learning opportunity within the individual rather than routing it to the system (Liu et al., 2022).

This is where FMEA and near-miss analysis reinforce each other. Near misses, when captured and analyzed, provide real-world evidence of which failure modes are not merely theoretical but actively occurring at low levels of harm. They can seed an FMEA process with specific, grounded data. Conversely, FMEA can help organizations prioritize which near misses deserve the most rigorous follow-up, since the RPN framework can identify failure modes that, while currently unrealized, carry high potential for serious harm if they were to fully materialize.

A telling example comes from a research setting, where a near miss involving a volunteer nearly enrolled in the wrong clinical protocol was investigated and traced to a systemic absence of reliable participant tracking and verification systems. This latent failure would never have been visible without the near miss, and without formal review of the near miss, the latent failure would have persisted (NCBI, 2006). This is the learning loop that mature organizations aim to institutionalize.

The Maturity Question: What It Means to Anticipate Rather Than React

Healthcare organizations do not adopt proactive risk management all at once. Research on safety culture maturity describes a progression that moves from reactive, to compliance-based, to proactive, and finally to generative, a state in which safety is deeply embedded in everyday organizational behavior (Health Catalyst, 2018).

At the reactive end of this spectrum, safety actions occur only in response to incidents. Efforts are fragmented, and there is little structural engagement with risk until harm has already occurred (Safety+Health Magazine, 2025). This describes a system that is largely dependent on RCA as its primary safety signal. The knowledge generated is real and often valuable, but it arrives late and it arrives at a cost.

Most hospitals currently function in primarily a reactive mode, investigating incidents in which patients have already been harmed, conducting root cause analyses, and instituting corrective action plans to prevent future occurrences (Chassin & Loeb, 2013). Becoming safer requires a willingness and ability to recognize and act on near misses and unsafe conditions before they escalate.

FMEA represents the structural embodiment of proactive safety culture. It does not wait for an event. It sends a team into a process to hunt for failure before failure hunts them. The very act of convening a multidisciplinary team to ask “what could go wrong?” is itself a cultural signal that risk is everyone’s responsibility and that anticipation is more valuable than investigation.

In practical terms, this means that organizations that invest in FMEA are building a different kind of institutional memory than those that rely on RCA alone. RCA gives institutions a catalog of what has happened. FMEA gives them a model of what is structurally possible. The most safety-mature organizations use both, treating RCA findings as inputs into FMEA processes and using FMEA outputs to inform where near-miss reporting systems should focus attention.

It also means accepting that a given team’s collective imagination about failure may be as valuable as a database of past events. High-reliability organizations treat predictive thinking as a professional skill, not just a procedural exercise (Health Catalyst, 2018). When frontline staff are given the language and structured opportunity to say “this process worries me,” that signal, if caught and acted upon, is functionally equivalent to an adverse event report, only without the adverse event.

Limitations Worth Naming

FMEA is not without its critics, and intellectual honesty requires acknowledging the tool’s constraints. Several reviews have noted that the traditional RPN calculation can be misleading: two failure modes with very different underlying scores can produce the same RPN, and the relative weighting of severity, occurrence, and detection may not reflect the priorities of every clinical context (Liu et al., 2013, as cited in APSF, 2024).

Additionally, FMEA is resource-intensive. Assembling a multidisciplinary team, mapping processes thoroughly, and scoring failure modes thoughtfully takes time that clinical environments often cannot easily spare. The quality of an FMEA is directly tied to the quality of team composition and facilitation; a poorly conducted FMEA may produce false confidence about process safety rather than genuine risk reduction.

There is also an honest gap in the evidence base for near-miss learning. While the intuitive logic is compelling, a 2023 scoping review found that formal evidence supporting the assumption that near-miss reporting and learning improves patient safety remains limited, and that actions following near-miss reporting are commonly directed at the individual rather than the system, which is precisely the wrong level of intervention (Hewitt et al., 2023). This suggests that near-miss systems are necessary but not sufficient. The system must also be designed to translate near-miss data into structural improvement.

These are reasons to implement FMEA carefully and evaluate it rigorously. They are not reasons to defer it indefinitely in favor of waiting for adverse events to generate the necessary motivation.

Practical Starting Points

For healthcare teams considering FMEA for the first time, or looking to expand its use within their organization, several practical principles are well-supported by the literature:

Choose high-risk, high-volume processes first. Processes like medication dispensing, handoffs, anticoagulant management, and procedural workflows carry the highest density of latent failure modes and offer the most return on the investment of FMEA effort.
Build multidisciplinary teams intentionally. FMEA quality depends on diversity of expertise. The team should include people who designed the process, people who use it daily, and people whose patients are downstream of it.
Use RCA findings as FMEA inputs. Every sentinel event investigation should feed into a prospective review of similar processes. What happened once is evidence of what could happen again or elsewhere.
Treat near misses as preliminary FMEA data. Create the psychological safety conditions for near-miss reporting, and build feedback loops that close the loop between reporting and visible change. Without visible follow-through, reporting cultures erode.
Recalculate the RPN after intervention. The value of FMEA is not in the initial risk assessment alone but in the longitudinal tracking of whether interventions actually reduced risk. A post-intervention RPN that has not meaningfully decreased is a signal that further action is needed.

Conclusion

The question is no longer whether FMEA belongs in healthcare’s patient safety toolkit. The evidence, and the weight of experience from aviation, nuclear energy, and manufacturing that preceded its adoption in clinical settings, makes clear that prospective risk analysis saves lives that reactive analysis cannot. The more important question is whether organizations are willing to invest in anticipation before they are compelled to respond to tragedy.

RCA will always have a role. When things go wrong, and in complex systems they will, understanding why is essential. But an organization whose primary safety learning mechanism requires harm to have already occurred is an organization that has accepted a preventable delay in its own improvement. FMEA offers an alternative: structured, systematic, team-based engagement with risk before the patient encounter that might have revealed it.

As the safety maturity literature makes clear, the organizations that protect patients most effectively are not those that respond best to failure. They are those that have learned, structurally and culturally, to see failure coming (Avetta, 2024). FMEA is one of the clearest tools available for developing that capability. And for healthcare workers who entered this field to prevent harm rather than document it, the case for adopting that capability should need no sentinel event to make it compelling.

References

Abujudeh, H. H., Kaewlai, R., Asfaw, B. A., & Bhatt, S. (2014). Root-cause analysis and health failure mode and effect analysis: Two leading techniques in health care quality assessment. Journal of the American College of Radiology, 11(6), 572-579. https://doi.org/10.1016/j.jacr.2013.07.029

American Institute of Healthcare Compliance. (2025). RCA is a patient safety initiative and a compliance imperative. https://aihc-assn.org/rca-is-a-patient-safety-initiative-and-a-compliance-imperative/

Anderson, J. E., Kodate, N., Walters, R., & Dodds, A. (2012). Key performance outcomes of patient safety curricula: Root cause analysis, failure mode and effects analysis, and structured communications skills. American Journal of Pharmaceutical Education, 76(9), 162. https://doi.org/10.5688/ajpe769162

Anesthesia Patient Safety Foundation. (2024). Proactive perioperative risk analysis: Use of failure mode and effects analysis (FMEA). https://www.apsf.org/article/proactive-perioperative-risk-analysis-use-of-failure-mode-and-effects-analysis-fmea/

Avetta. (2024). Advancing safety: Understanding the 5 levels of safety maturity. https://www.avetta.com/blog/advancing-safety-understanding-the-5-levels-of-safety-maturity

Chassin, M. R., & Loeb, J. M. (2013). High-reliability health care: Getting there from here. Milbank Quarterly, 91(3), 459-490. https://doi.org/10.1111/1468-0009.12023

DeRosier, J., Stalhandske, E., Bagian, J. P., & Nudell, T. (2002). Using health care failure mode and effect analysis: The VA National Center for Patient Safety’s prospective risk analysis system. Joint Commission Journal on Quality Improvement, 28(5), 248-267.

GoAudits. (2026). How to prevent and report near misses in healthcare. https://goaudits.com/blog/near-miss-in-healthcare/

Health Catalyst. (2018). A framework for high-reliability organizations in healthcare. https://www.healthcatalyst.com/learn/insights/high-reliability-organizations-in-healthcare-framework

Hewitt, T. A., & Chreim, S. (2023). The value of learning from near misses to improve patient safety: A scoping review. Journal of Patient Safety, 19(1), 1-9. https://doi.org/10.1097/PTS.0000000000001071

Liu, H. C., Liu, L., & Liu, N. (2013). Risk evaluation approaches in failure mode and effects analysis: A literature review. Expert Systems with Applications, 40(2), 828-838. https://doi.org/10.1016/j.eswa.2012.08.010

Liu, X., He, W., & Chen, Z. (2022). The effect of patient safety culture on nurses’ near-miss reporting intention: The moderating role of perceived severity of near misses. Frontiers in Medicine, 9, 839181. https://doi.org/10.3389/fmed.2022.839181

NCBI Bookshelf. (2006). Near-miss reporting system development and implications for human subjects protection. In Advances in patient safety: From research to implementation (Vol. 3). Agency for Healthcare Research and Quality. https://www.ncbi.nlm.nih.gov/books/NBK20529/

Pelino, G., Silvestri, L., Vendramini, B., Steffan, A., Sanson, G., Sartori, M., & Peratoner, A. (2022). Proactive risk assessment through failure mode and effect analysis (FMEA) for perioperative management model of oral anticoagulant therapy: A pilot project. International Journal of Environmental Research and Public Health, 19(24), 16430. https://doi.org/10.3390/ijerph192416430

Performance Health. (2025). The critical role of FMEA in healthcare risk management. https://www.performancehealthus.com/blog/role-of-fmea-in-healthcare-risk-management

Safety+Health Magazine. (2025). Safety maturity. https://www.safetyandhealthmagazine.com/articles/26924-safety-maturity

Scarpis, E., Comoretto, R. I., Gregori, D., Quattrin, R., Beux, P., & Corbella, M. (2022). Proactive risk assessment through failure mode and effect analysis (FMEA) for haemodialysis facilities: A pilot project. Frontiers in Public Health, 10, 823680. https://doi.org/10.3389/fpubh.2022.823680

Sidhu, N. S., & Butt, H. (2021). Application of failure mode and effects analysis (FMEA) to improve medication safety in the dispensing process: A study at a teaching hospital, Sri Lanka. Journal of Pharmaceutical Sciences and Research, 13(6), 361-367. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8293514/

U.S. Agency for Healthcare Research and Quality. (2023). Implementing near-miss reporting and improvement tracking in primary care practices: Lessons learned. https://www.ahrq.gov/patient-safety/reports/liability/crane.html

Van Tilburg, C. M., Leistikow, I. P., Rademaker, C. M. A., Bierings, M. B., & van Dijk, A. T. H. (2006). Health care failure mode and effect analysis: A useful proactive risk analysis in a pediatric oncology ward. Quality and Safety in Health Care, 15(1), 58-64.