Beyond Human Error: Why Your Safety Investigations Must Focus on Failed Defenses

Image of construction barrier in orange with the word "stop" in front of it. Barrier Analysis

Understanding Barrier Analysis and the Systems That Were Supposed to Protect Your Patients

When something goes wrong in a healthcare setting: a medication error, a wrong-site surgery, a missed diagnosis, the instinct is to ask: who made the mistake? While that question feels natural, it is almost always the wrong place to start. The more important question is: what protections were supposed to prevent this, and why didn’t they work?

This is the core insight of barrier analysis, a structured approach to safety investigations that shifts attention from the actions of individuals to the integrity of the systems designed to catch and correct errors before they reach patients. For healthcare workers and safety professionals, understanding barrier analysis is not just an academic exercise, it is a fundamentally different way of seeing risk.

The Event Happened Because the Defenses Failed

In patient safety, a ‘barrier’ is any control measure: a policy, a technology, a workflow, a person, that stands between a hazard and harm. Barrier analysis asks a direct question: which of those controls were in place, which were missing, and which failed to work as intended?

This reframing matters enormously. When investigators focus only on the actions that preceded an adverse event, they tend to find blame. When they focus on the barriers, they find systems. And fixing systems is what prevents the next patient from being harmed.

Most investigations should focus as much on failed protections as on failed actions. The event happened not because one person erred—but because the defenses that should have caught that error did not hold.

The World Health Organization’s patient safety curriculum describes barriers as ‘obstacles that prevent a hazard from causing harm,’ and emphasizes that robust safety systems layer these obstacles deliberately, so that no single failure can result in catastrophe (World Health Organization [WHO], 2011). When an event occurs despite the existence of barriers, investigators must ask why those barriers failed—not simply who was at fault.

Swiss Cheese and the Architecture of Risk

The most influential conceptual model in barrier analysis is James Reason’s ‘Swiss Cheese Model,’ developed in the 1990s and still widely used in healthcare safety today (Reason, 1990). The model illustrates how complex systems protect against failure through multiple defensive layers—each represented as a slice of Swiss cheese.

Each slice has holes. Those holes represent weaknesses, gaps, or temporary failures in a given barrier. Under normal conditions, the holes don’t align—a gap in one layer is covered by a solid portion of the next. But on rare, catastrophic occasions, the holes do align, and a hazard travels all the way through every layer, reaching the patient. This is when adverse events occur.

Accidents are rarely caused by a single catastrophic failure. They happen when multiple smaller failures align simultaneously—like light passing through holes in a stack of Swiss cheese.

For healthcare workers, this model reframes what an adverse event actually means: it is not evidence of one person’s failure. It is evidence that multiple layers of defense failed simultaneously. Reason himself emphasized that focusing blame on the individual at the ‘sharp end’, the nurse, the physician, the technician who was last to touch the situation, overlooks the systemic conditions that made failure more likely (Reason, 2000).

Dekker (2006) extends this thinking further, arguing that ‘human error’ is not a cause of accidents but a symptom of deeper organizational and systemic dysfunction. When investigators label an event as human error, they have stopped looking too soon. The next question must always be: what made that error possible, likely, or undetectable?

Prevention, Mitigation, and Recovery: A Three-Part Framework

Not all barriers are the same, and understanding their distinct roles is essential for both investigation and system design. In healthcare safety, barriers are commonly grouped into three functional categories: prevention barriers, mitigation barriers, and recovery barriers.

Prevention Barriers: Stopping Harm Before It Begins

Prevention barriers are designed to stop a hazardous event from occurring at all. They act upstream, before the threat has materialized. Examples in healthcare include:

  • Forcing functions—designs that make it physically impossible to commit certain errors, such as non-interchangeable connectors on IV tubing and gas lines.
  • Clinical decision support alerts that fire when a prescriber orders a contraindicated drug or dose.
  • Mandatory surgical time-outs before procedures begin.
  • Credentialing and privileging processes that restrict practice to qualified clinicians.

When prevention barriers fail or are absent, the hazard is no longer hypothetical—it has entered the system. This is when mitigation barriers become crucial.

Mitigation Barriers: Reducing the Impact of What Has Already Gone Wrong

Mitigation barriers do not prevent the hazardous event—they limit its consequences. They recognize that prevention is never perfect and build redundancy into the system to contain damage. Examples include:

  • Monitoring and surveillance systems that detect early deterioration, such as continuous pulse oximetry or rapid response teams.
  • Double-check procedures for high-alert medications like chemotherapy or anticoagulants.
  • Infection control protocols after a potential exposure has occurred.

The Institute for Healthcare Improvement (IHI) has long advocated that safety systems must include both prevention and mitigation strategies, recognizing that a system designed only to prevent errors will still experience patient harm when those prevention barriers inevitably fail (Berwick et al., 2002).

Recovery Barriers: Catching and Correcting Errors Before They Cause Harm

Recovery barriers are perhaps the most underappreciated category. These are the mechanisms that detect an error after it has occurred but before it harms the patient—and correct course in time. They are the last line of defense, and when they function well, they are also invisible: near-misses, by definition, do not become adverse events.

Examples of recovery barriers include:

  • Pharmacist review of medication orders before dispensing.
  • A nurse who questions an unusual order before administering a medication.
  • A bedside patient identity check that catches a labeling error before a transfusion.
  • Post-procedure radiographic confirmation that detects a retained surgical instrument.

Leape et al. (1995) found in their landmark study of adverse events in hospitals that a significant proportion of harm could have been prevented if recovery barriers—particularly medication review processes—had functioned as intended. The study underscored that healthcare systems routinely depend on human vigilance as a recovery barrier, making that vigilance vulnerable to fatigue, distraction, and cognitive overload.

Near-misses are not lucky escapes. They are evidence that recovery barriers worked. They deserve as much investigative attention as adverse events—because they reveal where the system is thin.

Why Barriers Are Weakened Before They Fail

One of the most important and counterintuitive findings in safety science is that barriers rarely fail suddenly. More often, they degrade gradually—through a process of normalization, resource constraint, and organizational drift—long before an adverse event occurs.

Diane Vaughan (1996), in her analysis of the Space Shuttle Challenger disaster, described this as the ‘normalization of deviance’: the process by which organizations come to accept conditions that deviate from established safety standards, treating them as normal because nothing bad has happened yet. This dynamic is disturbingly common in healthcare.

A double-check policy that is routinely skipped because staffing is short. An alert that is almost always overridden because it fires too frequently. A hand hygiene compliance rate that hovers at 60 percent but never triggers formal review. Each of these represents a barrier that exists on paper but has been functionally weakened in practice.

The Joint Commission’s sentinel event data consistently shows that communication failures, leadership gaps, and inadequate staffing are root causes in the majority of serious adverse events—not individual incompetence (The Joint Commission, 2023). These organizational factors are the conditions that weaken barriers over time.

Reason (2000) called this ‘latent conditions’—hidden organizational factors that lie dormant until they combine with active failures (the errors of individuals) to produce an accident. Latent conditions are particularly dangerous precisely because they are invisible until something goes wrong. Investigations that focus only on the moment of error will miss the months or years of organizational decisions that made that error likely.

By the time an adverse event occurs, the barriers protecting patients have often been eroding for months or years. The event is the symptom. The weakened defenses are the disease.

Vincent et al. (2000) developed a framework for analyzing patient safety incidents that explicitly maps contributing factors across multiple levels—patient characteristics, task factors, individual provider factors, team factors, work environment, and organizational/management factors. This multidimensional approach ensures that investigations do not stop at the individual level but continue upstream to the systemic and organizational conditions that shaped the event.

Reorienting Investigations: Lessons for Healthcare Teams

Understanding barrier analysis changes the entire posture of a safety investigation. The following principles reflect best practice for healthcare organizations committed to learning—rather than blaming—from adverse events.

1. Map the Barriers First

Before asking what went wrong, ask what was supposed to prevent it. Reconstruct the full set of barriers that were in place for the type of event that occurred. Then determine, for each barrier: Was it present? Was it functioning? Did it activate? If not, why not?

2. Investigate Near-Misses with the Same Rigor as Adverse Events

A near-miss means recovery barriers worked—but it also means prevention and mitigation barriers failed. Near-misses are invaluable leading indicators of systemic weakness. Organizations that only investigate when a patient is harmed are always reacting; organizations that investigate near-misses can intervene before harm occurs (WHO, 2011).

3. Look for Latent Conditions, Not Just Active Failures

When an investigation identifies a human error, treat it as a starting point, not a conclusion. Ask: What conditions made this error possible? What conditions made it hard to detect? What conditions made it hard to correct? The answers will almost always point to organizational factors that existed long before the event.

4. Resist the Allure of Individual Blame

Individual accountability matters—but it is not the same as individual blame. The goal of a safety investigation is to make the system safer, not to assign punishment. When organizations default to blaming individuals, they signal to their staff that reporting errors is dangerous, which suppresses the very information needed to improve safety (Dekker, 2006).

5. Treat Every Failed Barrier as a System Design Problem

When a barrier fails, ask: Why was it designed this way? Does the design account for how people actually work under real conditions? Is the barrier realistic given workload, staffing, and cognitive demands? Barriers that depend on perfection from humans under stress are not robust barriers—they are aspirations.

Conclusion

The Swiss Cheese Model, barrier analysis, and the distinction between prevention, mitigation, and recovery controls are not just theoretical frameworks. They are practical lenses that allow healthcare teams to see safety differently; to look past the error and find the system that allowed the error to cause harm.

Most adverse events in healthcare are not the result of reckless or incompetent individuals. They are the result of systems with weakened defenses, latent organizational conditions, and barriers that looked solid on paper but had been eroding for months. Investigations that find the person who made the last mistake have not found the cause of the event; they have found the end of the story, not the beginning.

The beginning is in the barriers. It is in the policies that were not enforced, the alerts that were always overridden, the staffing ratios that left nurses stretched too thin to catch every error. When healthcare organizations commit to investigating those conditions with the same rigor they bring to investigating individual actions, they will begin to build the systems their patients deserve.

Safety is not the absence of error. It is the presence of defenses strong enough to absorb error before it becomes harm.

References

Berwick, D. M., Calkins, D. R., McCannon, C. J., & Hackbarth, A. D. (2002). The 100,000 lives campaign: Setting a goal and a deadline for improving health care quality. JAMA, 288(18), 2302–2304. https://doi.org/10.1001/jama.288.18.2302

Dekker, S. (2006). The field guide to understanding human error. Ashgate Publishing.

Leape, L. L., Bates, D. W., Cullen, D. J., Cooper, J., Demonaco, H. J., Gallivan, T., Hallisey, R., Ives, J., Laird, N., Laffel, G., Nemeskal, R., Petersen, L. A., Porter, K., Servi, D., Shea, B. F., Small, S. D., Sweitzer, B. J., Thompson, B. T., & Vander Vliet, M. (1995). Systems analysis of adverse drug events. JAMA, 274(1), 35–43. https://doi.org/10.1001/jama.1995.03530010049034

Reason, J. (1990). Human error. Cambridge University Press.

Reason, J. (2000). Human error: Models and management. BMJ, 320(7237), 768–770. https://doi.org/10.1136/bmj.320.7237.768

The Joint Commission. (2023). Sentinel event data: General information. https://www.jointcommission.org/resources/sentinel-event/sentinel-event-data-general-information/

Vaughan, D. (1996). The Challenger launch decision: Risky technology, culture, and deviance at NASA. University of Chicago Press.

Vincent, C., Taylor-Adams, S., Chapman, E. J., Hewett, D., Prior, S., Strange, P., & Tizzard, A. (2000). How to investigate and analyse clinical incidents: Clinical risk unit and association of litigation and risk management protocol. BMJ, 320(7237), 777–781. https://doi.org/10.1136/bmj.320.7237.777

World Health Organization. (2011). Patient safety curriculum guide: Multi-professional edition. World Health Organization. https://www.who.int/publications/i/item/9789241501958

Related Articles