When The System Fails

Buried in today's New York Times is a story titled BP and Minerals Agency Faulted in Gulf Spill. Why shouldn't it be buried? Is there any news here?

The newsworthy aspect of the story is a report issued (actually leaked -- how ironic) yesterday by the National Academy of Engineering and National Research Council about the Gulf oil spill. The report is only preliminary, but I expect they got the key cause and effect dynamics mostly right. We all witnessed events in excruciating detail, and substituted anger for our helplessness as the Gulf, a treasure to us all and a lifeline for citizens in nearby states, was imperiled.

The 28-page report criticizes BP for a lack of management discipline in events that led up to and following the explosion. There are a few areas of the report (be aware: it's deadly dry stuff) that stand out. For me, the most important came from the last paragraph of the summary: “Of particular concern is an apparent lack of a systems approach that would integrate the multiplicity of factors potentially affecting the safety of the well, monitor the overall margins of safety and assess the various decisions from perspectives of well integrity and safety.” It also cites instances where a more effective approach and faithful deployment works in comparable circumstances.

The many isolated failures are addressed in the body of the report (lest you think BP is the only bad actor in this story, think again), and there is plenty of culpability to go around. But it comes back to the system: if the system was working, those isolated failures would have been caught, risk mitigated, and damage contained. I'm not suggesting that mistakes wouldn't have happened and the impact would have been zero, but I am suggesting that when the system works, better outcomes result.

For me, the worst disasters occur when the management system fails, just as the greatest successes are the result of management system excellence. Looking at all the pieces is important to gain an understanding of root causes and local effects, but the light doesn't go on until we connect the dots, with certainty looking back and confidence looking forward.

Do you have examples of where the system failed? We'd also like some reassurance to hear instances where it succeeded.