Archives For sensemaking

The Sydney Morning Herald published an article this morning that recounts the QF72 midair accident from the point of view of the crew and passengers, you can find the story at this link. I’ve previously covered the technical aspects of the accident here, the underlying integrative architecture program that brought us to this point here and the consequences here. So it was interesting to reflect on the event from the human perspective. Karl Weick points out in his influential paper on the Mann Gulch fire disaster that small organisations, for example the crew of an airliner, are vulnerable to what he termed a cosmology episode, that is an abruptly one feels deeply that the universe is no longer a rational, orderly system. In the case of QF72 this was initiated by the simultaneous stall and overspeed warnings, followed by the abrupt pitch over of the aircraft as the flight protection laws engaged for no reason.

Weick further posits that what makes such an episode so shattering is that both the sense of what is occurring and the means to rebuild that sense collapse together. In the Mann Gulch blaze the fire team’s organisation attenuated and finally broke down as the situation eroded until at the end they could not comprehend the one action that would have saved their lives, to build an escape fire. In the case of air crew they implicitly rely on the aircraft’s systems to `make sense’ of the situation, a significant failure such as occurred on QF72 denies them both understanding of what is happening and the ability to rebuild that understanding. Weick also noted that in such crises organisations are important as they help people to provide order and meaning in ill defined and uncertain circumstances, which has interesting implications when we look at the automation in the cockpit as another member of the team.

“The plane is not communicating with me. It’s in meltdown. The systems are all vying for attention but they are not telling me anything…It’s high-risk and I don’t know what’s going to happen.”

Capt. Kevin Sullivan (QF72 flight)

From this Weickian viewpoint we see the aircraft’s automation as both part of the situation `what is happening?’ and as a member of the crew, `why is it doing that, can I trust it?’ Thus the crew of QF72 were faced with both a vu jàdé moment and the allied disintegration of the human-machine partnership that could help them make sense of the situation. The challenge that the QF72 crew faced was not to form a decision based on clear data and well rehearsed procedures from the flight manual, but instead they faced much more unnerving loss of meaning as the situation outstripped their past experience.

“Damn-it! We’re going to crash. It can’t be true! (copilot #1)

“But, what’s happening? copilot #2)

AF447 CVR transcript (final words)

Nor was this an isolated incident, one study of other such `unreliable airspeed’ events, found errors in understanding were both far more likely to occur than other error types and when they did much more likely to end in a fatal accident.  In fact they found that all accidents with a fatal outcome were categorised as involving an error in detection or understanding with the majority being errors of understanding. From Weick’s perspective then the collapse of sensemaking is the knock out blow in such scenarios, as the last words of the Air France AF447 crew so grimly illustrate. Luckily in the case of QF72 the aircrew were able to contain this collapse, and rebuild their sense of the situation, in the case of other such failures, such as AF447, they were not.

 

Well I can’t believe I’m saying this but those happy clappers of the software development world, the proponents of Agile, Scrum and the like might (grits teeth), actually, have a point. At least when it comes to the development of novel software systems in circumstances of uncertainty, and possibly even for high assurance systems.

Continue Reading…

Deepwater horizon (Image source NY Times)

Mindfulness and paying attention to the wrong things

As I talked about in a previous post on the Deepwater Horizon disaster, I believe one of the underlying reasons, perhaps the reason, for Deepwater’s problems escalating to into a catastrophe was the attentional blindness of management to the indicators of problems on the rig, and that this blindness was due in large part to a corporate focus on individual worker injury rates at the expense of thinking about those rare but catastrophic risks that James Reason calls organisational accidents. And, in a coincidence to end all coincidences there was actually a high level management team visiting just prior to the disaster to congratulate the crew as to their seven years of injury free operations.

So it was kind of interesting to read in James Reason’s latest work ‘A Life in Error‘ his conclusion that the road to epic organisational accidents, is paved with declining or low Lost Time Injury Frequency Rates (LTIFR). He goes on to give the following examples in support:

  • Westray mining disaster (1992), Canada. 26 miners died, but the company had received an award for reducing the LTIFR,
  • Moura mining disaster (1994), Queensland. 11 miners died. The company had halved its LTIFR in the four years preceding the accident.
  • Longford gas plant explosion (1998), Victoria. Two died, eight injured. Safety was directed to reducing LTIFR rather than identifying and fixing the major hazards of un-repaired equipment.
  • Texas City explosion (2005), Texas. The Independent Safety Review panel identified that BP relied on injury rates to evaluate safety performance.

As Reason concludes, the causes of accidents that result in a direct (and individual injury) are very different to those that result in what he calls an organisational accident, that is one that is both rare and truly catastrophic. Therefore data gathered on LTIFR tells you nothing about the likelihood of such a catastrophic event, and as it turns out can be quite misleading. My belief is that not only is such data misleading, it’s salience actively channelises management attention, thereby ensuring the organisation is effectively unable to see the indications of impending disaster.

So if you see an organisation whose operations can go catastrophically wrong, but all you hear from management is proud pronouncements as to how they’re reducing their loss time injury rate then you might want to consider maintaining a safe, perhaps very safe, distance.

Reason’s A Life in Error is an excellent read by the way, I give if four omitted critical procedural steps out of five. 🙂

How do we  give meaning to experience in the midst of crisis?

Instead people strive to create a view of it by establishing a common framework into which events can be fitted to makes sense of the world, what Weick (1993) calls a process of sensemaking. And what is true for individuals is also true for the organisations they make up. In return people also use an organisation to make sense of what’s going on, especially in situations of uncertainty, ambiguity or contradiction.

Continue Reading…