Icicles on the launch tower (Image source: NASA)

An uneasy truth about the Challenger disaster

The story of Challenger in the public imagination could be summed up as ”’heroic’ engineers versus ’wicked’ managers”, which is a powerful myth but unfortunately just a myth. In reality? Well the reality is more complex and the causes of the decision to launch rest in part upon the failure of the participating engineers in the launch decision to clearly communicate the risks involved. Yes that’s right, the engineers screwed up in the first instance.

In the ideal, technical risk communication is the presentation of complex results and conclusions in a clear and unambiguous fashion. Unfortunately for the Challenger launch decision this was anything but the case. The meeting to discuss delaying the launch due to cold temperature was itself late to start and then chaotic in execution, among the problems:

  • the use of slide charts to present complex data,
  • said charts being made up in ‘real time’ prior to the meeting,
  • the participants having to digest the information ‘in-meeting’,
  • little attention paid to how the engineers presented data with many slides hand written,
  • not one chart presented O-ring erosion and temp together, and (perhaps most importantly),
  • not one chart referred to temperature data on all flights.

In contrast below is a single chart presenting the flight temperature that O-rings had been exposed to over the flight history of the shuttle fleet. Such a chart was not presented at the launch review, had it been then both NASA and Morton Thiokol management would have been confronted with how far outside the envelope they were taking the launch. Now I’m not going to say that the failure to communicate risk by the engineers to managers was the only reason that that a launch decision might have been made, but in an environment were the risk has not been clearly articulated it is all to easy to let other organisational objectives take precedence, see Vaughn (1996) for a fuller treatment of these.

Having observed the engineering culture at length I’ve concluded that engineers tend to fall into a view déformation professionnelle that the numbers and facts they work with are easily understood, universal in nature and thus that any conclusions they might draw are self evident truths. Having such a world view it’s a short step to a belief that no greater effort is required in communication than simply showing other folk the numbers, or worse, just your conclusions. This is probably also why engineers tend to love to use Powerpoint, and why poor technical communication (you guessed it, using Powerpoint) was also a causal factor in the Columbia accident. The reality, most especially in the case of risk, is that the numbers are never just the numbers and that effective technical communication is an essential element of risk management.

O ring distribution

O ring distribution chart showing just how far the degree to which the Challenger launch temperatures conditions lay outside the experience of previous launches

References

Rogers Commission, Report of the Presidential Commission on the Space Shuttle Challenger Accident, Chapter V: The Contributing Cause of The Accident, Washington, D.C., 1986.

Vaughn, D., the Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA, University of Chicago Press, 1996.

Risk managers are the historians of futures that may never be. 

I’ve rewritten my post on epistemic, aleatory and ontological risk pretty much completely, enjoy.

Qui enim teneat causas rerum futurarum, idem necesse est omnia teneat quae futura sint. Quod cum nem…

[Roughly, He who knows the causes will understand the future, except no-one but god possesses such faculty]

Cicero, De Divinatione, Liber primus, LVI, 127

Piece of wing found on La Réunion Island, is that could be flap of #MH370 ? Credit: reunion 1ere

Piece of wing found on La Réunion Island (Image source: reunion 1ere)


Why this bit of wreckage is unlikely to affect the outcome of the MH370 search

If this really is a flaperon from MH370 then it’s good news in a way because we could use wind and current data for the Indian ocean to determine where it might have gone into the water. That in turn could be used to update a probability map of where we think that MH370 went down, by adjusting our priors in the Bayesian search strategy. Thereby ensuring that all the information we have is fruitfully integrated into our search strategy.

Well… perhaps it could, if the ATSB were actually applying a Bayesian search strategy, but apparently they’re not. So the ATSB is unlikely to get the most out of this piece of evidence and the only real upside that I see to this is that it should shutdown most of the conspiracy nut jobs who reckoned MH370 had been spirited away to North Korea or some such. :)

We must contemplate some extremely unpleasant possibilities, just because we want to avoid them. 

As quoted in ‘The New Nuclear Age’. The Economist, 6 March 2015

Albert Wohlstetter

Jeep (Image source: Andy Greenberg/Wired)

Jeep (Image source: Andy Greenberg/Wired)

A big shout out to the Chrysler-Jeep control systems design team, it turns out that flat and un-partitioned architectures are not so secure, after security experts Charlie Miller and Chris Valasek demonstrated the ability to remotely take over a Jeep via the internet and steer it into a ditch.

Chrysler’s now patched the Sprint/UConnect vulnerability, and subsequently issued a recall notice for 1.4 million vehicles which requires owners to download a car security patch onto a USB stick then plug it into their car to update the firmware. So a big well done Chrysler-Jeep guys, you win this years Toyota Spaghetti Monster prize* for outstanding contributions to embedded systems design.

Continue Reading…