NASA on the importance of being uncertain


NASA safety handbook cover

Way, way back in 2011 NASA published the first volume of their planned two volume epic on system safety titled strangely enough “NASA System Safety Handbook Volume 1, System Safety Framework and Concepts for Implementation“, catchy eh?

I finally got around to reading it, which may give you an idea of the length of my reading list, and there are definitely aspects of the handbook that are new for safety standards. In particular the handbook requires that unknown and therefore unquantified hazards also be managed, which something you just don’t find in other current safety standards, like ISO 61508 which seem unfortunately to be mired in a frequentist view of uncertainty and risk.

The handbook identifies two key objectives in managing the unknown. Firstly to incorporate appropriate historically-informed defenses against unknown hazards into the design, for example through use of design heuristics, economy of mechanisms and so on. Secondly to minimise the introduction of potentially adverse conditions during system realization/operation. that is apply best practices to avoid (for example) installing space shuttle speed brake actuators reversed, as happened with the shuttle fleet.

Precursor analysis also gets a gurnesy as the standard, again something other standards don’t, addresses the management of safety risk during the operational phase of the systems life, requiring that that the results be flowed back into the standing risk analysis for the system.

The standard also defines the concept of a safety risk reserve that should be applied to safety assessments to ensure that the reported system risk reflects both the risk associated with identified hazards and that associated with as yet unidentified hazards. Interestingly the authors posit the the relative importance of such unknown hazards would decrease as operational experience accrues. For example:

  1. At initial transition to operations our uncertainty is high and the margin reflects this,
  2. As operational experience accrues further hazards are identified and the risk balance shifts towards identified hazards (the safety risk margin is consumed), 
  3. Where design changes to reduce hazard likelihood are incorporated the risk transfers bad the other way i.e. known risk reduces but unknowns again increase (as does the allocated safety risk margin),
  4. As we add features to better manage system faults both known and unknown risk is decreased, and 
  5. Finally as the system design stabilise we reach the systems safety goal as the unknown risk washes out to some (albeit unknown) asymptote.

What’s encouraging about the handbook is that it gets the fundamental completeness problem of hazard identification onto the table, requires that NASA programs formally manage it and provides a set of tools to do so. In addressing epistemic and ontological risk in a meaningful fashion NASA gets four out of five belated Black Swans.


4 responses to NASA on the importance of being uncertain

    Gary D. Borba 06/06/2014 at 10:50 pm

    Thank you for your review of the NASA handbook. I have been very interested in how to deal with unknown risk for some years but, only having a B.S. in Nuclear Engineering, don’t have the academic background or “chops” to to proper research, so will be looking at the NASA handbook with interest. Have really looked for papers on how to deal with unknown risk and have not found much that has been either useful or practical.

    I have looked at risk as dynamic combination of known, unknown and evolving (changes over time) risks. How I have tried to quantify unknown risk for, say a project, is based on the maturity of the project (mature, developing, new) and the design or technological newness/innovation (new or developed) and apply a multiplication to the known risk based on the described attributes. In many ways the actual “risk” number is not really that important, but keeping the unknown risk front and center in the minds of the designers, builders and operators is important.

    The NASA handbook approach sounds like a superior approach.



      Matthew Squair 07/06/2014 at 1:40 am
        Gary D. Borba 10/06/2014 at 10:07 pm

        Dr. Squair,

        Got the NASA document and have went through the first section. My initial enthusiasm was greatly dampened when I saw the approach using safety cases and probabilistic arguments. I also saw the emphasis was on finding and demonstrating “safety” rather than hazard identification and control, which frames the approach in a manner that may miss important factors; if you look for safety you will find safety, if you look for hazards you will find hazards.

        I will keep reading through the document as I may still get valuable information, even if I strongly disagree with the approach.

        Have previously got your fundamentals course slides. Thank you! Good information. Hope you students know how fortunate they are.



        Matthew Squair 13/06/2014 at 10:56 am

        Ah yes, the modern predilection with safety cases.. perhaps that’s why they only scored 4 out of 5 :). That being said I give the NASA team chops for taking the uncertainty bull by the horns.