Archives For Uncategorized

The quote below is from the eminent British scientist Lord Kelvin, who also pronounced that x-rays were a hoax, that heavier than air flying machines would never catch on and that radio had no future…

I often say that when you can measure what you are speaking about, and express it in numbers, then you know something about it; but when you cannot measure it, when you cannot express it in numbers, your may knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever that may be.p>

Lord Kelvin, 1891

I’d turn that statement about and remark that once you have a number in your grasp, your problems have only just started. And that numbers shorn of context are a meagre and entirely unsatisfactory way of expressing our understanding of the world.

When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.

Arthur C. Clarke,  Profiles of the Future (1962)

I often think that Arthur C. Clarke penned his famous laws in direct juxtaposition to the dogmatic statements of Lord Kelvin. It’s nice to think so anyway. :)

 

Just added a modified version of the venerable subjective 882 hazard risk matrix to my useful stuff page in which I fix a few issues that have bugged me about that particular tool, see Risk and the Matrix for a fuller discussion of the problems with risk matrices.

For those of you with a strong interest in such I’ve translated the matrix into cartesian coordinates, revised the risk zone and definitions to make the matrix ‘De Moivre theorem’ compliant (and a touch more conservative), added the AIAA’s combinatorial probability thresholds, introduced a calibration point and added the ALARP principal.

Who knows maybe the US DoD will pick it up…but probably not. :)

MIL-STD-882 Hazard Risk Matrix (Modified).

 

I’ve put the original Def Stan 00-55 (both parts) onto my resources page for those who are interested in doing a compare and contrast between the old, and the new (whenever it’s RFC is released). I’ll be interested to see whether the standards reluctance to buy into the whole safety by following a process argument is maintained in the next iteration. The problem of arguing from fault density to safety that they allude to also remains, I believe, insurmountable.

The justification of how the SRS development process is expected to deliver SRS of the required safety integrity level, mainly on the basis of the performance of the process on previous projects, is covered in 7.4 and annex E. However, in general the process used is a very weak predictor of the safety integrity level attained in a particular case, because of the variability from project to project. Instrumentation of the process to obtain repeatable data is difficult and enormously expensive, and capturing the important human factors aspects is still an active research area. Furthermore, even very high quality processes only predict the fault density of the software, and the problem of predicting safety integrity from fault density is insurmountable at the time of writing (unless it is possible to argue for zero faults).

Def Stan 00-55 Issue 2 Part 2 Cl. 7.3.1

Just as an aside, the original release of Def Stan 00-56 is also worth a look as it contains the methodology for the assignment of safety integrity levels. Basically for a single function or N>1 non-independent functions the SIL assigned to the function(s) is derived from the worst credible accident severity (much like DO-178). In the case of N>1 independent functions, one of these functions gets a SIL based on severity but the remainder have a SIL rating apportioned to them based on risk criteria. From which you can infer that the authors, just like the aviation community were rather mistrustful of using estimates of probability in assuring a first line of defence. :)

Preamble

The following is a critique of a teleconference conducted on the 16 March  between the UK embassy in Japan and the UK Governments senior scientific advisor and members of SAGE, a UK government crisis panel formed in the aftermath of the Japanese tsunami to advise on the Fukushima crisis. These comments pertain specifically to the 16 March (UK time) teleconference with the British embassy and the minutes of SAGE meetings on the 15th and 16th that preceded that teleconference. Continue Reading…

I’ve just reread Peter Ladkin’s 2008 dissection of the conceptual problems of IEC 61508 here, and having just worked through a recent project in which 61508 SILs were applied, I tend to agree that SIL is still a bad idea, done badly… I’d also add that, the HSE’s opinion notwithstanding, I don’t actually see that the a priori application of a risk derived SIL level to a specific software development acquits ones ‘so far as is reasonably practical’ duty of care. Of course if your regulator says it does, why then smile bravely and complement him on the wonderful cut of his new clothes. On the other hand if you’re design the safety system for a nuclear plant maybe have a look at how the aviation industry do business with their Design Assurance Levels. :)

 

Cognitive biases potentially affecting judgment of global risks

iOS-7 (Image source: Apple)

What iOS 7′s SSL/TLS security patch release tells us

While the commentators, pundits and software guru’s pontificate over Apple’s SSL/TLS goto fail bug’s root cause, the bug does provide an interesting perspective on Least Common Mechanism one of the least understood of Saltzer and Schroede’rs security principles. For those interested in the detail of what actually went wrong with ‘SSLProcessServerKeyExchange()’ click over to the Sophos post on the subject.

Continue Reading…

The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 24,000 times in 2013. If it were a concert at Sydney Opera House, it would take about 9 sold-out performances for that many people to see it.

Click here to see the complete report.

And I’ve just updated the philosophical principles for acquiring safety critical systems. All suggestions welcome…

Enjoy :)

Colossus the forbin project (Image source: Movie still)

Risk as uncontrollability…

The venerable safety standard MIL-STD-882 introduced the concept of software hazard and risk in Revision C of that standard. Rather than using the classical definition of risk as combination of severity and likelihood the authors struck off down quite a different, and interesting, path.

Continue Reading…

In an earlier post I had a look at the role played by design authorities in an organisation, which can have a major affect upon both safety and project success. My focus in that post was on the authority aspect.

However another perspective on the role that a design authority performs is that of someone who is able to understand both the operational requirements for a system (e.g. those that define a need) as well as the technical (those that define a solution) and most importantly be able to translate between them.

This is a role that is well understood in architecture, but one that has seemed to diminish and dwindle in engineering where projects of any complexity are more often undertaken by large bureaucratic organisations, which also traditionally fear assigning responsibility to one person.

20130405-110510.jpg
Provided as part of the QR show bag for the CORE 2012 conference. The irony of a detachable cab being completely unintentional…

20130223-170419.jpg

For somebody. :)

787 Lithium Battery (Image Source: JTSB)

But, we tested it? Didn’t we?

Earlier reports of the Boeing 787 lithium battery initial development indicated that Boeing engineers had conducted tests to confirm that a single cell failure would not lead to a cascading thermal runaway amongst the remaining batteries. According to these reports their tests were successful, so what went wrong?

Continue Reading…

Over on the RVS Bielefield site Peter Ladkin has just put up a white paper  entitled 61508 Weaknesses and Anomalies which looks at the problems with the current version of the IEC 61508 functional safety standard, part 6 of which sits on my desk even as we speak. Comments are welcome.

For my own contributions to the commentary on IEC 61508 see Buncefield the alternate view , Component SIL rating memes and SILs and Safety Myths.

Dr Nancy Leveson will be teaching a week-long class on system safety this year at the Talaris Conference Center in Seattle from July 15-19.

Her focus will be on the new techniques and approaches described in her latest book Engineering a Safer World. Should be excellent.

See the class announcement for further details.

The WordPress.com stats prepared a 2012 annual report for this blog.

Here’s an excerpt:

4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 29,000 views in 2012. If each view were a film, this blog would power 7 Film Festivals

Click here to see the complete report.

A Working Definition and a Key Question…

One of the truisms of systems safety theory is that safety is an emergent attribute of the design. But looking at this in reverse it also implies that system accidents are emergent, if unintended, attributes of a system. So when we talk about system accidents and hazards we’re really (if we accept the base definition) talking about an emergent attribute of a system.

But what are emergent attributes? Now that is a very interesting philosophical question, and answering it might help clarify what exactly is a system accident…

Just put the final conference ready draft of my Writing Specs for Fun and Profit up, enjoy!

As readers may have noted, I’ve changed the name of my blog to Critical Uncertainties which I think better represents what it’s all about, well at least that’s my intent. :-)

I’m still debating whether to go the whole hog and register a domain name, it has it’s advantages and disadvantages. But if I do don’t worry, my host WordPress.com will make sure that you don’t get lost on the way here.

P.S I did think of a few alternate names. Some better than others…

20120827-213039.jpg

Neil Armstrong died yesterday at the age of 82, and rather than celebrating his achievements as an astronaut, marvelous though they are, I’d like to pay tribute here to his work as an engineer and test pilot.

Before Apollo Neil Armstrong was a test pilot for NACA flying the X15 rocket plane, and during his test piloting he came up with what they ended up calling the Armstrong spiral. The manoeuvre was a descending glide spiral that tightened the turn radius as the glide speed reduced. Armstrong’s manouevre was so widely regarded that it was later adopted by the Space Shuttle program.

Fast forward to 4 November 2010 and Richard De Crespigny the Captain of QF 32 after experiencing a catastrophic engine failure and faced with the potential for a glide back to Changi remembers and uses the Armstrong approach in his plan for an engine out approach.

So misquote Shakespeare, sometimes the good that men do is not interred with them.

A330 Right hand (1 & 3) AoA probes (Image source: ATSB)

In an earlier post I commented that in the QF72 incident the use of a geometric mean (1) instead of the arithmetic mean when calculating the aircrafts angle of attack would have reduced the severity of the subsequent pitch over.

Which leads into the more general subject of what to do when the real world departs from our assumption about the statistical ‘well formededness’ of data.

Continue Reading…