Archives For Uncategorized

Just added a short case study on the Patriot software timing error to the software safety micro course page. Peter Ladkin has also subjected the accident to a Why Because Analysis.

A short tutorial on the architectural principles of integrity level partitioning,  I wrote this a while ago, but the fundamentals remain the same. Partitioning is a very powerful design technique but if you apply it you also need to be aware that it can interact with all sorts of other system design attributes, like scheduling and fault tolerance to name but two.

The material is drawn from may different sources, which unfortunately at the time I didn’t reference, so all I can do is offer a general acknowledgement here. You can also find a more permanent link to the tutorial on my publications page.

Or how to avoid the secret police reading your mail

Yaay! Our glorious government of Oceania has just passed the Data Retention Act 2015 with the support of the oh so loyal opposition. The dynamics of this is that both parties believe that ‘security’ is what’s called here in Oceania a ‘wedge’ issue so they strive to outdo each other in pandering to the demands of our erstwhile secret secret police, lest the other side gain political capital from taking a tougher position. It’s the political example of an evolutionary arms race with each cycle of legislation becoming more and more extreme.

As a result telco’s here are required to keep your metadata for three years so that the secret police can paw through the electronic equivalent of your rubbish bin any time they choose. For those who go ‘metadata huh?’ metadata is all the add on information that goes with your communications via the interwebz, like where your email went, and where you were when you made a call at 1.33 am in the morning to your mother, so just like your rubbish bin it can tell the secret police an awful lot about you, especially when you knit it up with other information.  Continue Reading…

risky shift

What the?

14/02/2015 — Leave a comment

In case you’re wondering what’s going on dear reader, human factors can be a bit dry, and the occasional poster style blog posts you may have noted is my attempt to hydrate the subject a little. The continuing series can be found on the page imaginatively titled Human error in pictures, and who knows someone may find it useful…

An interesting little exposition of the current state of the practice in information risk management using the metaphor of the bald tire on the FAIR wiki. The authors observe that there’s much more shamanistic ritual (dressed up as ‘best practice’) than we’d like to think in risk assessment. A statement that I endorse, actually I think it’s mummery for the most part, but ehem, don’t tell the kids.

Their two fold point. First that while experience and intuition are vital, on their own they give little grip to critical examination. Second that if you want to manage you must measure, and to measure you need to define.

A disclaimer, I’m neither familiar with or a proponent of the FAIR tool, and I strongly doubt as to whether we can ever put risk management onto a truly scientific footing, much like engineering there’s more art than artifice, but it’s an interesting commentary nonetheless.

I give it 011 out 101 tooled up script kiddies.

15 Minutes

11/02/2015 — Leave a comment

Matthew Squair:

What the future of high assurance may look like, DARPA’s HACMS, open source and formal from the ground up.

Originally posted on A Critical Systems Blog:

Some of the work I lead at Galois was highlighted in the initial story on 60 Minutes last night, a spot interviewing Dan Kaufman at DARPA. I’m Galois’ principal investigator for the HACMS program, focused on building more reliable software for automobiles and aircraft and other embedded systems. The piece provides a nice overview for the general public on why software security matters and what DARPA is doing about it; HACMS is one piece of that story.

I was busy getting married when filming was scheduled, but two of my colleagues (Dylan McNamee and Pat Hickey) appear in brief cameos in the segment (don’t blink!). Good work, folks! I’m proud of my team and the work we’ve accomplished so far.

You can see more details about how we have been building better programming languages for embedded systems and using them to build unpiloted air vehicle software here.

View original

The important thing is to stop lying to yourself. A man who lies to himself, and believes his own lies, becomes unable to recognise the truth, either in himself or in anyone else.

Fyodor Dostoyevskiy

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here's an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 32,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 12 sold-out performances for that many people to see it.

Click here to see the complete report.

Enigma Rotors (Image source: Harold Thimbleby)

Or getting off the password merry go round… 

I’m not sure how this happens, but there are certain months where a good proportion of my passwords rollover. Of course password rollovers are one of those entrenched security ‘good ideas’, and you’d assume it makes us more secure? Well no, unfortunately it has entirely the opposite effect.

Continue Reading…

Yep that’s right, due to popular demand I’m running ZEIT 8236 System Safety as an Intensive Delivery mode course in the second session at ADFA from the 13th to 17th of July 2015. If you want a flavour, here’s the introductory module. Remember, I love this stuff. :)

A safety engineer is someone who builds castles in the air and an operator is someone who goes and lives in them. But nature is the one who collects the rent…

EECON 2014

07/11/2014 — Leave a comment

So I’ve been invited to to give a talk on risk at the conference dinner. Should be interesting.

An interesting article in Forbes on human error in a very unforgiving environment, i.e. treating ebola patients, and an excellent use of basic statistics to prove that cumulative risk tends to do just that, accumulate. As the number of patients being treated in the west is pretty low at the moment it also gives a good indication of just how infectious Ebola is. One might also infer that the western medical establishment is not quite so smart as it thought it was, at least when it comes to treating the big E safely.

Of course the moment of international zen in the whole story had to be the comment by the head of the CDC Dr Friedan, that and I quote “clearly there was a breach in protocol”, a perfect example of affirming the consequent. As James Reason pointed out years ago there are two ways of dealing with human error, so I guess we know where the head of the CDC stands on that question. :)

If you were wondering why the Outliers post was, ehem, a little rough I accidentally posted an initial draft rather than the final version. I’ve now released the right one.

20140629-132953-48593553.jpg

On Artificial Intelligence as Ethical Prosthesis

Out here in the grim meat-hook present of Reaper missions and Predator drone strikes we’re already well down track to a future in which decisions as to who lives and who dies are made less and less by human beings, and more and more by automation.

Continue Reading…

Toyota ECM (Image source: Barr testimony presentation)

Comparing and contrasting

In 2010 NASA was called in by the National Highway Transport Safety Administration to help in figuring out the reason for reported unintended Toyota Camry accelerations. They subsequently published a report including a dedicated software annex. What’s interesting to me is the different outcome and conclusions of the two reports regarding software.  Continue Reading…

The quote below is from the eminent British scientist Lord Kelvin, who also pronounced that x-rays were a hoax, that heavier than air flying machines would never catch on and that radio had no future…

I often say that when you can measure what you are speaking about, and express it in numbers, then you know something about it; but when you cannot measure it, when you cannot express it in numbers, your may knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever that may be.

Lord Kelvin, 1891

I’d turn that statement about and remark that once you have a number in your grasp, your problems have only just started. And that numbers shorn of context are a meagre and entirely unsatisfactory way of expressing our understanding of the world.

When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.

Arthur C. Clarke,  Profiles of the Future (1962)

I often think that Arthur C. Clarke penned his famous laws in direct juxtaposition to the dogmatic statements of Lord Kelvin. It’s nice to think so anyway. :)

Just added a modified version of the venerable subjective 882 hazard risk matrix to my useful stuff page in which I fix a few issues that have bugged me about that particular tool, see Risk and the Matrix for a fuller discussion of the problems with risk matrices.

For those of you with a strong interest in such I’ve translated the matrix into cartesian coordinates, revised the risk zone and definitions to make the matrix ‘De Moivre theorem’ compliant (and a touch more conservative), added the AIAA’s combinatorial probability thresholds, introduced a calibration point and added the ALARP principal.

Who knows maybe the US DoD will pick it up…but probably not. :)

MIL-STD-882 Hazard Risk Matrix (Modified).

 

I’ve put the original Def Stan 00-55 (both parts) onto my resources page for those who are interested in doing a compare and contrast between the old, and the new (whenever it’s RFC is released). I’ll be interested to see whether the standards reluctance to buy into the whole safety by following a process argument is maintained in the next iteration. The problem of arguing from fault density to safety that they allude to also remains, I believe, insurmountable.

The justification of how the SRS development process is expected to deliver SRS of the required safety integrity level, mainly on the basis of the performance of the process on previous projects, is covered in 7.4 and annex E. However, in general the process used is a very weak predictor of the safety integrity level attained in a particular case, because of the variability from project to project. Instrumentation of the process to obtain repeatable data is difficult and enormously expensive, and capturing the important human factors aspects is still an active research area. Furthermore, even very high quality processes only predict the fault density of the software, and the problem of predicting safety integrity from fault density is insurmountable at the time of writing (unless it is possible to argue for zero faults).

Def Stan 00-55 Issue 2 Part 2 Cl. 7.3.1

Just as an aside, the original release of Def Stan 00-56 is also worth a look as it contains the methodology for the assignment of safety integrity levels. Basically for a single function or N>1 non-independent functions the SIL assigned to the function(s) is derived from the worst credible accident severity (much like DO-178). In the case of N>1 independent functions, one of these functions gets a SIL based on severity but the remainder have a SIL rating apportioned to them based on risk criteria. From which you can infer that the authors, just like the aviation community were rather mistrustful of using estimates of probability in assuring a first line of defence. :)

When Formal Systems Kill, an interesting paper by Lee Pike and Darren Abramson looking at the automatic formal system property of computers from an ethical perspective. Of course as we all know, the 9000 series has a perfect operational record…

Preamble

The following is a critique of a teleconference conducted on the 16 March  between the UK embassy in Japan and the UK Governments senior scientific advisor and members of SAGE, a UK government crisis panel formed in the aftermath of the Japanese tsunami to advise on the Fukushima crisis. These comments pertain specifically to the 16 March (UK time) teleconference with the British embassy and the minutes of SAGE meetings on the 15th and 16th that preceded that teleconference. Continue Reading…

I’ve just reread Peter Ladkin’s 2008 dissection of the conceptual problems of IEC 61508 here, and having just worked through a recent project in which 61508 SILs were applied, I tend to agree that SIL is still a bad idea, done badly… I’d also add that, the HSE’s opinion notwithstanding, I don’t actually see that the a priori application of a risk derived SIL level to a specific software development acquits ones ‘so far as is reasonably practical’ duty of care. Of course if your regulator says it does, why then smile bravely and complement him on the wonderful cut of his new clothes. On the other hand if you’re design the safety system for a nuclear plant maybe have a look at how the aviation industry do business with their Design Assurance Levels. :)

 

Cognitive biases potentially affecting judgment of global risks

The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 24,000 times in 2013. If it were a concert at Sydney Opera House, it would take about 9 sold-out performances for that many people to see it.

Click here to see the complete report.

And I’ve just updated the philosophical principles for acquiring safety critical systems. All suggestions welcome…

Enjoy :)

Colossus the forbin project (Image source: Movie still)

Risk as uncontrollability…

The venerable safety standard MIL-STD-882 introduced the concept of software hazard and risk in Revision C of that standard. Rather than using the classical definition of risk as combination of severity and likelihood the authors struck off down quite a different, and interesting, path.

Continue Reading…

In an earlier post I had a look at the role played by design authorities in an organisation, which can have a major affect upon both safety and project success. My focus in that post was on the authority aspect.

However another perspective on the role that a design authority performs is that of someone who is able to understand both the operational requirements for a system (e.g. those that define a need) as well as the technical (those that define a solution) and most importantly be able to translate between them.

This is a role that is well understood in architecture, but one that has seemed to diminish and dwindle in engineering where projects of any complexity are more often undertaken by large bureaucratic organisations, which also traditionally fear assigning responsibility to one person.

20130405-110510.jpg
Provided as part of the QR show bag for the CORE 2012 conference. The irony of a detachable cab being completely unintentional…

20130223-170419.jpg

For somebody. :)

787 Lithium Battery (Image Source: JTSB)

But, we tested it? Didn’t we?

Earlier reports of the Boeing 787 lithium battery initial development indicated that Boeing engineers had conducted tests to confirm that a single cell failure would not lead to a cascading thermal runaway amongst the remaining batteries. According to these reports their tests were successful, so what went wrong?

Continue Reading…

Over on the RVS Bielefield site Peter Ladkin has just put up a white paper  entitled 61508 Weaknesses and Anomalies which looks at the problems with the current version of the IEC 61508 functional safety standard, part 6 of which sits on my desk even as we speak. Comments are welcome.

For my own contributions to the commentary on IEC 61508 see Buncefield the alternate view , Component SIL rating memes and SILs and Safety Myths.

Dr Nancy Leveson will be teaching a week-long class on system safety this year at the Talaris Conference Center in Seattle from July 15-19.

Her focus will be on the new techniques and approaches described in her latest book Engineering a Safer World. Should be excellent.

See the class announcement for further details.

The WordPress.com stats prepared a 2012 annual report for this blog.

Here’s an excerpt:

4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 29,000 views in 2012. If each view were a film, this blog would power 7 Film Festivals

Click here to see the complete report.

A Working Definition and a Key Question…

One of the truisms of systems safety theory is that safety is an emergent attribute of the design. But looking at this in reverse it also implies that system accidents are emergent, if unintended, attributes of a system. So when we talk about system accidents and hazards we’re really (if we accept the base definition) talking about an emergent attribute of a system.

But what are emergent attributes? Now that is a very interesting philosophical question, and answering it might help clarify what exactly is a system accident…

Just put the final conference ready draft of my Writing Specs for Fun and Profit up, enjoy!

As readers may have noted, I’ve changed the name of my blog to Critical Uncertainties which I think better represents what it’s all about, well at least that’s my intent. :-)

I’m still debating whether to go the whole hog and register a domain name, it has it’s advantages and disadvantages. But if I do don’t worry, my host WordPress.com will make sure that you don’t get lost on the way here.

P.S I did think of a few alternate names. Some better than others…

20120827-213039.jpg

Neil Armstrong died yesterday at the age of 82, and rather than celebrating his achievements as an astronaut, marvelous though they are, I’d like to pay tribute here to his work as an engineer and test pilot.

Before Apollo Neil Armstrong was a test pilot for NACA flying the X15 rocket plane, and during his test piloting he came up with what they ended up calling the Armstrong spiral. The manoeuvre was a descending glide spiral that tightened the turn radius as the glide speed reduced. Armstrong’s manouevre was so widely regarded that it was later adopted by the Space Shuttle program.

Fast forward to 4 November 2010 and Richard De Crespigny the Captain of QF 32 after experiencing a catastrophic engine failure and faced with the potential for a glide back to Changi remembers and uses the Armstrong approach in his plan for an engine out approach.

So misquote Shakespeare, sometimes the good that men do is not interred with them.

How driver training problems for the M113 Armoured Personnel Carrier provide and insight into the ecology of interface design.

Continue Reading...
Tweedle Dum and Dee (Image source: Wikipedia Commons)
How do ya do and shake hands, shake hands, shake hands. How do ya do and shake hands and state your name and business…

Lewis Carrol, Through the Looking Glass

You would have thought after the Leveson and Knight experiments that the  theory that independently written software would only contain independent faults was dead and buried, another beautiful theory shot down by hard cold fact.  But unfortunately like many great errors the theory of n-versioning keeps on keeping on (1).
Continue Reading…