Archives For Safety cases

The 15 commandments

04/09/2016

10cmdts

The 15 commandments of the god  of the machine

Herewith, are the 15 commandments for thine safety critical software as spoken by the machine god unto his prophet Kopetz.

  1. Thou shalt regard the system safety case as thy tabernacle of safety and derive thine critical software failure modes and requirements from it.
  2. Thou shalt adopt a fundamentally safe architecture and define thy fault tolerance hypothesis as part of this. Even unto the definition of fault containment regions, their modes of failure and likelihood.
  3. Thine fault tolerance shall include start-up operating and shutdown states
  4. Thine system shall be partitioned to ‘divide and conquer’ the design. Yea such partitioning shall include the precise specification of component interfaces by time and value such that  all manner of men shall comprehend them
  5. Thine project team shall develop a consistent model of time and state for even unto the concept of states and fault recovery by voting is the definition of time important.
  6. Yea even though thou hast selected a safety architecture pleasing to the lord, yet it is but a house built upon the sand, if no ‘programming in the small’ error detection and fault recovery is provided.
  7. Thou shall ensure that errors are contained and do not propagate through the system for a error idly propagated  to a service interface is displeasing to the lord god of safety and invalidates your righteous claims of independence.
  8. Thou shall ensure independent channels and components do not have common mode failures for it is said that homogenous redundant channels protect only from random hardware failures  neither from the common external cause such as EMI or power loss, nor from the common software design fault.
  9. Thine voting software shall follow the self-confidence principle for it is said that if the self-confidence principle is observed then a correct FCR will always make the correct decision under the assumption of a single faulty FCR, and only a faulty FCR will make false decisions.
  10. Thou shall hide and separate thy fault-tolerance mechanisms so that they do not introduce fear, doubt and further design errors unto the developers of the application code.
  11. Thou shall design your system for diagnosis for it is said that even a righteously designed fault tolerant system my hide such faults from view whereas thy systems maintainers must replace the affected LRU.
  12. Thine interfaces shall be helpful and forgive the operator his errors neither shall thine system dump the problem in the operators lap without prior warning of impending doom.
  13. Thine software shall record every single anomaly for your lord god requires that every anomaly observed during operation must be investigated until a root cause is defined
  14. Though shall mitigate further hazards introduced by your design decisions for better it is that you not program in C++ yet still is it righteous to prevent the dangling of thine pointers and memory leaks
  15. Though shall develop a consistent fault recovery strategy such that even in the face of violations of your fault hypothesis thine system shall restart and never give up.

System Safety Fundamentals Concept Cloud

Have just updated the safety case module for the system safety course I teach at UNSW. Have revised it to include John Rushby’s approach to determining the soundness and strength of a safety argument (I like his simplification and separation of concerns strategy). Enjoy!

Morely.png

Why writing a safety case might (actually) be a good idea

Frequent readers of my blog would probably realise that I’m a little sceptical of safety cases, as Scrooge remarked to Morely’s ghost, “There’s more of gravy than of grave about you, whatever you are!” So to for safety cases, oft more gravy than gravitas about them in my opinion, regardless of what their proponents might think.

Continue Reading…

Monkey-typing

Safety cases and that room full of monkeys

Back in 1943, the French mathematician Émile Borel published a book titled Les probabilités et la vie, in which he stated what has come to be called Borel’s law which can be paraphrased as, “Events with a sufficiently small probability never occur.” Continue Reading…

I was cleaning out my (metaphorical) sock drawer and came across this rough guide to the workings of the Australian Defence standard on software safety DEF(AUST) 5679. The guide was written around 2006 for Issue 1 of the standard, although many of the issues it discussed persisted into Issue 2, which hit the streets in 2008.

DEF (AUST) 5679 is an interesting standard, one can see that the authors, Tony Cant amongst them, put a lot of thought into the methodology behind the standard, unfortunately it’s suffered from a failure to achieve large scale adoption and usage.

So here’s my thoughts at the time on how to actually use the standard to best advantage, I also threw in some concepts on how to deal with xOTS components within the DEF (AUST) 5679 framework.

Enjoy 🙂

IMG_3851-0.JPG

A report issued by the US Chemical Safety Board on Monday entitled “Regulatory Report: Chevron Richmond Refinery Pipe Rupture and Fire,” calls on California to make changes to the way it manages process safety.

The report is worth a read as it looks at various regulatory regimes in a fairly balanced fashion. A strong independent competent regulator is seen as a key factor for success by the reports authors, regardless of the regulatory mechanisms. I don’t however think the evidence is as strong as the report makes out that safety case/goal based safety regimes perform ‘all that better’ than other regulatory regimes. Would have also been nice if they’d compared and contrasted against other industries, like aviation.

Midlands hotel

A quick report from sunny Manchester, where I’m attending the IET’s annual combined conference on system safety and cyber security. Day one of the conference proper and I got to be lead off with the first keynote. I was thinking about getting everyone to do some Tai Chii to limber up (maybe next year). Thanks once again to Dr Carl Sandom for inviting me over, it was a pleasure. I just hope the audience felt the same way. 🙂

Continue Reading…

Current practices in formal safety argument notation such as Goal Structuring Notation (GSN) or Cause Argument Evidence (CAE) rely on the practical argument model developed by the philosopher Toulmin (1958). Toulmin focused on the justification aspects of arguments rather than the inferential and developed a model of these ‘real world’ argument based on facts, conclusions, warrants, backing and qualifier entities.

Using Toulmin’s model from evidence one can draw a conclusion, as long as it is warranted. Said warrant being possibly supported by additional backing, and possibly contingent upon some qualifying statement. Importantly one of the qualifier elements in practical arguments is what Toulmin called a ‘rebuttal’, that is some form of legitimate constraint that may be placed on the conclusion drawn, we’ll get to why that’s important in a second.

Toulmin Argumentation Example

You see Toulmin developed his model so that one could actually analyse an argument, that is argument in the verb sense of, ‘we are having a safety argument’. Formal safety arguments in safety cases however are inherently advocacy positions, and the rebuttal part of Toulmin’s model finds no part in them. In the noun world of safety cases, argument is used in the sense of, ‘there is the 12 volume safety argument on the shelf’, and if the object is to produce something rather than discuss then there’s no need for a claim and rebuttal pairing is there?

In fact you won’t find an explicit rebuttal form in either GSN or CAE as far as I am aware, it seem that the very ‘idea’ of rebuttal has been pruned from the language of both. Of course it’s hard to express a concept if you don’t have the word for it, nice little example of how language form can control the conversation. Language is power so they say.

 

Well it sounded reasonable…

One of the things that’s concerned me for a while is the potentially malign narrative power of a published safety case. For those unfamiliar with the term, a safety case can be defined as a structured argument supported by a body of evidence that provides a compelling, comprehensible and valid case that a system is safe for a given application in a given environment. And I have not yet read a safety case that didn’t purport to be exactly that.

Continue Reading…

The development of safety cases for complex safety critical systems

So what is a safety case? The term has achieved an almost quasi-religious status amongst safety practitioners, with it’s fair share of true believers and heretics. But if you’ve been given the job of preparing or reviewing a safety case what’s the next step?

Continue Reading…

Recently there’s been some robust discussion over on the Safety Critical Mail List at York regarding the utility of safety cases and performance based safety standards (as exemplified by the UK safety case regime) versus more prescriptive design standards (as exemplified by the aerospace industry FAR regulations). To provide one UK regulator’s perspective here’s a presentation by Taf Powell, Director of the Offshore Division of Health and Safety Executive’s Hazardous Industries Directorate, UK, on the state of safety cases in the UK offshore industry circa 2005. Of course his talk was well before the 2010 Deepwater Horizon disaster.

Continue Reading…