Archives For Philosophy

The philosophical aspects of safety and risk.

One of the perennial problems we face in a system safety program is how to come up with a convincing proof for the proposition that a system is safe. Because it’s hard to prove a negative (in this case the absence of future accidents) the usual approach is to pursue a proof by contradiction, that is develop the negative proposition that the system is unsafe, then prove that this is not true, normally by showing that the set of identified specific propositions of `un-safety’ have been eliminated or controlled to an acceptable level.  Enter the term `hazard’, which in this context is simply shorthand for  a specific proposition about the unsafeness of a system. Now interestingly when we parse the set of definitions of hazard we find the recurring use of terms like, ‘condition’, ‘state’, ‘situation’ and ‘events’ that should they occur will inevitably lead to an ‘accident’ or ‘mishap’. So broadly speaking a hazard is a explanation based on a defined set of phenomena, that argues that if they are present, and given there exists some relevant domain source (1) of hazard an accident will occur. All of which seems to indicate that hazards belong to a class of explanatory models called covering laws. As an explanatory class Covering laws models were developed by the logical positivist philosophers Hempel and Popper because of what they saw as problems with an over reliance on inductive arguments as to causality.

As a covering law explanation of unsafeness a hazard posits phenomenological facts (system states, human errors, hardware/software failures and so on) that confer what’s called nomic expectability on the accident (the thing being explained). That is, the phenomenological facts combined with some covering law (natural and logical), require the accident to happen, and this is what we call a hazard. We can see an archetypal example in the Source-Mechanism-Outcome model of Swallom, i.e. if we have both a source and a set of mechanisms in that model then we may expect an accident (Ericson 2005). While logical positivism had the last nails driven into it’s coffin by Kuhn and others in the 1960s and it’s true, as Kuhn and others pointed out, that covering model explanations have their fair share of problems so to do other methods (2). The one advantage that covering models do possess over other explanatory models however is that they largely avoid the problems of causal arguments. Which may well be why they persist in engineering arguments about safety.

Notes

1. The source in this instance is the ‘covering law’.

2. Such as counterfactual, statistical relevance or causal explanations.

References

Ericson, C.A. Hazard Analysis Techniques for System Safety, page 93, John Wiley and Sons, Hoboken, New Jersey, 2005.

M1 Risk_Spectrum_redux

A short article on (you guessed it) risk, uncertainty and unpleasant surprises for the 25th Anniversary issue of the UK SCS Club’s Newsletter, in which I introduce a unified theory of risk management that brings together aleatory, epistemic and ontological risk management and formalises the Rumsfeld four quadrant risk model which I’ve used for a while as a teaching aid.

My thanks once again to Felix Redmill for the opportunity to contribute.  🙂

But the virtues we get by first exercising them, as also happens in the case of the arts as well. For the things we have to learn before we can do them, we learn by doing them, e.g., men become builders by building and lyreplayers by playing the lyre; so too we become just by doing just acts, temperate by doing temperate acts, brave by doing brave acts.

Aristotle

Meltwater river Greenland icecap (Image source: Ian Jouhgin)

Meme’s, media and drug dealer’s

In honour of our Prime Minister’s use of the drug dealer’s argument to justify (at least to himself) why it’s OK for Australia to continue to sell coal, when we know we really have to stop, here’s an update of a piece I wrote on the role of the media in propagating denialist meme’s. Enjoy, there’s even a public heath tip at the end.

PS. You can find Part I and II of the series here.

🙂

Technical debt

05/09/2015

St Briavels Castle Debtors Prison (Image source: Public domain)

Paying down the debt

A great term that I’ve just come across, technical debt is a metaphor coined by Ward Cunningham to reflect on how a decision to act expediently for an immediate reason may have longer term consequences. This is a classic problem during design and development where we have to balance various ‘quality’ factors against cost and schedule. The point of the metaphor is that this debt doesn’t go away, the interest on that sloppy or expedient design solution keeps on getting paid every time you make a change and find that it’s harder than it should be. Turning around and ‘fixing’ the design in effect pays back the principal that you originally incurred. Failing to pay off the principal? Well such tales can end darkly. Continue Reading…

4blackswans

Or how do we measure the unknown?

The problem is that as our understanding and control of known risks increases, the remaining risk in any system become increasingly dominated by  the ‘unknown‘. The higher the integrity of our systems the more uncertainty we have over the unknown and unknowable residual risk. What we need is a way to measure, express and reason about such deep uncertainty, and I don’t mean tools like Pascalian calculus or Bayesian prior belief structures, but a way to measure and judge ontological uncertainty.

Even if we can’t measure ontological uncertainty directly perhaps there are indirect measures? Perhaps there’s a way to infer something from the platonic shadow that such uncertainty casts on the wall, so to speak. Nassim Taleb would say no, the unknowability of such events is the central thesis of his Ludic Fallacy after all. But I still think it’s worthwhile exploring, because while he might be right, he may also be wrong.

*With apologies to Nassim Taleb.

20140629-132953-48593553.jpg

Just because you can, doesn’t mean you ought

An interesting article by  and  on the moral hazard that the use of drone strikes poses and how in the debate on their use there arises a confusion of the facts with value. To say that drone strikes are effective and near consequence free, at least for the perpetrator, does not equate to the conclusion that they are ethical and that we should carry them out. Nor does the capability to safely attack with focused lethality mean that we will in fact make better ethical decisions. The moral hazard that Kaag and Krep assert is that ease of use can all to easily end up becoming the justification for use. My further prediction is that with the increasing automation and psychological distancing of the kill chain this tendency will inevitably increase. Herman Kahn is probably smiling now, wherever he is.

Continue Reading…

 

On Artificial Intelligence as ethical prosthesis

Out here in the grim meat-hook present of Reaper missions and Predator drone strikes we’re already well down track to a future in which decisions as to who lives and who dies are made less and less by human beings, and more and more by automation. Although there’s been a lot of ‘sexy’ discussion recently of the possibility of purely AI decision making, the current panic misses the real issue d’jour, that is the question of how well current day hybrid human-automation systems make such decisions, and the potential for the incremental abrogation of moral authority by the human part of this cybernetic system as the automation in this synthesis becomes progressively more sophisticated and suasive.

As Dijkstra pointed out in the context of programming, one of the problems or biases humans have in thinking about automation is that because it ‘does stuff’, we find the need to imbue it with agency, and from there it’s a short step to treating the automation as a partner in decision making. From this very human misunderstanding it’s almost inevitable that the the decision maker holding such a view will feel that the responsibility for decisions are shared, and responsibility diluted, thereby opening up potential for choice shift in decision making. As the degree of sophistication of such automation increases of course this effect becomes stronger and stronger, even though ‘on paper’ we would not recognise the AI as a rational being in the Kantian sense.

Even the design of decision support system interfaces can pose tricky problems when an ethical component is present, as the dimensions of ethical problem solving (time intensiveness, consideration, uncertainty, uniqueness and reflection) directly conflict with those that make for efficient automation (brevity, formulaic, simplification, certainty and repetition). This inherent conflict thereby ensuring that the interaction of automation and human ethical decision making becomes a tangled and conflicted mess. Technologists of course look at the way in which human beings make such decisions in the real world and believe, rightly or wrongly, that automation can do better. What we should remember is that such automation is still a proxy for the designer, if the designer has no real understanding of the needs of the user in forming such ethical decisions then if if the past is any guide we are up for a future of poorly conceived decision support systems, with all the inevitable and unfortunate consequences that attend. In fact I feel confident in predicting that the designers of such systems will, once again, automate their biases about how humans and automation should interact, with unpleasant surprises for all.

In a broader sense what we’re doing with this current debate is essentially rehashing the old arguments between two world views on the proper role of automation, on the one side automation is intended to supplant those messy, unreliable humans, in the current context effecting an unintentional ethical prosthetic. On the other hand we have the view that automation can and should be used to assist and augment human capabilities, that is it should be used to support and develop peoples innate ethical sense. Unfortunately in this current debate it also looks like the prosthesis school of thought is winning out. My view is that if we continue in this approach of ‘automating out’ moral decision making we will inevitably end up with the amputation of ethical sense in the decision maker, long before killer robots stalk the battlefield, or the high street of your home town.

Toyota ECM (Image source: Barr testimony presentation)

Comparing and contrasting

In 2010 NASA was called in by the National Highway Transport Safety Administration to help in figuring out the reason for reported unintended Toyota Camry accelerations. They subsequently published a report including a dedicated software annex. What’s interesting to me is the different outcome and conclusions of the two reports regarding software.  Continue Reading…

Waaay back in 2002 Chris Holloway wrote a paper that used a fictional civil court case involving the hazardous failure of software to show that much of the expertise and received wisdom of software engineering was, using the standards of the US federal judiciary, junky and at best opinion based.

Rereading the transcripts of Phillip Koopman, and Michael Barr in the 2013 Toyota spaghetti monster case I am struck both by how little things have changed and how far actual state of the industry can be from state of the practice, let alone state of the art. Life recapitulates art I guess, though not in a good way.

The enigmatic face of HAL

The enigmatic face of HAL

When Formal Systems Kill, an interesting paper by Lee Pike and Darren Abramson looking at the automatic formal system property of computers from an ethical perspective. Of course as we all know, the 9000 series has a perfect operational record…

Easter 2014 bus-cycle accident (Image Source: James Brickwood)

The limits of rational-legal authority

One of the underlying and unquestioned aspects of modern western society is that the power of the state is derived from a rational-legal authority, that is in the Weberian sense of a purposive or instrumental rationality in pursuing some end. But what if it isn’t? What if the decisions of the state are more based on belief in how people ought to behave and how things ought to be rather than reality? What, in other words, if the lunatics really are running the asylum?

Continue Reading…

On being a professional

Currently dealing with some software types who, god bless their wooly socks, are less than enthusiastic about dealing with all this ‘paperwork’ and ‘process’, which got me to thinking about the nature of professionalism.

While the approach of ‘Bürokratie über alles’ doesn’t sit well with me I confess, on the other side of coin I see the girls just wanna have fun mentality of many software developers as symptomatic of a lack of professionalism amongst the programming class. Professionals in my book intuitively understand that the ‘job’ entails three parts the preparing, the doing and the cleaning up, in a stoichiometric ratio of 4:2:4. That’s right, any job worth doing is a basic mix of two parts fun to eight part diligence, and that’s true if you’re a carpenter or a heart surgeon.

Unfortunately the fields of computer science seems to attract what I can only call man children, those folk who like Peter Pan want to fly around never land and never grow up, which is OK if you’re coding Java beans for a funky hipster website, not so great if you’re writing an embedded program for a pacemaker, and so in response we have seem to have process*.

Now as a wise man once remarked, process really says you don’t trust your people so I draw the logical conclusion that the continuing process obsession of the software community simply reflects an endemic lack of trust, due to the aforementioned lack professionalism, in that field. In contrast I trust my heart surgeon (or my master carpenter) because she is an avowed, experienced and skillful professional not because she’s CMMI level 4 certified.

*I guess that’s also why we have the systems engineer. 🙂

…and the value of virtuous witnesses

I have to say that I’ve never been terribly impressed with ISO 61508, given it purports to be so arcane that it require a priesthood of independent safety assessors to reliably interpret and sanction its implementation. My view is if your standard is that difficult then you need to rewrite the standard.

Which is where I would have parked my unhappiness with the general 61508 concept of an ISA, until I remembered a paper written by John Downer on how the FAA regulates the aerospace sector. Within the FAA’s regulatory framework there exists an analog to the ISA role, in the form of what are called Designated Engineering Representatives or DERs. In a similar independent sign-off role to the ISAs, DERs are paid by the company they work for to carry out a certifying function on behalf of the FAA.

Continue Reading…

Current practices in formal safety argument notation such as Goal Structuring Notation (GSN) or Cause Argument Evidence (CAE) rely on the practical argument model developed by the philosopher Toulmin (1958). Toulmin focused on the justification aspects of arguments rather than the inferential and developed a model of these ‘real world’ argument based on facts, conclusions, warrants, backing and qualifier entities.

Using Toulmin’s model from evidence one can draw a conclusion, as long as it is warranted. Said warrant being possibly supported by additional backing, and possibly contingent upon some qualifying statement. Importantly one of the qualifier elements in practical arguments is what Toulmin called a ‘rebuttal’, that is some form of legitimate constraint that may be placed on the conclusion drawn, we’ll get to why that’s important in a second.

Toulmin Argumentation Example

You see Toulmin developed his model so that one could actually analyse an argument, that is argument in the verb sense of, ‘we are having a safety argument’. Formal safety arguments in safety cases however are inherently advocacy positions, and the rebuttal part of Toulmin’s model finds no part in them. In the noun world of safety cases, argument is used in the sense of, ‘there is the 12 volume safety argument on the shelf’, and if the object is to produce something rather than discuss then there’s no need for a claim and rebuttal pairing is there?

In fact you won’t find an explicit rebuttal form in either GSN or CAE as far as I am aware, it seem that the very ‘idea’ of rebuttal has been pruned from the language of both. Of course it’s hard to express a concept if you don’t have the word for it, nice little example of how language form can control the conversation. Language is power so they say.

 

MH370 Satellite Image (Image source: AMSA)

MH370 and privileging hypotheses

The further away we’ve moved from whatever event that initiated the disappearance of MH370, the less entanglement there is between circumstances and the event, and thus the more difficult it is to make legitimate inferences about what happened. In essence the signal-to-noise ratio decreases exponentially as the causal distance from the event increases, thus the best evidence is that which is intimately entwined with what was going on onboard MH370 and of lesser importance is that evidence obtained at greater distances in time or space.

Continue Reading…

As Weick pointed out, to manage the unexpected we need to be reliably mindful, not reliably mindless. Obvious as that truism may be, those who invest heavily in plans, procedures, process and policy also end up perpetuating and reinforcing a whole raft of expectations about how the world is, thus investing in an organisational culture of mindlessness rather than mindfulness. Understanding that process inherently elides to a state of organisational mindlessness, we can see that a process oriented risk management standard such as ISO 31000 perversely cultivates a climate of inattentiveness, right where we should be most attentive and mindful. Nor am I alone in my assessment of ISO 31000, see for example John Adams criticism of the standard as  not fit for purpose , or KaplanMike’s assessment of ISO 31000 essentially ‘not relevant‘. Process is no substitute for paying attention.

Don’t get me wrong there’s nothing inherently wrong with a small dollop of process, just that it’s place is not centre stage in an international standard that purports to be about risk, not if you’re looking for an effective outcome. In real life it’s the unexpected, those black swans of Nassim Taleb’s flying in the dark skies of ignorance, that have the most effect, and about which ISO 31000 has nothing to say.

Postscript

Also the application of ISO 31000’s classical risk management to the workplace health and safety may actually be illegal in some jurisdictions (like Australia) where legislation is based on a backwards looking principle of due diligence, rather than a prospective risk based approach to workplace health and safety.

Metaphor shear

27/11/2013

Coined by Neal Stephenson the term metaphor shear is that moment when unpalatable technological complexities that we’ve smoothed over in the name of efficiency or usability suddenly intrude, and we’re left with the realisation that we’ve been living inside a metaphor as if it’s reality. And it’s that in that difference, between the idea and the reality where critical software systems can fail. For example (harrumph) we talk about the protection provided by software ‘architecture’ like partitions, firewalls and watchdogs as if these have a physical existence and permanence, in the same way as the real world mechanisms that they replaced, when in fact they’re ‘simply’ a collection of algorithms and digital ones and zeroes.

The igloo of uncertainty (Image source: UNEP 2010)

Ethics, uncertainty and decision making

The name of the model made me smile, but this article The Ethics of Uncertainty by TannertElvers and Jandrig argues that where uncertainty exists research should be considered as part of an ethical approach to managing risk.

Continue Reading…

Taboo transactions and the safety dilemma Again my thanks goes to Ross Anderson over on the Light Blue Touchpaper blog for the reference, this time to a paper by Alan Fiske  an anthropologist and Philip Tetlock a social psychologist, on what they terms taboo transactions. What they point out is that there are domains of sharing in society which each work on different rules; communal, versus reciprocal obligations for example, or authority versus market. And within each domain we socially ‘transact’ trade-offs between equivalent social goods.

Continue Reading…

The safety theatre

11/09/2013

I was reading a post by Ross Anderson on his dismal experiences at John Lewis, and ran across the term security theatre, I’ve actually heard the term, before, it was orignally coined by Bruce Schneier, but this time it got me thinking about how much activity in the safety field is really nothing more than theatrical devices that give the appearance of achieving safety, but not the reality. From zero harm initiatives to hi-vis vests, from the stylised playbook of public consultation to the use of safety integrity levels that purport to show a system is safe. How much of this adds any real value. Worse yet, and as with security theatre, an entire industry has grown up around this culture of risk, which in reality amounts to a culture of risk aversion in western society. As I see it risk as a cultural concept is like fire, a dangerous tool and an even more terrible master.

An articulated guess beats an unspoken assumption

Frederick Brooks

A point that Fred Brooks makes in his recent work the Design of Design is that it’s wiser to explicitly make specific assumptions, even if that entails guessing the values, rather than leave the assumption un-stated and vague because ‘we just don’t know’. Brooks notes that while specific and explicit assumptions may be questioned, implicit and vague ones definitely won’t be. If a critical aspect of your design rests upon such fuzzy unarticulated assumptions, then the results can be dire.

Continue Reading…

From Les Hatton, here’s how, in four easy steps:

  1. Insist on using R = F x C in your assessment. This will panic HR (People go into HR to avoid nasty things like multiplication.)
  2. Put “end of universe” as risk number 1 (Rationale: R = F x C. Since the end of the universe has an infinite consequence C, then no matter how small the frequency F, the Risk is also infinite)
  3. Ignore all other risks as insignificant
  4. Wait for call from HR…

A humorous note, amongst many, in an excellent presentation on the fell effect that bureaucracies can have upon the development of safety critical systems. I would add my own small corollary that when you see warning notes on microwaves and hot water services the risk assessment lunatics have taken over the asylum…

Battery post fire (Image source: NTSB)

The NTSB has released it’s interim report on the Boeing 787 JAL battery fire and it appears that Boeing’s initial safety assessment had concluded that the only way in which a battery fire would eventuate was through overcharging. Continue Reading…

787 Lithium Battery (Image Source: JTSB)

But, we tested it? Didn’t we?

Earlier reports of the Boeing 787 lithium battery initial development indicated that Boeing engineers had conducted tests to confirm that a single cell failure would not lead to a cascading thermal runaway amongst the remaining batteries. According to these reports their tests were successful, so what went wrong?

Continue Reading…

Just updated the post Why Safety Integrity Levels Are Pseudo-science with additional reference material and links to where it’s available on the web. Oh, and they’re still pseudo-science…

Just finished reading the excellent paper A Conundrum: Logic, Mathematics and Science Are Not Enough by John Holloway on the the swirling currents of politics, economics and emotion that can surround and affect any discussions of safety. The paper neatly illustrates why the canonical rational-philosophical model of expert knowledge is inherently flawed.

What I find interesting as a practicing engineer is that although every day debates and discussions with your peers emphasise the subjectivity of engineering ‘knowledge’ as engineers we all still like to pretend and behave as if it is not.

“Knowledge is an unending adventure at the edge of uncertainty”

Jacob Bronowski

British mathematician, biologist, historian of science, theatre author, poet and inventor.

In June of 2011 the Australian Safety Critical Systems Association (ASCSA) published a short discussion paper on what they believed to be the philosophical principles necessary to successfully guide the development of a safety critical system. The paper identified eight management and eight technical principles, but do these principles do justice to the purported purpose of the paper?

Continue Reading…

Did the designers of the japanese seawalls consider all the factors?

In an eerie parallel with the Blayais nuclear power plant flooding incident it appears that the designers of tsunami protection for the Japanese coastal cities and infrastructure hit by the 2011 earthquake did not consider all the combinations of environmental factors that go to set the height of a tsunami.

Continue Reading…

Thinking about the unintentional and contra-indicating stall warning signal of AF 447 I was struck by the common themes between AF 447 and the Titanic. In both the design teams designed a vehicle compliant to the regulations of the day. But in both cases an implicit design assumption as to how the system would be operated was invalidated.

Continue Reading...

Why more information does not automatically reduce risk

I recently re-read the article Risks and Riddles by Gregory Treverton on the difference between a puzzle and a mystery. Treverton’s thesis, taken up by Malcom Gladwell in Open Secrets, is that there is a significant difference between puzzles, in which the answer hinges on a known missing piece, and mysteries in which the answer is contingent upon information that may be ambiguous or even in conflict. Continue Reading…

Over the years a recurring question raised about the design of FBW aircraft has been whether pilots constrained by software embedded protection laws really have the authority to do what is necessary to avoid an accident? But this question falls into the trap of characterising the software as an entity in and of itself. The real question is should the engineers who developed the software be the final authority?

Continue Reading...

Why we risk…

15/05/2011

Why taking risk is an inherent part of the human condition

On the 6th of May 1968 Neil Armstrong stepped aboard the Lunar Lander Test Vehicle (LLTV) for a routine training mission. During the flight the vehicle went out of control and crashed with Armstrong ejecting to safety seconds before impact. Continue Reading…

Blayais Plant (Image source: Wikipedia Commons)

What a near miss flooding incident at a French nuclear plant in 1999 and the Fukushima 2012 disaster can tell us about fault tolerance and designing for reactor safety

Continue Reading…

A report by the AIA on engine rotor bursts and their expected severity raises questions about the levels of damage sustained by QF 32.

Continue Reading...

It appears that the underlying certification basis for aircraft safety in the event of a intermediate power turbine rotor bursts is not supported by the rotor failure seen on QF 32.

Continue Reading...

The Titanic effect

27/09/2010

So why did the Titanic sink? The reason highlights the role of implicit design assumptions in complex accidents and the interaction of design with operations of safety critical systems

Continue Reading...

Why do safety critical organisations also fail to respond to sentinel events?

Continue Reading...

The IPCC issued a set of lead author guidance notes on how to describe uncertainty prior to the fourth IPCC assessment. In it the IPCC laid out a methodology on how to deal with various classes of uncertainty. Unforunately the IPCC guidance also fell into a fatal trap.

Continue Reading...

One of the tenets of safety engineering is that simple systems are better. Many practical reasons are advanced to justify this assertion, but I’ve always wondered what, if any, theoretical justification was there for such a position.

Continue Reading...

What is a hazard?

14/06/2009

The principal of phenotype and genotype is used to explain the variability amongst definitions of hazard.

Continue Reading...

The use of integrity levels to achieve ultra high levels of safety has become an ‘accepted wisdom’ in the safety community. Yet I, and others, remain unconvinced as to their efficacy. In this post I argue that integrity levels are not scientific in any real sense of that term which leads in turn to the reasonable question of whether they work.

Testability and disconfirmation

The basis of science is empirical observation and inductive reasoning. For example we may observe that swans are white and therefore form a theory that all swans must be white. But as Hume pointed out inductive reasoning is inherently limited because the premises of an inductive argument support but cannot logically entail the conclusion. For example, in our original example a single black swan is sufficient to refute our theory, despite there being a thousand white swans…This does not mean that a theory cannot be useful (that is it works), but just because a theory has worked a number of times does not mean that it is proven to be true. For example, we can build ten bridges that stay up (our theory is useful) but there is nothing to say that the eleventh will not fall down due to effects not considered in the existing theory of how to design bridges. As any test of a theory cannot prove the truth of a theory only disprove it, when we say a theory is testable we are not saying that we can prove it, only that there exists an opportunity to disprove or falsify it. This concept of disproof is very much akin to the legal principal of finding a person ‘not guilty’, rather than ‘innocent’ of a charge. Which leaves us with a problem as to how science really works, if we presume that it’s science’s job to prove things.

The response of the philosopher Karl Popper to this problem of induction was to accept this inability to absolutely prove the truth and conclude that because we can never prove the truth of a scientific theory, science has to advance on the basis of the falsification of existing theories and replacing them with theories that better explain the facts (Popper 1968). From Popper’s perspective a good theory is one that offers us ample opportunity to falsify it. Conversely a theory which is not refutable by any conceivable means is non-scientific. Irrefutability is in fact not a virtue of a theory (as people often think) but a vice (Popper 1968).  To achieve falsifiability, according to Popper, a theory therefore needs to be:

  1. precisely stated (i.e. unambiguous),
  2. wide ranging, and
  3. testable in practical terms.

As a corollary if a theory does not satisfy these criteria it should not be considered scientific (Popper 1968). For example we could develop a design hypothesis  as to how we could design a bridge to span the straits of Gibraltar using as yet undeveloped hyper-strength materials, but as we have no practical way to test such a theory this should not be considered as scientific. Another way to look at it is that a ‘good’ scientific theory is a prohibition: it forbids certain things to happen. The more a theory forbids, the better it is because it gives us greater scope to falsify it. For example we build a very slender bridge using deflection theory (we do X) the design hypothesis forbids the bridge from falling down under specific deck loads (If X then not Y) which is eminently testable. Confirmations should also count only if they are the result of risky predictions. That is if based on our original theory or understanding, we expect an event that is incompatible with some new hypothesis then that would refute our new hypothesis. If on the other hand a new hypothesis predicts pretty much the same results as the accepted theory then there’s not much at risk. So in our bridge example if our new theory of bridge construction predicts that such a bridge will not fail under an known load, whereas the older theory based on traditional techniques predicts that it will, then there’s a clear, and testable, difference.

From an engineering perspective this means that the confirmation of a theory comes when it allows engineers to do something beyond the current state of the art. For example, we could use a new bridge deflection theory to design a lighter and more slender bridge span for given wind loads. If the bridge stands under the loads then the the results would count as confirmation of the new hypothesis as our old theory would have predicted failure. Confirming evidence should not count except when it is the result of a genuine test of our hypothesis and it can be presented as a serious, but unsuccessful, attempt to falsify the hypothesis and whose results corroborate the evidence (Popper 1968). Essentially our theories and hypotheses need to have some, ‘skin in the game’. Again using the bridge design example if both the original theory and the new design hypothesis predict the survival of a bridge this does not represent a genuine test of the new theory. But if that new bridge could only be built using the new hypothesis and the bridge subsequently falls down then that new hypothesis will inevitably be scrutinised and either rejected or adjusted, as happened with the Tacoma Narrows disaster, our new hypothesis has in effect lots of epistemic ‘skin in the game’.

The scientific theory of Software Integrity Levels (SILs)

As the concept of a safety integrity levels is most entrenched within the software community I’ll stick with them for the moment, noting that the issues raised below are just as valid for safety integrity levels when applied to hardware. The theory of safety integrity levels for software can be expressed as follows:

  1. Software failures are ‘systematic’ that is they result from systematic faults in the software specification, design or production processes,
  2. As such software failures are not random in nature, given the correct set of inputs or environmental conditions the failure will always occur,
  3. The requirement for ultrahigh reliability (for example 10E-9 per hour) of safety functions makes traditional reliability testing of software to demonstrate such reliability impossible,
  4. The use of specific development processes will deliver the required reliability by reducing the number of latent faults that could cause a software failure but this comes at a cost,
  5. Therefore based on an assessment of risk an ‘integrity level’ is assigned to the safety function. The higher the risk the greater the integrity level assigned,
  6. This integrity level represents the required reliability of the safety function, and
  7. To achieve the integrity level a set of processes are applied to the specification, design & production processes, these are defined as an associated software integrity level.

Problems with SILs as a scientific theory

SIL’s are fundamentally untestable

Unfortunately even the lowest target failure rates for safety functions (e.g. 10E-5 per hr) are already beyond practical verification (Littlewood-Strigini 1993) therefore we have no practical independent and empirical way to demonstrate that application of a SIL (or any other posited technique) will achieve the required reliability (freedom from accident). So we end up with a circular argument where we can only demonstrate achieving a specific SIL by the evidence of carrying out the processes that define that SIL level (McDermid 2001).

SIL allocation is non-trivial

A number of different techniques can be used to allocate integrity level requirements ranging from the Consequence/Autonomy models of DO-178B and MIL-STD-882  to the Risk Matrices of IEC 61508 (1). Because of these differences SIL allocation cannot be said to be a consistent and therefore precisely defined activity this makes refutation of the theory difficult as a failure could be argued, ad hoc, as being due to the incorrect allocation of SILs rather than SILs themselves.

SIL Activities are inconsistent from standard to standard

The many SIL based standards vary widely in the methods invoked and the degree of tailoring that a project can apply. DO-178B defines a basic development process but focuses upon software product testing and inspection to assure safety. Other standards such as DEF STAN 00-55 focus on the definition of safety requirements that are acquitted through evidence. Some standards, such as DEF AUST 5679, emphasise the use of formal methods to achieve the highest integrity levels while others, such as IEC 61508, invoke a broad range of techniques to deliver a safety function at a required integrity level. There is as a result no single consistent and therefore wide ranging, ‘theory of SILs’, but each is specific to the project and company ‘instance’.

SIL activities are applied inconsistently

The majority of SIL standards allow a degree of tailoring of process to the specific project or company. While this is understandable given the range of projects and industry contexts it results in an inherently inconsistent application of processes across projects. As an example from aviation, within that industries software community there has been a vigorous debate over the application of various methods of achieving the Modified Condition/Decision Coverage criteria of DO-178B (Chilenski 2001). Because of this variability of application it is impossible to say with precision that a specific standard has been fully applied. The lack of precision then makes it difficult to argue that should an accident occur that the standard failed because it could always be argued after the fact that it was a fault of application rather than an inherent fault in the process standard that caused the failure. This is what Popper calls a conventionalist twist, because it can be used to explain away inconvenient results. This problem of application is further exacerbated by the standardisation bodies expressing their requirements in terms of recommendations (IEC 61508) or guidance (DO-178) rather than requirements and thereby allowing process variance without either justification or demonstration of equivalence.

SIL activities are ambiguous as to outcome

While the SIL standards are intended to deliver both intermediate and final products with low defect rates the logical argument as to how each process achieves or contributes to such a process is not so clear. The problem becomes worse as the process moves away from proximal activities that directly impact the final delivered product and towards the distal activities of managing the process. For example DO-178 Table A-1 requires the preparation of a plan for the software aspects of certification. While planning a process is certainly a ‘good thing’, the problem is that it is difficult to link the quality of overall planning to a specific and hazardous fault in a product. All that can be said about a plan is that it represents a planning activity and ensures that, if adhered to, subsequent efforts are carried out in a planned way and are auditable against the plan. Having developed a software product to the SIL requirements we then find that the end product behaves much like software that has not been developed to such a standard. In essence SIL’s make no risky predictions given as noted above that the purported reliability of the software is not empirically testable. Even should a latent software fault exist as long as the correct set of circumstances never arise in practice the software will operate safely.

Conclusions

Given the problems identified above we must conclude that, however much SIL’s have become the accepted wisdom, they do not satisfy the requirements of a scientific theory. They may have a seductive simplicity but they are, it seems, closer to astrology than to science or engineering. Unfortunately while the software community continues to cling to such concepts it stifles serious investigation into the real question of what constitutes safe software.

References

Chilenski, J. J. (2001), An Investigation of Three Forms of the Modified Condition Decision Coverage (MCDC) Criterion, FAA Tech Center Report DOT/FAA/AR-01/18.

Fowler, D., Application of IEC 61508 to Air Traffic Management and Similar Complex Critical Systems – Methods and Mythology, in Lessons in System Safety: Proceedings of the Eighth Safety-Critical Systems Symposium, Anderson, T., Redmill, F. (ed.s), pp 226-245, Southampton, UK, Springer Verlag.

Littlewood, B. & Strigini, L. (1993), Validation of Ultra-High Dependability for Software-based Systems. Comm. of the ACM, 36(11):69–80.

McDermid, J. A, Pumfrey, D.J Software Safety: Why is there no Consensus? Proceedings of the International System Safety Conference (ISSC) 2001, Huntsville,System Safety Society, 2001.

Popper, K.R. , Conjectures and Refutations, Third Ed. Routledge Pub., 1968.

Redmill, F., Safety Integrity Levels – Theory and Problems, Lessons in Systems Safety, in Lessons in System Safety: Proceedings of the Eighth Safety-Critical Systems Symposium, Anderson, T., Redmill, F. (ed.s), pp 1-20, Southampton, UK, Springer Verlag

Notes

1. There have been a multitude of qualitative and quantitative methods proposed for SIL assignment, so many that it sometimes seems that safety professionals take a perverse delight in propagating new techniques. Some of the more common include (with source):

  • Consequence (control loss) (MISRA),
  • Software authority (MIL-STD-882),
  • Consequence (loss severity) (DO 178),
  • Quantitative risk method (IEC 61508),
  • Risk graph and calibrated risk graph (IEC 61508/IEC 61511),
  • Hazardous event severity matrix (IEC 61508),
  • Hybrid consequence and risk matrix (DEF STAN 00-56),
  • Semi-quantitative method (IEC 61511),
  • Safety layer matrix method (IEC 61511), and
  • Layer of protection analysis (IEC 61511).

Why is the concept of a hazard so hard to pin down? Wittgenstein provides some pointers as to why there is les to this old chestnut than appears to be.

Continue Reading...