In June of 2011 the Australian Safety Critical Systems Association (ASCSA) published a short discussion paper on what they believed to be the philosophical principles necessary to successfully guide the development of a safety critical system. The paper identified eight management and eight technical principles, but do these principles do justice to the purported purpose of the paper?
Referring to the Oxford dictionary you find principle defined as, “a fundamental truth or proposition that serves as the foundation for a system of belief or behaviour or for a chain of reasoning”. Clearly the authors intended for the principles so enumerated to serve as the foundational concepts for safety management and engineering performed when developing safety critical systems.
A worthy objective, but I’m afraid I didn’t really come away with the feeling that they had nailed the philosophy part. A lot of good common sense statements, but philosophy? As in the study of the fundamental nature of knowledge, nature and reality? Not so much…
So, throwing my hat in the ring here’s my working list of philosophical principles. Herewith the (very) draft Philosophica Safety Principia for safety critical systems.
Principle 1. Operating a high consequence safety critical technological system requires a decision to accept the residual risk of operation by society.
Principle 2. Where risk exists there also exists an ethical duty of decision makers to eliminate if practical or if it is not, to reduce such risks to an acceptable level.
Principle 3. The greater the potential severity of loss associated with the system the more likely the organisational and societal focus will be on prevention of accidents rather than mitigation of consequences.
Principle 4. When evaluating risk we must be clear as to whether we are dealing with an recoverable (ergodic) or non-recoverable (non ergodic) losses and therefore whether population and times series statistics are equivalent.
Principle 5. Risk is a subjective construct and can never be evaluated in a totally objective fashion.
An assessment of the risk posed by a technological system cannot be divorced from the operational, organisational and social context of such a system.
All risk assessments contain a degree of subjectivity in their estimation.
The subjective elements of risk assessment inevitably introduces cognitive biases into the process, including subjective expert estimations of such risks
Principle 6. Operating risk includes known (aleatory), surmised (epistemic) and unknown (ontological) components.
Safety management and engineering must address epistemic and ontological risks not just aleatory risk.
Unknown (ontological) risks may disclose themselves in the life of the systems, but some may never be identified through the life the of the system.
Principle 7. Decisions as to how to control risks are also most uncertain (due to greater epistemic and ontological uncertainty) during the initial development of a new technologies.
The achievement of acceptable safety risk for new technologies cannot therefore be fully validated a priori
Waiting until experience has accrued will result in difficulty in implementing controls because the technology has become entrenched.
High consequence and highly novel technologies should apply the principle of corrigibility that is where mistakes are possible, it should be possible to easily, quickly and cheaply detect and recover from them.
Principle 8. Risk for high consequence systems is dominated by epistemic and ontological risk e.g. the risks associated with the limits of our knowledge.
The lower the required likelihood of an accident the less we can rely upon the assumption of independence of events to justify a low likelihood of occurrence.
As the severity of consequences increases the design of safety components should be skewed towards deterministic rather than probabilistically assured safety.
High consequence systems should adopt a fail safe design approach so that even in the presence of unidentified design errors the system will fail in a deterministic, visible and safe fashion, rather than in a hidden (and potentially unsafe) fashion.
Safety mechanisms within systems should exhibit economy of mechanism, that is they should be as small and simple as possible in order to minimize the effort required to verify their properties and the likelihood of errors in their construction.
Principle 9. The greater the severity of a potential accident the lower the acceptable likelihood and the greater the epistemic and ontological uncertainty in estimates of that likelihood.
The lower the estimated likelihood of occurrence for accidents the greater the proportion of accidents that will occur due to causes that were not identified or considered in the risk analysis.
Design for high consequence systems should focus upon reducing the severity of possible accidents not just their likelihood.
Design of safety components should minimise the ontological uncertainty through the use of precedented design (and technologies) and economy of mechanism (simplicity and smallness).
Principle 10. Complicatedness breeds ontological uncertainty and risk. The more complicated a system the more likely it is that an accident will be due to the unintended interactions of components, rather than singular component failures or human errors.
Principle 11. Complex high consequence technological systems are highly optimised towards a tolerance of frequently occurring failure events Systems that have highly optimised tolerance will be vulnerable to rare combinations of such events and their failure rate will exhibit a heavy tail distribution.
Principle 12. One can never absolutely ‘prove’ the safety of a system as such arguments are inherently inductive. Therefore a rigorous, adversarial attempt to identify flaws in the safety argument, rather than attempting to prove it in some absolute sense, should be adopted.
The above list is in no way complete, and I’d appreciate suggested additional principles. 🙂