The concept of irreversible commands is one that has been around for a long time in the safety and aerospace communities, but why are they significant from a safety perspective?
Working with aerospace systems and their safety from time to time you come upon the concept of irreversible commands (1).
“The flight hardware and software shall be designed to preclude the unintentional execution of critical or irreversible commands…”
NASA Spacecraft Specification
Somewhat obviously the problem with irreversible commands is that the function they initiate is irreversible. If that function is potentially hazardous then inadvertent command inputs become a hazard. To reduce the likelihhod of failures in the input channel initiating an irreversible function ‘ANDed’ N redundant command paths can be used (Squair 2006) to ensure that N-1 component failures will not initiate an irreversible function. But even if we implement such redundant command channels if a single operator input to the command channel could still initiate the function we still have the potential for a single point of failure (2) in the system.
Slips, Mistakes and Single Point Human Error
The problem with human error and irreversible commands is that humans tend to make errors quite frequently, but we also use feedback from the environment to detect and correct them. Unfortunately irreversible commands remove the ability to correct an error, so you are left with a high base rate of human error and the potentially dire consequences of such an error.
Using the model of human error developed by James Reason (1990) one can categorise human error as broadly mistakes (errors in intention) and slips (errors in execution). Of particular interest in this discussion are slip type errors that can result in a direct single erroneous output, because such slips are a recurring element of accident causation (3). Of these slip type ‘description’ errors, where correct actions are applied to the wrong things (4) are a candidate error type that could cause a single point human error.
So how common would this sort of human error be? One likely scenario would be the operator is performing a routine task makes an error and throws the wrong switch. As it turns out in systems with standardised switch layouts it’s actually fairly common for description errors to occur. For example an operator may confuses one switch with another in a bank of switches (5) and throws the wrong switch.
One approach to rectifying this is simply to code (identify) controls and/or functionally group them to reduce the ambiguity of controls and the likelihood of incorrect selection (6). But although we have reduced the likelihood of mis-selection we are still left with a potential SPOF action. Another approach is to introduce a series of actions that must be completed before the irreversible command can be issued, which gets rid of the SPOF action. For example one could require that an independent ‘arm’ or ‘release consent’ operator input be provided (7) and then a separate ‘initiate/fire’ operator command be required. In some circumstances the master arm input itself may be guarded and require two separate physical actions to perform. This again increases the length and difficulty of the activation task as well as protecting from the human error of inadvertently knocking the switch to the on position. In the most extreme case we may require two operators to independently input a command or require the confirmation of the command’s validity by another operator prior to execution.
Note that requesting operator confirmation of the validity of a command does not constitute a true independent actuation step. Researchers have shown time and time again that such confirmations tend to be caught up in the same mistakes of intent that initiated the command in the first place.
The overall philosophy may be summarised as, ‘actions that could cause difficulties, should be difficult to action’, or simply that to move the system towards an irreversible, and potentially more hazardous state, should always take more effort than moving it into a safer state. The more effort required the more an operator must think about the need for that action and the less likely it is to be an ‘automatic’ response.
Reason, James., Human error, Cambridge University Press, New York, 1990, ISBN: 9780521314190.
Dismukes, K., Berman, B., Loukopolous, L., The Limits of Expertise – Rethinking Pilot Error and the Causes of Airline Accidents, Ashgate Pub., 2007.
1. Such commands can (for example) include the firing of thermal batteries, launch, stage separation, pyrotechnic firing and weapon ejection or fire commands.
2. If a command channel consists of a serial chain of components then the failure of a single component could initiate the safety critical function.
3. Slips and oversights have been implicated by Dismukes, Berman and Loukopolous (2007) in two of six overlapping error patterns for aviation accidents. These error patterns involved the conduct of highly practiced during both routine and challenging (high stress) conditions.
4. As an example, pilots of the Fokker F27 were prone to such an error due to the design of the brake and rudder pedal layout. The wheel brakes are situated just above the rudder pedals and pilots, intending to apply brakes on landing, frequently applied the rudder pedals instead.
5. Central controls are the most problematic to distinguish between.
6. An example of this effect is the numerous of pilots who selected wheels up rather than flaps down during landing because of the close location and similar feel of these control levers.
This led the FAA specifying shape coding to differentiate these controls, where upon the rate of pilot description errors fell of dramatically.
7. This ‘arm’ command may also be AND gated with other environmental or safety signals to decrease the likelihood of it being inappropriately given.