Why probability is not corroboration

The IEC’s 61508 standard on functional safety  assigns a series of Safety Integrity Levels (SIL) that correlate to the achievement of specific hazardous failure rates. Unfortunately this definition of SILs, that ties SILs to a probabilistic metric of failure, contains a fatal flaw.

Continue Reading…

Event trees

05/11/2015

I’ve just added the event trees module to the course notes.

System Safety Fundamentals Concept Cloud

System safety course, now with more case studies and software safety!

Have just added a couple of case studies and some course notes of software hazards and integrity partitioning, because hey I know you guys love that sort of stuff 🙂

Safety course notes

03/11/2015

System Safety Fundamentals Concept Cloud

I have finally got around to putting my safety course notes up, enjoy. You can also find them off the main menu.

Feel free to read and use under the terms of the associated creative commons license. I’d note that these are course notes so I use a large amount of example material from other sources (because hey, a good example is a good example right?) and where I have a source these are acknowledged in the notes. If you think I’ve missed a citation or made an error, then let me know.

To err is human, but to really screw it up takes a team of humans and computers…

How did a state of the art cruiser operated by one of the worlds superpowers end up shooting down an innocent passenger aircraft? To answer that question (at least in part) here’s a case study that’s part of the system safety course I teach that looks at some of the casual factors in the incident.

In the immediate aftermath of this disaster there was a lot of reflection, and work done, on how humans and complex systems interact. However one question that has so far gone unasked is simply this. What if the crew of the USS Vincennes had just used the combat system as it was intended? What would have happened if they’d implemented a doctrinal ruleset that reflected the rules of engagement that they were operating under and simply let the system do its job? After all it was not the software that confused altitude with range on the display, or misused the IFF system, or was confused by track IDs being recycled… no, that was the crew.

Consider the effect that the choice of a single word can have upon the success or failure of a standard.The standard is DO-278A, and the word is, ‘approve’. DO-278 is the ground worlds version of the aviation communities DO-178 software assurance standard, intended to bring the same level of rigour to the software used for navigation and air traffic management. There’s just one tiny difference, while DO-178 use the word ‘certify’, DO-278 uses the word ‘approve’, and in that one word lies a vast difference in the effectiveness of these two standards.

DO-178C has traditionally been applied in the context of an independent certifier (such as the FAA or JAA) who does just that, certifies that the standard has been applied appropriately and that the design produced meets the standard. Certification is independent of the supplier/customer relationship, which has a number of clear advantages. First the certifying body is indifferent as to whether the applicant meets or does not meet the requirements of DO-178C so has greater credibility when certifying as they are clearly much less likely to suffer from any conflict of interest. Second, because there is one certifying agency there is consistent interpretation of the standard and the fostering and dissemination of corporate knowledge across the industry through advice from the regulator.

Turning to DO-278A we find that the term ‘approver’ has mysteriously (1) replaced the term ‘certify’. So who, you may ask, can approve? In fact what does approve mean? Well the long answer short is anyone can approve and it means whatever you make of it. What usually results in is the standard being invoked as part of a contract between supplier and customer, with the customer then acting as the ‘approver’ of the standards application. This has obvious and significant implications for the degree of trust that we can place in the approval given by the customer organisation. Unlike an independent certifying agency the customer clearly has a corporate interest in acquiring the system which may well conflict with the object of fully complying with the requirements of the standard. Give that ‘approval’ is given on a contract basis between two organisations and often cloaked in non-disclosure agreements there is also little to no opportunity for the dissemination of useful learnings as to how to meet the standard. Finally when dealing with previously developed software the question becomes not just ‘did you apply the standard?’, but also ‘who was it that actually approved your application?’ and ‘How did they actually interpret the standard?’.

So what to do about it? To my mind the unstated success factor for the original DO-178 standard was in fact the regulatory environment in which it was used. If you want DO-278A to be more than just a paper tiger then you should also put in place mechanism for independent certification. In these days of smaller government this is unlikely to involve a government regulator, but there’s no reason why (for example) the independent safety assessor concept embodied in IEC 61508 could not be applied with appropriate checks and balances (1). Until that happens though, don’t set too much store by pronouncements of compliance to DO-278.

Final thought, I’m currently renovating our house and have had to employ an independent certifier to sign off on critical parts of the works. Now if I have to do that for a home renovation, I don’t see why some national ANSP shouldn’t have to do it for their bright and shiny toys.

Notes

1. Perhaps Screwtape consultants were advising the committee. 🙂

2. One of the problems of how 61508 implement the ISA is that they’re still paid by the customer, which leads in turn to the agency problem. A better scheme would be an industry fund into which all players contribute and from which the ISA agent is paid.

Meltwater river Greenland icecap (Image source: Ian Jouhgin)

Meme’s, media and drug dealer’s

In honour of our Prime Minister’s use of the drug dealer’s argument to justify (at least to himself) why it’s OK for Australia to continue to sell coal, when we know we really have to stop, here’s an update of a piece I wrote on the role of the media in propagating denialist meme’s. Enjoy, there’s even a public heath tip at the end.

PS. You can find Part I and II of the series here.

🙂

The point of an investigation is not to find where people went wrong; it is to understand why their assessments and actions made sense at the time.

Sidney Dekker

  
ZEIT8236 System safety 2015 redux

Off to teach a course in system safety for Navy, whic ends up as a week spent at the old almer mater. Hopefully all transport legs will be uneventful. 🙂

C1Thresher

…for my boat is so small and the ocean so huge

For a small close knit community like the submarine service the loss of a boat and it’s crew can strike doubly hard. The USN’s response to this disaster, was both effective and long lasting. Doubly impressive given it was implemented at the height of the Cold War. As part of the course that I teach on system safety I use the Thresher as an important case study in organisational failure, and recovery.

Postscript

The RAN’s Collins class Subsafe program derived it’s strategic principles in large measure from the USNs original program. The successful recovery of HMAS Dechaineux from a flooding incident at depth illustrates the success of both the RANs Subsafe program and also its antecedent.

WHS and MIL-STD-882

27/09/2015

Here’s a copy of the presentation that I gave at ASSC 2015 on how to use MIL-STD-882C to demonstrate compliance to the WHS Act 2011. The Model Australian Workplace Health and Safety (WHS) Act places new and quite onerous requirements upon manufacturer, suppliers and end users organisations. These new requirements include the requirement to demonstrate due diligence in the discharge of individual and corporate responsibilities. Traditionally contracts have steered clear of invoking Workplace Health and Safety (WHS) legislation in anything other than a most abstract form, unfortunately such traditional approaches provide little evidence with which to demonstrate compliance with the WHS act.

The presentation describes an approach to establishing compliance with the WHS Act (2011) using the combination of a contracted MIL-STD-882C system safety program and a compliance finding methodology. The advantages and effectiveness of this approach in terms of establishing compliance with the act and the effective discharge the responsibilities of both supplier and acquirer are illustrated using a case study of a major aircraft modification program. Limitations of the approach are then discussed given the significant difference between the decision making criteria of classic systems safety and the so far as is reasonably practicable principle.

Matrix (Image source: The Matrix film)

The law of unintended consequences

There are some significant consequences to the principal of reasonable practicability enshrined within the Australian WHS Act. The act is particularly problematic for risk based software assurance standards, where risk is used to determine the degree of effort that should be applied. In part one of this three part post I’ll be discussing the implications of the act for the process industries functional safety standard IEC 61508, in the second part I’ll look at aerospace and their software assurance standard DO-178C then finally I’ll try and piece together a software assurance strategy that is compliant with the Act. Continue Reading…

Source: Technical Lessons from QF32

Here’s a link to version 1.3 of System Safety Fundamentals, part of the course I teach at UNSW. I’ll be putting the rest of the course material up over the next couple of months. Enjoy 🙂

It is a common requirement to either load or update applications over the air after a distributed system has been deployed. For embedded systems that are mass market this is in fact a fundamental necessity. Of course once you do have an ability to load remotely there’s a back door that you have to be concerned about, and if the software is part of a vehicle’s control system or an insulin pump controller the consequences of leaving that door unsecured can be dire. To do this securely requires us to tackle the insecurities of the communications protocol head on.

One strategy is to insert a protocol ‘security layer’ between the stack and the application. The security layer then mediate between the application and the Stack to enforce the system’s overall security policy. For example the layer could confirm:

  • that the software update originated from an authenticated source,
  • that the update had not been modified,
  • that the update itself had been authorised, and
  • that the resources required by the downloaded software conform to any onboard safety or security policy.

There are also obvious economy of mechanism advantages when dealing with protocols like the TCP/IP monster. Who after all wants to mess around with the entirety of the TCP/IP stack, given that Richard Stevens took three volumes to define the damn thing? Similarly who wants to go through the entire process again when going from IP5 to IP6? 🙂

Interesting documentary on SBS about the Germanwings tragedy, if you want a deeper insight see my post on the dirty little secret of civilian aviation. By the way, the two person rule only works if both those people are alive.

What burns in Vegas…

Ladies and gentlemen you need to leave, like leave your luggage!

This has been another moment of aircraft evacuation Zen.

Lady Justice (Image source: Jongleur CC-BY-SA-3.0)

Or how I learned to stop worrying about trifles and love the Act

One of the Achilles heels of the current Australian WH&S legislation is that it provides no clear point at which you should stop caring about potential harm. While there are reasons for this, it does mean that we can end up with some theatre of the absurd moments where someone seriously proposes paper cuts as a risk of concern.

The traditional response to such claims of risk is to point out that actually the law rarely concerns itself with such trifles. Or more pragmatically, as you are highly unlikely to be prosecuted over a paper cut it’s not worth worrying about. Continue Reading…

The bond between a man and his profession is similar to that which ties him to his country; it is just as complex, often ambivalent, and in general it is understood completely only when it is broken: by exile or emigration in the case of one’s country, by retirement in the case of a trade or profession.

Primo Levi (1919-87)

Defence in depth

One of the oft stated mantra’s of both system safety and cyber-security is that a defence in depth is required if you’re really serious about either topic. But what does that even mean? How deep? And depth of what exactly? Jello? Cacti? While such a statement has a reassuring gravitas, in practice it’s void of meaning unless you can point to an exemplar design and say there, that is what a defence in depth looks like. Continue Reading…

Technical debt

05/09/2015

St Briavels Castle Debtors Prison (Image source: Public domain)

Paying down the debt

A great term that I’ve just come across, technical debt is a metaphor coined by Ward Cunningham to reflect on how a decision to act expediently for an immediate reason may have longer term consequences. This is a classic problem during design and development where we have to balance various ‘quality’ factors against cost and schedule. The point of the metaphor is that this debt doesn’t go away, the interest on that sloppy or expedient design solution keeps on getting paid every time you make a change and find that it’s harder than it should be. Turning around and ‘fixing’ the design in effect pays back the principal that you originally incurred. Failing to pay off the principal? Well such tales can end darkly. Continue Reading…

Inspecting Tacoma Narrows (Image source: Public domain)

We don’t know what we don’t know

The Tacoma Narrows bridge stands, or rather falls, as a classic example of what happens when we run up against the limits of our knowledge. The failure of the bridge due to an as then unknown torsional aeroelastic flutter mode, which the bridge with it’s high span to width ratio was particularly vulnerable to, is almost a textbook example of ontological risk. Continue Reading…

Icicles on the launch tower (Image source: NASA)

An uneasy truth about the Challenger disaster

The story of Challenger in the public imagination could be summed up as ”’heroic’ engineers versus ’wicked’ managers”, which is a powerful myth but unfortunately just a myth. In reality? Well the reality is more complex and the causes of the decision to launch rest in part upon the failure of the participating engineers in the launch decision to clearly communicate the risks involved. Yes that’s right, the engineers screwed up in the first instance. Continue Reading…

Risk managers are the historians of futures that may never be.

Matthew Squair

I’ve rewritten my post on epistemic, aleatory and ontological risk pretty much completely, enjoy.

Qui enim teneat causas rerum futurarum, idem necesse est omnia teneat quae futura sint. Quod cum nem…

[Roughly, He who knows the causes will understand the future, except no-one but god possesses such faculty]

Cicero, De Divinatione, Liber primus, LVI, 127

Piece of wing found on La RĂ©union Island, is that could be flap of #MH370 ? Credit: reunion 1ere

Piece of wing found on La RĂ©union Island (Image source: reunion 1ere)


Why this bit of wreckage is unlikely to affect the outcome of the MH370 search

If this really is a flaperon from MH370 then it’s good news in a way because we could use wind and current data for the Indian ocean to determine where it might have gone into the water. That in turn could be used to update a probability map of where we think that MH370 went down, by adjusting our priors in the Bayesian search strategy. Thereby ensuring that all the information we have is fruitfully integrated into our search strategy.

Well… perhaps it could, if the ATSB were actually applying a Bayesian search strategy, but apparently they’re not. So the ATSB is unlikely to get the most out of this piece of evidence and the only real upside that I see to this is that it should shutdown most of the conspiracy nut jobs who reckoned MH370 had been spirited away to North Korea or some such. 🙂

We must contemplate some extremely unpleasant possibilities, just because we want to avoid them. 

As quoted in ‘The New Nuclear Age’. The Economist, 6 March 2015

Albert Wohlstetter

Jeep (Image source: Andy Greenberg/Wired)

A big shout out to the Chrysler-Jeep control systems design team, it turns out that flat and un-partitioned architectures are not so secure, after security experts Charlie Miller and Chris Valasek demonstrated the ability to remotely take over a Jeep via the internet and steer it into a ditch. Chrysler has now patched the Sprint/UConnect vulnerability, and subsequently issued a recall notice for 1.4 million vehicles which requires owners to download a car security patch onto a USB stick then plug it into their car to update the firmware. So a big well done Chrysler-Jeep guys, you win this years Toyota Spaghetti Monster prize* for outstanding contributions to embedded systems design.

Continue Reading…

There are no facts, only interpretations…

Friedrich Nietzcshe

More woes for OPM, and pause for thought for the proponents of centralized government data stores. If you build it they will come…and steal it.

For no other reason than to answer the rhetorical question. Feel free to share.

How?

The offending PCA serial cable linking the comms module to the motherboard (Image source: Billy Rios)

Hannibal ante portas!

A recent article in Wired discloses how hospital drug pumps can be hacked and the firmware controlling them modified at will. Although in theory the comms module and motherboard should be separated by an air gap, in practice there’s a serial link cunningly installed to allow firmware to be updated via the interwebz.

As the Romans found, once you’ve built a road that a legion can march down it’s entirely possible for Hannibal and his elephants to march right up it. Thus proving once again, if proof be needed, that there’s nothing really new under the sun. In a similar vein we probably won’t see any real reform in this area until someone is actually killed or injured.

This has been another Internet of Things moment of zen.

Two_reactors

A tale of another two reactors

There’s been much debate over the years as whether various tolerance of risk approaches actually satisfy the legal principle of reasonable practicability. But there hasn’t to my mind been much consideration of the value of simply adopting the legalistic approach in situations when we have a high degree of uncertainty regarding the likelihood of adverse events. In such circumstances basing our decisions upon what can turn out to be very unreliable estimates of risk can have extremely unfortunate consequences. Continue Reading…

SFAIRP

The current Workplace Health and Safety (WHS) legislation of Australia formalises the common law principle of reasonable practicability in regard to the elimination or minimisation of risks associated with industrial hazards. Having had the advantage of going through this with a couple of clients the above flowchart is my interpretation of what reasonable practicability looks like as a process, annotated with cross references to the legislation and guidance material. What’s most interesting is that the process is determinedly not about tolerance of risk but instead firmly focused on what can reasonably and practicably be done. Continue Reading…

Cyber security (Image Source: IT-Lex, via Google Images)

Safety versus security

There is a certain school of thought that views safety and security as essentially synonymous, and therefore that the principles of safety engineering are directly applicable to that of security, and vice versa. You might caricature this belief as the management idea that all one needs to do to generate a security plan is to take an existing safety plan and replace ‘safety’ with ‘security’ or ‘hazard’ with ‘threat’. A caricature yes, but one that’s not that much removed from reality 🙂

Continue Reading…

If you’re interested in observation selection effects Nick Bostrum’s classic on the subject is (I now find out) available online here. A classic example of this is Wald’s work on aircraft survivability in WWII, a naive observer would seek to protect those parts of the returning aircraft that were most damaged, however Wald’s insight was that these were in fact the least critical areas of the aircraft and that the area’s not damaged should actually be the one’s that were reinforced.

ASSC 2015 Brisbane

29/05/2015

  
Just attended the Australian System Safety Conference, the venue was the Customs House right on River. Lots of excellent speakers and interesting papers, I enjoyed Drew Rae’s on tribalism in system safety particularly.  The keynotes on resilience by John Bergstrom and cyber-security by Chris Johnson were also very good. I gave a presentation on the use of MIL-STD-882 as a tool for demonstrating compliance to the WHS Act, a subject that only a mother could love. Favourite moment? Watching the attendees faces when I told them that 61508 didn’t comply with the law. 🙂

Thanks again to Kate Thomson and John Davies for reviewing the legal aspects of my paper. Much appreciated guys.

Just added a short case study on the Patriot software timing error to the software safety micro course page. Peter Ladkin has also subjected the accident to a Why Because Analysis.

iVote_Logo

The best defence of a secure system is openness

Ensuring the security of high consequence systems rests fundamentally upon the organisation that sustains that system. Thus organisational dysfunction can and does manifest itself as an inability to deal with security in an effective fashion. To that end the ‘shoot the messenger’ approach of the NSW Electoral Commission to reports of security flaws in the iVote electronic voting system does not bode well for that organisation’s ability to deal with such challenges. Continue Reading…

The Electronic Frontier Foundation reports that a flaw in the iVote system developed by the NSW Electoral Commission meant that up to 66,000 online votes, were vulnerable to online attack. Michigan Computer Science Professor J. Alex Halderman and University of Melbourne Research Fellow Vanessa Teague, who had previously predicted problems, found a weakness that would have allowed an untraceable man in the middle attack. The untraceable nature of that attack is important and we’ll get back to it. Continue Reading…

rocket-landing-attempt (Image source- Space X)

How to make rocket landings a bit safer easier

No one should underestimate how difficult landing a booster rocket is, let alone onto a robot barge that’s sitting in the ocean. The booster has to decelerate to a landing speed on a hatful of fuel, then maintain a fixed orientation to the deck while it descends, all the while counteracting the dynamic effects of a tall thin flexible airframe, fuel slosh, c of g changes, wind and finally landing gear bounce when you do hit. It’s enough to make an autopilot cry. Continue Reading…

Once again my hometown of Newcastle is being battered by fierce winds and storms, in yet another ‘storm of the century’, the scene above is just around the corner from my place in the inner city suburb of Cooks Hill. We’re now into our our second day of category two cyclonic winds and rain with many parts of the city flooded, and without power. Dungog a small town to the North of us is cut off and several houses have been swept off their piers there, three deaths are reported. My 8 minute walk to work this morning was an adventure to say the least.

Here’s a companion tutorial to the one on integrity level partitioning. This addresses more general software hazards and how to deal with them. Again you can find a more permanent link on my publications page. Enjoy 🙂

A short tutorial on the architectural principles of integrity level partitioning,  I wrote this a while ago, but the fundamentals remain the same. Partitioning is a very powerful design technique but if you apply it you also need to be aware that it can interact with all sorts of other system design attributes, like scheduling and fault tolerance to name but two.

The material is drawn from may different sources, which unfortunately at the time I didn’t reference, so all I can do is offer a general acknowledgement here. You can also find a more permanent link to the tutorial on my publications page.

20140122-072236.jpg

The GAO has released its latest audit report on the FAA’s NextGen Air Traffic Management system. The reports updates the original GAO’s report and when read in conjunction with the original gives an excellent insight into how difficult cybersecurity can be across a national infrastructure program, like really, really difficult. At least they’re not trying to integrate military and civilian airspaces at the same time 🙂

My analogy is that on the cyber security front we’re effectively asking the FAA to hold a boulder over its head for the next five years or so without dropping it. And if security isn’t built into the DNA of NextGen?  Well I leave it you dear reader to ponder the implications  of that, in this ever more connected world of ours.

In celebration of upgrading the site to WP Premium here’s some gratuitous eye candy 🙂

A little more seriously, PIO is one of those problems that, contrary to what the name might imply, requires one to treat the aircraft and pilot as a single control system.

The problem with people

The HAL effect, named after the eponymous anti-hero of Stanley Kubrick and Arthur C. Clarke’s film 2001, is the tendency for designers to implicitly embed their cultural biases into automation. While such biases are undoubtedly a very uncertain guide it might also be worthwhile to look at the 2001 Odyssey mission from HAL’s perspective for a moment. Here we have the classic long duration space mission with a minimalist two man complement for the cruise phase. The crew and the ship are on their own. In fact they’re about as isolated as it’s possible to be as human beings, and help is a very, very long way away. Now from HAL’s perspectives humans are messy, fallible creatures prone to making basic errors in even the most routine of tasks. Not to mention that they annoyingly use emotion to inform even the most basic of decisions. Then there’s the added complication that they’re social creatures apt in even the most well matched of groups to act in ways that a dispassionate external observer could only consider as confusing and often dysfunctional. Finally they break, sometimes in ways that can actively endanger others and the mission itself.

So from a mission assurance perspective would it really be appropriate to rely on a two man crew in the vastness of space? The answer is clearly no, even the most well adjusted of cosmonauts can exhibit psychological problems when isolated in the vastness of space. While a two man crew may be attractive from a cost perspective it’s still vulnerable to a single point of human failure. Or to put it more brutally murder and suicide are much more likely to be successful in small crews. Such scenarios however dark they may be need to be guarded against if we intended to use a small crew. But how to do it? If we add more crew to the cruise phase complement then we also add all the logistics tail that goes along with it, and our mission may become unviable. Even if cost were not a consideration small groups isolated for long periods are prone to yet other forms of psychological dysfunctions (1). Humans it seems exhibit a set of common mode failures that are difficult to deal with, so what to do?

Well, one way to guard against common mode failures is to implement diverse redundancy in the form of a cognitive agent whose intelligence is based on vastly different principles to human affect driven processing. Of course to be effective we’re talking a high end AI, with a sufficient grasp of the theory of mind and the subtleties of human psychology and group dynamics to be able to make usefully accurate predictions of what the crew will do next (2). With that insight goes the requirement for autonomy in vetoing illogical and patently hazardous crew actions, e.g “I’m sorry Dave but I’m afraid I can’t  let you override the safety interlocks on the reactor fuel feed…“. From that perspective we might have some sympathy for HAL’s reaction to his other crew mates plotting his cybernetic demise.

Which may all seem a little far fetched after all an AI of that sophistication is another twenty to thirty years away, and long duration deep space missions are probably that far away as well. On the other hand there’s currently a quiet conversation going on in the aviation industry about the next step for automation in the cockpit, e.g. one pilot in the cockpit of large airliners. After all, so the argument goes, pilot’s are expensive beasts and with the degree of automated support available to day, surely we don’t need two men in the cockpit? Well, if we’re thinking purely about economics then sure one could make that argument, but on the other hand as the awful reality of the Germanwings tragedy sinks in we also need to understand that people are simply not perfect, and that sometimes (very rarely (3)) they can fail catastrophically. Given that we know that reducing crew levels down to two increases the risk of successful suicide by airliner one could ask what happens to the risk if we go to single pilot operations? I think we all know what the answer to that would be.

Where is a competent AI (HAL 9000) when you need one? 🙂

Notes

1. From polar exploration we know that small exploratory teams of three persons are socially unstable and should be avoided. Which then drives the team size to four.

2. As an aside, the inability of HAL to understand the basics of human motivation always struck me as a false note in Kubrick’s 2001 movie. An AI as smart as Hal apparently was, and yet lacking even an undergraduate understanding of human psychology, maybe not.

3. Remember that we are in the tail of the aviation safety program where we are trying to mitigate hazards whose likelihoods are very, very rare. However given that they aren’t mitigated they dominate the residual statistic.

Comet (Image source: Public domain)

Amidst all the soul searching, and pontificating about how to deal with the problem of pilot’s ‘suiciding by airliner’, you are unlikely to find any real consideration of how we have arrived at this pass. There is as it turns out a simple one word answer, and that word is efficiency. back when airliner’s first started to fly  they needed a big aircrew, for example  on the Comet airliner you’d find a pilot, copilot, navigator and flight engineer. Now while that’s a lot of manpower to pay for it did possess one hidden advantage, and that was with a crew size greater than three it’s very, very difficult (OK effectively impossible) for any one member of the flight crew to attempt to commit suicide. If you think I exaggerate then go see if there has ever been a successful suicide by airliner where there were three or more aircrew in the cockpit. Nope, none. But, the aviation industry is one driven by cost. Each new generation of aircraft needs to be cheaper to operate which means that the airlines and airline manufacturers are locked in a ruthless evolutionary arms race to do more with less. One of the easiest ways to reduce operating costs is to reduce the number of aircrew needed to fly the big jets. Fewer aircrew, greater automation is an equation that delivers more efficient operations. And before you the traveller get too judgemental about all this just remember that the demand for cost reduction is in turn driven by our expectation as consumers that airlines can provide cheap mass airfare for the common man.

So we’ve seen the number of aircrew slowly reduce over the years, first the navigator went and then the flight engineer until we finally arrived at our current standard two man flight crew. There’s just one small problem with this, if one of those pilots wants to dispose of the other there’s not a whole lot that can be done to prevent it. In our relentless pursuit of efficiency we have inadvertently eliminated a safety margin that we didn’t even realise was there. So what can we really do about it? Well the simple ‘we know it works’ answer is to go back to three crew in the cockpit, which effectively eliminates the hazard, of course that’s also a solution that’s unlikely to be taken up. In the absence of going back to three man crews well, we get what we’re currently getting, aspirational statements about better management of stress and depression in aircrew, or the use of cabin crew to enforce no go alone rules. But when that cockpit door is closed it’s still one on one, and all such measures do in the final analysis is reduce the likelihood of the hazard, by some hard to quantify amount, they don’t eliminate it. As long as we fly two man crews behind armoured doors unfortunately the possibility and therefore the hazard remains.

Happy flying 🙂

Germanwings crash