Implications of the Knight Leveson experiment for software security

22/01/2014 — 3 Comments

20140122-072236.jpg

The failure of NVP and the likelihood of correlated security exploits

In 1986, John Knight & Nancy Leveson conducted an experiment to empirically test the assumption of independence in N version programming. What they found was that the hypothesis of independence of failures in N-version programs could be rejected at a 99% confidence level. While their results caused quite a stir in the software community, see their A reply to the critics for a flavour, what’s of interest to me is what they found when they took a closer look at the software faults.

…approximately one half of the total software faults found involved two or more programs. This is surprisingly high and implies that either programmers make a large number of similar faults or, alternatively, that the common faults are more likely to remain after debugging and testing.

Knight, Leveson 1986

To summarise the experimental results Knight and Leveson found that while some common faults could be laid at the feet of what you might call difficult computations, most correlated failures occurred when the faulty paths had common input-domains.

We conclude that this occurs when the faulty paths have common input-domains. Correlated failures occur when the partial  functions computed by the paths are identically wrong. The actual mistakes made, however, need not be similar or logically-related.

Brilliant, Knight & Leveson 1990

Now let’s consider system security at the point in a system where input-driven computation occurs, here a secure system should ideally parse an input for validity, and reject invalid inputs. Exploitations, unexpected input-driven computations due to maliciously crafted inputs, usually occur at this point as well, and rely on manipulating both latent design faults and regular features to subvert this parsing of inputs security function.

What the LK results imply is that a security exploits on the input domain will also correlate, that is an exploit of one system that utilises a specific set of inputs will likely work on another ‘like’ system, even though what actually breaks inside each will not be the same logical part, and even though the two systems appear to be designed quite differently. To put it simply, the existence of input domain correlated failures in software systems strongly implies the same correlation exists for security exploits regardless of design differences.

From a practical perspective this means that one should not naively rely on arguments of dissimilarity when considering the applicability of security exploits.

3 responses to Implications of the Knight Leveson experiment for software security

  1. 

    Nancy provided advice to our firm on our triple redundant fault tolerant process control computer, while she was a UC Irvine in the mid 80’s. I’m a UC Irvine alumni in physics where we built accelerator detectors. Must be near 100% reliable to capture that one chance in a 100 million of seeing the particle interaction

    Our code was 1 version, but had “fail to safety” hardware and firm, validated by SINTIF and TUV. Wing Toy (AT&T ESS 4/5)(http://goo.gl/ZiiOMf) lead our “safety assessment” A firmware state machine assessment to “conflicting command” process to the digital outputs and verified the 3o3 inputs. We had a 3 -> 2 -> 1 degradation mode for emergency shutdown. The current machine is at http://www.triconex.com

    One outcome was the assessment of partial fault detection for turbine controls and emergency shutdown
    http://www.slideshare.net/galleman/fault-tolerant-systems
    using “fault injection.” This is how many of the flight avionics machines were built in the 80’s (or heritage). Today triple redundant flight controls have only one code base, with “fail to safe” hardware and firmware (ASICS and FPGAs).

    • 
      Matthew Squair 22/01/2014 at 9:16 am

      As a slight segue, I see one of the advantages of a firmware design is that it encourages you to use a state machine design which tends to bound the scope of what can potentially go wrong, as opposed to software where there’s the potential for Turing complete ‘weird machines’ in the code.

      • 

        The state machine approach has “stopping rules” and can be exhaustively tested. FPGAs and ASICS are next best thing. This is how flight avionics core functions embedded

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s