**Why the risk matrix?**

For new systems we generally do not have statistical data on accidents, and high consequence events are, we hope, quite rare leaving us with a paucity of information. So we usually end up basing any risk assessment upon low base rate data, and having to fall back upon some form of subjective (and qualitative) method of risk assessment.

Risk matrices were developed to guide such qualitative risk assessments and decision making, and the form of these matrices is based on a mix of decision and classical risk theory. The matrix is widely described in safety and risk literature and has become one of the less questioned staples of risk management.

Despite this there are plenty of poorly constructed and ill thought out risk matrices out there, in both the literature and standards, and many users remain unaware of the degree of epistemic uncertainty that the use of a risk matrix introduces. So this post attempts to establish some basic principles of construction as an aid to improving the state of practice and understanding.

**Creating the Matrix**

In 1711 Abraham De Moivre came up with the mathematical definition of risk as:

The Risk of losing any sum is the reverse of Expectation; and the true measure of it is, the product of the Sum adventured multiplied by the Probability of the Loss.

Abraham de Moivre, De Mensura Sortis, 1711 in the Ph. Trans. of the Royal Society

Based on de Moivre’s definition we can define a series of curves that represent the same risk level (termed iso-risk contours) on a two dimensional surface. While this is mathematically correct it’s difficult for people to use in making qualitative evaluations. So decision theorists took the iso-risk contours (actually the log of these curves) and zoned them into judgementally tractable cells (or bins) to form the risk matrix.

In a risk matrix each cell notionally represent a point on the iso-risk curve and steps in the matrix define the edges of ‘risk zones’. We also usually plot the curve using log log axes which provide straight line contours.

This binning is intended to make qualitative decisions as to severity or likelihood more tractable as human beings find it easier to select qualitative values from amongst such bins.

But unfortunately binning also introduces ambiguity into the risk assessment, if you look at the example given above you’ll see that the iso-risk contour runs through the diagonal of the cell, so ‘off diagonal’ the cell risk is lesser or greater depending on which side of the contour your looking at.

So be aware that when you bin a continuous risk contour you pay for easier decision making with an increase in epistemic uncertainty over the resultant risk assessment.

**Scaling likelihood and severity**

The next problem that faces us in developing a risk matrix is assigning a scale to the severity and likelihood bins. If we have a linear (log log) plot then the value of each succeeding bin’s median point should go up by an order of magnitude. So in qualitative terms we would then define ‘probable’ in the example above to be 10 times as likely as ‘remote’, ‘catastrophic’ 10 times as severe as ‘critical’ and so on.

There are two good reasons to adopt such a scaling. The first is that it’s an established technique to avoid qualitative under-estimation of values. This is because people find it easier to discriminate between values separated by an order of magnitude than by a linear scale. The second is that if we have a linear iso-risk matrix then (by definition) the scales must also be logarithmic to comply with De Moivre’s equation. In fact they should always have a negative slope of one.

Unfortunately, you’ll also find plenty of example linear risk contour matrices with non-logarithmic axes that violate this rule, for example the Australian standard for risk management AS 4360 uses such an example. While such ill formed matrices may reflect a conscious decision making strategy, for example sensitivity to extreme severities, they *don’t* reflect De Moivre’s theorem and the classical definition of risk.

Another decision with designing a risk matrices is how many cells to have, too few and the evaluation is too granular, too many and the decision making becomes bogged down in detailed discrimination (which as noted above is hardly warranted). The usual happy compromise sits at around 5 to 7 bins on the vertical and horizontal, which for a log log scale will give a wide field of discrimination.

**The flatland problem and the semiotics of colour**

The problem with a matrix is that it can only represent two dimensions of information on one graph. Thus a simple matrix may allow us to communicate risk as a function of frequency (F) and severity (S) but we still need a way to graphically associate decision rules with the identified risk zones. The traditional method adopted to do this is to use colour to designate the specific risk zones and a colour key the associated action required. As the example above illustrates the various decision zones of risk are colour coded, with the highest risks being given the most alarming colours.

While colour has inherently a strong semiotic meaning, and one intuitively understood by both expert and lay person alike, there is also a potential trap in that by setting the priorities using such a strong code we are subtly forcing an imperative. In these circumstances a colourised matrix can become a tool of persuasion rather than one of communications (Roth 2012).

One must therefore carefully consider whether the matrix is intended as a tool to assist in decision making, where a degree of persuasion may be appropriate, or whether it is one intended to assist in more open ended communication, discussion and dialogue. If the answer is the latter then other forms of visualisation may be more appropriate.

**Ensure (weak) consistency of ordering**

A properly formed matrices risk zones should also exhibit what is called ‘weak consistency’, that is the qualitative ordering of risk as defined by the various (coloured) zones and their cells ranks various risks (form high to low) in roughly the same way that a quantitative analysis would do so (Cox 2008).

In practical terms what this means is that if you find there are areas of the matrix where the same effort will produce a greater or lesser improvement of risk when compared to another area (Clements 96) you have a problem of construction. You should also (in principal) never be able to jump across two risk decision zones in one movement.

For example if a safety device reduces the likelihood of occurrence of a hazard by a defined factor we wouldn’t expect the risk reduction to depend upon the severity. This exemplifies the problem of inconsistency that such breakdowns in consistency introduce.

**Dealing with boundary conditions**

In some standards, such as MIL-STD-882, an upper arbitrary bound may be placed on severity, in which case what happens when a severity exceeds the ‘allowable’ threshold? For example, should we discriminate between a risk to more than one person versus one to a single person? For specific domains this may not be a problem, but for others where mass casualt events are a possibility it may well be of concern.

In this case if we don’t wish to add columns to the matrix we may define a sub-cell within our matrix to reflect that this is a higher than we thought level of risk. Alternatively we could define the severity for the ‘I’ column as defining a range of severities whose values include at the median point the mandated severity. So for example the catastrophic hazard bin range would be from 1 fatality to 10 fatalities.

Looking at likelihood one should also include a likelihood of ‘impossible’ so that risks that have been removed can be recorded rather than deleted. Just because we’ve retired a hazard today, doesn’t mean a subsequent design change or design error won’t resurrect the hazard to haunt us.

**Calibrating the risk matrix**

Risk matrices are used to make decisions about risk, and their acceptability. But how do we calibrate the risk matrix to represent an understandably acceptable risk? One way is to pick a specific bin and establish a calibrating risk scenario for it, usually drawn from the real world, for which we can argue the risk is considered broadly acceptable by society (Clements 96).

So in the matrix above we could pick cell IE and equate that to an acceptable real world risk that could result in the death of a person (a catastrophic loss). For example, ‘*the risk of death in an motor vehicle accident on the way from and to your work on main arterial roads under all weather conditions cumulatively over a 25 year working career*‘. This establishes the edge of the acceptable risk zone by definition and allows use to define other risk zones.

In general it’s always a good idea to provide a description of what each qualitative bin means so that people understand the meaning. If you need to one can also include numerical ranges for likelihood and severity, such as the loss values in dollars, numbers of injuries sustained and so on.

**Define your exposure**

One should also consider and define the number of units, people or systems exposed, clearly there is a difference between the cumulative risk posed by say one aircraft and a fleet of one hundred aircraft in service or between one bus and a thousand cars.

What may be acceptable at an individual level (for example road accident) may not be acceptable at an aggregated or societal level and risk curves may need to be adjusted accordingly. MIL-STD-882C offers a simple example of this approach.

**And then define your duration **

Finally and perhaps most importantly you always need to define the duration of exposure for likelihood. Without it the statement is at best meaningless and at worst misleading as different readers will have different interpretations. A 1 in 100 probability of loss over 25 years of life is a very different risk to a 1 in 100 over a single 4 hour mission.

**Final thoughts**

A risk matrix is about making decisions so it needs to support the user in that regard, but, it’s use as part of a risk assessment should not be seen as a means of acquitting a duty of care. The principle of ‘so far as is reasonable practicable’ cares very little about risk in the first instance, asking only whether it would be considered reasonable to acquit the hazard. Risk assessments belong at the *back end* of a program when, despite our efforts we have residual risks to consider as part of evaluating our efforts in achieving a reasonably practical level of safety. A fact that modern decision makers should keep in mind.

**References**

Clements, P. Sverdrup System Safety Course Notes, 1996.

Cox, L.A. Jr., ‘What’s Wrong with Risk Matrices?’, Risk Analysis, Vol. 28, No. 2, 2008.

Cox, S., Tait, R., Safety, Reliability & Risk Management, 2nd Ed., Butterworth, Heinemann, 1998.

MIL-STD-882, System Safety Program Requirements.

Leveson, N., System Safety and Computers – A Guide to Preventing Accidents and Losses Caused by Technology, Addison Wesley, 1995.

Roth, Florian., Focal Report 9: Risk Analysis Visualizing Risk: The Use of Graphical Elements in Risk Analysis and Communications, Risk and Resilience Research Group Center for Security Studies (CSS), Zürich 2012.

Thanks for mentioning the “calibration” for Cardinal values of the scales, rather than just leaving them as ordinal. The latter is common in non-critical domains, making the results of the matrix not very useful.

Mathew

Following Glen, another common error you avoid is to implicitly assume linear scales. However, what I really liked was your suggestion about calibration and the MVA example. It was new to me and I like it very much.

As an aside, the UK Police no longer refer to MVA’s considering the term prejudicial. Now it’s a Motor Vehicle Incident. It is nice to come across precision in language as much as in other areas. Like the way you recognise in your first heading that the existence of numbers in a methodology does not make it quantitative. Thank you.

Mike

I’ve always insisted that the real problem with risk theory is that human being do a bad job of evaluating the underlying risks. That information deficits and mental heuristics often result in putting a number on an outcome to make the decision maker feel better but that doesn’t make it any less guesswork.

This is an especially good video to watch about how people evaluate risk. I especially like the look on the face of the guy who wasn’t playing. Sheer disbelief. And this is in a poker game where every risk can be calculated to the dollar.

So the problem is that people tend to put risks that are frequent in the impossible category and impossible risks in the frequent category. And until that issue is solved it is bins without meaning or utility.

As I see it the fundamental ‘problem’ with human evaluation of risk is that we find it very difficult to work in a analytical mindset rather than we’re fundamentally bad at it.

Recent work on human decision making shows that we have two clear modes of thinking if you will. The first is natural, quick, cognitively efficient, dominated by feelings, pattern matching and dependent on cognitive ‘short hands’. The second is slow, learned, very intensive in cognitive resource and ‘logic/analytical’.

Guess which mode we spend most time in and which we revert to under time pressure? And also guess which is prone to cognitive biases? :-)

Probability Impact matrices are a bad idea in most situations and not good in any. Cox (2008) makes this clear. The matrices are pseudo scientific mechanisms that elciit distorted judgements then do misleading calculations with them to make matters worse. They go hand in hand with the idea of making lists of ‘risks’, considered one at a time as if separate.

It makes more sense to go with mainstream management science when dealing with uncertainty. For example, (1) develop coherent mental models (whether quantified or not) rather than listing separate ‘events’, and (2) if using expected values for decision making, use the expected values of all outcomes from each alternative, not just the expected value of ‘losses’.

Risk lists quickly get complicated, messy, and tiring to deal with. A better analysis can be done in much less time with less effort if it is coherent and simple. Thinking clearly and logically is easier and less stressful than getting in a muddle, and it is not difficult to harness and improve on inputs from ‘gut’.

I’ve recently been able to show students on post-graduate risk courses that they can make most money in a general knowledge betting game by calculating expected values from their judged probabilities. Gut + Brain works well.

Hi Matthew,

I was thinking about risk matrices (again) and Bridgman’s concept of operationalism as it applies to risk. If we accept the premise that risk is a concept that is defined by a series of operations to generate a metric of risk. And further that this metric is intended to usefully express our risk perception, then I think we start to understand why there are different ways of expressing/communicating risk.

From that perspective quantitative EV is a metric we use to define a specific aspects of perceived risk and modelling all outcomes where there’s an upside and downside outcome. So it’s useful in that context, i.e. probabilistic cost benefit analysis.

Risk matrices (however) seem to have emerged in an environment where the ‘upside’ is either non-monetary or a given. For example if society makes a decision to commit to nuclear power, the social good has been decided beforehand and now we’re left with managing the downside (preventing reactor meltdowns for example). I often see their use in military/government projects (as an example) where the project has been initiated and there’s no avoidance choice or cost versus benefit decision to be made. They’re very prevalent in the system safety environment were there tends to be a focus on downside to the exclusion of all else.

So after all that, I agree with your comments on the flaws with risk matrices (and about risk lists as well) for more general risk management but conclude that the reason for their persistence in the system safety community is that they are a useful metric for risk in that context. Outside that context, they should be applied with considerable caution (if at all).

Many thanks really this is what I am asking.

In addition to that i think, after appropriate mathematical evaluations, conscious consideration should be given. And the evaluations by themselves are not enough as i think.

So business professionals should applycommon sense to mathematical evaluations to determine whether evaluation make sense