Yet another article about risk matrices

Julian Talbot
Jul 25, 2023
9 min read

Updated: Oct 9, 2023

A sure way to stir up comments and attention on LinkedIn seems to be to post anything positive about risk matrices. For reasons that I struggle to understand, risk matrices appear to be extraordinarily polarizing.

Risk Matrix showing both positive and negative risks

Most people who support risk matrices are open-minded about the benefits and limitations. There are also many intensely anti-matrix individuals. They seem to stubbornly refuse to accept the possibility that risk matrices have any place in risk management.

They regularly cite some excellent articles against risk matrices. Unfortunately, when you read the articles and research in-depth, most have serious flaws, such as the following.

Selecting poorly designed, fundamentally flawed matrices as the basis for analysis.
Reliance on vague risk statements. Using ill-defined risks would result in the same conclusions for any risk assessment tool. For example, analyzing risks such as "severe losses of drilling fluid, well-control issues, and blowout" as one oil and gas study attempted is a fool's errand. These are events, not risks. There are many possible sources of risk (e.g., negligence, weather, sabotage) and many consequences (e.g., reputation, death, finances), as well as countless assets at risk (e.g., inventory, infrastructure, capability). Changing even one thing (e.g., the source being weather vs. sabotage or terrorist attack) fundamentally changes the nature of the risk. Well-control issues are multiple risks. Little wonder that it can't be assessed on a risk matrix. Or any other tool for that matter.
Risk matrices are often assumed to be the only tool being used, ignoring the reality that a good operator can understand and work around their limitations
Risk matrices can be cost-effective for time-critical, under-resourced, or low-complexity risks such as manual handling. Little discussion of this was made in any of the papers.
There is little research on the benefits, such as facilitating a discussion or presenting and contrasting risks.
Many, if not most, highly successful organizations use them. There is no evidence that I could find regarding the long-term viability of an organization that correlates with the use of risk matrices.
ISO31000 process involves risk evaluation after the analysis, mainly addressing most criticisms. Analyzing risk matrices without the evaluation and treatment phases of risk assessment is a fundamental flaw of the studies.
Using risk matrices to multiply likelihood x consequence sets them up to fail. And yet, this is precisely what many organizations and most papers do. Most of the documents admit this, or at least expose it, as a limitation, but it is an issue that is easily fixed.
Other inputs, such as statistical analysis, historical data, the Delphi technique, threat assessments, scenario modeling, etc., can and often are used to improve their accuracy and effectiveness. This is not discussed in the literature to any great extent and is a common limitation of the studies.
No analysis of the benefits of risk matrices when used with other inputs and methods as appropriate for the organization's and situation's specific needs
Ignoring the evidence that they are widely used for mundane risk assessments and failing to suggest alternatives for less complex activities such as construction work or travel planning.
Much of the literature assumes that everyone uses risk matrices the same way. Many criticisms of risk matrices are from people who used them poorly, had poorly designed risk matrices, or used them in isolation from other inputs.
Ignoring the detail that people rarely use a risk matrix for analysis per se. A common approach, often conflated, with risk matrices is that an assessor will identify risks. Then document them in a risk register and record the likelihood using the organization's risk criteria. And then evaluate consequences against organizational risk criteria. And then possibly, but not necessarily, use a matrix to establish a semi-quantitative (prioritization) for the risks.

It is worth noting that many criticisms of risk matrices are made by people who cite the literature but haven't read it.

A Critique of the Critiques

Note: This is not a criticism of the authors, many of whom are far more capable than I am. I have read all the works mentioned in this article and gained some excellent insights. The critique below is about the scope of their analysis, not about their work or the conclusions they draw. In most cases, I agree with what they have written. Their findings are mainly valid within the scope and content of their studies. A laboratory study, however, to establish the LD50 of a drug does not reflect how people use it in the real world. And thus, when you take a poorly defined risk matrix in isolation from the inputs used, it doesn't reflect real-world results.

Some of the analysis has limitations which are only apparent if you take the time to read the articles critically. For example, Tony Cox's excellent article 'What's wrong with risk matrices' is a great starting point for understanding the limitations of risk matrices.

It should be pointed out that many people interpret Cox's article as an indictment of risk matrices and condemnation. He writes,

"These limitations suggest that risk matrices should be used with caution, and only with careful explanations of embedded judgments."

I agree and have added emphasis is in bold.

It's also worth pointing out in terms of the scope of that paper that the only consequence table in the article is also an inferior example of consequence definitions. Likelihood descriptor tables are also notably lacking, relying instead on grouping probabilities.

Cox summarizes four limitations in his abstract.

(a) Poor Resolution. Typical risk matrices can correctly and unambiguously compare only a small fraction (e.g., less than 10%) of randomly selected pairs of hazards. They can assign identical ratings to quantitatively very different risks ("range compression").

(b) Errors. Risk matrices can mistakenly assign higher qualitative ratings to quantitatively more minor risks. Risks with negatively correlated frequencies and severities can be "worse than useless," leading to worse-than-random decisions.

(c) Suboptimal Resource Allocation. Effective allocation of resources to risk-reducing countermeasures cannot be based on the categories provided by risk matrices.

(d) Ambiguous Inputs and Outputs. Categorizations of severity cannot be made objectively for uncertain consequences. Inputs to risk matrices (e.g., frequency and severity categorizations) and resulting outputs (i.e., risk ratings) require subjective interpretation. Different users may obtain opposite ratings of the same quantitative risks.

The four points noted above are valid within an academic framework evaluating poorly designed matrices without consideration of the inputs. It's also worth noting that 2x2 matrices, even the Stroud Matrix, are rarely, if ever, used in the real world, and the risk matrix examples in Cox's (and most other articles) are a selection of real-world but poorly designed matrices.

While I agree with the conclusions Tony Cox has drawn, there are a few points worth noting to give the article some context.

Poor Resolution

If you assess more than 20 or 30 risks, you should probably consider aggregating them. A risk register of 300 risks is too long to be meaningful without significant resources.

I have occasionally come across lists of several hundred risks or even created lists of over 200 risks from scratch. Invariably, upon analysis, that list aggregates to 30 to 50 risks. From that, it is rare that more than 30 are significant enough to exceed the organization's risk tolerance. Very rare.

Experience and retrospective analysis has also shown that treating the ten highest priority risks typically addresses the next 20 risks adequately. In my experience, treating the top ten risks well is better than spreading resources too thinly by treating all the risks just adequately.

In short, plotting 20 risks on a 5 x 5 risk matrix is very doable. Plotting 200 risks is not.

Even if range compression remains, it is not a problem if the issue is identified, addressed the issue, teased out the concerns, or re-evaluated.

Errors

Negatively correlated frequencies and severities are a challenge for all types of risk assessments. Low-likelihood, high-consequence events such as earthquakes and terrorist attacks or high-likelihood, low-consequence risks such as slips, trips, and falls need different management approaches. The risk matrix or any other assessment tool doesn't change that.

The essence of risk evaluation and treatment doesn't rely on whether the risk is rated low, medium, or high. These ratings serve different purposes.

The most straightforward possible risk matrix is a two × two table used in the article. A 2 x 2 matrix helps to illustrate the limitations of risk matrices, but it does not help separate risks for prioritization in any meaningful way. Nor is it representative of how most organizations use risk matrices.

The article notes that using risk matrices to multiply likelihood x consequence produces a false sense of accuracy and a fundamentally flawed output.

Suboptimal Resource Allocation

Allocation of resources to risk-reducing countermeasures in the ISO31000 framework isn't based on the categories provided by risk matrices. It is based on the evaluation process only after the analysis phase. I.e., Not in the same space as the risk matrix is used.

So the criticism is valid if you are not familiar with the correct use of a risk matrix. Or at least not familiar with the international risk management standard, ISO31000. Resource allocation is not a function of the risk matrix, or at least not if applied correctly.

Ambiguous Inputs and Outputs

This is true if you use a risk matrix in isolation. When I conduct an enterprise security risk assessment, one of the inputs is invariably a narrative threat assessment based on extensive research.

This is then rated using a standard intent and capability rating system. Other inputs, such as the analysis of competing hypotheses (ACH) model, might also form part of the analysis, all before going anywhere near a risk matrix.

Risk identification based on the threat assessment and a couple of other steps is also crucial. Ambiguous risk statements such as 'sabotage' or 'espionage' lead to poor results regardless of the tool used. They are too open to differing interpretations because they are not risks, but are bundles of risks.

At the very least, a risk statement should include the source of risk, the risk event, the assets or resources at risk, and the consequences on objectives if that risk eventuates.

For consequence inputs, processes such as business impact analysis, business impact levels, etc., can inform the consequence ratings.

The final ratings can be plotted on a risk matrix to provide some risk prioritization. This is a vastly different process from what is typically described in the risk matrix literature.

Centering Bias

The Risk of Using Risk Matrices, a paper by Thomas, Bratvold, and Bickel, suffers from similar issues. They go on to include the problem of Centering bias. This refers to the tendency of people to avoid extreme values or statements when presented with a choice. For example, if a score range is from 1 to 5, most people will select a value from 2 to 4.

In his 2009 book, The Failure of Risk Management: Why It's Broken and How to Fix It, Douglas Hubbard analyzed this in the case of information-technology projects. He found that 75% of the chosen scores were either 3 or 4. This further compacts the scale of RMs, exacerbating range compression.

Smith et al., in their 2009 article, 'Risk matrix input data biases,' came to the same conclusions from investigating risk management in the airline industry.

There is a straightforward and plausible reason for centering bias. One that I've encountered myself in countless risk assessments in almost every industry. When you think about it from a practical perspective, one would expect most risks to be centered in the middle of a risk matrix. And this is, unsurprisingly, what the literature found.

This is not, as some might suggest, a grand expose of the flaws of risk matrices. Instead, it reflects the real world.

High-likelihood, high-consequence risks (let's call this the top right corner of a risk matrix) are existential threats. Any organization that hasn't already addressed these risks or is operating in this environment is unlikely to remain in existence long.
Low-likelihood, low-consequence risks (the lower-left corner) are rarely worth documenting in a risk assessment. They sit well within organizational risk tolerances, if documented at all. Even if documented, few turn up on the risk matrix or in the risk register. Most are quickly excluded from further analysis. Plotting paper cuts and the like on a risk matrix wastes time and isn't worth including.
High-likelihood, low-consequence risks are already likely to have management controls that reduce the likelihood of them occurring. And that, in turn, means that the risks have moved towards the center.
Low-likelihood, high-consequence risks such as terrorism, earthquake, and global financial crisis typically do appear on a risk matrix or in a risk register. But there aren't many of them as a percentage of most organizations' risks.

For these reasons, medium-likelihood, medium-consequence risks are the bread and butter of most risk assessments. It matters little if they are clustered in the middle of a risk matrix or heat map.

The essential element is that the evaluation and treatment processes address each risk's specifics.

There are many limitations of risk matrices, most of which come from the following sources:

Fundamental flaws in the risk matrix due to poor design, particularly in selecting the risk criteria.
Inappropriate, incompetent, or inadequate use of risk matrices due to a lack of competence by the operator.
Ineffective use of inputs to the risk matrix.
Use of risk matrices in isolation (e.g., missing the evaluation or treatment selection steps) due to a failure to adopt an appropriate process such as ISO31000.

There are also many benefits, and the number of successful organizations that use them daily suggests that all risk matrices are reasonably practical.

If you want to build or modify a risk matrix, MS Excel is not bad. But if you'd like to automate the process... you can also find a free trial of a SaaS risk assessment application that will let you customize risk matrices and export your own risk reports at SECTARA.com.

JULIAN TALBOT