Revolutionizing Risk Assessment with LLMs — Part 1: The Context

This is the first episode in series. If you want to go straight to implementation, please go to: Revolutionizing Risk Assessment — Part 2: Boilerplate application.

Probably everyone who worked on securing an organisation has realised that its entirely different type of work to securing a single device or application. It’s not an isolated case of reversing a piece of malware or hunting for a crack in the attack surface. It’s also not as exciting for a security geek who finds pleasure in challenging themself. That’s because they need to assume the perspective of an organisation and the management aspect of their works starts to take priority, effectively leaving less space for any kind of cyberpunk romance.

From the organisation’s POV their work needs to be retained and usable by others. It needs to be documented, preferably in a form of processes that allows assigning responsibility and authority to designated roles. It needs to become prioritised with regards to goals, measurable, auditable, easy to communicate, etc.. There is a reason for the “Management” in Information Security Management System.

Unfortunately for us security geeks all of these tasks, while simple, come in overwhelming volumes. Soon it seems that we’re swamped by mundane tasks that are way below our skill levels. While this work is necessary and valuable, we soon start feeling like a cog in a machine. Disconnected from the goals, our work philosophy. Dreaming about violently bringing down the system that we ourselves established in some peculiar revolution. Well, let’s make this revolution a constructive one!

Automation to the rescue

Thankfully, repeatable tasks can to a large extent be automated and transformed into programmatic tasks. We can generalise over repeatable work and come up with inventive projects and implementations. However, before LLMs it was only up to a point where we deal with structured data. There was no viable solution for dealing with natural language. That meant that we still needed e.g. to go over all these vendor risk management forms and SOC2 reports and manually verify if they fulfil our requirements. Well not anymore!

Recent development in capabilities and capacities of LLMs allow us now to take course into uncharted territory of automating natural language-related tasks. Of course it has its risks and costs but it offers an incredible opportunity to transform repeatable, mind-numbing work into creative and engaging one.

In this series of articles I am planning to share my experiences with doing exactly that — bringing back the magic into ISO’s work with the use of LLMs. I will be exploring several different use-cases while building on the recent developments in the AI and IT security areas. Right now I think I will be navigating the landscape with an ISO 27001 map (because this is a standard that I am most familiar with) but I’m certain that these ideas are applicable in other standards and norms that you guys encounter.

In this series we’ll deal with something relatively easy but central to the organisational ISMS — the Risk Management and Risk Assessment processes and procedures (OMG I’m actually habitually writing it with capital letters).

Risk Management Process

Why do we need Risk Management in the first place? It helps us answering and justifying the answers to important questions from the perspective of an organisation:

Why do we spend resources on A and not on B?
What is the severity of the incident that has materialised?
What is the risk introduced by this specific vulnerability that we discovered?
Why do we need / why we don’t need this specific security control deployed?

Without RM we rely on our experience and intuition (which I like to imagine as experience distilled and trickling from my cortex into my lizard brain) which are often good advisors. But they can mislead us in more than one way:

We can get involved with details of a specific area (e.g., vulnerabilities in the attack surface) while losing sight of the big picture and other issues within it (e.g. threat insider or unintended damage scenarios)
We can have hard time justifying our decisions to the auditor, the management or even to our colleagues

RM, in addition to being an excellent foundation for a number of security-related processes, allows us to reassure ourselves and others in following a specific direction. It will always be imperfect and a compromise and will always leave some things unattended due to resources being limited, but we need to be sure it’s the right way to move forward swiftly. It allows us to understand the Why of stuff (not all of it, sadly).

My take on a ISO 27005-based Risk Assessment

Please note two things.

I didn’t write “compliant” :). It‘s very strong word that carries too much meaning for the purpose of this series.
I am focusing on the assessment part, so we are leaving stuff like Risk Treatment Plan budgeting or implementation out for now.

I referred to ISO 27005 in order to highlight one aspect that I think is different between ISO 27005 and SOC2 approach. I am no seasoned SOC2 expert but I noticed that often in context of the SOC2 the risk is defined as an independent unit, without a connection to company’s assets. For example, the risk:

Hacker exploits vulnerability in web application and exfiltrates the database

is being constructed by an analyst without referring to the organisation’s Asset Inventory (AInv). We simply assign an estimation to it, e.g. “4/5”, “High”, etc.

In contrast in ISO 27005 there is emphasis put on combining an asset and a risk scenario to define a risk:

Database + Exploitation of vulnerability in web application = Hacker exploits vulnerability in web application and exfiltrates the database

In this case the risk is defined based on the asset registered within AInv (database) and a risk scenario (hacker exploits vulnerability). Assigned risk estimation should be calculated based on asset sensitivity and the scenario likelihood with a formula that organisation chooses for itself. So assigned value could be e.g. “2+2 = 4”.

This might seem like a detail but actually it might have far reaching consequences to remaining areas of the ISMS. If we disconnect the risk estimation from asset sensitivity, we lose information that can be re-used in other processes. For example, if we are responding to an incident and we are estimating its criticality, we are not interested in the likelihood of scenario occurring because, well, it already occurred. What we can use in this situation to clearly determine the severity and decide wether to escalate or not is actually to consider affected assets and their sensitivity. If affected assets are non-critical, we might decide not to escalate and vice-versa.

In my work I have been re-using information about asset sensitivity in processes such as:

security reviews of designs / architecture of new features
security reviews of deployments
estimating risks of vulnerabilities
estimating severity of incidents
prioritising business continuity-related work
defining asset handling / asset access procedures
assigning Asset Owners that hold certain responsibilities and authorities in the security context
etc.

So having this information available and not lost during the Risk Assessment execution allowed me to re-use the work and save time for reinventing other processes from scratch.

Procedure before automation

To simplify, I will consider the following “manual” procedure for risk assessment:

Identify assets
Estimate asset sensitivity based on business strategy
Identify scenarios
Estimate scenarios’ likelihood based on available intel
Combine selected assets and scenarios and calculate risk estimations according to simple formula
Assign security controls to mitigate the inherent risks so that the residual risks are within acceptable threshold

There is potential for LLM-backed automation in all of these steps but we’ll start with step 5.

So as the inputs of step 5 we have:

asset inventory with assigned sensitivity
risk scenarios register with assigned likelihood
formula for calculating risk inherent estimation

As the output of step 5 we have:

a list of risks (combinations of assets and scenarios) with assigned estimations

The difficulty in here is to end up with a manageable risk register. It’s obviously difficult to theoretically decide when the risk register becomes unmanageable in an organisation but in practice it’s rather easy to see :).

For me (not doing RM full-time, need to deal with other costly processes such as vuln. management, incident mgmt. security reviews, etc.) this means that it has less than 100 risks and it still captures the overall risk situation of the organisation. In order to achieve this we sometimes might need to:

reduce the number of assets by generalising and grouping assets
reduce the number of risk scenarios by generalising and grouping scenarios
select only a subset of assets and a subset of scenarios that make sense as risks

This last sentence will be my point of departure in an attempt to automate the process. In the next episode of the series I will define the technical solution for automation and implement the first, basic version.

Potential risks of automation

While LLMs present an attractive opportunity for process automation, automation in general and LLM-backed automation specifically come with some significant risks that I would be amiss if I wouldn’t discuss.

Machine Learning as a black box

When we are using a traditional algorithm for solving a problem, it’s relatively easy to explain how the output has been arrived at. When we encounter an output that doesn’t meet our expectations we can often diagnose and point to the specific line of code that’s introducing misalignment.

LLMs on the other hand, as members of the broader family of Machine Learning (Neural Network) solutions, are represented in as a set of numbers with very high overall “entropy”. It’s difficult to say which number is responsible for the output and what will happen if we change a particular number.

Inability to understand the process behind producing a specific output and even inability to reproduce the same output with similar input can is a source of uncertainty and risk itself. This risk can be mitigated by looping in the operator that will approve the outputs at selected “checkpoints” in the process. Removing the operator from this loop should be gradual process that is tied closely to building trust in the solution.

But even so, in the big picture, we need to accept that clear understanding all building blocks of technical solutions that surround us is a forgone idea.

LLMs capabilities — knowledge and reasoning

In a highly specialised task such as Risk Assessment for a specific organisation we need to have reasonable confidence that the LLM is capable of producing valuable output. This means that:

We need to make sure that LLM has been trained on material sufficient for capturing meaning behind concepts such as risk, risk scenarios, TTPs, vulnerabilities and their technical aspects, assets and their roles in key processes
We need to make sure that the LLM has sufficient reasoning capacity and capability of following instructions for executing tasks
We need to be able to break down complex tasks into a set of smaller, simpler tasks
We need to understand traits of a specific LLM that we use, e.g. preferred layout of prompts, formal structuring, etc.

This confidence can be built by interacting with their LLM in a series of iterative attempts to improve produced outputs but also familiarising ourselves with important aspects of their training.

Potential vulnerabilities

Using LLM as a solution dependency means that we can introduce aditional weaknesses and vulnerabilities into it that we need to account for. And example of a potential vulnerability is prompt injection that can occur when we feed unsanitized input to the model.

In order to identify and mitigate these vulnerabilities I will be performing security tests on the implemented solution and selecting and implementing appropriate security controls.

Summary

In this episode I laid out the context in which I will be doing the automation work. I hope I managed to convey the importance of the Risk Management process as a foundational tool for selecting directions in IT security work and hint at the opportunities for automating it in context of the proliferation of LLMs.

In the episodes that will follow I will share my experiences in implementing automation solutions for specific instance of the process. Starting with creating a boilerplate gradio application.

‍

<<< Back

What is now proved
was once, only imagin'd

- William Blake

Revolutionizing Risk Assessment with LLMs — Part 1: The Context

Automation to the rescue

Risk Management Process

My take on a ISO 27005-based Risk Assessment

Procedure before automation

Potential risks of automation

Machine Learning as a black box

LLMs capabilities — knowledge and reasoning

Potential vulnerabilities

Summary

What is now proved was once, only imagin'd

- William Blake

Revolutionizing Risk Assessment with LLMs — Part 1: The Context

Automation to the rescue

Risk Management Process

My take on a ISO 27005-based Risk Assessment

Procedure before automation

Potential risks of automation

Machine Learning as a black box

LLMs capabilities — knowledge and reasoning

Potential vulnerabilities

Summary

What is now proved
was once, only imagin'd