This is the first episode in series. If you want to go straight to implementation, please go to: Revolutionizing Risk Assessment — Part 2: Boilerplate application.
Probably everyone who worked on securing an organisation has realised that its entirely different type of work to securing a single device or application. It’s not an isolated case of reversing a piece of malware or hunting for a crack in the attack surface. It’s also not as exciting for a security geek who finds pleasure in challenging themself. That’s because they need to assume the perspective of an organisation and the management aspect of their works starts to take priority, effectively leaving less space for any kind of cyberpunk romance.
From the organisation’s POV their work needs to be retained and usable by others. It needs to be documented, preferably in a form of processes that allows assigning responsibility and authority to designated roles. It needs to become prioritised with regards to goals, measurable, auditable, easy to communicate, etc.. There is a reason for the “Management” in Information Security Management System.
Unfortunately for us security geeks all of these tasks, while simple, come in overwhelming volumes. Soon it seems that we’re swamped by mundane tasks that are way below our skill levels. While this work is necessary and valuable, we soon start feeling like a cog in a machine. Disconnected from the goals, our work philosophy. Dreaming about violently bringing down the system that we ourselves established in some peculiar revolution. Well, let’s make this revolution a constructive one!
Thankfully, repeatable tasks can to a large extent be automated and transformed into programmatic tasks. We can generalise over repeatable work and come up with inventive projects and implementations. However, before LLMs it was only up to a point where we deal with structured data. There was no viable solution for dealing with natural language. That meant that we still needed e.g. to go over all these vendor risk management forms and SOC2 reports and manually verify if they fulfil our requirements. Well not anymore!
Recent development in capabilities and capacities of LLMs allow us now to take course into uncharted territory of automating natural language-related tasks. Of course it has its risks and costs but it offers an incredible opportunity to transform repeatable, mind-numbing work into creative and engaging one.
In this series of articles I am planning to share my experiences with doing exactly that — bringing back the magic into ISO’s work with the use of LLMs. I will be exploring several different use-cases while building on the recent developments in the AI and IT security areas. Right now I think I will be navigating the landscape with an ISO 27001 map (because this is a standard that I am most familiar with) but I’m certain that these ideas are applicable in other standards and norms that you guys encounter.
In this series we’ll deal with something relatively easy but central to the organisational ISMS — the Risk Management and Risk Assessment processes and procedures (OMG I’m actually habitually writing it with capital letters).
Why do we need Risk Management in the first place? It helps us answering and justifying the answers to important questions from the perspective of an organisation:
Without RM we rely on our experience and intuition (which I like to imagine as experience distilled and trickling from my cortex into my lizard brain) which are often good advisors. But they can mislead us in more than one way:
RM, in addition to being an excellent foundation for a number of security-related processes, allows us to reassure ourselves and others in following a specific direction. It will always be imperfect and a compromise and will always leave some things unattended due to resources being limited, but we need to be sure it’s the right way to move forward swiftly. It allows us to understand the Why of stuff (not all of it, sadly).
Please note two things.
I referred to ISO 27005 in order to highlight one aspect that I think is different between ISO 27005 and SOC2 approach. I am no seasoned SOC2 expert but I noticed that often in context of the SOC2 the risk is defined as an independent unit, without a connection to company’s assets. For example, the risk:
Hacker exploits vulnerability in web application and exfiltrates the database
is being constructed by an analyst without referring to the organisation’s Asset Inventory (AInv). We simply assign an estimation to it, e.g. “4/5”, “High”, etc.
In contrast in ISO 27005 there is emphasis put on combining an asset and a risk scenario to define a risk:
Database + Exploitation of vulnerability in web application = Hacker exploits vulnerability in web application and exfiltrates the database
In this case the risk is defined based on the asset registered within AInv (database) and a risk scenario (hacker exploits vulnerability). Assigned risk estimation should be calculated based on asset sensitivity and the scenario likelihood with a formula that organisation chooses for itself. So assigned value could be e.g. “2+2 = 4”.
This might seem like a detail but actually it might have far reaching consequences to remaining areas of the ISMS. If we disconnect the risk estimation from asset sensitivity, we lose information that can be re-used in other processes. For example, if we are responding to an incident and we are estimating its criticality, we are not interested in the likelihood of scenario occurring because, well, it already occurred. What we can use in this situation to clearly determine the severity and decide wether to escalate or not is actually to consider affected assets and their sensitivity. If affected assets are non-critical, we might decide not to escalate and vice-versa.
In my work I have been re-using information about asset sensitivity in processes such as:
So having this information available and not lost during the Risk Assessment execution allowed me to re-use the work and save time for reinventing other processes from scratch.
To simplify, I will consider the following “manual” procedure for risk assessment:
There is potential for LLM-backed automation in all of these steps but we’ll start with step 5.
So as the inputs of step 5 we have:
As the output of step 5 we have:
The difficulty in here is to end up with a manageable risk register. It’s obviously difficult to theoretically decide when the risk register becomes unmanageable in an organisation but in practice it’s rather easy to see :).
For me (not doing RM full-time, need to deal with other costly processes such as vuln. management, incident mgmt. security reviews, etc.) this means that it has less than 100 risks and it still captures the overall risk situation of the organisation. In order to achieve this we sometimes might need to:
This last sentence will be my point of departure in an attempt to automate the process. In the next episode of the series I will define the technical solution for automation and implement the first, basic version.
While LLMs present an attractive opportunity for process automation, automation in general and LLM-backed automation specifically come with some significant risks that I would be amiss if I wouldn’t discuss.
When we are using a traditional algorithm for solving a problem, it’s relatively easy to explain how the output has been arrived at. When we encounter an output that doesn’t meet our expectations we can often diagnose and point to the specific line of code that’s introducing misalignment.
LLMs on the other hand, as members of the broader family of Machine Learning (Neural Network) solutions, are represented in as a set of numbers with very high overall “entropy”. It’s difficult to say which number is responsible for the output and what will happen if we change a particular number.
Inability to understand the process behind producing a specific output and even inability to reproduce the same output with similar input can is a source of uncertainty and risk itself. This risk can be mitigated by looping in the operator that will approve the outputs at selected “checkpoints” in the process. Removing the operator from this loop should be gradual process that is tied closely to building trust in the solution.
But even so, in the big picture, we need to accept that clear understanding all building blocks of technical solutions that surround us is a forgone idea.
In a highly specialised task such as Risk Assessment for a specific organisation we need to have reasonable confidence that the LLM is capable of producing valuable output. This means that:
This confidence can be built by interacting with their LLM in a series of iterative attempts to improve produced outputs but also familiarising ourselves with important aspects of their training.
Using LLM as a solution dependency means that we can introduce aditional weaknesses and vulnerabilities into it that we need to account for. And example of a potential vulnerability is prompt injection that can occur when we feed unsanitized input to the model.
In order to identify and mitigate these vulnerabilities I will be performing security tests on the implemented solution and selecting and implementing appropriate security controls.
In this episode I laid out the context in which I will be doing the automation work. I hope I managed to convey the importance of the Risk Management process as a foundational tool for selecting directions in IT security work and hint at the opportunities for automating it in context of the proliferation of LLMs.
In the episodes that will follow I will share my experiences in implementing automation solutions for specific instance of the process. Starting with creating a boilerplate gradio application.