NEW - AI Risk Registers: Beyond Traditional Risk Management

PM Takeaways

• The risk management mechanics you already use — identify, assess, mitigate, monitor — apply to AI projects. What changes is the taxonomy of risks you must identify, the way likelihood behaves at scale, and the fact that the register does not close at deployment. EU AI Act Article 9 defines the risk management system for high-risk AI as a continuous iterative process throughout the entire lifecycle, requiring regular systematic review and updating.

• AI systems introduce risk categories that do not appear on conventional IT risk registers. NIST AI RMF identifies seven trustworthiness characteristics that map to risk categories: bias and fairness, safety and reliability, privacy, security and robustness, transparency and explainability, accountability, and environmental impact. If your identification exercise does not systematically address all seven, you have gaps.

• NIST AI RMF distinguishes three dimensions of harm that AI systems can produce: harm to people (individual civil liberties, group discrimination, societal effects on democratic participation), harm to organisations (operations, finances, reputation), and harm to ecosystems (environmental, interconnected systems). Impact scoring that addresses only organisational harm is incomplete for AI risk.

• Likelihood for AI risks is not a simple event probability. A 1% error rate is not a rare event — at scale, it is a predictable volume of harm. Under EU AI Act Article 9(2)(b), risk assessment must cover risks arising under conditions of reasonably foreseeable misuse, not only intended use. Both dimensions — frequency at scale and misuse conditions — must appear in your likelihood assessment.

• NIST AI RMF MEASURE 3.1 requires that approaches, personnel, and documentation be in place to regularly identify and track existing, unanticipated, and emergent AI risks based on actual performance in deployed contexts. This is a post-deployment risk function, not an extension of development-phase risk management. The risk register must remain active and monitored after go-live as a standing operational requirement.

Project managers know risk registers. Identify risks, assess likelihood and impact, plan responses, monitor throughout the project. Those mechanics do not change for AI projects — but the risks do, and so does the time horizon over which you have to manage them.

AI systems introduce risk categories that do not appear on traditional IT projects. They also behave differently over time: AI risks can emerge after deployment, change as the data environment shifts, and manifest in ways that are difficult to predict or measure in advance. A risk register that was adequate for your last software project is not adequate for an AI project without extension. This article covers what needs to change, why, and how to structure the register to hold up over the full lifecycle.

What Makes AI Risks Different

Traditional software risks are largely about whether the system works as designed. AI risks include that question — and add several that have no equivalent in conventional software.

Traditional software does what it is programmed to do. AI systems learn patterns from data, including patterns that were not intended and may not be detected until the system has been operating in production. The behaviour of an AI system is not fully determined by code you can inspect: it is shaped by training data you may not fully control, by distributional assumptions that may not hold over time, and by interactions with a real-world environment that is more complex than the test environment.

Traditional Software Risks	AI-Specific Additions
Does the system function correctly?	Does it function fairly, and for all affected groups?
Is it secure from external attacks?	Is it also robust to adversarial inputs, data poisoning, and model extraction?
Does it meet the specified requirements?	Do the requirements account for all affected parties, including those not represented in the design process?
Is data protected from unauthorised access?	Is training data appropriate, representative, and free of encodings of historical discrimination?
Behaviour is static after deployment	Performance can degrade or drift as the world changes, even without code changes
Failures are typically deterministic and reproducible	Failures can be probabilistic, context-dependent, and disproportionate across subgroups

This is not an argument that AI projects are uniquely dangerous. It is an argument that the risk identification phase must be broader, the assessment methodology must account for scale effects and distributional dynamics, and the monitoring phase must continue long after the system is live.

The Seven AI Risk Categories

NIST AI RMF 1.0 organises the trustworthiness characteristics of AI systems in a way that maps directly to the risk categories your register needs to cover. Every AI risk identification exercise should systematically address all seven. If your register has entries in some categories and none in others, that is not evidence that those categories are clean — it is evidence that they were not examined.

1. Bias and Fairness

The system may produce different outcomes for different groups in ways that are unfair, discriminatory, or disproportionate. NIST identifies three types of AI bias that must be assessed: systemic bias (present in datasets, organisational norms, and broader society), computational and statistical bias (arising from non-representative samples and algorithmic processes), and human-cognitive bias (arising from how people interpret and act on AI outputs). All three can occur without any discriminatory intent.

Example risks for the register: model performs worse for groups underrepresented in training data; proxy variables introduce discrimination even without protected attribute inputs; historical bias in training data is encoded and amplified in outputs; human reviewers selectively override the system in ways that reintroduce bias that the model had partially corrected.

The NIST framing is important: systems in which harmful biases are mitigated are not necessarily fair. Balanced accuracy across demographic groups does not resolve accessibility barriers, digital divide effects, or systemic disadvantages that pre-exist the AI system. Risk identification must be wide enough to surface these second-order effects.

2. Safety and Reliability

The system may produce outputs that cause harm, or fail to perform when it is needed. EU AI Act Article 15 requires that high-risk AI systems achieve an appropriate level of accuracy and robustness and perform consistently throughout their lifecycle — including under errors, faults, or inconsistencies arising from interaction with people or other systems.

Example risks: system fails silently without alerting operators; edge cases produce dangerous or inappropriate recommendations; system performs well on held-out test data but poorly in production conditions where the input distribution differs; feedback loops cause biased outputs to influence future training data. NIST MEASURE 2.6 requires that the AI system be demonstrated safe and that residual negative risk does not exceed the risk tolerance before deployment.

3. Privacy

The system may expose, infer, or misuse personal information. AI systems can present risks to privacy that go beyond conventional data protection: they can infer information that was never explicitly provided, and training data can become embedded in model weights in ways that are difficult or impossible to remove after training.

Example risks: training dataset contains personally identifiable information that a subject did not consent to include in AI training; model outputs enable inference of sensitive attributes (health conditions, political views, sexual orientation) from inputs that do not contain those attributes directly; membership inference attacks allow adversaries to determine whether a specific individual’s data was in the training set; data retention practices for inference logs violate applicable privacy requirements. NIST MEASURE 2.10 requires that privacy risk be examined and documented.

4. Security and Robustness

The system may be vulnerable to manipulation or adversarial attack. AI systems have attack surfaces that traditional software does not: the training data, the model weights, and the inference process are all potential vectors. NIST AI RMF identifies several ML-specific attack types that do not have direct equivalents in conventional security risk management.

Example risks: adversarial inputs — carefully crafted inputs designed to cause incorrect model outputs — cause the system to misclassify in ways an attacker can exploit; data poisoning corrupts training data to introduce systematic errors or backdoors; model extraction allows a competitor or adversary to replicate a proprietary model through repeated queries; model inversion allows an attacker to reconstruct sensitive training data from model outputs. EU AI Act Article 15(5) requires that high-risk AI systems be resilient against attempts by unauthorised third parties to alter their use, outputs, or performance by exploiting system vulnerabilities. NIST MEASURE 2.7 requires that security and resilience be evaluated and documented.

5. Transparency and Explainability

The system may make decisions that cannot be understood, explained, or audited. This is not a bug that can be patched — it is a characteristic of certain model architectures. The risk is not that the model is opaque internally; it is that opacity prevents the downstream consequences that depend on explainability: meaningful human oversight, right-to-explanation obligations, operator understanding of unexpected behaviour, and audit.

Example risks: the system cannot produce an explanation of individual decisions that would satisfy a right-to-explanation request; operators cannot understand why the system is behaving unexpectedly and therefore cannot determine whether to override it; regulators require explainability demonstrations that the system cannot support; post-incident investigation is blocked by inability to trace a decision back to an interpretable cause. NIST MEASURE 2.8 requires that risks associated with transparency and accountability be examined and documented. NIST MEASURE 2.9 requires that the AI model be explained, validated, and documented.

6. Accountability

It may be unclear who is responsible for the system’s decisions and their consequences. In complex AI deployments — especially those involving third-party models, multi-step pipelines, or agentic systems — accountability can be distributed across providers, deployers, and operators in ways that leave no single party clearly responsible for any given outcome.

Example risks: when the system produces a harmful output, no party in the supply chain accepts responsibility; a third-party model provider’s risk profile is unknown to the deploying organisation; the system is redeployed in a context its provider did not intend, and neither party has addressed the resulting risks; incident response is delayed because it is unclear which party has authority to intervene or modify the system.

7. Environmental and Societal Impact

The system may produce harms at a scale or in dimensions that conventional project risk assessment does not address. NIST AI RMF 1.0 frames potential AI harms across three categories: harm to people (including individual civil liberties, group discrimination, and societal effects on democratic participation or educational access), harm to organisations (including operations, finances, and reputation), and harm to ecosystems (including interconnected systems, global financial systems, natural resources, and the environment). NIST MEASURE 2.12 requires that environmental impact and sustainability be assessed and documented.

Example risks: the system encodes and amplifies existing societal disparities at scale; the system produces homogenised outputs across a population in ways that reduce diversity of outcomes; energy consumption during training or inference is disproportionate relative to the system’s value; the system contributes to erosion of trust in a domain (healthcare, legal, financial) in ways that affect people who never interact with it directly.

Assessing AI Risks: Likelihood, Scale, and Impact

Likelihood Is Not Just Event Probability

The standard risk formula applies: Risk = Likelihood × Impact. For AI systems, the likelihood component requires additional dimensions that do not arise in conventional project risk.

First, frequency at scale. A model with 1% error rate produces errors continuously during operation. At 100,000 inferences per day, that is 1,000 errors per day — not a rare event but a predictable daily volume of harm. Likelihood assessment must account for how often an error category will occur in production volume, not only whether it can occur.

Second, distributional specificity. Risks may be near-certain for specific subgroups, inputs, or contexts even where they are rare in aggregate. A model that performs well on average may perform systematically poorly for a particular demographic group. Aggregate likelihood assessment obscures these distributional patterns. NIST MEASURE 2.11 requires that fairness and bias evaluations be documented with disaggregated results precisely because aggregate metrics conceal subgroup-level risks.

Third, EU AI Act Article 9(2)(b) requires that risk assessment cover risks arising under conditions of reasonably foreseeable misuse, not only intended use. What happens when a deployer uses the system beyond its intended scope? When an end user finds an unintended application? When an adversary probes for weaknesses? These conditions must be assessed, not excluded from the register on the grounds that they are not the intended use case.

Fourth, detectability. Some AI failures are not immediately observable. A model may be producing systematically biased outputs for weeks or months before the pattern is large enough to detect. ‘Will we know when this is happening?’ is a risk assessment question that has no equivalent in traditional software risk management and must be answered in the register for every identified AI risk.

Impact Across Three Harm Dimensions

NIST AI RMF 1.0 frames AI harm across three categories. Your impact assessment should use this structure to avoid the common failure of scoring only organisational impact while leaving person-level and ecosystem-level harms unweighted.

Harm Dimension	Questions to Ask During Impact Assessment
Harm to people — individual	Who could be harmed by an incorrect output? How severely? Is the harm reversible? Does the harm affect a right (employment, credit, healthcare, liberty, fundamental rights under EU law)?
Harm to people — group and community	Could certain demographic groups be disproportionately affected? Is this a consequence of the model’s training data or its deployment context? Does disproportionate impact constitute discrimination under applicable law?
Harm to people — societal	At scale, does the system erode democratic participation, reduce educational access, or concentrate economic opportunity in ways that affect people who never interact with it?
Harm to the organisation	What are the regulatory consequences of a failure? What is the litigation exposure? What reputational damage could result? What operational disruption would follow a system failure or emergency shutdown?
Harm to ecosystems	Does the system interact with other AI systems or automated decision pipelines in ways that could produce cascading failures? Does it affect natural resources or produce environmental costs disproportionate to its benefit?

Severity Classification: Some Risks Are Not Acceptable at Any Probability

EU AI Act Article 9(5) requires that the relevant residual risk associated with each identified hazard, and the overall residual risk of the high-risk AI system, be judged acceptable. The Act’s structure implies a threshold question: if residual risk cannot be reduced to an acceptable level, deployment is not authorised. Your risk register should reflect this with a severity classification that distinguishes between risks that can be mitigated to an acceptable residual level and risks that require fundamental redesign or a no-go decision.

Classification	Definition and Response
Prohibitive	Residual risk cannot be reduced to acceptable levels through available mitigation measures. Requires fundamental system redesign, scope change, or a no-go deployment decision. EU AI Act Article 9(5) implies this category exists: some residual risks are not acceptable regardless of probability.
Major	Significant risk requiring substantial mitigation before deployment is authorised. Deployment without adequate mitigation of major risks is not acceptable. Must appear in the risk register with specific, resourced mitigation actions and residual risk assessment.
Moderate	Risk requiring monitoring and contingency plans. Acceptable to deploy with monitoring in place and a defined response plan for when the risk materialises. Must appear in the post-deployment risk tracking process.
Minor	Risk acknowledged and accepted with minimal mitigation. Should still be logged so that changes in deployment scale or context can trigger reassessment.

Risk Register Structure for AI Projects

Extend your standard risk register with AI-specific fields. The fields that matter most for AI — and that typically do not appear in conventional IT risk registers — are detection method, review trigger, and post-deployment monitoring. These fields exist because AI risks change: they evolve with the data environment, with deployment scale, and with real-world operational experience.

Field	Purpose and AI-Specific Guidance
Risk ID	Standard identifier. Use a consistent taxonomy prefix that reflects AI risk category (e.g. BIAS-001, SEC-004) to enable category-level reporting and filtering.
Category	AI risk category from the seven domains above. A risk can span categories — log it under the primary category and cross-reference.
Description	What could happen, stated in terms of the outcome rather than the mechanism. ‘Model produces discriminatory credit decisions for applicants with non-standard income sources’ is more useful than ‘model bias risk’.
Affected parties	Who would experience harm if this risk materialises. Use NIST’s three-tier framework: individuals, groups/communities, organisations, ecosystems. This field is what makes impact assessment concrete rather than abstract.
Likelihood	Probability of occurrence — but also: frequency at production scale, distributional specificity by subgroup, and whether misuse conditions are included. EU AI Act Article 9(2)(b) requires reasonably foreseeable misuse to be assessed.
Impact	Severity if it occurs, scored across harm dimensions. Do not score only organisational impact. A risk that carries low organisational impact but high individual or group harm should carry a high overall impact score.
Risk score	Likelihood × Impact. Document the scoring methodology so that scores are comparable across entries and reviewers.
Detection method	How would the organisation know this risk has materialised? This field is critical for AI projects because many AI failures are not immediately observable. If there is no detection method, that is itself a major risk — the system could be failing without anyone knowing.
Response strategy	Avoid, mitigate, transfer, or accept. For AI risks, ‘accept’ should be rare for high-severity categories. EU AI Act Article 9(5) sets a standard of acceptable residual risk, not zero residual risk — but acceptance must be conscious and documented.
Mitigation actions	Specific, resourced steps to reduce the risk. Who does what, by when? Mitigations should be concrete enough to be tracked to completion and verified as effective.
Residual risk	Risk remaining after mitigation. EU AI Act Article 9(5) requires that residual risk be judged acceptable. NIST MANAGE 1.4 requires that residual risks be documented and disclosed to downstream actors.
Owner	Named individual responsible for this risk entry. Not a role or a team — a person. Vacant ownership is the most common cause of unmonitored AI risk.
Monitoring approach	How will this risk be tracked in production? What metrics, thresholds, and review processes? NIST MEASURE 3.1 requires approaches and personnel to regularly identify and track existing, unanticipated, and emergent risks based on actual deployed-context performance.
Review trigger	What events should prompt reassessment of this risk entry? Candidates: model retraining, change in input data sources, significant change in deployment scale, incident involving this risk category, new regulatory guidance, post-market monitoring findings.

The Register Doesn’t Close at Deployment

Traditional project risk registers update at phase gates and close at project completion. AI risk registers must function as operational governance documents that remain active and maintained throughout the system’s production lifecycle.

This requirement is not optional for high-risk AI systems. EU AI Act Article 9 defines the risk management system as a continuous iterative process planned and run throughout the entire lifecycle of a high-risk AI system, requiring regular systematic review and updating. EU AI Act Article 72 requires providers to establish and document a post-market monitoring system that actively and systematically collects, documents, and analyses data on high-risk AI system performance throughout the system’s lifetime. These are lifecycle risk obligations, not project-phase risk obligations.

NIST AI RMF frames this through two mechanisms. MEASURE 3.1 requires that approaches, personnel, and documentation be in place to regularly identify and track existing, unanticipated, and emergent AI risks based on actual performance in deployed contexts — which is different from projected performance in test conditions. MANAGE requires that the MANAGE function continue to be applied to deployed AI systems as methods, contexts, risks, and needs evolve over time. Both functions are defined as continuous rather than phase-bounded.

Build Post-Deployment Risk Tracking Into the Project

• Define the review cadence before go-live. Who reviews the risk register, how often, and against what data? Monthly performance reports, quarterly full reviews, and event-triggered reassessment are a reasonable baseline. The cadence should be specified in the project’s operational handoff documentation and resourced accordingly.

• Connect risk monitoring to system monitoring. Performance monitoring metrics — accuracy drift, subgroup performance trends, input distribution shifts, override rates — should be linked to specific risk register entries. When a monitoring metric crosses a threshold, it should trigger a defined risk register review, not simply generate an alert that disappears into an operations queue.

• Assign post-deployment risk ownership at project close. Who is responsible for each risk register entry after the project team disperses? Risk entries with vacant post-deployment owners are not monitored in practice. Transfer of risk ownership must be an explicit deliverable of the deployment phase, not an assumption.

• Log incidents against risk register entries. When an AI system incident occurs, trace it back to the risk register. Did the register contain this risk category? Was the detection method adequate? Was the mitigation effective? Update the register with incident findings. A risk register that does not incorporate operational experience is progressively disconnected from the system it is supposed to govern.

• Define the events that trigger reassessment. Model retraining, changes in input data sources, deployment scale changes, new regulatory guidance, supply chain changes (provider updates a third-party model), significant organisational changes — each should be a defined review trigger. EU AI Act Article 9(2)(c) explicitly requires that risks be evaluated based on data gathered from the post-market monitoring system.

Track How Assessments Change Over Time

A risk register that only shows current assessments loses its governance value over time. The register should include fields for the initial assessment date, the most recent assessment date, a summary of what changed and why, and the events that triggered reassessment. This creates an audit trail that demonstrates that risk management is a continuous practice, not a one-time pre-deployment exercise — which is what regulators reviewing EU AI Act compliance will want to see.

Right-Sizing for Your Situation

The depth of your AI risk register should match the risk classification of the system and your organisation’s governance maturity. A proof-of-concept system used internally by a small team needs a lighter-weight register than a system making consequential decisions about individuals at scale. But all AI risk registers share the same structural requirements: coverage of all seven risk categories, assessment that includes harm to people and not only harm to the organisation, a detection method for every significant risk, and a post-deployment monitoring plan.

Greenfield — AI Risk Register Playbook

For PMs without formal risk management processes. A simplified template focused on the most critical AI risk categories, with worked examples for bias, safety, and privacy risks. Includes guidance on writing risk descriptions that are specific enough to drive monitoring rather than just naming a category.

Emerging — AI Risk Register Playbook

For PMs building repeatable processes. Full register template with all 14 fields, assessment guidance for each risk category, likelihood scoring that accounts for scale effects and misuse conditions, and review workflow templates including post-deployment cadence and trigger definitions.

Established — AI Risk Register Playbook

For PMs in organisations with formal risk management. How to integrate AI-specific risk categories — including the EU AI Act Article 9 lifecycle risk management requirements and NIST MEASURE 3.1 emergent risk tracking — into enterprise risk frameworks. Includes portfolio-level AI risk reporting and governance escalation workflows.

Become a member →

Framework References

• EU AI Act (Official Journal, 12 July 2024) — Article 9(1) (risk management system shall be established, implemented, documented, and maintained for high-risk AI systems); Article 9(2) (risk management system is a continuous iterative process throughout the entire lifecycle, comprising: identification and analysis of known and reasonably foreseeable risks; estimation and evaluation of risks under intended use and reasonably foreseeable misuse; evaluation of risks from post-market monitoring data; adoption of targeted risk management measures); Article 9(2)(b) (risk assessment must cover risks arising under conditions of reasonably foreseeable misuse, not only intended use); Article 9(2)(c) (evaluation of risks based on data gathered from post-market monitoring); Article 9(5) (risk management measures shall be such that residual risk associated with each hazard and overall residual risk is judged acceptable; eliminating or reducing risks technically, implementing mitigation for risks that cannot be eliminated, and providing information to deployers); Article 15(1) (high-risk AI systems shall achieve appropriate levels of accuracy, robustness, and cybersecurity, and perform consistently throughout their lifecycle); Article 15(5) (high-risk AI systems shall be resilient against attempts by unauthorised third parties to alter their use, outputs, or performance by exploiting system vulnerabilities); Article 72 (providers shall establish and document a post-market monitoring system that actively and systematically collects, documents, and analyses data on high-risk AI system performance throughout the system’s lifetime)

• NIST AI RMF 1.0 (NIST AI 100-1, 2023) — Three-tier harm framework (harm to people: individual civil liberties, physical and psychological safety, and economic opportunity; group and community discrimination; societal harm to democratic participation and educational access; harm to organisations: operations, finances, reputation; harm to ecosystems: interconnected and interdependent systems, global financial systems, natural resources and environment); Seven trustworthiness characteristics (accountable and transparent; explainable and interpretable; fair – with harmful bias managed; privacy-enhanced; safe; secure and resilient; valid and reliable); Three categories of AI bias (systemic: present in datasets, organisational norms, and broader society; computational and statistical: arising from non-representative samples and algorithmic processes; human-cognitive: arising from how people interpret and act on AI outputs); MEASURE 2.6 (AI system evaluated for safety risks; residual negative risk not to exceed risk tolerance before deployment); MEASURE 2.7 (security and resilience evaluated and documented); MEASURE 2.8 (transparency and accountability risks examined and documented); MEASURE 2.10 (privacy risk examined and documented); MEASURE 2.11 (fairness and bias evaluated with disaggregated results documented); MEASURE 3.1 (approaches, personnel, and documentation in place to regularly identify and track existing, unanticipated, and emergent AI risks based on actual performance in deployed contexts); MANAGE 1.4 (residual risks documented and disclosed to downstream AI actors); MANAGE function (continuous application to deployed systems as methods, contexts, risks, and needs evolve over time)

• NIST AI 600-1: Generative AI Profile (2024) — Risk taxonomy for generative AI including: confabulation (hallucination); harmful bias and homogenisation; information integrity risks; data privacy risks; value chain and component integration risks. Categorisation by risk source: technical/model risks from malfunction; misuse by humans (malicious use); ecosystem and societal risks. Cross-cutting nature of data privacy risk across all three categories.

This article is part of AIPMO’s PM Practice series. See also: The AI Project Charter | AI Impact Assessments | The PM’s Guide to NIST AI RMF