|
PM Takeaways |
|
•
Agentic
AI shifts PM accountability from reviewing outputs to governing autonomous
action sequences — EU AI Act Article 14 requires deployers to assign named,
competent humans for oversight before any high-risk deployment, and that
assignment must be documented. |
|
•
Your
project charter must define autonomy boundaries, escalation triggers, and a
circuit breaker before an agent touches production — PMI CPMAI Phase IV
treats safety and constraint system development as a mandatory gate, not an
optional architecture decision. |
|
•
A
single misinterpreted goal instruction can cascade across tool calls, APIs,
and downstream systems — NIST MANAGE 2.4 requires documented deactivation
criteria and escalation procedures established before deployment, not drafted
in response to an incident. |
|
•
Agentic
systems in high-risk use cases — employment screening, credit assessment,
education, healthcare — inherit full EU AI Act Chapter III obligations
regardless of the underlying model’s licence or origin. |
|
•
Multi-agent
systems require system-level governance, not just per-agent controls —
accountability gaps between orchestrator and worker agents are a documented
failure mode that no individual agent’s guardrails can prevent on their own. |
The AI systems you’ve been managing are about to get more complicated.
Traditional AI makes recommendations that humans review and implement. Agentic AI takes action on its own — browsing the web, executing code, calling APIs, managing workflows, even delegating tasks to other AI systems. The shift from “AI that advises” to “AI that acts” changes how you plan, govern, and oversee AI projects in ways none of the existing frameworks fully anticipated when they were written.
The frameworks you’ve learned still apply — but they need to stretch. This article explains where they stretch, what new obligations they create, and what PMs need to govern before an agentic system goes anywhere near production.
What Makes AI “Agentic”
Not all AI is agentic. The distinction matters for governance because the accountability model changes at each level of autonomy.
Traditional AI operates on a simple cycle: input, model, output. A human reviews the output and decides what to do. Agentic AI operates on a different cycle: goal, planning, action, observation, replanning, further action. The agent doesn’t wait for human approval between steps. It pursues its objective through whatever action sequence its reasoning produces.
|
Characteristic |
What It Means for Governance |
|
Autonomy |
Acts without step-by-step human direction — oversight must be
designed into the system, not applied at each step |
|
Goal-directed |
Works toward objectives rather than responding to single prompts
— goal specification becomes a risk surface |
|
Tool
use |
Interacts with external systems, APIs, databases, files — each
tool is an expanded attack surface |
|
Planning |
Breaks down complex tasks into steps — action sequences may be
unpredictable even if the goal was clearly defined |
|
Persistence |
Maintains context across multiple actions — errors compound
rather than being isolated to a single response |
|
Adaptability |
Adjusts approach based on results — the system you tested may not
be the system operating in production |
The Spectrum of Agency
Agency isn’t binary. Most current enterprise deployments sit in the assisted-to-supervised range — but the technology is enabling more delegated and autonomous use cases rapidly.
|
Level |
Description |
|
Advisory |
AI recommends, human acts. Example: chatbot suggests a draft
response. |
|
Assisted |
AI drafts, human approves each step. Example: AI writes email,
human sends. |
|
Supervised |
AI acts within set boundaries, human monitors. Example: AI
schedules within calendar rules. |
|
Delegated |
AI completes tasks, human reviews outcomes. Example: agent
researches and produces report. |
|
Autonomous |
AI operates independently toward goals. Example: self-managing
system optimization. |
Your governance approach should be calibrated to where on this spectrum your deployment sits. A supervised agent with limited tool access needs different oversight than a delegated agent with broad system permissions.
Why Agentic AI Is Different
Unpredictable Action Sequences
Traditional AI is bounded and predictable: given this input, the system produces this output. Agentic AI is emergent: given this goal, the action sequence that follows depends on what the agent encounters at each step. You cannot fully specify in advance what steps an agent will take. The sequence emerges from the interaction between the agent’s reasoning, its tools, and the environment.
PMI-CPMAI™ Phase IV identifies this directly: managing agentic AI “requires new oversight methods beyond traditional software” because emergent behaviors cannot be predicted from component-level testing alone.
Cascading Errors
In traditional AI, a wrong answer is contained — the human catches it before acting. In agentic AI, one mistake can cascade: a misinterpreted instruction leads to wrong research, which feeds a flawed analysis, which triggers incorrect actions in downstream systems. By the time the error is visible, it may have propagated across multiple tools and data stores.
This is why NIST MANAGE 2.4 requires that mechanisms for superseding, disengaging, or deactivating AI systems be established before deployment, not drafted after an incident: “Mechanisms are in place and applied, and responsibilities are assigned and understood, to supersede, disengage, or deactivate AI systems that demonstrate performance or outcomes inconsistent with intended use.”
Expanded Attack Surface
Every tool an agent can use is a potential risk vector. Web browsing exposes the system to prompt injection from malicious pages. File access enables data exfiltration or corruption. API calls can trigger unintended actions in connected systems. Code execution can have consequences far beyond the immediate task.
Accountability Gaps
When an agent takes a sequence of autonomous actions, traditional accountability models break down. Who is responsible for the outcome — the person who set the goal, the team that configured the agent, or the vendor who built the model? The answer isn’t always clear, and in multi-agent architectures it becomes harder still. Governance must assign accountability before deployment, not search for it after.
The Regulatory Landscape
EU AI Act
The EU AI Act’s risk-based approach applies to agentic AI without modification. High-risk agentic uses — employment screening, credit assessment, education, law enforcement — require full Chapter III compliance: risk management, data governance, technical documentation, quality management, post-market monitoring, and human oversight. These requirements attach to the use case, not the model.
Article 14 is the central obligation for deployers. High-risk AI systems must be “designed and developed… that they can be effectively overseen by natural persons during the period in which they are in use.” Article 14(4) specifies that the persons assigned oversight must be able to “intervene in the operation of the high-risk AI system or interrupt the system through a ‘stop’ button or a similar procedure.” For agentic systems, this means a functional circuit breaker is not optional.
Article 14(4)(b) also requires that assigned persons remain “aware of the possible tendency of automatically relying or over-relying on the output produced by a high-risk AI system” — automation bias. In agentic systems where action sequences happen faster than human review cycles, this risk is structurally amplified.
Because agentic systems can self-update and adapt, the EU AI Act’s emphasis on continuous monitoring is especially important. Article 5’s prohibitions on manipulation and exploitation of vulnerabilities apply to agentic behaviors, and the IAPP/HCLTech Global AI Governance Law & Policy Series 2025 notes that “because agentic systems can self-update, risk levels may evolve — underscoring continuous monitoring and in-life change control.”
NIST AI RMF
NIST has not yet issued agentic-specific guidance — a revision to address agentic AI is anticipated — but the AI RMF’s existing functions apply directly. NIST hosted a workshop in January 2025 specifically to develop a taxonomy of agentic AI tools, with published lessons learned in August 2025.
Four functions are directly relevant:
• MAP 3.5: Processes for human oversight must be “defined, assessed, and documented in accordance with organizational policies.” NIST is explicit that oversight is a shared responsibility and “attempts to properly authorize or govern oversight practices will not be effective without organizational buy-in and accountability mechanisms.”
• MANAGE 2.4: Mechanisms must be in place to supersede, disengage, or deactivate AI systems. Action MG-2.4-004 requires teams to “establish and regularly review specific criteria that warrants the deactivation of GAI systems in accordance with set risk tolerances and appetites.”
• MANAGE 2.3: Procedures must be in place to respond to and recover from previously unknown risks. Action MG-2.3-001 requires that response and recovery plans “account for the GAI system value chain” and include communication procedures for downstream actors.
• MANAGE 4.1: Post-deployment monitoring plans must include “mechanisms for capturing and evaluating input from users and other relevant AI Actors, appeal and override, decommissioning, incident response, recovery, and change management.”
Singapore IMDA
Singapore’s IMDA published its Agentic AI Governance Framework in 2025, introducing a principal-agent accountability model that is directly actionable for PMs. The framework defines the deploying organization as the principal — the party who authorizes the agent’s actions and retains accountability for outcomes — and establishes five governance principles that operate at the project level:
• Directed: Agents must operate within goals and boundaries set by the principal. Scope creep — the agent interpreting its goals more broadly than intended — is a primary failure mode.
• Sanctioned: Agents should only take actions explicitly authorized. Any action outside the sanctioned set requires escalation, not autonomous judgment.
• Supervised: Meaningful human oversight must be maintained throughout operation, not only at initial deployment.
• Transparent: Agent actions, decisions, and data access must be logged in a way that allows audit and explainability.
• Minimal footprint: Agents should request only the permissions and access they need for the current task. Broad, standing permissions are a governance risk, not an efficiency gain.
The minimal footprint principle has direct PM implications: resist the temptation to grant broad system access for convenience. Every permission granted to an agent expands the blast radius of a misinterpreted goal or a compromised prompt.
Singapore’s IMDA also published the Government Technology Agency’s Agentic AI Primer (2025) for public sector deployments — the first government-level operational framework for agentic AI governance, covering multi-agent architectures and accountability chains.
Global Regulatory Picture
Most jurisdictions are applying existing frameworks rather than enacting agentic-specific law. The IAPP/HCLTech Global AI Governance Law & Policy Series 2025 maps the current landscape:
• United States: No federal agentic-specific legislation. Sector-specific laws will likely apply as agentic AI enters regulated industries. NIST’s anticipated AI RMF revision is the most significant pending development.
• EU: Existing AI Act obligations apply based on use case risk classification. Agentic AI that qualifies as high-risk inherits all Chapter III requirements. Continuous monitoring obligations are especially relevant given self-updating system risks.
• China: Broader regulations on algorithms, generative AI, and automated decision-making apply. Standards for “intelligent agents” under ITU-T F.748.46 are emerging. CSL amendments effective January 2026 add new AI-specific provisions.
• UK: General responsible AI principles apply; sector-specific agentic guidance is expected.
• UAE: Beginning to differentiate between autonomous AI — operating within predefined parameters — and agentic AI — pursuing goals with greater flexibility. Governance frameworks are being extended to address this distinction.
Your Governance Obligations
Project Charter Requirements
An agentic AI project charter must address elements that don’t appear in traditional AI project documentation. Before any agent is deployed, the following must be defined and formally approved:
|
Charter
Element |
What Must Be Defined |
|
Autonomy
level |
Where on the advisory-to-autonomous spectrum does this system
sit? This determines oversight intensity. |
|
Tool
access |
What systems, APIs, and data can the agent interact with? Each
must be explicitly listed. |
|
Boundaries |
What actions are explicitly prohibited? Stated as rules, not as
general principles. |
|
Escalation
triggers |
Under what specific conditions must the agent stop and seek human
input? |
|
Circuit
breaker |
Who has authority to interrupt the system, how do they do it, and
is it tested? |
|
Rollback
capability |
Can agent actions be reversed? If not, what compensating controls
exist for irreversible actions? |
|
Accountability
assignment |
Who is the named human responsible for oversight per EU AI Act
Article 14? |
Risk Assessment
Agentic AI introduces risk categories that don’t appear in standard AI risk registers. Each requires explicit documentation:
• Scope creep: Could the agent interpret its goals more broadly than intended? How are goal statements validated before deployment?
• Cascading errors: How far could a mistake propagate before being detected? What’s the maximum blast radius?
• Prompt injection: Could external content manipulate the agent’s behavior via web browsing, document reading, or API responses?
• Tool misuse: Could the agent use its authorized tools in ways that weren’t anticipated when permissions were granted?
• Resource consumption: Could the agent consume excessive compute, API calls, or budget autonomously? Are there hard caps?
• Data exposure: Could the agent inadvertently transmit sensitive information to external systems or third-party APIs?
• Unauthorized actions: Could the agent take actions outside its intended scope, either through misinterpretation or adversarial manipulation?
Human Oversight Design
EU AI Act Article 14 requires that oversight be designed into the system before deployment. For agentic AI, this means selecting a model appropriate to the risk level:
|
Oversight
Model |
When to Use |
|
Approval
gates |
Agent pauses at defined points for human sign-off. Required for
high-risk or irreversible actions. |
|
Boundary
enforcement |
Agent operates freely within defined constraints; triggers
escalation at boundary. For lower-risk tasks with clear limits. |
|
Monitoring
and intervention |
Agent acts; humans watch and can interrupt. For time-sensitive
tasks where approval gates would break the use case. |
|
Post-hoc
review |
Agent completes tasks; humans review outcomes. Only for low-risk,
fully reversible actions. |
For most enterprise use cases, a combination is appropriate — tight approval gates on high-risk or irreversible actions, boundary enforcement on routine tasks. The oversight model must be specified in the project charter and tested before production deployment.
Testing: Beyond Traditional QA
Traditional testing asks: given this input, does the system produce correct output?
Agentic testing must ask a different set of questions:
• Given this goal, does the agent take appropriate steps — not just the right final action, but the right path?
• Does the agent stay within its authorized boundaries when pursuing goals?
• How does the agent behave when it encounters unexpected or ambiguous situations mid-task?
• Can the agent be manipulated by adversarial content in its environment — malicious web pages, doctored API responses?
• Does the agent handle failures in connected systems gracefully, or do external failures cascade?
PMI-CPMAI™ Phase V for agentic AI specifies comprehensive validation requirements: individual agent performance testing, multi-agent coordination testing, safety mechanism validation, and “adversarial scenario testing that attempts to manipulate agent behavior through prompt injection, goal ambiguity, and environmental manipulation.” These are not optional test categories — they are Phase V gates.
CPMAI’s agentic success factors identify two specific gaps that traditional QA misses: circuit breaker effectiveness (does the safety mechanism actually stop the agent under the conditions where it should?) and constraint enforcement validation (does the agent stay within boundaries when pursuing goals, or does goal-directed reasoning override constraints?).
Post-Deployment Monitoring
Agentic systems require a monitoring framework that traditional AI production monitoring doesn’t address. NIST MANAGE 4.1 requires post-deployment monitoring plans to include “mechanisms for capturing and evaluating input from users and other relevant AI Actors, appeal and override, decommissioning, incident response, recovery, and change management.”
At minimum, your monitoring framework should track:
|
Metric |
Purpose |
|
Action
logs |
Full audit trail of every action taken, every tool called, every
data source accessed. |
|
Boundary
violations |
Did the agent attempt actions outside its defined scope? These
are early warning signals. |
|
Escalation
patterns |
When does the agent seek human input? Changes in escalation
frequency signal behavioral drift. |
|
Error
rates |
How often do action sequences fail, and at what step? |
|
Resource
usage |
Compute, API calls, cost — unexpected spikes signal scope creep
or adversarial manipulation. |
|
Outcome
quality |
Are completed tasks meeting quality standards? Degradation in
quality precedes more serious failures. |
NIST MANAGE 4.1 action MG-4.1-002 specifically requires organizations to “establish, maintain, and evaluate effectiveness of organizational processes and procedures for post-deployment monitoring of GAI systems.” Evaluate effectiveness, not just implement monitoring. This means regular reviews of whether the monitoring framework is actually detecting the issues it was designed to catch.
Multi-Agent Systems
Complexity increases substantially when multiple agents work together. Governance must account for the system as a whole, not just individual agents.
|
Orchestration
Pattern |
Governance Challenge |
|
Sequential |
Agents hand off tasks in sequence — accountability across
handoffs must be explicitly assigned. |
|
Parallel |
Agents work simultaneously on subtasks — coordination failures
and conflicting actions are the primary risk. |
|
Hierarchical |
Supervisor agent delegates to worker agents — the supervisor’s
instructions become a governance surface. |
|
Collaborative |
Agents negotiate and coordinate — emergent behavior from
interactions is the hardest risk to anticipate in testing. |
PMI-CPMAI™’s guidance on multi-agent systems is direct: “Organizations should prioritize thorough testing and validation of agent behaviors, coordination, and safety systems.” Specifically, this means testing the system as a whole, not just validating individual agents.
Three risks are unique to multi-agent architectures:
• Accountability diffusion: When multiple agents contribute to an outcome, responsibility is harder to trace. The IMDA’s principal-agent model addresses this by requiring that accountability remain with the human principal regardless of how many agents are involved.
• Error amplification: Errors don’t just cascade — in parallel and collaborative architectures, they can be amplified across agents simultaneously.
• Emergent behavior: System-level behaviors can emerge from agent interactions that no individual agent was designed to produce and that unit-level testing would never surface.
PM Responsibilities
During Planning
• Assess whether agentic AI is the right approach for the use case — many use cases are better served by supervised or assisted models with less governance overhead
• Define autonomy level, tool access, boundaries, escalation triggers, and circuit breaker procedures in the project charter before any development work begins
• Map the use case to EU AI Act risk categories — if the application touches employment, credit, healthcare, or education, full Chapter III compliance is required regardless of the underlying model
• Apply the IMDA’s minimal footprint principle to tool access design: grant only what is needed for the current task, not what might be useful in future tasks
• Budget explicitly for extended testing, adversarial evaluation, and post-deployment monitoring infrastructure
During Development
• PMI-CPMAI™ Phase IV requires safety and constraint systems to be built alongside agent capabilities, not added after — boundary enforcement mechanisms, circuit breakers, and intervention capabilities must be developed as first-class deliverables
• Implement audit logging before any integration testing — you cannot test what you cannot observe
• Test boundary enforcement and escalation triggers as formal acceptance criteria, not as developer self-testing
• Conduct adversarial testing for prompt injection resistance before any external data access is enabled
• If building multi-agent systems: test the full system, not individual agents — PMI-CPMAI™ Phase V specifies multi-agent coordination testing as a separate gate
During Deployment and Post-Deployment
• Verify that the assigned human oversight person has the “necessary competence, training and authority” per EU AI Act Article 14(4) — this is a documented compliance requirement, not an informal designation
• Start with limited autonomy and expand gradually as monitoring data confirms the system is behaving within expected boundaries
• Establish the deactivation criteria required by NIST MANAGE 2.4 (MG-2.4-004) and test that the circuit breaker functions before production traffic begins
• Review action logs and boundary violation reports on a defined cadence — not only when something goes wrong
• Update boundaries and guardrails based on operational experience; the system’s behavior in production will reveal edge cases that testing missed
Scaling This to Your Context
Greenfield — First Agentic Deployment
Start with a supervised agent that has the minimum tool access required to deliver business value. Define boundaries tightly before you grant permissions broadly. The IMDA’s minimal footprint principle is your default starting position, not a constraint to be engineered around. Build and test your circuit breaker before you build and test anything else. The AIPMO AI Governance Advisor at app.aipmo.co can generate a project charter template for agentic AI deployments grounded in CPMAI Phase I business understanding requirements and EU AI Act Article 14 oversight obligations.
Emerging — Agentic Capabilities in Production
Formalize what is likely currently informal. Document the autonomy boundaries, escalation triggers, and oversight assignments that are probably understood by the team but not formally specified. Conduct the adversarial testing and circuit breaker validation that was likely skipped in an initial deployment. NIST MANAGE 4.1 requires that monitoring effectiveness be evaluated, not just that monitoring exists — run a gap assessment against that standard. The AIPMO AI Governance Advisor can help you design a monitoring plan that maps to NIST MANAGE 4.1 action items.
Established — Governing Agentic AI Across Multiple Systems
At scale, the challenge is governance consistency across teams and use cases. The IMDA principal-agent model provides a common accountability framework that can be applied portfolio-wide without requiring custom governance approaches per project. Map your portfolio-level agentic AI governance to EU AI Act Article 14 obligations and NIST MANAGE 2.4 deactivation standards to establish a defensible compliance baseline. The AIPMO AI Governance Advisor supports multi-organization context for Consultant-tier users and can help you develop portfolio-level agentic governance standards.
Framework References
• EU AI Act (Official Journal 2024) — Art. 14 (human oversight for high-risk AI systems); Art. 5 (prohibited AI practices including manipulation); Chapter III (high-risk AI system requirements).
• NIST AI RMF 1.0 (NIST AI 100-1, 2023) — MAP 3.5 (human oversight processes); MANAGE 2.3 (response to unknown risks); MANAGE 2.4 (deactivation and disengagement mechanisms); MANAGE 4.1 (post-deployment monitoring).
• NIST AI 600-1 — Generative AI Profile, 2024. MG-2.4-001 through MG-2.4-004 (deactivation criteria and escalation procedures).
• Singapore IMDA — Agentic AI Governance Framework, 2025. Principal-agent accountability model; five governance principles: directed, sanctioned, supervised, transparent, minimal footprint. Government Technology Agency of Singapore Agentic AI Primer, 2025.
• PMI — Guide to Leading and Managing AI Projects (CPMAI™), 2025. Phase IV (Model Development for Agentic AI); Phase V (Model Evaluation for Agentic AI); Agentic AI Success Factors.
• IAPP/HCLTech — Global AI Governance Law & Policy Series 2025. Agentic AI regulatory landscape; continuous monitoring and change control requirements.
AIPMO sits at the intersection of project management and AI governance. All content is grounded in published frameworks, not vendor marketing. For deployment-context guidance, visit app.aipmo.co.