|
PM Takeaways |
|
•
When
you deploy third-party AI, you deploy the decisions its developer made about
training data, fairness testing, and risk mitigation — and under the EU AI
Act and NIST AI RMF, you are accountable for the outcomes regardless of who
built the system. Vendor accountability does not transfer your compliance
obligations; it supplements them. |
|
•
NIST
AI RMF GOVERN 6.1 requires policies that address third-party AI systems and
data, including transparency into training data, algorithms, assumptions, and
limitations — if a vendor cannot or will not provide this information, that
opacity is itself a risk indicator that should factor into your procurement
decision. |
|
•
Standard
software contracts are structurally inadequate for AI procurement: they
typically do not address model change notification, bias testing results, IP
indemnification for training data, or audit rights — every one of these must
be negotiated explicitly before contract execution, not requested after a
problem emerges. |
|
•
Pre-trained
models used in your system require ongoing monitoring as part of regular AI
system maintenance; NIST AI RMF MANAGE 3.2 is explicit that third-party model
behavior can change after fine-tuning, drift, or vendor updates — a model
that passed validation at procurement may behave differently six months into
production. |
|
•
For
critical AI functions, fallback procedures are a governance requirement, not
an IT continuity nicety — NIST GenAI GV-6.2-006 requires policies and
procedures for rollover and fallback technologies, acknowledging that
fallback may include manual processing. Design for vendor failure before you
need to execute it. |
Not every AI project builds from scratch. Most organisations adopt pre-trained models, integrate AI APIs, license AI-powered software, or deploy vendor solutions. This approach accelerates time-to-value and reduces technical complexity — but it introduces risks that traditional vendor management frameworks were not designed to handle.
When you use third-party AI, you inherit the decisions someone else made about training data, model architecture, fairness testing, and risk mitigation. You also inherit their limitations, their biases, and their vulnerabilities. And under an increasingly mature regulatory landscape, you inherit compliance obligations that cannot be contracted away to the vendor. The PM who understands this manages procurement differently.
The AI Value Chain
AI systems frequently involve multiple third-party components, each of which contributes its own risk profile to your deployment. A single production AI system may draw on several of these simultaneously.
|
Component Type |
Examples |
|
Foundation models |
GPT, Claude, Llama, Gemini — pre-trained models used via
API or fine-tuned for your use case |
|
Pre-trained models |
Specialised models for vision, speech, translation,
sentiment, or document processing |
|
Training datasets |
Licensed or purchased data used to train or fine-tune
custom models |
|
AI platforms |
Cloud ML services, MLOps platforms, feature stores,
labelling services |
|
Embedded AI |
AI capabilities built into enterprise software — CRM, ERP,
HRIS, procurement tools |
|
AI APIs |
Third-party services for specific functions: OCR, fraud
detection, recommendations, scoring |
|
Software libraries |
Open-source ML frameworks, data processing tools,
evaluation packages |
Each component in this chain introduces potential risks. Errors in any component can cascade downstream, and the complexity of these dependencies makes attribution difficult when something goes wrong. NIST AI RMF MAP 4.2 specifically requires that internal risk controls for third-party AI technologies are identified and documented — the value chain must be mapped before it can be managed.
Third-Party AI Risks
Transparency and Documentation Gaps
Without adequate vendor documentation, you may not know what data the model was trained on, which populations are underrepresented, what testing was performed, what known limitations exist, or how the model was optimised — and for whose use case. NIST AI RMF GOVERN 6.1 identifies transparency into third-party system functions, including knowledge about training data, training and inference algorithms, and assumptions and limitations, as a core policy requirement. Without this information, you cannot fully assess fitness for purpose or meet your own transparency obligations downstream.
Inherited Bias
A model trained on biased data will produce biased outputs regardless of your intentions. When you deploy third-party AI, you deploy the biases embedded in its training pipeline — and you are accountable for the outcomes in your deployment context. NIST AI RMF GOVERN 6.1 requires organisations to review third-party material for risks related to bias, data privacy, and security vulnerabilities before deployment. A vendor’s bias testing on their general population does not substitute for your own validation in your specific use case.
Intellectual Property Risks
Training data may include copyrighted material, personal data used without appropriate consent, or content with legal restrictions. Using AI trained on such data may expose your organisation to infringement claims. The EU AI Act places copyright compliance obligations on GPAI model providers, but downstream deployers must still assess exposure — particularly in sectors where IP sensitivity is high.
Security Vulnerabilities
Third-party components can introduce attack vectors that your own security posture does not address: malware or backdoors in software libraries, data poisoning in training datasets, adversarial vulnerabilities in model architectures, and privacy leaks through model memorisation of training data. NIST GenAI GV-6.1-009 requires vendor assessments to include evaluation against incident and vulnerability databases — a step that most standard IT procurement processes do not include.
Dependency and Lock-In
Reliance on third-party AI creates dependencies that may be difficult to unwind. Vendors discontinue services, change pricing, modify model behaviour in updates, revise API terms, or exit the market. NIST GenAI GV-6.2-007 advises explicitly reviewing vendor contracts to avoid arbitrary or capricious termination of critical AI services and non-standard terms that amplify or defer liability in unexpected ways. Dependency risk must be assessed at procurement, not discovered at renewal.
Compliance Transfer
Regulatory requirements for AI systems apply to you as the deployer regardless of who built the components. Under the EU AI Act, deployers of high-risk AI systems have obligations even when using third-party systems — including ensuring users can interpret outputs, maintaining human oversight mechanisms, and cooperating with post-market monitoring. NIST AI RMF MANAGE 3.1 echoes this: third-party AI risks and benefits must be regularly monitored and risk controls applied and documented. Compliance obligations do not transfer through the supply chain; they accumulate.
Due Diligence for AI Procurement
Traditional vendor due diligence evaluates financial stability, service reliability, and contractual terms. AI due diligence must go further, evaluating the model itself — its training, its testing, and its known failure modes. NIST GenAI GV-6.1-009 requires due diligence processes to address intellectual property, data privacy, security, and ongoing monitoring for third-party GenAI specifically.
Documentation to Request
|
Document |
Purpose |
|
Model cards |
Intended use, training data summary, performance metrics,
known limitations, out-of-scope uses |
|
Datasheets for datasets |
Data sources, collection methods, preprocessing, potential
biases, gaps in coverage |
|
System cards |
Full system behaviour, safety measures, human oversight
integration, deployment constraints |
|
Testing and validation reports |
Evaluation methodology, fairness and bias testing,
adversarial testing results |
|
Incident history |
Past problems, how they were identified, how they were
addressed, recurrence rate |
If a vendor cannot or will not provide this documentation, treat that opacity as a risk signal. NIST AI RMF GOVERN 6.1 suggests tracking third parties that prevent or hamper risk mapping as indicators of increased risk. Transparency gaps at procurement rarely improve post-contract.
Evaluation Criteria
|
Criterion |
Key Questions |
|
Fitness for purpose |
Is this AI appropriate for your specific use case? Has it
been validated in similar operational contexts? |
|
Performance |
What are the accuracy metrics? How were they measured? On
what populations and in what conditions? |
|
Fairness |
Has bias testing been performed? Across which demographic
groups? What were the results and accepted thresholds? |
|
Transparency |
Can individual decisions be explained? Is documentation
sufficient for your compliance obligations? |
|
Security |
What security testing has been performed? What
vulnerabilities are known and how have they been addressed? |
|
Privacy |
What data was used for training? How is your input data
handled at inference? Is it used to retrain the model? |
|
Regulatory support |
Does the vendor understand the regulatory requirements in
your operating jurisdictions? |
|
Incident response |
What are the vendor’s notification timelines and support
obligations when the AI causes harm? |
Approved Vendor Lists
NIST GenAI GV-6.1-007 recommends maintaining an inventory of all third-party entities with access to organisational content and establishing approved AI technology and service provider lists. Define the criteria for inclusion, the process for periodic review, and the conditions that would trigger removal. This is a governance artefact that procurement, legal, and technology should own jointly.
Contract Considerations
Standard software contracts are structurally inadequate for AI procurement. They were designed for deterministic software with stable behaviour. AI systems change, drift, and fail in ways that standard SLAs and limitation of liability clauses do not anticipate. NIST GenAI GV-6.2-007 identifies several categories of contract terms that directly affect your ability to govern third-party AI in production.
Key Contract Elements
|
Element |
Why It Matters |
|
Liability allocation |
Who is responsible when the AI causes harm? Avoid blanket
terms that defer all liability to the deployer. |
|
Incident notification |
Require timely notification of serious incidents arising
from the vendor’s AI, with defined timelines. |
|
Change notification |
Require advance notice before material changes to model
behaviour, architecture, or training data. |
|
Performance SLAs |
Define acceptable performance levels — including fairness
metrics — and remedies if they are not met. |
|
Audit rights |
Include the right to evaluate the vendor’s AI processes,
documentation, and testing results. |
|
Data handling |
Specify how your input data is used — particularly whether
it is used to train or fine-tune the model. |
|
IP indemnification |
Protect against claims arising from the vendor’s training
data, especially for copyright and personal data. |
|
Termination rights |
Ensure you can exit the contract if the AI does not meet
requirements or compliance obligations. |
|
Data portability |
Ensure you can retrieve your data and any fine-tuned
artefacts if you switch vendors. |
Terms to Avoid
Watch for contract language that allows unlimited use of your data for model training without consent, permits arbitrary changes to model behaviour without notification, disclaims all responsibility for AI outputs, creates high switching costs through proprietary data formats, or requires you to indemnify the vendor for AI-related claims. These terms are increasingly common in standard AI vendor agreements and must be actively negotiated out.
Ongoing Monitoring
Due diligence does not end at contract signing. NIST AI RMF MANAGE 3.2 is explicit that pre-trained models used in development must be monitored as part of regular AI system maintenance — model behaviour can change after fine-tuning, after vendor updates, or through drift in your production data distribution. A model that passed validation at procurement may behave differently six months into deployment.
What to Monitor
|
Area |
Focus |
|
Performance |
Is the AI still performing as expected in your environment
against your baseline metrics? |
|
Behaviour changes |
Have vendor updates changed outputs in ways that affect
your use case or compliance posture? |
|
Fairness drift |
Are bias metrics stable across demographic groups, or
drifting as the user population or data changes? |
|
Incidents |
Has the vendor reported incidents? Have you experienced
issues that should be reported upstream? |
|
Compliance alignment |
Do vendor practices still align with evolving regulatory
requirements in your operating jurisdictions? |
|
Security |
Are there new vulnerabilities reported in vendor
components, model architectures, or dependencies? |
Review Cadence
Establish a structured review cadence calibrated to the risk level of the AI system.
• Continuous: Automated performance and fairness metric monitoring with defined alert thresholds
• Monthly: Review of incidents, user feedback, and any vendor communications about model changes
• Quarterly: Vendor performance against SLAs, including incident response and change notification
• Annually: Full reassessment of vendor relationship, AI suitability, and alignment with current regulatory requirements
Incident Response for Third-Party AI
When third-party AI causes problems, the incident response process must account for the vendor dependency. NIST GenAI GV-6.2-003 requires incident response plans for third-party AI technologies to define ownership of response functions, communicate plans to all relevant AI actors, and be rehearsed at a regular cadence — not assembled for the first time during an active incident.
|
Element |
Action |
Vendor Dependency |
|
Detection |
Monitor for problems in third-party components via
automated tooling and user feedback channels |
Define what vendor telemetry you have access to under the
contract |
|
Containment |
Be able to disable or bypass third-party AI and activate
fallback procedures immediately |
Test fallback activation before you need it; manual
fallback may be the only option |
|
Escalation |
Know exactly how to reach vendor support for urgent
issues, with documented contacts and SLAs |
Verify escalation paths at contract execution, not during
an incident |
|
Communication |
Coordinate incident communication with the vendor to avoid
contradictory public statements |
Clarify communication ownership in the contract; align on
timelines for disclosure |
|
Documentation |
Track all third-party AI incidents with timestamps,
decisions made, and outcomes |
Document vendor response times and quality for use in
periodic relationship reviews |
|
Review |
Incorporate vendor incidents into relationship reviews;
poor incident response is a termination trigger |
Review contract terms after every significant incident for
adequacy |
Fallback Procedures
For critical AI functions, fallback procedures are a governance requirement. NIST GenAI GV-6.2-006 requires policies and procedures to test and manage risks related to rollover and fallback technologies, explicitly acknowledging that fallback may include manual processing. Design and test your fallback before you need it. An AI system with no viable fallback is an operational dependency that has not been risk-assessed.
Regulatory Implications
EU AI Act
The EU AI Act’s value chain provisions are explicit. Deployers of high-risk AI systems have obligations even when using third-party components — including risk management, human oversight, and post-market monitoring. GPAI model providers are required to supply technical documentation and instructions for use to downstream deployers, which forms the basis of deployer compliance. Downstream deployers can lodge complaints about upstream providers’ infringements with the AI Office — a recourse that presupposes the deployer has conducted sufficient due diligence to identify the infringement.
Where a party puts its name or trademark on a third-party high-risk AI system, it assumes the obligations of a provider under the Act — regardless of who built the underlying model. This has significant implications for embedded AI in enterprise software and white-label AI products.
NIST AI RMF
The NIST AI RMF addresses value chain risk across multiple functions. GOVERN 6.1 requires policies covering third-party AI systems, data, and IP risks. GOVERN 6.2 requires contingency processes for failures in third-party systems deemed high-risk. MAP 4.2 requires internal risk controls for third-party AI components to be identified and documented. MANAGE 3.1 requires that third-party AI risks are regularly monitored with controls applied and documented. Together these constitute a comprehensive third-party AI governance programme — not a single procurement checklist.
PM Responsibilities by Phase
During Planning
• Identify all third-party AI components in scope and map the full value chain from model provider to deployment
• Include vendor due diligence as a formal project deliverable with its own timeline, owner, and acceptance criteria
• Budget for evaluation, validation in your context, and ongoing monitoring — not just procurement costs
• Define the requirements that third-party AI must meet before deployment, including performance, fairness, and documentation thresholds
During Procurement
• Conduct AI-specific due diligence using the documentation and evaluation criteria above — not standard IT vendor assessment
• Request and review model cards, datasheets, testing reports, and incident history before contract execution
• Negotiate AI-specific contract terms covering change notification, audit rights, IP indemnification, and data handling
• Establish acceptance criteria for third-party AI that must be validated in your deployment environment, not just accepted from vendor documentation
During Implementation
• Validate that third-party AI meets your requirements in your specific context — vendor test results on different populations or use cases are insufficient
• Test for bias and fairness in your deployment environment with your actual user population
• Establish monitoring infrastructure for third-party components before go-live, not after
• Document all third-party dependencies in the project’s AI system documentation for governance and examination readiness
Post-Deployment
• Monitor third-party AI performance and behaviour against your defined thresholds on the cadence appropriate to the risk level
• Track vendor updates and formally assess their impact on your deployment before accepting them into production
• Manage incidents involving third-party components using your defined response process, with vendor escalation paths tested
• Conduct periodic vendor reviews on the schedule established at procurement; use incident history and SLA performance as inputs
Questions to Ask Vendors
Use these questions across the procurement and ongoing monitoring lifecycle. A vendor’s willingness and ability to answer them is itself a governance signal.
Training and Data
• What data was used to train this model, and how was it sourced and licensed?
• Were there data quality, representativeness, or coverage gaps that affect performance for specific populations or use cases?
• How do you handle personal data in training, and have you assessed compliance with applicable data protection law?
• Is our input data used to train or fine-tune the model? If so, under what conditions and with what controls?
Testing and Validation
• What testing was performed before release, and can you provide the full test reports?
• How do you test for fairness and bias, and what were the results across demographic groups?
• What are the known limitations of this model, and in which contexts is it not appropriate to use?
• Have you tested for adversarial vulnerabilities, data poisoning risks, and model memorisation of training data?
Operations and Incident Response
• How and when do you notify customers of model changes that could affect outputs or behaviour?
• What are your defined response times for critical incidents, and what remedies are available if you miss them?
• Can you provide your incident history for this model, including how issues were identified and resolved?
• What fallback or continuity options are available if your service is disrupted?
Compliance and Audit
• How do you support customers’ obligations under the EU AI Act, including documentation for deployers?
• What audit rights do you provide, and have you been audited by third parties?
• How do you handle IP indemnification for claims arising from your training data?
• Are you registered with the EU AI Act’s AI Office or any sector regulator, and can you provide compliance documentation?
Right-Sizing for Your Situation
Vendor management depth should match the risk of the AI and its role in your operations. A foundation model API used for internal content summarisation needs substantially less governance infrastructure than a third-party scoring model used to make credit or employment decisions. AIPMO’s implementation playbooks provide practical guidance calibrated to your stage.
|
Greenfield
— Third-Party AI Playbook For PMs
without formal AI vendor processes. Essential due diligence checklist, key
contract terms to negotiate, and a basic monitoring approach for smaller
deployments or lower-risk third-party AI. |
|
Emerging
— Third-Party AI Playbook For PMs
building repeatable processes. Comprehensive vendor evaluation framework,
contract negotiation guidance, approved vendor list templates, and monitoring
programme design. |
|
Established
— Third-Party AI Playbook For PMs
in organisations with formal governance. How to integrate AI vendor
management with existing procurement, vendor management, legal, and
compliance frameworks — including EU AI Act value chain obligations. |
Framework References
• EU AI Act (Official Journal, 12 July 2024) — Article 25 (responsibilities along the AI value chain; conditions under which distributors and deployers assume provider obligations); Article 53 (GPAI model provider obligations including technical documentation and instructions for downstream deployers); Recital 83 (clarification of operator roles along the value chain); Recital 88 (obligations of value chain suppliers to provide information to high-risk AI system providers)
• NIST AI Risk Management Framework (AI RMF 1.0, NIST AI 100-1) — GOVERN 6.1 (policies addressing third-party AI risks including transparency, testing, and IP); GOVERN 6.2 (contingency processes for third-party AI system failures); MAP 4.2 (internal risk controls for third-party AI components); MANAGE 3.1 (regular monitoring of third-party AI risks and documented controls); MANAGE 3.2 (monitoring of pre-trained models as part of regular system maintenance)
• NIST AI RMF Playbook — GOVERN 6.1 suggested actions (transparency into third-party system functions; thorough testing requirements; third-party technology policy evaluation; supply chain and full product lifecycle governance); GOVERN 6.2 suggested actions (contingency verification for mission-critical third-party AI; decommissioning processes for systems exceeding risk tolerances)
• NIST AI 600-1: Generative AI Profile (2024) — GV-6.1-005 (supplier risk assessment framework including monitoring and legal compliance); GV-6.1-007 (inventory of third-party entities and approved AI provider lists); GV-6.1-009 (updated due diligence for GenAI acquisition including IP, privacy, security, and embedded AI); GV-6.2-003 (incident response plans for third-party GenAI technologies); GV-6.2-006 (rollover and fallback technology policies including manual processing); GV-6.2-007 (vendor contract review to avoid liability amplification and secondary data use)
• AIGP Body of Knowledge v1.0.0 — Domain III (IP risks in AI training data; data provenance and licensing obligations); Domain IV (third-party AI governance programme requirements; vendor accountability frameworks; contractual protections for deployers)
• Singapore IMDA Model AI Governance Framework v2.0 — Section 3 (internal governance structures for AI including third-party procurement); Section 5 (vendor due diligence and model documentation requirements for deployed AI systems)
• PMI Guide to Leading and Managing AI Projects (CPMAI 2025) — Phase I (stakeholder and dependency mapping including third-party AI components); Phase III (procurement and vendor evaluation as project deliverables); Phase V (third-party AI validation in deployment context); Phase VI (ongoing monitoring obligations for procured AI systems)
This article is part of AIPMO’s PM Practice series. See also: AI Risk Registers | Model Cards and Datasheets | Monitoring AI Systems in Production