NEW - Third-Party AI and Vendor Management: Risks You Don't Control

PM Takeaways

• When you deploy third-party AI, you deploy the decisions its developer made about training data, fairness testing, and risk mitigation — and under the EU AI Act and NIST AI RMF, you are accountable for the outcomes regardless of who built the system. Vendor accountability does not transfer your compliance obligations; it supplements them.

• NIST AI RMF GOVERN 6.1 requires policies that address third-party AI systems and data, including transparency into training data, algorithms, assumptions, and limitations — if a vendor cannot or will not provide this information, that opacity is itself a risk indicator that should factor into your procurement decision.

• Standard software contracts are structurally inadequate for AI procurement: they typically do not address model change notification, bias testing results, IP indemnification for training data, or audit rights — every one of these must be negotiated explicitly before contract execution, not requested after a problem emerges.

• Pre-trained models used in your system require ongoing monitoring as part of regular AI system maintenance; NIST AI RMF MANAGE 3.2 is explicit that third-party model behavior can change after fine-tuning, drift, or vendor updates — a model that passed validation at procurement may behave differently six months into production.

• For critical AI functions, fallback procedures are a governance requirement, not an IT continuity nicety — NIST GenAI GV-6.2-006 requires policies and procedures for rollover and fallback technologies, acknowledging that fallback may include manual processing. Design for vendor failure before you need to execute it.

Not every AI project builds from scratch. Most organisations adopt pre-trained models, integrate AI APIs, license AI-powered software, or deploy vendor solutions. This approach accelerates time-to-value and reduces technical complexity — but it introduces risks that traditional vendor management frameworks were not designed to handle.

When you use third-party AI, you inherit the decisions someone else made about training data, model architecture, fairness testing, and risk mitigation. You also inherit their limitations, their biases, and their vulnerabilities. And under an increasingly mature regulatory landscape, you inherit compliance obligations that cannot be contracted away to the vendor. The PM who understands this manages procurement differently.

The AI Value Chain

AI systems frequently involve multiple third-party components, each of which contributes its own risk profile to your deployment. A single production AI system may draw on several of these simultaneously.

Component Type	Examples
Foundation models	GPT, Claude, Llama, Gemini — pre-trained models used via API or fine-tuned for your use case
Pre-trained models	Specialised models for vision, speech, translation, sentiment, or document processing
Training datasets	Licensed or purchased data used to train or fine-tune custom models
AI platforms	Cloud ML services, MLOps platforms, feature stores, labelling services
Embedded AI	AI capabilities built into enterprise software — CRM, ERP, HRIS, procurement tools
AI APIs	Third-party services for specific functions: OCR, fraud detection, recommendations, scoring
Software libraries	Open-source ML frameworks, data processing tools, evaluation packages

Each component in this chain introduces potential risks. Errors in any component can cascade downstream, and the complexity of these dependencies makes attribution difficult when something goes wrong. NIST AI RMF MAP 4.2 specifically requires that internal risk controls for third-party AI technologies are identified and documented — the value chain must be mapped before it can be managed.

Third-Party AI Risks

Transparency and Documentation Gaps

Without adequate vendor documentation, you may not know what data the model was trained on, which populations are underrepresented, what testing was performed, what known limitations exist, or how the model was optimised — and for whose use case. NIST AI RMF GOVERN 6.1 identifies transparency into third-party system functions, including knowledge about training data, training and inference algorithms, and assumptions and limitations, as a core policy requirement. Without this information, you cannot fully assess fitness for purpose or meet your own transparency obligations downstream.

Inherited Bias

A model trained on biased data will produce biased outputs regardless of your intentions. When you deploy third-party AI, you deploy the biases embedded in its training pipeline — and you are accountable for the outcomes in your deployment context. NIST AI RMF GOVERN 6.1 requires organisations to review third-party material for risks related to bias, data privacy, and security vulnerabilities before deployment. A vendor’s bias testing on their general population does not substitute for your own validation in your specific use case.

Intellectual Property Risks

Training data may include copyrighted material, personal data used without appropriate consent, or content with legal restrictions. Using AI trained on such data may expose your organisation to infringement claims. The EU AI Act places copyright compliance obligations on GPAI model providers, but downstream deployers must still assess exposure — particularly in sectors where IP sensitivity is high.

Security Vulnerabilities

Third-party components can introduce attack vectors that your own security posture does not address: malware or backdoors in software libraries, data poisoning in training datasets, adversarial vulnerabilities in model architectures, and privacy leaks through model memorisation of training data. NIST GenAI GV-6.1-009 requires vendor assessments to include evaluation against incident and vulnerability databases — a step that most standard IT procurement processes do not include.

Dependency and Lock-In

Reliance on third-party AI creates dependencies that may be difficult to unwind. Vendors discontinue services, change pricing, modify model behaviour in updates, revise API terms, or exit the market. NIST GenAI GV-6.2-007 advises explicitly reviewing vendor contracts to avoid arbitrary or capricious termination of critical AI services and non-standard terms that amplify or defer liability in unexpected ways. Dependency risk must be assessed at procurement, not discovered at renewal.

Compliance Transfer

Regulatory requirements for AI systems apply to you as the deployer regardless of who built the components. Under the EU AI Act, deployers of high-risk AI systems have obligations even when using third-party systems — including ensuring users can interpret outputs, maintaining human oversight mechanisms, and cooperating with post-market monitoring. NIST AI RMF MANAGE 3.1 echoes this: third-party AI risks and benefits must be regularly monitored and risk controls applied and documented. Compliance obligations do not transfer through the supply chain; they accumulate.

Due Diligence for AI Procurement

Traditional vendor due diligence evaluates financial stability, service reliability, and contractual terms. AI due diligence must go further, evaluating the model itself — its training, its testing, and its known failure modes. NIST GenAI GV-6.1-009 requires due diligence processes to address intellectual property, data privacy, security, and ongoing monitoring for third-party GenAI specifically.

Documentation to Request

Document	Purpose
Model cards	Intended use, training data summary, performance metrics, known limitations, out-of-scope uses
Datasheets for datasets	Data sources, collection methods, preprocessing, potential biases, gaps in coverage
System cards	Full system behaviour, safety measures, human oversight integration, deployment constraints
Testing and validation reports	Evaluation methodology, fairness and bias testing, adversarial testing results
Incident history	Past problems, how they were identified, how they were addressed, recurrence rate

If a vendor cannot or will not provide this documentation, treat that opacity as a risk signal. NIST AI RMF GOVERN 6.1 suggests tracking third parties that prevent or hamper risk mapping as indicators of increased risk. Transparency gaps at procurement rarely improve post-contract.

Evaluation Criteria

Criterion	Key Questions
Fitness for purpose	Is this AI appropriate for your specific use case? Has it been validated in similar operational contexts?
Performance	What are the accuracy metrics? How were they measured? On what populations and in what conditions?
Fairness	Has bias testing been performed? Across which demographic groups? What were the results and accepted thresholds?
Transparency	Can individual decisions be explained? Is documentation sufficient for your compliance obligations?
Security	What security testing has been performed? What vulnerabilities are known and how have they been addressed?
Privacy	What data was used for training? How is your input data handled at inference? Is it used to retrain the model?
Regulatory support	Does the vendor understand the regulatory requirements in your operating jurisdictions?
Incident response	What are the vendor’s notification timelines and support obligations when the AI causes harm?

Approved Vendor Lists

NIST GenAI GV-6.1-007 recommends maintaining an inventory of all third-party entities with access to organisational content and establishing approved AI technology and service provider lists. Define the criteria for inclusion, the process for periodic review, and the conditions that would trigger removal. This is a governance artefact that procurement, legal, and technology should own jointly.

Contract Considerations

Standard software contracts are structurally inadequate for AI procurement. They were designed for deterministic software with stable behaviour. AI systems change, drift, and fail in ways that standard SLAs and limitation of liability clauses do not anticipate. NIST GenAI GV-6.2-007 identifies several categories of contract terms that directly affect your ability to govern third-party AI in production.

Key Contract Elements

Element	Why It Matters
Liability allocation	Who is responsible when the AI causes harm? Avoid blanket terms that defer all liability to the deployer.
Incident notification	Require timely notification of serious incidents arising from the vendor’s AI, with defined timelines.
Change notification	Require advance notice before material changes to model behaviour, architecture, or training data.
Performance SLAs	Define acceptable performance levels — including fairness metrics — and remedies if they are not met.
Audit rights	Include the right to evaluate the vendor’s AI processes, documentation, and testing results.
Data handling	Specify how your input data is used — particularly whether it is used to train or fine-tune the model.
IP indemnification	Protect against claims arising from the vendor’s training data, especially for copyright and personal data.
Termination rights	Ensure you can exit the contract if the AI does not meet requirements or compliance obligations.
Data portability	Ensure you can retrieve your data and any fine-tuned artefacts if you switch vendors.

Terms to Avoid

Watch for contract language that allows unlimited use of your data for model training without consent, permits arbitrary changes to model behaviour without notification, disclaims all responsibility for AI outputs, creates high switching costs through proprietary data formats, or requires you to indemnify the vendor for AI-related claims. These terms are increasingly common in standard AI vendor agreements and must be actively negotiated out.

Ongoing Monitoring

Due diligence does not end at contract signing. NIST AI RMF MANAGE 3.2 is explicit that pre-trained models used in development must be monitored as part of regular AI system maintenance — model behaviour can change after fine-tuning, after vendor updates, or through drift in your production data distribution. A model that passed validation at procurement may behave differently six months into deployment.

What to Monitor

Area	Focus
Performance	Is the AI still performing as expected in your environment against your baseline metrics?
Behaviour changes	Have vendor updates changed outputs in ways that affect your use case or compliance posture?
Fairness drift	Are bias metrics stable across demographic groups, or drifting as the user population or data changes?
Incidents	Has the vendor reported incidents? Have you experienced issues that should be reported upstream?
Compliance alignment	Do vendor practices still align with evolving regulatory requirements in your operating jurisdictions?
Security	Are there new vulnerabilities reported in vendor components, model architectures, or dependencies?

Review Cadence

Establish a structured review cadence calibrated to the risk level of the AI system.

• Continuous: Automated performance and fairness metric monitoring with defined alert thresholds

• Monthly: Review of incidents, user feedback, and any vendor communications about model changes

• Quarterly: Vendor performance against SLAs, including incident response and change notification

• Annually: Full reassessment of vendor relationship, AI suitability, and alignment with current regulatory requirements

Incident Response for Third-Party AI

When third-party AI causes problems, the incident response process must account for the vendor dependency. NIST GenAI GV-6.2-003 requires incident response plans for third-party AI technologies to define ownership of response functions, communicate plans to all relevant AI actors, and be rehearsed at a regular cadence — not assembled for the first time during an active incident.

Element	Action	Vendor Dependency
Detection	Monitor for problems in third-party components via automated tooling and user feedback channels	Define what vendor telemetry you have access to under the contract
Containment	Be able to disable or bypass third-party AI and activate fallback procedures immediately	Test fallback activation before you need it; manual fallback may be the only option
Escalation	Know exactly how to reach vendor support for urgent issues, with documented contacts and SLAs	Verify escalation paths at contract execution, not during an incident
Communication	Coordinate incident communication with the vendor to avoid contradictory public statements	Clarify communication ownership in the contract; align on timelines for disclosure
Documentation	Track all third-party AI incidents with timestamps, decisions made, and outcomes	Document vendor response times and quality for use in periodic relationship reviews
Review	Incorporate vendor incidents into relationship reviews; poor incident response is a termination trigger	Review contract terms after every significant incident for adequacy

Fallback Procedures

For critical AI functions, fallback procedures are a governance requirement. NIST GenAI GV-6.2-006 requires policies and procedures to test and manage risks related to rollover and fallback technologies, explicitly acknowledging that fallback may include manual processing. Design and test your fallback before you need it. An AI system with no viable fallback is an operational dependency that has not been risk-assessed.

Regulatory Implications

EU AI Act

The EU AI Act’s value chain provisions are explicit. Deployers of high-risk AI systems have obligations even when using third-party components — including risk management, human oversight, and post-market monitoring. GPAI model providers are required to supply technical documentation and instructions for use to downstream deployers, which forms the basis of deployer compliance. Downstream deployers can lodge complaints about upstream providers’ infringements with the AI Office — a recourse that presupposes the deployer has conducted sufficient due diligence to identify the infringement.

Where a party puts its name or trademark on a third-party high-risk AI system, it assumes the obligations of a provider under the Act — regardless of who built the underlying model. This has significant implications for embedded AI in enterprise software and white-label AI products.

NIST AI RMF

The NIST AI RMF addresses value chain risk across multiple functions. GOVERN 6.1 requires policies covering third-party AI systems, data, and IP risks. GOVERN 6.2 requires contingency processes for failures in third-party systems deemed high-risk. MAP 4.2 requires internal risk controls for third-party AI components to be identified and documented. MANAGE 3.1 requires that third-party AI risks are regularly monitored with controls applied and documented. Together these constitute a comprehensive third-party AI governance programme — not a single procurement checklist.

PM Responsibilities by Phase

During Planning

• Identify all third-party AI components in scope and map the full value chain from model provider to deployment

• Include vendor due diligence as a formal project deliverable with its own timeline, owner, and acceptance criteria

• Budget for evaluation, validation in your context, and ongoing monitoring — not just procurement costs

• Define the requirements that third-party AI must meet before deployment, including performance, fairness, and documentation thresholds

During Procurement

• Conduct AI-specific due diligence using the documentation and evaluation criteria above — not standard IT vendor assessment

• Request and review model cards, datasheets, testing reports, and incident history before contract execution

• Negotiate AI-specific contract terms covering change notification, audit rights, IP indemnification, and data handling

• Establish acceptance criteria for third-party AI that must be validated in your deployment environment, not just accepted from vendor documentation

During Implementation

• Validate that third-party AI meets your requirements in your specific context — vendor test results on different populations or use cases are insufficient

• Test for bias and fairness in your deployment environment with your actual user population

• Establish monitoring infrastructure for third-party components before go-live, not after

• Document all third-party dependencies in the project’s AI system documentation for governance and examination readiness

Post-Deployment

• Monitor third-party AI performance and behaviour against your defined thresholds on the cadence appropriate to the risk level

• Track vendor updates and formally assess their impact on your deployment before accepting them into production

• Manage incidents involving third-party components using your defined response process, with vendor escalation paths tested

• Conduct periodic vendor reviews on the schedule established at procurement; use incident history and SLA performance as inputs

Questions to Ask Vendors

Use these questions across the procurement and ongoing monitoring lifecycle. A vendor’s willingness and ability to answer them is itself a governance signal.

Training and Data

• What data was used to train this model, and how was it sourced and licensed?

• Were there data quality, representativeness, or coverage gaps that affect performance for specific populations or use cases?

• How do you handle personal data in training, and have you assessed compliance with applicable data protection law?

• Is our input data used to train or fine-tune the model? If so, under what conditions and with what controls?

Testing and Validation

• What testing was performed before release, and can you provide the full test reports?

• How do you test for fairness and bias, and what were the results across demographic groups?

• What are the known limitations of this model, and in which contexts is it not appropriate to use?

• Have you tested for adversarial vulnerabilities, data poisoning risks, and model memorisation of training data?

Operations and Incident Response

• How and when do you notify customers of model changes that could affect outputs or behaviour?

• What are your defined response times for critical incidents, and what remedies are available if you miss them?

• Can you provide your incident history for this model, including how issues were identified and resolved?

• What fallback or continuity options are available if your service is disrupted?

Compliance and Audit

• How do you support customers’ obligations under the EU AI Act, including documentation for deployers?

• What audit rights do you provide, and have you been audited by third parties?

• How do you handle IP indemnification for claims arising from your training data?

• Are you registered with the EU AI Act’s AI Office or any sector regulator, and can you provide compliance documentation?

Right-Sizing for Your Situation

Vendor management depth should match the risk of the AI and its role in your operations. A foundation model API used for internal content summarisation needs substantially less governance infrastructure than a third-party scoring model used to make credit or employment decisions. AIPMO’s implementation playbooks provide practical guidance calibrated to your stage.

Greenfield — Third-Party AI Playbook

For PMs without formal AI vendor processes. Essential due diligence checklist, key contract terms to negotiate, and a basic monitoring approach for smaller deployments or lower-risk third-party AI.

Emerging — Third-Party AI Playbook

For PMs building repeatable processes. Comprehensive vendor evaluation framework, contract negotiation guidance, approved vendor list templates, and monitoring programme design.

Established — Third-Party AI Playbook

For PMs in organisations with formal governance. How to integrate AI vendor management with existing procurement, vendor management, legal, and compliance frameworks — including EU AI Act value chain obligations.

Become a member →

Framework References

• EU AI Act (Official Journal, 12 July 2024) — Article 25 (responsibilities along the AI value chain; conditions under which distributors and deployers assume provider obligations); Article 53 (GPAI model provider obligations including technical documentation and instructions for downstream deployers); Recital 83 (clarification of operator roles along the value chain); Recital 88 (obligations of value chain suppliers to provide information to high-risk AI system providers)

• NIST AI Risk Management Framework (AI RMF 1.0, NIST AI 100-1) — GOVERN 6.1 (policies addressing third-party AI risks including transparency, testing, and IP); GOVERN 6.2 (contingency processes for third-party AI system failures); MAP 4.2 (internal risk controls for third-party AI components); MANAGE 3.1 (regular monitoring of third-party AI risks and documented controls); MANAGE 3.2 (monitoring of pre-trained models as part of regular system maintenance)

• NIST AI RMF Playbook — GOVERN 6.1 suggested actions (transparency into third-party system functions; thorough testing requirements; third-party technology policy evaluation; supply chain and full product lifecycle governance); GOVERN 6.2 suggested actions (contingency verification for mission-critical third-party AI; decommissioning processes for systems exceeding risk tolerances)

• NIST AI 600-1: Generative AI Profile (2024) — GV-6.1-005 (supplier risk assessment framework including monitoring and legal compliance); GV-6.1-007 (inventory of third-party entities and approved AI provider lists); GV-6.1-009 (updated due diligence for GenAI acquisition including IP, privacy, security, and embedded AI); GV-6.2-003 (incident response plans for third-party GenAI technologies); GV-6.2-006 (rollover and fallback technology policies including manual processing); GV-6.2-007 (vendor contract review to avoid liability amplification and secondary data use)

• AIGP Body of Knowledge v1.0.0 — Domain III (IP risks in AI training data; data provenance and licensing obligations); Domain IV (third-party AI governance programme requirements; vendor accountability frameworks; contractual protections for deployers)

• Singapore IMDA Model AI Governance Framework v2.0 — Section 3 (internal governance structures for AI including third-party procurement); Section 5 (vendor due diligence and model documentation requirements for deployed AI systems)

• PMI Guide to Leading and Managing AI Projects (CPMAI 2025) — Phase I (stakeholder and dependency mapping including third-party AI components); Phase III (procurement and vendor evaluation as project deliverables); Phase V (third-party AI validation in deployment context); Phase VI (ongoing monitoring obligations for procured AI systems)

This article is part of AIPMO’s PM Practice series. See also: AI Risk Registers | Model Cards and Datasheets | Monitoring AI Systems in Production