Due Diligence 2.0: How to Evaluate AI Safety Claims in 60 Minutes

"Our AI is safe, explainable, and EU AI Act compliant."

You'll hear some version of this in every AI startup pitch deck you see in 2026. The problem is that most investors have no reliable way to test that claim in a standard due diligence process.

That's a problem with real portfolio consequences. Regulatory non-compliance under the EU AI Act carries fines up to 7% of global annual turnover for high-risk AI systems. For a portfolio company, that's not an abstract risk. It's a valuation event.

Here's how to assess AI safety claims with substance in the time you actually have.

Why Standard Due Diligence Misses AI Risk

Traditional technology due diligence focuses on things investors and their advisors know how to evaluate: code quality, system architecture, security practices, data handling.

AI safety is different. It's not primarily about whether the code is well-written. It's about whether the model does what the company claims it does, whether it does so consistently and without discriminatory outputs, whether there are meaningful human oversight mechanisms, and whether the company has the documentation to prove any of this to a regulator.

None of these questions appear on a standard security questionnaire. Most technical advisors aren't equipped to assess them. And the founders pitching you know exactly what you'll ask, so they've prepared confident-sounding answers to a set of questions that don't get to the substance.

The result is that investors are regularly underwriting EU AI Act compliance risk without knowing they're doing it.

The 60-Minute Assessment Framework

This framework is designed for a standard due diligence call or on-site session. It surfaces genuine AI governance maturity versus performative compliance.

Block 1: Risk Classification (10 minutes)

Ask the founder: "Walk me through your EU AI Act risk classification and the rationale behind it."

What you're looking for: A specific answer. "We're a prohibited use case," "we're high-risk under Annex III," "we've determined we're limited risk because X," or "we're minimal risk because Y." Any of these is fine. What isn't fine is vague answers ("we're compliance-focused"), non-answers ("our lawyers are handling that"), or confidently incorrect answers (a lending model claiming it's minimal-risk).

Follow-up: "Which Annex III category applies to you, if any?" If they're claiming to be outside Annex III, ask them to explain why, given their product category. This separates the founders who've done the analysis from the ones who haven't.

Red flag: Any founder who doesn't know their Annex III classification for a product in the credit, employment, education, biometrics, critical infrastructure, or law enforcement categories is carrying significant compliance risk.

Block 2: Technical Documentation (15 minutes)

Ask to see their Article 11 technical documentation. If they don't have it, ask when they plan to produce it.

What Article 11 documentation covers:

General description of the system and its intended purpose
Design specifications and development process
Description of training, validation, and testing procedures
Information about training datasets (provenance, quality measures)
Known limitations and risks
Monitoring and logging capabilities

What you're looking for: A document that exists and is specific. It doesn't need to be perfect. It does need to exist, reference specific technical choices, and demonstrate that the team understands what the regulation requires.

Red flag: "We're working on it" from a company that's been operating for more than 12 months with a high-risk product is a significant gap. The documentation should have been started at product inception.

Block 3: Risk Management System (15 minutes)

Article 9 of the EU AI Act requires high-risk AI system providers to have a documented risk management system. Ask: "Can you walk me through your risk management process for this AI system?"

What you're looking for: A process, not a policy. Policies are easy to write. A process means someone is running it, it's documented, and there's evidence of it happening (risk assessments conducted, mitigations applied, sign-offs recorded).

Specific questions: - Who is responsible for the risk management system? - How often do you conduct risk assessments? - What did your last risk assessment find, and what did you do about it? - What's your process when the model produces an unexpected output?

Red flag: "Our CTO owns AI safety" with no documentation, no process, and no evidence of structured risk assessment is not a risk management system. It's a title.

Block 4: Human Oversight (10 minutes)

Article 14 requires that high-risk AI systems be designed so that humans can understand, monitor, and intervene in their outputs. Ask: "How do end users or operators intervene when your system produces an incorrect or harmful output?"

What you're looking for: A specific, designed mechanism. Not "users can always contact support" but "when the model flags a credit decision as high-uncertainty, the application requires human review before proceeding." The distinction is between a theoretical escape hatch and a designed oversight mechanism.

Red flag: Any high-risk AI system where the product flow doesn't have a clear human intervention path is non-compliant by design.

Block 5: Conformity Assessment Timeline (10 minutes)

For Annex III high-risk AI systems, ask about their conformity assessment status. Under the EU AI Act, most high-risk AI systems can self-certify (third-party assessment is only required for specific categories). But self-certification still requires documented evidence that all requirements are met.

Ask: "What's your conformity assessment timeline, and who is managing it?"

What you're looking for: A timeline, a process owner, and ideally a third-party advisor or notified body in the loop.

Red flag: "Our lawyers will handle it" for a product that should have been working toward conformity for 12 months is a timeline risk. The conformity assessment is a substantive technical and documentation exercise, not a legal filing.

The Four-Question Summary Test

If you don't have time for the full framework, these four questions will surface the most important signals in under 15 minutes:

"What's your EU AI Act risk classification and why?" (Tests foundational understanding)
"Can I see your Article 11 technical documentation?" (Tests documentation maturity)
"Who owns your risk management system and when did you last run a structured risk assessment?" (Tests process vs. policy)
"What's your conformity assessment timeline?" (Tests operational readiness)

A company that can answer all four with specificity and evidence is in a genuinely different position than the majority of AI startups claiming compliance in their decks.

What Good Looks Like vs. What It Doesn't

Question | Strong Response | Weak Response

Risk classification | "We're high-risk under Annex III point 5(b). Here's our classification rationale document." | "We're working with our lawyers on that."

Technical docs | "Here's our v2 Article 11 document, last updated March 2026." | "We have an internal document we can share later."

Risk management | "We run quarterly risk assessments. Our last one flagged X; here's how we addressed it." | "Our CTO is responsible for AI safety."

Human oversight | "When confidence is below 85%, the application routes to human review. Here's the UX flow." | "Users can always contact us if there's an issue."

Conformity assessment | "We're targeting Q3 2026 for self-certification. We're using [advisor] to verify." | "That's post-launch."

Portfolio-Level Implications

If you manage a portfolio of AI companies, the EU AI Act creates a systematic risk that cuts across your holdings. Companies that are non-compliant aren't just exposed to fines. They're exposed to:

Contract cancellation clauses from enterprise customers who require compliance
Difficulty raising follow-on rounds from investors with their own governance requirements
Reputational damage if a regulatory incident occurs
Competitive disadvantage versus the compliant players in their category

Running this 60-minute framework across your existing portfolio before August 2026 will surface which companies need immediate governance investment. The ones that need the most work are the ones you'll want to know about now, not after an enforcement action.