The Claude Code source leak — Anthropic's internal system prompt extracted and published by researchers multiple times — revealed a hidden governance layer most users didn't know existed: thousands of words of behavioral instructions sitting between the model and every interaction. Most teams using Claude Code had no idea.
That's the point of this article. Not "Claude Code is dangerous." More like: every AI tool you depend on has a version of this, and you've probably never thought about it.
What the extractions revealed
Claude Code operates under an extensive internal system prompt. When researchers published it, most teams discovered things they hadn't known:
The prompt runs to thousands of words. It specifies how Claude Code handles code edits (it prefers surgical changes over rewrites), how it interprets ambiguous requests, and which values win when you ask it to do something in tension with its training. Every session, every code suggestion, every refusal — all of it flows through rules you've never seen.
Those rules aren't static. They change as Anthropic updates Claude Code. A behavior your team relies on in one version may disappear in the next, not because the underlying model changed, but because the system prompt changed. There's no public changelog for that.
And this isn't unique to Claude Code. Most AI tools work this way. You can observe behavior and infer rules, but you can't audit the actual governance layer. Claude Code made the opacity visible. Most tools just leave you in the dark.
Five lessons at a glance
| Lesson | The governance gap it closes | Time to implement |
|---|---|---|
| 1. You don't know what governs your tools | Document observed behavior, not assumed behavior — that's your auditable governance layer | 1 hour per tool, once per year |
| 2. Vendor incidents are your incidents | Add AI vendors to your incident response plan; know your GDPR Article 33 triggers | 2 hours to update IR plan |
| 3. AI updates change behavior silently | Run a 20-minute validation after every major AI tool update | 20 minutes per update |
| 4. Vendor transparency docs aren't complete | Test your production workflows directly; don't rely on responsible AI statements | 1 hour to build behavioral test set |
| 5. Your IR plan needs a vendor AI section | Assign a vendor owner, subscribe to security advisories, have an exit plan | 1 hour per quarter |
Five lessons for teams that use AI tools
1. You don't know what governs your AI tools
The model weights are one layer. The system prompt, fine-tuning instructions, output filters, and deployment configuration are another — and that second layer shapes the behavior you actually interact with.
Your AI governance documentation can't describe the vendor's internal instructions because you don't have access to them. It can only describe what you've observed: what the tool does in your specific use case, what outputs you've validated, what behaviors you've tested. That's your governance layer, and it's the only one you can actually audit.
Practical step: build a one-page behavioral record for every AI tool your team depends on. Update it quarterly. Document observed behavior, any deviations from expected behavior, and the workflows you've tested. That's what you show an auditor — not the vendor's responsible AI statement.
2. Vendor AI incidents are your problem too
When an AI vendor has a security incident — leaked system prompt, compromised model update, data breach affecting fine-tuning data — your use of that tool becomes a governance event for you.
If Claude Code had been fine-tuned on your proprietary codebase and that data was exposed, you have a potential data breach. If a system prompt change causes the tool to output code that violates your security policies, you have a supply chain issue. If the tool starts behaving in ways that affect your EU users, you may have a GDPR Article 33 reporting obligation.
Treat your AI vendor list the same way you treat your software dependency list. Each vendor needs a designated internal owner, a way to monitor vendor security communications, and a response protocol for when something goes wrong.
3. AI tool updates are not like software updates
Software updates come with release notes. Breaking changes get flagged. AI tool updates — system prompt changes, model version bumps, configuration changes — often go live silently.
A team that built workflow automations around specific Claude Code behaviors may find those behaviors gone after an update, with no explanation. The system prompt changed. Nothing was announced.
When a major AI tool updates, run a quick validation: do your standard workflows still produce expected outputs? Any new refusals or new behaviors? Log anything significant. It takes 20 minutes and has caught real issues.
4. Vendor transparency documents don't tell the whole story
Anthropic publishes its usage policy, acceptable use guidelines, and model documentation. None of that covered the specific behavioral rules in the system prompt. The transparency documentation was accurate — it just wasn't complete.
That's not bad faith. Detailed behavioral instructions are proprietary and could be gamed if published. But it means your AI governance can't rest on vendor disclosures alone.
For any AI tool in a high-stakes workflow — customer-facing, data-processing, decision-affecting — run your own behavioral tests and document what you find. Vendor documents tell you intent. Your tests tell you reality.
5. Your incident response plan needs a vendor AI section
System prompt extractions are relatively low-stakes: instructions go public, no user data is exposed. More serious events are coming — compromised model updates, breaches of fine-tuning data, supply chain attacks on AI tools. The category is growing.
When a vendor AI security event occurs:
- Assess what data the tool had access to and whether any of your users were affected
- Review recent tool outputs for unexpected behavior
- Log the incident, your assessment, and the date
- Check whether the incident triggers GDPR Article 33, CCPA, or applicable state breach notification requirements
- Decide whether the tool relationship needs changes — additional controls, temporary suspension, or removal
A 30-minute tabletop exercise once a year — "our AI coding tool has a security incident, what do we do" — is worth running before something actually happens.
What Anthropic actually did right
The governance lesson is not that Claude Code is unsafe. The extracted instructions show what responsible AI development looks like: detailed behavioral guidelines, explicit handling of sensitive requests, clear logic for edge cases. These aren't instructions designed to deceive users — they're a thoughtful attempt to make a capable tool predictable and safe.
In a strange way, the extractions made Claude Code more auditable than most competing tools. The lesson isn't to avoid tools with system prompts. It's to understand that every AI tool in your stack has something like this, and you should think about it.
How to audit a vendor's AI governance layer before an incident
The best time to think about vendor AI governance is before something goes wrong. For every AI tool your team uses in a workflow that matters — not the experimental stuff, the actual production workflows — run through these questions once a year:
What instructions govern this tool's behavior? If the vendor publishes a system prompt or behavioral guidelines, read them. Many don't. If they don't, your baseline is the behavior you observe. Document it: what does the tool do, what does it refuse, how does it handle ambiguous requests?
What data does this tool see? List it explicitly. Code? Customer communications? Internal documents? PII? The data scope determines your breach exposure if the vendor has a security incident.
How would you know if the tool's behavior changed? For each major AI tool, identify the two or three behaviors your workflows depend on most. Run a quick validation check after every major update. This takes 15 minutes and has caught real behavior changes that weren't documented in any release notes.
Does the vendor have a security disclosure process? Check whether they publish security advisories. Subscribe to their status page or security mailing list if they have one. For tools processing sensitive data, know who to contact at the vendor if you need to report or investigate an incident.
What's your exit plan? If this vendor had a serious incident tomorrow and you needed to stop using the tool within 30 days, what would break? For critical-path tools, have at least a rough answer to this before you need it.
One hour per quarter, across your top three AI tools. That's the entire overhead of this practice.
Vendor AI incident checklist
Run this when a vendor AI security event occurs:
- List which internal workflows use the affected tool
- Assess data exposure: what data did the tool process, and is it sensitive?
- Review recent tool outputs for unexpected behavior
- Check vendor communications for scope and remediation timeline
- Check whether GDPR Article 33, CCPA, or state breach notification applies
- Log the incident and your assessment in your AI governance record
- Decide whether the tool relationship needs changes (additional controls, suspension, or removal)
- Brief relevant internal stakeholders
For the full incident response framework, see the AI Incident Response Plan Template. For evaluating AI vendors before an incident occurs, see the Third-Party AI Tool Risk Assessment Template.
