When your team sends prompts to an AI API, you are a data controller. The provider is a data processor. GDPR and CCPA apply the moment any prompt contains information about an identifiable person. The question is not whether your API provider trains on your data, it is whether you have the right contracts, the right settings, and the right data hygiene in place.
This guide maps the major AI APIs by default data-training behavior, lists the three contract clauses that matter, and gives you a DPA checklist you can use today.
Which AI APIs Do Not Train on Your Data by Default
| Provider | Trains on API data? | EU hosting available | DPA provided |
|---|---|---|---|
| Anthropic Claude API | No | No (US-only) | Yes |
| Azure OpenAI Service | No | Yes (EU regions) | Yes (via Microsoft) |
| Google Vertex AI (Gemini) | No | Yes (EU regions) | Yes |
| OpenAI API | No (since Mar 2023) | No (US-only) | Yes |
| Mistral AI API | No | Yes (France/EU) | Yes |
| Cohere API | No (enterprise tier) | Yes (EU available) | Yes |
| ChatGPT (consumer) | Yes (by default) | N/A | Not available |
| Claude.ai (free/pro) | May be used for safety | N/A | Not available |
Key distinction: API products and consumer products have different policies. Your developers using the Claude API are in a different compliance position than employees using Claude.ai in a browser.
Anthropic Claude API
Claude API does not use prompts or completions to train models. This is stated in Anthropic's API usage policy and backed by the DPA Anthropic provides for enterprise customers. Retention: Anthropic stores API inputs and outputs for up to 30 days for abuse detection, then deletes them. Zero-retention is available on request for enterprise agreements.
GDPR gap: Anthropic processes data in the US. If you send EU personal data, you need standard contractual clauses (SCCs) in addition to the DPA. The Anthropic DPA includes SCCs.
Azure OpenAI Service
Microsoft does not train OpenAI models or its own models on customer data submitted to Azure OpenAI. Data is processed within the Azure region you select. EU customers can choose EU-based regions (West Europe, North Europe, Sweden Central) for data residency.
GDPR advantage: As part of the Microsoft cloud, Azure OpenAI is covered by the Microsoft Products and Services DPA (MSDPA), which is GDPR Article 28 compliant and includes EU SCCs and UK IDTA. This is the most mature DPA structure of the major providers.
OpenAI API (direct)
Since March 2023, OpenAI does not train on API data by default. You do not need to opt out. However, OpenAI processes data in the US, and you must sign a DPA at platform.openai.com/privacy to be GDPR compliant. The DPA includes SCCs.
Practical step: Log into your OpenAI account, go to Settings > Privacy, confirm "Improve model for everyone" is disabled. This setting should be off for API users by default, but verify it.
Mistral AI API
Mistral is headquartered in Paris and operates infrastructure in the EU. API data is not used for training. For EU-based small teams, Mistral is often the cleanest option from a data residency standpoint since no SCC transfer mechanism is needed for EU-to-EU data flows.
Google Vertex AI
Vertex AI (the enterprise route to Gemini models) does not train on customer data. This is separate from Google AI Studio, which has different terms. If your team is using the Gemini API, confirm they are going through Vertex AI under your Google Cloud account, not Google AI Studio with a personal Google account.
The Three Contract Clauses That Matter
When reviewing any AI API agreement for GDPR or CCPA compliance, look for these three clauses.
1. No secondary use for training
The agreement must state that the provider will not use your data to train, improve, or develop AI models. "Train" should be defined broadly to include fine-tuning, RLHF, and evaluation datasets. Generic phrases like "we may use data to improve services" are not sufficient.
Look for: "Provider will not use Customer Data to train, retrain, fine-tune, or improve foundation models."
2. Sub-processor list and notification obligation
GDPR requires you to know who your processor shares data with. The agreement must include a sub-processor list (or a link to a maintained list) and a notification period (typically 30 days) before new sub-processors are added.
Look for: "Provider will notify Customer at least 30 days before adding new sub-processors."
3. Deletion on request and at termination
You must be able to delete your data. The agreement must commit to deleting data within a reasonable period on request, and at contract termination.
Look for: "Provider will delete or return all Customer Data within 30 days of termination."
CCPA: Service Provider Agreement Requirement
Under CCPA, sending personal information to an AI API is typically classified as a disclosure to a service provider, not a sale. This avoids the "Do Not Sell" obligations. But you must have a written service provider agreement that prohibits the provider from:
- Retaining, using, or disclosing the personal information for any purpose other than performing the service
- Retaining, using, or disclosing the information for commercial purposes outside of providing the service
- Selling the personal information
Most major AI API enterprise agreements include these prohibitions. Check that your agreement is for the API product, not the consumer product.
California residents test: If any of your prompts could contain information about California residents (including your own employees or customers in California), CCPA service provider requirements apply.
DPA Checklist for AI APIs
Before sending personal data to any AI API:
- Signed DPA in place (not just accepted Terms of Service)
- DPA includes EU Standard Contractual Clauses if provider is outside EU
- Sub-processor list reviewed and acceptable
- Data retention period confirmed (preferably 30 days or less)
- Deletion on request confirmed in writing
- Training opt-out confirmed (check API settings dashboard, not just contract)
- Data minimization: are you sending only what the API needs?
- Special-category data excluded from prompts (health, biometric, political, etc.)
- CCPA service provider agreement in place if California residents are in scope
- Internal record of processing activities (ROPA) updated to include this provider
What Data Should Never Go into Any AI API Prompt
Regardless of which provider you use or how good their DPA is, avoid sending:
Always exclude:
- Social Security Numbers or national ID numbers
- Payment card numbers (PCI scope, separate obligation)
- Health information covered by HIPAA or EU health data rules
- Biometric data (voiceprints, facial recognition data)
- Data about children under 13 (COPPA) or 16 (GDPR)
Handle with caution:
- Full names combined with email addresses or job titles (identifiable)
- IP addresses in system prompts (personal data under GDPR)
- Employee performance data
- Legal advice or attorney-client privileged material
The safest prompt engineering practice: replace personal identifiers with tokens before sending to the API, and map them back to real data after receiving the response.
Quick Decision Guide
If your team is EU-based and data residency is a hard requirement: Use Azure OpenAI (EU regions) or Mistral AI. Both process in the EU and have mature GDPR DPAs.
If your team is US-based and you want the simplest compliance path: Anthropic Claude API or OpenAI API with a signed DPA. Both have clean no-training policies and provide SCCs for EU data flows.
If you are processing health data or other special-category data: None of the standard AI API DPAs are designed for this. You need legal advice and likely a Business Associate Agreement (BAA) in the US, which only a handful of providers offer.
If you are using consumer products (ChatGPT, Claude.ai, Gemini): These are not compliant paths for processing personal data in a business context. Switch to the API or enterprise product.
Check your provider's DPA directly — terms change. Links: Anthropic DPA, OpenAI DPA, Azure DPA, Mistral DPA, Google Cloud DPA.
