Do I need a DPA to use an AI API under GDPR?

Yes, if you are sending any data about EU residents (names, emails, IP addresses, user prompts that identify people) to an AI API provider, you need a Data Processing Agreement (DPA). All major providers offer one: Anthropic, OpenAI, Anthropic, Azure, Google, Mistral. Without a signed DPA, you are in breach of GDPR Article 28 regardless of whether the provider trains on your data.

Does OpenAI API train on my data?

By default since March 2023, OpenAI does not train foundation models on API data. You still need a DPA for GDPR compliance. The opt-out from training applies automatically to API usage, but not to ChatGPT consumer products. Confirm via your API settings dashboard that training opt-out is active, and sign the DPA at platform.openai.com/privacy.

What data should never be sent to any AI API?

Special-category data under GDPR Article 9: health records, biometric identifiers, political opinions, racial or ethnic origin, religious beliefs, sexual orientation. This data requires explicit consent and has heightened obligations that most AI API DPAs do not cover. Also avoid: social security numbers, payment card data (PCI scope), and data about children under 13/16 depending on jurisdiction.

Is CCPA compliance different from GDPR for AI APIs?

CCPA focuses on sale and sharing of personal information. Sending data to an AI API is typically a 'service provider' relationship, not a sale, if you have a service provider agreement in place. Under CCPA, you need a written service provider contract that prohibits the provider from using the data for their own business purposes. Most enterprise AI API agreements include this. Key difference from GDPR: CCPA does not require a DPA by that name, but requires a service provider addendum with equivalent restrictions.

Privacy-First AI APIs: Which Don't Train on…

Q: Which AI APIs do not train on your data?

Anthropic Claude API does not train on prompts or outputs by default — confirmed in their usage policy. Azure OpenAI Service does not use customer data to train foundation models. Google Vertex AI (Gemini API via Vertex) does not train on customer data. OpenAI API does not train on API data by default since March 2023, but you must sign a DPA if you are processing EU personal data. Mistral AI's API (EU-hosted in France) does not train on customer data. Consumer-facing products (ChatGPT free, Claude.ai free) have different policies.

When your team sends prompts to an AI API, you are a data controller. The provider is a data processor. GDPR and CCPA apply the moment any prompt contains information about an identifiable person. The question is not whether your API provider trains on your data, it is whether you have the right contracts, the right settings, and the right data hygiene in place.

This guide maps the major AI APIs by default data-training behavior, lists the three contract clauses that matter, and gives you a DPA checklist you can use today.

Which AI APIs Do Not Train on Your Data by Default

Provider	Trains on API data?	EU hosting available	DPA provided
Anthropic Claude API	No	No (US-only)	Yes
Azure OpenAI Service	No	Yes (EU regions)	Yes (via Microsoft)
Google Vertex AI (Gemini)	No	Yes (EU regions)	Yes
OpenAI API	No (since Mar 2023)	No (US-only)	Yes
Mistral AI API	No	Yes (France/EU)	Yes
Cohere API	No (enterprise tier)	Yes (EU available)	Yes
ChatGPT (consumer)	Yes (by default)	N/A	Not available
Claude.ai (free/pro)	May be used for safety	N/A	Not available

Key distinction: API products and consumer products have different policies. Your developers using the Claude API are in a different compliance position than employees using Claude.ai in a browser.

Anthropic Claude API

Claude API does not use prompts or completions to train models. This is stated in Anthropic's API usage policy and backed by the DPA Anthropic provides for enterprise customers. Retention: Anthropic stores API inputs and outputs for up to 30 days for abuse detection, then deletes them. Zero-retention is available on request for enterprise agreements.

GDPR gap: Anthropic processes data in the US. If you send EU personal data, you need standard contractual clauses (SCCs) in addition to the DPA. The Anthropic DPA includes SCCs.

Azure OpenAI Service

Microsoft does not train OpenAI models or its own models on customer data submitted to Azure OpenAI. Data is processed within the Azure region you select. EU customers can choose EU-based regions (West Europe, North Europe, Sweden Central) for data residency.

GDPR advantage: As part of the Microsoft cloud, Azure OpenAI is covered by the Microsoft Products and Services DPA (MSDPA), which is GDPR Article 28 compliant and includes EU SCCs and UK IDTA. This is the most mature DPA structure of the major providers.

OpenAI API (direct)

Since March 2023, OpenAI does not train on API data by default. You do not need to opt out. However, OpenAI processes data in the US, and you must sign a DPA at platform.openai.com/privacy to be GDPR compliant. The DPA includes SCCs.

Practical step: Log into your OpenAI account, go to Settings > Privacy, confirm "Improve model for everyone" is disabled. This setting should be off for API users by default, but verify it.

Mistral AI API

Mistral is headquartered in Paris and operates infrastructure in the EU. API data is not used for training. For EU-based small teams, Mistral is often the cleanest option from a data residency standpoint since no SCC transfer mechanism is needed for EU-to-EU data flows.

Google Vertex AI

Vertex AI (the enterprise route to Gemini models) does not train on customer data. This is separate from Google AI Studio, which has different terms. If your team is using the Gemini API, confirm they are going through Vertex AI under your Google Cloud account, not Google AI Studio with a personal Google account.

The Three Contract Clauses That Matter

When reviewing any AI API agreement for GDPR or CCPA compliance, look for these three clauses.

1. No secondary use for training

The agreement must state that the provider will not use your data to train, improve, or develop AI models. "Train" should be defined broadly to include fine-tuning, RLHF, and evaluation datasets. Generic phrases like "we may use data to improve services" are not sufficient.

Look for: "Provider will not use Customer Data to train, retrain, fine-tune, or improve foundation models."

2. Sub-processor list and notification obligation

GDPR requires you to know who your processor shares data with. The agreement must include a sub-processor list (or a link to a maintained list) and a notification period (typically 30 days) before new sub-processors are added.

Look for: "Provider will notify Customer at least 30 days before adding new sub-processors."

3. Deletion on request and at termination

You must be able to delete your data. The agreement must commit to deleting data within a reasonable period on request, and at contract termination.

Look for: "Provider will delete or return all Customer Data within 30 days of termination."

CCPA: Service Provider Agreement Requirement

Under CCPA, sending personal information to an AI API is typically classified as a disclosure to a service provider, not a sale. This avoids the "Do Not Sell" obligations. But you must have a written service provider agreement that prohibits the provider from:

Retaining, using, or disclosing the personal information for any purpose other than performing the service
Retaining, using, or disclosing the information for commercial purposes outside of providing the service
Selling the personal information

Most major AI API enterprise agreements include these prohibitions. Check that your agreement is for the API product, not the consumer product.

California residents test: If any of your prompts could contain information about California residents (including your own employees or customers in California), CCPA service provider requirements apply.

DPA Checklist for AI APIs

Before sending personal data to any AI API:

What Data Should Never Go into Any AI API Prompt

Regardless of which provider you use or how good their DPA is, avoid sending:

Always exclude:

Social Security Numbers or national ID numbers
Payment card numbers (PCI scope, separate obligation)
Health information covered by HIPAA or EU health data rules
Biometric data (voiceprints, facial recognition data)
Data about children under 13 (COPPA) or 16 (GDPR)

Handle with caution:

Full names combined with email addresses or job titles (identifiable)
IP addresses in system prompts (personal data under GDPR)
Employee performance data
Legal advice or attorney-client privileged material

The safest prompt engineering practice: replace personal identifiers with tokens before sending to the API, and map them back to real data after receiving the response.

Quick Decision Guide

If your team is EU-based and data residency is a hard requirement: Use Azure OpenAI (EU regions) or Mistral AI. Both process in the EU and have mature GDPR DPAs.

If your team is US-based and you want the simplest compliance path: Anthropic Claude API or OpenAI API with a signed DPA. Both have clean no-training policies and provide SCCs for EU data flows.

If you are processing health data or other special-category data: None of the standard AI API DPAs are designed for this. You need legal advice and likely a Business Associate Agreement (BAA) in the US, which only a handful of providers offer.

If you are using consumer products (ChatGPT, Claude.ai, Gemini): These are not compliant paths for processing personal data in a business context. Switch to the API or enterprise product.

Check your provider's DPA directly — terms change. Links: Anthropic DPA, OpenAI DPA, Azure DPA, Mistral DPA, Google Cloud DPA.

Get the next template in your inbox

Subscribe for updates