Does ChatGPT train on my business data?

It depends on your plan. ChatGPT Free and Plus may use conversations for model training, though you can opt out in Settings > Data Controls. ChatGPT Enterprise, Business, and Team plans explicitly do not use business data for training. If your team uses ChatGPT on personal accounts, company data may be used for training unless individuals opt out.

Does Atlassian train on Confluence and Jira data?

Starting August 17, 2026, yes, unless you opt out. Atlassian will use in-app data (Confluence page content, Jira issue titles and descriptions) to train its AI models including Rovo. All plans can opt out of in-app data collection via Atlassian Administration. Only Enterprise customers can opt out of metadata (story points, SLA metrics).

Does Claude train on user conversations?

No. Anthropic does not use Claude conversations to train its models by default across all plan tiers. There is no opt-out required for Team or Enterprise plans. Claude's Constitutional AI approach relies on model-level training techniques, not live conversation data. This holds for API usage, Claude.ai personal accounts, and enterprise deployments.

Does GitHub Copilot use my code to train AI models?

It depends on your plan. GitHub Copilot Individual may use code snippets and suggestions (opt out available in settings). GitHub Copilot Business and Enterprise explicitly do not use your code for model training. Code shared in Copilot Business and Enterprise stays within your organization and is not used to improve GitHub's models.

Does Your AI Vendor Train on Your Business…

Data privacy lock concept representing AI vendor training data policies

TL;DR: Most major AI vendors at the business or enterprise tier do not use your data for training. The exceptions matter: consumer-tier ChatGPT and Gemini (opt-out available), GitHub Copilot Individual (opt-out available), and Atlassian, which is changing its policy on August 17, 2026 to train on Confluence page content and Jira issues unless your team actively opts out. If employees use personal accounts for business tasks, data may already be in training pipelines.

In June 2026, a Reddit thread hit the front page of r/artificial with a blunt headline: "So now scraping data without permission is bad for AI training all of sudden?" The sarcasm was pointed at a familiar dynamic: AI companies spent years building their models by scraping the internet, and now those same companies are objecting when others scrape their content. But underneath the irony sat a practical question that thousands of compliance teams are still working out: which of the AI tools your team uses right now is training on your data?

The answer varies by vendor, plan tier, and opt-out status. And as of June 2026, one major change is coming that most teams have not noticed: Atlassian is updating its terms to begin training on Confluence and Jira data starting August 17, 2026, unless customers actively opt out before the deadline.

This guide covers the training data policies for 11 AI tools that are in active use on most small and mid-sized teams, along with what you can do before August 17.

Why This Question Is Harder Than It Looks

The simple version of the question is: "does this AI vendor use my data to train their model?" But in practice, there are at least three different things that could mean:

Training the base model. Using your data to improve the underlying large language model that the product is built on. This is the highest-risk scenario: your confidential data could influence what the model "knows" and potentially surface in responses to other users.

Fine-tuning for the product. Using your usage data to fine-tune the product layer on top of the base model (adjusting tone, task-specific behavior, etc.). Lower risk than base model training, but still involves your data leaving your environment.

Inference-time subprocessing. Sending your prompts and data to a third-party LLM provider to generate a response, without using that data for training. This is what most AI products do and is the least risky form of "vendor sees your data."

Most vendor statements about "not using your data for training" refer specifically to base model training. The same vendor may still send your data to a subprocessor (OpenAI, Google Vertex, AWS Bedrock) for inference. That subprocessing is normal and is disclosed in vendor DPA documents, but it is a separate question from training.

The table below covers base model and fine-tuning training, not subprocessing.

Compliance comparison checklist for AI vendor data privacy policies

11 Vendor Comparison: Training Data Policies (June 2026)

Vendor / Plan	Trains on Your Data?	Opt-Out?	Notes
ChatGPT Free/Plus	Yes, by default	Yes (Settings > Data Controls)	Consumer tier; personal account data is used unless opted out
ChatGPT Enterprise/Business/Team	No	N/A	Business tiers explicitly excluded from training
Claude (all plans)	No	N/A	No opt-out needed; Anthropic does not use Claude conversations for training
Gemini (free / Google.com)	Yes, reviewers may read	Yes (Activity controls)	Consumer tier; human review allowed by default
Gemini Workspace (Business/Enterprise)	No	N/A	Admin and user data excluded from model training by policy
Microsoft Copilot (free)	May be used for improvement	Limited	Consumer tier; check Microsoft Privacy Dashboard
Microsoft Copilot Business (M365 Business)	No	N/A	Tenant data not used for foundation model training
GitHub Copilot Individual	Yes (code snippets)	Yes (GitHub settings)	Individual tier; opt out in GitHub > Settings > Copilot
GitHub Copilot Business/Enterprise	No	N/A	Code and prompts excluded from training by contract
Atlassian Rovo/Confluence AI	Yes (from Aug 17 2026)	Yes, until Aug 17	All plans can opt out of in-app data; metadata opt-out is Enterprise only
Notion AI	No	N/A	No opt-out needed; optional LEAP program is strictly opt-in

The Atlassian August 17 Deadline

This is the most time-sensitive item in the table. Atlassian announced in mid-2026 that starting August 17, 2026, it will use in-app content from Confluence and Jira to train its AI models, including Rovo.

What counts as in-app content: Confluence page titles and page bodies, Jira issue titles, descriptions, and comments, and custom emoji and workflow names. If your team stores business strategy documents, product roadmaps, customer data, or legal notes in Confluence, that content will be used for AI training starting August 17 unless you opt out.

The policy affects all Atlassian Cloud customers, roughly 300,000 organizations globally.

How to opt out:

Go to Atlassian Administration (admin.atlassian.com)
Navigate to Security > Data contribution
Disable in-app data collection before August 17, 2026

Important limitation: only Enterprise plan customers can opt out of metadata collection (story points, SLA metrics, search behavior). For non-Enterprise plans, metadata opt-out is not available.

For teams with EU users: Atlassian sends data to US-based subprocessors including OpenAI (USA), Google Vertex AI, AWS Bedrock, and Databricks for AI processing. This creates data residency considerations for EU-regulated data. If you have EU employees' data in Confluence, the August 17 change without opt-out creates a new cross-border transfer for AI training purposes.

If you have not already reviewed Atlassian's updated data terms and verified your opt-out status, do it before July ends.

The Account Tier Problem

The most common error teams make is assuming that because they have an enterprise contract for their main AI tools, all AI tool usage in the organization is covered. It is not.

Employees routinely use personal-tier accounts for business tasks:

A developer with a personal GitHub account uses Copilot Individual (trains on code)
A manager pastes meeting notes into ChatGPT free via their personal email (trains on data unless opted out)
An analyst uses a personal Google account to run Gemini queries on business data (reviewers may see it)

The risk is concentrated at the boundary between personal accounts and company work. An acceptable use policy that requires employees to use company-provisioned accounts for business AI use cases is the control that addresses this most directly.

For teams that cannot enforce company-provisioned accounts everywhere, a tiered approach works: require enterprise accounts for any use case involving confidential data (customer information, legal documents, financial data, HR records), and allow personal accounts only for low-sensitivity tasks where training exposure is acceptable.

The Open-Source Alternative

For teams with strict data handling requirements, there is a category the table above does not include: self-hosted open-weight models.

Models like Meta Llama 3, Mistral, and the open-source version of DeepSeek can be deployed on your own infrastructure. When you run inference on your own servers, data never leaves your environment. There is no vendor to train on your data because there is no vendor processing it.

The tradeoff is operational overhead: you need to manage hosting, updates, and infrastructure. For regulated industries (healthcare under HIPAA, finance under SOX or GLBA, EU companies processing personal data under GDPR), the control over data residency and training exposure that comes from self-hosting may be worth the cost.

The privacy-first AI APIs guide covers which commercial API providers have verifiable no-training commitments and how to document them for compliance purposes.

How to Update Your AI Acceptable Use Policy

If your organization has an AI acceptable use policy, the training data question should be addressed explicitly. Three provisions to add or update:

Personal account restriction. Specify that employees must use company-provisioned AI accounts for any task involving confidential company information, customer data, or regulated personal data. Define what "confidential" means in your context. Personal accounts are acceptable for learning and low-stakes experimentation.

Training data disclosure requirement. When evaluating new AI tools, require the vendor to disclose their training data policy in writing as part of procurement. Ask specifically: (a) does the vendor train base or fine-tuning models on customer data; (b) is this training the same across all plan tiers; and (c) what is the process for requesting deletion of previously used training data?

Opt-out verification. For any AI tool where training is the default and opt-out is available (Atlassian being the current example), add opt-out verification to your AI tool registration and approval process. Confirm opt-out status annually at minimum.

The Regulatory Dimension

For teams operating under GDPR, training on personal data has additional implications. Using an employee's Jira comments or Confluence pages to train an AI model may constitute processing of personal data for a new purpose (AI training) that was not disclosed at the time of collection.

GDPR Article 5(1)(b), the purpose limitation principle, requires that data collected for one purpose (project management, collaboration) is not used for a different purpose (AI model training) without either a compatible justification or fresh consent. Atlassian's approach of opting customers into training by default with a deadline to opt out is a notification-and-consent model, but teams in EU-regulated industries should verify whether their DPO or legal team considers this adequate under their specific GDPR obligations.

The AI data privacy for small teams guide covers the GDPR analysis for common AI tool use cases, including the purpose limitation question.

Vendor Comparison: What to Ask Before You Sign

If you are in procurement for a new AI tool, ask these questions before signing:

At which plan tier does the vendor stop training on customer data?
Is there a lag between processing data and using it for training? (Can you delete it before it enters training?)
Does the vendor have a data deletion process for previously trained models? (Almost no vendor offers model retraining to remove specific customer data, which is worth knowing.)
Which subprocessors receive your data for inference, and what are their training policies?
If the vendor changes their training data policy after contract signing, what is the notification period and can you exit the contract?

Question 5 is the Atlassian scenario. A vendor that can change training policy with 45 days notice and no exit right is a different risk profile from a vendor that requires material policy changes to go through contract amendment.

Vendor / Plan

Trains on Your Data?

Opt-Out?

Notes

ChatGPT Free/Plus

Yes, by default

Yes (Settings > Data Controls)

Consumer tier; personal account data is used unless opted out

ChatGPT Enterprise/Business/Team

N/A

Business tiers explicitly excluded from training

Claude (all plans)

N/A

No opt-out needed; Anthropic does not use Claude conversations for training

Gemini (free / Google.com)

Yes, reviewers may read

Yes (Activity controls)

Consumer tier; human review allowed by default

Gemini Workspace (Business/Enterprise)

N/A

Admin and user data excluded from model training by policy

Microsoft Copilot (free)

May be used for improvement

Limited

Consumer tier; check Microsoft Privacy Dashboard

Microsoft Copilot Business (M365 Business)

N/A

Tenant data not used for foundation model training

GitHub Copilot Individual

Yes (code snippets)

Yes (GitHub settings)

Individual tier; opt out in GitHub > Settings > Copilot

GitHub Copilot Business/Enterprise

N/A

Code and prompts excluded from training by contract

Atlassian Rovo/Confluence AI

Yes (from Aug 17 2026)

Yes, until Aug 17

All plans can opt out of in-app data; metadata opt-out is Enterprise only

Notion AI

N/A

No opt-out needed; optional LEAP program is strictly opt-in

Does Your AI Vendor Train on Your Business Data? 11 Vendors Compared (2026 Policy Guide)

Why This Question Is Harder Than It Looks

11 Vendor Comparison: Training Data Policies (June 2026)

The Atlassian August 17 Deadline

The Account Tier Problem

The Open-Source Alternative

How to Update Your AI Acceptable Use Policy

The Regulatory Dimension

Vendor Comparison: What to Ask Before You Sign

Does Your AI Vendor Train on Your Business Data? 11 Vendors Compared (2026 Policy Guide)

Why This Question Is Harder Than It Looks

11 Vendor Comparison: Training Data Policies (June 2026)

The Atlassian August 17 Deadline

The Account Tier Problem

The Open-Source Alternative

How to Update Your AI Acceptable Use Policy

The Regulatory Dimension

Vendor Comparison: What to Ask Before You Sign