Which AI coding tools train on your code, and which have enterprise DPAs:
| Tool | Trains on your code? | Enterprise DPA? | SOC 2? | Self-serve DPA link |
|---|---|---|---|---|
| GitHub Copilot Business | ❌ No | ✅ Yes (GitHub DPA) | ✅ Yes | github.com/customer-agreement |
| GitHub Copilot Enterprise | ❌ No | ✅ Yes | ✅ Yes | Included in enterprise agreement |
| GitHub Copilot Individual | ⚠️ Yes (unless opt-out) | ❌ No | N/A | Opt-out in settings |
| Cursor for Teams | ❌ No (telemetry off) | ✅ Yes | ✅ Yes | cursor.com/privacy |
| Cursor Individual | ❌ No (telemetry off) | ❌ No | N/A | Disable telemetry in settings |
| Claude Code (Anthropic API) | ❌ No | ✅ Yes | ✅ Yes | privacy.anthropic.com/dpa |
| Amazon Q Developer (Pro) | ❌ No | ✅ Yes (AWS DPA) | ✅ Yes | AWS service terms |
| Tabnine (Team/Enterprise) | ❌ No | ✅ Yes | ✅ Yes | tabnine.com/enterprise |
| ChatGPT (free/Pro) | ⚠️ May be used | ❌ No DPA | N/A | Not available |
| Google AI Studio | ⚠️ Yes | ❌ No | N/A | Not available |
Rule: Individual-tier AI coding tools are not compliant paths for code that touches customer data or operates under a vendor NDA. Use Business or Enterprise tiers for professional use.
AI coding tools raise three specific governance risks that general AI tool policies do not cover: secret leakage (developers accidentally paste credentials into prompts), IP contamination (AI suggestions may reproduce copyrighted code), and output quality (AI-generated code that looks correct but contains security flaws).
This guide gives you a governance policy you can copy directly into your team wiki, a data rules table, and the vendor comparison data your DPA review will need.
What AI Coding Tools Actually Send to the Vendor
Understanding what gets transmitted changes how you write policy.
GitHub Copilot sends code context — the file you are editing and surrounding files — to GitHub's servers to generate suggestions. With Copilot Business/Enterprise, this context is not stored beyond the current session and is not used for training. GitHub does not store code context after the suggestion is returned.
Cursor sends code context to its servers, which then forward to the underlying AI provider (Anthropic Claude or OpenAI, depending on model selection). Cursor processes responses and returns suggestions. With telemetry disabled, Cursor does not log prompts or completions. With telemetry enabled (default), usage data including code snippets may be retained.
Claude Code runs locally on your machine and sends only explicit user prompts and relevant code context to the Anthropic API. It does not scrape your entire codebase. What gets sent is what you explicitly ask about — usually the file contents you paste or the files you reference in a command.
Amazon Q Developer integrates with AWS and IDEs. In Pro tier, code is not used for training. Q may index your codebase in the AWS CodeWhisperer customization feature — this is opt-in and disabled by default.
The Four Governance Controls Your Team Needs
1. Approved-Tools List
Define exactly which AI coding tools are allowed, under what plan, and for what purpose.
| Tool | Approved tier | Approved for | Not approved for |
|---|---|---|---|
| GitHub Copilot | Business or Enterprise | All development tasks | Repositories containing customer PII unless DPA confirmed |
| Cursor | Teams tier | Local development, code review | Sending code with embedded secrets |
| Claude Code | API with signed DPA | Development, architecture analysis | Production secrets in context |
| [Other tools] | [Tier] | [Use case] | [Restrictions] |
Not approved without security review: any AI coding tool not on this list, including free-tier tools, browser-based AI tools, and mobile apps.
2. Data Rules — What Never Goes Into an AI Coding Tool
This is the highest-risk control. Developers often paste code snippets into AI tools without thinking about what is in them.
Absolute prohibitions — no exceptions:
- Hardcoded API keys, passwords, database credentials, or tokens (even expired ones — they reveal key format and naming conventions)
- Customer personal data embedded in test fixtures (real names, real emails, real IDs)
- Private keys, certificates, or cryptographic material
- Code covered by an NDA with a client or partner that prohibits sharing
- Proprietary algorithms explicitly listed as trade secrets in vendor agreements
Handle with caution:
- Production configuration files — remove sensitive values before pasting
- Database schemas with personally identifiable column names — anonymize the schema
- Error logs — strip user IDs, session tokens, and email addresses before sharing
Practical rule for developers: before pasting any file into an AI tool, scan it for strings matching key, secret, password, token, credential, private. If any match, scrub before sending.
3. DPA Verification
If your team's code ever touches EU personal data — customer records, employee data, user activity logs — the AI coding tool you use to work with that code is a data processor under GDPR.
DPA required for:
- Any AI coding tool used with code that processes EU personal data
- Any tool used in a repository that stores personal data, even in test fixtures
DPA not required for:
- AI tools used only with fully anonymized or synthetic test data
- AI tools used only with internal tooling that never touches personal data
Self-serve DPA confirmation:
- GitHub Copilot Business/Enterprise: DPA included in GitHub Customer Agreement
- Cursor Teams: DPA available at cursor.com/privacy
- Claude Code (Anthropic API): DPA at privacy.anthropic.com/dpa
- Amazon Q Pro: covered under AWS Data Processing Addendum
4. Output Review Requirements
AI-generated code is not production-ready by default. Your code review process must include AI-specific checks.
Mandatory review before merging AI-generated code:
- No hardcoded credentials or secrets introduced by the AI suggestion
- No SQL injection, XSS, or other OWASP Top 10 vulnerabilities in the generated code
- No copyrighted code reproduced verbatim (check for distinctive patterns from known open-source projects)
- Logic reviewed by a human — does this actually do what the prompt asked?
- Tests written for the generated code (AI tools often skip edge cases)
The accountability rule: whoever merges AI-generated code is responsible for it — as fully responsible as if they had written it themselves. Saying "the AI wrote it" is not a defense in a security incident.
Copy-Paste Governance Policy: AI Coding Tools
Add this section to your existing AI Acceptable Use Policy or team engineering handbook.
AI Coding Tools Policy — [Company Name] Version: 1.0 | Effective: [Date]
Approved tools:
The following AI coding tools are approved for development use. All other AI coding tools require approval from [Tech Lead / CISO] before use.
| Tool | Approved tier | Conditions |
|---|---|---|
| GitHub Copilot | Business or Enterprise | Must be signed into company GitHub org |
| Cursor | Teams | Telemetry must be disabled in settings |
| Claude Code | API via Anthropic | DPA confirmed; secrets.env excluded from context |
| [Other] | [Tier] | [Conditions] |
Data prohibitions:
Developers must not send the following to any AI coding tool:
- API keys, passwords, secrets, or credentials (active or expired)
- Customer personal data, including names, emails, user IDs, or payment data embedded in test fixtures
- Code covered by an NDA that prohibits third-party sharing
- Private cryptographic keys or certificates
Output accountability:
Code generated or suggested by an AI tool must be reviewed by a human before merging. The reviewer is fully responsible for the correctness and security of merged code, regardless of origin.
IP and licensing:
AI coding tools may suggest code that reproduces open-source code subject to licensing obligations. Developers must not accept verbatim suggestions of more than 10–15 lines from an AI tool without verifying the output is not subject to a copyleft license (GPL, AGPL) if the output will be released under a different license.
Violation reporting:
If a developer accidentally pastes credentials or customer data into an AI tool prompt:
- Stop immediately
- Report to [security contact] within 24 hours
- Rotate the credential immediately if it was valid
- Document in the AI incident log
Review cadence: This approved-tools list is reviewed quarterly. New tools require security review before addition.
[Company Name] | [Date] | Owner: [Tech Lead / CISO]
Related Resources
- AI Acceptable Use Policy Template — the full policy covering all AI tools, not just coding tools
- Privacy-First AI APIs — which AI APIs don't train on your data with DPA comparison table
- AI Tool Register Template — tracking every AI tool your team uses with risk tiers
References
- GitHub — Copilot Trust Center
- Anthropic — API Privacy and Data Policy
- OWASP — Top 10 Web Application Security Risks
- National Institute of Standards and Technology — AI Risk Management Framework 1.0
