Does GitHub Copilot send my code to external servers?

Yes. GitHub Copilot sends code snippets (your current file and nearby files as context) to Microsoft/OpenAI servers to generate suggestions. GitHub Copilot for Business and Enterprise offer additional data protection terms: Microsoft does not use your code to train foundation models under the Business/Enterprise plans, and you can disable telemetry. The Copilot Individual plan has different terms. Check which plan your organization uses and review the applicable data handling documentation.

Does Cursor AI send my codebase to Anthropic or OpenAI?

Cursor sends code context to AI model providers (Anthropic for Claude, OpenAI for GPT-4) to generate completions. With Cursor Business, you can configure a privacy mode that limits what is sent. Without explicit configuration, Cursor may include multiple files as context. Cursor does not train its models on your code, but the code is transmitted to third-party model providers under their standard terms.

What code should never be included in AI tool context?

API keys, tokens, and credentials (even in .env files adjacent to the context window), production database connection strings, private cryptographic keys and certificates, personally identifiable information in test data, proprietary algorithms or trade secrets your organization has not decided to expose to AI vendors, and healthcare or financial data subject to HIPAA or PCI-DSS.

What should an AI code governance policy include?

Five components: (1) approved tools list by team (which AI coding tools are allowed for which codebases), (2) context rules (what files can be included in AI context), (3) credential exclusion (git pre-commit hooks to prevent secrets in AI context), (4) review requirement (all AI-generated code reviewed by a human before merge), (5) codebase classification (which repos are off-limits for AI tools entirely).

Is AI-generated code a copyright concern?

Yes. Copilot and similar tools may reproduce code from their training data, including code under copyleft licenses (GPL, AGPL) that would require you to open-source your derivative work. GitHub Copilot for Business and Enterprise include an optional filter that blocks suggestions matching known public code. Enable it. For legally sensitive projects, have AI-generated code reviewed for license compliance before it enters production.

Does GitHub Copilot send my code to external servers?

Yes. GitHub Copilot sends code snippets (your current file and nearby files as context) to Microsoft/OpenAI servers to generate suggestions. GitHub Copilot for Business and Enterprise offer additional data protection terms: Microsoft does not use your code to train foundation models under the Business/Enterprise plans, and you can disable telemetry. The Copilot Individual plan has different terms. Check which plan your organization uses and review the applicable data handling documentation.

AI Code Governance: Rules for GitHub Copilo…

GitHub Copilot sends your current file and nearby files to Microsoft/OpenAI every time it generates a suggestion. Cursor sends multi-file codebase context to Anthropic or OpenAI. Every AI coding tool transmits code to external servers — that is how they work. Without explicit governance rules, engineering teams routinely expose proprietary algorithms, customer data included in test fixtures, and credentials sitting in adjacent config files. Five governance rules close the most common gaps.

At a glance: AI coding tools are not covered by your general AI use policy without explicit rules. The three highest-risk exposures: credentials in AI context windows (API keys, tokens, database strings), customer or employee PII in test data, and proprietary algorithms in codebases classified as trade secrets. The five rules in this guide close each of these gaps without slowing down your engineering team.

Why AI Code Governance Is Different

Most AI governance policies cover things like ChatGPT usage, vendor DPAs, and automated decision-making. They don't specifically address AI coding tools, because AI coding tools feel like developer utilities rather than data processing systems.

They are both. GitHub Copilot processes code context. Cursor processes codebase context. Amazon CodeWhisperer, Tabnine, and similar tools all transmit code to external inference endpoints. What gets transmitted — and under what terms — matters for IP protection, data privacy, and regulatory compliance.

The governance gaps that emerge without a code-specific policy:

Credentials in context. A developer working in a directory that contains a .env file may accidentally include live API keys in the AI context window when requesting a completion. The AI tool doesn't transmit the key as a credential — it's just text in the file being processed. But the processing happens on external servers.

PII in test data. Test fixtures built from production data frequently contain real customer names, emails, and identifiers. When an AI tool loads a test file for context, that PII travels to the AI provider's servers.

Proprietary algorithms. Core business logic — pricing algorithms, recommendation systems, fraud detection models — may be exposed to AI vendors under terms that are not equivalent to your NDAs with employees.

License compliance. AI-generated code can reproduce code from training data, including GPL/AGPL-licensed code. Without a review step, this ends up in proprietary codebases without notice.

Rule 1: Define Approved Tools by Codebase Sensitivity

Not all AI coding tools are appropriate for all codebases. Classify your repositories by sensitivity and assign allowed tools accordingly.

Tier 1 — Unrestricted. Public repositories, open-source projects, documentation sites. Any approved AI coding tool may be used.

Tier 2 — Standard. Internal tools, non-sensitive business logic. Approved AI coding tools may be used with standard credential hygiene.

Tier 3 — Restricted. Core product algorithms, customer data processing pipelines, financial and payment systems. AI coding tools allowed with explicit context configuration (see Rule 3). No AI tools with training rights.

Tier 4 — Off-limits. Healthcare records systems, regulated financial data, legally privileged code, codebases classified as trade secrets by legal. No AI coding tools permitted.

Document this classification in your codebase's README or in a central policy document. Engineers should be able to check a single page to know which AI tools they may use for a given repository.

Rule 2: Configure Data Protection in Your AI Coding Tool

Each major AI coding tool has settings that affect how your code is handled. Set these before deployment, not after a data incident.

GitHub Copilot:

Use Business or Enterprise plan — Individual plan does not provide organizational data control.
Disable telemetry for your organization: Settings → Copilot → Allow GitHub to use my code snippets → Off.
Enable code duplication filter: Settings → Copilot → Suggestions matching public code → Block.
Review the GitHub Copilot data handling documentation for your plan tier.

Cursor:

Enable Privacy Mode in Settings → General → Privacy Mode. This limits what code context is sent.
For Business plan: configure codebase indexing settings to exclude sensitive directories.
Review Cursor's privacy policy to confirm current data retention and training terms.

Amazon CodeWhisperer:

Use the Professional tier for organizational deployment — it includes additional data protection terms.
Opt out of sharing code suggestions as training data: Settings → Data sharing → Off.

Tabnine:

Tabnine offers a self-hosted option that keeps all code processing local. For Tier 3 and Tier 4 codebases, this is the only appropriate deployment model.
For cloud deployments, confirm training opt-out is active at the organization level.

Rule 3: Block Credentials from AI Context Windows

This is the highest-urgency rule. Credentials in AI context windows are the most common and most immediately harmful code governance failure.

Git pre-commit hook for secret detection. Add a pre-commit hook that scans staged files for common credential patterns before allowing commits. Tools like gitleaks, trufflesecurity/trufflehog, or the simpler git-secrets catch most common patterns:

# .git/hooks/pre-commit (or use pre-commit framework)
#!/bin/bash
gitleaks detect --staged --no-git 2>/dev/null && echo "No secrets found" || {
  echo "ERROR: Potential secrets detected. Review before committing."
  exit 1
}

.gitignore enforcement. Ensure .env, .env.local, secrets.yaml, and similar files are in .gitignore for every repository. AI tools use file proximity for context — a .env file in the same directory as the file being edited may be included in context even if it is not the file being edited.

Dedicated secrets management. Production credentials should be in a secrets manager (AWS Secrets Manager, HashiCorp Vault, Doppler) accessed at runtime, not stored in any file that could enter an AI context window.

Rule 4: Require Human Review for AI-Generated Code Before Merge

AI-generated code goes to production faster than any code that has ever existed. The review step that slows it down is also the step that catches license compliance issues, credential injection, logic errors, and security vulnerabilities.

Define the review requirement explicitly:

All AI-generated functions over 20 lines require a named human reviewer before merge.
AI-generated database migrations require review by a senior engineer or technical lead.
AI-generated authentication or authorization logic requires security review, not just functional review.

The review is not checking whether the code works — automated tests do that. The review is checking whether the code is appropriate: no license conflicts, no accidental credential exposure, no logic that circumvents intended access controls.

Track AI-generated code in your code review tooling. Most AI coding tools can be configured to add a marker comment; alternatively, use a PR label. This creates an audit trail for which code was AI-generated.

Rule 5: Include AI Code Tools in Your AI Incident Log

When an AI coding tool produces code that causes a production incident — a security vulnerability, a data exposure, an incorrect calculation — that event belongs in your AI incident log.

Without a log, patterns in AI-generated code failures are invisible. With a log, you can identify that your team's AI-generated authentication code is failing more often than human-written authentication code, and take corrective action — tighter review requirements, different tools, or prohibited use in that domain.

The log entry format: date, tool, what happened, whether AI-generated code was involved, action taken.

Implementation Checklist

Policy Template: AI Coding Tool Rules (Copy-Paste)

AI Coding Tool Policy — [Company Name]
Last updated: [Date]

Approved tools: [GitHub Copilot / Cursor / Tabnine / other]
Not approved: [tools not on the approved list]

Repository tiers:
- Unrestricted repos: [list] — all approved tools permitted
- Standard repos: [list] — approved tools with credential hygiene
- Restricted repos: [list] — approved tools with privacy mode enabled
- Off-limits repos: [list] — no AI coding tools permitted

Rules for all engineers:
1. No AI tool context window may include credentials, API keys, or tokens.
2. AI-generated code over 20 lines requires a named human reviewer before merge.
3. AI-generated database migrations require senior engineer approval.
4. PII must be removed from test fixtures before using AI tools in that directory.
5. Incidents involving AI-generated code must be logged in the AI incident log.

Privacy settings (enforce at organization level):
- Training opt-out: enabled
- Code duplication filter: enabled (blocks public code suggestions)
- Telemetry sharing: disabled

Questions: contact [AI governance owner name and email]

References

AI governance for small teams — complete guide
Hidden AI features in developer tools
AI vendor due diligence checklist 2026
TypeScript AI agent security incident response playbook
GitHub Copilot for Business data handling: docs.github.com/en/copilot/overview-of-github-copilot/about-github-copilot-for-business
Cursor privacy policy: cursor.com/privacy

At a glance: AI coding tools are not covered by your general AI use policy without explicit rules. The three highest-risk exposures: credentials in AI context windows (API keys, tokens, database strings), customer or employee PII in test data, and proprietary algorithms in codebases classified as trade secrets. The five rules in this guide close each of these gaps without slowing down your engineering team.

Why AI Code Governance Is Different

The governance gaps that emerge without a code-specific policy:

License compliance. AI-generated code can reproduce code from training data, including GPL/AGPL-licensed code. Without a review step, this ends up in proprietary codebases without notice.

Rule 1: Define Approved Tools by Codebase Sensitivity

Not all AI coding tools are appropriate for all codebases. Classify your repositories by sensitivity and assign allowed tools accordingly.

Tier 1 — Unrestricted. Public repositories, open-source projects, documentation sites. Any approved AI coding tool may be used.

Tier 2 — Standard. Internal tools, non-sensitive business logic. Approved AI coding tools may be used with standard credential hygiene.

Tier 4 — Off-limits. Healthcare records systems, regulated financial data, legally privileged code, codebases classified as trade secrets by legal. No AI coding tools permitted.

Document this classification in your codebase's README or in a central policy document. Engineers should be able to check a single page to know which AI tools they may use for a given repository.

Rule 2: Configure Data Protection in Your AI Coding Tool

Each major AI coding tool has settings that affect how your code is handled. Set these before deployment, not after a data incident.

GitHub Copilot:

Use Business or Enterprise plan — Individual plan does not provide organizational data control.
Disable telemetry for your organization: Settings → Copilot → Allow GitHub to use my code snippets → Off.
Enable code duplication filter: Settings → Copilot → Suggestions matching public code → Block.
Review the GitHub Copilot data handling documentation for your plan tier.

Cursor:

Enable Privacy Mode in Settings → General → Privacy Mode. This limits what code context is sent.
For Business plan: configure codebase indexing settings to exclude sensitive directories.
Review Cursor's privacy policy to confirm current data retention and training terms.

Amazon CodeWhisperer:

Use the Professional tier for organizational deployment — it includes additional data protection terms.
Opt out of sharing code suggestions as training data: Settings → Data sharing → Off.

Tabnine:

Tabnine offers a self-hosted option that keeps all code processing local. For Tier 3 and Tier 4 codebases, this is the only appropriate deployment model.
For cloud deployments, confirm training opt-out is active at the organization level.

Rule 3: Block Credentials from AI Context Windows

This is the highest-urgency rule. Credentials in AI context windows are the most common and most immediately harmful code governance failure.

# .git/hooks/pre-commit (or use pre-commit framework)
#!/bin/bash
gitleaks detect --staged --no-git 2>/dev/null && echo "No secrets found" || {
  echo "ERROR: Potential secrets detected. Review before committing."
  exit 1
}

Rule 4: Require Human Review for AI-Generated Code Before Merge

Define the review requirement explicitly:

All AI-generated functions over 20 lines require a named human reviewer before merge.
AI-generated database migrations require review by a senior engineer or technical lead.
AI-generated authentication or authorization logic requires security review, not just functional review.

AI Coding Tool Policy — [Company Name]
Last updated: [Date]

Approved tools: [GitHub Copilot / Cursor / Tabnine / other]
Not approved: [tools not on the approved list]

Repository tiers:
- Unrestricted repos: [list] — all approved tools permitted
- Standard repos: [list] — approved tools with credential hygiene
- Restricted repos: [list] — approved tools with privacy mode enabled
- Off-limits repos: [list] — no AI coding tools permitted

Rules for all engineers:
1. No AI tool context window may include credentials, API keys, or tokens.
2. AI-generated code over 20 lines requires a named human reviewer before merge.
3. AI-generated database migrations require senior engineer approval.
4. PII must be removed from test fixtures before using AI tools in that directory.
5. Incidents involving AI-generated code must be logged in the AI incident log.

Privacy settings (enforce at organization level):
- Training opt-out: enabled
- Code duplication filter: enabled (blocks public code suggestions)
- Telemetry sharing: disabled

Questions: contact [AI governance owner name and email]

References

AI governance for small teams — complete guide
Hidden AI features in developer tools
AI vendor due diligence checklist 2026
TypeScript AI agent security incident response playbook
GitHub Copilot for Business data handling: docs.github.com/en/copilot/overview-of-github-copilot/about-github-copilot-for-business
Cursor privacy policy: cursor.com/privacy

AI Code Governance: Rules for GitHub Copilot, Cursor, and Code Gen Tools (2026)

Why AI Code Governance Is Different

Rule 1: Define Approved Tools by Codebase Sensitivity

Rule 2: Configure Data Protection in Your AI Coding Tool

Rule 3: Block Credentials from AI Context Windows

Rule 4: Require Human Review for AI-Generated Code Before Merge

Rule 5: Include AI Code Tools in Your AI Incident Log

Implementation Checklist

Policy Template: AI Coding Tool Rules (Copy-Paste)

References

AI Code Governance: Rules for GitHub Copilot, Cursor, and Code Gen Tools (2026)

Why AI Code Governance Is Different

Rule 1: Define Approved Tools by Codebase Sensitivity

Rule 2: Configure Data Protection in Your AI Coding Tool

Rule 3: Block Credentials from AI Context Windows

Rule 4: Require Human Review for AI-Generated Code Before Merge

Rule 5: Include AI Code Tools in Your AI Incident Log

Implementation Checklist

Policy Template: AI Coding Tool Rules (Copy-Paste)

References

Rule 5: Include AI Code Tools in Your AI Incident Log

Get the next template in your inbox

Rule 5: Include AI Code Tools in Your AI Incident Log

Get the next template in your inbox