Does California AB 2013 apply to my company if we are not based in California?

Yes. AB 2013 applies to any developer that makes a generative AI system available to Californians, regardless of where the developer is incorporated or headquartered. If California residents can access and use your GenAI system, you are in scope. This is the same reach-basis used by the CCPA.

What exactly must be published on the website under AB 2013?

A high-level summary of the training datasets used, covering: (1) dataset names and sources, (2) types of data (text, images, code, audio, etc.), (3) whether copyrighted material was included and its licensing status, (4) whether personal information was included, (5) whether synthetic data was used, (6) data processing steps applied, and (7) the general purpose for which each dataset was used. No prescribed format — it can be a webpage, a PDF linked from your website, or a structured data card.

What triggers the update requirement?

A 'substantial modification' to the AI system triggers a new or updated disclosure. The law does not define substantial modification with precision, but industry guidance treats fine-tuning on a new dataset, replacing the base model, or adding a materially different training corpus as triggering the update obligation.

How is AB 2013 enforced?

Via the California Unfair Competition Law (Bus. & Prof. Code § 17200). Non-compliance is an unlawful business practice under the UCL. This means: (1) the California AG can bring an action, (2) local district attorneys and city attorneys can bring actions, and (3) any private person acting in the public interest can bring a representative action. No class action bar — this creates real private litigation risk.

Does AB 2013 apply to open-source AI models?

Yes — if the open-source model is made available to Californians for use, the disclosure requirement applies. The exemption in the law covers internal use and certain government/security applications, not open-source release. Model cards on Hugging Face or GitHub may satisfy the disclosure requirement if they contain all required fields.

Does California AB 2013 apply to my company if we are not based in California?

Yes. AB 2013 applies to any developer that makes a generative AI system available to Californians, regardless of where the developer is incorporated or headquartered. If California residents can access and use your GenAI system, you are in scope. This is the same reach-basis used by the CCPA.

California AB 2013 Compliance Checklist 202…

California AB 2013 Compliance Checklist 2026 — AI Training Data Transparency

California state capitol building in Sacramento — AB 2013 took effect January 1, 2026 and is enforced via the California Unfair Competition Law

California AB 2013 (the Generative AI Training Data Transparency Act) took effect January 1, 2026. If your company offers a generative AI system to California users and has not published a training data disclosure page, you are already out of compliance.

At a glance:

Element	Detail
Effective date	January 1, 2026
Who it covers	Developers offering GenAI systems to Californians
Core obligation	Publish training data disclosure on website
Update trigger	Substantial modification to the AI system
Enforcement	California UCL — AG, local AGs, and private lawsuits
Applies to systems released from	January 1, 2022 onward
Format required	None prescribed — must be publicly accessible

Step 1: Determine if you are in scope

AB 2013 covers developers of generative AI systems made available to Californians. You are in scope if you:

Offer a text, image, audio, video, or code generation model to the public
Make a generative AI API available to third-party developers
Operate a product built on a fine-tuned or custom-trained generative model
Released or substantially modified a GenAI system on or after January 1, 2022

Out of scope:

AI systems used solely for internal business operations with no consumer-facing output
AI used for national security or defense applications
AI systems for safety and security purposes (fraud detection, spam filtering) that do not generate open-ended content
Pure wrappers around third-party APIs where you have not conducted any training or fine-tuning (though best practice is to disclose what you know about the underlying model)

Gray zone: If you fine-tune an existing model on your own data and deploy it to users, you are likely in scope for the fine-tuning dataset, even if the base model disclosure is handled by the upstream provider.

Step 2: What must be published

The disclosure must appear on your publicly accessible website before the system is made available. No prescribed format: a dedicated webpage, a model card, or a PDF linked from your product page all work.

Required fields:

Field	What to include
Dataset names and sources	Name each dataset (or category of dataset); identify whether it came from web scrape, licensed data purchase, first-party data, or other source
Data types	Specify the modalities: text, images, audio, video, code, structured data, synthetic data
Copyrighted material	State whether copyrighted content is included; if yes, describe the licensing basis (licensed, fair use claim, rights-reserved opt-out honored)
Personal information	State whether personal data was included; if yes, describe the category (public biographical data, user-consented data, etc.)
Synthetic data	Disclose whether any training data was synthetically generated and how it was produced
Data processing	Describe filtering, deduplication, quality scoring, or other processing applied
Intended purpose of datasets	Describe what capability each dataset was intended to develop

What you do NOT have to disclose:

Proprietary details of your training pipeline
Full dataset contents or samples
Specific file counts or token counts (though including these helps)
Vendor contracts or pricing

Step 3: Where to publish it

The law requires the disclosure to be on your "internet website." Accepted formats:

Dedicated /training-data or /model-transparency page on your product website
Model card hosted on Hugging Face, GitHub, or your own documentation site with a link from your main site
Section within your existing Terms of Service or Privacy Policy (not recommended — harder to find and update)
Structured data card following a published schema (e.g., Hugging Face dataset card format, Croissant format)

The disclosure should be findable from your product's main page within 1-2 clicks. Burying it in a footer link to a legal page satisfies the letter but creates litigation risk.

Step 4: Update triggers

You must update the disclosure when there is a substantial modification to the AI system. Treat these events as triggers:

Event	Triggers update?
New base model version (e.g., switching from GPT-4 to GPT-4o)	Yes — if you conducted any fine-tuning
Fine-tuning on a new proprietary dataset	Yes
Adding a new data source to the training pipeline	Yes
Prompt engineering or system prompt changes only	No
Inference parameter changes (temperature, context window)	No
RAG pipeline updates (new knowledge base)	Likely yes — disclose the new retrieval corpus
Bug fixes or safety filter updates	No

Build the update into your release process: every model release checklist should include a disclosure review step.

Data documentation and transparency — AB 2013 requires training dataset disclosures to be published before a GenAI system goes live for California users

Step 5: Enforcement and litigation risk

AB 2013 is enforced via the California Unfair Competition Law (UCL), which makes non-compliance an unlawful business practice. Unlike many state AI laws with AG-only enforcement:

Enforcement actor	What they can do
California Attorney General	Civil action, injunction, civil penalties
Local district attorneys and city attorneys	Same powers as AG within their jurisdiction
Private plaintiffs (representative action)	Sue in the public interest under UCL § 17204

The private lawsuit risk is real. Since Prop 64 (2004), UCL plaintiffs must demonstrate actual injury in fact and loss of money or property to claim restitution. But UCL § 17204 representative actions for injunctive relief remain accessible with a lower standing bar — a public interest plaintiff can seek to compel compliance without proving individual damages. Attorney-fee shifting further incentivizes plaintiff firms targeting non-compliant AI developers.

What typically triggers a lawsuit: public disclosure failures that are easy to document. A product launched without any disclosure page, or a model updated on a new training dataset with no corresponding disclosure update.

Step 6: Interaction with other laws

AB 2013 operates alongside other data transparency obligations:

Law	Training data relevance
AB 2013 (California)	Disclosure of training datasets, publicly on website
EU AI Act GPAI (Chapter V)	Training data summary, published using EU AI Office template, for EU market
CCPA / CPRA	Training on California residents' personal data may trigger notice and opt-out obligations
Copyright law	Training on rights-reserved content is subject to ongoing litigation — disclosure of copyrighted training data does not grant a license

If you are targeting both California and EU markets, the GPAI training data summary requirement and AB 2013 are structurally similar — you can produce one document that satisfies both with minor additions for each.

Minimum viable disclosure page — template structure

# [Product Name] — AI Model Training Data Disclosure

Last updated: [date]

This page describes the data used to train [product name]'s AI model,
as required by California AB 2013.

## Training Datasets

| Dataset | Source | Data type | Copyrighted content | Personal data | Synthetic data |
|---|---|---|---|---|---|
| Common Crawl (filtered) | Web scrape | Text | Yes — fair use basis | No | No |
| [Dataset 2] | ... | ... | ... | ... | ... |

## Data Processing

[Describe: deduplication approach, quality filtering, safety filtering,
 PII removal if applicable]

## Intended Purpose

[Describe what each dataset was used to develop — e.g., "general
language understanding", "code generation", "instruction following"]

## Updates

This disclosure was last updated on [date] following [description of
substantial modification]. Prior versions are available at [link].

## Contact

Questions about this disclosure: [email]

Practical next steps for small teams

Audit your GenAI systems today — list every model you train, fine-tune, or substantially configure that reaches California users
Create the disclosure page — one page per system; link from the product homepage and footer
Add disclosure review to your release checklist — before every model update, check whether it constitutes a substantial modification
Document your training data internally — you cannot disclose what you haven't recorded; build data provenance tracking from the start
Check the GPAI overlap — if you are also an EU market AI provider, a combined AB 2013/GPAI disclosure saves time

Texas TRAIGA compliance checklist — another in-force state AI law with documentation requirements
GPAI enforcement August 2026 — EU parallel to AB 2013's training data summary obligation
Federal AI preemption vs state laws 2026 — why AB 2013 applies even if you expect federal preemption

References

California Legislature — AB 2013 Full Text
Crowell & Moring — California's AB 2013 Requires Generative AI Data Disclosure by January 1, 2026
Goodwin Law — California's AB 2013: Generative AI Developers Must Show Their Data
TrustArc — California SB 942 & AB 2013: AI transparency compliance guide

California AB 2013 Compliance Checklist 2026 — AI Training Data Transparency

California state capitol building in Sacramento — AB 2013 took effect January 1, 2026 and is enforced via the California Unfair Competition Law

At a glance:

Element	Detail
Effective date	January 1, 2026
Who it covers	Developers offering GenAI systems to Californians
Core obligation	Publish training data disclosure on website
Update trigger	Substantial modification to the AI system
Enforcement	California UCL — AG, local AGs, and private lawsuits
Applies to systems released from	January 1, 2022 onward
Format required	None prescribed — must be publicly accessible

Step 1: Determine if you are in scope

AB 2013 covers developers of generative AI systems made available to Californians. You are in scope if you:

Offer a text, image, audio, video, or code generation model to the public
Make a generative AI API available to third-party developers
Operate a product built on a fine-tuned or custom-trained generative model
Released or substantially modified a GenAI system on or after January 1, 2022

Out of scope:

AI systems used solely for internal business operations with no consumer-facing output
AI used for national security or defense applications
AI systems for safety and security purposes (fraud detection, spam filtering) that do not generate open-ended content
Pure wrappers around third-party APIs where you have not conducted any training or fine-tuning (though best practice is to disclose what you know about the underlying model)

Step 2: What must be published

Required fields:

Field	What to include
Dataset names and sources	Name each dataset (or category of dataset); identify whether it came from web scrape, licensed data purchase, first-party data, or other source
Data types	Specify the modalities: text, images, audio, video, code, structured data, synthetic data
Copyrighted material	State whether copyrighted content is included; if yes, describe the licensing basis (licensed, fair use claim, rights-reserved opt-out honored)
Personal information	State whether personal data was included; if yes, describe the category (public biographical data, user-consented data, etc.)
Synthetic data	Disclose whether any training data was synthetically generated and how it was produced
Data processing	Describe filtering, deduplication, quality scoring, or other processing applied
Intended purpose of datasets	Describe what capability each dataset was intended to develop

What you do NOT have to disclose:

Proprietary details of your training pipeline
Full dataset contents or samples
Specific file counts or token counts (though including these helps)
Vendor contracts or pricing

Step 3: Where to publish it

The law requires the disclosure to be on your "internet website." Accepted formats:

Dedicated /training-data or /model-transparency page on your product website
Model card hosted on Hugging Face, GitHub, or your own documentation site with a link from your main site
Section within your existing Terms of Service or Privacy Policy (not recommended — harder to find and update)
Structured data card following a published schema (e.g., Hugging Face dataset card format, Croissant format)

The disclosure should be findable from your product's main page within 1-2 clicks. Burying it in a footer link to a legal page satisfies the letter but creates litigation risk.

Step 4: Update triggers

You must update the disclosure when there is a substantial modification to the AI system. Treat these events as triggers:

Event	Triggers update?
New base model version (e.g., switching from GPT-4 to GPT-4o)	Yes — if you conducted any fine-tuning
Fine-tuning on a new proprietary dataset	Yes
Adding a new data source to the training pipeline	Yes
Prompt engineering or system prompt changes only	No
Inference parameter changes (temperature, context window)	No
RAG pipeline updates (new knowledge base)	Likely yes — disclose the new retrieval corpus
Bug fixes or safety filter updates	No

Build the update into your release process: every model release checklist should include a disclosure review step.

Data documentation and transparency — AB 2013 requires training dataset disclosures to be published before a GenAI system goes live for California users

Step 5: Enforcement and litigation risk

AB 2013 is enforced via the California Unfair Competition Law (UCL), which makes non-compliance an unlawful business practice. Unlike many state AI laws with AG-only enforcement:

Enforcement actor	What they can do
California Attorney General	Civil action, injunction, civil penalties
Local district attorneys and city attorneys	Same powers as AG within their jurisdiction
Private plaintiffs (representative action)	Sue in the public interest under UCL § 17204

Step 6: Interaction with other laws

AB 2013 operates alongside other data transparency obligations:

Law	Training data relevance
AB 2013 (California)	Disclosure of training datasets, publicly on website
EU AI Act GPAI (Chapter V)	Training data summary, published using EU AI Office template, for EU market
CCPA / CPRA	Training on California residents' personal data may trigger notice and opt-out obligations
Copyright law	Training on rights-reserved content is subject to ongoing litigation — disclosure of copyrighted training data does not grant a license

Minimum viable disclosure page — template structure

# [Product Name] — AI Model Training Data Disclosure

Last updated: [date]

This page describes the data used to train [product name]'s AI model,
as required by California AB 2013.

## Training Datasets

| Dataset | Source | Data type | Copyrighted content | Personal data | Synthetic data |
|---|---|---|---|---|---|
| Common Crawl (filtered) | Web scrape | Text | Yes — fair use basis | No | No |
| [Dataset 2] | ... | ... | ... | ... | ... |

## Data Processing

[Describe: deduplication approach, quality filtering, safety filtering,
 PII removal if applicable]

## Intended Purpose

[Describe what each dataset was used to develop — e.g., "general
language understanding", "code generation", "instruction following"]

## Updates

This disclosure was last updated on [date] following [description of
substantial modification]. Prior versions are available at [link].

## Contact

Questions about this disclosure: [email]

Practical next steps for small teams

Audit your GenAI systems today — list every model you train, fine-tune, or substantially configure that reaches California users
Create the disclosure page — one page per system; link from the product homepage and footer
Add disclosure review to your release checklist — before every model update, check whether it constitutes a substantial modification
Document your training data internally — you cannot disclose what you haven't recorded; build data provenance tracking from the start
Check the GPAI overlap — if you are also an EU market AI provider, a combined AB 2013/GPAI disclosure saves time

Texas TRAIGA compliance checklist — another in-force state AI law with documentation requirements
GPAI enforcement August 2026 — EU parallel to AB 2013's training data summary obligation
Federal AI preemption vs state laws 2026 — why AB 2013 applies even if you expect federal preemption

References

California Legislature — AB 2013 Full Text
Crowell & Moring — California's AB 2013 Requires Generative AI Data Disclosure by January 1, 2026
Goodwin Law — California's AB 2013: Generative AI Developers Must Show Their Data
TrustArc — California SB 942 & AB 2013: AI transparency compliance guide

California AB 2013 Compliance Checklist 2026 — AI Training Data Transparency

Get the next template in your inbox

California AB 2013 Compliance Checklist 2026 — AI Training Data Transparency

Get the next template in your inbox