How do you validate structured JSON output from an LLM in TypeScript?

Define the expected shape as a Zod schema, then call schema.safeParse(output) on every response. If parsing fails, either retry with a correction prompt ('Your previous output was invalid JSON. Return only a JSON object matching this schema: ...') or fall back to a default value. Never trust raw LLM output as valid JSON without parsing — LLMs frequently add markdown fences, trailing commas, or extra explanation text.

How do you redact PII from AI agent outputs in TypeScript?

Run a regex pipeline over the output string before it reaches the user or any log. Patterns to match: email addresses, phone numbers, SSNs (\d{3}-\d{2}-\d{4}), credit card numbers (\d{4}[\s-]\d{4}[\s-]\d{4}[\s-]\d{4}), and IP addresses. Replace matches with [REDACTED-TYPE]. For named entities (person names, addresses), use a dedicated NER library — regex alone misses most named PII.

How do you implement a content policy filter for AI outputs in TypeScript?

Build a layered filter: first, a fast keyword/regex blocklist for obvious violations (O(n) over the output string); then, for borderline content, a secondary call to a moderation API (OpenAI Moderation, Azure Content Safety, or similar). Gate the output: if the filter fails, return a safe fallback string and log the violation — do not surface the raw output to the user.

What is a cost circuit breaker for AI agents?

A circuit breaker tracks cumulative token usage per session or per request and throws when a configurable threshold is exceeded. Implement as a wrapper around your LLM client that increments a counter on each response and calls an onBudgetExceeded callback when the limit is hit. Use this to prevent runaway agent loops — a stuck agent calling tools repeatedly can generate thousands of dollars in API charges in minutes.

How do you handle hallucinations in TypeScript AI agent outputs?

Hallucination prevention is primarily a retrieval and prompting problem, not a post-processing one. For structured fact outputs, implement a confidence threshold pattern: require the model to output a confidence score alongside the answer, and route low-confidence outputs to human review rather than surfacing them directly. For outputs that cite sources, verify URLs exist with a lightweight HEAD request before including them in responses.

How do you validate structured JSON output from an LLM in TypeScript?

Define the expected shape as a Zod schema, then call schema.safeParse(output) on every response. If parsing fails, either retry with a correction prompt ('Your previous output was invalid JSON. Return only a JSON object matching this schema: ...') or fall back to a default value. Never trust raw LLM output as valid JSON without parsing — LLMs frequently add markdown fences, trailing commas, or extra explanation text.

TypeScript AI Agent Output Validation: 6 Pa…

TypeScript AI Agent Output Validation: 6 Patterns with Code Templates

Code on a monitor — TypeScript AI agent output validation and guardrail patterns

Every AI agent output is untrusted input to the rest of your system. Treat it the same way you treat user-supplied HTTP request bodies: parse, validate, sanitize, then use.

These six patterns are standalone — each is a function or class you drop in and wire to your existing agent loop.

Pattern 1: Zod Schema Enforcement with Retry

Parse every structured output against its expected schema. On failure, send a correction prompt and retry once before falling back.

import { z } from "zod";
import OpenAI from "openai";

const client = new OpenAI();

async function callWithSchema<T>(
  messages: OpenAI.Chat.ChatCompletionMessageParam[],
  schema: z.ZodType<T>,
  maxRetries = 1
): Promise<T> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await client.chat.completions.create({
      model: "gpt-4o",
      messages,
      response_format: { type: "json_object" },
    });

    const raw = response.choices[0].message.content ?? "";
    const parsed = schema.safeParse(JSON.parse(raw));

    if (parsed.success) return parsed.data;

    if (attempt < maxRetries) {
      // Feed the error back as a correction prompt
      messages = [
        ...messages,
        { role: "assistant", content: raw },
        {
          role: "user",
          content: `Your output failed validation: ${parsed.error.message}. Return a valid JSON object matching the schema.`,
        },
      ];
    }
  }
  throw new Error("Output validation failed after retries");
}

// Usage
const AnalysisResult = z.object({
  risk_level: z.enum(["low", "medium", "high"]),
  summary: z.string().max(500),
  action_required: z.boolean(),
});

const result = await callWithSchema(
  [{ role: "user", content: "Analyze this contract for risk..." }],
  AnalysisResult
);
// result is typed as { risk_level: "low"|"medium"|"high", summary: string, action_required: boolean }

Why one retry: Two retries rarely help — if the model failed twice on the same schema, the schema description in the prompt is the problem, not the model.

Pattern 2: PII Redaction Pipeline

Strip sensitive data from outputs before they reach users, logs, or downstream systems.

type RedactionRule = { pattern: RegExp; label: string };

const DEFAULT_RULES: RedactionRule[] = [
  { pattern: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/gi, label: "EMAIL" },
  { pattern: /\b\d{3}[-.\s]?\d{3}[-.\s]?\d{4}\b/g, label: "PHONE" },
  { pattern: /\b\d{3}-\d{2}-\d{4}\b/g, label: "SSN" },
  {
    pattern: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g,
    label: "CARD",
  },
  {
    pattern: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g,
    label: "IP",
  },
];

function redactPII(
  text: string,
  rules: RedactionRule[] = DEFAULT_RULES
): { redacted: string; hits: string[] } {
  const hits: string[] = [];
  let redacted = text;

  for (const rule of rules) {
    redacted = redacted.replace(rule.pattern, (match) => {
      hits.push(`${rule.label}: ${match.slice(0, 4)}***`);
      return `[REDACTED-${rule.label}]`;
    });
  }

  return { redacted, hits };
}

// Usage — always redact before logging or returning to client
const raw = await getAgentOutput();
const { redacted, hits } = redactPII(raw);

if (hits.length > 0) {
  console.warn("PII redacted from agent output:", hits);
}

return redacted; // safe to surface

Note: Regex catches structured PII (emails, SSNs, card numbers). Named entities (person names, addresses) require a NER model — regex alone is not sufficient for GDPR Article 17 compliance.

Pattern 3: Content Policy Filter

Block harmful, off-topic, or policy-violating outputs before they reach users. Layered: fast keyword check first, moderation API second.

const BLOCKLIST = [
  /\b(how to (make|build|create) (bomb|weapon|malware))\b/i,
  /\b(suicide|self.harm) (method|instruction|guide)\b/i,
];

type PolicyVerdict = "pass" | "block" | "review";

async function checkContentPolicy(text: string): Promise<{
  verdict: PolicyVerdict;
  reason?: string;
}> {
  // Layer 1: fast blocklist (< 1ms)
  for (const pattern of BLOCKLIST) {
    if (pattern.test(text)) {
      return { verdict: "block", reason: "blocklist_match" };
    }
  }

  // Layer 2: moderation API (only if layer 1 passes)
  const moderation = await openai.moderations.create({ input: text });
  const result = moderation.results[0];

  if (result.flagged) {
    const categories = Object.entries(result.categories)
      .filter(([, flagged]) => flagged)
      .map(([cat]) => cat);
    return { verdict: "block", reason: categories.join(",") };
  }

  // Layer 3: low-confidence pass → human review queue
  const maxScore = Math.max(...Object.values(result.category_scores));
  if (maxScore > 0.5) {
    return { verdict: "review", reason: `high_score:${maxScore.toFixed(2)}` };
  }

  return { verdict: "pass" };
}

// Usage
const output = await agent.run(userMessage);
const policy = await checkContentPolicy(output);

if (policy.verdict === "block") {
  auditLog.write({ event: "output_blocked", reason: policy.reason });
  return SAFE_FALLBACK_MESSAGE;
}
if (policy.verdict === "review") {
  humanReviewQueue.push({ output, reason: policy.reason, userId });
  return "Your request is being reviewed. We'll follow up shortly.";
}

return output;

Pattern 4: JSON Repair and Fallback

LLMs frequently return near-valid JSON wrapped in markdown fences, with trailing commas, or with commentary appended. Repair before throwing.

function extractAndRepairJSON(raw: string): unknown {
  // Strip markdown code fences
  let text = raw.replace(/^```(?:json)?\n?/m, "").replace(/\n?```$/m, "").trim();

  // Try direct parse first
  try {
    return JSON.parse(text);
  } catch {
    // Remove trailing commas before } or ]
    text = text.replace(/,(\s*[}\]])/g, "$1");

    // Strip single-line comments
    text = text.replace(/\/\/[^\n]*/g, "");

    try {
      return JSON.parse(text);
    } catch {
      // Find the outermost JSON object or array
      const objMatch = text.match(/\{[\s\S]*\}/);
      const arrMatch = text.match(/\[[\s\S]*\]/);
      const match = objMatch ?? arrMatch;

      if (match) {
        return JSON.parse(match[0]);
      }

      throw new Error(`Cannot repair JSON: ${raw.slice(0, 100)}`);
    }
  }
}

// Usage — wrap any structured output call
const raw = response.choices[0].message.content ?? "";
const parsed = extractAndRepairJSON(raw);
const validated = MySchema.parse(parsed); // then validate with Zod

Pattern 5: Confidence Threshold Guardrail

Route low-confidence outputs to human review instead of surfacing them directly.

const ConfidentOutput = z.object({
  answer: z.string(),
  confidence: z.number().min(0).max(1),
  sources: z.array(z.string()).optional(),
});

type ReviewableOutput =
  | { type: "direct"; answer: string }
  | { type: "pending_review"; reviewId: string };

async function callWithConfidenceGate(
  query: string,
  threshold = 0.75
): Promise<ReviewableOutput> {
  const result = await callWithSchema(
    [
      {
        role: "system",
        content:
          "Answer the query. Include a confidence score from 0 to 1. Be conservative — if unsure, score below 0.75.",
      },
      { role: "user", content: query },
    ],
    ConfidentOutput
  );

  if (result.confidence >= threshold) {
    return { type: "direct", answer: result.answer };
  }

  // Low confidence — queue for human review
  const reviewId = await humanReviewQueue.push({
    query,
    draft: result.answer,
    confidence: result.confidence,
  });

  return { type: "pending_review", reviewId };
}

// Usage
const output = await callWithConfidenceGate(userQuery, 0.8);

if (output.type === "direct") {
  return output.answer;
} else {
  return `We're verifying this answer. Check back using ID: ${output.reviewId}`;
}

When to use: High-stakes domains — healthcare information, legal interpretation, financial advice, or any output that drives an irreversible decision.

Pattern 6: Token Cost Circuit Breaker

Stop runaway agent loops before they generate unexpected API charges.

class CostCircuitBreaker {
  private totalTokens = 0;
  private readonly limit: number;
  private readonly onBudgetExceeded: (tokens: number) => void;

  constructor(
    tokenLimit: number,
    onBudgetExceeded: (tokens: number) => void
  ) {
    this.limit = tokenLimit;
    this.onBudgetExceeded = onBudgetExceeded;
  }

  record(usage: { prompt_tokens: number; completion_tokens: number }): void {
    this.totalTokens += usage.prompt_tokens + usage.completion_tokens;
    if (this.totalTokens > this.limit) {
      this.onBudgetExceeded(this.totalTokens);
      throw new Error(
        `Token budget exceeded: ${this.totalTokens} > ${this.limit}`
      );
    }
  }

  get used(): number {
    return this.totalTokens;
  }
}

// Usage — wrap your agent loop
const breaker = new CostCircuitBreaker(
  50_000, // ~$1.50 at GPT-4o pricing
  (tokens) => {
    alertOps(`Agent token budget exceeded: ${tokens} tokens used`);
    auditLog.write({ event: "budget_exceeded", tokens, sessionId });
  }
);

while (agent.hasMoreSteps()) {
  const response = await client.chat.completions.create({ /* ... */ });
  breaker.record(response.usage!); // throws if over budget
  await agent.processResponse(response);
}

Set the limit per session, not per request. A single tool call may be cheap, but a stuck loop calling tools 200 times is not.

Combining Patterns

Wire them in order — validate structure first, then policy, then surface:

async function safeAgentOutput(
  messages: OpenAI.Chat.ChatCompletionMessageParam[]
): Promise<string> {
  // 1. Get raw output (with cost tracking)
  const response = await client.chat.completions.create({ model: "gpt-4o", messages });
  breaker.record(response.usage!);
  const raw = response.choices[0].message.content ?? "";

  // 2. Content policy
  const policy = await checkContentPolicy(raw);
  if (policy.verdict === "block") return SAFE_FALLBACK_MESSAGE;

  // 3. PII redaction
  const { redacted } = redactPII(raw);

  // 4. Return clean output
  return redacted;
}

For structured outputs, replace step 4 with Zod schema validation (Pattern 1) after redaction.

TypeScript AI agent authorization patterns 2026 — control which tools the agent can call
TypeScript AI agent security audit checklist 2026 — audit trail and logging patterns
TypeScript AI agent security incident response playbook — what to do when something goes wrong

References