Learn/Advanced Topics

JSON & AI/LLM — Structured Output, Function Calling & Tool Use

AI models generate text, but applications need structured data. This guide covers how to get reliable, validated JSON from LLMs — using structured output modes, function calling schemas, prompt engineering, and validation patterns that prevent hallucinated fields from reaching production.

JSON in the AI Stack

LLM → JSON → Application

1. OpenAI Structured Output

JSON Mode (Basic)

Force valid JSON outputtypescript
1const response = await openai.chat.completions.create({
2 model: 'gpt-4o',
3 response_format: { type: 'json_object' },
4 messages: [
5 {
6 role: 'system',
7 content: 'Extract product details as JSON with keys: name, price, category, inStock',
8 },
9 {
10 role: 'user',
11 content: 'The Blue Widget costs $29.99 and is currently available in the Electronics section.',
12 },
13 ],
14});
15
16const product = JSON.parse(response.choices[0].message.content!);
17// { "name": "Blue Widget", "price": 29.99, "category": "Electronics", "inStock": true }

Warning

JSON mode guarantees valid JSON but not the correct shape. The model might add extra fields, use different names, or change types. For shape guarantees, use Structured Outputs.

Structured Outputs (Schema-Enforced)

Schema-constrained JSON outputtypescript
1const response = await openai.chat.completions.create({
2 model: 'gpt-4o',
3 response_format: {
4 type: 'json_schema',
5 json_schema: {
6 name: 'product_extraction',
7 strict: true,
8 schema: {
9 type: 'object',
10 properties: {
11 name: { type: 'string' },
12 price: { type: 'number' },
13 category: { type: 'string', enum: ['Electronics', 'Clothing', 'Home', 'Other'] },
14 inStock: { type: 'boolean' },
15 },
16 required: ['name', 'price', 'category', 'inStock'],
17 additionalProperties: false,
18 },
19 },
20 },
21 messages: [
22 { role: 'user', content: 'The Blue Widget costs $29.99, available in Electronics.' },
23 ],
24});
25
26// Output is GUARANTEED to match the schema
27const product = JSON.parse(response.choices[0].message.content!);

2. Function Calling / Tool Use

Function calling lets the model decide when to call a function and what arguments to pass — all as JSON:

Defining tools with JSON Schematypescript
1const tools = [
2 {
3 type: 'function' as const,
4 function: {
5 name: 'search_products',
6 description: 'Search the product catalog by query and optional filters',
7 parameters: {
8 type: 'object',
9 properties: {
10 query: { type: 'string', description: 'Search query text' },
11 category: {
12 type: 'string',
13 enum: ['electronics', 'clothing', 'home', 'all'],
14 description: 'Product category filter',
15 },
16 maxPrice: {
17 type: 'number',
18 description: 'Maximum price in dollars',
19 },
20 inStockOnly: {
21 type: 'boolean',
22 description: 'Only return in-stock items',
23 },
24 },
25 required: ['query'],
26 },
27 },
28 },
29];
30
31const response = await openai.chat.completions.create({
32 model: 'gpt-4o',
33 tools,
34 messages: [{ role: 'user', content: 'Find me blue shoes under $100' }],
35});
36
37// Model returns:
38// {
39// "name": "search_products",
40// "arguments": "{\"query\":\"blue shoes\",\"category\":\"clothing\",\"maxPrice\":100}"
41// }
42
43const call = response.choices[0].message.tool_calls?.[0];
44if (call) {
45 const args = JSON.parse(call.function.arguments);
46 const results = await searchProducts(args);
47 // Feed results back to model for natural language response
48}
ProviderFeature NameSchema FormatStrict Mode
OpenAIFunction calling / ToolsJSON SchemaYes (strict: true)
AnthropicTool useJSON SchemaYes (enforced by default)
Google GeminiFunction callingOpenAPI-stylePartial
Open-source (Ollama)Varies by modelJSON Schema (if supported)No guarantees

3. Prompt Engineering for JSON

System Prompt Pattern

Effective system prompt for JSON outputtext
1You are a data extraction assistant. Extract information from user text
2and return it as a JSON object matching this exact structure:
3
4{
5 "name": "string (product name)",
6 "price": number (in dollars, e.g. 29.99),
7 "features": ["string array of key features"],
8 "rating": number (1-5, or null if not mentioned),
9 "available": boolean
10}
11
12Rules:
13- Return ONLY the JSON object, no markdown fences, no explanation
14- Use null for fields not found in the text
15- Price must be a number, not a string
16- Features array can be empty if none mentioned

Few-Shot Examples

1const messages = [
2 { role: 'system', content: 'Extract event details as JSON.' },
3 // Example 1
4 { role: 'user', content: 'React Conf on May 15 at 2pm in Las Vegas' },
5 { role: 'assistant', content: JSON.stringify({
6 event: 'React Conf', date: '2026-05-15', time: '14:00', location: 'Las Vegas',
7 })},
8 // Example 2
9 { role: 'user', content: 'Team standup tomorrow morning' },
10 { role: 'assistant', content: JSON.stringify({
11 event: 'Team standup', date: 'tomorrow', time: 'morning', location: null,
12 })},
13 // Actual request
14 { role: 'user', content: userInput },
15];

4. Validating LLM JSON Output

Robust LLM output parsing with Zodtypescript
1import { z } from 'zod';
2
3const ProductSchema = z.object({
4 name: z.string().min(1),
5 price: z.number().positive(),
6 category: z.enum(['Electronics', 'Clothing', 'Home', 'Other']),
7 inStock: z.boolean(),
8 features: z.array(z.string()).default([]),
9});
10
11type Product = z.infer<typeof ProductSchema>;
12
13function parseLlmJson<T>(raw: string, schema: z.ZodSchema<T>): T {
14 // Strip markdown fences if model wraps in ```json ... ```
15 let cleaned = raw.trim();
16 if (cleaned.startsWith('```')) {
17 cleaned = cleaned.replace(/^```(?:json)?\n?/, '').replace(/\n?```$/, '');
18 }
19
20 const parsed = JSON.parse(cleaned);
21 return schema.parse(parsed);
22}
23
24try {
25 const product = parseLlmJson(llmOutput, ProductSchema);
26 console.log(product.name, product.price);
27} catch (error) {
28 if (error instanceof z.ZodError) {
29 console.error('Schema validation failed:', error.errors);
30 // Retry with a more explicit prompt
31 } else {
32 console.error('Invalid JSON from LLM');
33 // Attempt repair or retry
34 }
35}

5. Common LLM JSON Issues

IssueExamplePrevention
Markdown wrapping```json\n{...}\n```Strip fences before parsing
Trailing text{"name":"Widget"} Hope that helps!Extract first { ... } match
Hallucinated fields{"name":"Widget","color":"blue"} (no color asked)Use strict schema + additionalProperties: false
Type mismatch"price": "$29.99" (string not number)Enforce types in schema, parse with Zod coerce
Incomplete JSON{"name":"Widget","prSet max_tokens high enough, use partial parser
Nested markdown"description": "A **bold** widget"Post-process or specify plain text in prompt

Try It — Validate LLM JSON Output

Try It Yourself

Validate JSON output from an AI model before using in your application

Frequently Asked Questions

What is OpenAI JSON mode?
JSON mode (response_format: {type: "json_object"}) forces the model to return valid JSON. However, it does not guarantee the shape matches your expected schema. For shape guarantees, use Structured Outputs (response_format: {type: "json_schema", json_schema: {...}}) which constrains output to match an exact JSON Schema.
What is function calling in LLMs?
Function calling (also called "tool use") lets you define functions with JSON Schema parameters that the model can "call" by returning a JSON object with the function name and arguments. You then execute the function in your code and feed the result back. It is how AI agents interact with external APIs, databases, and tools.
How do I handle invalid JSON from an LLM?
Use structured output modes when available (OpenAI, Anthropic). For models without this, wrap JSON.parse in try/catch and attempt repair: strip markdown fences, fix trailing commas, close unclosed brackets. Libraries like json-repair and dirty-json can fix common LLM JSON errors automatically.
Can I stream JSON from an LLM?
Yes, but the JSON arrives token by token and is incomplete until the stream finishes. Options: (1) Wait for the full response then parse, (2) Use a partial JSON parser that renders fields as they appear, (3) Ask the model to output JSON Lines (one object per line) so each line is independently parseable.
How is JSON Schema used in function calling?
Each function/tool you define includes a JSON Schema for its parameters. The model uses this schema to generate the correct argument structure. For example, a "search_products" function might have parameters: {type: "object", properties: {query: {type: "string"}, category: {type: "string", enum: [...]}}}. The model generates {"query": "blue shoes", "category": "footwear"}.