Output Parsers
Parse and validate LLM outputs into structured data with JSON, Zod schemas, lists, and auto-fixing.
Why Output Parsers?
LLMs return unstructured text. Output parsers extract structured data, validate formats, and handle errors, making LLM outputs reliable for downstream processing.
# JSONParser
Extract and parse JSON from LLM responses, even when wrapped in markdown code blocks or mixed with text.
import { JSONParser } from 'orkajs/parsers/json'; const parser = new JSONParser({ strict: false }); // LLM response with JSON in markdownconst llmOutput = `Here's the data you requested: \`\`\`json{ "name": "Alice", "age": 30, "skills": ["TypeScript", "Python"]}\`\`\``; const data = parser.parse(llmOutput);console.log(data);// { name: 'Alice', age: 30, skills: ['TypeScript', 'Python'] } // Get format instructions for the LLMconst instructions = parser.getFormatInstructions();console.log(instructions);🎯 Smart Extraction
JSONParser automatically extracts JSON from markdown code blocks, handles both objects and arrays, and provides clear error messages when parsing fails.
# StructuredOutputParser
Parse and validate LLM outputs against a Zod schema for type-safe structured data.
📦 Installation Required
StructuredOutputParser requires Zod for schema validation:
npm install zodimport { StructuredOutputParser } from 'orkajs/parsers/structured';import { z } from 'zod'; // Define schemaconst schema = z.object({ name: z.string().describe('Person name'), age: z.number().describe('Age in years'), email: z.string().email().describe('Email address'), skills: z.array(z.string()).describe('List of skills')}); // Create parserconst parser = StructuredOutputParser.fromZodSchema(schema); // Get format instructions to send to LLMconst instructions = parser.getFormatInstructions();const prompt = `Extract person info from this text: "Alice is 30 years old..." ${instructions}`; const llmResponse = await llm.generate(prompt); // Parse and validatetry { const data = parser.parse(llmResponse.content); console.log(data); // { name: 'Alice', age: 30, email: 'alice@example.com', skills: [...] } // ✅ Type-safe and validated} catch (error) { console.error('Validation failed:', error.message);}❌ Without Validation
const data = JSON.parse(response);// No type safety// No validation// Runtime errors✅ With StructuredOutputParser
const data = parser.parse(response);// ✅ Type-safe// ✅ Validated// ✅ Clear errors# ListParser
Parse lists from LLM outputs, automatically handling bullet points, numbers, and custom separators.
import { ListParser } from 'orkajs/parsers/list'; const parser = new ListParser({ separator: '\n', // Split by newline (default) trim: true // Remove whitespace}); const llmOutput = `Here are the top programming languages: - TypeScript- Python- Go- Rust`; const items = parser.parse(llmOutput);console.log(items);// ['TypeScript', 'Python', 'Go', 'Rust'] // Works with numbered lists tooconst numbered = `1. First item2. Second item3. Third item`; const items2 = parser.parse(numbered);// ['First item', 'Second item', 'Third item'] // Custom separatorconst csvParser = new ListParser({ separator: ',' });const csv = 'apple, banana, orange';console.log(csvParser.parse(csv));// ['apple', 'banana', 'orange']# AutoFixParser
Wraps any parser and automatically retries with LLM correction when parsing fails.
import { AutoFixParser } from 'orkajs/parsers/auto-fix';import { StructuredOutputParser } from 'orkajs/parsers/structured';import { z } from 'zod'; const schema = z.object({ name: z.string(), age: z.number()}); const baseParser = StructuredOutputParser.fromZodSchema(schema); const autoFixParser = new AutoFixParser({ parser: baseParser, maxRetries: 3, llm: orka.getLLM()}); // Malformed LLM outputconst badOutput = `{ "name": "Alice", "age": "thirty" // ❌ Should be number}`; // Try to parse with auto-fixtry { const data = await autoFixParser.parseWithRetry(badOutput); console.log(data); // { name: 'Alice', age: 30 } ✅ Fixed automatically} catch (error) { console.error('Failed after retries:', error);}🔄 How Auto-Fix Works
- Attempts to parse with base parser
- If parsing fails, sends error + original output to LLM
- LLM corrects the format
- Retries parsing with corrected output
- Repeats up to maxRetries times
# XMLParser
Parse XML-tagged outputs from LLMs. Useful when you need multiple named fields without JSON formatting, which some LLMs handle more naturally with XML tags.
import { XMLParser } from 'orkajs/parsers/xml'; // Basic usage — extract all XML tagsconst parser = new XMLParser(); const llmOutput = `Here is my analysis: <summary>The product has strong market potential</summary><sentiment>positive</sentiment><confidence>0.92</confidence><reasoning>Based on market trends and competitor analysis, the product fills a clear gap.</reasoning>`; const data = parser.parse(llmOutput);console.log(data);// {// summary: 'The product has strong market potential',// sentiment: 'positive',// confidence: '0.92',// reasoning: 'Based on market trends and competitor analysis...'// } // Strict mode — require specific tagsconst strictParser = new XMLParser({ tags: ['summary', 'sentiment', 'confidence'], strict: true // Throws if any required tag is missing}); const result = strictParser.parse(llmOutput);// ✅ Validates that all required tags are present // Get format instructions for the LLMconsole.log(strictParser.getFormatInstructions());// "Your response must use the following XML tags:// <summary>value</summary>// <sentiment>value</sentiment>// <confidence>value</confidence>"# CSVParser
Parse CSV-formatted outputs into arrays of objects. Handles quoted fields, custom separators, and optional predefined headers. Ideal for tabular data extraction from LLMs.
import { CSVParser } from 'orkajs/parsers/csv'; // Auto-detect headers from first rowconst parser = new CSVParser(); const llmOutput = `name,role,experienceAlice,Engineer,5 yearsBob,Designer,3 yearsCharlie,Manager,8 years`; const data = parser.parse(llmOutput);console.log(data);// [// { name: 'Alice', role: 'Engineer', experience: '5 years' },// { name: 'Bob', role: 'Designer', experience: '3 years' },// { name: 'Charlie', role: 'Manager', experience: '8 years' }// ] // Predefined headers (no header row in data)const noHeaderParser = new CSVParser({ headers: ['product', 'price', 'stock'], separator: ';', // Custom separator strict: true // Enforce column count}); const tabData = `iPhone;999;trueMacBook;1999;false`; console.log(noHeaderParser.parse(tabData));// [// { product: 'iPhone', price: '999', stock: 'true' },// { product: 'MacBook', price: '1999', stock: 'false' }// ] // Handles quoted fields with commasconst quotedCSV = `name,description"Smith, John","Senior engineer, 10+ years"`;console.log(parser.parse(quotedCSV));// [{ name: 'Smith, John', description: 'Senior engineer, 10+ years' }]# CommaSeparatedListParser
A specialized parser for comma-separated lists. Simpler than CSVParser when you just need a flat list of values. Supports deduplication and automatic trimming.
import { CommaSeparatedListParser } from 'orkajs/parsers/comma-separated-list'; const parser = new CommaSeparatedListParser({ trim: true, // Remove whitespace (default: true) removeDuplicates: false // Keep duplicates (default: false)}); const llmOutput = 'TypeScript, Python, Go, Rust, JavaScript';const items = parser.parse(llmOutput);console.log(items);// ['TypeScript', 'Python', 'Go', 'Rust', 'JavaScript'] // With deduplicationconst deduper = new CommaSeparatedListParser({ removeDuplicates: true });const dupes = 'apple, banana, apple, orange, banana';console.log(deduper.parse(dupes));// ['apple', 'banana', 'orange'] // Format instructions for the LLMconsole.log(parser.getFormatInstructions());// "Your response must be a comma-separated list of values.// Example: item1, item2, item3"Complete Example
import { createOrka, OpenAIAdapter } from 'orkajs';import { StructuredOutputParser } from 'orkajs/parsers/structured';import { AutoFixParser } from 'orkajs/parsers/auto-fix';import { z } from 'zod'; const orka = createOrka({ llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), vectorDB: /* ... */}); // Define schemaconst productSchema = z.object({ name: z.string(), price: z.number(), category: z.enum(['electronics', 'clothing', 'food']), inStock: z.boolean(), tags: z.array(z.string())}); // Create parser with auto-fixconst baseParser = StructuredOutputParser.fromZodSchema(productSchema);const parser = new AutoFixParser({ parser: baseParser, maxRetries: 2, llm: orka.getLLM()}); // Generate structured outputconst prompt = `Extract product information from this description:"The iPhone 15 Pro costs $999 and is currently available. It's an electronics item with tags: smartphone, apple, 5g" ${baseParser.getFormatInstructions()}`; const response = await orka.generate(prompt); // Parse with validation and auto-fixconst product = await parser.parseWithRetry(response); console.log(product);// {// name: 'iPhone 15 Pro',// price: 999,// category: 'electronics',// inStock: true,// tags: ['smartphone', 'apple', '5g']// }// ✅ Type-safe, validated, and auto-corrected if neededComparison
| Parser | Use Case | Validation |
|---|---|---|
| JSONParser | Simple JSON extraction | Basic JSON syntax |
| StructuredOutput | Type-safe structured data | Zod schema validation |
| ListParser | Lists, arrays, enumerations | Format cleaning |
| AutoFixParser | Unreliable LLM outputs | LLM-powered correction |
| XMLParser | Multi-field extraction with XML tags | Tag presence (strict mode) |
| CSVParser | Tabular data extraction | Column count (strict mode) |
| CommaSeparatedList | Simple comma-separated values | Non-empty list |
Best Practices
1. Include Format Instructions
Always add parser.getFormatInstructions() to your prompts to guide the LLM.
2. Use Zod for Complex Schemas
StructuredOutputParser with Zod provides type safety, validation, and clear error messages.
3. Use AutoFix Sparingly
AutoFixParser makes extra LLM calls. Use it for critical data or when LLM outputs are unreliable.
Tree-shaking Imports
// ✅ Import only what you needimport { StructuredOutputParser } from 'orkajs/parsers/structured';import { AutoFixParser } from 'orkajs/parsers/auto-fix';import { XMLParser } from 'orkajs/parsers/xml';import { CSVParser } from 'orkajs/parsers/csv';import { CommaSeparatedListParser } from 'orkajs/parsers/comma-separated-list'; // ✅ Or import from indeximport { JSONParser, ListParser, XMLParser, CSVParser } from 'orkajs/parsers';