Chains
Pre-built chains for common LLM patterns: question answering over documents, conversational retrieval with memory, and multi-strategy summarization. Each chain orchestrates retrieval, context building, and generation in a single call.
Why Chains?
Building a RAG pipeline from scratch requires multiple steps: retrieve documents, build context, format prompts, call the LLM, and handle errors. Chains encapsulate these patterns into reusable, tested components with built-in observability via intermediate steps.
RetrievalQA
Simple Q&A over documents
Conversational
Chat with memory + retrieval
Summarization
Stuff, map-reduce, refine
QA Chain
Advanced QA with strategies
# RetrievalQAChain
The simplest chain for question answering over documents. It retrieves relevant documents, builds a context window, and generates an answer in a single call. Perfect for straightforward knowledge base queries.
import { RetrievalQAChain } from 'orkajs/chains/retrieval-qa';import { VectorRetriever } from 'orkajs/retrievers/vector';import { OpenAIAdapter } from 'orkajs/adapters/openai';Β const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! });Β const retriever = new VectorRetriever({ llm, vectorDB: myVectorDB, topK: 5});Β const chain = new RetrievalQAChain({ llm, retriever, collection: 'documentation', returnSources: true, // Include source documents in result maxSourceTokens: 3000, // Max tokens for context window systemPrompt: 'You are a helpful assistant. Answer based ONLY on the provided context.'});Β const result = await chain.call('How do I configure authentication?');Β console.log(result.answer);// "To configure authentication, you need to..."Β console.log(result.sources);// [{ id: 'doc-1', score: 0.92, content: '...' }, ...]Β console.log(result.intermediateSteps);// [// { name: 'retrieve', input: '...', output: 'Found 5 relevant documents', latencyMs: 120 },// { name: 'generate', input: '...', output: '...', latencyMs: 850 }// ]Β console.log(result.usage);// { promptTokens: 1200, completionTokens: 350, totalTokens: 1550 }π Intermediate Steps
Every chain returns intermediateSteps with the name, input, output, and latency of each step. This gives you full observability into the chain's execution β useful for debugging, monitoring, and optimization.
# ConversationalRetrievalChain
Extends RetrievalQA with conversation memory. It automatically condenses follow-up questions using chat history, so the retriever gets a standalone query even when the user asks "What about the second one?" or "Can you explain more?". The chain maintains its own chat history internally.
import { ConversationalRetrievalChain } from 'orkajs/chains/conversational-retrieval';Β const chain = new ConversationalRetrievalChain({ llm, retriever, collection: 'documentation', returnSources: true, maxHistoryLength: 10, // Keep last 10 messages in context // Optional: customize how follow-up questions are condensed condenseQuestionPrompt: `Given the following conversation and a follow-up question,rephrase the follow-up as a standalone question.Β Chat History:{{history}}Β Follow-up: {{question}}Β Standalone Question:`});Β // First question β normal retrievalconst result1 = await chain.call('What authentication methods are supported?');console.log(result1.answer);// "Orka AI supports JWT, OAuth2, and API key authentication..."Β // Follow-up β automatically condensed with historyconst result2 = await chain.call('How do I configure the second one?');// Internally condensed to: "How do I configure OAuth2 authentication?"console.log(result2.answer);// "To configure OAuth2, you need to set up..."Β // Another follow-upconst result3 = await chain.call('What are the required scopes?');// Condensed to: "What are the required OAuth2 scopes?"console.log(result3.answer);Β // Check intermediate steps to see the condensed questionconsole.log(result2.intermediateSteps);// [// { name: 'condense_question', input: 'How do I configure the second one?',// output: 'How do I configure OAuth2 authentication?' },// { name: 'retrieve', ... },// { name: 'generate', ... }// ]Β // Manage historyconsole.log(chain.getChatHistory());// [{ role: 'user', content: '...' }, { role: 'assistant', content: '...' }, ...]Β chain.clearHistory(); // Reset conversationπ Question Condensation Flow
- User asks: "How do I configure the second one?"
- LLM condenses: Uses chat history to produce "How do I configure OAuth2?"
- Retriever searches: With the standalone question
- LLM answers: Using retrieved context + full history
# SummarizationChain
Summarizes one or more documents using three different strategies, each with different trade-offs for quality, speed, and cost.
π¦ Stuff
Concatenates all documents into one prompt. Fastest, cheapest, but limited by context window.
Best for: short docsπΊοΈ Map-Reduce
Summarizes each chunk independently (map), then combines all summaries (reduce). Handles unlimited document size.
Best for: large docsπ Refine
Starts with first chunk, then iteratively refines the summary with each subsequent chunk. Best quality but sequential.
Best for: quality- Stuff Strategy
import { SummarizationChain } from 'orkajs/chains/summarization';Β // Stuff: all documents in one promptconst stuffChain = new SummarizationChain({ llm, strategy: 'stuff', systemPrompt: 'Provide a clear and concise summary.'});Β const result = await stuffChain.call([ 'Document 1 content here...', 'Document 2 content here...', 'Document 3 content here...']);Β console.log(result.answer); // Combined summaryconsole.log(result.usage); // Token usage (1 LLM call)- Map-Reduce Strategy
// Map-Reduce: summarize chunks independently, then combineconst mapReduceChain = new SummarizationChain({ llm, strategy: 'map-reduce', maxChunkSize: 3000, // Split large docs into 3000-char chunks combinePrompt: 'Combine the following summaries into a single coherent summary:\n\n{{summaries}}\n\nCombined Summary:'});Β const result = await mapReduceChain.call([ veryLongDocument1, // 50,000 chars veryLongDocument2 // 30,000 chars]);Β console.log(result.answer);console.log(result.intermediateSteps);// [// { name: 'map_chunk_0', ... }, // Summary of chunk 1// { name: 'map_chunk_1', ... }, // Summary of chunk 2// ...// { name: 'reduce_combine', ... } // Final combined summary// ]- Refine Strategy
// Refine: iteratively improve summary with each chunkconst refineChain = new SummarizationChain({ llm, strategy: 'refine', maxChunkSize: 3000, refinePrompt: `Here is an existing summary:{{existingSummary}}Β Refine this summary with the following additional context:{{context}}Β Refined Summary:`});Β const result = await refineChain.call([longDocument]);Β console.log(result.answer);console.log(result.intermediateSteps);// [// { name: 'initial_summary', ... }, // First chunk summary// { name: 'refine_1', ... }, // Refined with chunk 2// { name: 'refine_2', ... }, // Refined with chunk 3// ...// ]# QAChain β Advanced Question Answering
QAChain combines retrieval with multi-strategy answering. Unlike RetrievalQAChain which always uses the "stuff" approach, QAChain supports stuff, map-reduce, and refine strategies for answering questions over large document sets.
import { QAChain } from 'orkajs/chains/qa';Β // Stuff strategy (default) β fast, for small document setsconst stuffQA = new QAChain({ llm, retriever, collection: 'docs', strategy: 'stuff', returnSources: true});Β const result1 = await stuffQA.call('What is the rate limit?');Β // Map-Reduce strategy β for many retrieved documentsconst mapReduceQA = new QAChain({ llm, retriever, collection: 'docs', strategy: 'map-reduce', returnSources: true, systemPrompt: 'Answer precisely based on the documents. Cite sources when possible.'});Β const result2 = await mapReduceQA.call('Compare all authentication methods');// Map: extracts relevant info from each document// Reduce: combines into a comprehensive answerΒ // Refine strategy β best quality for complex questionsconst refineQA = new QAChain({ llm, retriever, collection: 'docs', strategy: 'refine', returnSources: true});Β const result3 = await refineQA.call('Explain the complete deployment process step by step');// Starts with answer from first doc, refines with each additional docComplete Example β Production RAG Pipeline
import { createOrka } from 'orkajs/core';import { OpenAIAdapter } from 'orkajs/adapters/openai';import { EnsembleRetriever, VectorRetriever, BM25Retriever } from 'orkajs/retrievers';import { ConversationalRetrievalChain, SummarizationChain } from 'orkajs/chains';import { MemoryCache } from 'orkajs/cache/memory';import { CachedLLM } from 'orkajs/cache/llm';Β // Setup with cachingconst baseLLM = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! });const llm = new CachedLLM(baseLLM, new MemoryCache({ maxSize: 500 }));Β const orka = createOrka({ llm, vectorDB: myVectorDB });Β // Hybrid retriever (BM25 + Vector)const hybrid = new EnsembleRetriever({ retrievers: [ new BM25Retriever({ documents: myDocs, topK: 10 }), new VectorRetriever({ llm, vectorDB: myVectorDB, topK: 10 }) ], weights: [0.3, 0.7], topK: 5});Β // Conversational chain with hybrid retrievalconst chatChain = new ConversationalRetrievalChain({ llm, retriever: hybrid, collection: 'documentation', returnSources: true, maxHistoryLength: 10});Β // Handle user conversationasync function handleMessage(userMessage: string) { const result = await chatChain.call(userMessage);Β return { answer: result.answer, sources: result.sources?.map(s => ({ id: s.id, score: s.score, preview: s.content?.slice(0, 100) })), steps: result.intermediateSteps };}Β // Summarize long documentsconst summarizer = new SummarizationChain({ llm, strategy: 'map-reduce', maxChunkSize: 3000});Β async function summarizeDocuments(texts: string[]) { const result = await summarizer.call(texts); return result.answer;}Comparison
| Chain | Use Case | Memory | Strategies |
|---|---|---|---|
| RetrievalQA | Simple Q&A over docs | β | stuff |
| Conversational | Chat with follow-ups | β | stuff |
| Summarization | Document summarization | β | stuff, map-reduce, refine |
| QAChain | Advanced Q&A with strategies | β | stuff, map-reduce, refine |
Summarization Strategies Comparison
| Strategy | LLM Calls | Quality | Max Doc Size |
|---|---|---|---|
| stuff | 1 | βββ | Context window limit |
| map-reduce | N + 1 | ββ | Unlimited |
| refine | N | ββββ | Unlimited |
π‘ Best Practices
1. Start with RetrievalQA
For most use cases, RetrievalQAChain is sufficient. Only upgrade to ConversationalRetrievalChain when you need multi-turn conversations.
2. Use Hybrid Retrieval
Combine BM25 + VectorRetriever in an EnsembleRetriever for the best results. Keyword matching catches exact terms that semantic search might miss.
3. Monitor Intermediate Steps
Use intermediateSteps to debug retrieval quality, identify slow steps, and optimize your pipeline.
4. Choose the Right Summarization Strategy
Use 'stuff' for short documents (<4K tokens), 'map-reduce' for parallel processing of large docs, and 'refine' when summary quality is critical.
Tree-shaking Imports
// β
Import only what you needimport { RetrievalQAChain } from 'orkajs/chains/retrieval-qa';import { ConversationalRetrievalChain } from 'orkajs/chains/conversational-retrieval';import { SummarizationChain } from 'orkajs/chains/summarization';import { QAChain } from 'orkajs/chains/qa';Β // β
Or import from indeximport { RetrievalQAChain, ConversationalRetrievalChain, SummarizationChain, QAChain } from 'orkajs/chains';