Chains

Pre-built chains for common LLM patterns: question answering over documents, conversational retrieval with memory, and multi-strategy summarization. Each chain orchestrates retrieval, context building, and generation in a single call.

Why Chains?

Building a RAG pipeline from scratch requires multiple steps: retrieve documents, build context, format prompts, call the LLM, and handle errors. Chains encapsulate these patterns into reusable, tested components with built-in observability via intermediate steps.

RetrievalQA

Stateless Sync

"Direct query-to-answer flow for static knowledge bases."

Conversational

Memory-Aware

"Maintains dialogue state across multiple turns with history."

Summarization

Token Optimizer

"Processes large volumes via Stuff or Map-Reduce logic."

QA Chain

Advanced Logic

"Customizable prompts and multi-step retrieval strategies."

# RetrievalQAChain

The simplest chain for question answering over documents. It retrieves relevant documents, builds a context window, and generates an answer in a single call. Perfect for straightforward knowledge base queries.

import { RetrievalQAChain, VectorRetriever } from '@orka-js/tools';
import { OpenAIAdapter } from '@orka-js/openai';
 
const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! });
 
const retriever = new VectorRetriever({
  llm,
  vectorDB: myVectorDB,
  topK: 5
});
 
const chain = new RetrievalQAChain({
  llm,
  retriever,
  collection: 'documentation',
  returnSources: true,          // Include source documents in result
  maxSourceTokens: 3000,        // Max tokens for context window
  systemPrompt: 'You are a helpful assistant. Answer based ONLY on the provided context.'
});
 
const result = await chain.call('How do I configure authentication?');
 
console.log(result.answer);
// "To configure authentication, you need to..."
 
console.log(result.sources);
// [{ id: 'doc-1', score: 0.92, content: '...' }, ...]
 
console.log(result.intermediateSteps);
// [
//   { name: 'retrieve', input: '...', output: 'Found 5 relevant documents', latencyMs: 120 },
//   { name: 'generate', input: '...', output: '...', latencyMs: 850 }
// ]
 
console.log(result.usage);
// { promptTokens: 1200, completionTokens: 350, totalTokens: 1550 }

Intermediate Steps

Every chain returns intermediateSteps with the name, input, output, and latency of each step. This gives you full observability into the chain's execution — useful for debugging, monitoring, and optimization.

# ConversationalRetrievalChain

Extends RetrievalQA with conversation memory. It automatically condenses follow-up questions using chat history, so the retriever gets a standalone query even when the user asks "What about the second one?" or "Can you explain more?". The chain maintains its own chat history internally.

import { ConversationalRetrievalChain } from '@orka-js/tools';
 
const chain = new ConversationalRetrievalChain({
  llm,
  retriever,
  collection: 'documentation',
  returnSources: true,
  maxHistoryLength: 10,  // Keep last 10 messages in context
  // Optional: customize how follow-up questions are condensed
  condenseQuestionPrompt: `Given the following conversation and a follow-up question,
rephrase the follow-up as a standalone question.
 
Chat History:
{{history}}
 
Follow-up: {{question}}
 
Standalone Question:`
});
 
// First question — normal retrieval
const result1 = await chain.call('What authentication methods are supported?');
console.log(result1.answer);
// "Orka JS supports JWT, OAuth2, and API key authentication..."
 
// Follow-up — automatically condensed with history
const result2 = await chain.call('How do I configure the second one?');
// Internally condensed to: "How do I configure OAuth2 authentication?"
console.log(result2.answer);
// "To configure OAuth2, you need to set up..."
 
// Another follow-up
const result3 = await chain.call('What are the required scopes?');
// Condensed to: "What are the required OAuth2 scopes?"
console.log(result3.answer);
// "The required scopes are: openid, profile, email..."
 
// Check intermediate steps to see the condensed question
console.log(result2.intermediateSteps);
// [
//   { name: 'condense_question', input: 'How do I configure the second one?',
//     output: 'How do I configure OAuth2 authentication?' },
//   { name: 'retrieve', ... },
//   { name: 'generate', ... }
// ]
 
// Manage history
console.log(chain.getChatHistory());
// [{ role: 'user', content: '...' }, { role: 'assistant', content: '...' }, ...]
 
// Reset conversation
chain.clearHistory();

Contextual Query

The user asks a follow-up question (e.g., 'And the second one?') that relies on previous turns.

Step 1: InputAmbiguous User Input

Semantic Condensation

LLM analyzes chat history to rewrite the query into a standalone version: 'How do I configure OAuth2?'

Step 2: CondenseLLM Pre-processing

Knowledge Search

The retriever uses the condensed standalone question to fetch the most relevant documents.

Step 3: RetrievalVector Lookup

Final Generation

The LLM synthesizes the retrieved context and full history to provide a precise, grounded answer.

Step 4: ResponseContextual Answer

# SummarizationChain

Summarizes one or more documents using three different strategies, each with different trade-offs for quality, speed, and cost.

Linear Injection

Stuffing

Groups all chunks into a single context. Instant execution but hits the hard wall of token limits.

LatencyUltra-Low

Atomic Docs

Parallel Processing

Map-Reduce

Summarizes chunks in parallel, then distills them. Best for massive datasets and infinite scalability.

ScalabilityInfinite

Big Data

Iterative Polishing

Refine

Loops through chunks to refine the result. Highest nuance retention but significantly slower (sequential).

CoherenceMax

High Precision

- Stuff Strategy

import { SummarizationChain } from '@orka-js/tools';
 
// Stuff: all documents in one prompt
const stuffChain = new SummarizationChain({
  llm,
  strategy: 'stuff',
  systemPrompt: 'Provide a clear and concise summary.'
});
 
const result = await stuffChain.call([
  'Document 1 content here...',
  'Document 2 content here...',
  'Document 3 content here...'
]);
 
console.log(result.answer);       // Combined summary
console.log(result.usage);        // Token usage (1 LLM call)

- Map-Reduce Strategy

// Map-Reduce: summarize chunks independently, then combine
const mapReduceChain = new SummarizationChain({
  llm,
  strategy: 'map-reduce',
  maxChunkSize: 3000,  // Split large docs into 3000-char chunks
  combinePrompt: Combine the following summaries into a single coherent summary:\n\n{{summaries}}\n\nCombined Summary:
});
 
const result = await mapReduceChain.call([
  veryLongDocument1,  // 50,000 chars
  veryLongDocument2   // 30,000 chars
]);
 
console.log(result.answer);// Final combined summary
console.log(result.intermediateSteps);
// [
//   { name: 'map_chunk_0', ... },  // Summary of chunk 1
//   { name: 'map_chunk_1', ... },  // Summary of chunk 2
//   ...
//   { name: 'reduce_combine', ... } // Final combined summary
// ]

- Refine Strategy

// Refine: iteratively improve summary with each chunk
const refineChain = new SummarizationChain({
  llm,
  strategy: 'refine',
  maxChunkSize: 3000,
  refinePrompt: `Here is an existing summary:
{{existingSummary}}
 
Refine this summary with the following additional context:
{{context}}
 
Refined Summary:`
});
 
const result = await refineChain.call([longDocument]);
 
console.log(result.answer);// Final refined summary
console.log(result.intermediateSteps);
// [
//   { name: 'initial_summary', ... },  // First chunk summary
//   { name: 'refine_1', ... },          // Refined with chunk 2
//   { name: 'refine_2', ... },          // Refined with chunk 3
//   ...
// ]

# QAChain — Advanced Question Answering

QAChain combines retrieval with multi-strategy answering. Unlike RetrievalQAChain which always uses the "stuff" approach, QAChain supports stuff, map-reduce, and refine strategies for answering questions over large document sets.

import { QAChain } from '@orka-js/tools';
 
// Stuff strategy (default) — fast, for small document sets
const stuffQA = new QAChain({
  llm,
  retriever,
  collection: 'docs',
  strategy: 'stuff',
  returnSources: true
});
 
const result1 = await stuffQA.call(What is the rate limit?);
 
// Map-Reduce strategy — for many retrieved documents
const mapReduceQA = new QAChain({
  llm,
  retriever,
  collection: 'docs',
  strategy: 'map-reduce',
  returnSources: true,
  systemPrompt: Answer precisely based on the documents. Cite sources when possible.
});
 
const result2 = await mapReduceQA.call(Compare all authentication methods);
// Map: extracts relevant info from each document
// Reduce: combines into a comprehensive answer
 
// Refine strategy — best quality for complex questions
const refineQA = new QAChain({
  llm,
  retriever,
  collection: 'docs',
  strategy: 'refine',
  returnSources: true
});
 
const result3 = await refineQA.call(Explain the complete deployment process step by step);
// Starts with answer from first doc, refines with each additional doc

Complete Example — Production RAG Pipeline

import { createOrka } from '@orka-js/core';
import { OpenAIAdapter } from '@orka-js/openai';
import {
  EnsembleRetriever,
  VectorRetriever,
  BM25Retriever,
  ConversationalRetrievalChain,
  SummarizationChain
} from '@orka-js/tools';
import { MemoryCache, CachedLLM } from '@orka-js/cache';
 
// Setup with caching
const baseLLM = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! });
const llm = new CachedLLM(baseLLM, new MemoryCache({ maxSize: 500 }));
 
const orka = createOrka({ llm, vectorDB: myVectorDB });
 
// Hybrid retriever (BM25 + Vector)
const hybrid = new EnsembleRetriever({
  retrievers: [
    new BM25Retriever({ documents: myDocs, topK: 10 }),
    new VectorRetriever({ llm, vectorDB: myVectorDB, topK: 10 })
  ],
  weights: [0.3, 0.7],
  topK: 5
});
 
// Conversational chain with hybrid retrieval
const chatChain = new ConversationalRetrievalChain({
  llm,
  retriever: hybrid,
  collection: 'documentation',
  returnSources: true,
  maxHistoryLength: 10
});
 
// Handle user conversation
async function handleMessage(userMessage: string) {
  const result = await chatChain.call(userMessage);
 
  return {
    answer: result.answer,
    sources: result.sources?.map(s => ({
      id: s.id,
      score: s.score,
      preview: s.content?.slice(0, 100)
    })),
    steps: result.intermediateSteps
  };
}
 
// Summarize long documents
const summarizer = new SummarizationChain({
  llm,
  strategy: 'map-reduce',
  maxChunkSize: 3000
});
 
async function summarizeDocuments(texts: string[]) {
  const result = await summarizer.call(texts);
  return result.answer;
}

Comparison

Core Chain Engine	Deployment Focus	State Persistence	Available Logic
`RetrievalQA`	One-shot Document Q&A	Stateless	stuff
`Conversational`	Contextual Support Agents	Stateful	stuff
`Summarization`	Massive Content Distillation	Stateless	stuffmap-reducerefine
`QAChain`	High-Fidelity Analysis	Stateless	stuffmap-reducerefine

Summarization Strategies Comparison

Aggregation Strategy	Compute Cost	Output Fidelity	Capacity
`Stuffing`	1LLM Calls	Baseline	Context-bound
`Map-Reduce`	N + 1LLM Calls	Parallelized	Unlimited
`Refine`	N (Sequential)LLM Calls	State-of-the-Art	Unlimited

💡 Best Practices

1. Start with RetrievalQA

For most use cases, RetrievalQAChain is sufficient. Only upgrade to ConversationalRetrievalChain when you need multi-turn conversations.

2. Use Hybrid Retrieval

Combine BM25 + VectorRetriever in an EnsembleRetriever for the best results. Keyword matching catches exact terms that semantic search might miss.

3. Monitor Intermediate Steps

Use intermediateSteps to debug retrieval quality, identify slow steps, and optimize your pipeline.

4. Choose the Right Summarization Strategy

Use 'stuff' for short documents (<4K tokens), 'map-reduce' for parallel processing of large docs, and 'refine' when summary quality is critical.

Tree-shaking Imports

// ✅ Import only what you need
import { RetrievalQAChain } from '@orka-js/tools';
import { ConversationalRetrievalChain } from '@orka-js/tools';
import { SummarizationChain } from '@orka-js/tools';
import { QAChain } from '@orka-js/tools';
 
// ✅ Or import from index
import { RetrievalQAChain, ConversationalRetrievalChain, SummarizationChain, QAChain } from '@orka-js/tools';