Resilience
Build fault-tolerant AI applications with retry logic and multi-provider fallback chains.
Why Resilience?
LLM APIs can fail due to rate limits, network issues, or provider outages. Resilience patterns ensure your application continues working even when individual API calls fail. Orka provides two key patterns: automatic retry with exponential backoff, and multi-provider fallback chains.
# ResilientLLM
ResilientLLM wraps any LLM adapter with automatic retry logic. It implements the LLMAdapter interface, so you can use it as a drop-in replacement.
import { ResilientLLM } from 'orkajs/resilience';import { OpenAIAdapter } from 'orkajs/adapters/openai'; const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }); // Wrap with automatic retryconst resilientLLM = new ResilientLLM(llm, { maxRetries: 3, // Maximum retry attempts initialDelayMs: 1000, // First retry after 1 second backoffMultiplier: 2, // Double delay each retry: 1s, 2s, 4s maxDelayMs: 30000, // Cap delay at 30 seconds retryableErrors: [ // Only retry these errors 'rate limit', '429', '503', 'ECONNRESET', 'timeout' ], onRetry: (error, attempt, delayMs) => { console.log(`Retry ${attempt}/3 in ${delayMs}ms: ${error.message}`); },}); // Use as normal LLMconst result = await resilientLLM.generate('What is TypeScript?'); // Or use with Orkaconst orka = createOrka({ llm: resilientLLM, vectorDB: new MemoryVectorDB(),});- ResilientLLM Options
maxRetries: numberdefault: 3Maximum number of retry attempts before giving up.
initialDelayMs: numberdefault: 1000Delay before the first retry in milliseconds.
backoffMultiplier: numberdefault: 2Multiply delay by this factor after each retry. 2 = exponential backoff.
retryableErrors: string[]Only retry if error message contains one of these strings. Non-matching errors fail immediately.
onRetry?: (error, attempt, delayMs) => voidCallback fired before each retry. Use for logging or monitoring.
# withRetry() Helper
For one-off retry logic on any async function, use the withRetry() helper:
import { withRetry } from 'orkajs/resilience'; const result = await withRetry( () => orka.ask({ question: 'My question', knowledge: 'docs' }), { maxRetries: 3, initialDelayMs: 1000, backoffMultiplier: 2, maxDelayMs: 30000, retryableErrors: ['rate limit', '429', '503'], onRetry: (error, attempt) => { console.log(`Attempt ${attempt}: ${error.message}`); }, },);# FallbackLLM — Multi-Provider Failover
FallbackLLM chains multiple LLM adapters together. If the primary fails, it automatically tries the next one in the chain. This provides redundancy across different providers.
import { FallbackLLM } from 'orkajs/resilience';import { OpenAIAdapter } from 'orkajs/adapters/openai';import { AnthropicAdapter } from 'orkajs/adapters/anthropic';import { OllamaAdapter } from 'orkajs/adapters/ollama'; const llm = new FallbackLLM({ adapters: [ // Primary: OpenAI (fastest, most reliable) new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), // Secondary: Anthropic (different provider for redundancy) new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }), // Tertiary: Local Ollama (always available, no API limits) new OllamaAdapter({ model: 'llama3.2' }), ], onFallback: (error, failedAdapter, nextAdapter) => { console.warn(`⚠️ ${failedAdapter} failed: ${error.message}`); console.warn(` Falling back to ${nextAdapter}`); // Send alert to monitoring system alerting.send('llm_fallback', { from: failedAdapter, to: nextAdapter }); },}); const orka = createOrka({ llm, vectorDB: new MemoryVectorDB() }); // If OpenAI fails, automatically tries Anthropic, then Ollamaconst answer = await orka.ask({ question: 'What is TypeScript?' });- FallbackLLM Options
adapters: LLMAdapter[]requiredArray of LLM adapters in priority order. First adapter is tried first.
onFallback?: (error, failedAdapter, nextAdapter) => voidCallback fired when falling back to next adapter. Use for logging/alerting.
# Combining Retry + Fallback
For maximum resilience, combine ResilientLLM (retry) with FallbackLLM (multi-provider). This gives you retry within each provider AND fallback across providers.
import { FallbackLLM, ResilientLLM } from 'orkajs/resilience';import { OpenAIAdapter } from 'orkajs/adapters/openai';import { AnthropicAdapter } from 'orkajs/adapters/anthropic'; // Wrap each adapter with retry logicconst resilientOpenAI = new ResilientLLM( new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), { maxRetries: 2, initialDelayMs: 500 }); const resilientAnthropic = new ResilientLLM( new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }), { maxRetries: 2, initialDelayMs: 500 }); // Chain resilient adapters with fallbackconst llm = new FallbackLLM({ adapters: [resilientOpenAI, resilientAnthropic], onFallback: (error, from, to) => console.warn(`Fallback: ${from} -> ${to}`),}); const orka = createOrka({ llm, vectorDB: new MemoryVectorDB() }); // Flow: OpenAI (retry 1) -> OpenAI (retry 2) -> Anthropic (retry 1) -> SuccessThis gives you: OpenAI (retry 1) → OpenAI (retry 2) → Anthropic (retry 1) → Anthropic (retry 2) → Success ✓
Complete Production Example
import { createOrka } from 'orkajs/core';import { FallbackLLM, ResilientLLM } from 'orkajs/resilience';import { OpenAIAdapter } from 'orkajs/adapters/openai';import { AnthropicAdapter } from 'orkajs/adapters/anthropic';import { OllamaAdapter } from 'orkajs/adapters/ollama';import { MemoryVectorDB } from 'orkajs/adapters/memory'; // Production-grade resilient LLM setupconst llm = new FallbackLLM({ adapters: [ // Primary: OpenAI with retry new ResilientLLM( new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY!, model: 'gpt-4o-mini', timeoutMs: 30000, }), { maxRetries: 2, initialDelayMs: 1000, retryableErrors: ['429', '503', 'timeout'] } ), // Secondary: Anthropic with retry new ResilientLLM( new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-3-5-sonnet-20241022', }), { maxRetries: 2, initialDelayMs: 1000 } ), // Tertiary: Local fallback (always available) new OllamaAdapter({ model: 'llama3.2' }), ], onFallback: (error, from, to) => { console.error(`[RESILIENCE] Fallback ${from} -> ${to}: ${error.message}`); // Alert your monitoring system },}); const orka = createOrka({ llm, vectorDB: new MemoryVectorDB() }); // Your app is now resilient to:// - Rate limits (retry with backoff)// - Temporary outages (retry)// - Provider outages (fallback to another provider)// - Complete cloud failure (fallback to local Ollama)💡 Production Tips
- Always configure at least 2 providers in FallbackLLM
- Use different cloud providers (OpenAI + Anthropic) for true redundancy
- Include a local Ollama as last resort (no API limits, always available)
- Monitor fallback events to detect provider issues early
- Set appropriate timeouts to fail fast and try next provider
Tree-shaking Imports
// ✅ Import only what you needimport { ResilientLLM, FallbackLLM, withRetry } from 'orkajs/resilience';