OrkaJS
Orka.JS

Resilience

Build fault-tolerant AI applications with retry logic and multi-provider fallback chains.

Why Resilience?

LLM APIs can fail due to rate limits, network issues, or provider outages. Resilience patterns ensure your application continues working even when individual API calls fail. Orka provides two key patterns: automatic retry with exponential backoff, and multi-provider fallback chains.

# ResilientLLM

ResilientLLM wraps any LLM adapter with automatic retry logic. It implements the LLMAdapter interface, so you can use it as a drop-in replacement.

import { ResilientLLM } from 'orkajs/resilience';
import { OpenAIAdapter } from 'orkajs/adapters/openai';
 
const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! });
 
// Wrap with automatic retry
const resilientLLM = new ResilientLLM(llm, {
maxRetries: 3, // Maximum retry attempts
initialDelayMs: 1000, // First retry after 1 second
backoffMultiplier: 2, // Double delay each retry: 1s, 2s, 4s
maxDelayMs: 30000, // Cap delay at 30 seconds
retryableErrors: [ // Only retry these errors
'rate limit',
'429',
'503',
'ECONNRESET',
'timeout'
],
onRetry: (error, attempt, delayMs) => {
console.log(`Retry ${attempt}/3 in ${delayMs}ms: ${error.message}`);
},
});
 
// Use as normal LLM
const result = await resilientLLM.generate('What is TypeScript?');
 
// Or use with Orka
const orka = createOrka({
llm: resilientLLM,
vectorDB: new MemoryVectorDB(),
});

- ResilientLLM Options

maxRetries: numberdefault: 3

Maximum number of retry attempts before giving up.

initialDelayMs: numberdefault: 1000

Delay before the first retry in milliseconds.

backoffMultiplier: numberdefault: 2

Multiply delay by this factor after each retry. 2 = exponential backoff.

retryableErrors: string[]

Only retry if error message contains one of these strings. Non-matching errors fail immediately.

onRetry?: (error, attempt, delayMs) => void

Callback fired before each retry. Use for logging or monitoring.

# withRetry() Helper

For one-off retry logic on any async function, use the withRetry() helper:

import { withRetry } from 'orkajs/resilience';
 
const result = await withRetry(
() => orka.ask({ question: 'My question', knowledge: 'docs' }),
{
maxRetries: 3,
initialDelayMs: 1000,
backoffMultiplier: 2,
maxDelayMs: 30000,
retryableErrors: ['rate limit', '429', '503'],
onRetry: (error, attempt) => {
console.log(`Attempt ${attempt}: ${error.message}`);
},
},
);

# FallbackLLM — Multi-Provider Failover

FallbackLLM chains multiple LLM adapters together. If the primary fails, it automatically tries the next one in the chain. This provides redundancy across different providers.

import { FallbackLLM } from 'orkajs/resilience';
import { OpenAIAdapter } from 'orkajs/adapters/openai';
import { AnthropicAdapter } from 'orkajs/adapters/anthropic';
import { OllamaAdapter } from 'orkajs/adapters/ollama';
 
const llm = new FallbackLLM({
adapters: [
// Primary: OpenAI (fastest, most reliable)
new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
// Secondary: Anthropic (different provider for redundancy)
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }),
// Tertiary: Local Ollama (always available, no API limits)
new OllamaAdapter({ model: 'llama3.2' }),
],
onFallback: (error, failedAdapter, nextAdapter) => {
console.warn(`⚠️ ${failedAdapter} failed: ${error.message}`);
console.warn(` Falling back to ${nextAdapter}`);
// Send alert to monitoring system
alerting.send('llm_fallback', { from: failedAdapter, to: nextAdapter });
},
});
 
const orka = createOrka({
llm,
vectorDB: new MemoryVectorDB()
});
 
// If OpenAI fails, automatically tries Anthropic, then Ollama
const answer = await orka.ask({ question: 'What is TypeScript?' });

- FallbackLLM Options

adapters: LLMAdapter[]required

Array of LLM adapters in priority order. First adapter is tried first.

onFallback?: (error, failedAdapter, nextAdapter) => void

Callback fired when falling back to next adapter. Use for logging/alerting.

# Combining Retry + Fallback

For maximum resilience, combine ResilientLLM (retry) with FallbackLLM (multi-provider). This gives you retry within each provider AND fallback across providers.

import { FallbackLLM, ResilientLLM } from 'orkajs/resilience';
import { OpenAIAdapter } from 'orkajs/adapters/openai';
import { AnthropicAdapter } from 'orkajs/adapters/anthropic';
 
// Wrap each adapter with retry logic
const resilientOpenAI = new ResilientLLM(
new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
{ maxRetries: 2, initialDelayMs: 500 }
);
 
const resilientAnthropic = new ResilientLLM(
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }),
{ maxRetries: 2, initialDelayMs: 500 }
);
 
// Chain resilient adapters with fallback
const llm = new FallbackLLM({
adapters: [resilientOpenAI, resilientAnthropic],
onFallback: (error, from, to) => console.warn(`Fallback: ${from} -> ${to}`),
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorDB() });
 
// Flow: OpenAI (retry 1) -> OpenAI (retry 2) -> Anthropic (retry 1) -> Success

This gives you: OpenAI (retry 1) → OpenAI (retry 2) → Anthropic (retry 1) → Anthropic (retry 2) → Success ✓

Complete Production Example

resilience-example.ts
import { createOrka } from 'orkajs/core';
import { FallbackLLM, ResilientLLM } from 'orkajs/resilience';
import { OpenAIAdapter } from 'orkajs/adapters/openai';
import { AnthropicAdapter } from 'orkajs/adapters/anthropic';
import { OllamaAdapter } from 'orkajs/adapters/ollama';
import { MemoryVectorDB } from 'orkajs/adapters/memory';
 
// Production-grade resilient LLM setup
const llm = new FallbackLLM({
adapters: [
// Primary: OpenAI with retry
new ResilientLLM(
new OpenAIAdapter({
apiKey: process.env.OPENAI_API_KEY!,
model: 'gpt-4o-mini',
timeoutMs: 30000,
}),
{ maxRetries: 2, initialDelayMs: 1000, retryableErrors: ['429', '503', 'timeout'] }
),
// Secondary: Anthropic with retry
new ResilientLLM(
new AnthropicAdapter({
apiKey: process.env.ANTHROPIC_API_KEY!,
model: 'claude-3-5-sonnet-20241022',
}),
{ maxRetries: 2, initialDelayMs: 1000 }
),
// Tertiary: Local fallback (always available)
new OllamaAdapter({ model: 'llama3.2' }),
],
onFallback: (error, from, to) => {
console.error(`[RESILIENCE] Fallback ${from} -> ${to}: ${error.message}`);
// Alert your monitoring system
},
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorDB() });
 
// Your app is now resilient to:
// - Rate limits (retry with backoff)
// - Temporary outages (retry)
// - Provider outages (fallback to another provider)
// - Complete cloud failure (fallback to local Ollama)

💡 Production Tips

  • Always configure at least 2 providers in FallbackLLM
  • Use different cloud providers (OpenAI + Anthropic) for true redundancy
  • Include a local Ollama as last resort (no API limits, always available)
  • Monitor fallback events to detect provider issues early
  • Set appropriate timeouts to fail fast and try next provider

Tree-shaking Imports

// ✅ Import only what you need
import { ResilientLLM, FallbackLLM, withRetry } from 'orkajs/resilience';