Resilience

Build fault-tolerant AI applications withretry logic andmulti-provider fallback chains.

Why Resilience?

LLM APIs can fail due to rate limits, network issues, or provider outages. Resilience patterns ensure your application continues working even when individual API calls fail. Orka provides two key patterns: automatic retry with exponential backoff, and multi-provider fallback chains.

# ResilientLLM

ResilientLLM wraps any LLM adapter with automatic retry logic. It implements the LLMAdapter interface, so you can use it as a drop-in replacement.

import { ResilientLLM } from '@orka-js/resilience';
import { OpenAIAdapter } from '@orka-js/openai';
 
const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! });
 
// Wrap with automatic retry
const resilientLLM = new ResilientLLM(llm, {
  maxRetries: 3,              // Maximum retry attempts
  initialDelayMs: 1000,       // First retry after 1 second
  backoffMultiplier: 2,       // Double delay each retry: 1s, 2s, 4s
  maxDelayMs: 30000,          // Cap delay at 30 seconds
  retryableErrors: [          // Only retry these errors
    'rate limit', 
    '429', 
    '503', 
    'ECONNRESET',
    'timeout'
  ],
  onRetry: (error, attempt, delayMs) => {
    console.log(`Retry ${attempt}/3 in ${delayMs}ms: ${error.message}`);
  },
});
 
// Use as normal LLM
const result = await resilientLLM.generate('What is TypeScript?');
 
// Or use with Orka
const orka = createOrka({
  llm: resilientLLM,
  vectorDB: new MemoryVectorAdapter(),
});

- ResilientLLM Options

maxRetries: numberdefault: 3

Maximum number of retry attempts before giving up.

initialDelayMs: numberdefault: 1000

Delay before the first retry in milliseconds.

backoffMultiplier: numberdefault: 2

Multiply delay by this factor after each retry. 2 = exponential backoff.

retryableErrors: string[]

Only retry if error message contains one of these strings. Non-matching errors fail immediately.

onRetry?: (error, attempt, delayMs) => void

Callback fired before each retry. Use for logging or monitoring.

# withRetry() Helper

For one-off retry logic on any async function, use the withRetry() helper:

import { withRetry } from '@orka-js/resilience';
 
const result = await withRetry(
  () => orka.ask({ question: 'My question', knowledge: 'docs' }),
  {
    maxRetries: 3,
    initialDelayMs: 1000,
    backoffMultiplier: 2,
    maxDelayMs: 30000,
    retryableErrors: ['rate limit', '429', '503'],
    onRetry: (error, attempt) => {
      console.log(`Attempt ${attempt}: ${error.message}`);
    },
  },
);

# FallbackLLM — Multi-Provider Failover

FallbackLLM chains multiple LLM adapters together. If the primary fails, it automatically tries the next one in the chain. This provides redundancy across different providers.

import { FallbackLLM } from '@orka-js/resilience';
import { OpenAIAdapter } from '@orka-js/openai';
import { AnthropicAdapter } from '@orka-js/anthropic';
import { OllamaAdapter } from '@orka-js/ollama';
 
const llm = new FallbackLLM({
  adapters: [
    // Primary: OpenAI (fastest, most reliable)
    new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
    // Secondary: Anthropic (different provider for redundancy)
    new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }),
    // Tertiary: Local Ollama (always available, no API limits)
    new OllamaAdapter({ model: 'llama3.2' }),
  ],
  onFallback: (error, failedAdapter, nextAdapter) => {
    console.warn(`⚠️ ${failedAdapter} failed: ${error.message}`);
    console.warn(`   Falling back to ${nextAdapter}`);
    // Send alert to monitoring system
    alerting.send('llm_fallback', { from: failedAdapter, to: nextAdapter });
  },
});
 
const orka = createOrka({ 
  llm, 
  vectorDB: new MemoryVectorAdapter() 
});
 
// If OpenAI fails, automatically tries Anthropic, then Ollama
const answer = await orka.ask({ question: 'What is TypeScript?' });

- FallbackLLM Options

adapters: LLMAdapter[]required

Array of LLM adapters in priority order. First adapter is tried first.

onFallback?: (error, failedAdapter, nextAdapter) => void

Callback fired when falling back to next adapter. Use for logging/alerting.

# Combining Retry + Fallback

For maximum resilience, combine ResilientLLM (retry) with FallbackLLM (multi-provider). This gives you retry within each provider AND fallback across providers.

import { FallbackLLM, ResilientLLM } from '@orka-js/resilience';
import { OpenAIAdapter } from '@orka-js/openai';
import { AnthropicAdapter } from '@orka-js/anthropic';
 
// Wrap each adapter with retry logic
const resilientOpenAI = new ResilientLLM(
  new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
  { maxRetries: 2, initialDelayMs: 500 }
);
 
const resilientAnthropic = new ResilientLLM(
  new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }),
  { maxRetries: 2, initialDelayMs: 500 }
);
 
// Chain resilient adapters with fallback
const llm = new FallbackLLM({
  adapters: [resilientOpenAI, resilientAnthropic],
  onFallback: (error, from, to) => console.warn(`Fallback: ${from} -> ${to}`),
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
 
// Flow: OpenAI (retry 1) -> OpenAI (retry 2) -> Anthropic (retry 1) -> Success

This gives you: OpenAI (retry 1) → OpenAI (retry 2) → Anthropic (retry 1) → Anthropic (retry 2) → Success ✓

Complete Production Example

resilience-example.ts

import { createOrka } from '@orka-js/core';
import { FallbackLLM, ResilientLLM } from '@orka-js/resilience';
import { OpenAIAdapter } from '@orka-js/openai';
import { AnthropicAdapter } from '@orka-js/anthropic';
import { OllamaAdapter } from '@orka-js/ollama';
import { MemoryVectorAdapter } from '@orka-js/memory';
 
// Production-grade resilient LLM setup
const llm = new FallbackLLM({
  adapters: [
    // Primary: OpenAI with retry
    new ResilientLLM(
      new OpenAIAdapter({ 
        apiKey: process.env.OPENAI_API_KEY!,
        model: 'gpt-4o-mini',
        timeoutMs: 30000,
      }),
      { maxRetries: 2, initialDelayMs: 1000, retryableErrors: ['429', '503', 'timeout'] }
    ),
    // Secondary: Anthropic with retry
    new ResilientLLM(
      new AnthropicAdapter({ 
        apiKey: process.env.ANTHROPIC_API_KEY!,
        model: 'claude-3-5-sonnet-20241022',
      }),
      { maxRetries: 2, initialDelayMs: 1000 }
    ),
    // Tertiary: Local fallback (always available)
    new OllamaAdapter({ model: 'llama3.2' }),
  ],
  onFallback: (error, from, to) => {
    console.error(`[RESILIENCE] Fallback ${from} -> ${to}: ${error.message}`);
    // Alert your monitoring system
  },
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
 
// Your app is now resilient to:
// - Rate limits (retry with backoff)
// - Temporary outages (retry)
// - Provider outages (fallback to another provider)
// - Complete cloud failure (fallback to local Ollama)

💡 Production Tips

Always configure at least 2 providers in FallbackLLM
Use different cloud providers (OpenAI + Anthropic) for true redundancy
Include a local Ollama as last resort (no API limits, always available)
Monitor fallback events to detect provider issues early
Set appropriate timeouts to fail fast and try next provider

Tree-shaking Imports

// ✅ Import only what you need
import { ResilientLLM, FallbackLLM, withRetry } from '@orka-js/resilience';