Resilience & Fallback

Build fault-tolerant AI systems with automatic retry and multi-provider fallback.

Production AI systems need to handle API failures gracefully. Orka provides built-in retry logic and fallback chains.

ORKA — RESILIENCE ARCHITECTURE

📤 orka.generate(prompt)

withRetry()

maxRetries: 3 | backoff: exponential

→

FallbackLLM

OpenAIPrimary

↓ if fails

AnthropicFallback 1

↓ if fails

OllamaLocal backup

✓ Response

rate limit

timeout

429

500

503

1. FallbackLLM - Multi-Provider Chain

Automatically switch to backup providers when the primary fails:

fallback-llm.ts

import { 
  createOrka, 
  OpenAIAdapter, 
  AnthropicAdapter, 
  OllamaAdapter,
  MemoryVectorAdapter,
  FallbackLLM,
} from 'orkajs';
 
// Fallback chain: OpenAI → Anthropic → Ollama local
const llm = new FallbackLLM({
  adapters: [
    new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
    new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }),
    new OllamaAdapter({ model: 'llama3.2' }), // Backup local
  ],
  onFallback: (error, failedAdapter, nextAdapter) => {
    console.log(`⚠️  ${failedAdapter} failed (${error.message}), falling back to ${nextAdapter}`);
  },
});
 
const orka = createOrka({
  llm,
  vectorDB: new MemoryVectorAdapter(),
});
 
// If OpenAI fails, Anthropic automatically takes over.
const result = await orka.generate('Explain TypeScript in one sentence.');
console.log(`Response: ${result}`);

2. withRetry - Exponential Backoff

Automatically retry failed requests with exponential backoff:

with-retry.ts

import { createOrka, OpenAIAdapter, MemoryVectorAdapter, withRetry } from 'orkajs';
 
const orka = createOrka({
  llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
  vectorDB: new MemoryVectorAdapter(),
});
 
// Automatic retry with exponential backoff
const result = await withRetry(
  () => orka.ask({ question: 'Explain TypeScript in one sentence.' }),
  {
    maxRetries: 3,
    initialDelayMs: 1000,      // First delay: 1s
    backoffMultiplier: 2,       // Multiplier: 1s → 2s → 4s
    maxDelayMs: 30000,          // Max delay: 30s
    retryableErrors: ['rate limit', 'timeout', '429', '500', '503'],
    onRetry: (error, attempt) => {
      console.log(`🔄 Retry ${attempt}: ${error.message}`);
    },
  },
);
 
console.log(`✅ Response: ${result.answer}`);
console.log(`📊 Tokens: ${result.usage.totalTokens}, Latency: ${result.latencyMs}ms`);

3. ResilientLLM Wrapper

Wrap any LLM adapter with automatic retry capabilities:

resilient-llm.ts

import { createOrka, OpenAIAdapter, MemoryVectorAdapter, ResilientLLM } from 'orkajs';
 
// Wrapper that adds automatic retry to any LLM
const llm = new ResilientLLM(
  new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
  {
    maxRetries: 3,
    initialDelayMs: 1000,
    backoffMultiplier: 2,
    retryableErrors: ['rate limit', 'timeout', '429', '500', '503'],
    onRetry: (error, attempt) => {
      console.log(`🔄 Retry ${attempt}: ${error.message}`);
    },
  }
);
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
 
// All requests automatically benefit from retry
const result = await orka.generate('Hello world');

4. Complete Example

resilience-complete.ts

import { 
  createOrka, 
  OpenAIAdapter, 
  AnthropicAdapter,
  OllamaAdapter,
  MemoryVectorAdapter,
  FallbackLLM,
  withRetry,
} from 'orkajs';
 
async function main() {
  // Configuration: Multi-vendor fallback
  const llm = new FallbackLLM({
    adapters: [
      new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
      new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }),
      new OllamaAdapter({ model: 'llama3.2' }),
    ],
    onFallback: (error, failed, next) => {
      console.log(`⚠️  ${failed} failed (${error.message}), falling back to ${next}`);
    },
  });
 
  const orka = createOrka({
    llm,
    vectorDB: new MemoryVectorAdapter(),
  });
 
  // Combine Fallback + Retry for maximum resilience
  const result = await withRetry(
    () => orka.ask({ question: 'Explique TypeScript en une phrase.' }),
    {
      maxRetries: 3,
      initialDelayMs: 1000,
      backoffMultiplier: 2,
      retryableErrors: ['rate limit', 'timeout', '429', '500', '503'],
      onRetry: (error, attempt) => {
        console.log(`🔄 Retry ${attempt}: ${error.message}`);
      },
    },
  );
 
  console.log(`✅ Response: ${result.answer}`);
  console.log(`📊 Tokens: ${result.usage.totalTokens}, Latency: ${result.latencyMs}ms`);
}
 
main().catch(console.error);

Best Practices

✓ Use Exponential Backoff

Avoid hammering APIs with immediate retries. Use increasing delays.

✓ Include Local Fallback

Add Ollama as last resort for complete API independence.

✓ Log Fallback Events

Monitor which providers fail and how often for capacity planning.

✓ Set Timeouts

Configure timeoutMs on adapters to fail fast and trigger fallback.