Resilience & Fallback
Build fault-tolerant AI systems with automatic retry and multi-provider fallback.
Production AI systems need to handle API failures gracefully. Orka provides built-in retry logic and fallback chains.
ORKA β RESILIENCE ARCHITECTURE
π€ orka.generate(prompt)
withRetry()
maxRetries: 3 | backoff: exponential
1s
β
2s
β
4s
FallbackLLM
OpenAIPrimary
β if fails
AnthropicFallback 1
β if fails
OllamaLocal backup
β Response
rate limit
timeout
429
500
503
1. FallbackLLM - Multi-Provider Chain
Automatically switch to backup providers when the primary fails:
fallback-llm.ts
import { createOrka, OpenAIAdapter, AnthropicAdapter, OllamaAdapter, MemoryVectorAdapter, FallbackLLM,} from 'orkajs';Β // Fallback chain: OpenAI β Anthropic β Ollama localconst llm = new FallbackLLM({ adapters: [ new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }), new OllamaAdapter({ model: 'llama3.2' }), // Backup local ], onFallback: (error, failedAdapter, nextAdapter) => { console.log(`β οΈ ${failedAdapter} failed (${error.message}), falling back to ${nextAdapter}`); },});Β const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter(),});Β // If OpenAI fails, Anthropic automatically takes over.const result = await orka.generate('Explain TypeScript in one sentence.');console.log(`Response: ${result}`);2. withRetry - Exponential Backoff
Automatically retry failed requests with exponential backoff:
with-retry.ts
import { createOrka, OpenAIAdapter, MemoryVectorAdapter, withRetry } from 'orkajs';Β const orka = createOrka({ llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), vectorDB: new MemoryVectorAdapter(),});Β // Automatic retry with exponential backoffconst result = await withRetry( () => orka.ask({ question: 'Explain TypeScript in one sentence.' }), { maxRetries: 3, initialDelayMs: 1000, // First delay: 1s backoffMultiplier: 2, // Multiplier: 1s β 2s β 4s maxDelayMs: 30000, // Max delay: 30s retryableErrors: ['rate limit', 'timeout', '429', '500', '503'], onRetry: (error, attempt) => { console.log(`π Retry ${attempt}: ${error.message}`); }, },);Β console.log(`β
Response: ${result.answer}`);console.log(`π Tokens: ${result.usage.totalTokens}, Latency: ${result.latencyMs}ms`);3. ResilientLLM Wrapper
Wrap any LLM adapter with automatic retry capabilities:
resilient-llm.ts
import { createOrka, OpenAIAdapter, MemoryVectorAdapter, ResilientLLM } from 'orkajs';Β // Wrapper that adds automatic retry to any LLMconst llm = new ResilientLLM( new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), { maxRetries: 3, initialDelayMs: 1000, backoffMultiplier: 2, retryableErrors: ['rate limit', 'timeout', '429', '500', '503'], onRetry: (error, attempt) => { console.log(`π Retry ${attempt}: ${error.message}`); }, });Β const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });Β // All requests automatically benefit from retryconst result = await orka.generate('Hello world');4. Complete Example
resilience-complete.ts
import { createOrka, OpenAIAdapter, AnthropicAdapter, OllamaAdapter, MemoryVectorAdapter, FallbackLLM, withRetry,} from 'orkajs';Β async function main() { // Configuration: Multi-vendor fallback const llm = new FallbackLLM({ adapters: [ new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }), new OllamaAdapter({ model: 'llama3.2' }), ], onFallback: (error, failed, next) => { console.log(`β οΈ ${failed} failed (${error.message}), falling back to ${next}`); }, });Β const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter(), });Β // Combine Fallback + Retry for maximum resilience const result = await withRetry( () => orka.ask({ question: 'Explique TypeScript en une phrase.' }), { maxRetries: 3, initialDelayMs: 1000, backoffMultiplier: 2, retryableErrors: ['rate limit', 'timeout', '429', '500', '503'], onRetry: (error, attempt) => { console.log(`π Retry ${attempt}: ${error.message}`); }, }, );Β console.log(`β
Response: ${result.answer}`); console.log(`π Tokens: ${result.usage.totalTokens}, Latency: ${result.latencyMs}ms`);}Β main().catch(console.error);Best Practices
β Use Exponential Backoff
Avoid hammering APIs with immediate retries. Use increasing delays.
β Include Local Fallback
Add Ollama as last resort for complete API independence.
β Log Fallback Events
Monitor which providers fail and how often for capacity planning.
β Set Timeouts
Configure timeoutMs on adapters to fail fast and trigger fallback.