OrkaJS
Orka.JS

Multi-Model Orchestration

Route, race, balance, and combine multiple LLM providers intelligently.

Orka's orchestration layer lets you combine multiple LLM providers for optimal cost, latency, and quality.

ORKA — MULTI-MODEL ORCHESTRATION
RouterLLM
Route by condition
simpleclaude-sonnet-4.5
complexclaude-opus-4.5
ConsensusLLM
Best of N models
claude-sonnet-4.5 → response A
claude-opus-4.5 → response B
judge → best_score
RaceLLM
Fastest wins
claude-opus-4.5 ✓ 120ms
claude-sonnet-4.5 cancelled
LoadBalancerLLM
Distribute load
round_robin | weighted | random
A: 50%
B: 50%

1. RouterLLM - Smart Routing

Route requests to different models based on complexity, content, or custom conditions:

router-llm.ts
import { createOrka, AnthropicAdapter, MemoryVectorAdapter, RouterLLM } from 'orkajs';
 
const fast = new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' });
const smart = new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' });
 
const llm = new RouterLLM({
routes: [
{
name: 'complex',
condition: (prompt) => prompt.length > 500 || prompt.includes('analyse') || prompt.includes('compare'),
adapter: smart,
},
{
name: 'code',
condition: (prompt) => prompt.includes('code') || prompt.includes('function') || prompt.includes('```'),
adapter: smart,
},
],
defaultAdapter: fast, // Fallback for simple queries
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
 
// Simple query → claude-sonnet-4.5
const simple = await orka.generate('Say hello in one sentence.');
console.log('Simple: ' + simple);
 
// Complex query → claude-opus-4.5
const complex = await orka.generate('Analyze the advantages and disadvantages of TypeScript vs JavaScript.');
console.log('Complex: ' + complex);
}

2. ConsensusLLM - Best of N

Query multiple models and use a judge to select the best response:

consensus-llm.ts
import { ConsensusLLM, AnthropicAdapter, MemoryVectorAdapter, createOrka } from 'orkajs';
 
const llm = new ConsensusLLM({
adapters: [
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' }),
],
strategy: 'best_score', // 'best_score' | 'majority_vote' | 'first_valid'
judge: new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' }),
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
 
// Both models generate results, and the judge chooses the best answer.
const result = await orka.generate('Explique la récursion en programmation.');
console.log('Consensus: ' + result.slice(0, 200) + '...');

3. RaceLLM - Fastest Wins

Race multiple models and return the first response:

race-llm.ts
import { RaceLLM, AnthropicAdapter, MemoryVectorAdapter, createOrka } from 'orkajs';
 
const llm = new RaceLLM({
adapters: [
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' }),
],
timeout: 10000, // Timeout global en ms
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
 
// The first model to respond wins, the others are cancelled
const result = await orka.generate('What is TypeScript?');
console.log('Race winner: ' + result.slice(0, 200) + '...');

4. LoadBalancerLLM - Distribute Load

Distribute requests across multiple model instances:

load-balancer-llm.ts
import { LoadBalancerLLM, AnthropicAdapter, MemoryVectorAdapter, createOrka } from 'orkajs';
 
const lb = new LoadBalancerLLM({
adapters: [
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
],
strategy: 'round_robin', // 'round_robin' | 'weighted' | 'random'
});
 
const orka = createOrka({ llm: lb, vectorDB: new MemoryVectorAdapter() });
 
// Requests are distributed in round-robin
for (let i = 0; i < 4; i++) {
await orka.generate('Question ' + (i + 1));
}
 
// Usage statistics
console.log('Stats:', lb.getStats());
// { adapter_0: { requests: 2, avgLatency: 150 }, adapter_1: { requests: 2, avgLatency: 145 } }

5. Complete Example

multi-model-complete.ts
import {
createOrka,
AnthropicAdapter,
MemoryVectorAdapter,
RouterLLM, ConsensusLLM, RaceLLM, LoadBalancerLLM,
} from 'orkajs';
 
async function routerExample() {
console.log('🔀 Router: Route by complexity\n');
 
const fast = new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' });
const smart = new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' });
 
const llm = new RouterLLM({
routes: [
{ name: 'complex', condition: (p) => p.length > 500 || p.includes('analyse'), adapter: smart },
{ name: 'code', condition: (p) => p.includes('code') || p.includes('```'), adapter: smart },
],
defaultAdapter: fast,
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
const simple = await orka.generate('Say hello.');
console.log('Simple (gpt-4o-mini): ' + simple + '
');
}
 
async function consensusExample() {
console.log('🤝 Consensus: Best answer of 2 models\n');
 
const llm = new ConsensusLLM({
adapters: [
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' }),
],
strategy: 'best_score',
judge: new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' }),
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
const result = await orka.generate('Explain recursion.');
console.log('Consensus: ' + result.slice(0, 200) + '...
');
}
 
async function raceExample() {
console.log('🏎️ Race: The fastest wins\n');
 
const llm = new RaceLLM({
adapters: [
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-opus-4.5' }),
],
timeout: 10000,
});
 
const orka = createOrka({ llm, vectorDB: new MemoryVectorAdapter() });
const result = await orka.generate('What is TypeScript?');
console.log('Race winner: ' + result.slice(0, 200) + '...
');
}
 
async function loadBalancerExample() {
console.log('⚖️ Load Balancer: Round-robin\n');
 
const lb = new LoadBalancerLLM({
adapters: [
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
new AnthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4.5' }),
],
strategy: 'round_robin',
});
 
const orka = createOrka({ llm: lb, vectorDB: new MemoryVectorAdapter() });
 
for (let i = 0; i < 4; i++) {
await orka.generate('Question ' + (i + 1));
}
 
console.log('Stats:', lb.getStats());
}
 
async function main() {
await routerExample();
await consensusExample();
await raceExample();
await loadBalancerExample();
}
 
main().catch(console.error);

When to Use Each Strategy

🔀 RouterLLM

Cost optimization: use cheap models for simple queries, expensive for complex.

🤝 ConsensusLLM

Quality critical: get the best answer by comparing multiple models.

🏎️ RaceLLM

Latency critical: get the fastest response regardless of which model.

⚖️ LoadBalancerLLM

High throughput: distribute load across multiple API keys or instances.