OrkaJS
Orka.JS

Ask & Generate

Use orka.ask() for Q&A with RAG and orka.generate() for direct LLM generation.

Import Methods

Standard import:

import { createOrka, OpenAIAdapter, MemoryVectorAdapter } from 'orkajs';

Tree-shakeable import (recommended for production):

import { createOrka } from 'orkajs/core';
import { OpenAIAdapter } from 'orkajs/adapters/openai';
import { MemoryVectorAdapter } from 'orkajs/adapters/memory';

orka.ask()

The primary method for question-answering. When a knowledge base is specified, Orka AI automatically performs RAG (Retrieval-Augmented Generation). This is the most powerful method for building Q&A systems.

// Basic usage with RAG
const result = await orka.ask({
question: 'How do I reset my password?',
knowledge: 'support-docs', // Name of your knowledge base
});
 
console.log(result.answer); // "To reset your password, go to Settings > Security..."
 
// Advanced usage with all options
const result = await orka.ask({
question: 'How do I reset my password?',
knowledge: 'support-docs',
systemPrompt: 'You are a helpful support agent. Be concise and friendly.',
topK: 5, // Number of context chunks to retrieve
minScore: 0.7, // Minimum similarity score for chunks
temperature: 0.7, // LLM creativity (0-1)
maxTokens: 1024, // Maximum response length
includeContext: true, // Return retrieved chunks in result
});
 
console.log(result.answer); // The generated answer
console.log(result.context); // Array of retrieved chunks
console.log(result.latencyMs); // Total execution time
console.log(result.usage.totalTokens); // Tokens consumed

Parameters

question: stringrequired

The question to answer. Can be any natural language query.

knowledge?: string

Name of the knowledge base to search. If omitted, no RAG is performed (direct LLM call).

topK?: numberdefault: 5

Number of context chunks to retrieve from the knowledge base.

minScore?: number

Minimum similarity score (0-1) for retrieved chunks. Filters out low-relevance results.

systemPrompt?: string

Custom system prompt to control the LLM's behavior and tone.

temperature?: numberdefault: 0.7

LLM creativity level (0-1). Lower = more deterministic, higher = more creative.

includeContext?: booleandefault: false

If true, returns the retrieved chunks in result.context for debugging or display.

Return Value

interface AskResult {
answer: string; // The generated answer
context?: ChunkResult[]; // Retrieved chunks (if includeContext: true)
latencyMs: number; // Total execution time in milliseconds
usage: {
promptTokens: number; // Tokens in the prompt
completionTokens: number; // Tokens in the response
totalTokens: number; // Total tokens consumed
};
}

orka.generate()

Direct LLM generation without RAG. Use for creative tasks, text transformations, summarization, or any task that doesn't require external knowledge.

// Simple generation
const response = await orka.generate('Write a haiku about TypeScript');
console.log(response); // "Types flow like water..." (string)
 
// With options
const response = await orka.generate('Summarize this article: [article text]', {
temperature: 0.3, // Lower for more deterministic output
maxTokens: 200, // Limit response length
systemPrompt: 'You are a professional summarizer. Be concise.',
});
 
// Creative writing
const story = await orka.generate('Write a short story about a robot', {
temperature: 0.9, // Higher for more creativity
maxTokens: 1000,
systemPrompt: 'You are a creative fiction writer.',
});
 
// Code generation
const code = await orka.generate('Write a TypeScript function to sort an array', {
temperature: 0.2, // Low for precise code
systemPrompt: 'You are an expert TypeScript developer. Return only code.',
});

Parameters

prompt: stringrequired

The prompt to send to the LLM. Can be any text or instruction.

options.temperature?: numberdefault: 0.7

Controls randomness. 0 = deterministic, 1 = maximum creativity.

options.maxTokens?: number

Maximum number of tokens in the response. Limits output length.

options.systemPrompt?: string

System prompt to set the LLM's role and behavior.

orka.embed()

Generate vector embeddings for text. Embeddings are high-dimensional vectors that capture semantic meaning. Use for custom similarity comparisons, clustering, or building your own retrieval systems.

// Single text embedding
const [embedding] = await orka.embed('Hello world');
console.log(embedding.length); // e.g., 1536 (depends on model)
console.log(embedding[0]); // e.g., 0.0234 (first dimension)
 
// Multiple texts (batched for efficiency)
const embeddings = await orka.embed([
'How do I reset my password?',
'I forgot my login credentials',
'What is the weather today?',
]);
 
// Compare similarity (cosine similarity)
function cosineSimilarity(a: number[], b: number[]): number {
const dot = a.reduce((sum, val, i) => sum + val * b[i], 0);
const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
return dot / (magA * magB);
}
 
const sim1 = cosineSimilarity(embeddings[0], embeddings[1]); // ~0.85 (similar)
const sim2 = cosineSimilarity(embeddings[0], embeddings[2]); // ~0.45 (different)
When to use orka.embed()

Use orka.embed() when you need raw embeddings for custom logic. For standard RAG, use orka.knowledge.create() and orka.ask() instead — they handle embeddings automatically.

Complete Example

ask-generate-example.ts
import { createOrka, OpenAIAdapter, MemoryVectorDB } from 'orkajs';
 
const orka = createOrka({
llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
vectorDB: new MemoryVectorDB(),
});
 
// 1. Create a knowledge base
await orka.knowledge.create({
name: 'product-docs',
source: [
'Our product supports dark mode. Go to Settings > Appearance to enable it.',
'To reset your password, click "Forgot Password" on the login page.',
'Premium users get unlimited API calls and priority support.',
],
});
 
// 2. Ask questions with RAG
const result = await orka.ask({
question: 'How do I enable dark mode?',
knowledge: 'product-docs',
topK: 3,
includeContext: true,
});
 
console.log('Answer:', result.answer);
// "To enable dark mode, go to Settings > Appearance."
 
console.log('Sources:', result.context?.map(c => c.content.slice(0, 50)));
// ["Our product supports dark mode. Go to Settings > ..."]
 
// 3. Direct generation (no RAG)
const summary = await orka.generate(
'Summarize: ' + result.answer,
{ temperature: 0.3, maxTokens: 50 }
);
 
// 4. Get embeddings for custom logic
const [queryEmbed] = await orka.embed('dark mode settings');
console.log('Embedding dimensions:', queryEmbed.length);

Comparison Table

MethodUse CaseReturns
orka.ask()Q&A with optional RAG, knowledge-grounded answers{ answer, context?, usage }
orka.generate()Creative writing, transformations, code generationstring
orka.embed()Custom similarity, clustering, manual retrievalnumber[][]

Tree-shaking Imports

// ✅ Import from main package
import { createOrka } from 'orkajs/core';
 
// ✅ Import adapters separately
import { OpenAIAdapter } from 'orkajs/adapters/openai';
import { MemoryVectorDB } from 'orkajs/adapters/memory';