Ask & Generate
Use orka.ask() for Q&A with RAG and orka.generate() for direct LLM generation.
Import Methods
Standard import:
import { createOrka, OpenAIAdapter, MemoryVectorAdapter } from 'orkajs';Tree-shakeable import (recommended for production):
import { createOrka } from 'orkajs/core';import { OpenAIAdapter } from 'orkajs/adapters/openai';import { MemoryVectorAdapter } from 'orkajs/adapters/memory';orka.ask()
The primary method for question-answering. When a knowledge base is specified, Orka AI automatically performs RAG (Retrieval-Augmented Generation). This is the most powerful method for building Q&A systems.
// Basic usage with RAGconst result = await orka.ask({ question: 'How do I reset my password?', knowledge: 'support-docs', // Name of your knowledge base}); console.log(result.answer); // "To reset your password, go to Settings > Security..." // Advanced usage with all optionsconst result = await orka.ask({ question: 'How do I reset my password?', knowledge: 'support-docs', systemPrompt: 'You are a helpful support agent. Be concise and friendly.', topK: 5, // Number of context chunks to retrieve minScore: 0.7, // Minimum similarity score for chunks temperature: 0.7, // LLM creativity (0-1) maxTokens: 1024, // Maximum response length includeContext: true, // Return retrieved chunks in result}); console.log(result.answer); // The generated answerconsole.log(result.context); // Array of retrieved chunksconsole.log(result.latencyMs); // Total execution timeconsole.log(result.usage.totalTokens); // Tokens consumedParameters
question: stringrequiredThe question to answer. Can be any natural language query.
knowledge?: stringName of the knowledge base to search. If omitted, no RAG is performed (direct LLM call).
topK?: numberdefault: 5Number of context chunks to retrieve from the knowledge base.
minScore?: numberMinimum similarity score (0-1) for retrieved chunks. Filters out low-relevance results.
systemPrompt?: stringCustom system prompt to control the LLM's behavior and tone.
temperature?: numberdefault: 0.7LLM creativity level (0-1). Lower = more deterministic, higher = more creative.
includeContext?: booleandefault: falseIf true, returns the retrieved chunks in result.context for debugging or display.
Return Value
interface AskResult { answer: string; // The generated answer context?: ChunkResult[]; // Retrieved chunks (if includeContext: true) latencyMs: number; // Total execution time in milliseconds usage: { promptTokens: number; // Tokens in the prompt completionTokens: number; // Tokens in the response totalTokens: number; // Total tokens consumed };}orka.generate()
Direct LLM generation without RAG. Use for creative tasks, text transformations, summarization, or any task that doesn't require external knowledge.
// Simple generationconst response = await orka.generate('Write a haiku about TypeScript');console.log(response); // "Types flow like water..." (string) // With optionsconst response = await orka.generate('Summarize this article: [article text]', { temperature: 0.3, // Lower for more deterministic output maxTokens: 200, // Limit response length systemPrompt: 'You are a professional summarizer. Be concise.',}); // Creative writingconst story = await orka.generate('Write a short story about a robot', { temperature: 0.9, // Higher for more creativity maxTokens: 1000, systemPrompt: 'You are a creative fiction writer.',}); // Code generationconst code = await orka.generate('Write a TypeScript function to sort an array', { temperature: 0.2, // Low for precise code systemPrompt: 'You are an expert TypeScript developer. Return only code.',});Parameters
prompt: stringrequiredThe prompt to send to the LLM. Can be any text or instruction.
options.temperature?: numberdefault: 0.7Controls randomness. 0 = deterministic, 1 = maximum creativity.
options.maxTokens?: numberMaximum number of tokens in the response. Limits output length.
options.systemPrompt?: stringSystem prompt to set the LLM's role and behavior.
orka.embed()
Generate vector embeddings for text. Embeddings are high-dimensional vectors that capture semantic meaning. Use for custom similarity comparisons, clustering, or building your own retrieval systems.
// Single text embeddingconst [embedding] = await orka.embed('Hello world');console.log(embedding.length); // e.g., 1536 (depends on model)console.log(embedding[0]); // e.g., 0.0234 (first dimension) // Multiple texts (batched for efficiency)const embeddings = await orka.embed([ 'How do I reset my password?', 'I forgot my login credentials', 'What is the weather today?',]); // Compare similarity (cosine similarity)function cosineSimilarity(a: number[], b: number[]): number { const dot = a.reduce((sum, val, i) => sum + val * b[i], 0); const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0)); const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0)); return dot / (magA * magB);} const sim1 = cosineSimilarity(embeddings[0], embeddings[1]); // ~0.85 (similar)const sim2 = cosineSimilarity(embeddings[0], embeddings[2]); // ~0.45 (different)Use orka.embed() when you need raw embeddings for custom logic. For standard RAG, use orka.knowledge.create() and orka.ask() instead — they handle embeddings automatically.
Complete Example
import { createOrka, OpenAIAdapter, MemoryVectorDB } from 'orkajs'; const orka = createOrka({ llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), vectorDB: new MemoryVectorDB(),}); // 1. Create a knowledge baseawait orka.knowledge.create({ name: 'product-docs', source: [ 'Our product supports dark mode. Go to Settings > Appearance to enable it.', 'To reset your password, click "Forgot Password" on the login page.', 'Premium users get unlimited API calls and priority support.', ],}); // 2. Ask questions with RAGconst result = await orka.ask({ question: 'How do I enable dark mode?', knowledge: 'product-docs', topK: 3, includeContext: true,}); console.log('Answer:', result.answer);// "To enable dark mode, go to Settings > Appearance." console.log('Sources:', result.context?.map(c => c.content.slice(0, 50)));// ["Our product supports dark mode. Go to Settings > ..."] // 3. Direct generation (no RAG)const summary = await orka.generate( 'Summarize: ' + result.answer, { temperature: 0.3, maxTokens: 50 }); // 4. Get embeddings for custom logicconst [queryEmbed] = await orka.embed('dark mode settings');console.log('Embedding dimensions:', queryEmbed.length);Comparison Table
| Method | Use Case | Returns |
|---|---|---|
orka.ask() | Q&A with optional RAG, knowledge-grounded answers | { answer, context?, usage } |
orka.generate() | Creative writing, transformations, code generation | string |
orka.embed() | Custom similarity, clustering, manual retrieval | number[][] |
Tree-shaking Imports
// ✅ Import from main packageimport { createOrka } from 'orkajs/core'; // ✅ Import adapters separatelyimport { OpenAIAdapter } from 'orkajs/adapters/openai';import { MemoryVectorDB } from 'orkajs/adapters/memory';