Conversation Memory
Manage conversation history with single-session and multi-session memory in Orka AI.
Why Conversation Memory?
LLMs are stateless — they don't remember previous messages. Conversation memory solves this by storing chat history and injecting it into each new request. Orka provides automatic memory management with configurable trimming strategies to stay within token limits.
# Single-Session Memory
Use single-session memory for simple chatbots or when you only need to track one conversation at a time.
import { Memory } from 'orkajs/memory'; const memory = new Memory({ maxMessages: 50, // Keep at most 50 messages strategy: 'sliding_window' // Trimming strategy}); // Add messages to the conversationmemory.addMessage({ role: 'user', content: 'My name is Alice.' });memory.addMessage({ role: 'assistant', content: 'Hello Alice! How can I help you today?' });memory.addMessage({ role: 'user', content: 'What is my name?' }); // Get the full historyconst history = memory.getHistory();/*[ { role: 'user', content: 'My name is Alice.', timestamp: 1708000000000 }, { role: 'assistant', content: 'Hello Alice! How can I help you today?', timestamp: 1708000001000 }, { role: 'user', content: 'What is my name?', timestamp: 1708000002000 }]*/ // Clear the conversationmemory.clear();- Memory Methods
addMessage(message: ChatMessage): voidAdd a message to the conversation. Automatically applies trimming if maxMessages is exceeded.
getHistory(): ChatMessage[]Returns the current conversation history as an array of messages.
clear(): voidClears all messages from the conversation history.
getMessageCount(): numberReturns the current number of messages in the conversation.
# Multi-Session Memory
For multi-user applications (APIs, chatbots serving multiple users), use SessionMemory to manage separate conversations per user/session. Each session has its own isolated memory with automatic TTL-based cleanup.
import { SessionMemory } from 'orkajs/memory'; const sessions = new SessionMemory({ maxMessages: 50, // Per-session message limit strategy: 'sliding_window', ttlMs: 3600_000, // Sessions expire after 1 hour of inactivity}); // User A's conversationsessions.addMessage('user-alice', { role: 'user', content: 'Hello!' });sessions.addMessage('user-alice', { role: 'assistant', content: 'Hi Alice!' }); // User B's conversation (completely separate)sessions.addMessage('user-bob', { role: 'user', content: 'Help me with my order.' });sessions.addMessage('user-bob', { role: 'assistant', content: 'Of course! What is your order ID?' }); // Retrieve history by session IDconst aliceHistory = sessions.getHistory('user-alice');const bobHistory = sessions.getHistory('user-bob'); // List all active sessionsconsole.log(sessions.getActiveSessions()); // ['user-alice', 'user-bob'] // Clear a specific sessionsessions.clearSession('user-alice'); // Clear all sessionssessions.clearAll();- SessionMemory Methods
addMessage(sessionId: string, message: ChatMessage): voidAdd a message to a specific session. Creates the session if it doesn't exist.
getHistory(sessionId: string): ChatMessage[]Returns the conversation history for a specific session.
getActiveSessions(): string[]Returns an array of all active session IDs.
clearSession(sessionId: string): voidClears and removes a specific session.
# Trimming Strategies
When the conversation grows too long, Orka automatically trims old messages using one of three strategies:
| Strategy | Description |
|---|---|
| sliding_window | Keeps the N most recent messages, preserving system messages |
| buffer | Keeps messages that fit within estimated token budget |
| summary | Compresses old messages into a summary, preserving context while reducing size |
💡 Summary Strategy
The summary strategy is ideal for long conversations where you want to preserve context without keeping all messages. When the message count exceeds the threshold, old messages are compressed into a system message summary.
const memory = new Memory({ strategy: 'summary', maxMessages: 20, summaryThreshold: 10, // Summarize when 10+ messages overflow llm: orka.getLLM(), // LLM used to generate summaries}); // After 30 messages, the first 10 are compressed into a summary:// [{ role: 'system', content: 'Summary: User Alice discussed...' }, ...recent 20 messages]# Configuration
Configure memory when creating your Orka instance:
import { createOrka, OpenAIAdapter, MemoryVectorAdapter } from 'orkajs'; const orka = createOrka({ llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), vectorDB: new MemoryVectorAdapter(), memory: { maxMessages: 50, // Maximum messages to keep maxTokensEstimate: 4000, // For 'buffer' strategy strategy: 'sliding_window', // 'sliding_window' | 'buffer' | 'summary' },}); // Access memory directlyconst memory = orka.memory();memory.addMessage({ role: 'user', content: 'Hello!' }); // Or use with ask() - memory is automatically managedconst response = await orka.ask({ question: 'What did I say earlier?', useMemory: true, // Injects conversation history into the prompt});- Configuration Options
maxMessages: numberMaximum number of messages to keep. Older messages are trimmed based on strategy.
maxTokensEstimate: numberFor 'buffer' strategy: estimated token budget. Messages are trimmed to fit.
strategy: 'sliding_window' | 'buffer' | 'summary'How to trim messages when limits are exceeded. Default: sliding_window.
ttlMs: numberFor SessionMemory: time-to-live in milliseconds. Sessions expire after this duration of inactivity.
Tree-shaking Imports
// ✅ Import only what you needimport { Memory } from 'orkajs/memory';import { SessionMemory } from 'orkajs/memory'; // ✅ Or import from indeximport { Memory, SessionMemory } from 'orkajs/memory';