OrkaJS
Orka.JS

Conversation Memory

Manage conversation history with single-session and multi-session memory in Orka AI.

Why Conversation Memory?

LLMs are stateless — they don't remember previous messages. Conversation memory solves this by storing chat history and injecting it into each new request. Orka provides automatic memory management with configurable trimming strategies to stay within token limits.

# Single-Session Memory

Use single-session memory for simple chatbots or when you only need to track one conversation at a time.

import { Memory } from 'orkajs/memory';
 
const memory = new Memory({
maxMessages: 50, // Keep at most 50 messages
strategy: 'sliding_window' // Trimming strategy
});
 
// Add messages to the conversation
memory.addMessage({ role: 'user', content: 'My name is Alice.' });
memory.addMessage({ role: 'assistant', content: 'Hello Alice! How can I help you today?' });
memory.addMessage({ role: 'user', content: 'What is my name?' });
 
// Get the full history
const history = memory.getHistory();
/*
[
{ role: 'user', content: 'My name is Alice.', timestamp: 1708000000000 },
{ role: 'assistant', content: 'Hello Alice! How can I help you today?', timestamp: 1708000001000 },
{ role: 'user', content: 'What is my name?', timestamp: 1708000002000 }
]
*/
 
// Clear the conversation
memory.clear();

- Memory Methods

addMessage(message: ChatMessage): void

Add a message to the conversation. Automatically applies trimming if maxMessages is exceeded.

getHistory(): ChatMessage[]

Returns the current conversation history as an array of messages.

clear(): void

Clears all messages from the conversation history.

getMessageCount(): number

Returns the current number of messages in the conversation.

# Multi-Session Memory

For multi-user applications (APIs, chatbots serving multiple users), use SessionMemory to manage separate conversations per user/session. Each session has its own isolated memory with automatic TTL-based cleanup.

import { SessionMemory } from 'orkajs/memory';
 
const sessions = new SessionMemory({
maxMessages: 50, // Per-session message limit
strategy: 'sliding_window',
ttlMs: 3600_000, // Sessions expire after 1 hour of inactivity
});
 
// User A's conversation
sessions.addMessage('user-alice', { role: 'user', content: 'Hello!' });
sessions.addMessage('user-alice', { role: 'assistant', content: 'Hi Alice!' });
 
// User B's conversation (completely separate)
sessions.addMessage('user-bob', { role: 'user', content: 'Help me with my order.' });
sessions.addMessage('user-bob', { role: 'assistant', content: 'Of course! What is your order ID?' });
 
// Retrieve history by session ID
const aliceHistory = sessions.getHistory('user-alice');
const bobHistory = sessions.getHistory('user-bob');
 
// List all active sessions
console.log(sessions.getActiveSessions()); // ['user-alice', 'user-bob']
 
// Clear a specific session
sessions.clearSession('user-alice');
 
// Clear all sessions
sessions.clearAll();

- SessionMemory Methods

addMessage(sessionId: string, message: ChatMessage): void

Add a message to a specific session. Creates the session if it doesn't exist.

getHistory(sessionId: string): ChatMessage[]

Returns the conversation history for a specific session.

getActiveSessions(): string[]

Returns an array of all active session IDs.

clearSession(sessionId: string): void

Clears and removes a specific session.

# Trimming Strategies

When the conversation grows too long, Orka automatically trims old messages using one of three strategies:

StrategyDescription
sliding_windowKeeps the N most recent messages, preserving system messages
bufferKeeps messages that fit within estimated token budget
summaryCompresses old messages into a summary, preserving context while reducing size

💡 Summary Strategy

The summary strategy is ideal for long conversations where you want to preserve context without keeping all messages. When the message count exceeds the threshold, old messages are compressed into a system message summary.

const memory = new Memory({
strategy: 'summary',
maxMessages: 20,
summaryThreshold: 10, // Summarize when 10+ messages overflow
llm: orka.getLLM(), // LLM used to generate summaries
});
 
// After 30 messages, the first 10 are compressed into a summary:
// [{ role: 'system', content: 'Summary: User Alice discussed...' }, ...recent 20 messages]

# Configuration

Configure memory when creating your Orka instance:

import { createOrka, OpenAIAdapter, MemoryVectorAdapter } from 'orkajs';
 
const orka = createOrka({
llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
vectorDB: new MemoryVectorAdapter(),
memory: {
maxMessages: 50, // Maximum messages to keep
maxTokensEstimate: 4000, // For 'buffer' strategy
strategy: 'sliding_window', // 'sliding_window' | 'buffer' | 'summary'
},
});
 
// Access memory directly
const memory = orka.memory();
memory.addMessage({ role: 'user', content: 'Hello!' });
 
// Or use with ask() - memory is automatically managed
const response = await orka.ask({
question: 'What did I say earlier?',
useMemory: true, // Injects conversation history into the prompt
});

- Configuration Options

maxMessages: number

Maximum number of messages to keep. Older messages are trimmed based on strategy.

maxTokensEstimate: number

For 'buffer' strategy: estimated token budget. Messages are trimmed to fit.

strategy: 'sliding_window' | 'buffer' | 'summary'

How to trim messages when limits are exceeded. Default: sliding_window.

ttlMs: number

For SessionMemory: time-to-live in milliseconds. Sessions expire after this duration of inactivity.

Tree-shaking Imports

// ✅ Import only what you need
import { Memory } from 'orkajs/memory';
import { SessionMemory } from 'orkajs/memory';
 
// ✅ Or import from index
import { Memory, SessionMemory } from 'orkajs/memory';