Skip to main content
The Episodic Memory module manages the “stream” of interactions for a specific session. It orchestrates the flow of data between the incoming API requests and the underlying vector and relational databases.

The Ingestion Pipeline

When the server receives an AddMemoriesSpec from the SDK, it triggers a transformation pipeline within the service layer.

Data Mapping: SDK to Server

The server maps the high-level MemoryMessage from the client into a persistent EpisodeEntry. This process includes:
  • Producer Mapping: The SDK’s producer field is mapped to the internal producer_id.
  • Role Assignment: Validates the producer_role (e.g., “user”, “assistant”).
  • Timestamp Normalization: Converts various incoming date formats into a standardized AwareDatetime (UTC).
  • Metadata Casting: Raw JSON payloads are cast to dict[str, JsonValue] for storage compatibility.

Memory Configuration

The server allows for granular control over how episodic data is handled through two distinct sub-modules:

1. Short-Term Memory (STM)

  • Responsibility: Manages the immediate context window.
  • Server Logic: Controls the retrieval of the most recent N episodes to provide “sliding window” context for LLM prompts.
  • Operations: configure_short_term_memory allows the server to toggle STM usage or adjust window sizes.

2. Long-Term Memory (LTM)

  • Responsibility: Manages the persistent, vector-indexed history of a session.
  • Server Logic: Handles the “summarization” loop, where older episodic memories are condensed to save space while retaining semantic meaning.
  • Operations: configure_long_term_memory enables or disables the backend vector-search retrieval for historical context.

Internal Server Classes

EpisodicMemoryParams

This dataclass defines the environment for a memory instance.
AttributeTypeDescription
session_keystrThe {org_id}/{project_id} unique identifier.
long_term_memoryLongTermMemoryThe backend vector store instance for this session.
short_term_memoryShortTermMemoryThe backend relational/cache store for the context window.
metrics_factoryMetricsFactoryTelemetry provider for monitoring ingestion rates.

EpisodeEntry (Internal Model)

Unlike the SDK’s Episode, the server’s EpisodeEntry includes internal tracking fields:
# Internal server-side structure
class EpisodeEntry:
    content: str
    producer_id: str
    produced_for_id: str
    producer_role: str
    created_at: datetime
    metadata: dict[str, JsonValue]
    episode_type: EpisodeType # Defaults to MESSAGE

Lifecycle Management

Episodic memory instances are stateful but cached.
  1. Creation: When an API request arrives for a new session, the EpisodicMemoryManager instantiates the episodic sub-system.
  2. Reference Counting: The server tracks how many active requests are utilizing a specific session instance.
  3. Cleanup: Once the reference count drops to zero and the max_life_time is exceeded, the server flushes caches and closes database connections to conserve resources.