Skip to main content
For the full source code and advanced implementation details, see the official LlamaIndex Integration in our repository.

Overview

Integrating MemMachine with LlamaIndex provides a persistent memory layer for chat engines. This allows agents to:
  • Recall User Profiles: Surface user preferences and facts directly into the prompt context.
  • Maintain Context Across Sessions: Stored episodic and semantic memories persist beyond a single execution.
  • Intelligent Injection: MemMachine automatically injects relevant context as a system message during inference.

Configuration

You can configure the LlamaIndex memory component using environment variables or direct constructor parameters.
ParameterEnvironment VariableDefaultDescription
base_urlMEMORY_BACKEND_URLhttp://localhost:8080MemMachine server URL
org_idLLAMAINDEX_ORG_IDllamaindex_orgOrganization ID
project_idLLAMAINDEX_PROJECT_IDllamaindex_projectProject ID
user_idLLAMAINDEX_USER_IDNoneUser identifier
agent_idLLAMAINDEX_AGENT_IDNoneAgent identifier
session_idLLAMAINDEX_SESSION_IDNoneSession identifier

1

Install Dependencies

Install the core LlamaIndex framework and the updated MemMachine client:
pip install llama-index memmachine-client
2

Initialize MemMachine Memory

Import the MemMachineMemory class and configure it with your project identifiers.
from mem_machine_memory import MemMachineMemory

memory = MemMachineMemory(
    base_url="http://localhost:8080",
    org_id="my_org",
    project_id="my_project",
    user_id="user_123",
    session_id="session_456"
)
3

Build the Chat Engine

Equip your LlamaIndex SimpleChatEngine with the persistent memory instance.
from llama_index.core.chat_engine import SimpleChatEngine
from llama_index.llms.openai import OpenAI

# Ensure your API key is configured
llm = OpenAI(api_key="your-openai-api-key")

agent = SimpleChatEngine.from_defaults(
    llm=llm, 
    memory=memory
)

# First interaction stores facts
print(agent.chat("I am Alice, I like Python programming."))

# Subsequent interactions recall them
print(agent.chat("What do you know about me?"))
Pro Tip: Tune the search_msg_limit parameter to balance the depth of recall against context window usage and latency.

Requirements

  • MemMachine Server: Must be running (default: http://localhost:8080).
  • Python: 3.12 or higher.
  • LLM: An OpenAI-compatible LLM provider.