Skip to main content

Prerequisites

Before you begin the installation and configuration of MemMachine, you must ensure that your local environment is ready by having Ollama installed and the necessary models downloaded.

1. Ollama Service

MemMachine connects directly to Ollama using its local API, which must be running in the background.
  • Installation: If you do not yet have Ollama installed, please follow the official setup guide.
  • Start the Service: Once installed, start the Ollama service to make the local API available. Open your terminal or command prompt and run:
    ollama serve
    
  • Verification: You can confirm the service is running successfully by opening your web browser and navigating to the following address: http://localhost:11434 You should see the Ollama web interface if the service is active.

2. Required Ollama Models

MemMachine requires two types of models to function: a Large Language Model (LLM) for generative tasks and an embedding model for converting text into vectors (e.g., for retrieval-augmented generation). You must download these models to your local Ollama repository before starting MemMachine.
  • Download Models: Use the ollama pull command to download the models you want.
Model TypeExample Model IDCommand to Run
Large Language Model (LLM)Llama 3ollama pull llama3
Embedding ModelNomic Embed Textollama pull nomic-embed-text
You can choose any compatible LLM (like mixtral, gemma, etc.) and embedding model available on Ollama, but the examples above are recommended starting points.
  • View Downloaded Models: To see a list of all models currently available in your local Ollama repository, run:
    ollama list
    

Installation: QuickStart Configuration

The installation script will automatically guide you through setting up your Large Language Model (LLM) provider. When prompted, you must select Ollama to integrate with Ollama. Your prompt input should match the following example:
[PROMPT] Which provider would you like to use? (OpenAI/Bedrock/Ollama) [OpenAI]: Ollama
[INFO] Selected provider: OLLAMA
Ollama Configuration and Model Choices You’ll then be prompted to select: • Ollama base URL (default: http://host.docker.internal:11434/v1) • Choice of LLM (Large Language Model) example: llama3 • Choice of Embedding Model example: nomic-embed-text
If you are unsure about model selection, simply press Enter at the respective prompts to use the recommended default options.
Congratulations! You have now successfully deployed MemMachine using Ollama!

Manually Configuring MemMachine to use Ollama

If you already have MemMachine installed and wish to switch to Ollama or change models manually, you can do so by updating the configuration file, cfg.yml.
Within the ‘cfg.yml’ file, make sure duplicate models are commented out or removed. For example, if you are using the ollama_model, ensure that any other model configurations (like openai_model) are commented out or deleted to avoid conflicts.
We will walk you through the necessary configuration changes for each component: LLM Provider, Embedder, and Reranker. After that, we’ll show you how to use parameters you’ve already set to fill out your Memory section. Below is an example configuration snippets for setting up Ollama as different components of MemMachine:

LLM Provider

To set your LLM Model for Ollama, you will want to use the following fields:
ParameterRequired?DefaultDescription
Model:YesN/AThe Model block in the configuration
ollama_model:YesN/ATag you can use if setting Ollama for your Embedder.
model_vendorYesN/AThe vendor of the model (e.g., openai-compatible).
modelYesN/AThe name of the model to use (e.g., llama3).
api_keyYesN/AThe API key field must exist, but needs to be set to EMPTY
base_urlYesN/AThe base URL for accessing your ollama service (e.g., http://host.docker.internal:11434/v1).
Here’s an example of what the Ollama LLM Model configuration would look like in cfg.yml:
Model:
  ollama_model:
    model_vendor: openai-compatible
    model: "llama3"
    api_key: "EMPTY"
    base_url: "http://host.docker.internal:11434/v1"

Embedder Configuration

To set your Embedding Model for Ollama, you will want to use the following fields:
ParameterRequired?DefaultDescription
embedder:YesN/AThe embedder block in the configuration.
ollama_embedder:YesN/ATag you can use if setting Ollama for your Embedder.
name:YesN/AThe name of the API client to use. For Ollama’s local API compatibility, this must be set to openai.
config:YesN/ADesignates the config sub-block in the embedder block for the confinguration file.
model:YesN/AThe Ollama embedding model ID to use for generating vectors. This model must be downloaded and running locally on your Ollama service.
api_keyYesN/AThe API key placeholder. Since Ollama’s local API typically doesn’t require a key, it is common practice to set this to "EMPTY" or any placeholder string.
base_urlYesN/AThe URL of the Ollama API endpoint. The format is usually http://<host>:<port>/v1. The host host.docker.internal is used when running a configuration inside a Docker container to reach a service (Ollama) running on the host machine.
dimensionsYesN/AThe expected output dimension of the vector embeddings. This value must match the dimension size of the specific embedding model (e.g., nomic-embed-text is 768).
Here’s an example of what the Ollama embedder configuration would look like in cfg.yml:
embedder:
    ollama_embedder:
        name: openai
        config:
            model: "nomic-embed-text"
            api_key: "EMPTY"
            base_url: "http://host.docker.internal:11434/v1"
            dimensions: 768

Memory Configuration

To ensure optimal performance when using Ollama models, you will want to adjust the memory settings in your cfg.yml file. The memory sections are long_term_memory,profile_memory, and sessionMemory. Within the Configuration file, you do not need to put the blocks together. In the following table, we’ll map the memory parameters to their corresponding field settings for Ollama. Keep in mind, Memory Parameter is using a naming convention of codeblock:parameter. Field Setting Equivalent shows which code block and field you’ve already set up in the previous sections, with the format codeblock:field. Here is a translation of the existing fields we’ve already covered and where you would enter them in the memory section:
Memory ParameterField Setting EquivalentExampleDescription
long_term_memory: embedder`embedder: ollama_embedder’ollama_embedderThe Tag you defined for your Ollama Embedder Config Block.
profile_memory: llm_modelModel: ollama_modelollama_modelThe Tag you defined for your Ollama LLM Model Config Block
profile_memory: embedding_modelembedder: ollama_embedderollama_embedderThe Tag you defined for your Ollama Embedder Config Block
session_memory: model_nameModel: ollama_modelollama_modelThe Tag you defined for your Ollama LLM Model Config Block
Here’s an example of how the memory configuration would look in cfg.yml:
long_term_memory:
  derivative_deriver: sentence
  metadata_prefix: "[$timestamp] $producer_id: "
  embedder: ollama_embedder
  reranker: my_reranker_id
  vector_graph_store: my_storage_id

profile_memory:
  llm_model: ollama_model
  embedding_model: ollama_embedder
  database: profile_storage
  prompt: profile_prompt

sessionMemory:
  model_name: ollama_model
  message_capacity: 500
  max_message_length: 16000
  max_token_num: 8000
Congratulations! You’ve now changed your MemMachine configuration to use Ollama for LLM Model and Embedding functionalities, along with the appropriate memory settings. Make sure to restart MemMachine after making these changes to apply the new settings. Need Help? Refer to the Install Guide for how to start and test your configuration with MemMachine.