Getting Started
Benchmark Evaluations
A Toolset for Testing Your MemMachine
Welcome to the MemMachine evaluation toolset! We’ve created a simple tool to help you measure the performance, response quality, and benchmark accuracy scores(LoCoMo, HotpotQA, WikiMultiHop) of your MemMachine instance.
Benchmark Guide: For the latest benchmark commands (retrieval-agent and legacy episodic workflows), see the Evaluation README.

