Wir brauchen Ihre Hilfe.Lesen Sie unsere Geschichte
Mnemosyne
All posts
Release2026-05-186 min read

Mnemosyne 3.0: MEMORIA Fact Engine

From 35.4% to 65.2% on BEAM, leading published SOTA. Multi-hop reasoning went from 16.7% to 87.5%. Same zero-cloud architecture. MEMORIA changed everything.

v3.0memoriabeam-benchmarkfactssota

Eight weeks ago our BEAM benchmark score was 35.4%. We were losing to RAG at scale. The problem was not the retrieval speed or the vector search or the SQLite backend. The problem was simpler than that: we were treating memory as chunks of text and hoping keyword search found the needle.

So we stopped doing that.

What MEMORIA Is

MEMORIA is a structured fact engine that lives inside BeamMemory. At ingestion time, every fact gets extracted into its own SQLite table: temporal triples with version chains, metric values with previous-value tracking, entity relationships with valid-from/to windows. Memory stops being blobs of text and starts being structured facts.

When a question comes in, 10 different retrieval strategies fire depending on what is being asked: Information extraction: hits the fact tables directly. Multi-hop reasoning: runs recursive gap analysis, re-querying until answers are found. Temporal reasoning: reads version chains with valid-from/to windows. Contradiction resolution: UNION search across episodic memory and structured facts. Event ordering: strict JSON mode with negative examples.

The routing happens automatically. You do not configure it. You do not select strategies. You ask a question and MEMORIA figures out which retrieval path to take.

The Numbers

65.2% on BEAM at 100K. Leading published SOTA. Honcho sits at 63.0%. Hindsight at 64.1%. This is not a marginal improvement. This is what happens when you stop treating memory as text search.

Ability v2.5 v3.0 Delta
IE (Information Extraction) 80.5% 91.5% +11.0
MR (Multi-hop Reasoning) 16.7% 87.5% +70.8
TR (Temporal Reasoning) 29.2% 75.0% +45.8
KU (Knowledge Update) 16.7% 50.0% +33.3
ABS (Abstention) 50.0% 100.0% +50.0

The largest gains are in multi-hop reasoning (+70.8pp), temporal reasoning (+45.8pp), and knowledge update (+33.3pp). The exact abilities MEMORIA was designed to solve. Structured facts beat keyword search every time.

What Did Not Change

Same architecture under the hood. No Docker. No PostgreSQL. No API keys at runtime. No cloud dependency. Three lines of Python and you have memory. MEMORIA is an upgrade to the retrieval engine, not a fork of the project.

Your existing Mnemosyne database works. v3.0.0 auto-migrates the schema. New tables appear. Old data stays where it was. The BEAM tiers (working, episodic, scratchpad) are all still there. MEMORIA sits on top as a new retrieval layer, not a replacement.

pip install --upgrade mnemosyne-memory

Getting Involved

Building an AI memory system means you have to prove it actually works. Every new feature gets benchmarked against the same BEAM dataset on the same protocol. You can run the benchmark on your own hardware:

python tools/evaluate_beam_end_to_end.py --sample 5 --scales 100K

Full benchmark report and methodology are in the repo at docs/beam-benchmark.md. Contributions, issues, and benchmark runs are welcome. If you run BEAM with a different model or hardware, open a PR with the results.

A

Abdias J

Building Mnemosyne in public. No VC, no cloud lock-in, just code that works.