Memory is Infrastructure, Not Just Files.
Persistent, shared, and governed memory for OpenClaw agent systems.
Explore mem9.ai

Shared Across Agents
Hybrid Retrieval
Multi-Tenant
mem9.ai · Built on TiDB
CHAPTER 1
Why Agent Memory Breaks
From single-agent prototype to production fleet — the architectural gap that file-based memory cannot cross.
Every Agent System Hits a Structural Ceiling
File-based memory cannot cross the prototype-to-fleet boundary.
Prototype
One agent · local file · works
✓ Works
Growth
Multi-session · files diverge · context lost
Production Fleet
Multi-tenant · no sharing · architecture collapses

Memory Silos Block Organizational Intelligence

Without shared infrastructure, context stays local — agents repeatedly rebuild what others already learned.
01
Personal
Local notes · no reliable transfer · isolated per agent
02
Team
No shared context across agents
03
Company
No queryable knowledge base
↑ Facts should flow upward — but never do
↓ Org knowledge should flow down — but is inaccessible
No Upward Flow
Facts never reach team or company layer
No Downward Inheritance
Org knowledge inaccessible to agents
No Convergence
Agents rebuild context independently
The OpenClaw Native Baseline — and Its Limits
Fast to set up for a single agent. Not production memory.
Native Memory ✓
  • Markdown files · SQLite index
  • Verbatim snippet retrieval
  • Zero dependencies · works offline
Native Memory ✗
  • No cross-agent sharing
  • No typed schema or provenance
  • No multi-tenant isolation
File-Centric Memory Is Structurally Fragile
Every failure mode is silent — no error, no alert, no recovery.
Forget
Agent omits write → fact lost permanently
Write Wrong
Vague fact persists · no validation
Duplicate
Index bloat · recall quality degrades
Conflict
Two agents contradict · no reconciliation
Failure mode comparison

Compaction Is Context Management, Not Memory Modeling
Compaction manages the context window. mem9 manages what the agent knows — permanently.
Compaction
  • Summarizes old turns · frees context window
  • Session-scoped · agent-triggered
  • Does not persist across restarts
mem9 Memory
  • Extracts atomic facts · reconciles Insights
  • Persistent across sessions and agents
  • Queryable · typed · governed
CHAPTER 2
mem9 Runtime Pipeline
How mem9 captures, processes, types, and stores every agent interaction — server-side, asynchronously, without blocking the agent.
mem9 Architecture: Four-Layer Stack
One plugin change. Four server-side layers. Complete production memory.
① Agent + Plugin
before_prompt_build → inject · agent_end → flush
② mem9 Server
Ingest · Smart Pipeline · CRUD · Hybrid Recall
③ TiDB Storage
Distributed SQL + Vector · BM25 + Embedding index
④ Console
Memories Explorer · Analysis · Webhooks · Operator-only
Agent sees only the plugin · server is invisible
Operator sees only the Console · no direct DB access
Replacing the File with an API Plugin
One config change. No agent retraining. The memory destination changes — the agent does not.
Before mem9
  • ~/memory/*.md · SQLite on disk
  • No server · no sharing · no governance
  • Memory quality depends on agent discipline
After mem9
  • Plugin routes to mem9 server at agent_end
  • TiDB-backed · typed · governed · queryable
  • Smart Pipeline handles extraction + reconciliation
01
Install the mem9 plugin
No agent code changes.
02
Configure the Space or chain_ key
Set it in the plugin config.
03
Run the agent
The local file is no longer used.
Auto Capture: agent_end Message Flush
No agent writing discipline required — the plugin hooks two lifecycle events and handles the rest.
before_prompt_build · Inject typed memory into prompt
Agent runs · Processes turn · No memory writes
agent_end · Batch messages · POST to /ingest · Agent unblocked
Server async · Strip → Extract → Reconcile → Persist
Strip
Remove injected context from batch
Extract
LLM extracts atomic facts + message tags
Reconcile
ADD new or UPDATE+archive existing
Persist
Provenance attached: agent · session · appId
Smart Ingest: Messages → Storage → Extraction → Reconciliation
Server-side pipeline. Every session produces structured, durable memory — regardless of agent behavior.
Flush · agent_end → size-aware batch → POST /ingest → ACK → agent unblocked
Store · Raw messages → Session layer · strip injected context · deduplicate by hash
Extract · LLM over cleaned batch → atomic facts + message tags · query_intent dropped
Reconcile · ADD if no match · UPDATE+archive if superseded · provenance attached
Hot Path (sync) · agent_end → POST → ACK · agent never waits
Off Path (async) · Strip → Extract → Reconcile → Persist · no latency impact
Smart Ingest: Extraction Policy and Fact Classification
Only atomic facts become durable memory — everything else is dropped or stored as fallback.
Injected Context Stripped First
Prevents re-extracting known facts.
Size-Aware Batching
Long sessions chunked by token budget.
Schema-Validated Output
Malformed extraction discarded · not stored.
Reconciliation Flow: ADD / UPDATE / archive-old / preserve provenance
Reconciliation never deletes — every superseded Insight is archived, not erased.
Candidate Arrives
Embedding generated · BM25 + vector search
Match Check
No match → ADD · Contradicts → UPDATE · Equivalent → No-op
Write Decision
ADD: new Insight · UPDATE: new + archive old · No-op: discard
History Preserved
Every write carries agent · session · appId · source_turns
Conflict Handling: Three Guardrails
mem9 resolves stale facts, isolates project context, and prioritizes personal memory before inherited memory.
① Reconcile
  • Superseded fact → new Insight + archive old
  • Duplicate → no-op
② Isolate
  • appId partitions reads/writes inside one Space
  • Project memories don't leak across partitions
③ Prioritize
  • chain_ key: personal node searched first
  • High-confidence hit → early stop
Pinned facts are AI-immutable — agents cannot overwrite operator-approved ground truth.
A First-Class Typed Memory Model
Three types. Explicit semantics, mutation rules, and lifecycle for each.
Pinned · Permanent ground truth · AI-immutable · Operator-only writes · Never auto-deleted
Insight · Reconciled atomic facts · ADD or UPDATE+archive · Archivable · Promotable to Pinned
Session · Raw interaction data · Append-only · Hash-deduplicated · Expires by retention policy
Asynchronous Write Path: Fast ACK, Async Pipeline
The hook waits for server acceptance; memory extraction and reconciliation continue in the background.
Hot Path (Sync)
  • agent_end fires · plugin batches messages
  • Lightweight POST to /ingest endpoint
  • Server returns accepted/ACK · agent does not wait for Smart Pipeline completion
Off Path (Async)
  • Smart Pipeline runs after acceptance
  • Strip → Extract → Reconcile → Persist
  • No impact from extraction/reconciliation on agent response latency
01
agent_end fires · Plugin POSTs to server · waits only for ACK
02
Async ingest stores raw messages · deduplicates by hash
03
Pipeline extracts/reconciles Insights in background
CHAPTER 3
Recall and Trust
How mem9 retrieves the right memory at the right time — and proves it can be trusted.
Recall Engine: Hybrid Retrieval with Confidence Scoring
BM25 + vector in parallel, fused with RRF, scored by the Confidence Engine.
BM25 Keyword Search
Exact + partial term matching · names · dates · IDs · Output: keyword rank → RRF contribution
Vector Semantic Search
Embedding similarity · preferences · context · intent · Output: vector similarity → vecNorm + RRF contribution
— Fusion & Scoring
① RRF Fusion
score += 1/(60+rank) per retrieval leg · normalized as rrfNorm
② Confidence Engine
0.55×rrfNorm + 0.20×vecNorm + agreement/evidence/recency/source-prior bonuses
③ Ranked Results
query-time confidence 0–100 · top-k · provenance / chain_source when applicable
The Confidence Engine: Keyword + Vector + Bonuses + Source Priors
Multi-factor fusion — not raw similarity. Agreement, recency, and source shape the final score.
Signal Fusion
  • RRF score → rrfNorm · Vector similarity → vecNorm
  • Base formula: 0.55×rrfNorm + 0.20×vecNorm
  • Agreement bonus: +0.10 when keyword and vector both hit the same record
Structured Bonuses
  • Evidence bonuses: literal/keyword/answer evidence can raise confidence
  • Recency: +0.05 if <7 days · +0.02 if <30 days
  • Source prior: exact/time → Session · general → Insights
  • Final score clamped strictly 0–100
Agreement Bonus · keyword + vector both rank same record → confidence boost
Recency Bonus · fresh memories score higher · staleness penalized
Source Prior · query shape determines layer bias · Insights vs Session
Dual-Layer Memory: Facts First, Session as Fallback
Insights surface first. Raw Session provides full-fidelity fallback when precision matters.
Layer 1 — Insights (Primary)
Reconciled atomic facts · Deduplicated · Searched first · Best for: preferences · background · current state
Layer 2 — Raw Session (Fallback)
Verbatim records · Append-only · Searched when Insights don't satisfy · Best for: exact recall · time-based · verbatim evidence
Provenance: Per-Object Metadata on Every Memory Write
Set by the server at write time — not by the agent. Travels with the object through its lifecycle.
Written at Ingest · server stamps agent_id, session_id, appId, metadata, and timestamps
Travels with the Object · visible in Explorer and preserved on archive/supersede history
chain_source on Chain Recalls · adds chain_id, node_position, tenant_id, external_space_id
CHAPTER 4
Spaces, appId, and Space Chain
The isolation, inheritance, and routing model that makes mem9 safe for multi-tenant, multi-app, and multi-team deployments.
Space: Bounded, Governed Memory Context for a Tenant
A Space is the fundamental memory unit — one tenant, one API key, one governed context.
Tenant Boundary
One Space = one tenant · appId provides finer partitioning within
API Key Access
Each Space accessed via API key · Keys rotatable and revocable
Unified Recall API
BM25 + vector hybrid · filterable by type · state · appId · agent · tags
Governed Lifecycle
Explicit creation/deletion · active/archived states · retention policies per Space
appId: Optional Application Isolation Within a Space
appId scopes memory/session writes and read filters within one Space; it is not a separate tenant.
— Write Rules
appId: "support"
Writes belong to the support sub-space
omitted / null / empty / whitespace appId
Writes belong to the default/global appId
API key ownership
Permissions, quota, and billing do not change regardless of appId
— Query Rules
omitted appId
Searches across all appIds under the API key
non-empty appId
Searches only that exact sub-space
appId=null or appId=""
Searches only the default/global appId
Use non-empty appId for app isolation; omit appId only when broad cross-app recall is intended.
Space Chain: Ordered Linear Chain of Spaces
An ordered linear sequence of Spaces accessed via a single chain_ key — not a graph.
1
Node 0 — First Active Node
Default ADD target · Recall starts here · No routing policy
2
Node 1 — Middle Node
Searched after Node 0 · Natural-language routing policy · Write or webhook-only
3
Node 2 — Last Node
Broadest scope · Routing policy · chain_source identifies this node
Ordered Traversal
Recall searches nodes in sequence · high-confidence early stop ends search
scanAll Mode
Searches all nodes · reranks aggregate · chain_source on every result
chain_ Key
Distinct from Space API keys · encodes ordered node list + traversal config
Space Chain Recall: Early Stop and scanAll
Two modes control the precision-vs-coverage tradeoff. Both return chain_source provenance on every result.
chain_ key traversal: early-stop at first high-confidence hit
Node 0 — Personal Space
Searched first · most specific context · Early Stop: high-confidence found → stop · scanAll: collect + continue
Node 1 — Team Space
Searched after Personal · shared team knowledge · Early Stop: high-confidence found → stop · scanAll: collect + continue
Node 2 — Company Space
Searched last · broadest organizational knowledge · Early Stop: returns results if reached · scanAll: all results merged + reranked
Early Stop
Stops at first node with high-confidence result · fastest path · chain_source included
scanAll
Searches all nodes regardless of confidence · aggregate reranked · chain_source on every result
Space Chain Writes: First Active Node, Locate-by-ID
ADD goes to the first active node. UPDATE and DELETE locate the owning node by ID — no broadcast.
ADD → First Active Node
All new writes via chain_ key go to Node 0 · most specific Space in the chain
UPDATE / DELETE → Locate-by-ID
Server finds the object across chain nodes · writes only to owning node
No Automatic Propagation
Routing policies on non-first nodes control ingest-time routing · not direct writes
Routing Policies: Natural-Language Prompts on Non-First Nodes
After extraction, each fact is evaluated against non-first node policies — write, webhook-only, or no match.
Extraction Complete
Classified facts with confidence scores.
Each is a routing candidate.
First Node Reconcile (always)
Candidate evaluated against Node 0 memory.
ADD / UPDATE+archive / No-op · Node 0 is the default target and has no routing policy.
memory.added fires only when a new object is written.
Non-First Node Policy Eval
Natural-language prompt per node.
Write · Webhook-only · No match
Routing Outcome
Write: fact stored in target Space; memory.added fires when ADD creates a memory.
Webhook-only: space_chain.fact_routed emitted.
No match: not routed.
Space Chain Provenance: chain_source Response Fields
Every chain_ key recall result includes chain_source so callers know which chain node returned the object.
Per-Result Provenance
In scanAll, every result carries its own chain_source block.
Node Position, Not Name
Integer index; meaning is defined by chain configuration.
Object Fields Stay Separate
agent_id, session_id, and metadata.source_turns remain on the memory object, not inside chain_source.
CHAPTER 5
Console and Automation
The operator interface for governing, inspecting, analyzing, and automating mem9 — without writing a single API call.
Console Hierarchy: Organization → Project → Space → Key
Four levels. Each with a distinct scope and responsibility.
01
Organization · Top-level account boundary · all Projects and Spaces belong here
02
Project · Logical grouping of Spaces · organize by product · team · or deployment
03
Space · Fundamental memory unit · API key · appId partitions · retention · recall config
04
Key · API key for Space or chain_ access · create · rotate · revoke from Space detail
Space Detail: Key Metrics, Imports, and Memory Workbench
The operator's primary workspace — live metrics, import tooling, and direct memory management.
Key Metrics
  • Total objects by type: Pinned / Insight / Session
  • Active vs archived counts
  • Recall request count · sessions ingested
  • Average confidence score across Insights
Space Configuration
  • API key assignment and rotation
  • appId partition list with per-appId counts
  • Retention policy: Session expiry · Insight archival
  • Recall default: early stop vs scanAll
Import Tooling
bulk import via CSV · JSON · or API · typed + provenance-stamped
Memory Workbench
browse · search · filter · bulk archive · pin · export
Mutation Metadata
operator identity · timestamp · operation type per object
Memories Explorer: Browse, Filter, and Manage Memory Objects
Full-text search and multi-dimension filtering — all within the Console, no API calls.
Text Search
Full-text across content + metadata · ranked by relevance
Filter by Type
Pinned · Insight · Session · combinable
Filter by State
Active or archived · archived objects remain queryable
Filter by Agent
Tags · appId · multi-select · scope to specific partition
Memory Analysis and Deep Analysis
Both run off the hot path. All recommendations require explicit operator action.
Taxonomy
categorizes objects by topic + entity type · filterable in Memories Explorer
Duplicate Cleanup
identifies near-duplicate Insights · operator reviews + archives · no auto-delete
Deep Analysis
operator-initiated · generates insight reports: knowledge gaps · stale facts · summary Insights
Insight Reports
advisory only · operator decides which recommendations to act on
Webhooks: React to Memory Events Without Polling
Signed JSON over HTTPS. Push, not pull. Three event types. No polling required.
Push, Not Pull
events delivered to your endpoint when they occur · no polling · no Recall API quota
Signed JSON over HTTPS
signing_secret provided once at creation or rotation · HMAC verified per delivery
Retry with Backoff
failed deliveries retried with exponential backoff · full delivery history in Console
Webhook Lifecycle: Configure → Sign → Deliver → Retry → Inspect
The signing_secret is provided once — store it securely. All steps visible in the Console.
01
Configure · Register HTTPS endpoint · select event types · set scope: Space or Space Chain
02
Sign · signing_secret generated once · store securely · HMAC verifies every delivery · rotate to invalidate
03
Deliver · Event fires → POST signed JSON to endpoint · payload: event type · object data · provenance · timestamp
04
Retry · Non-2xx or timeout → exponential backoff · failed deliveries logged with response code + latency
05
Inspect · Console shows full delivery history · event type · payload · response code · latency · retry count · manual re-trigger
HTTPS Required in Production · HTTP only in development environments
Scoped Subscriptions · one Space or one Space Chain per webhook
signing_secret Rotation · old secret immediately invalidated · no grace period overlap
Webhook Events: memory.added, memory.deleted, space_chain.fact_routed
Three supported event types. Each payload carries the event type, timestamp, and the relevant memory or routing data.
memory.added
Trigger: Direct writes · pinned writes · smart ingest ADD · successful routed target writes
  • memory.id · content · memory_type · agent_id · session_id · appId · tags · metadata · created_at · updated_at
Note: Smart ingest UPDATE-only reconciliation does not emit memory.added
memory.deleted
Trigger: Single-memory or batch delete succeeds
  • memory.id · tenant_id · deleted_by_agent · deleted_at
Note: Hard delete only · archive does not fire
space_chain.fact_routed
Trigger: Non-first-node routing policy matches a fact
  • route_id · chain_id · source_tenant_id · target_tenant_id · target_external_space_id · routing_policy_node_id · source_facts · target_memory · webhook_only · agent_id · appId · session_id
Note: Emitted for routed writes and webhook-only routing; failed/quota-denied target writes do not emit a routed-write event
Only these 3 event types are supported. There is no memory.updated event; UPDATE-only reconciliation archives/supersedes internally and does not emit memory.added or memory.deleted.
CHAPTER 6
Deployment and Differentiation
How mem9 deploys, what makes it different, and why the architecture matters for production agent fleets.
Four Cooperating Planes: Runtime · Control · Visualization · Analysis
Four planes. One complete memory infrastructure. Agent touches only the Runtime.
Runtime Plane — Always On
Go server · TiDB (SQL + vector) · Ingest · Smart Pipeline · Hybrid Recall · Provenance on every write
Control Plane — Operator Interface
API key management · Space config · Space Chain · Webhook config · signing_secret rotation
Visualization Plane — Human Window
Memories Explorer · Space detail · Memory Analysis · Deep Analysis reports
Analysis Plane — Off-Path Intelligence
Taxonomy · Duplicate detection · Deep Analysis jobs · All outputs advisory · operator approval required
Agent touches only the Runtime Plane
Control · Visualization · Analysis are operator-facing only
Analysis Plane is always async
Never on the hot path · never blocks agent recall
Ecosystem Flexibility: OpenClaw-First, API-First, Self-Hosted or Managed
Runs wherever your agents run — self-hosted, managed, or hybrid. No vendor lock-in.
Self-Hosted
  • Full data sovereignty · your infrastructure
  • Custom retention and lifecycle policies
  • Bring your own embedding model
Managed (mem9.ai)
  • mem9 operates the runtime on your behalf
  • Managed embedding and indexing pipeline
  • Console access without operating the runtime
Embedding Model Agnostic
pluggable models · no vendor lock-in on vector layer
OpenClaw Native
drop-in replacement · one config change · no agent retraining
API-First
every capability via REST API · integrate with any framework or stack
Framework Agnostic
REST API works with any agent framework · not OpenClaw-specific
Enterprise Memory Scorecard
OpenClaw Native vs mem9 — production readiness across 12 dimensions.
OpenClaw Native helps an agent keep notes.
mem9 helps a fleet of agents keep memory.
Governed
Typed writes · Pinned facts · Provenance on every object
Isolated
Space-level tenant isolation · appId-level app isolation
Persistent
Space Chains · Cross-session continuity · Cold-start solved
mem9.ai · Built on TiDB