Newcontext-mode—Save 98% of your AI coding agent's context windowLearn more
MCP Directory
ServersClientsBlog

context-mode

Save 98% of your AI coding agent's context window. Works with Claude Code, Cursor, Copilot, Codex, and more.

Try context-mode
MCP Directory

Model Context Protocol Directory

MKSF LTD
Suite 8805 5 Brayford Square
London, E1 0SG

MCP Directory

  • About
  • Blog
  • Documentation
  • Contact

Menu

  • Servers
  • Clients

© 2026 model-context-protocol.com

The Model Context Protocol (MCP) is an open standard for AI model communication.
Powered by Mert KoseogluSoftware Forge
  1. Home
  2. Clients
  3. mind-mem

mind-mem

GitHub
Website

Persistent, auditable, contradiction-safe memory for coding agents. Hybrid BM25 + vector retrieval, 19 MCP tools, co-retrieval graph, MIND-accelerated scoring. Zero external dependencies.

2
7
<p align="center"> <h1 align="center">mind-mem</h1> <p align="center"> <strong>Drop-in memory for Claude Code, OpenClaw, and any MCP-compatible agent.</strong><br> <sub>OpenClaw is an <a href="https://github.com/openclaw/openclaw">open-source AI assistant platform</a> with multi-channel support.</sub> </p> <p align="center"> Local-first &bull; Zero-infrastructure &bull; Governance-aware &bull; MIND-accelerated </p> <p align="center"> <a href="https://pypi.org/project/mind-mem/"><img src="https://img.shields.io/pypi/v/mind-mem?include_prereleases&style=flat-square&color=blue&label=PyPI" alt="PyPI"></a> <a href="https://pypi.org/project/mind-mem/"><img src="https://img.shields.io/pypi/pyversions/mind-mem?style=flat-square" alt="Python Versions"></a> <a href="https://github.com/star-ga/mind-mem/blob/main/LICENSE"><img src="https://img.shields.io/pypi/l/mind-mem?style=flat-square" alt="License"></a> <a href="https://github.com/star-ga/mind-mem/releases"><img src="https://img.shields.io/github/v/release/star-ga/mind-mem?include_prereleases&style=flat-square&color=green&label=Release" alt="Release"></a> <img src="https://img.shields.io/badge/core_deps-zero-brightgreen?style=flat-square" alt="Zero Core Dependencies"> <img src="https://img.shields.io/badge/MCP-compatible-purple?style=flat-square" alt="MCP Compatible"> <img src="https://img.shields.io/badge/MIND-accelerated-orange?style=flat-square" alt="MIND Accelerated"> <img src="https://img.shields.io/badge/tests-3280-brightgreen?style=flat-square" alt="Tests: 3280"> <img src="https://img.shields.io/badge/MCP_tools-44-blue?style=flat-square" alt="MCP Tools: 35"> <img src="https://img.shields.io/badge/audit-3--LLM_joint-darkgreen?style=flat-square" alt="3-LLM joint audit per release"> <img src="https://img.shields.io/badge/release-local_(no_Actions)-lightgrey?style=flat-square" alt="Released locally; GitHub Actions disabled account-wide"> </p> </p>

Drop-in memory layer for AI coding agents — Claude Code, Claude Desktop, Codex CLI, Gemini CLI, Cursor, Windsurf, Zed, OpenClaw, or any MCP-compatible client. Upgrades your agent from "chat history + notes" to a governed Memory OS with hybrid search, RRF fusion, intent routing, optional MIND kernels, structured persistence, contradiction detection, drift analysis, safe governance, and full audit trail.

If your agent runs for weeks, it will drift. mind-mem prevents silent drift.

mind-mem powers the Memory Plane of the MIND Cognitive Kernel — the deterministic AI runtime architecture.

Shared Memory Across All Your AI Agents

This is the killer feature. When you install mind-mem, all your AI coding agents share the same memory workspace. Claude Code, Codex CLI, Gemini CLI, Cursor, Windsurf, Zed — every MCP-compatible client connects to the same persistent memory through a single workspace.

What this means in practice:

  • A decision made in Claude Code is instantly recalled by Codex CLI and Gemini CLI
  • Entity knowledge (projects, tools, people) accumulates from all sessions across all agents
  • Contradictions detected by one agent are flagged to all others
  • Your memory doesn't fragment across tools — it compounds

One install script, all agents configured in seconds:

git clone https://github.com/star-ga/mind-mem.git
cd mind-mem
./install.sh --all    # Auto-detects and configures every AI coding client on your machine

The installer auto-detects Claude Code, Claude Desktop, Codex CLI, Gemini CLI, Cursor, Windsurf, Zed, and OpenClaw — creates a shared workspace and wires the MCP server into each client's config. SQLite WAL mode ensures safe concurrent access: one writer, many readers, zero corruption.

30-Second Demo

pip install mind-mem
mind-mem-init ~/my-workspace        # Create workspace
mind-mem-recall -q "API decisions" --workspace ~/my-workspace  # Hybrid BM25F search
mind-mem-scan ~/my-workspace        # Detect drift & contradictions
<p align="center"> <img src="demo.gif" alt="mind-mem recall and scan demo" width="720"> </p>

Output:

[1.204] D-20260215-001 (decision) — Use async/await for all API endpoints
        decisions/DECISIONS.md:11
[1.094] D-20260210-003 (decision) — REST over GraphQL for public API
        decisions/DECISIONS.md:20

Trust Signals

PrincipleWhat it means
DeterministicSame input, same output. No ML in the core, no probabilistic mutations.
AuditableEvery apply logged with timestamp, receipt, and DIFF. Full traceability.
Local-firstAll data stays on disk. No cloud calls, no telemetry, no phoning home.
No vendor lock-inPlain Markdown files. Move to any system, any time.
Zero magicEvery check is a grep, every mutation is a file write. Read the source in 30 min.
No silent mutationNothing writes to source of truth without explicit /apply. Ever.
Zero infrastructureNo Redis, no Postgres, no vector DB, no GPU. Python 3.10+ and stdlib only.
100% NIAH250/250 Needle In A Haystack retrieval. Every needle, every depth, every size.

Table of Contents

  • Why mind-mem
  • Features
  • Benchmark Results
  • Quick Start
  • Health Summary
  • Commands
  • Architecture
  • How It Compares
  • Recall
  • MIND Kernels
  • Auto-Capture
  • Multi-Agent Memory
  • Governance Modes
  • Block Format
  • Configuration
  • MCP Server
  • Security
  • Troubleshooting
  • Contributing
  • License

Why mind-mem

Most memory plugins store and retrieve. That's table stakes.

mind-mem also detects when your memory is wrong — contradictions between decisions, drift from informal choices never formalized, dead decisions nobody references, orphan tasks pointing at nothing — and offers a safe path to fix it.

ProblemWithout mind-memWith mind-mem
Contradicting decisionsFollows whichever seen lastFlags, links both, proposes fix
Informal chat decisionLost after session endsAuto-captured, proposed to formalize
Stale decisionZombie confuses future sessionsDetected as dead, flagged
Orphan task referenceSilent breakageCaught in integrity scan
Scattered recall qualitySingle-mode search misses contextHybrid BM25+Vector+RRF fusion finds it
Ambiguous query intentOne-size-fits-all retrieval9-type intent router optimizes parameters

Novel Contributions

mind-mem introduces several techniques not found in existing memory systems:

TechniqueWhat's newWhy it matters
Co-retrieval graphPageRank-like score propagation across blocks frequently retrieved togetherSurfaces structurally relevant blocks with zero lexical overlap (+2.0pp accuracy)
Fact card sub-block indexingAtomic fact extraction → small-to-big retrieval with parent score blendingCatches fine-grained facts that full-block BM25 misses (+2.6pp accuracy)
Adaptive knee cutoffScore-drop-based truncation instead of fixed top-KEliminates noise that hurts LLM judges — returns 3-15 results adaptively
Hard negative miningLogs BM25-high / cross-encoder-low blocks as misleading, penalizes in future queriesSelf-improving retrieval: precision increases over time without retraining
Deterministic abstentionPre-LLM confidence gate using 5-signal scoring (entity, BM25, speaker, evidence, negation)Prevents hallucinated answers to unanswerable questions — no ML required
Governance pipelineContradiction detection + drift analysis + safe apply with audit trailOnly memory system that detects when stored knowledge is wrong
Agent-agnostic shared memorySingle MCP workspace shared across Claude Code, Codex, Gemini, Cursor, Windsurf, ZedMemory compounds across tools instead of fragmenting

Features

Hybrid BM25+Vector Search with RRF Fusion

Thread-parallel BM25 and vector search with Reciprocal Rank Fusion (k=60). Configurable weights per signal. Vector is optional — works with just BM25 out of the box.

RM3 Dynamic Query Expansion

Pseudo-relevance feedback using JM-smoothed language models. Expands queries with top terms from initial result set. Falls back to static synonyms for adversarial queries. Zero dependencies.

9-Type Intent Router

Classifies queries into WHY, WHEN, ENTITY, WHAT, HOW, LIST, VERIFY, COMPARE, or TRACE. Each intent type maps to optimized retrieval parameters (limits, expansion settings, graph traversal depth).

A-MEM Metadata Evolution

Auto-maintained per-block metadata: access counts, importance scores (clamped to [0.8, 1.5] reranking boost), keyword evolution, and co-occurrence tracking. Importance decays with exponential recency.

Deterministic Reranking

Four-signal reranking pipeline: negation awareness (penalizes contradicting results), date proximity (Gaussian decay), 20-category taxonomy matching, and recency boosting. No ML required.

Optional Cross-Encoder

Drop-in ms-marco-MiniLM-L-6-v2 cross-encoder (80MB). Blends 0.6 * CE + 0.4 * original score. Falls back gracefully when unavailable. Enabled via config.

MIND Kernels (Optional, Native Speed)

17 compiled MIND scoring kernels (BM25F, RRF fusion, reranking, negation penalty, date proximity, category boost, importance, entity overlap, confidence, top-k, weighted rank, category affinity, query-category relevance, category assignment). Compiles to native .so via the MIND compiler. Pure Python fallback always available — no functionality is lost without compilation.

BM25F Hybrid Recall

BM25F field-weighted scoring (k1=1.2, b=0.75) with per-field weighting (Statement: 3x, Title: 2.5x, Name: 2x, Summary: 1.5x), Porter stemming, bigram phrase matching (25% boost per hit), overlapping sentence chunking (3-sentence windows with 1-sentence overlap), domain-aware query expansion, and optional 2-hop graph-based cross-reference neighbor boosting. Zero dependencies. Fast and deterministic.

Graph-Based Recall

2-hop cross-reference neighbor boosting — when a keyword match is found, blocks that reference or are referenced by the match get boosted (1-hop: 0.3x decay, 2-hop: 0.1x decay). Surfaces related decisions, tasks, and entities that share no keywords but are structurally connected. Auto-enabled for multi-hop queries.

Vector Recall (optional)

Pluggable embedding backend — local ONNX (all-MiniLM-L6-v2, no server needed) or cloud (Pinecone). Falls back to BM25 when unavailable.

Persistent Memory

Structured, validated, append-only decisions / tasks / entities / incidents with provenance and supersede chains. Plain Markdown files — readable by humans, parseable by machines.

Immune System

Continuous integrity checking: contradictions, drift, dead decisions, orphan tasks, coverage scoring, regression detection. 74+ structural validation rules.

Safe Governance

All changes flow through graduated modes: detect_only → propose → enforce. Apply engine with snapshot, receipt, DIFF, and automatic rollback on validation failure.

Adversarial Abstention Classifier

Deterministic pre-LLM confidence gate for adversarial/verification queries. Computes confidence from entity overlap, BM25 score, speaker coverage, evidence density, and negation asymmetry. Below threshold → forces abstention without calling the LLM, preventing hallucinated answers to unanswerable questions.

Auto-Capture with Structured Extraction

Session-end hook detects decision/task language (26 patterns with confidence classification), extracts structured metadata (subject, object, tags), and writes to SIGNALS.md only. Never touches source of truth directly. All signals go through /apply.

Concurrency Safety

Cross-platform advisory file locking (fcntl/msvcrt/atomic create) protects all concurrent write paths. Stale lock detection with PID-based cleanup. Zero dependencies.

Compaction & GC

Automated workspace maintenance: archive completed blocks, clean up old snapshots, compact resolved signals, archive daily logs into yearly files. Configurable thresholds with dry-run mode.

Observability

Structured JSON logging (via stdlib), in-process metrics counters, and timing context managers. All scripts emit machine-parseable events. Controlled via MIND_MEM_LOG_LEVEL env var.

Multi-Agent Namespaces & ACL

Workspace-level + per-agent private namespaces with JSON-based ACL. fnmatch pattern matching for agent policies. Shared fact ledger for cross-agent propagation with dedup and review gate.

Automated Conflict Resolution

Graduated resolution pipeline: timestamp priority, confidence priority, scope specificity, manual fallback. Generates supersede proposals with integrity hashes. Human veto loop — never auto-applies without review.

Write-Ahead Log (WAL) + Backup/Restore

Crash-safe writes via journal-based WAL. Full workspace backup (tar.gz), git-friendly JSONL export, selective restore with conflict detection and path traversal protection.

Transcript JSONL Capture

Scans Claude Code transcript files for user corrections, convention discoveries, bug fix insights, and architectural decisions. 16 transcript-specific patterns with role filtering and confidence classification.

MCP Server (32 tools, 8 resources)

Full Model Context Protocol server with 32 tools and 8 read-only resources. Works with Claude Code, Claude Desktop, Cursor, Windsurf, and any MCP-compatible client. HTTP and stdio transports with optional bearer token auth.

74+ Structural Checks + 3024 Unit Tests

validate.sh checks schemas, cross-references, ID formats, status values, supersede chains, ConstraintSignatures, and more. Backed by 3024 pytest unit tests covering all core modules.

Audit Trail

Every applied proposal logged with timestamp, receipt, and DIFF. Full traceability from signal → proposal → decision.

Calibration Feedback Loop

Per-block quality tracking with Bayesian weight computation. When users provide feedback (thumbs up/down) via calibration_feedback, the system maintains a rolling quality score per block over a 30-day window. Bayesian smoothing constrains calibration weights to the 0.5-1.5 range, preventing any single block from dominating or being silenced. Calibration weights integrate directly into the BM25 + FTS5 retrieval pipeline — high-quality blocks rank higher, low-quality blocks are naturally demoted. Use calibration_stats to inspect per-block quality distributions and global calibration health.

LLM-Guided Multi-Query Expansion

Generates semantically diverse query reformulations before search — synonym expansion, specificity shifts, temporal rephrasing, and negation variants. Combines all reformulated queries with Reciprocal Rank Fusion for broader recall without sacrificing precision. Runs locally with zero API calls.

4-Layer Search Deduplication

Post-retrieval dedup pipeline: best-chunk-per-source (keeps highest-scoring chunk from each file), cosine similarity dedup (>0.85 threshold), type diversity capping (max 3 results per block type), and per-source chunk limiting. Eliminates redundant results that waste LLM context.

LLM-Guided Smart Chunking

Content-aware chunking that splits at semantic boundaries (headers, paragraph breaks, list items, code blocks) instead of fixed character counts. Produces variable-size chunks with overlap for continuity. Supports markdown, code, and prose with format-specific splitting rules.

Compiled Truth Pages

Per-entity knowledge compilation: current-best-understanding on top, timestamped evidence trail below. Contradiction detection across evidence entries with automatic flagging. Entities accumulate knowledge from all sessions — each new evidence entry is checked against existing facts.

Dream Cycle (Autonomous Memory Enrichment)

Scheduled background enrichment: scans recent memory for missing cross-references, broken citations, orphan entities, and consolidation opportunities. Generates repair proposals for stale links, detects implicit entities not yet formalized, and compacts redundant entries. Runs during idle periods with configurable depth.

Feature Completeness Matrix

Capabilitymind-memMem0ZepLettaLangMem
BM25 lexical searchY————
Vector semantic searchYYYYY
Hybrid BM25+Vector+RRFY————
Cross-encoder rerankingY————
Intent-aware routing (9 types)Y————
RM3 query expansionY————
Co-retrieval graph (PageRank)Y————
Fact sub-block indexingY————
Hard negative miningY————
Adaptive knee cutoffY————
Contradiction detectionY————
Drift analysisY————
Governance pipeline (propose/apply)Y————
Multi-agent shared memory (MCP)Y——Y—
Zero core dependenciesY————
Local-only (no cloud required)Y————
Compiled native kernels (MIND)Y————
Backup/restore with zip-slip protectionY————
Multi-query expansion with RRFY————
4-layer search deduplicationY————
Semantic-aware smart chunkingY————
Compiled truth pages (per-entity)Y————
Dream cycle (autonomous enrichment)Y————

Benchmark Results

mind-mem's recall engine evaluated on standard long-term memory benchmarks using multiple configurations — from pure BM25 to full hybrid retrieval with neural reranking.

Needle In A Haystack (NIAH)

250/250 — 100% retrieval across all haystack sizes, burial depths, and needle types.

A single fact is planted at a controlled depth within a haystack of semantically diverse filler blocks. The system must retrieve the needle in its top-5 results using only a natural-language query.

Haystack SizeDepths TestedNeedlesPassedRate
10 blocks0/25/50/75/100%1050/50100%
50 blocks0/25/50/75/100%1050/50100%
100 blocks0/25/50/75/100%1050/50100%
250 blocks0/25/50/75/100%1050/50100%
500 blocks0/25/50/75/100%1050/50100%

Config: Hybrid BM25 + BAAI/bge-large-en-v1.5 + RRF (k=60) + sqlite-vec. Full details: benchmarks/NIAH.md

LoCoMo LLM-as-Judge

Same pipeline as Mem0 and Letta evaluations: retrieve context, generate answer with LLM, score against gold reference with judge LLM. Directly comparable methodology.

v1.0.7 — Hybrid + top_k=18 (Mistral answerer + judge, conv-0, 199 questions):

CategoryNAcc (>=50)Mean Score
Overall19992.5%76.7
Adversarial4797.9%89.8
Multi-hop3791.9%74.3
Open-domain7092.9%72.7
Temporal1392.3%76.2
Single-hop3284.4%68.9

Pipeline: BM25 + Qwen3-Embedding-8B (4096d) vector search → RRF fusion (k=60) → top-18 evidence blocks → observation compression → answer → judge. A/B validated: +2.8 mean vs top_k=10 baseline.

v1.1.1 — BM25 + top_k=18 (Mistral Large answerer + judge, 10 conversations, 1986 questions):

CategoryNAcc (>=50)Mean Score
Overall198673.8%70.5
Adversarial44692.4%87.2
Single-hop28280.9%68.7
Open-domain84171.2%70.3
Temporal9666.7%65.9
Multi-hop32150.5%51.1

Pipeline: BM25 + RM3 query expansion → top-18 evidence blocks → observation compression → answer → judge. Full 10-conversation benchmark with Mistral Large as both answerer and judge.

v1.0.0 — BM25-only baseline (gpt-4o-mini answerer + judge, 10 conversations):

CategoryNAcc (>=50)Mean Score
Overall198667.3%61.4
Open-domain84186.6%78.3
Temporal9678.1%65.7
Single-hop28268.8%59.1
Multi-hop32155.5%48.4
Adversarial44636.3%39.5

Key improvements since v1.0.0: Adversarial accuracy tripled from 36.3% to 92.4% via abstention classifier + hybrid retrieval. Overall Acc≥50 improved from 67.3% to 73.8% (+6.5pp).

Competitive Landscape

SystemScoreApproach
mind-mem76.7%Hybrid BM25 + Qwen3-8B vector + RRF fusion (local-only)
Memobase75.8%Specialized extraction
Letta74.0%Files + agent tool use
mind-mem73.8%BM25-only, full 10-conv (1986 questions, Mistral Large)
Mem068.5%Graph + LLM extraction

mind-mem now surpasses Mem0 and Letta with local-only retrieval — no cloud calls, no graph DB, no LLM in the retrieval loop. mind-mem's unique value is governance (contradiction detection, drift analysis, audit trails) and agent-agnostic shared memory via MCP — areas these benchmarks don't measure.

Benchmark Comparison (2026-02-22)

SystemLoCoMo Acc>=50LongMemEval R@10InfrastructureDependencies
mind-mem (hybrid)76.7%88.1%Local-onlyZero core (optional: llama.cpp, sentence-transformers)
Memobase75.8%--Cloud + GPUembeddings + vector DB
Letta74.0%--Cloudembeddings + vector DB
mind-mem (BM25)73.8%88.1%Local-onlyZero core
full-context72.9%--N/ALLM context window
Mem068.5%--Cloud (managed)graph DB + embeddings

mind-mem surpasses Mem0 (68.5%), Letta (74.0%), and Memobase (75.8%) with zero cloud infrastructure. Full 10-conversation benchmark (1986 questions) validates this at scale.

LongMemEval (ICLR 2025, 470 questions)

CategoryNR@1R@5R@10MRR
Overall47073.285.388.1.784
Multi-session12183.595.995.9.885
Temporal12776.491.392.9.826
Knowledge update7280.688.991.7.844
Single-session5682.189.389.3.847

Performance (Latency & Throughput)

Measured on a 65-block workspace (typical personal workspace) with SQLite FTS5 backend:

OperationMetricValue
Query (FTS5 + rerank)p50 latency2.1 ms
Query (FTS5 + rerank)p95 latency4.9 ms
Query (FTS5 + rerank)mean latency2.6 ms
Incremental reindexelapsed32 ms (13 blocks indexed)
Full index buildelapsed48 ms (65 blocks)
MCP tool overheadstdio round-trip< 15 ms
Memory footprintRSS (idle MCP server)~28 MB

Query latency scales as O(log N) with SQLite FTS5 (vs O(corpus) for scan backend). The co-retrieval graph adds < 1ms per query. Knee cutoff and fact aggregation add negligible overhead (< 0.5ms).

Run Benchmarks Yourself

# Retrieval-only (R@K metrics)
python3 benchmarks/locomo_harness.py
python3 benchmarks/longmemeval_harness.py

# LLM-as-judge (accuracy metrics, requires API key)
python3 benchmarks/locomo_judge.py --dry-run
python3 benchmarks/locomo_judge.py --answerer-model gpt-4o-mini --output results.json

# Hybrid retrieval with any model pair (BM25 + vector + cross-encoder)
python3 benchmarks/locomo_judge.py --hybrid --compress --answerer-model mistral-large-latest --judge-model mistral-large-latest --output results.json

# Selective conversations
python3 benchmarks/locomo_harness.py --conv-ids 4,7,8

Quick Start

pip install (quickest)

pip install mind-mem
mind-mem-recall "What decisions were made about the API?"

Universal Installer (Recommended)

git clone https://github.com/star-ga/mind-mem.git
cd mind-mem
./install.sh --all

This auto-detects every AI coding client on your machine and configures mind-mem for all of them. Supported clients:

ClientConfig LocationFormat
Claude Code CLI~/.claude/mcp.jsonJSON
Claude Desktop~/.config/Claude/claude_desktop_config.jsonJSON
Codex CLI (OpenAI)~/.codex/config.tomlTOML
Gemini CLI (Google)~/.gemini/settings.jsonJSON
Cursor~/.cursor/mcp.jsonJSON
Windsurf~/.codeium/windsurf/mcp_config.jsonJSON
Zed~/.config/zed/settings.jsonJSON
OpenClaw~/.openclaw/hooks/mind-mem/JS hook

Selective install:

./install.sh --claude-code --codex --gemini         # Only specific clients
./install.sh --all --workspace ~/my-project/memory  # Custom workspace path

Uninstall:

./uninstall.sh          # Remove from all clients (keeps workspace data)
./uninstall.sh --purge  # Remove everything including workspace data

Manual Setup

For manual or per-project setup:

1. Clone into your project

cd /path/to/your/project
git clone https://github.com/star-ga/mind-mem.git .mind-mem

2. Initialize workspace

python3 .mind-mem/src/mind_mem/init_workspace.py .

Creates 12 directories, 19 template files, and mind-mem.json config. Never overwrites existing files.

3. Validate

bash .mind-mem/src/mind_mem/validate.sh .
# or cross-platform:
python3 .mind-mem/src/mind_mem/validate_py.py .

Expected: 74 checks | 74 passed | 0 issues.

4. First scan

python3 .mind-mem/src/mind_mem/intel_scan.py .

Expected: 0 critical | 0 warnings on a fresh workspace.

5. Verify recall + capture

python3 .mind-mem/src/mind_mem/recall.py --query "test" --workspace .
# → No results found. (empty workspace — correct)

python3 .mind-mem/src/mind_mem/capture.py .
# → capture: no daily log for YYYY-MM-DD, nothing to scan (correct)

6. Add hooks (optional)

Option A: Claude Code hooks (recommended)

Merge into your .claude/hooks.json:

{
  "hooks": [
    {
      "event": "SessionStart",
      "command": "bash .mind-mem/hooks/session-start.sh"
    },
    {
      "event": "Stop",
      "command": "bash .mind-mem/hooks/session-end.sh"
    }
  ]
}

Option B: OpenClaw hooks (for OpenClaw 2026.2+)

cp -r .mind-mem/hooks/openclaw/mind-mem ~/.openclaw/hooks/mind-mem
openclaw hooks enable mind-mem

7. Smoke Test (optional)

bash .mind-mem/src/mind_mem/smoke_test.sh

Creates a temp workspace, runs init → validate → scan → recall → capture → pytest, then cleans up.


Health Summary

After setup, this is what a healthy workspace looks like:

$ python3 -m mind_mem.intel_scan .

mind-mem Intelligence Scan Report v2.0
Mode: detect_only

=== 1. CONTRADICTION DETECTION ===
  OK: No contradictions found among 25 signatures.

=== 2. DRIFT ANALYSIS ===
  OK: All active decisions referenced or exempt.
  INFO: Metrics: active_decisions=17, active_tasks=7, blocked=0,
        dead_decisions=0, incidents=3, decision_coverage=100%

=== 3. DECISION IMPACT GRAPH ===
  OK: Built impact graph: 11 decision(s) with edges.

=== 4. STATE SNAPSHOT ===
  OK: Snapshot saved.

=== 5. WEEKLY BRIEFING ===
  OK: Briefing generated.

TOTAL: 0 critical | 0 warnings | 16 info

Commands

CommandWhat it does
/scanRun integrity scan — contradictions, drift, dead decisions, impact graph, snapshot, briefing
/applyReview and apply proposals from scan results (dry-run first, then apply)
/recall <query>Search across all memory files with ranked results (add --graph for cross-reference boosting)

Architecture

your-workspace/
├── mcp_server.py            # MCP server (FastMCP, 44 tools, 8 resources)
├── mind-mem.json             # Config
├── MEMORY.md                # Protocol rules
│
├── mind/                    # 17 MIND source files (.mind)
│   ├── bm25.mind           # BM25F scoring kernel
│   ├── rrf.mind            # Reciprocal Rank Fusion kernel
│   ├── reranker.mind        # Deterministic reranking
│   ├── abstention.mind      # Confidence gating
│   ├── ranking.mind         # Evidence ranking
│   ├── importance.mind      # A-MEM importance scoring
│   ├── category.mind        # Category relevance scoring
│   ├── recall.mind          # Combined recall scoring
│   ├── hybrid.mind          # BM25 + vector hybrid fusion
│   ├── rm3.mind             # RM3 pseudo-relevance feedback
│   ├── rerank.mind          # Score combination pipeline
│   ├── adversarial.mind     # Adversarial query detection
│   ├── temporal.mind        # Time-aware scoring
│   ├── prefetch.mind        # Context pre-assembly
│   ├── intent.mind          # Intent classification
│   └── cross_encoder.mind   # Cross-encoder blending
│
├── lib/                     # Compiled MIND kernels (optional)
│   └── libmindmem.so       # mindc output — not required for operation
│
├── decisions/
│   └── DECISIONS.md         # Formal decisions [D-YYYYMMDD-###]
├── tasks/
│   └── TASKS.md             # Tasks [T-YYYYMMDD-###]
├── entities/
│   ├── projects.md          # [PRJ-###]
│   ├── people.md            # [PER-###]
│   ├── tools.md             # [TOOL-###]
│   └── incidents.md         # [INC-###]
│
├── memory/
│   ├── YYYY-MM-DD.md        # Daily logs (append-only)
│   ├── intel-state.json     # Scanner state + metrics
│   └── maint-state.json     # Maintenance state
│
├── summaries/
│   ├── weekly/              # Weekly summaries
│   └── daily/               # Daily summaries
│
├── intelligence/
│   ├── CONTRADICTIONS.md    # Detected contradictions
│   ├── DRIFT.md             # Drift detections
│   ├── SIGNALS.md           # Auto-captured signals
│   ├── IMPACT.md            # Decision impact graph
│   ├── BRIEFINGS.md         # Weekly briefings
│   ├── AUDIT.md             # Applied proposal audit trail
│   ├── SCAN_LOG.md          # Scan history
│   ├── proposed/            # Staged proposals + resolution proposals
│   │   ├── DECISIONS_PROPOSED.md
│   │   ├── TASKS_PROPOSED.md
│   │   ├── EDITS_PROPOSED.md
│   │   └── RESOLUTIONS_PROPOSED.md
│   ├── applied/             # Snapshot archives (rollback)
│   └── state/snapshots/     # State snapshots
│
├── shared/                  # Multi-agent shared namespace
│   ├── decisions/
│   ├── tasks/
│   ├── entities/
│   └── intelligence/
│       └── LEDGER.md        # Cross-agent fact ledger
│
├── agents/                  # Per-agent private namespaces
│   └── <agent-id>/
│       ├── decisions/
│       ├── tasks/
│       └── memory/
│
├── mind-mem-acl.json        # Multi-agent access control
├── .mind-mem-wal/           # Write-ahead log (crash recovery)
│
└── src/mind_mem/
    ├── mind_ffi.py          # MIND FFI bridge (ctypes)
    ├── hybrid_recall.py     # Hybrid BM25+Vector+RRF orchestrator
    ├── block_metadata.py    # A-MEM metadata evolution
    ├── cross_encoder_reranker.py  # Optional cross-encoder
    ├── intent_router.py     # 9-type intent classification (adaptive)
    ├── recall.py            # BM25F + RM3 + graph scoring engine
    ├── recall_vector.py     # Vector/embedding backends
    ├── sqlite_index.py      # FTS5 + vector + metadata index
    ├── connection_manager.py # SQLite connection pool (WAL read/write separation)
    ├── block_store.py       # BlockStore protocol + MarkdownBlockStore
    ├── corpus_registry.py   # Central corpus path registry
    ├── abstention_classifier.py  # Adversarial abstention
    ├── evidence_packer.py   # Evidence assembly and ranking
    ├── intel_scan.py        # Integrity scanner
    ├── apply_engine.py      # Proposal apply engine (delta-based snapshots)
    ├── block_parser.py      # Markdown block parser (typed)
    ├── capture.py           # Auto-capture (26 patterns)
    ├── compaction.py        # Compaction/GC/archival
    ├── mind_filelock.py     # Cross-platform advisory file locking
    ├── observability.py     # Structured JSON logging + metrics
    ├── namespaces.py        # Multi-agent namespace & ACL
    ├── conflict_resolver.py # Automated conflict resolution
    ├── backup_restore.py    # WAL + backup/restore + JSONL export
    ├── transcript_capture.py  # Transcript JSONL signal extraction
    ├── validate.sh          # Structural validator (74+ checks)
    └── validate_py.py       # Structural validator (Python, cross-platform)

How It Compares

Quick Comparison

Featuremind-memMem0LettaZep/Graphiti
Local-onlyYesNo (cloud API)No (runtime)No (Neo4j)
Zero infrastructureYesNoNoNo
Hybrid retrievalBM25F + vector + RRFVector onlyHybridGraph + vector
Governance (propose/review/apply)YesNoNoNo
Contradiction detectionYesNoNoNo
Tests3,193---
LoCoMo benchmark67.3%68.5%74.0%-
MCP tools33---
Core dependencies0ManyManyMany

At a Glance

ToolStrengthTrade-off
Mem0Fast managed service, graph memory, multi-user scopingCloud-dependent, no integrity checking
SupermemoryFastest retrieval (ms), auto-ingestion from Drive/NotionCloud-dependent, auto-writes without review
claude-memPurpose-built for Claude Code, ChromaDB vectorsRequires ChromaDB + Express worker, no integrity
LettaSelf-editing memory blocks, sleep-time compute, 74% LoCoMoFull agent runtime (heavy), not just memory
ZepTemporal knowledge graph, bi-temporal model, sub-second at scaleCloud service, complex architecture
LangMemNative LangChain/LangGraph integrationTied to LangChain ecosystem
CogneeAdvanced chunking, web content bridgingResearch-oriented, complex setup
GraphlitMultimodal ingestion, semantic search, managed platformCloud-only, managed service
ClawMemFull ML pipeline (cross-encoder + QMD + beam search)4.5GB VRAM, 3 GPU processes required
MemUHierarchical 3-layer memory, multimodal ingestion, LLM-based retrievalRequires LLM for extraction and retrieval, no hybrid search
mind-memIntegrity + governance + zero core deps + hybrid search + MIND kernels + 44 MCP tools + 3-LLM audit per releaseLexical recall by default (vector/CE optional)

Full Feature Matrix

Compared against every major memory solution for AI agents (as of 2026):

Mem0Supermemoryclaude-memLettaZepLangMemCogneeGraphlitClawMemMemUmind-mem
Recall
VectorCloudCloudChromaYesYesYesYesYesYes—Optional
LexicalFilter———————BM25—BM25F
GraphYes———Yes—YesYesBeam—2-hop
Hybrid + RRFPart———Yes—YesYesYes—Yes
Cross-encoder————————qwen3 0.6B—MiniLM 80MB
Intent routing————————Yes—9 types
Query expansion————————QMD 1.7B—RM3 (zero-dep)
Persistence
StructuredJSONJSONSQLBlkGrphKVGrphGrphSQLMarkdownMarkdown
EntitiesYesYes—YesYesYesYesYes—YesYes
Temporal————Yes—————Yes
Supersede———YesYes—————Yes
Append-only——————————Yes
A-MEM metadata————————Yes—Yes
Integrity
Contradictions——————————Yes
Drift detection——————————Yes
Validation——————————74+ rules
Impact graph——————————Yes
Coverage——————————Yes
Multi-agent———Yes——————ACL-based
Conflict res.——————————Automatic
WAL/crash——————————Yes
Backup/restore——————————Yes
Abstention——————————Yes
Governance
Auto-captureAutoAutoAutoSelfExtExtExtIngAutoLLM ExtPropose
Proposal queue——————————Yes
Rollback——————————Yes
Mode governance——————————3 modes
Audit trail—Part————————Full
Operations
Local-only——Yes—————YesYesYes
Zero core deps——————————Yes
No daemon—————Yes———YesYes
GPU required————————4.5GBNoNo
Git-friendly———Part—————YesYes
MCP server——————————44 tools
MIND kernels——————————16 source

The Gap mind-mem Fills

Every tool above does storage + retrieval. None of them answer:

  • "Do any of my decisions contradict each other?"
  • "Which decisions are active but nobody references anymore?"
  • "Did I make a decision in chat that was never formalized?"
  • "What's the downstream impact if I change this decision?"
  • "Is my memory state structurally valid right now?"

mind-mem focuses on memory governance and integrity — the critical layer most memory systems ignore entirely.

Why Plain Files Outperform Fancy Retrieval

Letta's August 2025 analysis showed that a plain-file baseline (full conversations stored as files + agent filesystem tools) scored 74.0% on LoCoMo with gpt-4o-mini — beating Mem0's top graph variant at 68.5%. Key reasons:

  • LLMs excel at tool-based retrieval. Agents can iteratively query/refine file searches better than single-shot vector retrieval that might miss subtle connections.
  • Benchmarks reward recall + reasoning over storage sophistication. Strong judge LLMs handle the rest once relevant chunks are loaded.
  • Overhead hurts. Specialized pipelines introduce failure modes (bad embeddings, chunking errors, stale indexes) that simple file access avoids.
  • For text-heavy agentic use cases, "how well the agent manages context" > "how smart the retrieval index is."

mind-mem's deterministic retrieval pipeline validates these findings: 67.3% on LoCoMo with zero dependencies, no embeddings, and no vector database — within 1.2pp of Mem0's graph-based approach. The key insight: treating retrieval as a reasoning pipeline (wide candidate pool → deterministic rerank → context packing) closes most of the gap without any ML infrastructure. Unlike plain-file baselines, mind-mem adds integrity checking, governance, and agent-agnostic shared memory via MCP that no other system provides.


Recall

Default: BM25 Hybrid

python3 -m mind_mem.recall --query "authentication" --workspace .
python3 -m mind_mem.recall --query "auth" --json --limit 5 --workspace .
python3 -m mind_mem.recall --query "deadline" --active-only --workspace .

BM25F scoring (k1=1.2, b=0.75) with per-field weighting, bigram phrase matching, overlapping sentence chunking, and query-type-aware parameter tuning. Searches across all structured files.

BM25F field weighting: Terms in Statement fields score 3x higher than terms in Context (0.5x). This naturally prioritizes core content over auxiliary metadata.

RM3 query expansion: Pseudo-relevance feedback from top-k initial results. JM-smoothed language model extracts expansion terms, interpolated with the original query at configurable alpha. Falls back to static synonyms for adversarial queries.

Adversarial abstention: Deterministic pre-LLM confidence gate. Computes confidence from entity overlap, BM25 score, speaker coverage, evidence density, and negation asymmetry. Below threshold → forces abstention.

Stemming: "queries" matches "query", "deployed" matches "deployment". Simplified Porter stemmer with zero dependencies.

Hybrid Search (BM25 + Vector + RRF)

{
  "recall": {
    "backend": "hybrid",
    "vector_enabled": true,
    "rrf_k": 60,
    "bm25_weight": 1.0,
    "vector_weight": 1.0
  }
}

Thread-parallel BM25 and vector retrieval fused via RRF: score(doc) = bm25_w / (k + bm25_rank) + vec_w / (k + vec_rank). Deduplicates by block ID. Falls back to BM25-only when vector backend is unavailable.

Graph-Based (2-hop cross-reference boost)

python3 -m mind_mem.recall --query "database" --graph --workspace .

2-hop graph traversal: 1-hop neighbors get 0.3x score boost, 2-hop get 0.1x (tagged [graph]). Surfaces structurally connected blocks via AlignsWith, Dependencies, Supersedes, Sources, and ConstraintSignature scopes. Auto-enabled for multi-hop queries.

Vector (pluggable)

{
  "recall": {
    "backend": "vector",
    "vector_enabled": true,
    "vector_model": "all-MiniLM-L6-v2",
    "onnx_backend": true
  }
}

Supports ONNX inference (local, no server) or cloud embeddings. Falls back to BM25 automatically if unavailable.


MIND Kernels

mind-mem includes 17 .mind kernel source files — numerical hot paths written in the MIND programming language. The MIND kernel is optional. mind-mem works identically without it (pure Python fallback). With it, scoring runs at native speed with compile-time tensor shape verification.

Compilation

Requires the MIND compiler (mindc). See mindlang.dev for installation.

# Compile all kernels to a single shared library
mindc mind/*.mind --emit=shared -o lib/libmindmem.so

# Or compile individually for testing
mindc mind/bm25.mind --emit=shared -o lib/libbm25.so

Kernel Index

FileFunctionsPurpose
bm25.mindbm25f_doc, bm25f_batch, apply_recency, apply_graph_boostBM25F scoring with field boosts
rrf.mindrrf_fuse, rrf_fuse_threeReciprocal Rank Fusion
reranker.minddate_proximity_score, category_boost, negation_penalty, rerank_deterministicDeterministic reranking
rerank.mindrerank_scoresScore combination pipeline
abstention.mindentity_overlap, confidence_scoreConfidence gating
ranking.mindweighted_rank, top_k_maskEvidence ranking
importance.mindimportance_scoreA-MEM importance scoring
category.mindcategory_affinity, query_category_relevance, category_assignCategory distillation scoring
prefetch.mindprefetch_score, prefetch_selectSignal-based context pre-assembly
recall.mindrecall_scoreCombined recall scoring
hybrid.mindhybrid_fuseBM25 + vector hybrid fusion
rm3.mindrm3_weightRM3 pseudo-relevance feedback
adversarial.mindadversarial_gateAdversarial query detection
temporal.mindtemporal_decayTime-aware scoring
intent.mindintent_paramsIntent classification parameters
cross_encoder.mindce_blendCross-encoder blending configuration

Performance

<details> <summary>Compiled MIND kernels vs pure Python — 9 core scoring functions (200 iterations, <code>perf_counter</code>)</summary>

FunctionN=100N=1,000N=5,000
rrf_fuse10.8x69.0x72.5x
bm25f_batch13.2x113.8x193.1x
negation_penalty3.3x7.0x18.4x
date_proximity10.7x15.3x26.9x
category_boost3.3x19.8x17.7x
importance_batch22.3x46.2x48.6x
confidence_score0.9x0.8x0.9x
top_k_mask3.1x8.1x11.8x
weighted_rank5.1x26.6x121.8x
Overall49.0x

49x faster end-to-end at production scale (N=5,000). Individual kernels reach up to 193x speedup. The compiled library includes 14 runtime protection layers with near-zero overhead.

</details>

FFI Bridge

The compiled .so exposes a C99-compatible ABI. Python calls via ctypes through src/mind_mem/mind_ffi.py:

from mind_ffi import get_kernel, is_available, is_protected

if is_available():
    kernel = get_kernel()
    scores = kernel.rrf_fuse_py(bm25_ranks, vec_ranks, k=60.0)
    print(f"Protected: {is_protected()}")  # True with FORTRESS build

Without MIND

If lib/libmindmem.so is not present, mind-mem uses pure Python implementations. The Python fallback produces identical results (within f32 epsilon). No functionality is lost — MIND is a performance optimization, not a requirement.


Auto-Capture

Session end
    ↓
capture.py scans daily log (or --scan-all for batch)
    ↓
Detects decision/task language (26 patterns, 3 confidence levels)
    ↓
Extracts structured metadata (subject, object, tags)
    ↓
Classifies confidence (high/medium/low → P1/P2/P3)
    ↓
Writes to intelligence/SIGNALS.md ONLY
    ↓
User reviews signals
    ↓
/apply promotes to DECISIONS.md or TASKS.md

Batch scanning: python3 -m mind_mem.capture . --scan-all scans the last 7 days of daily logs.

Safety guarantee: capture.py never writes to decisions/ or tasks/ directly. All signals must go through the apply engine.


Multi-Agent Memory

Namespace Setup

python3 -m mind_mem.namespaces workspace/ --init coder-1 reviewer-1

Creates shared/ (visible to all) and agents/coder-1/, agents/reviewer-1/ (private) directories with ACL config.

Access Control

{
  "default_policy": "read",
  "agents": {
    "coder-1": {"namespaces": ["shared", "agents/coder-1"], "write": ["agents/coder-1"], "read": ["shared"]},
    "reviewer-*": {"namespaces": ["shared"], "write": [], "read": ["shared"]},
    "*": {"namespaces": ["shared"], "write": [], "read": ["shared"]}
  }
}

Shared Fact Ledger

High-confidence facts proposed to shared/intelligence/LEDGER.md become visible to all agents after review. Append-only with dedup and file locking.

Conflict Resolution

python3 -m mind_mem.conflict_resolver workspace/ --analyze
python3 -m mind_mem.conflict_resolver workspace/ --propose

Graduated resolution: confidence priority > scope specificity > timestamp priority > manual fallback.

Transcript Capture

python3 -m mind_mem.transcript_capture workspace/ --transcript path/to/session.jsonl
python3 -m mind_mem.transcript_capture workspace/ --scan-recent --days 3

Scans Claude Code JSONL transcripts for user corrections, convention discoveries, and architectural decisions. 16 patterns with confidence classification.

Backup & Restore

python3 -m mind_mem.backup_restore backup workspace/ --output backup.tar.gz
python3 -m mind_mem.backup_restore export workspace/ --output export.jsonl
python3 -m mind_mem.backup_restore restore workspace/ --input backup.tar.gz
python3 -m mind_mem.backup_restore wal-replay workspace/

Governance Modes

ModeWhat it doesWhen to use
detect_onlyScan + validate + report onlyStart here. First week after install.
proposeReport + generate fix proposals in proposed/After a clean observation week with zero critical issues.
enforceBounded auto-supersede + self-healing within constraintsProduction mode. Requires explicit opt-in.

Recommended rollout:

  1. Install → run in detect_only for 7 days
  2. Review scan logs → if clean, switch to propose
  3. Triage proposals for 2-3 weeks → if confident, enable enforce

Block Format

All structured data uses a simple, parseable markdown format:

[D-20260213-001]
Date: 2026-02-13
Status: active
Statement: Use PostgreSQL for the user database
Tags: database, infrastructure
Rationale: Better JSON support than MySQL for our use case
ConstraintSignatures:
- id: CS-db-engine
  domain: infrastructure
  subject: database
  predicate: engine
  object: postgresql
  modality: must
  priority: 9
  scope: {projects: [PRJ-myapp]}
  evidence: Benchmarked JSON performance
  axis:
    key: database.engine
  relation: standalone
  enforcement: structural

Blocks are parsed by block_parser.py — a zero-dependency markdown parser that extracts [ID] headers and Key: Value fields into structured dicts.


Configuration

All settings in mind-mem.json (created by init_workspace.py):

{
  "version": "2.0.0",
  "workspace_path": ".",
  "auto_capture": true,
  "auto_recall": true,
  "governance_mode": "detect_only",
  "recall": {
    "backend": "bm25",
    "rrf_k": 60,
    "bm25_weight": 1.0,
    "vector_weight": 1.0,
    "vector_model": "all-MiniLM-L6-v2",
    "vector_enabled": false,
    "onnx_backend": false
  },
  "proposal_budget": {
    "per_run": 3,
    "per_day": 6,
    "backlog_limit": 30
  },
  "compaction": {
    "archive_days": 90,
    "snapshot_days": 30,
    "log_days": 180,
    "signal_days": 60
  },
  "scan_schedule": "daily"
}
KeyDefaultDescription
version"2.0.0"Config file version
auto_capturetrueRun capture engine on session end
auto_recalltrueShow recall context on session start
governance_mode"detect_only"Governance mode (detect_only, propose, enforce)
recall.backend"scan""scan" (BM25), "hybrid" (BM25+Vector+RRF), or "vector"
recall.rrf_k60RRF fusion parameter k
recall.bm25_weight1.0BM25 weight in RRF fusion
recall.vector_weight1.0Vector weight in RRF fusion
recall.vector_model"all-MiniLM-L6-v2"Embedding model for vector search
recall.vector_enabledfalseEnable vector search backend
recall.onnx_backendfalseUse ONNX for local embeddings (no server needed)
proposal_budget.per_run3Max proposals generated per scan
proposal_budget.per_day6Max proposals per day
proposal_budget.backlog_limit30Max pending proposals before pausing
compaction.archive_days90Archive completed blocks older than N days
compaction.snapshot_days30Remove apply snapshots older than N days
compaction.log_days180Archive daily logs older than N days
compaction.signal_days60Remove resolved/rejected signals older than N days
scan_schedule"daily""daily" or "manual"

MCP Server

mind-mem ships with a Model Context Protocol server that exposes memory as resources and tools to any MCP-compatible client.

Install

pip install fastmcp

Automatic Setup (Recommended)

./install.sh --all

Configures all detected clients automatically. See Quick Start.

Manual Setup

For Claude Code, Claude Desktop, Cursor, Windsurf, and Gemini CLI, add to the respective JSON config under mcpServers:

{
  "mcpServers": {
    "mind-mem": {
      "command": "python3",
      "args": ["/path/to/mind-mem/mcp_server.py"],
      "env": {"MIND_MEM_WORKSPACE": "/path/to/your/workspace"}
    }
  }
}
ClientConfig File
Claude Code CLI~/.claude/mcp.json
Claude Desktop~/.config/Claude/claude_desktop_config.json
Gemini CLI~/.gemini/settings.json
Cursor~/.cursor/mcp.json
Windsurf~/.codeium/windsurf/mcp_config.json

For Codex CLI (TOML format), add to ~/.codex/config.toml:

[mcp_servers.mind-mem]
command = "python3"
args = ["/path/to/mind-mem/mcp_server.py"]

[mcp_servers.mind-mem.env]
MIND_MEM_WORKSPACE = "/path/to/your/workspace"

For Zed, add to ~/.config/zed/settings.json under context_servers:

{
  "context_servers": {
    "mind-mem": {
      "command": {
        "path": "python3",
        "args": ["/path/to/mind-mem/mcp_server.py"],
        "env": {"MIND_MEM_WORKSPACE": "/path/to/your/workspace"}
      }
    }
  }
}

Direct (stdio / HTTP)

# stdio transport (default)
MIND_MEM_WORKSPACE=/path/to/workspace python3 mcp_server.py

# HTTP transport (multi-client / remote)
MIND_MEM_WORKSPACE=/path/to/workspace python3 mcp_server.py --transport http --port 8765

Resources (read-only)

URIDescription
mind-mem://decisionsActive decisions
mind-mem://tasksAll tasks
mind-mem://entities/{type}Entities (projects, people, tools, incidents)
mind-mem://signalsAuto-captured signals pending review
mind-mem://contradictionsDetected contradictions
mind-mem://healthWorkspace health summary
mind-mem://recall/{query}BM25 recall search results
mind-mem://ledgerShared fact ledger (multi-agent)

Tools (21)

ToolDescription
recallSearch memory with BM25 (query, limit, active_only)
propose_updatePropose a decision/task — writes to SIGNALS.md only
approve_applyApply a staged proposal (dry_run=True by default)
rollback_proposalRollback an applied proposal by receipt timestamp
scanRun integrity scan (contradictions, drift, signals)
list_contradictionsList contradictions with auto-resolution analysis
hybrid_searchHybrid BM25+Vector search with RRF fusion
find_similarFind blocks similar to a given block
intent_classifyClassify query intent (9 types with parameter recommendations)
index_statsIndex statistics, MIND kernel availability, block counts
retrieval_diagnosticsPipeline rejection rates, intent histogram, hard negatives
reindexRebuild FTS5 index (optionally including vectors)
memory_evolutionView/trigger A-MEM metadata evolution for a block
list_mind_kernelsList available MIND kernel configurations
get_mind_kernelRead a specific MIND kernel configuration as JSON
category_summaryCategory summaries relevant to a given topic
prefetchPre-assemble context from recent conversation signals
delete_memory_itemDelete a memory block by ID (admin-scope)
export_memoryExport workspace as JSONL (user-scope)
calibration_feedbackSubmit quality feedback for a retrieved block (thumbs up/down)
calibration_statsView per-block and global calibration statistics

Token Auth (HTTP)

MIND_MEM_TOKEN=your-secret python3 mcp_server.py --transport http --port 8765

Safety Guarantees

  • propose_update never writes to DECISIONS.md or TASKS.md. All proposals go to SIGNALS.md.
  • approve_apply defaults to dry_run=True. Creates a snapshot before applying for rollback.
  • All resources are read-only. No MCP client can mutate source of truth through resources.
  • Namespace-aware. Multi-agent workspaces scope resources by agent ACL.

Security

Threat Model

What we protectHow
Memory integrity74+ structural checks, ConstraintSignature validation
Accidental overwritesProposal-based mutations only (never direct writes)
Rollback safetySnapshot before every apply, atomic os.replace()
Symlink attacksSymlink detection in restore paths
Path traversalAll paths resolved via os.path.realpath(), workspace-relative only
What we do NOT protect againstWhy
Malicious local userSingle-user CLI tool — filesystem access = data access
Network attacksNo network calls, no listening ports, no telemetry
Encrypted storageFiles are plaintext Markdown — use disk encryption if needed

No Network Calls

mind-mem makes zero network calls from its core. No telemetry, no phoning home, no cloud dependencies. Optional features (vector embeddings, cross-encoder) may download models on first use.


Requirements

  • Python 3.10+
  • No external packages — stdlib only for core functionality

Optional Dependencies

PackagePurposeInstall
fastmcpMCP serverpip install mind-mem[mcp]
onnxruntime + tokenizersLocal vector embeddingspip install mind-mem[embeddings]
sentence-transformersCross-encoder rerankingpip install mind-mem[cross-encoder]
ollamaLLM extraction (local)pip install ollama

Mind-Mem:7B — Purpose-Trained LLM

For best LLM extraction quality, use Mind-Mem:7B — a purpose-trained model fine-tuned on mind-mem's 8 extraction tasks (entity extraction, fact extraction, observation compression, contradiction detection, governance analysis, intent classification, axis-aware retrieval, LLM reranking).

Ollama (recommended):

# Download the GGUF from HuggingFace
wget https://huggingface.co/star-ga/mind-mem-7b/resolve/main/mind-mem-7b-Q4_K_M.gguf

# Create Ollama model
cat > Modelfile << 'EOF'
FROM ./mind-mem-7b-Q4_K_M.gguf
SYSTEM "You are Mind7B, a specialized memory extraction model for mind-mem."
PARAMETER temperature 0.1
PARAMETER num_predict 512
EOF
ollama create mind-mem:7b -f Modelfile

Then set in mind-mem.json:

{
  "extraction": {
    "enabled": true,
    "model": "mind-mem:7b",
    "backend": "ollama"
  }
}

LoRA adapter (transformers + PEFT):

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-9B", device_map="auto")
model = PeftModel.from_pretrained(base, "star-ga/mind-mem-7b")
tokenizer = AutoTokenizer.from_pretrained("star-ga/mind-mem-7b")
ResourceLink
Model (LoRA + GGUF)star-ga/mind-mem-7b
Training datastar-ga/mind7b-training
Base modelQwen/Qwen3.5-9B
TrainingQLoRA via Unsloth, 2600 examples, RTX 4090

Platform Support

PlatformStatusNotes
LinuxFullPrimary target
macOSFullPOSIX-compliant shell scripts
Windows (WSL/Git Bash)FullUse WSL2 or Git Bash for shell hooks
Windows (native)Python onlyUse validate_py.py; hooks require WSL

Troubleshooting

ProblemSolution
validate.sh says "No mind-mem.json found"Run in a workspace, not the repo root. Run init_workspace.py first.
recall returns no resultsWorkspace is empty. Add decisions/tasks first.
capture says "no daily log"No memory/YYYY-MM-DD.md for today. Write something first.
intel_scan finds 0 contradictionsGood — no conflicting decisions.
Tests fail on WindowsUse validate_py.py instead of validate.sh. Hooks require WSL.
MIND kernel not loadingCompile with mindc mind/*.mind --emit=shared -o lib/libmindmem.so. Or ignore — pure Python works identically.

FAQ

No results from recall?
Check that the workspace path is correct and points to an initialized workspace
containing decisions, tasks, or entities. If the FTS5 index is stale or missing,
run the reindex MCP tool to rebuild it.

MCP connection failed?
Verify that fastmcp is installed (pip install fastmcp). Check the transport
configuration in your client's MCP config (stdio vs HTTP). Ensure the
MIND_MEM_WORKSPACE environment variable points to a valid workspace directory.

MIND kernels not loading?
Run bash src/mind_mem/build.sh to compile the MIND source files (requires mindc).
If the MIND compiler is not available, mind-mem automatically uses the pure Python
fallback with identical results.

Index corrupt?
Run the reindex MCP tool, or from the command line:
python3 -m mind_mem.sqlite_index --rebuild --workspace /path/to/workspace.
This drops and recreates the FTS5 index from all workspace files.


Specification

For the formal grammar, invariant rules, state machine, and atomicity guarantees, see SPEC.md.


Contributing

Contributions welcome. Please open an issue first to discuss what you'd like to change.

See CONTRIBUTING.md for guidelines.


License

Apache 2.0 — Copyright 2026 STARGA Inc and contributors.

Repository

ST
star-ga

star-ga/mind-mem

Created

February 18, 2026

Updated

April 13, 2026

Language

Python

Category

AI