MB memory-bank-skill Persistent memory for AI coding teams

Open source · MIT · v5.0.0

AI coding agents should not wake up with amnesia.

tests passing PyPI version GitHub release Python versions Homebrew tap Downloads Last commit License MIT

Every new session starts with re-explaining the project, re-stating the plan, and watching your rules get forgotten. memory-bank-skill ends that: a .memory-bank/ directory next to your code that the agent reads at session start, updates as it works, and hands to the next session — or the next agent — intact.

8 agents One memory model
50+ commands Plan, work, verify, recall
$0 by default Local files, no API keys

Session signal

Status, plans, specs, research, lessons, code graph, and past sessions stay navigable between chats — across compaction events and across agents.

$ pipx install memory-bank-skill
$ memory-bank install   # Claude Code · Cursor · Codex · …

.memory-bank/
├── status.md      ← where we are, what's next
├── checklist.md   ← tasks ✅ / ⬜
├── roadmap.md     ← priorities, direction
├── plans/ specs/  ← executable plans & SDD specs
├── notes/ lessons.md
├── session/       ← cross-chat memory (/mb recall)
└── codebase/      ← stack map + code graph

$ /mb start
[context] active plan loaded
[rules] tdd + clean architecture + fsd
[next] resume exactly where we stopped
Claude Code Cursor Codex OpenCode Windsurf Cline Kilo Pi Code

The problem

The tooling is smart. The memory model is fragile.

LLM agents are stateless by design. Without a durable memory layer you pay the same tax every single day — in tokens, in time, and in architecture drift.

Session cold starts

Yesterday's decisions disappear. Today's agent re-asks the same questions, re-reads half the repo, burns tokens, and repeats mistakes you already fixed once.

Rule drift

TDD, architecture boundaries, review rituals — repeated in chat, forgotten by the next session. Shared constraints decay into wishful thinking.

Tool lock-in

Claude Code, Cursor, Codex, and OpenCode each invent their own glue. Your project memory should outlive the agent you happen to use this week.

Everything inside

Not a notes folder. A full operating system for AI-assisted work.

Three layers in one skill: long-term project memory, an always-on engineering ruleset, and a 50+ command dev toolkit — all plain files in your repo, all working offline, all free by default.

Persistent project memory

.memory-bank/ holds status, checklist, roadmap, plans, research, lessons, and notes. It survives compaction, restarts, and agent switches. Commit it and your teammate's agent catches up without a single question.

Engineering rules, always on

TDD, SOLID, Clean Architecture, Feature-Sliced Design, Testing Trophy, coverage targets — installed as global rules the agent reads at every session start. No more reminding. Works even without a bank (rules-only mode).

50+ workflow commands

25 top-level commands (/plan, /review, /commit, /adr, /security-review…) plus 25+ /mb sub-commands for the full memory lifecycle. /mb help shows them live inside any agent.

Composable work pipeline new in v5

/mb work drives plans and specs through implement → verify → done by default. Need more rigor? Flip on --review, --judge, or the full 8-stage chain. You pay for exactly the process you ask for.

Spec-driven development

/mb discuss interviews you into EARS-validated requirements, /mb sdd turns them into a Kiro-style spec triple (requirements / design / tasks), and /mb work executes the tasks one by one — with verification gates between them.

Code graph & semantic search

/mb graph builds a multi-language code graph (Python, Go, JS/TS, Rust, Java) with god-node detection, git co-change edges, and suggested questions. Semantic search runs on free local BM25, with opt-in embeddings for concept queries.

LLM codebase wiki

/mb wiki writes a per-community wiki of your codebase and hunts for surprising cross-module connections — using your host agent's subagents. No API key, no external service.

Cross-session recall

Lifecycle hooks log every session automatically. /mb recall "why did we pick postgres" searches past chats and notes — so decisions made three weeks ago are one query away, not lost in a closed tab.

Rule profiles & presets

22 built-in presets tune the rules to your world — backend / frontend / mobile, Go to TypeScript, microservices to FSD, TDD to legacy-safe. The safety baseline (no placeholders, protected files, verify before done) can never be switched off.

A roster of subagents

Planner, plan-verifier, doctor, reviewer ensemble, test-runner, rules-enforcer, and 9 dev-role agents (backend, frontend, iOS, Android, QA, DevOps…) — each carrying an evidence-before-claims discipline: no "tests pass" without the actual output.

8 agents, zero lock-in

One install wires Claude Code, Cursor, Codex, OpenCode, Windsurf, Cline, Kilo, and Pi Code to the same memory model. Adapters merge idempotently into your existing AGENTS.md and hooks — uninstalling one client never breaks another.

Private & local by design

Everything is markdown and JSON on your disk. Nothing is sent anywhere. API keys and tokens (sk-…, ghp_…, JWTs…) are auto-redacted from session capture before they ever reach disk; wrap anything else in <private> tags to keep it out of every index and search.

New in v5.0.0

The pipeline is now yours to compose.

v5 makes process a dial, not a tax. The default flow is lean — review and judge are opt-in stages you add per run, per project, or per preset. Canonical order is fixed; invalid chains fail fast before any code is touched.

default (execution preset) opt-in via flags, pipeline.yaml, or presets

Lean by default

/mb work my-feature
# implement → verify → done

Review is off by default. Fast iteration stays fast.

Rigor on demand

/mb work payments --review --judge
# + reviewer + independent judge
# severity gate: blocker 0 · major 0

Add a reviewer and an independent judge for the code that matters.

The whole chain

/mb work checkout --workflow full
# discuss → sdd → plan → implement
# → verify → review → judge → done

From requirements interview to judged delivery — one command.

Three precedence layers: launch flags > project pipeline.yaml > built-in default. The heavyweight 5-reviewer ensemble stays available as --workflow governed-execution. Upgrading from v4? One-page migration guide — no file migration needed.

Code-graph intelligence

Stop grepping. Query the codebase like a database.

/mb graph parses Python (stdlib ast) and Go, JS, TS, Rust, Java (tree-sitter) into a deterministic JSON-Lines graph next to your code — with god-node analytics, bridge-file detection, Louvain communities, and an incremental cache that rebuilds in seconds. Then it gives you three different ways to ask questions, all local.

1 · Structural queries — $0, <1s

$ mb-graph-query.py impact \
    --symbol WriteFile
# every caller, transitively —
# the blast radius before you edit

"Who calls X?" · "What breaks if I change X?" · "Which tests cover X?" — deterministic answers from graph.json, queryable with plain jq. No false positives from strings and comments.

2 · Semantic search — local, no API key

$ mb-semantic-search.py \
    "how does auth token refresh work" \
    --source-only

Find code by meaning, not by name. Pure-Python BM25 always works; opt-in local embeddings handle concept queries. A code-aware tokenizer splits camelCase and snake_case for precision.

3 · LLM wiki — your agent's own subagents

$ /mb wiki
# one article per module community
# + "surprising connections" edges
#   with confidence & rationale

Haiku writes a wiki article per module cluster; Sonnet hunts non-obvious cross-module links and merges them into the graph idempotently. No extra API key — it reuses the agent you're already running.

Opt-in layers keep the base output byte-identical: --cochange mines git history for files that change together without importing each other — coupling no AST can see; --questions appends data-ranked "what to look at first" questions; --docs enriches nodes with signatures and docstrings. Unlike Aider's per-request repo-map or Cursor's server-side index, the graph is a persistent, committable file living next to your plans and ADRs — and the dev-role subagents check the blast radius through it before every edit. How it works & honest comparison.

How it works

Navigate the project. Don't dump the whole codebase.

Context stays structured, small, and durable. The agent loads only what's needed and writes the memory back as real work lands — so the next session starts warm.

01

Initialize once

/mb init creates .memory-bank/, detects your stack, and points the agent at it. /mb map and /mb graph add a codebase map and code graph on top.

02

Resume instantly

/mb start loads status, checklist, roadmap, and the active plan before a single line of code — the agent knows the blockers and the next step in seconds.

03

Work with discipline

/mb plan or /mb sdd produce executable plans and specs; /mb work drives them stage by stage under TDD, Clean Architecture, and your chosen pipeline.

04

Close the loop

/mb verify audits the diff against the plan's DoD; /mb done appends progress, updates status, and preserves the knowledge for tomorrow — and for /mb recall.

Cross-agent portability

One repo, one memory bank, eight coding agents.

The project state lives in your repo, not in a vendor's cloud. Adapters teach each host the same memory model — switch tools mid-project and nothing is lost.

Claude Code

Native commands, full lifecycle hooks, subagents, session memory.

Cursor

Global rules, 10 hooks, commands, and skill alias — auto-wired.

Codex

Skill discovery plus AGENTS.md guidance and project adapter.

OpenCode

Native commands, TypeScript plugins, global AGENTS surface.

Windsurf

Cascade hooks and project rules.

Cline

.clinerules integration and hooks.

Kilo

Rules plus git-hook fallback where native hooks are absent.

Pi Code

Dual-mode install: native skill or shared AGENTS.md.

Install

Fast to add. Easy to keep. Trivial to remove.

Plain files, marker-based merges, byte-level idempotent installs. No framework lock-in, no hosted dependency, no hidden state. macOS, Linux, and Windows (Git Bash / WSL).

Recommended — pipx

pipx install memory-bank-skill
memory-bank install
# or pick clients explicitly:
memory-bank install \
  --clients claude-code,cursor,codex

First session in a project

/mb init        # create the bank
/mb start       # load context
/mb plan feature user-auth
/mb work user-auth
/mb verify && /mb done

Alt paths

# Homebrew
brew tap fockus/tap && brew install memory-bank

# One-shot skill copy
npx skills add fockus/skill-memory-bank

# Developers
git clone https://github.com/fockus/skill-memory-bank.git \
  ~/.claude/skills/skill-memory-bank

Documentation

Learn the rhythm in ten minutes.

The daily loop is three commands. Everything else is there when you need it — and every doc lives next to the code on GitHub.

Every session

start → work → done

Open the project, run /mb start, do the work, finish with /mb done. The agent keeps checklist.md and progress.md current as tasks complete — you never write a status report again.

Bigger features

discuss → sdd → work

/mb discuss turns a vague idea into validated requirements, /mb sdd produces the spec triple, and /mb work executes its tasks with verification — and review/judge gates when you opt in.

When you forget

search → recall → wiki

/mb search scans the bank, /mb recall digs through past sessions, semantic search finds code by concept, and /mb wiki explains how the modules actually fit together.

Honest answers

Does this replace my agent's built-in memory?

No — it complements it. Native memory is per-user and cross-project (your preferences, your style). .memory-bank/ is per-project and team-shared (status, plans, decisions). Both load simultaneously.

Do I have to commit .memory-bank/?

Your call. Commit it to share state with your team — a colleague clones the repo, runs /mb start, and has full context. Solo? .gitignore it, or use global storage mode and keep the repo untouched.

Is my code sent anywhere?

No. Everything is local markdown and JSON. The code graph, BM25 search, co-change analysis, and recall index all run on your machine for $0. The only LLM calls are the ones your agent already makes.

Will it overwrite my existing AGENTS.md or hooks?

No. Adapters use marker blocks (<!-- memory-bank:start/end -->) and merge idempotently. Your content is preserved; uninstall removes only MB-owned sections.

Is it production-ready?

v5.0.0 is the current stable line, used daily on real projects, with a green test envelope of 1,900+ tests (pytest + bats) on Python 3.11/3.12 × Ubuntu and macOS. The skill develops itself with its own memory bank — dogfooding all the way down.

Ready to keep the context?

Put the project memory next to the code.

Install the skill, run /mb init, and let every session start from the actual state of the project instead of guesswork. Five minutes now, zero cold starts after.