Session cold starts
Yesterday's decisions disappear. Today's agent re-asks the same questions, re-reads half the repo, burns tokens, and repeats mistakes you already fixed once.
Open source · MIT · v5.0.0
Every new session starts with re-explaining the project, re-stating the plan, and
watching your rules get forgotten. memory-bank-skill ends that: a
.memory-bank/ directory next to your code that the agent reads at session
start, updates as it works, and hands to the next session — or the next agent —
intact.
Session signal
Status, plans, specs, research, lessons, code graph, and past sessions stay navigable between chats — across compaction events and across agents.
$ pipx install memory-bank-skill
$ memory-bank install # Claude Code · Cursor · Codex · …
.memory-bank/
├── status.md ← where we are, what's next
├── checklist.md ← tasks ✅ / ⬜
├── roadmap.md ← priorities, direction
├── plans/ specs/ ← executable plans & SDD specs
├── notes/ lessons.md
├── session/ ← cross-chat memory (/mb recall)
└── codebase/ ← stack map + code graph
$ /mb start
[context] active plan loaded
[rules] tdd + clean architecture + fsd
[next] resume exactly where we stopped
The problem
LLM agents are stateless by design. Without a durable memory layer you pay the same tax every single day — in tokens, in time, and in architecture drift.
Yesterday's decisions disappear. Today's agent re-asks the same questions, re-reads half the repo, burns tokens, and repeats mistakes you already fixed once.
TDD, architecture boundaries, review rituals — repeated in chat, forgotten by the next session. Shared constraints decay into wishful thinking.
Claude Code, Cursor, Codex, and OpenCode each invent their own glue. Your project memory should outlive the agent you happen to use this week.
Everything inside
Three layers in one skill: long-term project memory, an always-on engineering ruleset, and a 50+ command dev toolkit — all plain files in your repo, all working offline, all free by default.
.memory-bank/ holds status, checklist, roadmap, plans, research,
lessons, and notes. It survives compaction, restarts, and agent switches. Commit it
and your teammate's agent catches up without a single question.
TDD, SOLID, Clean Architecture, Feature-Sliced Design, Testing Trophy, coverage targets — installed as global rules the agent reads at every session start. No more reminding. Works even without a bank (rules-only mode).
25 top-level commands (/plan, /review,
/commit, /adr, /security-review…) plus 25+
/mb sub-commands for the full memory lifecycle. /mb help
shows them live inside any agent.
/mb work drives plans and specs through
implement → verify → done by default. Need more rigor? Flip on
--review, --judge, or the full 8-stage chain. You pay for
exactly the process you ask for.
/mb discuss interviews you into EARS-validated requirements,
/mb sdd turns them into a Kiro-style spec triple
(requirements / design / tasks), and /mb work executes the tasks one by
one — with verification gates between them.
/mb graph builds a multi-language code graph (Python, Go, JS/TS, Rust,
Java) with god-node detection, git co-change edges, and suggested questions.
Semantic search runs on free local BM25, with opt-in embeddings for concept queries.
/mb wiki writes a per-community wiki of your codebase and hunts for
surprising cross-module connections — using your host agent's subagents. No API key,
no external service.
Lifecycle hooks log every session automatically. /mb recall "why did we pick
postgres" searches past chats and notes — so decisions made three weeks ago
are one query away, not lost in a closed tab.
22 built-in presets tune the rules to your world — backend / frontend / mobile, Go to TypeScript, microservices to FSD, TDD to legacy-safe. The safety baseline (no placeholders, protected files, verify before done) can never be switched off.
Planner, plan-verifier, doctor, reviewer ensemble, test-runner, rules-enforcer, and 9 dev-role agents (backend, frontend, iOS, Android, QA, DevOps…) — each carrying an evidence-before-claims discipline: no "tests pass" without the actual output.
One install wires Claude Code, Cursor, Codex, OpenCode, Windsurf, Cline, Kilo, and
Pi Code to the same memory model. Adapters merge idempotently into your existing
AGENTS.md and hooks — uninstalling one client never breaks another.
Everything is markdown and JSON on your disk. Nothing is sent anywhere. API keys and
tokens (sk-…, ghp_…, JWTs…) are auto-redacted from session
capture before they ever reach disk; wrap anything else in
<private> tags to keep it out of every index and search.
New in v5.0.0
v5 makes process a dial, not a tax. The default flow is lean — review and judge are opt-in stages you add per run, per project, or per preset. Canonical order is fixed; invalid chains fail fast before any code is touched.
default (execution preset)
opt-in via flags, pipeline.yaml, or presets
/mb work my-feature
# implement → verify → done
Review is off by default. Fast iteration stays fast.
/mb work payments --review --judge
# + reviewer + independent judge
# severity gate: blocker 0 · major 0
Add a reviewer and an independent judge for the code that matters.
/mb work checkout --workflow full
# discuss → sdd → plan → implement
# → verify → review → judge → done
From requirements interview to judged delivery — one command.
Three precedence layers: launch flags > project pipeline.yaml >
built-in default. The heavyweight 5-reviewer ensemble stays available as
--workflow governed-execution. Upgrading from v4?
One-page migration guide — no file migration needed.
Code-graph intelligence
/mb graph parses Python (stdlib ast) and Go, JS, TS, Rust,
Java (tree-sitter) into a deterministic JSON-Lines graph next to your code — with
god-node analytics, bridge-file detection, Louvain communities, and an incremental
cache that rebuilds in seconds. Then it gives you three different ways to ask
questions, all local.
$ mb-graph-query.py impact \
--symbol WriteFile
# every caller, transitively —
# the blast radius before you edit
"Who calls X?" · "What breaks if I change X?" · "Which tests cover X?" —
deterministic answers from graph.json, queryable with plain
jq. No false positives from strings and comments.
$ mb-semantic-search.py \
"how does auth token refresh work" \
--source-only
Find code by meaning, not by name. Pure-Python BM25 always works; opt-in local
embeddings handle concept queries. A code-aware tokenizer splits
camelCase and snake_case for precision.
$ /mb wiki
# one article per module community
# + "surprising connections" edges
# with confidence & rationale
Haiku writes a wiki article per module cluster; Sonnet hunts non-obvious cross-module links and merges them into the graph idempotently. No extra API key — it reuses the agent you're already running.
Opt-in layers keep the base output byte-identical: --cochange mines git
history for files that change together without importing each other — coupling no AST
can see; --questions appends data-ranked "what to look at first"
questions; --docs enriches nodes with signatures and docstrings. Unlike
Aider's per-request repo-map or Cursor's server-side index, the graph is a
persistent, committable file living next to your plans and ADRs —
and the dev-role subagents check the blast radius through it before every edit.
How it works & honest comparison.
How it works
Context stays structured, small, and durable. The agent loads only what's needed and writes the memory back as real work lands — so the next session starts warm.
/mb init creates .memory-bank/, detects your stack, and
points the agent at it. /mb map and /mb graph add a
codebase map and code graph on top.
/mb start loads status, checklist, roadmap, and the active plan before
a single line of code — the agent knows the blockers and the next step in seconds.
/mb plan or /mb sdd produce executable plans and specs;
/mb work drives them stage by stage under TDD, Clean Architecture, and
your chosen pipeline.
/mb verify audits the diff against the plan's DoD; /mb done
appends progress, updates status, and preserves the knowledge for tomorrow — and for
/mb recall.
Cross-agent portability
The project state lives in your repo, not in a vendor's cloud. Adapters teach each host the same memory model — switch tools mid-project and nothing is lost.
Native commands, full lifecycle hooks, subagents, session memory.
Global rules, 10 hooks, commands, and skill alias — auto-wired.
Skill discovery plus AGENTS.md guidance and project adapter.
Native commands, TypeScript plugins, global AGENTS surface.
Cascade hooks and project rules.
.clinerules integration and hooks.
Rules plus git-hook fallback where native hooks are absent.
Dual-mode install: native skill or shared AGENTS.md.
Install
Plain files, marker-based merges, byte-level idempotent installs. No framework lock-in, no hosted dependency, no hidden state. macOS, Linux, and Windows (Git Bash / WSL).
pipx install memory-bank-skill
memory-bank install
# or pick clients explicitly:
memory-bank install \
--clients claude-code,cursor,codex
/mb init # create the bank
/mb start # load context
/mb plan feature user-auth
/mb work user-auth
/mb verify && /mb done
# Homebrew
brew tap fockus/tap && brew install memory-bank
# One-shot skill copy
npx skills add fockus/skill-memory-bank
# Developers
git clone https://github.com/fockus/skill-memory-bank.git \
~/.claude/skills/skill-memory-bank
Documentation
The daily loop is three commands. Everything else is there when you need it — and every doc lives next to the code on GitHub.
Open the project, run /mb start, do the work, finish with
/mb done. The agent keeps checklist.md and
progress.md current as tasks complete — you never write a status
report again.
/mb discuss turns a vague idea into validated requirements,
/mb sdd produces the spec triple, and /mb work executes
its tasks with verification — and review/judge gates when you opt in.
/mb search scans the bank, /mb recall digs through past
sessions, semantic search finds code by concept, and /mb wiki explains
how the modules actually fit together.
No — it complements it. Native memory is per-user and cross-project (your
preferences, your style). .memory-bank/ is per-project and team-shared
(status, plans, decisions). Both load simultaneously.
.memory-bank/?
Your call. Commit it to share state with your team — a colleague clones the repo,
runs /mb start, and has full context. Solo? .gitignore it,
or use global storage mode and keep the repo untouched.
No. Everything is local markdown and JSON. The code graph, BM25 search, co-change analysis, and recall index all run on your machine for $0. The only LLM calls are the ones your agent already makes.
AGENTS.md or hooks?
No. Adapters use marker blocks (<!-- memory-bank:start/end -->)
and merge idempotently. Your content is preserved; uninstall removes only MB-owned
sections.
v5.0.0 is the current stable line, used daily on real projects, with a green test envelope of 1,900+ tests (pytest + bats) on Python 3.11/3.12 × Ubuntu and macOS. The skill develops itself with its own memory bank — dogfooding all the way down.
Ready to keep the context?
Install the skill, run /mb init, and let every session start from the
actual state of the project instead of guesswork. Five minutes now, zero cold
starts after.