What I learned about AI agent memory

I've been thinking through memory for rangoon, a personal assistant framework I'm building. The core problem is straightforward: your agent finishes a task, the session ends, and everything it learned disappears.

My first instinct was to reach for something heavy. Vector databases, embedding pipelines, retrieval frameworks. But when I looked at how the most capable agent projects actually handle memory, I found something I didn't expect.

Four types of memory

Agent memory isn't one thing. There are four types, and they matter for different reasons.

Short-term memory is your current conversation context. It lives for the session and dies when it ends. This is the one everybody gets for free.

Long-term memory is the persistent stuff. Facts the agent should remember across sessions: user preferences, project context, lessons learned from past mistakes. This is where most frameworks focus and where most of them overcomplicate things.

Procedural memory is learned workflows. How to deploy, how to run tests, how to format a commit message. Patterns the agent has internalized through repeated use.

Working memory is scratch space. The agent uses it to reason through a problem, then throws it away. Temporary by design.

Most memory frameworks try to handle all four with a single abstraction. That's usually where things start to fall apart.

Three projects, same answer

Manus, OpenClaw, and Claude Code all arrived at markdown files independently. Manus was acquired for $2B. OpenClaw has 145K+ GitHub stars. Claude Code is what I'm using to write this post.

All three store agent memory as plain text files in the project directory. MEMORY.md, CLAUDE.md, daily logs, task plans. No database, no embeddings, no retrieval pipeline. Some people call it "memory as documentation" instead of "memory as database."

When three teams working on very different problems converge on the same answer, I pay attention.

Why files beat databases (for this)

Vector databases sound good on paper. Semantic search, scalable retrieval, fuzzy matching. But for local agents working on codebases and personal data, the actual problems are different.

Debuggability is the big one. When your agent starts acting weird, cat MEMORY.md tells you why. Try getting that kind of visibility from a vector store. No dashboard, no query language, just open the file.

Version control comes free. git diff on your agent's memory shows you exactly what changed, when it changed, and lets you roll it back if something went wrong. When your agent decides to "learn" something incorrect, being able to revert it matters.

Crash resilience is built in. Process dies mid-task? The file is still on disk. No connection pools to manage, no data loss from incomplete transactions, no recovery scripts.

You can swap your agent framework tomorrow and the memory files still work. They're text. There's no migration path because there's nothing to migrate.

And there's no infrastructure to maintain. No database to provision, no embeddings API to pay for, no retrieval service to keep running. For local, single-user agents, that overhead is hard to justify.

A practical file structure

The pattern I keep seeing (and the one I've adopted for rangoon) is simple:

A curated memory file that the agent reads at the start of every session. This is the actively maintained working memory: current state, open threads, things to remember. The agent updates it as things change, removes outdated entries, and keeps it tight enough to fit in context. If you let it grow unchecked it becomes noise.

An append-only journal for session history. This never gets edited or trimmed. Over time it becomes a genuinely useful log of what happened and when. It's the longitudinal record that the curated file draws from.

Optionally, a shared identity file that gives every agent cross-cutting context. Preferences, goals, communication style. One file, readable by all agents.

The important thing is that all of this is readable and editable by you. Open the file, change something, the agent picks it up next session. No admin panel, no config API.

Where it breaks down

File-based memory works up to about 5MB of accumulated text. Past that, you need search. Not necessarily vector search. BM25 full-text indexing handles most cases. Honestly, grep handles a surprising number of them.

If you're building an agent that needs to recall from thousands of documents or weeks of conversation history, a hybrid approach makes more sense. Files for the curated stuff, indexed search for the archive.

Most local agents never hit that threshold though. If yours does, you'll know, and you can add the complexity then. Not before.