Skip to content

Knowledge Infrastructure

AI agents forget everything. That's a choice, not a constraint.

Every session with Claude Code starts blank. No memory of last week's refactor, no awareness of which team worked on this module, no continuity between the agent you ran on Tuesday and the one running today.

For a personal productivity tool, that's fine. For an enterprise deploying a fleet of AI agents, it's a fundamental architectural gap.

Two Architectures for Claude Code: What 19,700 Stars Got Right and What They Missed

A repository called claude-code-best-practice hit #1 trending on GitHub this week. 19,700 stars in days. Eighty-four tips from Boris Cherny, who created Claude Code, along with contributions from Thariq, Cat Wu, and the broader Anthropic team. It is the most comprehensive public document on how to get serious results from Claude Code, and it deserves the attention it is getting.

The reason it caught my eye is that the ExoCortex -- the eight-layer stack running my Claude Code setup for ten-plus weeks now -- solves many of the same problems from a fundamentally different direction. Same tool, same class of problem, different architectural assumptions. Comparing the two reveals something neither setup has articulated on its own: there are two distinct philosophies for extending Claude Code, and both have blind spots the other has solved.

We Cancelled a 45-Minute Architecture Review. A KCP Query Answered It in 1.2 Seconds.

When the AI Agent Knows Your Architecture — organisational knowledge becomes queryable rather than assembled in meetings

Last week someone asked the question that usually triggers a meeting: "If we change the payment service API contract, what else breaks?" In any enterprise system older than a few years, nobody has the full picture. The payment service team knows their side. The downstream consumers know theirs. The platform team knows the deployment topology. But the blast radius of a contract change lives in the intersection of what three or four people carry in their heads, and the only way to assemble that intersection has always been to put those people in a room.

We didn't put them in a room. We ran a query.

Agent Memory Rots. Here's How We Stopped It.

Five weeks ago I wrote about the three-layer memory architecture for AI agents: working memory (the context window), episodic memory (indexed session transcripts), and semantic memory (a workspace knowledge graph). The prescription was "build these layers." Yesterday I shipped the maintenance system that keeps them from decaying.

Building the layers was the easy part.

Agent Memory Rots — diagnostic telemetry and behavioral heuristics for maintaining the ExoCortex. 3,000+ sessions indexed. 65,905 files. Memory degradation imminent.

The Tool I Didn't Plan to Build: Synthesis, Ten Weeks Later

In late January 2026, I had a problem I hadn't anticipated. I had just finished building lib-pcb — a Java library for parsing eight proprietary PCB binary formats. 197,831 lines of code. 7,461 tests. Eleven calendar days. The AI agent (Claude Code) wrote most of it. The methodology worked exactly as designed.

And then I couldn't navigate any of it.

The KCP Ecosystem: How Five Tools Turn Claude Code Into a Persistent Intelligence Platform

The KCP Ecosystem — Turning Claude Code into a Persistent Intelligence Platform


The Problem

Every session with Claude Code starts from zero.

Every AI session starts from zero — the Start-From-Zero Loop

You open a new session, and the model has no idea what you were doing yesterday. Which services are running. What you decided about the database schema last Thursday. Why you chose the library you chose. You re-explain it. Claude asks clarifying questions you answered two sessions ago. You paste the same background context you always paste. Then the work begins.

And when the work does begin, there's a different problem: output flooding the context window. Run mvn package and you get 400 lines of Maven lifecycle noise. Run terraform plan and the diff buries the actual changes in scaffolding. Run kubectl get pods cluster-wide and you've spent 8,000 tokens on status rows you didn't need.

Context flooding destroys working memory — 33.7% of a 200K context is recovery overhead

The context window is your working memory. Filling it with boilerplate and re-explaining the same setup repeatedly is waste — not just inconvenient, but structurally limiting. A 200K token context sounds vast until a third of it is recovery overhead.

What's missing is infrastructure. Not smarter prompting. Not longer context. Infrastructure — a persistent layer that handles memory, filters noise, and gives the model the right knowledge at the right moment without you having to manage it manually.

That infrastructure is KCP.

kcp-dashboard: Observability for the KCP Ecosystem

The KCP toolchain has been running in the background for weeks. kcp-commands injects manifests before Bash calls. kcp-memory indexes sessions and tool events. Events accumulate in ~/.kcp/usage.db and ~/.kcp/memory.db. The machinery works. But until today, the only way to know whether it was working well was to grep through databases and trust the numbers.

Trust is not observability. You cannot improve what you cannot see.

Today we are releasing kcp-dashboard v0.22.0 -- a terminal UI that reads both KCP databases and shows you what the guidance layer is actually doing: which commands are guided, how often manifests leave the agent needing more help, what sessions look like, and where the gaps are.