AI-Augmented Development¶

March 4, 2026
in AI-Augmented Development, Knowledge Infrastructure
8 min read

The Prompt Cache as Infrastructure: Lessons from 3,007 Claude Code Sessions

I extracted the token usage from 55 days of Claude Code sessions. The number that stopped me:

12,198,713,224 cache read tokens.

Twelve billion. Against 9,965,286 input tokens. For every fresh token sent to the model, 1,224 were served from cache.

March 4, 2026
in AI-Augmented Development, Knowledge Infrastructure
11 min read

Same Engine, Different Transmission

Most AI productivity discussion asks the wrong question. "How much faster?" assumes the difference is speed. It is not. Two developers using the same model, the same IDE integration, the same subscription tier -- one of them starts every session cold and the other does not. The gap between them is not the engine. It is the transmission. But the transmission alone is not the full story either. There is a third variable: how the driver was trained.

Same Engine, Different Transmission: The 4-Gear AI Memory Architecture -- standard 1-gear setup vs four gears, honest productivity numbers

March 3, 2026
in AI-Augmented Development, Knowledge Infrastructure
8 min read

Two Gaps, Both Closed

Written the day both came online.

March 3, 2026
in AI-Augmented Development, Knowledge Infrastructure
15 min read

kcp-commands: Save 33% of Claude Code's Context Window

Every token Claude Code's context window can hold is an opportunity — a tool call result that stays in scope, a file that does not need to be re-read, a decision that does not need to be recapped. Wasting those tokens on noise is a quiet tax on every session.

Today we are releasing kcp-commands: a Claude Code hook that recovers 33.7% of a 200K context window in a typical agentic coding session by intercepting Bash tool calls at two critical points.

The full number across our benchmark session: 67,352 tokens saved.

Update (March 3, 2026 — v0.9.0): kcp-commands now writes a JSON event to ~/.kcp/events.jsonl on every Phase A Bash hook call. kcp-memory v0.4.0 ingests that stream to provide tool-level episodic memory — kcp-memory events search "kubectl apply" returns every time Claude ran that command across all your projects. Phase A gives Claude vocabulary. Phase B cleans output. Phase C remembers what ran.

March 3, 2026
in AI-Augmented Development, Knowledge Infrastructure
12 min read

kcp-memory: Give Claude Code a Memory

Every Claude Code session starts the same way. The context window is empty. The agent has no recollection of what it did yesterday, which files it touched last week, or how it solved a similar problem three sessions ago. Each session is day one.

That is not a limitation of the model. It is a missing infrastructure layer.

Today we are releasing kcp-memory v0.1.0: a standalone Java daemon that indexes ~/.claude/projects/**/*.jsonl session transcripts into a local SQLite database with FTS5 full-text search. Ask it "what was I working on last week?" and it answers in milliseconds.

Update (same day — v0.2.0): kcp-memory now ships with tool-level granularity. kcp-commands v0.9.0 writes every Bash tool call to ~/.kcp/events.jsonl; kcp-memory ingests that stream and makes individual commands searchable via kcp-memory events search. Session-level and tool-level memory in one daemon, one database, zero additional dependencies.

Update (same day — v0.3.0): kcp-memory now ships as an MCP server. Run java -jar ~/.kcp/kcp-memory-daemon.jar mcp — registered in mcpServers in ~/.claude/settings.json — and Claude Code can call kcp_memory_search, kcp_memory_events_search, kcp_memory_list, and kcp_memory_stats inline during a session, without leaving the context window.

Update (same day — v0.4.0): Two new MCP tools close the remaining gaps. kcp_memory_session_detail(session_id) returns full session content — user messages, files touched, tools used — completing the search → read flow. kcp_memory_project_context() reads PWD from the process environment and returns the last 5 sessions and 20 tool events for the current project, with no query needed. Call it at the start of every session and Claude immediately knows what it was doing here last time.

March 3, 2026
in AI-Augmented Development, Knowledge Infrastructure
6 min read

KCP Comes to OpenCode: The First AI Coding Tool Plugin

kcp-commands recovers 33% of Claude Code's context window by intercepting Bash tool calls. Today we are extending that same principle to OpenCode — the 114K-star TypeScript alternative to Claude Code.

The result is opencode-kcp-plugin: a plugin that injects a knowledge.yaml knowledge map into every OpenCode session and annotates file search results with intent descriptions. The mechanism is different from kcp-commands, and the target is different, but the underlying idea is identical: give the agent a map so it does not have to rediscover the territory on every session.

March 3, 2026
in AI-Augmented Development, Knowledge Infrastructure
16 min read

Working Memory, Episodic Memory, Semantic Memory. Your Agent Has One.

Every session starts from zero. The agent cannot remember the decision it helped you make last Tuesday, the bug it spent three hours debugging last week, or the architectural pattern you established last month. This is not a model capability problem. It is a memory architecture problem — and it has a tractable solution.

March 2, 2026
in AI-Augmented Development, Knowledge Infrastructure
10 min read

KCP v0.1 to v0.5: How a Knowledge Standard Grows

KCP (Knowledge Context Protocol) has gone from a draft proposal to a v0.5 spec in one week. This post walks through what each version added, why those decisions were made, and where the spec is heading next.

The short version: every release promotes optional fields from community RFCs into the normative core. The spec is a strict superset at each step — a manifest written for v0.1 is still valid under v0.5.

The Evolution of KCP: From Minimal Draft to AI Knowledge Standard — v0.1 through v0.5 and Synthesis v1.20.0 implementation overview

March 1, 2026
in AI-Augmented Development
7 min read

What Happens When an AI Submits a PR and Another AI Reviews It

Recursive Refinement

We submitted a pull request to CrewAI adding a KCP manifest and TL;DR summary files. The goal was straightforward: contribute the same efficiency improvement that cut agent tool calls by 76% in our benchmark. Open it up, share the result, see if the maintainers want it.

What happened next was not what I expected.

March 1, 2026
in AI-Augmented Development, Knowledge Infrastructure
8 min read

KCP on Three Agent Frameworks: Same Pattern, Bigger Numbers

Five repos benchmarked, 73–80% reduction across three major AI agent frameworks. AutoGen leads at 80%, CrewAI at 76%, smolagents at 73%.

Today we applied KCP to three of the most widely-used AI agent frameworks — smolagents (HuggingFace, 25K stars), AutoGen (Microsoft, 55K stars), and CrewAI (44K stars). All three got the same treatment: a knowledge.yaml manifest, pre-built TL;DR summary files for the highest-traffic sections, and a before/after benchmark using the same model and methodology.

The results: 73%, 80%, and 76% reductions in agent tool calls. Open PRs are live on all three repositories.

73% smolagents (HuggingFace), 80% AutoGen (Microsoft), 76% CrewAI — KCP applied to these three widely-used AI frameworks yielded identical patterns of massive tool-call reduction.