Skip to content

Knowledge Infrastructure

The Faster Pencil

AI does not remove the hard part of any job. It moves it — and makes it harder to ignore.

Based on a conversation between two software developers, March 2026.


Two developers were talking late one night about what AI had actually changed in their work. They had both been using it for years. They were good at it. And what they kept coming back to was something that surprised them: the more capable the tool got, the more it demanded of them — not less.

This essay is built on that conversation. But the idea they landed on has nothing to do with software. It applies to any job where thinking is the work.

Every Agent That Queries a Knowledge Manifest Reinvents Filtering

Your agent has a task, a token budget, and a manifest with 200 knowledge units. Which units should it actually read? Every team answers this question differently — custom audience filters, ad-hoc staleness checks, bespoke capability gates. The logic works, but none of it interoperates. Swap one tool for another and you rewrite the glue.

KCP v0.14 standardises the query. RFC-0014 standardises composition. Together, they solve the two problems that make knowledge manifests painful at scale.

Peter Naur Was Right in 1985, and AI Just Proved It

In 1985, the Danish computer scientist Peter Naur published a short paper called "Programming as Theory Building." His argument was simple and radical: a program is not its source code. A program is a theory — a coherent mental model of what the system does, how its parts relate to each other, and why it was built the way it was. The source code is a byproduct of that theory. A trace. Not the thing itself.

The Manifest Quality Feedback Loop

kcp-commands ships 291 manifests. Each one is a bet: that the flags we chose are the ones the agent will actually need, that the output filter is tight enough, that the preferred invocations match real usage. Some of those bets pay off. Some do not.

Until now there was no way to know which. A manifest for kubectl apply could be steering the agent into the wrong flags on every invocation, and we would never see it unless we happened to watch the session in real time. At 291 manifests and hundreds of tool calls per day, that does not scale.

Today we are shipping two small releases that close that gap: kcp-commands v0.15.0 and kcp-memory v0.7.0. Together they create a feedback loop from agent behaviour back to manifest quality -- not by guessing, but by measuring what actually happened.

From Instrumentation to Infrastructure

AI agents like Claude Code run dozens of CLI commands per session, orchestrating complex multi-step workflows. Without structured knowledge of each tool, the agent guesses flags, calls --help to discover syntax, or retries when the first attempt fails. Each mistake compounds: a wrong flag in step 3 can invalidate everything that follows.

kcp-commands solves this with manifests -- YAML files that encode exactly what an agent needs to use a CLI tool correctly: key flags, preferred invocations, output patterns to strip. The daemon injects the right manifest before each Bash call, turning an uninformed first attempt into a guided one.

kcp-memory adds the second dimension: episodic memory. Every session is indexed. Every tool call is logged. The agent can search what it did last week, recover the reasoning from a delegated subagent, and see which manifests are actually working in practice.

Together they make Claude Code measurably smarter: 33% of the context window recovered, --help calls eliminated, and an agent that learns from its own history instead of starting from zero every session.

The latest addition closes the loop: the tools now learn from their own performance. Every Bash call produces an outcome signal. kcp-memory tracks retry rates, help-followup rates, and error rates per manifest -- surfacing which ones are guiding the agent well and which ones are steering it wrong. The highest-failure manifests have already been rewritten based on the data. The infrastructure measures its own effectiveness and improves.

kcp-commands v0.9.0 and kcp-memory v0.4.0 were passive observers. They watched what Claude did, logged it, made it searchable. Useful, but limited. The tools had no opinions about their own data.

The work since then -- through kcp-commands v0.18.0 and kcp-memory v0.18.0 today -- has been about a different question: what happens when the tools know what to ignore, can measure their own quality, and maintain themselves?

From Capable to Trustworthy: How KCP Evolved from Discovery to Governance

AI agents are getting remarkably good at doing things. They read code, traverse APIs, generate summaries, and execute multi-step plans across sprawling codebases. What they are still bad at is knowing what they should not do.

Today, an agent dropped into a new repository does the equivalent of walking into a library and reading every book on every shelf before deciding which one is relevant. This is expensive, slow, and -- in environments where some shelves contain confidential material -- genuinely dangerous.

The Autonomous Agentic Web Needs a Foundation Layer

The Foundation Layer of the Agentic Web — Capable models are not enough

Something is being assembled right now, mostly without a name for it.

Production pipelines where agents write code, run tests, and open pull requests. Compliance workflows where agents check controls, gather evidence, and escalate to humans when something needs a decision. Developer rigs where an agent calls a tool, the tool delegates to a sub-agent, the sub-agent calls an API, and the result flows back up the chain.

The models are capable. The tooling is solid. The use cases are real.

What we are building — collectively, across hundreds of teams and projects — is an autonomous agentic web. And like the original web, it will only become useful when the pieces can talk to each other reliably. We are not there yet. The reason is interesting.

The Front Door and the Filing Cabinet: A2A Agent Cards Meet KCP

The Front Door and the Filing Cabinet: Composing A2A and KCP in Multi-Agent Systems

Multi-agent systems need two kinds of identity. The first answers "who is this agent and how do I call it." The second answers "what knowledge does this agent have and who is allowed to see each piece of it." Google's A2A protocol handles the first. KCP handles the second. They are not competitors. They are different layers of the same stack.

This post started as that explanation. Then, the same day it was published, we built four simulators and 150 adversarial tests against the spec. The tests found 8 concrete gaps. Those gaps are now driving KCP v0.7. The full story follows.