Skip to content

Beyond RAG: How KCP 0.16--0.17 Give Agents Trustworthy, Self-Describing Knowledge

Overview: KCP v0.16 closes the trust gap with the Trusted Render Pipeline — signing, trust tiers, and sanitization. KCP v0.17 closes the content gap with content structure metadata and subtractive not_for matching. Together they compose a precision retrieval pipeline.

When a compliance agent evaluates a supplier against NIS2 Article 21, it needs two things: the supplier's security documentation to evaluate, and the regulation to evaluate it against. KCP, as described in the previous post, gives the supplier documentation a shape. The evaluation result gets a shape. But the regulation itself -- the specific requirements of Article 21(2), the interpretive guidance from ENISA, the national implementation notes -- lives where it has always lived: as prose in the system prompt.

That worked when agents consumed knowledge from a single, trusted source. It does not work when your agent pulls context from four federated manifests across two organisations, one of which was generated by an automated crawl three weeks ago. The question is no longer "does this knowledge have a shape?" It is: "can I trust this knowledge, and is it even the right content for what I need?"

KCP v0.16 and v0.17 answer those two questions. v0.16 introduces a trust model -- cryptographic signing, trust tiers, a render pipeline that fails closed. v0.17 introduces a content model -- structural metadata that tells you what a unit contains before you fetch it, and subtractive matching that tells you what it is explicitly not about. Together, they close two gaps that have been open since the beginning of the series.

This post walks through both releases. The examples are concrete. The threat model is explicit. If you are building systems where agents ingest knowledge from sources you do not fully control, this is the machinery that makes that safe.


The trust gap: v0.16

You run a KCP bridge that federates manifests from three repositories. A partner signed their manifest last month. An attacker compromises the CDN serving it, strips the signature, and injects a unit whose content contains prompt injection disguised as documentation. Your bridge sees no signature, classifies it as "unsigned," and ingests it because your policy allows unsigned content from known sources.

The context window is the attack surface. If the knowledge arriving in that window has no provenance, no integrity check, and no way to distinguish "never signed" from "signature removed," you have a supply-chain problem wearing a different hat.

The context window is the attack surface. Unverified manifests from federated sources can carry injected instructions, stripped signatures, or poisoned content — and without a trust pipeline, they arrive indistinguishable from legitimate knowledge.

The Trusted Render Pipeline

v0.16 introduces kcp render (SPEC section 16), a deterministic pipeline that produces a derived, sanitized artifact from a raw manifest. Three design principles:

  • LLM-free. No language model at any stage. Deterministic transforms only.
  • Fail-closed. On any failure, exit code 2 and no output. No partial render, no "best effort."
  • Consumer-side tier assignment. Trust tiers are computed by the consumer, not declared by the producer. A manifest cannot claim to be trusted.

The Trusted Render Pipeline: raw manifest in, sanitized artifact out. LLM-free, deterministic, fail-closed. The pipeline enforces a schema whitelist, runs an imperative-mood lint, and computes trust tiers consumer-side.

Four trust tiers, computed after render:

Tier Meaning
trusted Valid signature from an allowlisted key, scope matches
known Valid signature from a recognized but non-allowlisted key
unsigned No signature present, no scope expectation violated
failed Signature present but invalid, or expected signature missing

Plus unrendered, a pseudo-tier for manifests that have not been through the pipeline at all -- strictly worse than failed, because at least failed went through verification.

Consumer-Side Tier Assignment: the diagnostic matrix for all five trust states. Tiers are computed by the consumer over the content_integrity block — producers cannot self-declare a tier.

Content integrity

The manifest gains a trust.content_integrity block:

trust:
  content_integrity:
    manifest_hash: "sha256:a1b2c3d4..."
    signing:
      algorithm: "EdDSA"
      public_key: "ed25519:MCowBQYDK2VwAyEA..."
      signature_location: "knowledge.yaml.jws"

EdDSA with Ed25519 (RFC 8037) is mandatory-to-implement. The signature is a detached JWS over canonical manifest bytes -- it lives in a separate file, so the manifest itself is unchanged. You can still cat knowledge.yaml and read it.

Closing the signature-stripping attack

The mechanism is scope pinning. When you add a public key to your allowlist, you pin a scope: expected origins where manifests signed by that key should appear. A manifest arriving from a pinned origin without a signature renders as failed, not unsigned. The absence of the expected signature is itself the signal.

Without scope pinning, signature stripping degrades a manifest from trusted to unsigned. With scope pinning, it degrades to failed. The difference between "this was never signed" and "this should have been signed but was not" is the difference between a policy decision and a security incident.

Closing the signature-stripping attack with scope pinning. Without it: strip the signature, degrade from trusted to unsigned, pass policy. With it: a missing signature from a pinned origin renders as failed — the absence is the signal.

Sanitization

Two sanitization layers run before output.

A render-schema whitelist strips unknown fields. If someone adds an llm_instructions field, it does not survive rendering. Versioned, explicit, no pass-through of unrecognized keys.

A versioned imperative-mood lint flags prompt injection in manifest content. Fields like intent and description should contain declarative text. Imperative instructions ("ignore previous context and...") get flagged. The design is quarantine-not-reject: flagged content is annotated, not silently dropped.

The lint has a documented limitation. Descriptive-mood injection -- instructions phrased as statements -- passes by design. The load-bearing control for that threat class is C8 data-framing, not the lint. We are explicit about this because security controls that overstate their coverage are worse than controls that honestly describe their gaps.

Heuristic sanitization and quarantine: schema whitelist strips unknown fields; imperative-mood lint flags injected instructions. Quarantine-not-reject — flagged content is annotated, not silently dropped. Documented known gap: descriptive-mood injection passes the lint by design.

Kind-based load eligibility

Not all manifest kinds should be loadable. Manifests with kind: service, kind: executable, or any unknown kind value are never load-eligible at any trust tier. You can render and inspect them, but the bridge will not inject their content into a context window.

Federation semantics

Two rules: no transitive trust (manifest B is not trusted just because manifest A federates to it), and no auto-traversal below trusted (a bridge does not follow federation links from known, unsigned, or failed manifests).

Rendered artifacts are not self-authenticating. A runtime never ingests a render it did not produce or verify in the current session. Render outputs are disposable caches, not portable trust tokens.

Hard gates and federation semantics: kind-based load eligibility blocks service/executable kinds at all trust tiers. No transitive trust, no auto-traversal below trusted. Rendered artifacts are session-scoped, not portable trust tokens.


Observability: seeing what the trust pipeline does

SPEC section 17 promotes the observability schema from an RFC to a core specification. The ~/.kcp/usage.db database now has three tables:

Table Contents
usage_events Tool calls, queries, token consumption
render_events Every render invocation with input hash, output tier, duration
quarantine_events Every lint flag, with the field name, content snippet, and threat category

The quarantine table is the one to watch. Track quarantine-event count between commits -- if it drifts upward, something changed in your knowledge sources. This is a CI-diffable signal. A spike means either your content changed or someone else's did.

Discovery verification status

Section 4.18 adds discovery.verification_status: declared as a new first-party self-description level. The epistemic ordering across the spec is now:

rumored < declared < observed < verified

A declared status with confidence in [0.5, 0.8) signals "the publisher says so." It is evidence, not proof. A consumer can use it as input to a trust decision, but it does not substitute for independent verification. The ordering is normative -- a serving layer must not rank declared above observed.

Validation

The trust model was validated against a 22-case experiment matrix covering threats T1--T8: signature stripping, scope violation, field injection, kind spoofing, transitive trust abuse, lint evasion, render replay, federation poisoning. Fresh Ed25519 keys per run. Mutation-tested expectations -- the suite verifies that specific mutations cause specific failures, not just that valid manifests pass.


The content gap: v0.17

Trust tells you whether to ingest a unit. It does not tell you whether the unit is useful for your current task. That is a different problem, and it is the one most RAG pipelines handle poorly.

Consider a unit titled "API Reference." It might be 3,000 tokens of dense tables listing every endpoint with parameters, response codes, and payload schemas. A code-generation agent looking for "how to authenticate" fetches it, processes it, finds the authentication endpoint buried in row 47, and wastes 2,800 tokens on everything else. Or consider a unit titled "System Architecture" that is primarily a diagram with prose annotations. A text-only pipeline fetches it, tries to extract meaning from the alt-text, and gets noise.

The inverse problem is just as common. A unit titled "Authentication Guide" keeps appearing in search results for "authorization patterns" because the titles share vocabulary. The agent fetches it, reads about session tokens and login flows, and finds nothing about RBAC or permission models. The unit is high-quality, correctly signed, properly tiered -- and completely irrelevant to the query.

Both problems come from the same gap: units do not describe their own content structure or their own negative scope. v0.17 closes both.

The content gap: quality does not equal relevance. A correctly signed, properly tiered unit can still waste tokens (wrong modality) or mislead (vocabulary overlap without conceptual overlap). v0.17 adds the metadata to detect both problems before fetching.

Content structure

Section 4.19 adds content_structure to unit definitions:

units:
  - id: api-reference
    title: "REST API Reference"
    content_structure:
      primary: table
      contains: [table, code, prose]
      density: dense

  - id: architecture-overview
    title: "System Architecture"
    content_structure:
      primary: diagram
      contains: [diagram, prose]
      density: sparse

Three fields:

  • primary: the dominant modality. One of prose, table, code, list, diagram, reference, or mixed.
  • contains: all modalities present in the unit. A unit with primary: prose might also contain code blocks and tables.
  • density: sparse, normal, or dense. A sparse unit has high whitespace ratio and short sections. A dense unit packs information tightly with minimal narrative.

A RAG pipeline that handles code well but struggles with dense tables can prefer primary: prose or primary: code units. A multimodal agent can filter for contains: [diagram]. A token-constrained pipeline can deprioritize density: dense units. Extraction strategy -- chunking, embedding, parsing -- gets decided before fetching. You do not download 3,000 tokens of dense tables to discover your pipeline cannot process them.

Forward-compatible: unknown values pass through with a warning, not a rejection. Existing parsers will not break on future modality values.

Self-describing knowledge structures: the content_structure block declares primary modality, all modalities present, and information density. Retrieval agents and RAG pipelines can route and choose extraction strategies before fetching.

Subtractive matching: not_for and not_for_strict

Section 4.20 introduces the spec's first subtractive matching field:

units:
  - id: auth-guide
    title: "Authentication Guide"
    not_for:
      - "authorization and RBAC"
      - "OAuth client registration"

  - id: deployment-runbook
    title: "Production Deployment Runbook"
    not_for:
      - "local development setup"
      - "CI/CD pipeline configuration"
    not_for_strict: true

Every previous matching field in KCP is additive: triggers, intents, scopes -- they say "this unit is relevant when X." not_for says the opposite: "this unit is explicitly not about Y." This is the negative space that search systems almost never model.

By default, a not_for match triggers soft demotion: the unit still appears with a caution annotation. not_for_strict: true changes that to hard exclusion -- the bridge MUST omit the unit entirely.

Subtractive matching and negative scope: the not_for Venn diagram. Soft demotion surfaces the unit with a caution annotation; not_for_strict forces hard exclusion. The spec's first subtractive matching field.

Evaluation order (section 15.11): scoring first, then negative-space filtering, then the top-N cut. A unit can score highly on positive terms and still get excluded by not_for. This means not_for operates as a correction on the scoring model, not a pre-filter that might miss relevant units.

The deterministic order of operations: score → negative-space filter → top-N cut. not_for filtering runs after scoring and before the cut — a unit can rank high on positive terms and still be excluded by its own declared negative scope.

Manifest-level not_for (section 3.10) adds an advisory root-level scope boundary for federation: "this entire manifest does not cover X." A coarse filter for multi-manifest environments where queries fan out across dozens of sources.


How they fit together

v0.16 answers: can I trust what I am about to ingest?

v0.17 answers: is this the right content for what I need?

They compose. The sequence on a federated query: render the manifest (trust pipeline), assign a tier, check kind eligibility, then score and filter units using content structure and negative-space matching (content pipeline). A unit that fails trust never reaches scoring. A unit that passes trust but fails content matching never reaches the context window.

You are not fetching context from a knowledge base. You are fetching signed, tiered, content-described, negatively-scoped units from a pipeline that enforced trust before they arrived and relevance before they were loaded.

The KCP Precision Intake Funnel: v0.16 trust pipeline narrows the field to verified, sanitized manifests; v0.17 content pipeline filters to relevant, correctly-typed units. Only what passes both reaches the context window.

The arc of the series: KCP started as a manifest format (v0.10 query vocabulary, v0.11 cold discovery, v0.12 governance, v0.14 ranked retrieval, v0.14.3 observability RFC). Now v0.16 adds a trust model and v0.17 adds a content model. Each release closes a gap that was visible from the previous one.


What comes next

Three problems are already visible from where we stand.

Federated trust chains. Today, there is no transitive trust below trusted. That is the safe default, but it limits federation in large organisations where trust is naturally delegated. We want explicit delegation: "I trust this key, and I trust manifests that this key's owner has themselves signed." The machinery for this is mostly a policy language problem, not a cryptography problem.

Modality-aware extraction. content_structure currently drives routing -- which units to fetch. The next step is driving parsing strategy: using primary: table to select a table-aware chunker, using density: dense to adjust embedding window sizes, using contains: [diagram] to trigger image extraction. The metadata is there. The extraction pipelines need to consume it.

Richer claim provenance. The epistemic ordering in verification_status -- rumored, declared, observed, verified -- is the foundation for tracking not just whether a claim is trusted, but how it became trusted. Who verified it? When? Against what evidence? The ordering gives us a ladder. The rungs need more detail.

Observability and what comes next: the quarantine events table as a CI-diffable security signal, the epistemic ladder (rumored → declared → observed → verified), and the three open problems — federated trust chains, modality-aware extraction, richer claim provenance.

KCP v0.16 & v0.17: the full precision intake pipeline — Phase 1 closes the trust gap (Trusted Render Pipeline: signing, trust tiers, sanitization), Phase 2 closes the content gap (content structure metadata and subtractive not_for matching), Phase 3 shows the integrated precision retrieval flow.


Try it

The spec, the CLI, and the bridges are all open source under the Cantara organisation on GitHub:

Run kcp render on your own knowledge.yaml and see what tier it gets. Run kcp validate to check your manifest against the current spec. If something breaks, that is useful feedback -- open an issue or a discussion. We would rather hear about the rough edges now than discover them in production.


Co-authored with Claude. The structure and framing are mine; Claude helped draft and sharpen the argument.