Benchmarks¶

February 26, 2026
in AI-Augmented Development, Knowledge Infrastructure, Benchmarks
9 min read

The Date the AI Invented

The agent answered the ROI metrics question with zero tool calls. It reported the indexing speed, the search latency, the file count, the retrieval time improvement, the test count. All correct. Every number accurate.

Then it said the metrics were validated on February 19, 2026.

The actual date was February 17.

February 26, 2026
in AI-Augmented Development, Synthesis, Benchmarks
10 min read

We Gave the AI Better Documentation. It Got Slower.

We had 15 skill files documenting every Synthesis CLI command — syntax, options, example invocations, expected output. We wrote them carefully. We loaded them into the agent's context. We assumed the agent would use them.

Then we ran a benchmark.

The CLI condition was the worst-performing integration in the entire test. Worse than no integration at all.