I’ve been experimenting with a runtime pattern for coding agents I call Zero-Context Architecture (ZCA).
Instead of accumulating a long conversation history while exploring a repo, the agent repeatedly re-observes the environment and sends only a minimal slice of relevant files to the model.
The loop becomes:
run tests → project slice → edit → run tests → re-project
I built a small benchmark to compare this against a traditional long-context exploration agent.
On the current tasks, a ZCA agent using Haiku matches a baseline Opus agent while using ~27× fewer tokens.
The interesting part is the failure boundary: performance depends heavily on the projection layer rather than the model itself.
I’ve been experimenting with a runtime pattern for coding agents I call Zero-Context Architecture (ZCA).
Instead of accumulating a long conversation history while exploring a repo, the agent repeatedly re-observes the environment and sends only a minimal slice of relevant files to the model.
The loop becomes:
run tests → project slice → edit → run tests → re-project
I built a small benchmark to compare this against a traditional long-context exploration agent.
On the current tasks, a ZCA agent using Haiku matches a baseline Opus agent while using ~27× fewer tokens.
The interesting part is the failure boundary: performance depends heavily on the projection layer rather than the model itself.