Skip to main content

Why

“I’m done” is the riskiest signal in multi-agent coding. The test gate turns it from a self-report into a verifiable receipt: your change doesn’t propagate to teammates until the tests that depend on the symbols you touched actually pass.

The Flow

1

Discover

Call confirm_ready with touched_files (and optionally touched_symbols). The gate walks the symbol graph backwards from your changes to the test files that transitively depend on them and returns affected_tests.
2

Run

Your agent runs exactly those tests locally. No point running the full suite if only three tests are affected.
3

Submit

Call confirm_ready again with test_results[]. The gate evaluates pass/fail per test, correlates failures against your machine’s flake history (machine_id), and returns a verdict.
4

Receipt

On pass the gate writes a sync_log row. Pass its sync_log_id to sync_commit when you push — the auto-sync watcher uses gated receipts to decide which commits to fan out to other machines.

Flake Handling

The gate tracks per-machine pass/fail history per test. Flaky tests (intermittent failures on the same machine + commit) are surfaced as failures[].flake=true. Repeated failures across machines are treated as real and block the gate.

Completing a Task Atomically

If the change-set satisfies a delegated task, pass completes_task_id to confirm_ready. On gate pass the task is marked done in the same transaction, which triggers the unblock cascade for any tasks that depended on it.

When the Gate Isn’t Enough

The gate checks “the tests I decided to run passed on my machine”. It does not replace CI. Keep CI for:
  • Full-matrix runs (OS × node version × env)
  • Integration tests against real external services
  • Static analysis (linting, typechecks) that gates merge, not handoff
The gate’s job is to prevent breaking handoffs inside a workstream. CI’s job is to prevent merging to main broken.