Why
“I’m done” is the riskiest signal in multi-agent coding. The test gate turns it from a self-report into a verifiable receipt: your change doesn’t propagate to teammates until the tests that depend on the symbols you touched actually pass.The Flow
Discover
Call
confirm_ready with touched_files (and optionally touched_symbols). The gate walks the symbol graph backwards from your changes to the test files that transitively depend on them and returns affected_tests.Run
Your agent runs exactly those tests locally. No point running the full suite if only three tests are affected.
Submit
Call
confirm_ready again with test_results[]. The gate evaluates pass/fail per test, correlates failures against your machine’s flake history (machine_id), and returns a verdict.Receipt
On pass the gate writes a
sync_log row. Pass its sync_log_id to sync_commit when you push — the auto-sync watcher uses gated receipts to decide which commits to fan out to other machines.Flake Handling
The gate tracks per-machine pass/fail history per test. Flaky tests (intermittent failures on the same machine + commit) are surfaced asfailures[].flake=true. Repeated failures across machines are treated as real and block the gate.
Completing a Task Atomically
If the change-set satisfies a delegated task, passcompletes_task_id to confirm_ready. On gate pass the task is marked done in the same transaction, which triggers the unblock cascade for any tasks that depended on it.
When the Gate Isn’t Enough
The gate checks “the tests I decided to run passed on my machine”. It does not replace CI. Keep CI for:- Full-matrix runs (OS × node version × env)
- Integration tests against real external services
- Static analysis (linting, typechecks) that gates merge, not handoff
main broken.