Skip to main content
< All Topics
Print

Antigravity Test Orchestration

name: antigravity-testing

description: Orchestrate autonomous test execution in Google Antigravity using agent dispatch, browser sub-agents, artifact review, and AGENTS.md configuration. Use when dispatching test agents in Antigravity, configuring test session workflows, reviewing Walkthrough artifacts, triaging test results, or integrating Antigravity test output into CI/CD pipelines.

Antigravity Test Orchestration

Instructions

Use Google Antigravity as the autonomous test execution layer in the ITI three-lane toolchain (Cursor for development, Claude Code for context management, Antigravity for test/debug). Antigravity agents plan, execute, and verify test suites with browser automation, screenshot capture, and structured Walkthrough artifacts.

Prerequisites:

  • Antigravity installed (v1.18.0+ required for browser sub-agent; ITI pins v1.107.0)
  • Gemini 3 Pro (or 3.1 Pro) selected as the primary model — avoid flash-tier models for test orchestration
  • .agents/ directory configured at project root with rules, skills, and workflows
  • Local dev environment running (WordPress dev server, Tauri dev build, Python venv, or Docker stack)
  • Artifact Review Policy set to “Request Review” (never “Always Proceed”)
  • Terminal Command Auto Execution set to “Request Review”

Agent dispatch for test execution:

Open the target project as a dedicated Antigravity workspace (never multi-root across clients). In the Agent Manager, click +Task and set Planning Mode to Plan for all non-trivial test sessions.

Standard test dispatch prompt:


/iti-delivery-framework

Run the existing test suite for [plugin/feature name].
Identify all failures, trace root causes to specific functions or files.

Generate a Walkthrough artifact that includes:
1. Test results summary (pass/fail counts)
2. Root cause analysis for each failure
3. Proposed fixes as diffs — do NOT apply them yet
4. Any [CONTEXT-UPDATE] flags for findings that should update CLAUDE.md

Do not modify source files.

Planning mode vs Fast mode:

Mode When to use Trade-off
Plan Test suites, multi-step QA, regression testing Agent produces a reviewable plan before executing; slower but safer
Fast Trivial checks (single file lint, quick lookup) Immediate execution; no plan artifact to review

Always use Plan mode when the test session could modify files, run terminal commands, or navigate a browser.

Workspace isolation rules:

  • Each client project = a separate Antigravity workspace
  • Never open a multi-root workspace spanning two clients
  • Name workspaces explicitly: [Client] / [Project]
  • Before each session: confirm the workspace name in Agent Manager matches the project

AGENTS.md configuration for test sessions:

The .agents/ directory at project root provides persistent context:


.agents/
├── rules/
│   ├── iti-context-system.md    # Always-on: project context, protected files, available skills
│   └── test-session-rules.md    # Manual: activated for diagnostic sessions
├── skills/
│   ├── iti-context.md           # /iti-context — master ITI operating context
│   ├── iti-delivery-framework.md # /iti-delivery-framework — delivery phase reference
│   ├── iti-claude-context.md    # /iti-claude-context — CLAUDE.md system context
│   └── iti-audit.md             # /iti-audit — codebase accuracy audit
└── workflows/
    ├── test-session.md          # /test-session — run test suite diagnostics
    ├── browser-test.md          # /browser-test — browser-based UI testing
    └── prompt-library.md        # /prompt-library — quick reference for all workflows

Global rules live at ~/.gemini/GEMINI.md. GEMINI.md takes priority over AGENTS.md when both exist.

Artifact review protocol:

After an agent completes a test session, it produces a Walkthrough artifact containing:

  1. Test results summary (pass/fail/skip counts)
  2. Root cause analysis for each failure with severity classification
  3. [PROPOSED-FIX] blocks with diffs — review before accepting
  4. [CONTEXT-UPDATE] flags for findings that should update CLAUDE.md
  5. Browser session recordings and screenshots (for browser-based tests)

Review checklist for each artifact:

  • [ ] Change is limited to the file/function the agent was asked about
  • [ ] No changes to CLAUDE.md, .cursorrules, or context markdown files
  • [ ] No new dependencies added without explicit approval
  • [ ] No hardcoded credentials or environment-specific values
  • [ ] WordPress security conventions preserved (nonces, prepare(), output escaping)
  • [ ] For Tauri/Rust: no unwrap() on user-facing paths, no hardcoded file paths

Test result triage:

Status Action
Pass No action; verify count matches expectations
Fail — known issue Confirm root cause matches known pattern; document if new variant
Fail — new issue Create [PROPOSED-FIX] diff; classify severity; flag [CONTEXT-UPDATE] if architectural
Skip Verify skip condition is intentional (missing fixture, env constraint)
Timeout For heavy multi-agent workflows (>15s), timeouts confirm processing, not breakage

Parallel agent dispatch:

The Agent Manager supports multiple concurrent agents across workspaces. Use this for:

  • Running unit tests in one agent while browser QA runs in another
  • Testing multiple product endpoints simultaneously
  • Running regression suites on different feature branches

Each agent operates in its own context — ensure workspace isolation rules are followed.

CI/CD integration patterns:

Antigravity test sessions can feed into CI pipelines:

  1. Agent generates test results as structured Walkthrough artifacts
  2. Export screenshots and recordings from the Artifacts panel
  3. Commit test evidence alongside code changes
  4. Reference artifact IDs in PR descriptions for reviewability

For GitHub Actions integration, configure the browser sub-agent with headless mode and screenshot-on-failure: true in .agents/rules/.

Knowledge sync after test sessions:

Every Antigravity test session ends with knowledge sync (non-negotiable):

  1. Scan all Walkthrough artifacts for [CONTEXT-UPDATE] flags
  2. Classify each flag by tier: GLOBAL, PROJECT, PRODUCT, CLIENT, or MANUAL
  3. Switch to Claude Code to apply approved updates to the appropriate CLAUDE.md
  4. Commit knowledge updates separately from code fixes:

docs: sync knowledge files from Antigravity session YYYY-MM-DD

See operations/documentation/antigravity-runbook.md for the complete step-by-step protocol.

Hard boundaries — Antigravity must never:

  • Modify CLAUDE.md at any tier
  • Modify .cursorrules or .cursor/rules/*.mdc
  • Run git push, git commit, or git merge
  • Run database migrations without explicit human approval
  • Access external URLs outside the local dev domain during browser testing
  • Install packages without human review

Cross-references:

  • operations/documentation/antigravity-runbook.md — complete operational runbook
  • operations/documentation/claude-code-workflow.md — three-lane toolchain model
  • test-plan-writing skill — test plan structure and coverage requirements
  • session-context-protocol skill — session opener/closer lifecycle
  • antigravity-debugging skill — parallel debug dispatch patterns
  • antigravity-browser-qa skill — browser sub-agent QA workflows

Outputs: Walkthrough artifacts with test results, root cause analyses, proposed fix diffs, [CONTEXT-UPDATE] flags, browser recordings, screenshots, and test evidence for CI pipelines.

Table of Contents