Test Plan Writing

PostedApril 21, 2026

UpdatedMay 4, 2026

ByPeter Westerman

name: test-plan-writing

description: Write comprehensive test plans covering functional, non-functional, integration, regression, and exploratory testing. Use when planning testing for a new feature, sprint, or release, defining test coverage for an API or component, writing acceptance tests for user stories, or reviewing whether testing is complete before sign-off.

Test Plan Writing

Instructions

Write test plans that give engineers and QA a complete, executable testing program — no guesswork, no gaps.

Test plan document structure:


# [Feature/Sprint/Release] Test Plan
**Version**: [x.y.z]
**Date**: YYYY-MM-DD
**Scope**: [what is and is not being tested]

## Objectives
[What quality attributes are being verified: correctness, security, performance, reliability]

## Test Types
- Functional: [which user stories / requirements]
- Integration: [which component interactions]
- Regression: [which previously passing tests must still pass]
- Performance: [which operations need timing validation]
- Security: [which attack vectors are being tested]
- Exploratory: [which areas are being probed without a script]

## Test Cases
[See format below]

## Exit Criteria
[What constitutes a complete, passing test run]

## Environment Requirements
[OS, browser, app version, database state, API credentials needed]

## Dependencies
[External services, test data, credentials, third-party sandboxes]

Test case format:


### TC-[product]-[number]: [Test case title]
**Type**: Functional | Integration | Regression | Performance | Security
**Related**: [Story ID or Requirement ID]
**Preconditions**: [State of system before test]
**Steps**:
1. [Action]
2. [Action]
**Expected result**: [Exact expected outcome]
**Pass criteria**: [How to determine pass vs fail]

Minimum test coverage per user story:

Happy path (primary success scenario)
Primary error state (most likely failure)
Boundary condition (empty input, maximum value, edge case)
If story involves AI: malformed AI response handling

Minimum test coverage per API endpoint:

200 (success with expected payload)
400 (invalid input — missing required field, wrong type)
401/403 (unauthenticated / unauthorized)
404 (resource not found)
500 (server error handling — verify error message doesn’t expose internals)
Timeout (network latency simulation)

AI-specific test scenarios (for Claude API integrations):

Valid request → verify response structure and content quality
Rate limit (429) → verify retry behavior and user-facing message
Token limit exceeded (400) → verify graceful degradation
Network timeout → verify fallback behavior
Malformed JSON response → verify error handling, no crash

Performance test thresholds (ITI standard):

Page load: < 3 seconds
API response: < 5 seconds (95th percentile)
Claude API call: < 10 seconds (with streaming progress indicator)
Database query: < 500ms for simple CRUD, < 2s for complex joins
n8n webhook response: < 2 seconds (non-AI workflows), < 15 seconds (AI Agent workflows)
Redis operation: < 10ms
Dify KB retrieval: < 3 seconds
Docker container health check: < 5 seconds

Infrastructure/container testing patterns (Docker Compose stack):

When writing test plans for containerized services, include these test types:

Container health: verify each container responds to its health check (PostgreSQL pg_isready, Redis PING, n8n /healthz, Dify API /console/api/setup)
Service connectivity: verify cross-container DNS resolution and port accessibility
Database readiness: verify expected databases exist (n8n, dify, dify_plugin), pgvector extension loaded
Nginx proxy routing: verify routes resolve correctly (Dify API on :3001, n8n on :5678, Dify Web on :3000)
Volume persistence: verify data survives container restart

Test markers for infrastructure tests:

@pytest.mark.smoke — quick health checks, containers running, ports responding (<30s total)
@pytest.mark.integration — multi-service integration, requires full Docker stack
@pytest.mark.slow — long-running tests (AI Agent workflows, Dify retrieval)

n8n workflow webhook testing patterns:

Reachability: GET/POST to /webhook/{path} returns 200 with valid JSON
AI Agent workflows: verify chatInput is accepted and response contains output field
Router workflows: verify each Switch branch is reachable with appropriate input discriminator
KB-augmented workflows: verify Dify retrieval tool returns context in agent output
Empty/invalid payload: verify workflow handles gracefully without 5xx
Session memory: verify session-based workflows maintain context across turns using $execution.id

Dify KB retrieval quality testing:

Relevant query: at least one result returned with score > 0
Irrelevant query: empty results or results with score < threshold
Invalid dataset ID: 404 response
Missing authentication: 401 response
Chunking quality: retrieved segments are coherent and contextually complete

FastAPI async endpoint testing patterns:

In-process (unit): pytest + httpx AsyncClient with ASGITransport — no external services
Live (integration): pytest + httpx against Docker Compose URL — requires stack running
Auth matrix: valid token, expired token, missing token, invalid role
CORS: allowed origin, blocked origin, preflight OPTIONS request
Security headers: X-Content-Type-Options, X-Frame-Options, Strict-Transport-Security
File uploads: valid MIME accepted, invalid MIME rejected (415), EXIF stripping verified
Error responses: no PII, no stack traces, no internal file paths

Antigravity test execution integration:

When the test plan will be executed via Google Antigravity (the ITI test/debug lane), include these considerations:

Agent dispatch format: Structure test cases so they can be dispatched as Antigravity Agent Manager tasks using Planning mode. Each test suite maps to a /test-session or /browser-test workflow invocation.
Browser QA test cases: For UI-facing tests, specify viewport sizes (1440px, 1024px, 768px, 375px) and expected visual states. Antigravity’s browser sub-agent captures screenshots and recordings as Walkthrough artifacts.
Visual regression baselines: Include a baseline capture step before code changes. Antigravity compares screenshots against baselines to detect unintended UI changes.
Artifact review criteria: Define what constitutes a pass/fail in Walkthrough artifact review — include expected screenshot states, acceptable visual differences, and [TEST-FAILURE] flag thresholds.
Knowledge sync: Test plans should include a post-session step to scan Walkthrough artifacts for [CONTEXT-UPDATE] flags and route findings to the appropriate CLAUDE.md tier.

See the antigravity-testing and antigravity-browser-qa skills for detailed dispatch and artifact review protocols.

Outputs: Test plan document (Markdown), test case matrix (story/requirement → test cases), exit criteria checklist, environment setup guide, defect report template, infrastructure smoke test checklist.

AI Skill

Product Showcase

ITI Knowledge System

AI Agent

User Guide

Requirements

ScubaGPT

Grateful Dead Chatbot

Farmers Bounty

Technical Document

Answer Engine Optimizer

SEO Optimizer

Travel Planner

Fact Checker

Estate Manager

ITI Operations

ITI Marketing

Patriot University

Personal Assistant

Test Plan Writing

Test Plan Writing

Instructions