AI Inference Boundary Review

PostedApril 21, 2026

UpdatedMay 4, 2026

ByPeter Westerman

name: ai-inference-boundary-review

description: Review AI-generated artifacts (code diffs, PRs, documents, configs) for silent inferences the AI made that go beyond what was explicitly requested. Catches scope drift, uninvited refactors, assumed requirements, and architectural decisions made by the AI rather than by a human. Use before merging any AI-generated PR, during code review of AI-assisted changes, or when diagnosing why an AI-produced artifact feels “off” relative to the prompt.

AI Inference Boundary Review

Core Principle

Every AI response contains inferences not explicitly requested. Your prompt underdetermines the output — the AI fills gaps with priors (framework conventions, its sense of “good code”, assumed requirements, aesthetic preferences). Some inferences are valuable. Some are scope creep, wrong assumptions, or silent architectural decisions that bypass human judgment.

The purpose of this review is to make inferences visible so a human can ratify or reject them.

Instructions

Run this review on any non-trivial AI-generated artifact before it’s merged or shipped. The goal is to separate what was requested from what the AI decided on its own.

1. Reconstruct the Explicit Ask

Before looking at the artifact, write down what was actually requested:

Primary ask: The literal request — “fix the null check in X”, “add a login form”, “refactor Y to use Z”.
Stated constraints: Explicit limits in the prompt — “don’t touch Z”, “keep the API the same”, “use the existing pattern”.
Implicit constraints from context: What the repo’s conventions, CLAUDE.md, and recent history imply.

Everything not in one of these three buckets is an inference.

2. Classify Each Change

Walk the diff (or the document section by section). For each change, classify:

Class	Definition	Default Action
Requested	Directly implements the primary ask	Review on merit
Necessary consequence	Required by the requested change (imports, type signatures, test updates)	Review on merit
Ratified inference	AI made a judgment call, but it’s in-bounds and a reasonable human would make the same call	Accept, note for future prompts
Uninvited inference	AI changed something outside the primary ask for reasons of taste, consistency, or assumed improvement	Flag — decide explicitly
Assumed requirement	AI acted on a requirement that was never stated (e.g., added rate limiting, added a config option, changed error format)	Flag — reject unless intentional
Silent architectural decision	AI chose between plausible architectures without asking (cache strategy, lib choice, file layout)	Flag — require human ratification
Refactor-in-passing	AI cleaned up code it touched “while it was there”	Flag — separate from feature change

3. Red-Flag Patterns

These patterns indicate the AI went beyond its brief and need specific scrutiny:

Scope indicators:

Files changed that weren’t mentioned and aren’t causally linked to the ask
Renames or moves not requested
Reformatting across an entire file when only a section was touched
Import reordering, comment removal, or stylistic sweeps
New files introduced without discussion
Dependencies added (check via dependency-hygiene skill)

Assumption indicators:

New config options or environment variables invented
Error messages, logging, or user-facing text changed without instruction
Default values chosen (“I defaulted it to X”) without being asked to pick a default
Security posture changed (CORS, auth, headers) without being in scope
API shapes changed — endpoints, parameters, return types
Database migrations added or altered

Silent-decision indicators:

Between two valid approaches, one was chosen without flagging the tradeoff
Performance/memory/concurrency behavior changed
Caching introduced or altered
Library substitutions
Feature flags added or removed

Narrative indicators in the response text:

“I also took the liberty of…”
“While I was in there, I noticed…”
“I improved…”
“I assumed you wanted…”
“I went ahead and…”
“For consistency, I…”

Any of these phrases in the AI’s description of the change is a direct confession of an uninvited inference.

4. Decision Matrix

For each Uninvited Inference, Assumed Requirement, Silent Architectural Decision, or Refactor-in-Passing:

Decision	Action
Accept	Keep in the PR; note in commit message that it was out-of-scope-but-approved; consider adding to project conventions
Reject	Revert that change; re-prompt with explicit “do not modify X” constraint
Split	Move the change to a separate PR with its own review
Defer	Park in the scope-control backlog for later decision

Default bias is toward Reject or Split. Merging uninvited inferences quietly is how AI-assisted projects accumulate silent architectural decisions no human made.

5. Feedback Loop

After the review, update forward-looking guardrails:

CLAUDE.md updates: If the AI made an assumption a human wouldn’t want, document the preferred default in CLAUDE.md so future sessions inherit it.
Protected file list: If the AI touched files it shouldn’t, add those to the protected list.
Prompt hygiene: If the ambiguity was in your prompt, note the pattern and make future prompts more explicit about scope.
Repeat offender tracking: If the same kind of inference keeps happening (e.g., AI keeps “improving” error messages), add an explicit constraint to the project CLAUDE.md.

Review Checklist

Use this when reviewing an AI-generated PR:

[ ] Reconstructed the explicit ask before reading the diff
[ ] Every changed file has a causal link to the requested change
[ ] No new dependencies added without verification
[ ] No new config options, env vars, or feature flags invented
[ ] No API shapes changed that weren’t in scope
[ ] No security posture changes outside scope
[ ] Commit message describes human decisions, not just AI’s narrative
[ ] Uninvited inferences are either accepted-with-note, split, or rejected
[ ] CLAUDE.md updated if a new project convention emerged
[ ] Parking-lot backlog updated for any deferred items

Output Format


## Inference Boundary Review — [PR / Artifact Name]
**Date**: YYYY-MM-DD
**Reviewer**: [human]

### Explicit Ask
[Primary ask and constraints]

### Inference Inventory
| File:Line | Change | Class | Decision | Note |
|---|---|---|---|---|
| src/auth.ts:42 | Added rate limit | Assumed Requirement | Reject | Not in scope; re-prompt without |
| src/utils.ts:1-200 | Reformatted file | Refactor-in-Passing | Split | Move to separate PR |
| package.json | Added `lodash` | Uninvited Dep | Reject | Existing utils cover this |
...

### Accepted Inferences
[List, with rationale — these become candidates for CLAUDE.md]

### Rejected / Split Changes
[Re-prompt guidance]

### CLAUDE.md Updates Proposed
[Specific text to add to project context]

Standards

Default to explicit. When in doubt, reject the inference and re-prompt. The cost of re-prompting is much lower than the cost of silent architectural drift.
Separate feature from refactor. AI’s “while I was in there” changes should always be split into their own PR, reviewed on their own merits.
Make the feedback loop work. Every rejected inference is signal for prompt improvement or CLAUDE.md update. Don’t just reject and forget.
Scale with stakes. For prototype code, a light review. For production, security-sensitive, or payment code, every inference gets explicit ratification.

Related Skills

ai-coworker-trust-protocol — broader trust hygiene, of which this is the diff-level component
scope-control — handling scope changes at the sprint level
code-review — general code review methodology
dependency-hygiene — dependency verification protocol

Outputs: Inference inventory with per-change classification and decisions; re-prompt guidance; CLAUDE.md update proposals.

AI Skill

Product Showcase

ITI Knowledge System

AI Agent

User Guide

Requirements

ScubaGPT

Grateful Dead Chatbot

Farmers Bounty

Technical Document

Answer Engine Optimizer

SEO Optimizer

Travel Planner

Fact Checker

Estate Manager

ITI Operations

ITI Marketing

Patriot University

Personal Assistant

AI Inference Boundary Review

AI Inference Boundary Review

Core Principle

Instructions

1. Reconstruct the Explicit Ask

2. Classify Each Change

3. Red-Flag Patterns

4. Decision Matrix

5. Feedback Loop

Review Checklist

Output Format

Standards

Related Skills