Chapter 18: Claude & the Anthropic API
Chapter 18: Claude & the Anthropic API
Last Updated: 2026-03
18.2 Authentication
The Anthropic API key is stored:
- n8n: In the n8n credential vault (never in workflow code)
- WordPress/PHP: In WordPress options, encrypted using ITI utilities
- Swift: In the macOS Keychain via
KeychainService - Python: In
.envasANTHROPIC_API_KEY, loaded viapython-dotenv - Rust/Tauri: Via the Tauri Keychain plugin, loaded into IPC commands at runtime
Warning: The Anthropic API key must never appear in source code, Git commits, logs, or error messages.
18.3 Models
ITI uses the following Claude models:
| Model | Use Case | Max Tokens (output) |
|---|---|---|
claude-opus-4-5 |
Complex reasoning, long documents, multi-agent tasks | 8,192 |
claude-sonnet-4-5 |
Balanced speed/quality for product features | 8,192 |
claude-haiku-3-5 |
High-volume, latency-sensitive tasks | 4,096 |
Default for products: claude-opus-4-5 unless performance requirements demand a faster model.
Note: Model names and capabilities change with Anthropic releases. Check
operations/claude-parity-requirements.mdfor the current model selection guidance.
18.4 Basic API Call Structure
Python
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env
message = client.messages.create(
model="claude-opus-4-5",
max_tokens=2048,
system="You are an expert travel planner.",
messages=[
{"role": "user", "content": "Plan a 7-day trip to Japan."}
]
)
print(message.content[0].text)
TypeScript (Node.js / Tauri)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const message = await client.messages.create({
model: 'claude-opus-4-5',
max_tokens: 2048,
system: 'You are an expert travel planner.',
messages: [{ role: 'user', content: 'Plan a 7-day trip to Japan.' }],
});
console.log(message.content[0].text);
18.5 Tool Use (Function Calling)
Tool use allows Claude to request that the calling code execute a function and return the result. This is how multi-step reasoning is implemented without n8n.
import anthropic
import json
client = anthropic.Anthropic()
# Define tools
tools = [
{
"name": "search_flights",
"description": "Search for available flights between two cities.",
"input_schema": {
"type": "object",
"properties": {
"origin": {"type": "string", "description": "Origin city or airport code"},
"destination": {"type": "string", "description": "Destination city or airport code"},
"date": {"type": "string", "description": "Travel date (YYYY-MM-DD)"},
},
"required": ["origin", "destination", "date"],
},
}
]
# Initial message
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "Find me flights from Atlanta to Tokyo next week."}],
)
# Process tool calls
if response.stop_reason == "tool_use":
tool_use_block = next(b for b in response.content if b.type == "tool_use")
tool_result = execute_tool(tool_use_block.name, tool_use_block.input)
# Continue conversation with tool result
final_response = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
tools=tools,
messages=[
{"role": "user", "content": "Find me flights from Atlanta to Tokyo next week."},
{"role": "assistant", "content": response.content},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": tool_use_block.id, "content": json.dumps(tool_result)}
]},
],
)
18.6 Streaming
For real-time UI updates (typing indicator effect), use streaming:
with client.messages.stream(
model="claude-opus-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a poem about the ocean."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
18.7 Context Window Management
Claude has a context window (the maximum combined length of system prompt + conversation history + tool definitions + response). Exceeding the limit causes an error.
| Model | Context Window |
|---|---|
| claude-opus-4-5 | 200,000 tokens |
| claude-sonnet-4-5 | 200,000 tokens |
| claude-haiku-3-5 | 200,000 tokens |
Practical limits for product features:
- Keep system prompts under 2,000 tokens.
- Keep RAG-injected context under 8,000 tokens.
- Keep conversation history truncated to the last 20 messages or 10,000 tokens.
The ConversationManager in the ITI shared library handles automatic history truncation.
18.8 Error Handling
import anthropic
client = anthropic.Anthropic()
try:
response = client.messages.create(...)
except anthropic.RateLimitError:
# Wait and retry — use exponential backoff
time.sleep(60)
except anthropic.APIConnectionError:
# Network issue — retry or use fallback
pass
except anthropic.APIStatusError as e:
# API returned an error status
print(f"API error: {e.status_code} {e.message}")
except anthropic.AuthenticationError:
# Invalid API key — check configuration
pass
18.9 Direct vs n8n-Mediated Calls
| Consideration | Direct API | Via n8n |
|---|---|---|
| Centralized logging | No | Yes |
| RAG integration | Manual | Built-in (Dify retrieval node) |
| Error monitoring | Manual | Via Error Monitor workflow |
| Behavior changes without deployment | No | Yes (edit workflow) |
| Latency | Lower | Higher (webhook round-trip) |
| Suitable for | Fallback, scripts, prototypes | All product integrations |
Previous: Chapter 17 — iOS & macOS with Swift | Next: Chapter 19 — Prompt Engineering
