Skip to main content
< All Topics
Print

Chapter 18: Claude & the Anthropic API

Chapter 18: Claude & the Anthropic API

Last Updated: 2026-03

## 18.1 Overview

Claude (by Anthropic) is the primary large language model used across all ITI products. Claude handles AI-powered features: document analysis, knowledge base retrieval synthesis, conversational assistants, code generation, content analysis, and multi-step reasoning.

Claude is accessed two ways in ITI products:

| Access Method | When Used | Components |

|—————|———–|———–|

| Via n8n workflow | Standard product integrations | ITI_Workflow_Adapter → n8n webhook → AI Agent node |

| Direct API | Fallback, standalone services, one-off scripts | ITI_Claude_API class, ClaudeService.swift, claude_client.py |

Always prefer the n8n-mediated path for product integrations. It provides centralized logging, error monitoring, RAG integration, and the ability to modify AI behavior without deploying new product code.

18.2 Authentication

The Anthropic API key is stored:

  • n8n: In the n8n credential vault (never in workflow code)
  • WordPress/PHP: In WordPress options, encrypted using ITI utilities
  • Swift: In the macOS Keychain via KeychainService
  • Python: In .env as ANTHROPIC_API_KEY, loaded via python-dotenv
  • Rust/Tauri: Via the Tauri Keychain plugin, loaded into IPC commands at runtime

Warning: The Anthropic API key must never appear in source code, Git commits, logs, or error messages.


18.3 Models

ITI uses the following Claude models:

Model Use Case Max Tokens (output)
claude-opus-4-5 Complex reasoning, long documents, multi-agent tasks 8,192
claude-sonnet-4-5 Balanced speed/quality for product features 8,192
claude-haiku-3-5 High-volume, latency-sensitive tasks 4,096

Default for products: claude-opus-4-5 unless performance requirements demand a faster model.

Note: Model names and capabilities change with Anthropic releases. Check operations/claude-parity-requirements.md for the current model selection guidance.


18.4 Basic API Call Structure

Python


import anthropic

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=2048,
    system="You are an expert travel planner.",
    messages=[
        {"role": "user", "content": "Plan a 7-day trip to Japan."}
    ]
)

print(message.content[0].text)

TypeScript (Node.js / Tauri)


import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const message = await client.messages.create({
    model: 'claude-opus-4-5',
    max_tokens: 2048,
    system: 'You are an expert travel planner.',
    messages: [{ role: 'user', content: 'Plan a 7-day trip to Japan.' }],
});

console.log(message.content[0].text);

18.5 Tool Use (Function Calling)

Tool use allows Claude to request that the calling code execute a function and return the result. This is how multi-step reasoning is implemented without n8n.


import anthropic
import json

client = anthropic.Anthropic()

# Define tools
tools = [
    {
        "name": "search_flights",
        "description": "Search for available flights between two cities.",
        "input_schema": {
            "type": "object",
            "properties": {
                "origin": {"type": "string", "description": "Origin city or airport code"},
                "destination": {"type": "string", "description": "Destination city or airport code"},
                "date": {"type": "string", "description": "Travel date (YYYY-MM-DD)"},
            },
            "required": ["origin", "destination", "date"],
        },
    }
]

# Initial message
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "Find me flights from Atlanta to Tokyo next week."}],
)

# Process tool calls
if response.stop_reason == "tool_use":
    tool_use_block = next(b for b in response.content if b.type == "tool_use")
    tool_result = execute_tool(tool_use_block.name, tool_use_block.input)

    # Continue conversation with tool result
    final_response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        tools=tools,
        messages=[
            {"role": "user", "content": "Find me flights from Atlanta to Tokyo next week."},
            {"role": "assistant", "content": response.content},
            {"role": "user", "content": [
                {"type": "tool_result", "tool_use_id": tool_use_block.id, "content": json.dumps(tool_result)}
            ]},
        ],
    )

18.6 Streaming

For real-time UI updates (typing indicator effect), use streaming:


with client.messages.stream(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a poem about the ocean."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

18.7 Context Window Management

Claude has a context window (the maximum combined length of system prompt + conversation history + tool definitions + response). Exceeding the limit causes an error.

Model Context Window
claude-opus-4-5 200,000 tokens
claude-sonnet-4-5 200,000 tokens
claude-haiku-3-5 200,000 tokens

Practical limits for product features:

  • Keep system prompts under 2,000 tokens.
  • Keep RAG-injected context under 8,000 tokens.
  • Keep conversation history truncated to the last 20 messages or 10,000 tokens.

The ConversationManager in the ITI shared library handles automatic history truncation.


18.8 Error Handling


import anthropic

client = anthropic.Anthropic()

try:
    response = client.messages.create(...)
except anthropic.RateLimitError:
    # Wait and retry — use exponential backoff
    time.sleep(60)
except anthropic.APIConnectionError:
    # Network issue — retry or use fallback
    pass
except anthropic.APIStatusError as e:
    # API returned an error status
    print(f"API error: {e.status_code} {e.message}")
except anthropic.AuthenticationError:
    # Invalid API key — check configuration
    pass

18.9 Direct vs n8n-Mediated Calls

Consideration Direct API Via n8n
Centralized logging No Yes
RAG integration Manual Built-in (Dify retrieval node)
Error monitoring Manual Via Error Monitor workflow
Behavior changes without deployment No Yes (edit workflow)
Latency Lower Higher (webhook round-trip)
Suitable for Fallback, scripts, prototypes All product integrations

Previous: Chapter 17 — iOS & macOS with Swift | Next: Chapter 19 — Prompt Engineering

Table of Contents