Skip to main content
< All Topics
Print

Patriot University Architecture Overview

Patriot University: Knowledge System Architecture

Document type: Technical Architecture Showcase

Product: Patriot University

Audience: Technical professionals and developers

Status: Published

Last updated: June 12, 2026

Related: AI Project Showcase (product overview)

Patriot University is a civic education and civil rights platform that delivers AI-powered rights advising, accountability research, and strategic guidance across five surfaces: a FastAPI backend, iOS and macOS native apps, a WordPress plugin, and a Hugo static portal. This document covers the knowledge system design and agent architecture that makes the platform work — the LLM Wiki, the Obsidian authoring environment, the dual knowledge graph, the publishing pipeline, and the rules and skills systems that constrain how AI agents operate within it.

The product overview showcase covers user needs, competitive positioning, and product features. This document does not repeat that material.


1. Architecture Overview

The platform is organized around a single source-of-truth Obsidian vault that feeds all runtime surfaces through a structured publishing and embedding pipeline.


┌─────────────────────────────────────────────────────┐
│              Obsidian Vault (source of truth)        │
│  _sources/ → _drafts/ → publishable wiki pages      │
│  ~1,550 .md files (June 2026)                        │
└──────────────┬──────────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────────┐
│         Flask Ops GUI (port 8766)                    │
│  Ingest · Promote · Publish · Lint · Chat            │
│  Embedded as Custom Frames inside Obsidian           │
└──────────────┬──────────────────────────────────────┘
               │
       ┌───────┴──────────┐
       ▼                  ▼
┌────────────┐    ┌────────────────────────────────┐
│ pu_publish │    │ Entity/Community Graph          │
│ .py        │    │ extract_entities.py →           │
│            │    │ build_communities.py →          │
│ MD → HTML  │    │ generate_community_summaries.py │
│ [[slug]] → │    │ (6,779 nodes · 16,310 edges)    │
│ permalinks │    └────────────────────────────────┘
└─────┬──────┘
      │
      ▼
┌─────────────────────────────────────────────────────┐
│      WordPress Echo KB (epkb_post_type_1 CPT)        │
│      publish-manifest.json (slug → wp_post_id)       │
└──────────────┬──────────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────────┐
│  sync_embeddings.py → Pinecone (via AI Engine)       │
└──────────────┬──────────────────────────────────────┘
               │
       ┌───────┴───────────────┐
       ▼                       ▼
┌─────────────┐       ┌────────────────────────┐
│  FastAPI    │       │  Graphify (code graph)  │
│  backend    │       │  32,102 nodes           │
│  Whoosh +   │       │  58,938 edges           │
│  Pinecone   │       │  query at chat time     │
│  RAG        │       └────────────────────────┘
└─────────────┘

Knowledge base scale (June 2026): ~1,550 publishable markdown files — 986 accountability profiles, 59 US jurisdiction voting guides, 17 truth-reconciliation documents, 105 press-freedom documents, 100 investigative-tools documents, 70 local specialist skill files.


2. LLM Wiki: Compounding Knowledge from Sources

The platform uses the LLM Wiki pattern, a knowledge-base architecture that treats AI-compiled wiki pages as the primary knowledge artifact rather than real-time generation or raw vector retrieval. The pattern was articulated publicly by Andrej Karpathy in April 2026 and spread widely in the developer community.

The core principle: knowledge compounds across sessions because the wiki is the persistent artifact. When a question is answered, it draws from a wiki page that already contains synthesized cross-references — the LLM is not re-deriving knowledge from scratch on each query.

2.1 Vault Structure

The vault enforces a strict three-layer layout defined in patriot-agent-base/knowledgebase/CLAUDE.md:


_sources/          Immutable raw captures. Never published, never mutated.
_drafts/           LLM-compiled drafts awaiting human review.
                   Each file carries source_ref provenance links.
<wiki folders>/    Publishable, cross-linked knowledge pages.
                   e.g. accountability/, voting/, truth-reconciliation/

Every _-prefixed folder is excluded from the publish loader by scripts/lib/frontmatter.py. This convention is enforced in code, not documentation.

2.2 Three Operations

The vault schema defines exactly three operations an agent may perform:

INGEST — Read from _sources/, compile a wiki draft to _drafts/, inject [[slug]] wikilinks to related pages, log the operation to _control-center/ingest-log.md with a datestamped entry. The draft stays in _drafts/ until a human promotes it.

QUERY — Route a query through scripts/lib/query_router.py. The router selects among five modes:

Mode Mechanism Best for
local Pinecone vector similarity Specific named entities
global LLM-generated community summaries Broad thematic questions
graph BFS traversal of entity graph Relationship chains
hybrid local + global combined General factual queries
auto Router selects based on query analysis Default

LINTmake lint runs validate_frontmatter.py + guardrails.py + link graph validation. A page cannot be promoted until lint passes.

2.3 Obsidian-Safe Markdown Contract

Because the same markdown files are read by Obsidian, processed by pu_publish.py, and indexed by Whoosh and Pinecone, the vault enforces a conservative Markdown subset. [[slug]] wikilinks are supported (Obsidian reads them; pu_publish.py resolves them to permalinks). The following are banned: ![[embeds]], callouts, block references (^id), %%comments%%, and inline hashtags. These bans exist because they create parse failures in the non-Obsidian consumers of the same files.

2.4 Human-Gated Promotion

Most LLM Wiki implementations described publicly are personal tools where the user authors and queries directly. The Patriot University implementation adds a mandatory human review gate between _drafts/ and publishable wiki pages:


make promote SLUG=<slug>   # promotes _drafts/<slug>.md → category folder
                           # sets status: review in frontmatter
make publish               # only after human approval

Accountability profiles additionally require the inclusion eligibility gate (described in Section 6) before make promote will accept them.

2.5 What This Is Not

The LLM Wiki pattern here is not RAG in the conventional sense. Pinecone vector retrieval is one of five query modes — not the primary one. The wiki pages themselves are the knowledge artifact. Pinecone fills gaps; the wiki is the source of truth.


3. Obsidian as the AI Authoring Environment

Most teams using Obsidian for AI-adjacent work treat it as a note-taking layer that feeds into a separate build process. This implementation goes further: Obsidian is the primary content-operations IDE, with the build pipeline running inside it.

3.1 Vault Configuration

The .obsidian/ folder configures eight community plugins that transform Obsidian’s behavior:

Plugin Role
dataview Live dashboards over vault metadata (status, publish state, category counts)
templater Scaffolds for new accountability profiles, voting guides, skill files
shellcommands Exposes make lint, make promote, make publish as callable vault commands
meta-bind Renders pipeline buttons inline in _control-center/index.md
custom-frames Embeds the Flask GUI as Obsidian panes (see 3.2)
obsidian-importer Ingests documents from external formats into _sources/
homepage Opens _control-center/index.md on vault launch
todoist-sync Surfaces content pipeline tasks in Obsidian

3.2 Custom Frames: The Flask GUI Inside Obsidian

The most architecturally significant configuration is Custom Frames, which embeds the Flask GUI (http://127.0.0.1:8766) directly into Obsidian panes. This creates a unified interface where the author never leaves the writing environment:

Frame URL What it does
PU AI Chat /chat/ Chat with the KB using graph-aware context injection
Source Manager /editor/sources/ Intake form for new _sources/ documents
Publish Status /ops/ Shows which vault pages are published, draft, or pending
Link Graph /ops/graph/ Visualizes the wikilink graph across the vault
Usage Dashboard /ops/usage/ API usage and embedding sync status

The Flask GUI (gui/) is a full Python web application with routes across chat.py, sources.py, editor.py, ops.py, skills.py, bulk.py, taxonomy.py, backup.py, and jobs.py. It runs locally as a background process; Obsidian accesses it via iframe.

3.3 Control Center

_control-center/index.md is the vault’s operational home page. It contains:

  • A live Dataview query showing publish status by category
  • Meta Bind buttons wired to shell commands: Promote, Lint, Publish, Sync Embeddings
  • Links to each Custom Frames pane
  • ingest-log.md — the append-only record of all INGEST operations

Because homepage plugin opens this file on vault launch, every authoring session starts with a current view of the pipeline’s state.

3.4 The _dashboards/ Pattern

Dataview dashboards query vault frontmatter at render time. Because the vault is a git repository, the live Dataview data is gitignored and mirrored into _dashboards/data/ by sync_dashboard_data.py. This preserves the vault’s state as a reproducible artifact while allowing dashboards to display current data without committing transient JSON on every render.


4. Dual Knowledge Graph System

The platform maintains two independent graph systems serving distinct purposes. They share no code and are not queried together.

4.1 Graphify: Codebase Navigation

Graphify extracts a knowledge graph from the codebase itself — Python, Swift, PHP, TypeScript, markdown, YAML — using AST parsing for source files and semantic extraction for documentation.

Metric Value (June 10, 2026 snapshot)
Files analyzed 2,858
Nodes 32,102
Edges 58,938
Communities 3,131
Edge provenance 96% EXTRACTED / 4% INFERRED

The graph is stored as graphify-out/graph.json and rebuilt with make build-graph (graphify . --update, AST-only, no API cost). The incremental update mode means the graph can be kept current without per-session rebuild overhead.

Runtime integration: When a developer or AI agent chats with the Flask GUI, gui/routes/chat.py calls _graphify_context(), which runs graphify query "" --graph graphify-out/graph.json as a subprocess and injects the returned subgraph into the chat context. This means the AI chat assistant answers codebase questions with live graph data rather than stale training knowledge.

4.2 Entity/Community Graph: KB Reasoning

The second graph is built over the knowledge base content, not the codebase. It implements the community-based retrieval pattern used in Microsoft’s GraphRAG work, where documents are organized into hierarchical communities and each community receives an LLM-generated summary.

Build chain:


make rebuild-graph-full
# → extract_entities.py      (entity + relationship extraction per document)
# → build_communities.py     (Leiden algorithm via igraph)
# → generate_community_summaries.py  (LLM summarizes each community)
Metric Value (June 2026)
Nodes 6,779
Edges 16,310
Community levels L0 + L1 (Leiden hierarchical)
LLM summaries 31

The community summaries power the global query mode in query_router.py: when a question is thematic or broad (“What patterns appear across accountability profiles?”), the router retrieves the relevant community summaries rather than individual document chunks.

The ego-graph pipeline (generate_ego_graphs.pysync_ego_graphs.py) produces per-profile network graphs stored as JSON and surfaced via WordPress shortcodes on individual profile pages.

4.3 Why Two Graphs

The codebase graph (Graphify) and the knowledge-base graph (entity/community) have different extraction methods, different query interfaces, and different runtimes. Running them as independent systems means each can be rebuilt, queried, and updated on its own schedule without coupling. The query_router.py selects between Pinecone, graph traversal, and community summaries at query time — the dual-graph architecture is the mechanism that makes the routing decision meaningful.


5. Obsidian → WordPress Publishing Pipeline

5.1 Data Flow


knowledgebase/<category>/<slug>.md
    │
    ▼ scripts/pu_publish.py
    │  • Strips leading H1
    │  • Resolves [[slug]] → WordPress permalinks
    │  • Applies iti_md_to_html.convert_md_to_html()
    │  • Handles embed shortcodes: {{timeline}}, {{infographic}}, {{network_graph}}
    │  • Hash-based change detection: skips unchanged files
    │  • Wraps output in <div class="pu-showcase">
    │
    ▼ WordPress REST API
    │  Custom Post Type: epkb_post_type_1 (Echo KB)
    │  publish-manifest.json: slug → wp_post_id + content_hash + permalink (~1.1 MB)
    │
    ▼ scripts/sync_embeddings.py
       AI Engine plugin: mwai/v1/vectors/sync → Pinecone
       sync-post-ids.json: tracks which posts need embedding updates

5.2 Design Properties

Idempotent by design. Running make publish on an unchanged vault is a no-op. pu_publish.py computes a content hash for each file and checks it against publish-manifest.json before making any API call. A full vault publish with no changes completes in seconds.

Multi-surface delivery. The same vault markdown feeds:

  • WordPress Echo KB (via pu_publish.py)
  • Pinecone embeddings (via sync_embeddings.py)
  • Hugo portal (static site from vault exports)
  • FastAPI Whoosh index (in-memory full-text search)
  • Native iOS/macOS apps (via FastAPI; PatriotCore package queries the backend)

No surface gets its own copy of the knowledge base. All surfaces derive from the vault.

Link resolution with cascade. When pu_publish.py encounters a [[slug]] wikilink pointing to a page that has not yet been published (no permalink in the manifest), it queues the target for re-render once it gains a permalink. This prevents broken links from temporarily unpublished pages.

Human gate on every step. No part of the pipeline runs autonomously:


make lint              # validate frontmatter + guardrails + link graph
make promote SLUG=x    # move _drafts/x.md → category folder (requires lint pass)
make publish           # push to WordPress REST API
make sync-embeddings   # update Pinecone

Each step requires an explicit human invocation. There is no cron job, webhook, or background process that publishes content.

5.3 Supporting Scripts

Script Purpose
pu_cleanup.py Trash stale WordPress posts + delete corresponding Pinecone vectors
fix_broken_links.py Repair wikilinks after slug renames
sync_related.py / populate_related.py Enforce bidirectional related: frontmatter links
normalize_frontmatter.py Re-canonicalize frontmatter after bulk edits
generate_ego_graphs.py Build per-profile network graphs for WordPress shortcodes
timeline_extract.py Extract timeline events → Cool Timelines Pro format
validate_frontmatter.py Schema validation run by make lint

The shared library in scripts/lib/ provides: frontmatter.py, manifest.py, wp_client.py, linkresolver.py, embed_check.py, guardrails.py, linkgraph.py, kb_index.py, query_router.py, redirects.py, runlog.py, seo.py, timeline_repo.py.


6. Rules Architecture

Rules are the mechanism by which AI agents that work on the codebase or vault are constrained. The system uses a four-tier hierarchy of context documents plus Cursor .mdc rule files.

6.1 CLAUDE.md Hierarchy

Every AI agent session inherits rules from all four levels, with lower levels inheriting from higher levels and never contradicting them:

Tier File Scope
Global ~/CLAUDE.md Cross-project: tool lanes, security rules, session protocol, protected file list
Project ITI/CLAUDE.md ITI monorepo: shared library patterns, multi-client isolation, delivery philosophy
Product ITI/products/patriot-university/CLAUDE.md PU-specific: directory map, publishing pipeline, graph layers, accountability governance
Vault patriot-agent-base/knowledgebase/CLAUDE.md Vault schema: LLM Wiki operations, Obsidian-safe Markdown contract, frontmatter schema

Each tier adds specificity. The vault-level CLAUDE.md defines exactly what an AI agent is permitted to do when working inside the knowledge base. It does not need to re-state global security rules (no hardcoded credentials, no eval on user data) because those are inherited.

6.2 Protected Files

The global CLAUDE.md defines a protected file list. Files on this list cannot be modified autonomously by any AI agent — changes must be flagged as [CONTEXT-UPDATE] items for the human to apply:

  • CLAUDE.md (all tiers)
  • .cursorrules
  • **/context/*.md
  • .cursor/rules/*.mdc
  • .agents/rules/*.md
  • .agents/skills/*.md
  • ~/.gemini/GEMINI.md

This protection is enforced by instruction, not by file system permissions. An agent that modified a protected file would be violating its own context rules, which is visible in review.

6.3 Cursor .mdc Rules

.mdc files in .cursor/rules/ are contextually applied to Cursor AI sessions. Three rules are active for this project:

graphify-navigation.mdc (always-on): Before answering any question about where code lives or how projects connect, the agent must query graphify first. Direct grep or read is only used after graph query. This prevents stale answers based on the agent’s training data rather than the current codebase.

graphify.mdc (always-on when graphify-out/graph.json exists): For project-scoped architecture questions, the agent uses graphify query "" rather than reading files directly. Falls back to GRAPH_REPORT.md only for broad architecture review. Falls back to file reads only when the graph does not surface sufficient context.

marketing-doc-freshness.mdc: Applied when editing marketing documents. Triggers re-verification of claims against current codebase state before making updates.

6.4 How Rules Are Applied at Runtime

Rules are injected into the AI agent’s system context at session start. Always-applied rules load unconditionally on every session. Agent-requestable rules load when the agent determines the current task matches the rule’s stated trigger condition.

The net effect is that an agent working on a codebase architecture question has a mandatory graph-first behavior, an agent working on marketing content has a freshness-check behavior, and an agent working in the vault has the LLM Wiki schema in context. These behaviors are not defaults that can be turned off — they are rules the agent operates under.

6.5 Cross-Agent Rules: AGENTS.md

AGENTS.md at the product root provides authoring rules for all AI agents across tools (Cursor, Claude Code, Codex). It points every agent at the vault schema and defines which paths agents are permitted to write to. Its key constraint: no agent may modify vault pages that are already published (status: published in frontmatter) without a human promote step first.


7. Skills Architecture

Skills are structured context documents that load specialist knowledge and behavioral constraints into an AI agent for a specific task. They are not pre-loaded into every session — they are invoked explicitly when a task requires depth that is not covered by the general context.

7.1 Local Specialist Skills (70 files)

patriot-agent-base/skills/ contains 70 specialist skill files, each covering a specific legal, investigative, or editorial domain. These skills are also published as ai-skills articles on the WordPress site — they serve double duty as agent configuration and as public documentation of what the platform knows how to do.

The skills are organized across four domains:

Constitutional Law (9 skills): One skill per constitutional amendment or structural doctrine — First, Fourth, Fifth, Sixth, Eighth, Tenth, Fourteenth, Twenty-Second Amendment, and Separation of Powers. Each skill file carries: the amendment text, foundational Supreme Court precedents, analysis of current litigation patterns (2025–2026), and a list of recognized constitutional law scholars. When the Rights Advisor receives a query that touches a specific amendment, the corresponding skill is injected into the prompt context alongside the KB content.

Immigration Law (4 skills): Know-your-rights at ICE encounters, detention rights, removal defense, and workplace enforcement. Each skill covers the statutory framework, foundational precedents, and current mass-removal litigation patterns.

Legal Practice (5 skills): Appellate brief writing, citation verification (Bluebook compliance, KeyCite/Shepard’s), legal database research methodology, litigation support and eDiscovery, and policy and regulatory tracking.

Investigative and Civic (52 skills): OSINT methodologies, network analysis, accountability tracking, democratic health monitoring, election law, civil resistance theory, voting rights law, and editorial quality gates.

7.2 Global Skills Library (100+ files)

~/.codex/skills/ contains the shared ITI skill library, available across all ITI products. Skills in this library that are relevant to Patriot University include: graphify (codebase graph navigation), tavily-search and tavily-research (web research), postgresql-administration, docker-compose-management, nginx-reverse-proxy, and wordpress-development.

The product-local skills take precedence over global skills when both cover the same domain. The global library provides operational depth; the local skills provide domain depth specific to civic accountability work.

7.3 Skills as Quality Gates

Two skills function primarily as quality gates rather than authoring tools:

patriot-private-citizen-inclusion-gate — Run before any new accountability profile is created. Applies a Non-Speech Anchor Test: a subject cannot be included solely because of their political speech, party affiliation, rally attendance, or political association. There must be a documented non-speech basis (financial fraud, abuse of office, violence, etc.) to trigger inclusion. The skill requires a mandatory “Basis for Inclusion” disclosure block in every profile.

patriot-sanity-check — Run after every accountability profile and after any skill or knowledge document is updated. Checks for: unsupported factual claims, claims that exceed the evidence, disproportionate severity assessments, temporal staleness, source credibility issues, logical inconsistencies, and inadvertent framing bias. The goal is not political balance — it is evidentiary rigor. A profile that correctly documents verified misconduct should pass; a profile that overstates severity or relies on weak sources should not.

These two skills form a gate-and-verify pattern: the inclusion gate runs before content is created, the sanity check runs after. No profile moves from _drafts/ to publishable wiki status without both passing.

7.4 Skill Composition

Skills are designed to chain. The accountability profile workflow demonstrates the pattern:


1. patriot-private-citizen-inclusion-gate   → confirms subject qualifies
2. accountability-profile-verification      → primary-source research, court citation,
                                              confidence scoring, legal risk assessment
3. patriot-sanity-check                    → evidentiary audit of the completed profile
4. make lint                               → frontmatter validation + link graph check
5. make promote SLUG=<slug>                → human review gate
6. make publish                            → WordPress + Pinecone

Each skill in the chain has a defined input (the current draft or the task description) and a defined output (a verification result or a quality-check report). Skills do not modify files — they produce assessments. The human and the pipeline scripts do the file operations.


8. Technical Specifications

Component Detail
Backend Python 3.12 · FastAPI · Uvicorn · Docker (PostgreSQL 16, Redis 7, MinIO)
Native apps Swift 5.9 · SwiftUI · PatriotCore shared package · SQLCipher
WordPress plugin PHP 8.1+ · WordPress 6.0+ · Echo KB CPT · Gutenberg blocks
Static portal Hugo
Ops GUI Python · Flask · port 8766
Vault Obsidian (community plugins: dataview, templater, shellcommands, meta-bind, custom-frames)
KB scale ~1,550 publishable .md files (June 2026)
Accountability profiles 986
Jurisdiction voting guides 59 (50 states + DC + 8 territories)
Local specialist skills 70
Codebase graph 32,102 nodes · 58,938 edges · 3,131 communities (Graphify)
KB entity graph 6,779 nodes · 16,310 edges · 31 LLM community summaries
Embeddings Pinecone (via WordPress AI Engine plugin)
Auth Invite-code JWT · 4-hour sessions · zero-PII logging
Mobile storage SQLCipher encrypted SQLite
Pipeline scripts 49+ scripts · 12 shared library modules in scripts/lib/
Publish state publish-manifest.json (~1.1 MB) · kb-index.json (~1.8 MB)

Source Files Referenced

File Purpose
patriot-agent-base/knowledgebase/CLAUDE.md LLM Wiki schema and vault operations
patriot-agent-base/knowledgebase/.obsidian/community-plugins.json Plugin configuration
CLAUDE.md (product tier) Directory map, pipeline, graph layers
scripts/pu_publish.py Core publish engine
scripts/lib/query_router.py Query mode routing
scripts/lib/guardrails.py Lint-time guardrail checks
gui/routes/chat.py Graphify context injection at chat time
graphify-out/GRAPH_REPORT.md Codebase graph stats (June 10, 2026)
docs/entity-graph.json · docs/community-graph.json KB entity graph
docs/community-summaries.json LLM community summaries
Makefile Full pipeline orchestration
patriot-agent-base/skills/ (70 files) Local specialist skills

For product features, user needs, and competitive context, see the Patriot University AI Project Showcase.

Table of Contents