Skip to main content
< All Topics
Print

Chapter 6: PostgreSQL & pgvector

Chapter 6: PostgreSQL & pgvector

Last Updated: 2026-03

## 6.1 Overview

PostgreSQL with the pgvector extension is the primary data store for the entire ITI platform. It hosts three databases, supports vector similarity search for AI features, and is the source of truth for all persistent state.

Container: iti-postgres

Image: pgvector/pgvector:pg16

Host port: 5432

Connection: postgresql://postgres:@localhost:5432/

6.2 The Three Databases

Database Owner Purpose
n8n n8n Workflow definitions, execution history, encrypted credentials
dify Dify API Knowledge base documents, chunks, embeddings (vectors), app configs
dify_plugin Plugin daemon Plugin state and configuration

All three databases are created automatically by init-db.sql on the first container start.


6.3 Connecting to PostgreSQL

Via docker exec (no client required)


# Connect to the n8n database
docker exec -it iti-postgres psql -U postgres -d n8n

# Connect to the Dify database
docker exec -it iti-postgres psql -U postgres -d dify

# Run a one-off query
docker exec iti-postgres psql -U postgres -d dify -c "SELECT COUNT(*) FROM documents;"

Common psql commands

Command Action
\l List all databases
\dt List all tables in current database
\d
Describe a table
\q Quit
\timing Toggle query timing

6.4 pgvector

The pgvector extension enables vector similarity search directly in PostgreSQL. Dify uses it to store document chunk embeddings and perform semantic search during RAG retrieval.

Verify pgvector is installed


docker exec iti-postgres psql -U postgres -d dify -c \
  "SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';"

How Dify uses pgvector

When a document is added to a Dify knowledge base:

  1. The document is chunked into segments (configurable chunk size and overlap).
  2. Each segment is embedded using the configured embedding model (text-embedding-3-small by default).
  3. The embedding vector (1536 dimensions for text-embedding-3-small) is stored in the dify database via pgvector.
  4. At query time, the query is embedded and compared against stored vectors using cosine similarity.

Index types

pgvector supports two index types for approximate nearest neighbor (ANN) search:

Index Type Use Case Creation
HNSW High-recall, fast query, higher memory CREATE INDEX ON embeddings USING hnsw (embedding vector_cosine_ops)
IVFFlat Lower memory, slightly lower recall CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops)

Dify manages index creation automatically. You do not need to create these manually.


6.5 Database Maintenance

Check database sizes


docker exec iti-postgres psql -U postgres -c \
  "SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database ORDER BY pg_database_size(datname) DESC;"

Check table sizes within Dify database


docker exec iti-postgres psql -U postgres -d dify -c \
  "SELECT tablename, pg_size_pretty(pg_total_relation_size(tablename::text)) AS size
   FROM pg_tables WHERE schemaname = 'public'
   ORDER BY pg_total_relation_size(tablename::text) DESC LIMIT 10;"

Vacuum and analyze (reclaim space)


docker exec iti-postgres psql -U postgres -d dify -c "VACUUM ANALYZE;"
docker exec iti-postgres psql -U postgres -d n8n -c "VACUUM ANALYZE;"

Run VACUUM ANALYZE monthly or after large deletions (e.g., after removing a large knowledge base).

Check for long-running queries


docker exec iti-postgres psql -U postgres -c \
  "SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
   FROM pg_stat_activity
   WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';"

6.6 Backup and Restore

Covered in detail in Chapter 4 — Daily Operations, Section 4.4 and 4.5. Summary:

Backup individual database


docker exec iti-postgres pg_dump -U postgres n8n > Archives/n8n-db-$(date +%Y%m%d).sql
docker exec iti-postgres pg_dump -U postgres dify > Archives/dify-db-$(date +%Y%m%d).sql

Restore


docker exec -i iti-postgres psql -U postgres -d n8n < Archives/n8n-db-YYYYMMDD.sql

6.7 Security

  • The PostgreSQL password is stored in .env as POSTGRES_PASSWORD and never hardcoded in any application code.
  • The PostgreSQL port (5432) is exposed to localhost only. It is not accessible from outside the machine.
  • Only the application containers (iti-n8n, iti-dify-api, iti-dify-worker) connect to PostgreSQL — each using the shared postgres superuser account within the Docker network.

Note: For production deployments on a remote server, each database should have a dedicated user with minimal privileges rather than sharing the superuser account.


Previous: Chapter 5 — Infrastructure Upgrades | Next: Chapter 7 — Redis

Table of Contents