What can we help you with?
-
AI Skill
- Access Fingerprinter
- Accessibility Design
- Ad Campaign Optimization
- Advisor Action Framework
- AEO RECOMMENDATION TOOL - SYSTEM PROMPT
- Agentic Task Execution
- AI Candor Probe
- AI Citation Tracking
- AI Content Authenticity Detection
- AI Coworker Trust Protocol
- AI Inference Boundary Review
- AI Journalism
- AI Project Showcase Skill
- AI Self-Report Calibration
- AI Vision Diagnosis
- Antigravity Browser QA
- Antigravity Parallel Debug
- Antigravity Test Orchestration
- Apache HTTPD Configuration
- API Design
- Apple Human Interface Design System
- AppSec Engineer — API Security Specialist
- AppSec Engineer — Cloud & Container Security Specialist
- AppSec Engineer — DevSecOps Specialist
- AppSec Engineer — IAM Security Specialist
- AppSec Engineer — Security Testing & Incident Response Specialist
- Arborist / Tree Care Specialist
- Atlanta Gardening
- Atlanta Guide
- B2B Media Consulting
- Botanical Garden Taxonomist
- Botanist / Plant Scientist
- Brand Voice Development
- Business Proposal Evaluation
- Career Assessment
- Celery Task Management
- Chapter 22: Safety & Guardrails
- Chapter 26: Security Standards
- Chapter 28: Cursor Skills
- Children's Garden Educator
- Civic Tech Privacy Architecture
- Claims Integrity Audit
- CloudKit + Tauri Debugging
- Code Review
- Community Engagement Features
- Community Engagement Manager
- Competitive Analysis
- Conservation Biologist
- Content Gap Analysis
- Content Strategy
- Contract Analysis
- Conversational UI Design
- Cooking Technique Tutorial
- Copywriting
- Culinary Knowledge Lookup
- Curator of Living Collections
- Customer Journey Methodology
- Customer Support
- Data Interpretation
- Democratic Health Monitoring
- Dependency Hygiene
- Design Systems
- Dify Knowledge Base Management
- Director of Education
- Director of Horticulture
- Director of Science & Research
- Dive Conditions Forecasting
- Dive Planning
- Dive Site Data Ingestion
- Diversity, Equity & Inclusion (DEI) Coordinator
- Docker Compose Management
- Education Curator
- Education Program Coordinator
- Email Campaign Automation
- Email Parsing — Travel Bookings
- Estate Accounting
- Estate and Trust Management
- Estate Document Extraction
- Estate Jurisdiction Engine
- Estate Manager — Build Plan
- Estate Manager — Updated Product Roadmap
- Estate Professional — CPA / Accountant
- Estate Professional — Elder Law Attorney
- Estate Professional — Enrolled Agent
- Estate Professional — Estate Planning Attorney
- Estate Professional — Financial Advisor
- Estate Professional — Insurance Agent
- Estate Professional — Probate Attorney
- Estate Professional — Probate Litigation Attorney
- Estate Professional — Real Estate Agent
- Estate Professional — Real Estate Appraiser
- Estate Professional — Real Estate Attorney
- Estate Professional — Tax Attorney
- Estate Professional — Title Company
- Estate Task Automation
- EventKit Calendar Sync
- Executive Advisor Board — Build Plan
- Executive Board Advisor
- Executive CCO (Chief Customer Success Officer)
- Executive CEO
- Executive CFO
- Executive CHRO
- Executive CMO
- Executive COO
- Executive CPO (Chief Product Officer)
- Executive CRO (Chief Revenue Officer)
- Executive CTO
- Executive General Counsel
- Expat Planning
- Expat Tax Compliance
- Fact-Checking
- Family Gamification Design
- FastAPI Development
- Federal Register API Integration
- FIFA 2026 World Cup Travel Advisory
- Financial Analysis
- Flask Application Development
- FLUX Image Generation
- FLUX Operations
- Garden Technician
- Gardener / Groundskeeper
- Generative Engine Optimization
- GIS / Mapping Specialist
- Grateful Dead Historian
- Greenhouse Manager
- Greenhouse Technician
- Guided Content Journeys
- Head Gardener / Garden Manager
- Herbarium Curator
- Horticulturist
- Image Generation Service Operations
- Influencer Marketing
- Infrastructure Operations
- Infrastructure Upgrades
- Integrated Pest Management (IPM) Specialist
- Interaction Design
- Internship Program Coordinator
- Interview Coaching Design
- Irrigation Specialist
- ITI Audience Development
- ITI Consulting Intake
- ITI Content Strategy
- ITI Financial Modeling
- ITI MD to Wordpress HTML Converter
- ITI Quality Assurance
- ITI Report Synthesis
- ITI Strategic Planning
- ITI Technology Strategy
- ITI Token Compression Skill
- Java Development
- Journey Mapping
- Landing Page Optimization
- Lead Qualification
- Local SEO Optimization
- Marine Life Identification
- Market Research
- MCP Client for Tauri
- MCP Server Development
- Meal Planning
- Meeting Management
- Mental Load Equity Design
- Multi-Agent Deliberation Design
- Multilingual Content Management
- Music Discovery
- n8n + Dify Testing
- n8n Debugging
- n8n Workflow Development
- News Credibility Scoring
- Nginx Reverse Proxy
- Objection Handling
- Onboarding Design
- Pinecone Embedding Management
- Podcast Production
- Political Speech Analyzer
- PostgreSQL Administration
- Presentation Design
- Press Release Writing
- PRISM ZIP Code → Zone Lookup
- Privacy Compliance
- Product Design
- Product Roadmap Update Prompt
- Professional Selection
- Project Management
- Prompt Auditor
- Proposal Evaluation
- Public Relations Manager
- RabbitMQ Messaging
- Recipe Formatting
- Redis Operations
- Release Management
- Requirements Writing
- Research Associate / Lab Technician
- Retirement Calculator Engine
- Roadmap Build Planning
- Safety Guardrails
- Salary Negotiation Frameworks
- Schema Markup Generation
- School Programs Specialist
- Scope Control
- Scouting Trip Planning
- Screenshot Capture Guide
- Seed Bank Curator
- SEO & AEO Optimization
- Session Context Protocol
- Skills Index
- Social Media Content Calendar
- Stable Diffusion Image Generation
- Tauri Desktop Development
- Tavily & Pinecone Integrations
- Tavily API Quick Reference - Factchecker Plugin
- Tech Debt Analysis
- Technical Writing
- Test Plan Writing
- Therapeutic Horticulture Program Manager
- Travel Planning
- TSP Route Optimization
- UI Design
- UX Research
- Vibe Coding Guardrails
- Video Scripting
- Visual Brand Design
- Volunteer Coordinator
- Weather-Disease Modeling
- Wildlife Habitat Certification Guide
- WordPress SEO Plugin Integration
- Workflow Adapter Integration
- Show Remaining Articles (211) Collapse Articles
-
Product Showcase
- AEO Optimizer Product Showcase
- AI News Cafe Product Showcase
- AI Project Showcase: Journey Mapper (Customer Journey Mapper)
- AI Project Showcase: SEO Assistant with LLM
- Estate Manager Product Showcase
- Executive Advisor Board Product Showcase
- Expat Advisor Showcase
- Factchecker Product Showcase
- Farmers Bounty Product Showcase
- Gardener's Bounty AI Assistant Product Showcase
- GD Claude Chatbot Product Showcase
- IT Influentials Agent POC Product Showcase
- IT Influentials Agent Product Showcase
- IT Influentials Express Agents Product Showcase
- My TravelPlanner Product Showcase
- Patriot Agent Product Showcase
- Patriot University Showcase
- ScubaGPT — Product Showcase
- ScubaGPT Showcase
- WordPress Plugin Clone Safety Checker Showcase
- Show Remaining Articles (5) Collapse Articles
-
ITI Knowledge System
- Chapter 1: Introduction
- Chapter 10: n8n — Debugging & Operations
- Chapter 11: Dify — Knowledge Bases & RAG
- Chapter 12: The ITI Workflow Adapter
- Chapter 13: The ITI Shared Library
- Chapter 14: WordPress Plugin Development
- Chapter 15: Desktop Apps with Tauri 2
- Chapter 16: Python Services
- Chapter 17: iOS & macOS with Swift
- Chapter 18: Claude & the Anthropic API
- Chapter 19: Prompt Engineering
- Chapter 2: Workspace Overview
- Chapter 20: Agents, Skills & Pipelines
- Chapter 21: Knowledge Bases
- Chapter 22: Safety & Guardrails
- Chapter 23: Build Session Protocol
- Chapter 24: Required Product Artifacts
- Chapter 25: Testing
- Chapter 26: Security Standards
- Chapter 27: Deployment
- Chapter 28: Cursor Skills
- Chapter 29: Cursor Rules
- Chapter 3: The Docker Stack
- Chapter 30: MCP Integrations
- Chapter 31: Builder and Agent Roles
- Chapter 32: Builder's Portfolio
- Chapter 33: Claims Integrity & Content Governance
- Chapter 4: Daily Operations
- Chapter 5: Infrastructure Upgrades
- Chapter 6: PostgreSQL & pgvector
- Chapter 7: Redis
- Chapter 8: Nginx Reverse Proxy
- Chapter 9: n8n — Workflow Development
- Show Remaining Articles (18) Collapse Articles
-
AI Agent
-
User Guide
- ADMIN-SHORTCODES.html Update Summary
- Factchecker Plugin - Installation Guide
- Factchecker Plugin - Troubleshooting Guide
- Farmers Bounty - Quick Start Guide
- Farmers Bounty - Troubleshooting Guide
- Farmers Bounty - User Guide
- Farmers Bounty Chatbot - Complete Documentation
- Farmers Bounty Desktop User Guide
- Farmers Bounty Plugin - Gardener's Review Guide
- Farmers Bounty Plugin v6.6.0 - Release Notes
- Farmers Bounty v2.0 - Complete User Guide
- Farmers Bounty v5.3.0 - Complete User Guide
- SEO Assistant with LLM
- 🌱 Farmers Bounty Homepage Shortcode - Quick Start
- 🌱 Farmers Bounty Shortcodes
- 🌹 Grateful Dead Chatbot - Quickstart Guide ⚡
- Show Remaining Articles (1) Collapse Articles
-
Requirements
-
ScubaGPT
-
Grateful Dead Chatbot
-
Farmers Bounty
- 01 current state analysis
- 02 architecture overview
- 03 data sources
- 05 cost analysis
- 06 database schema
- 08 ui ux changes
- 09 ai context optimization
- 10 testing validation
- 11 risk mitigation
- 12 implementation checklist
- ADMIN-SHORTCODES.html Update Summary
- Atlanta Gardening
- Beneficial Insects Guide for Georgia Gardens
- Botanical Garden Taxonomist
- Children's Garden Educator
- Farmers Bounty - Quick Start Guide
- Farmers Bounty - Troubleshooting Guide
- Farmers Bounty - User Guide
- Farmers Bounty Chatbot - Complete Documentation
- Farmers Bounty Desktop User Guide
- Farmers Bounty Plugin - Gardener's Review Guide
- Farmers Bounty Plugin v6.6.0 - Release Notes
- Farmers Bounty v2.0 - Complete User Guide
- Farmers Bounty v5.3.0 - Complete User Guide
- Glossary
- Integrated Pest Management (IPM) Specialist
- PRISM ZIP Code → Zone Lookup
- Public Relations Manager
- Recipe Formatting
- Research Associate / Lab Technician
- School Programs Specialist
- Seed Bank Curator
- Volunteer Coordinator
- Weather-Disease Modeling
- Wildlife Habitat Certification Guide
- 🌱 Farmers Bounty Homepage Shortcode - Quick Start
- 🌱 Farmers Bounty Shortcodes
- Show Remaining Articles (22) Collapse Articles
-
Technical Document
- Accessibility Design
- Agentic Task Execution
- AI Candor Probe
- AI Coworker Trust Protocol
- AI Inference Boundary Review
- AI Vision Diagnosis
- Antigravity Browser QA
- Antigravity Parallel Debug
- Antigravity Test Orchestration
- AppSec Engineer — IAM Security Specialist
- Chapter 22: Safety & Guardrails
- Chapter 26: Security Standards
- Civic Tech Privacy Architecture
- ClaimReview Schema Integration
- Claims Evidence Registry
- Code Review
- IT Influentials Express Agents Product Showcase
- Java Development
- MCP Client for Tauri
- MCP Server Development
- Nginx Reverse Proxy
- Pinecone Embedding Management
- PostgreSQL Administration
- Product Roadmap Update Prompt
- Prompt Auditor
- RabbitMQ Messaging
- Redis Operations
- Release Management
- Retirement Calculator Engine
- Roadmap Build Planning
- Schema Markup Generation
- ScubaGPT — Architecture
- ScubaGPT Safety Guardrails - Quick Reference
- Session Context Protocol
- Stable Diffusion Image Generation
- Tauri Desktop Development
- Tavily & Pinecone Integrations
- Tavily API Quick Reference - Factchecker Plugin
- Tech Debt Analysis
- Test Plan Writing
- Travel Planner — n8n + Dify Integration Guide
- UI Design
- UX Research
- Vibe Coding Guardrails
- WordPress Plugin Clone Safety Checker Showcase
- Workflow Adapter Integration
- Show Remaining Articles (31) Collapse Articles
-
Answer Engine Optimizer
-
SEO Optimizer
-
Travel Planner
-
Fact Checker
-
Estate Manager
-
ITI Operations
- Access Fingerprinter
- Accessibility Design
- Advisor Action Framework
- Agentic Task Execution
- AI Candor Probe
- AI Content Authenticity Detection
- AI Coworker Trust Protocol
- AI Inference Boundary Review
- AI Project Showcase Skill
- AI Self-Report Calibration
- Antigravity Browser QA
- Antigravity Parallel Debug
- Antigravity Test Orchestration
- Apple Human Interface Design System
- AppSec Engineer — API Security Specialist
- AppSec Engineer — DevSecOps Specialist
- Chapter 32: Builder's Portfolio
- CloudKit + Tauri Debugging
- Code Review
- Content Strategy
- Customer Journey Methodology
- Customer Support
- Data Interpretation
- Dependency Hygiene
- End-User Documentation Requirements Document
- Farmers Bounty Plugin - Gardener's Review Guide
- Generative Engine Optimization
- Guided Content Journeys
- Influencer Marketing
- Infrastructure Upgrades
- Interaction Design
- IT Influentials Agent POC Product Showcase
- IT Influentials Agent Product Showcase
- IT Influentials Express Agents Product Showcase
- ITI Audience Development
- ITI Consulting Intake
- ITI Financial Modeling
- ITI Quality Assurance
- ITI Report Synthesis
- ITI Strategic Planning
- ITI Token Compression Skill
- Market Research
- MCP Server Development
- Multi-Agent Deliberation Design
- Multilingual Content Management
- n8n Debugging
- n8n Workflow Development
- Pinecone Embedding Management
- Privacy Compliance
- Product Roadmap Update Prompt
- Project Management
- Prompt Auditor
- Proposal Evaluation
- Redis Operations
- Release Management
- Requirements Writing
- Roadmap Build Planning
- Safety Guardrails
- Scope Control
- Screenshot Capture Guide
- Stable Diffusion Image Generation
- Tavily & Pinecone Integrations
- Technical Writing
- Test Plan Writing
- UI Design
- UX Research
- Vibe Coding Guardrails
- Wordpress Plugin Install Safety Features
- Show Remaining Articles (53) Collapse Articles
-
ITI Marketing
- Articles coming soon
-
Patriot University
-
Personal Assistant
< All Topics
Print
Dify Knowledge Base Management
Posted
Updated
ByPeter Westerman
name: dify-knowledge-base-management
description: Managing Dify knowledge bases including dataset creation, document ingestion, chunking strategy, embedding configuration, and semantic retrieval. Use when building or maintaining RAG pipelines, uploading documents to Dify, or troubleshooting retrieval quality.
Dify Knowledge Base Management
Instructions
Manage the full Dify knowledge base lifecycle for retrieval-augmented generation.
Dataset creation:
- Create datasets via Console API or UI with descriptive names and clear descriptions
- Supported document formats: md, txt, csv, json, pdf, docx, xlsx, html, xml
- One dataset per logical domain (e.g., “product-docs”, “support-tickets”, “policies”)
Chunking strategy:
- Use automatic mode (recommended) — Dify handles paragraph and section splitting
- For custom: set max chunk size 500-1000 tokens with 50-100 token overlap
- Pre-clean documents before upload: strip headers/footers, fix encoding, remove boilerplate
Embedding model selection:
- Use
text-embedding-3-smallvia pgvector for cost-effective semantic search - Embedding dimensions: 1536 (default) — sufficient for most retrieval tasks
- pgvector stores embeddings in PostgreSQL with HNSW or IVFFlat indexing
Semantic retrieval testing:
- Test queries against the dataset before wiring into applications
- Evaluate top-k results for relevance; adjust chunk size if results are too broad or narrow
- Use similarity score thresholds to filter low-confidence matches
Service API for runtime retrieval:
POST /v1/datasets/{dataset_id}/retrievewith Bearer token authentication- Request body:
{"query": "search text", "retrieval_model": {"search_method": "semantic_search", "top_k": 5}} - Use dataset-scoped API tokens, not user-level tokens
Console API authentication workaround:
- Generate tokens via
docker execinto the dify-api container when Console API is unreliable - Direct DB token creation: insert dataset-scoped API tokens into the
api_tokenstable - Scope tokens to specific datasets to follow least-privilege principles
Maintenance:
- Re-index after bulk document updates or embedding model changes
- Monitor retrieval latency — slow queries may indicate missing pgvector indexes
- Archive stale datasets rather than deleting to preserve audit trail
Table of Contents
