Dify

What Dify is and when to use it
Core concepts: apps, workflows, agents
Building a RAG app in Dify
Workflow builder
API deployment
Self-hosting Dify
Gotchas

SECTION 01

What Dify is and when to use it

Dify is a low-code platform for building LLM-powered applications. Instead of writing Python to call the Anthropic API and manage prompts, you configure applications through a visual interface. Dify handles the infrastructure: prompt management, model switching, RAG pipeline setup, API endpoints, monitoring, and rate limiting.

With 130k+ GitHub stars, it's one of the most popular open-source LLM platforms. It's particularly strong for teams that need to: iterate on prompts without deploying code, build RAG pipelines without writing vector store logic, give non-engineers the ability to configure AI features, or ship a REST API from an LLM workflow in minutes.

Dify is NOT the right choice when: you need full programmatic control, your workflow logic is too complex for visual representation, you're already deep in LangChain/LangGraph and comfortable there, or you need to avoid cloud dependencies for compliance reasons (though self-hosting addresses this).

SECTION 02

Core concepts: apps, workflows, agents

Dify has three main app types:

Chatbot: the simplest type. Configure a system prompt, choose a model, add knowledge bases (RAG), and Dify creates a chat interface + REST API. No coding needed.

Workflow: a visual DAG (Directed Acyclic Graph) of LLM calls, tool calls, conditionals, code blocks, and data transformations. Think n8n or Zapier but specifically for LLM pipelines. Each node is a step; edges connect outputs to inputs.

Agent: a ReAct-style agent with access to tools you configure (web search, database query, custom API calls). The agent decides at runtime which tools to call and in what order. Dify handles the tool dispatch loop.

All three types can be published as REST APIs with one click, embedded in a web interface, or accessed via Dify's hosted chat interface.

SECTION 03

Building a RAG app in Dify

Dify's RAG setup is entirely visual:

1. Create a Knowledge Base: upload PDFs, connect a web crawler, or sync from Notion/GitHub. Dify chunks, embeds, and indexes documents automatically. You configure chunk size, overlap, embedding model, and index type (keyword, vector, or hybrid).

2. Create a Chatbot app and attach the knowledge base. Configure how many chunks to retrieve (top-K) and the relevance threshold.

3. Customise the citation format: Dify can show inline citations, source previews, or hide sources entirely.

The result is a RAG chatbot accessible via REST API:

import requests

DIFY_API_KEY = "app-..."
DIFY_BASE_URL = "https://api.dify.ai/v1"

def chat_with_dify(message: str, conversation_id: str = "") -> dict:
    response = requests.post(
        f"{DIFY_BASE_URL}/chat-messages",
        headers={"Authorization": f"Bearer {DIFY_API_KEY}"},
        json={
            "inputs": {},
            "query": message,
            "conversation_id": conversation_id,
            "response_mode": "blocking",
            "user": "user-123"
        }
    )
    return response.json()

result = chat_with_dify("What are the key findings in the Q3 report?")
print(result["answer"])
print(result.get("metadata", {}).get("retriever_resources", []))  # Citations

SECTION 04

Workflow builder

Dify's visual workflow builder lets you chain LLM calls with logic:

Available node types: LLM (call any configured model), Knowledge Retrieval (RAG lookup), HTTP Request (call any REST API), Code (run Python/JavaScript snippets), If-Else (conditional branching), Iteration (loop over list items), Variable Aggregator (merge branches), and many more.

A typical document processing workflow:

Start → [HTTP: fetch URL] → [LLM: extract key facts] → [If-Else: has enough facts?]
  YES → [LLM: generate report] → [HTTP: post to Slack] → End
  NO  → [LLM: request clarification] → End

The visual representation makes this understandable to non-engineers and easy to iterate on without code deploys. Each workflow node has configurable prompts, variables, and error handling. The workflow is exported as a YAML file for version control.

SECTION 05

API deployment

Every Dify app gets an API key and REST endpoints automatically. The API supports:

import requests

# Workflow API
def run_workflow(inputs: dict) -> dict:
    response = requests.post(
        "https://api.dify.ai/v1/workflows/run",
        headers={"Authorization": "Bearer app-..."},
        json={
            "inputs": inputs,
            "response_mode": "blocking",
            "user": "user-123"
        }
    )
    return response.json()

result = run_workflow({
    "document_url": "https://example.com/report.pdf",
    "output_format": "bullet_points"
})
print(result["data"]["outputs"])

# Streaming mode for long-running workflows
def run_workflow_streaming(inputs: dict):
    response = requests.post(
        "https://api.dify.ai/v1/workflows/run",
        headers={"Authorization": "Bearer app-..."},
        json={"inputs": inputs, "response_mode": "streaming", "user": "user-123"},
        stream=True
    )
    for line in response.iter_lines():
        if line.startswith(b"data: "):
            import json
            event = json.loads(line[6:])
            print(event.get("data", {}).get("text", ""), end="", flush=True)

SECTION 06

Self-hosting Dify

Dify can be self-hosted with Docker Compose in minutes:

git clone https://github.com/langgenius/dify.git
cd dify/docker
cp .env.example .env
# Edit .env: set SECRET_KEY, database passwords, S3 config
docker compose up -d

The Docker Compose stack includes: Dify API server, Dify Worker (async task queue), Nginx (reverse proxy), PostgreSQL (metadata), Redis (cache/queue), Weaviate (vector store), and MinIO (file storage). Everything runs locally — no data leaves your infrastructure.

For production deployments, use the enterprise version or deploy on Kubernetes with the official Helm chart. Key configuration options: swap Weaviate for Qdrant or Pinecone, use your own PostgreSQL, configure SSO for team access.

SECTION 07

Gotchas

Workflow logic has limits. Simple conditional branching works well. Complex nested loops, dynamic tool selection, or multi-agent coordination quickly exceed what the visual builder handles elegantly. When your workflow needs Python for logic, you're probably better off in LangGraph or a custom framework.

Version control is manual. Dify exports workflows as YAML, but it doesn't integrate with git natively. You need to manually export and commit. Larger teams should establish a workflow export discipline — otherwise "what changed in last week's deploy?" is unanswerable.

Self-hosting operational complexity is real. The Docker Compose stack is easy to start but has 8 services to maintain. Upgrades occasionally require manual database migrations. For small teams without DevOps capacity, the Dify cloud offering reduces this burden significantly.

Dify Workflow Component Comparison

Dify provides a visual workflow builder for constructing LLM applications without requiring deep programming expertise. Its node-based interface lets users chain LLM calls, tool invocations, conditional logic, and data transformations into production-ready pipelines that can be deployed as APIs or embedded in web applications.

Component	Purpose	Key Config	Use Case
LLM Node	Model inference	Model, prompt, temp	Any generation step
Knowledge Retrieval	Semantic search	Dataset, top-K, threshold	RAG pipelines
Tool Node	External API call	Tool name, params	Web search, calculators
Code Node	Python execution	Script, inputs/outputs	Data transformation
If/Else	Conditional routing	Condition expression	Intent routing
Iterator	Loop over list	Input array, sub-flow	Batch processing

Dify's dataset management handles the full RAG document lifecycle: uploading documents, chunking them into segments, generating embeddings, and storing them in a vector database. The platform supports multiple chunking strategies (fixed-length, sentence-based, paragraph-based) and allows configuring overlap between chunks. For most document types, paragraph-based chunking with moderate overlap produces the best retrieval precision because it preserves semantic units rather than splitting sentences mid-thought.

The annotation feature in Dify enables human-in-the-loop quality improvement. When a query produces a low-confidence or incorrect response, operators can annotate the correct answer directly in the Dify interface. These annotations are stored as high-priority examples that override the LLM response for similar future queries using embedding similarity matching, creating a progressively improving knowledge base without requiring model retraining.

Dify's variable system allows dynamic content to flow between nodes using a Jinja-like template syntax. Output from one node is referenced in subsequent nodes using double-brace notation, enabling complex data transformations across the workflow without requiring custom code. System variables like user inputs, session metadata, and timestamp are available globally, while node outputs are scoped to the current workflow execution context and accessible to all downstream nodes.

Multi-turn conversation state management in Dify uses conversation variables that persist across turns within a session. Unlike regular workflow variables that reset per execution, conversation variables accumulate context — previous questions, established facts, user preferences — that subsequent turns can reference. This enables agents that remember what the user said three turns ago without appending the full conversation history to every prompt, reducing token costs while maintaining conversational coherence.

Dify supports webhook triggers that allow external events to initiate workflow executions without human interaction. A webhook endpoint configured on a Dify workflow can receive payloads from GitHub, Slack, or any HTTP-capable service, parse the payload using a code node, and trigger the appropriate LLM processing pipeline. This enables event-driven AI automation where document uploads, ticket creations, or form submissions automatically invoke AI analysis without requiring polling or manual intervention.

Table of Contents