Agents are powerful but expensive and unpredictable. Deterministic pipelines are cheaper and debuggable. A decision framework for choosing the right architecture for your task.
A pipeline is a fixed sequence of LLM calls and code steps that you control completely — retrieve, extract, summarise, format. Every step is predictable, fast, and cheap. A agent uses an LLM to decide which tools to call and in what order, looping until the task is complete. Agents handle open-ended tasks that don't fit a fixed sequence, but they're slower, more expensive, harder to debug, and can fail in unexpected ways.
The practical advice: start with a pipeline and only reach for agents when the task genuinely requires dynamic decision-making that you can't encode in a fixed workflow.
Use a deterministic pipeline when:
Use an agent when the task genuinely requires dynamic decision-making:
Rule of thumb: if you can write a flowchart for the task in 15 minutes, use a pipeline. If the flowchart has too many branches to draw, consider an agent.
Most production systems are hybrids: an agent orchestrates high-level decisions while pipelines handle well-defined sub-tasks. Examples:
def choose_architecture(task: str) -> str:
# Ask these questions in order:
questions = [
("Can you write all steps in a fixed flowchart?", "pipeline"),
("Is latency < 5s required?", "pipeline"),
("Is cost/request bounded critical?", "pipeline"),
("Does success rate need to be > 95%?", "pipeline"),
("Does the task require open-ended web research?", "agent"),
("Does the task require tool selection from 5+ options?", "agent"),
("Does task length vary wildly (1-50 steps)?", "agent"),
]
# If most answers point to "pipeline", use a pipeline.
# Only if the last 3 conditions are true should you reach for an agent.
...
# Practical decision tree:
# 1. Can I write a fixed DAG? -> YES -> Pipeline
# 2. Is this customer-facing (latency/reliability matters)? -> YES -> Pipeline
# 3. Is this internal / research task with flexible budget? -> Maybe agent
# 4. Does the task have genuinely open-ended structure? -> YES -> Agent
For a "research and summarise" task:
For a simple extraction task where a pipeline works perfectly, using an agent is ~23× more expensive and ~5× slower. Over 1M requests/month, that's the difference between $3k and $70k in LLM costs. Use agents only where their dynamic capability is actually needed.
Pipeline implementations use deterministic control flow — if/else branching, for loops, function composition — to sequence LLM calls and tool invocations. The predictable execution graph makes pipelines easy to test, debug, and optimize: each step can be unit-tested independently, failures are isolated to specific steps, and latency is predictable because the execution path is fixed. Agent implementations use the LLM itself to determine what to do next, creating dynamic execution graphs that adapt to task requirements but are harder to test comprehensively because all execution paths cannot be enumerated.
| Dimension | Pipeline | Agent |
|---|---|---|
| Control flow | Deterministic (code) | Dynamic (LLM-decided) |
| Testability | High (unit-testable steps) | Low (non-deterministic paths) |
| Latency | Predictable | Variable (unknown steps) |
| Cost | Fixed per run | Variable (depends on task) |
| Failure modes | Localized, recoverable | Cascading, harder to diagnose |
from openai import OpenAI
client = OpenAI()
# Pipeline approach: deterministic steps
def pipeline_summarize_and_translate(text: str, target_lang: str) -> str:
# Step 1: summarize
summary = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Summarize in 3 sentences:
{text}"}]
).choices[0].message.content
# Step 2: translate (always runs, predictable cost)
translated = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Translate to {target_lang}:
{summary}"}]
).choices[0].message.content
return translated
# vs Agent approach: LLM decides whether translation is needed
# (more flexible but unpredictable cost and steps)
Monitoring and observability requirements differ substantially between pipelines and agents. Pipeline monitoring logs fixed-step execution traces — step name, input, output, latency, errors — that map naturally to structured logging and distributed tracing systems. Agent monitoring must capture dynamic execution traces of variable length, track tool call sequences that were not predetermined, and identify when agents loop or make unproductive sequences of calls. Specialized agent tracing tools like LangSmith, Langfuse, and Arize Phoenix are designed for this dynamic trace structure, providing visualizations and analysis that general-purpose APM tools don't support.
Cost control is fundamentally different between pipelines and agents. Pipeline costs are deterministic and predictable — the cost per request equals the sum of token costs for each fixed step. Agent costs are variable and potentially unbounded — a misbehaving agent can make hundreds of tool calls before giving up, incurring costs orders of magnitude higher than expected. Budget limits (maximum LLM calls, maximum token spend per request) are essential safety mechanisms for agent deployments, as are circuit breakers that terminate agent loops when no progress is detected after a configurable number of steps.
The decision between pipeline and agent architecture frequently comes down to the variance in the input task structure. If all tasks have the same structure and require the same steps, a pipeline is always more appropriate — it is cheaper, faster, more reliable, and easier to maintain than an agent performing the same deterministic sequence. Agents add value only when task structure genuinely varies and the correct sequence of steps cannot be determined without reasoning about the specific task. Many applications that are initially built as agents can be refactored as pipelines once the actual distribution of task types is understood from production data.
Pipeline composition using function chaining or dataclass-based message passing provides explicit data flow visibility that agents lack. In a pipeline, the output type of each step is the input type of the next step — type annotations and runtime validation can enforce this contract, catching mismatches at development time rather than in production. This strongly-typed data flow is one of the most undervalued advantages of pipelines over agents: the compiler or type checker can verify the entire pipeline's structure before any LLM calls are made, enabling safe refactoring and early error detection that agentic architectures with dynamic tool selection cannot provide.
Hybrid architectures that use pipelines for the predictable outer structure and agents for bounded sub-tasks within that structure often provide the best practical tradeoff. For example, a document analysis pipeline might have fixed steps for document ingestion, chunking, and output formatting, but delegate the core analysis to an agent with a limited set of domain-specific tools. This structure preserves pipeline predictability for the high-level flow while allowing agent flexibility for the analysis step where the reasoning path genuinely varies. Bounding agent execution (max steps, specific tool set, structured output schema) within a pipeline context is the architectural pattern that most often succeeds in production LLM applications.