CrewAI

CrewAI's mental model
Core abstractions: Agent, Task, Crew
Building your first crew
Sequential vs hierarchical process
Custom tools
Connecting to Anthropic
Gotchas

SECTION 01

CrewAI's mental model

CrewAI models multi-agent collaboration as a crew of specialists. Each agent has a role (who they are), a goal (what they're optimising for), and a backstory (context that shapes their reasoning). Agents are assigned tasks, and the crew coordinates to produce a final output.

The analogy: you're building a startup team. You hire a Researcher, an Analyst, and a Writer — each with a job description, a personal objective, and relevant experience. You then assign a project and they collaborate.

SECTION 02

Core abstractions: Agent, Task, Crew

from crewai import Agent, Task, Crew, Process
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")

# Agents: define role, goal, backstory
researcher = Agent(
    role="Senior Research Analyst",
    goal="Uncover comprehensive, accurate information on the given topic.",
    backstory='''You are an expert researcher with 10 years of experience
synthesising complex technical information into clear, actionable insights.
You verify sources and flag uncertainty.''',
    llm=llm,
    verbose=True,
    allow_delegation=False
)

writer = Agent(
    role="Technical Content Writer",
    goal="Write clear, engaging technical content that developers will love.",
    backstory='''You are a technical writer who translates complex concepts
into accessible prose. You follow the style of technical blogs.''',
    llm=llm,
    verbose=True,
    allow_delegation=False
)

# Tasks: define expected output and assign to an agent
research_task = Task(
    description='''Research the top 3 Python web frameworks (Flask, Django, FastAPI).
Gather: GitHub stars, community size, key use cases, and performance characteristics.''',
    expected_output="A structured summary covering all 3 frameworks with factual data.",
    agent=researcher
)

write_task = Task(
    description='''Using the research provided, write a 400-word blog post comparing
Flask, Django, and FastAPI, with a clear recommendation for new projects.''',
    expected_output="A polished 400-word blog post with a recommendation section.",
    agent=writer,
    context=[research_task]   # write_task receives research_task's output
)

# Crew: wire agents and tasks together
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
    process=Process.sequential,   # or Process.hierarchical
    verbose=True
)

SECTION 03

Building your first crew

pip install crewai crewai-tools langchain-anthropic

from crewai import Crew, Process

# (Define agents and tasks as above)

# Kick off the crew
result = crew.kickoff()
print("\n=== FINAL OUTPUT ===")
print(result)
# Outputs the writer's blog post, having used the researcher's findings

Under the hood, CrewAI manages: passing outputs from task to task via the context parameter, managing the conversation history for each agent, handling tool calls, and formatting the final result.

SECTION 04

Sequential vs hierarchical process

Sequential (default): tasks execute one by one in the order defined. Each task can receive the output of previous tasks via context. Simple and predictable.

Hierarchical: a manager agent is automatically created. The manager assigns tasks to worker agents, evaluates outputs, and decides when the task is complete. More powerful for open-ended tasks, but more expensive (extra LLM calls for the manager).

from crewai import Crew, Process
from langchain_anthropic import ChatAnthropic

manager_llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")

crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, analysis_task, write_task],
    process=Process.hierarchical,
    manager_llm=manager_llm,    # the manager decides task routing
    verbose=True
)
result = crew.kickoff()

SECTION 05

Custom tools

from crewai_tools import tool

@tool("WebSearch")
def web_search(query: str) -> str:
    '''Search the web for current information. Use for facts, news, and data.'''
    # Replace with real implementation (SerpAPI, Tavily, etc.)
    return f"Web results for: {query}"

@tool("FileReader")
def read_file(filename: str) -> str:
    '''Read the contents of a local file. Provide the full file path.'''
    with open(filename) as f:
        return f.read()

# Assign tools to agents
researcher = Agent(
    role="Senior Research Analyst",
    goal="Research using web and local files.",
    backstory="Expert researcher.",
    llm=llm,
    tools=[web_search, read_file]   # only this agent has these tools
)

Tools are per-agent — the researcher can search the web, but the writer can't (and shouldn't). This is an important design feature: limiting tools reduces error surface area.

SECTION 06

Connecting to Anthropic

from crewai import Agent
from langchain_anthropic import ChatAnthropic

# Standard Claude model
claude_sonnet = ChatAnthropic(
    model="claude-3-5-sonnet-20241022",
    temperature=0.1,       # lower temp for more consistent agent behaviour
    max_tokens=4096
)

# Use Haiku for less critical agents (cheaper)
claude_haiku = ChatAnthropic(
    model="claude-3-5-haiku-20241022",
    temperature=0.0
)

# Assign different models to different agents
planner_agent  = Agent(role="Planner",  llm=claude_sonnet, ...)
executor_agent = Agent(role="Executor", llm=claude_haiku,  ...)

# Set ANTHROPIC_API_KEY in your environment or:
import anthropic
import os
os.environ["ANTHROPIC_API_KEY"] = "your-key"

SECTION 07

Gotchas

Backstory is real prompt space. The agent's role, goal, and backstory are injected into every prompt. A 500-word backstory × 5 agents × 10 tasks = 25,000 tokens of overhead. Keep backstories to 2–3 sentences covering essential context only.

allow_delegation=True can cause unexpected loops. When delegation is enabled, agents can reassign tasks to each other. In complex crews, this can create circular delegations. Disable it unless you specifically need agents to reassign work.

Tool errors propagate loudly. If a tool raises an exception, CrewAI will include the traceback in the agent's context. Wrap tools in try/except and return user-friendly error strings: "Search failed: rate limit. Please try a different query."

Verbose mode is expensive in production. verbose=True prints all intermediate reasoning, which is great for debugging but adds no value in production. Set verbose=False before deploying.

CrewAI Agent Role Design Patterns

CrewAI structures multi-agent systems around the metaphor of a crew of specialized workers, each with a defined role, goal, backstory, and set of tools. This role-based architecture makes agent collaboration intuitive to design and reason about, mapping naturally to human organizational structures like research teams, editorial boards, or engineering squads.

Role Pattern	Responsibilities	Typical Tools	Position in Flow
Researcher	Gather and synthesize information	Web search, document reader	First (information gathering)
Analyst	Process and evaluate data	Code executor, calculator	Middle (transformation)
Writer	Draft and format output	Template engine	Late (content generation)
Reviewer	Quality check and edit	None (pure reasoning)	Last (quality gate)
Orchestrator	Coordinate and delegate	Agent delegation	Throughout (coordination)

CrewAI's task system assigns work items to specific agents with explicit expected outputs, context dependencies, and output type specifications. Sequential task execution passes the output of each completed task as context to the next task in the chain, enabling information to accumulate naturally as it flows through the crew. Parallel task execution runs independent tasks simultaneously across multiple agents, reducing total wall-clock time for workflows where tasks do not have dependencies between them.

Agent backstory in CrewAI serves as the system prompt that shapes how the agent approaches its assigned tasks. A well-crafted backstory includes relevant expertise, characteristic reasoning style, and typical quality standards the agent should apply. An agent with the backstory "experienced data scientist who always validates assumptions and quantifies uncertainty" will approach tasks differently than one described as "decisive executive who prioritizes speed and actionable conclusions over exhaustive analysis," even when given identical goals and tools.

CrewAI's memory system provides agents with different types of persistent context across task executions. Short-term memory stores recent agent interactions within the current crew run. Long-term memory persists important findings across crew runs, building a knowledge base that improves over time. Entity memory tracks specific entities mentioned across tasks. These memory types are stored in a vector database, enabling semantic retrieval of relevant past knowledge when agents encounter similar situations in future runs — a key capability for crews that perform repeated similar tasks where learning from previous attempts provides value.

Error handling in CrewAI agent crews uses graceful degradation patterns where task failures are reported to the crew orchestrator with the failure reason, and the orchestrator decides whether to retry the task, reassign it to a different agent, or proceed with a reduced result. Configuring max_retries per task and setting allow_delegation to True enables agents to autonomously request help when they are unable to complete a task with their current capabilities. These resilience mechanisms are important for long-running crews that cannot be manually supervised throughout execution.

CrewAI Flow provides a code-first alternative to the role-based crew metaphor for defining LLM pipelines. Rather than defining agents with roles and goals, Flow uses Python decorators to define steps and the routing logic between them, with explicit state management using Pydantic models. Flow is better suited for deterministic pipelines where the execution structure is known in advance, while the crew abstraction is better for exploratory tasks where agents need to dynamically determine the appropriate next action based on intermediate results.

CrewAI's process types — sequential, hierarchical, and consensual — determine how tasks are assigned and executed within a crew. Sequential process executes tasks in defined order with outputs passed between tasks as context. Hierarchical process uses a manager agent to dynamically assign tasks to specialized workers based on capabilities. Consensual process (experimental) requires agent agreement before proceeding. Choosing the right process type for the task structure significantly affects crew reliability and efficiency — sequential processes are predictable but inflexible, while hierarchical processes adapt to task complexity at the cost of additional LLM calls for orchestration.