LLM Development Frameworks

Contents

Framework landscape
Framework comparison
LangChain deep-dive
LlamaIndex deep-dive
Haystack deep-dive
When to use what
References

01 — Overview

Framework Landscape

LLM development frameworks abstract away boilerplate. They provide building blocks: chains (sequential operations), prompts (templates), retrievers (vector search), agents (autonomous decision-making), and memory (conversation context). Each framework takes a different architectural approach, leading to different tradeoffs in simplicity, flexibility, and observability.

Why Frameworks Exist

Raw LLM APIs are low-level. Building a chat application requires: prompt template management, context window management, tool invocation loops, error handling, retries, and logging. Frameworks handle this. But they also impose patterns—some more opinionated than others.

💡 Key decision: Do you want simplicity (more opinionated framework) or flexibility (less opinionated)? LangChain leans opinionated; LlamaIndex leans flexible.

02 — Comparison Matrix

Framework Comparison

Framework	Primary Use	RAG Support	Agents	Streaming	Community
LangChain	General-purpose chains	Excellent	Strong (ReAct)	✓	Largest
LlamaIndex	RAG-first indexing	Outstanding	Good	✓	Large
Haystack	NLP pipelines	Excellent	Emerging	✓	Medium
Semantic Kernel	.NET integration	Good	Good	✓	Growing
LangGraph	State machines	Good	Excellent	✓	Emerging
DSPy	Program optimization	Good	Good	✓	Research

03 — Most Popular

LangChain Deep-Dive

LangChain is the largest and most opinionated framework. It pioneered the "chain" abstraction: composable components linked together. Core concepts: Chains (sequences), Runnables (modern execution model), Memory (conversation state), Tools (function calling), and Callbacks (observability).

LangChain Core Concepts

⛓️ LCEL Chains

Composable units of work
Pipe operator for chaining
Streaming + batching built-in

🧠 Memory Management

Conversation history tracking
Context window awareness
Retrieval augmented context

🔧 Tools & Agents

Function calling abstraction
ReAct agent loop
Tool selection strategies

📊 LangSmith

Tracing & debugging
Evaluation frameworks
Production monitoring

Example: Simple Chain

from langchain.llms import OpenAI from langchain.prompts import PromptTemplate from langchain.chains import LLMChain llm = OpenAI(temperature=0.7) prompt = PromptTemplate( input_variables=["topic"], template="Write a haiku about {topic}" ) chain = LLMChain(llm=llm, prompt=prompt) result = chain.run(topic="machine learning") print(result)

Pros & Cons

✅ Largest ecosystem and community
✅ Excellent documentation and examples
✅ Strong RAG + agent support
❌ Opinionated (can feel over-engineered)
❌ Breaking changes between versions
❌ Performance overhead for simple tasks

04 — RAG-First

LlamaIndex Deep-Dive

LlamaIndex is purpose-built for RAG. Concepts: Data Loaders (ingest), Indices (organize), Query Engines (retrieve + rerank), and Retrievers (vector search). Focused on data indexing and retrieval quality.

LlamaIndex Architecture

📥 Data Loaders

Load from 100+ sources
SimpleDirectoryReader
Custom loader support

📑 Indices

VectorStoreIndex (embedding-based)
TreeIndex (hierarchical)
SummaryIndex (flat)

🔍 Query Engines

Built on Retrievers
Reranking support
Streaming responses

🤖 Agents

Agent Router
Tool selection
Multi-index search

Example: RAG Pipeline

from llama_index import SimpleDirectoryReader, VectorStoreIndex # Load documents documents = SimpleDirectoryReader("./data").load_data() # Create index index = VectorStoreIndex.from_documents(documents) # Query query_engine = index.as_query_engine() response = query_engine.query("What is RAG?") print(response)print(response)

Pros & Cons

✅ RAG-optimized, not general-purpose
✅ Simpler learning curve than LangChain
✅ Excellent data loader ecosystem
❌ Less agent support than LangChain
❌ Smaller community

05 — NLP-First

Haystack Deep-Dive

Haystack (by deepset) is built on NLP-first thinking. Concepts: Pipelines (YAML-configurable workflows), Components (pluggable NLP nodes), Document Stores (retrieval), and Evaluators (metrics). Designed for production NLP systems.

When to Use Haystack

You have complex NLP requirements (entity extraction, classification)
You want YAML-based pipeline configuration
You need strong evaluation/metrics frameworks
You're building traditional search (BM25 + dense retrieval)

Haystack Pipeline Example

from haystack import Pipeline from haystack.components.retrievers.in_memory import InMemoryBM25Retriever from haystack.components.rankers import TransformersSimilarityRanker # Build pipeline pipeline = Pipeline() pipeline.add_component("retriever", InMemoryBM25Retriever()) pipeline.add_component("ranker", TransformersSimilarityRanker()) # Connect components pipeline.connect("retriever.documents", "ranker.documents") # Run result = pipeline.run({"retriever": {"query": "machine learning"}}) print(result)

06 — Decision Framework

When to Use What

Scenario	Best Choice	Why
Simple RAG (docs → search → LLM)	LlamaIndex	Purpose-built, minimal overhead
Complex agents (multi-tool, memory)	LangChain	Strongest agent framework
Enterprise NLP pipeline	Haystack	Production-grade, evaluation tools
.NET/C# application	Semantic Kernel	Native .NET integration
Agentic state machine	LangGraph	Explicit state control
Prompt optimization	DSPy	Automated prompt engineering

Migration Paths

Start simple with LlamaIndex for RAG, graduate to LangChain if you need agents. Both work together—use LlamaIndex's query engine as a tool in LangChain agents.

⚠️ Avoid framework hopping: All frameworks are viable. Pick one, go deep, and switch only if you hit real constraints (not theoretical ones).

Tools & SDKs

Framework & Tool Ecosystem

Framework

LangChain

General-purpose LLM application framework with chains, agents, memory

Framework

LlamaIndex

RAG-focused indexing and retrieval framework

Framework

Haystack

NLP-first pipeline and component framework

Framework

Semantic Kernel

Microsoft's .NET LLM framework

Agentic

LangGraph

State machine framework for agents (LangChain layer)

Optimization

DSPy

Automatic prompt optimization framework

Observability

LangSmith

Debugging, evaluation, monitoring (LangChain)

Vector DB

Pinecone

Managed vector database (works with all frameworks)

07 — Further Reading

References

Documentation

Guides & Blogs

LLM Development Frameworks

Framework Landscape

Why Frameworks Exist

Framework Comparison

LangChain Deep-Dive

LangChain Core Concepts

⛓️ LCEL Chains

🧠 Memory Management

🔧 Tools & Agents

📊 LangSmith

Example: Simple Chain

Pros & Cons

LlamaIndex Deep-Dive

LlamaIndex Architecture

📥 Data Loaders

📑 Indices

🔍 Query Engines

🤖 Agents

Example: RAG Pipeline

Pros & Cons

Haystack Deep-Dive

When to Use Haystack

Haystack Pipeline Example

When to Use What

Migration Paths

Framework & Tool Ecosystem

References

Related concepts