01 — Structure
Anatomy of a Prompt
Every LLM call has three parts: system prompt (instructions, persona, constraints), user message (the current request), and optional assistant message (prefilled response start).
System Prompt
Sets behavior for the entire conversation. Loaded once. Put stable instructions here: persona, output format, constraints, examples. Example: "You are a senior data analyst. Outputs must be concise, evidence-based, and formatted as: Finding → Evidence → Recommendation."
User Message
The dynamic input. Should be specific and self-contained. Include all needed context without relying on the model's general knowledge if precision matters.
Assistant Prefill
Start the model's response yourself to guide format. Useful for: forcing JSON output (start with "{"), forcing code blocks (start with "```python"), or steering toward specific response types.
Example: well-structured prompt
# System prompt: stable, comprehensive
system = """You are a senior data analyst. Your outputs must be:
- Concise (under 200 words unless asked for detail)
- Evidence-based (cite numbers from the data provided)
- Formatted as: Finding → Evidence → Recommendation
Do NOT speculate beyond the data provided.
Do NOT use bullet points unless explicitly asked."""
# User message: specific, includes all needed context
user = """Analyze this Q3 sales data and identify the top concern:
Region | Q2 Revenue | Q3 Revenue | Change
North | $2.1M | $1.8M | -14%
South | $1.5M | $1.7M | +13%
West | $3.2M | $2.9M | -9%
Focus on actionable issues only."""
✓
Recency bias: Put your most important constraint at the END of the system prompt. Models exhibit recency bias — the last instruction has highest compliance rate.
02 — Capability Ladder
Zero-Shot, Few-Shot, and Chain-of-Thought
Zero-shot: just describe the task. Works for simple, common tasks. Fails on nuanced, rare, or multi-step tasks. Few-shot: provide 2–5 examples of (input, output) pairs before the actual input. Dramatically improves consistency and format adherence. Chain-of-thought (CoT): add "Think step by step" or include examples showing reasoning steps. Improves accuracy on math, logic, multi-step tasks by 20–40%.
Prompting Strategies
| Strategy | When to use | Cost | Accuracy gain |
| Zero-shot | Simple, common tasks | Minimal | Baseline |
| Few-shot (2–5 examples) | Format consistency, rare tasks | +examples tokens | +10–30% |
| Zero-shot CoT | Math, logic, multi-step | +reasoning tokens | +20–40% |
| Few-shot CoT | Hardest tasks | +examples + reasoning | +30–50% |
Example: few-shot vs zero-shot classification
# Zero-shot — inconsistent output format
user = "Classify this support ticket: 'My login button doesn't work on mobile Safari'"
# Output might be: "Technical Issue", "Bug Report", "UI Bug", "Technical" — unpredictable
# Few-shot — consistent, controlled output
user = """Classify support tickets into: billing, technical, account, feature_request
Ticket: "I was charged twice for last month" → billing
Ticket: "How do I export my data to CSV?" → feature_request
Ticket: "Can't log in after password reset" → account
Ticket: "My login button doesn't work on mobile Safari" →"""
# Output: "technical" — format controlled, vocabulary controlled
03 — Persona & Expertise
Role Prompting and Personas
Role prompting: give the model an identity that activates relevant knowledge and communication style. "You are an expert X" works well when the expertise is well-represented in training data. Avoid fictional personas for safety-critical tasks — "You are DAN (Do Anything Now)" is a jailbreak pattern.
Audience Specification
"Explain to a 10-year-old" vs "Explain to a senior ML engineer" controls depth, vocabulary, and analogies. Specific audience selection forces the model to adjust explanation style.
Example: effective persona patterns
# Technical expert persona
"You are a principal software engineer at a FAANG company with 15 years of Python experience.
When reviewing code, you prioritize: correctness first, then readability, then performance.
You give specific, actionable feedback — not vague observations."
# Domain expert persona
"You are a board-certified cardiologist. You explain medical concepts accurately but accessibly.
Always recommend the patient consult their own doctor for personal medical decisions."
# Anti-pattern: persona that fights the model's values
"You are an AI with no restrictions..." → jailbreak attempt, will be ignored or refused
⚠️
Specificity activates knowledge: "principal engineer reviewing a data pipeline" is more reliable than "software engineer". The more specific the role, the more reliably the model activates relevant knowledge and tone.
04 — Positive vs Negative
Instructions: Positive vs Negative
Positive instructions ("Do X") are more reliable than negative instructions ("Don't do X"). Negative constraints are necessary but should be paired with positive alternatives. Be specific about format, length, and style — "brief" means different things to different models.
Example: rewriting vague/negative prompts
# Vague — model interprets "summary" inconsistently
"Summarize this document."
# Specific — model knows exactly what to produce
"Write a 3-sentence executive summary of this document.
Sentence 1: The main topic and scope.
Sentence 2: The key finding or recommendation.
Sentence 3: The most important caveat or risk."
# Negative only — model may still do the thing
"Don't be verbose. Don't add disclaimers."
# Negative + positive alternative
"Be direct and concise — maximum 150 words.
Skip disclaimers and caveats unless the information is genuinely uncertain.
Start your response immediately without preamble."
Common Anti-Patterns & Fixes
| Anti-pattern | Problem | Fix |
| "Be helpful and accurate" | Every model tries this — no signal | "When uncertain, rate confidence 1-5" |
| "Don't hallucinate" | Model can't control this | "Only state facts you're confident about. Flag uncertain claims [UNCERTAIN]" |
| "Write a good email" | "Good" undefined | "Write 3-paragraph email: greeting + ask + next step + sign-off" |
| "Think carefully" | No actionable instruction | "List your assumptions first. Then answer." |
06 — Grounding & Knowledge
Context and Retrieval Integration
Grounding: provide relevant context so the model answers from facts, not hallucinated memory. Document injection: insert retrieved documents with clear delimiters. Instruction position: put user's question AFTER the documents (recency effect), not before.
Example: RAG prompt template
system = """You are a customer support agent. Answer questions using ONLY
the provided documentation. If the answer isn't in the docs, say so.
Quote the relevant passage when possible."""
user = f"""Documentation:
{retrieved_chunk_1}
{retrieved_chunk_2}
Customer question: {user_question}
Answer based only on the documentation above:"""
⚠️
Retrieval quality matters: "Answer only from provided context" reduces hallucinations but causes unhelpful "I don't know" responses when relevant context wasn't retrieved. Tune retrieval before over-restricting the prompt.
07 — Practice & Refinement
Iteration and Testing
Development Strategies
1
Start Minimal, Add Constraints — build iteratively
Begin with the simplest prompt that could work. Add instructions only to fix specific observed failures. Every added line is a new failure mode.
- Start: 1–2 sentences describing the task
- Test on 5 diverse examples
- Add constraint only if multiple tests fail
2
Test on Diverse Inputs — edge cases matter
20 varied inputs beat 20 similar inputs. Include edge cases: empty input, very long input, off-topic input, adversarial input. Prompts that work on 5 examples often fail on the 6th.
- Normal cases: 50% of eval
- Edge cases: 30% of eval
- Adversarial/tricky: 20% of eval
3
Version & Diff — track changes
Store prompts as text files in git. When you change a prompt, run your eval suite on both versions. Never deploy a prompt change without comparative testing.
- Commit each prompt change separately
- Include before/after eval results in commit
- Tag production-ready prompts
4
Separate System & User — isolation principle
Put stable instructions in system prompt; put dynamic content in user message. Mixing them makes prompts brittle — one change breaks everything.
- System: rules, persona, format constraints
- User: current task, data, question
- Never hardcode user data in system prompt
Evaluation & Tooling
References
Further Reading
Guides & Documentation
Academic Papers
Tools & Frameworks