AI pair programmers that write, review, and refactor code — inside your IDE and CLI
AI coding assistants use large language models trained on billions of lines of code. They predict what you're about to type, complete functions, generate tests, or refactor existing code. The magic is context: the more context (surrounding code, open files) the model sees, the better it understands intent.
Autocomplete: Type a few characters, model suggests rest of line. Fastest, requires least setup. Chat: Ask the model questions or describe changes. "Add error handling to this function." CLI-based (agentic): Code assistant reads diffs, runs tests, commits automatically. Highest velocity but requires discipline.
Larger context windows (8K, 32K, 200K tokens) let the model see more of your codebase. With 200K context, you can include entire libraries or projects. Smaller context forces you to be explicit and limits the model's understanding of related code.
Cursor is an IDE built from VS Code specifically for AI. Native LLM integration, slash commands, multi-file editing. Cmd+K for inline editing, Cmd+L for chat. Supports Claude, GPT, o1. High context window (200K+). Best for developers who want a tightly integrated AI-first workflow.
GitHub Copilot is the industry standard. Available as VS Code extension, JetBrains IDEs, Neovim. Tab to accept autocomplete. Fine-tuned on public GitHub repos. Excellent at predicting short lines and common patterns. Works offline. Most familiar to developers.
| Tool | Context | Models | Cost | Setup |
|---|---|---|---|---|
| Cursor | 200K+ tokens | Claude, GPT, o1 | Pro $20/mo or usage | Native IDE |
| Copilot | 8K tokens | GPT-3.5, GPT-4 | $10/mo or team | Extension |
| Copilot X | 8K tokens | GPT-4 Turbo | $20/mo | Extension |
@mention to reference files in chat. Use slash commands (/edit, /review). Set rules in .cursorules file for codebase-specific conventions.
Aider is an AI pair programmer for the terminal. Install via pip. Run aider in your repo. Describe changes in plain English. Aider reads existing code, generates diffs, applies changes, runs tests. Commits automatically with atomic, testable diffs.
Add files to session with /add filename. Describe a change: "Add user authentication to login endpoint." Aider generates code, shows diff, applies to files. Run tests automatically. Commit if passing. Repeat. Highest velocity for complete features but requires good test coverage.
Atomic commits: Each change is a single, testable commit. Edit-in-place: Aider modifies files directly, never writes elsewhere. Context awareness: Reads entire codebase, understands structure. Test integration: Can run tests and debug failures.
Bad: "Make this faster." Good: "Optimize the loop by caching results in a dict and returning cached values on repeated inputs." Bad prompts lead to random refactors; good prompts guide precise changes.
If you want a specific style or pattern, show an example from your codebase. "Follow the pattern used in utils.py for error handling."
Keep related files open (IDE tools) or add them to session (Aider). The more context, the better the suggestions. Don't make the model guess what imports are needed or what functions exist.
from anthropic import Anthropic
client = Anthropic()
def review_code(code: str, language: str = "python") -> str:
"""Ask Claude to review code and suggest improvements."""
resp = client.messages.create(
model='claude-opus-4-5',
max_tokens=1024,
system=f"""You are a senior {language} engineer doing a code review.
Focus on: correctness, performance, readability, edge cases.
Be specific and actionable.""",
messages=[{
'role': 'user',
'content': f'Review this {language} code:\n\n```{language}\n{code}\n```'
}]
)
return resp.content[0].text
def generate_tests(code: str) -> str:
"""Generate pytest unit tests for a function."""
resp = client.messages.create(
model='claude-opus-4-5',
max_tokens=1024,
system='Generate comprehensive pytest unit tests. Include edge cases and error cases.',
messages=[{'role': 'user', 'content': f'Generate tests for:\n\n```python\n{code}\n```'}]
)
return resp.content[0].text
sample = """
def calculate_discount(price: float, pct: float) -> float:
return price * (1 - pct / 100)
"""
print(review_code(sample))
Boilerplate code (repetitive patterns), test generation, documentation, refactoring within a file, completing partially written functions. Models excel at pattern completion.
Novel algorithms without examples, complex architectural changes requiring global context, security-critical code (always review), external API integration (models hallucinate API calls). Require human oversight.
from openai import OpenAI
from string import Template
client = OpenAI()
CODE_GEN_TEMPLATE = Template("""You are an expert $language developer. Rules:
- Write idiomatic, production-quality $language
- Include type hints (Python) / TypeScript types (JS/TS)
- Handle edge cases and invalid inputs explicitly
- Add a one-line comment only for non-obvious logic
- Do NOT add print statements, logging, or TODO comments
TASK:
$task
INTEGRATION CONTEXT (code this must work with):
```$language
$context
```
Return ONLY the implementation. No explanation. No markdown code fences.
""")
def generate_code(task: str, language: str = "python",
context: str = "# No existing context") -> str:
prompt = CODE_GEN_TEMPLATE.substitute(
language=language, task=task, context=context
)
return client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
temperature=0.2 # low = more deterministic code
).choices[0].message.content
# Example
code = generate_code(
task="Implement `chunk(lst, n)` — split list into batches of size n",
language="python",
context="# Used in batch processing pipeline, n is always >= 1"
)
print(code)
# .cursorrules — project-level AI instructions (place in repo root)
cat > .cursorrules << 'EOF'
You are an expert Python developer working on a FastAPI application.
- Use Python 3.12+ type hints everywhere
- Prefer Pydantic v2 models for all request/response schemas
- Write pytest tests for every public function
- Use async/await for all I/O operations
- Follow the existing pattern: service layer calls repository layer
- Never use global state; inject dependencies via FastAPI Depends()
EOF
# Cursor reads this file and applies it to all AI completions in the project
# Open Cursor: Ctrl+Shift+P → "Cursor: Open AI Chat"
# Similarly for GitHub Copilot — add to .github/copilot-instructions.md
mkdir -p .github
cat > .github/copilot-instructions.md << 'EOF'
Project: FastAPI REST API (Python 3.12)
Always use type hints. Follow existing service/repository pattern.
Write async functions for database calls. Use Pydantic v2.
EOF
Modern AI coding tools have moved beyond autocomplete into agentic workflows: multi-step tasks where the AI reads files, writes code, runs tests, inspects output, and iterates until the task is complete. Claude Code, Devin, and similar tools operate in this mode.
Agentic coding is most effective when the task is well-defined and verifiable — "implement this interface so that these tests pass" is a great agentic task; "improve this codebase" is not. The key enablers are: tight test feedback loops, version control so every change is reversible, and scoped file access so the agent cannot touch unrelated code. The productivity gain is typically 3–10× for bounded, well-specified tasks.
import subprocess
from openai import OpenAI
client = OpenAI()
def run_tests(test_file: str) -> tuple[bool, str]:
result = subprocess.run(
["python", "-m", "pytest", test_file, "-v", "--tb=short"],
capture_output=True, text=True, timeout=60
)
return result.returncode == 0, result.stdout + result.stderr
def agentic_implement(task: str, impl_file: str, test_file: str,
max_attempts: int = 5) -> str:
history = [{"role": "system", "content":
"You are an expert Python developer. Write clean, well-typed code. "
"Return ONLY the complete Python file — no explanation, no markdown."}]
test_output = ""
for attempt in range(max_attempts):
user_msg = task if attempt == 0 else (
f"{task}
Attempt {attempt} failed. Pytest output:
{test_output}
"
"Fix the implementation so all tests pass."
)
history.append({"role": "user", "content": user_msg})
code = client.chat.completions.create(
model="gpt-4o", messages=history, temperature=0.2
).choices[0].message.content
history.append({"role": "assistant", "content": code})
with open(impl_file, "w") as f: f.write(code)
passed, test_output = run_tests(test_file)
if passed:
print(f"✓ Passed on attempt {attempt + 1}")
return code
print(f"✗ Attempt {attempt + 1} failed")
raise RuntimeError("Could not produce passing implementation")