Coding Agents

Claude Code + Codex. How to scope work, write PRDs that work, and avoid expensive loops.

TL;DR: Install Claude Code (primary engineer) and Codex (specialist). Always write a PRD before spawning. Short sessions beat long marathons — context degrades around 30-40 minutes. Write failing tests first. Never delegate exploratory work.

Install the Agents

# Claude Code — primary engineering agent npm install -g @anthropic-ai/claude-code # Codex — specialist agent for infra and pipelines npm install -g @openai/codex # Verify claude --version codex --version

The Two-Agent Split

Agent	Primary Use	When to Use It
Claude Code	Features, business logic, tests, PR reviews	Default for most coding tasks
Codex	Infrastructure, data pipelines, quick prototypes	When you need infrastructure or data work

Claude Code is your main engineer. Codex is the specialist you bring in for infra.

Write a PRD First. Always.

Never spawn a coding agent without a Product Requirements Document (PRD). Without one, the agent guesses at scope, writes code that doesn't match expectations, and burns tokens on revisions.

A PRD that works:

# PRD: [Feature Name]

## Goal
One sentence: what does this accomplish?

## Context
What exists already? What's the relevant background?

## Tasks
- [ ] First specific task
- [ ] Second specific task
- [ ] Write tests for all new code
- [ ] Run full test suite — all passes
- [ ] No linter errors

## Acceptance Criteria
- All checkboxes ticked
- Existing tests still pass
- Specific behaviour X produces result Y

## Out of Scope
- Things the agent should NOT touch
- Constraints to respect

Acceptance criteria are not optional. A non-deterministic worker needs deterministic success conditions. Without them, "done" means whatever the agent thinks it means.

Run Claude Code

# Interactive mode (recommended for complex tasks) cd /path/to/your/project claude --permission-mode bypassPermissions # Print mode (for scripted/automated tasks) claude --permission-mode bypassPermissions --print "Read PRD.md and implement all tasks"

Context Degrades at 30-40 Minutes

This is the most important thing to know about working with coding agents: long sessions produce worse results.

Around 30-40 minutes in, the context window fills up. The agent starts:

Hallucinating file paths that don't exist
Forgetting architectural decisions made earlier in the session
Declaring tasks complete when they aren't
Making contradictory changes

The solution: many short sessions, not one marathon.

# Session 1: Core implementation claude --permission-mode bypassPermissions --print "Implement tasks 1 and 2 from PRD.md" # Session 2: Tests claude --permission-mode bypassPermissions --print "Read PRD.md. Tasks 1 and 2 are complete. Now write tests for them." # Session 3: Cleanup claude --permission-mode bypassPermissions --print "Run linter, fix any issues, verify all acceptance criteria from PRD.md"

Each session reads the file system and git history — it picks up where the last one left off without needing to hold everything in context.

Test-Driven Prompts

Always ask the agent to write failing tests first, then implement code to make them pass:

"Write failing tests for the API endpoint described in PRD.md. Then implement the endpoint to make the tests pass. Do not modify the tests after writing them."

Tests are deterministic acceptance criteria. They give the agent a clear, measurable definition of done.

What NOT to Delegate

Don't delegate these to coding agents:

Exploratory work — when you don't know what the solution looks like, you need to think first. Agents execute, they don't discover.
One-line fixes — the overhead of spawning an agent isn't worth it for trivial changes
Security-critical code — auth, encryption, payments require human review
Database migrations — too destructive if wrong. Always review manually.
Ambiguous requirements — "make it better" produces worse code, not better

Cost Control

Coding agents are the biggest cost driver in your setup. A poorly scoped PRD can cause the agent to loop — attempting the same failing approach repeatedly, burning tokens on each iteration.

Set a token budget per task (roughly 50K-100K tokens for a medium feature)
Watch for loops — if the agent tries the same thing twice, interrupt and redirect
Scope tightly — every line in a PRD that's ambiguous will cost you tokens
Use --print mode for simple tasks — interactive mode uses more tokens

The Operating Rhythm

You write the PRD — 5-10 minutes, clear acceptance criteria
Claude Code executes — 20-30 minutes, short session
You review the PR — 10-15 minutes, check the output
Claude Code revises if needed — another short session
You merge — done

Your job shifts from writing code to writing specs and reviewing output. That's the leverage.

← Heartbeats & Hooks Social & Commerce →

Questions & Suggestions

Have a question about this page? Spotted something wrong? Want to suggest an improvement? We read everything and respond to all paid-tier questions.

✅ Thanks! We got it. Questions from paid users are answered within 24 hours. Suggestions go straight into the queue.