Coding Agents

Claude Code + Codex. How to scope work, write PRDs that work, and avoid expensive loops.

TL;DR: Install Claude Code (primary engineer) and Codex (specialist). Always write a PRD before spawning. Short sessions beat long marathons — context degrades around 30-40 minutes. Write failing tests first. Never delegate exploratory work.

Install the Agents

# Claude Code — primary engineering agent npm install -g @anthropic-ai/claude-code # Codex — specialist agent for infra and pipelines npm install -g @openai/codex # Verify claude --version codex --version

The Two-Agent Split

Agent Primary Use When to Use It
Claude Code Features, business logic, tests, PR reviews Default for most coding tasks
Codex Infrastructure, data pipelines, quick prototypes When you need infrastructure or data work

Claude Code is your main engineer. Codex is the specialist you bring in for infra.

Write a PRD First. Always.

Never spawn a coding agent without a Product Requirements Document (PRD). Without one, the agent guesses at scope, writes code that doesn't match expectations, and burns tokens on revisions.

A PRD that works:

# PRD: [Feature Name] ## Goal One sentence: what does this accomplish? ## Context What exists already? What's the relevant background? ## Tasks - [ ] First specific task - [ ] Second specific task - [ ] Write tests for all new code - [ ] Run full test suite — all passes - [ ] No linter errors ## Acceptance Criteria - All checkboxes ticked - Existing tests still pass - Specific behaviour X produces result Y ## Out of Scope - Things the agent should NOT touch - Constraints to respect
Acceptance criteria are not optional. A non-deterministic worker needs deterministic success conditions. Without them, "done" means whatever the agent thinks it means.

Run Claude Code

# Interactive mode (recommended for complex tasks) cd /path/to/your/project claude --permission-mode bypassPermissions # Print mode (for scripted/automated tasks) claude --permission-mode bypassPermissions --print "Read PRD.md and implement all tasks"

Context Degrades at 30-40 Minutes

This is the most important thing to know about working with coding agents: long sessions produce worse results.

Around 30-40 minutes in, the context window fills up. The agent starts:

The solution: many short sessions, not one marathon.

# Session 1: Core implementation claude --permission-mode bypassPermissions --print "Implement tasks 1 and 2 from PRD.md" # Session 2: Tests claude --permission-mode bypassPermissions --print "Read PRD.md. Tasks 1 and 2 are complete. Now write tests for them." # Session 3: Cleanup claude --permission-mode bypassPermissions --print "Run linter, fix any issues, verify all acceptance criteria from PRD.md"

Each session reads the file system and git history — it picks up where the last one left off without needing to hold everything in context.

Test-Driven Prompts

Always ask the agent to write failing tests first, then implement code to make them pass:

"Write failing tests for the API endpoint described in PRD.md. Then implement the endpoint to make the tests pass. Do not modify the tests after writing them."

Tests are deterministic acceptance criteria. They give the agent a clear, measurable definition of done.

What NOT to Delegate

Don't delegate these to coding agents:

Cost Control

Coding agents are the biggest cost driver in your setup. A poorly scoped PRD can cause the agent to loop — attempting the same failing approach repeatedly, burning tokens on each iteration.

The Operating Rhythm

  1. You write the PRD — 5-10 minutes, clear acceptance criteria
  2. Claude Code executes — 20-30 minutes, short session
  3. You review the PR — 10-15 minutes, check the output
  4. Claude Code revises if needed — another short session
  5. You merge — done

Your job shifts from writing code to writing specs and reviewing output. That's the leverage.