Agentic Coding Workflows
A coding agent doesn’t just suggest lines — it takes a task and runs the agent loop: explore the repo, plan, edit files, run tests, fix what broke. This unlocks real delegation. It also demands a different workflow — your job becomes specifying and reviewing, not typing.
The agentic loop, applied to code
Section titled “The agentic loop, applied to code”The two human checkpoints — review the plan and review the diff — are where quality is won or lost. Skip them and you’ve automated the writing of code nobody understands.
Scope the task well
Section titled “Scope the task well”Agents succeed or fail on the task you hand them. A good agentic task is:
- Well-specified — clear definition of done, expected behavior, constraints.
- Verifiable — there’s a concrete way to confirm success, ideally tests.
- Bounded — a coherent unit of work, not “build the whole feature.”
- Low-ambiguity — few unstated decisions; you’ve made the design calls.
Poor: "Improve the checkout flow."Good: "In services/checkout.py, add a retry (3 attempts, exponential backoff) around the payment-gateway call. Only retry network and 5xx errors — never on a declined card. Add unit tests for both paths. Match the retry helper already in lib/http.py."The good version made the design decisions for the agent. Agents execute well; they decide poorly. Keep the judgment yours.
Review the plan before the code
Section titled “Review the plan before the code”Most agents will outline an approach first. This is the cheapest place to catch a mistake — redirecting a plan costs a sentence; redirecting a finished 500-line diff costs a rewrite. Read the plan: right files? sound approach? missing an edge case? Correct it now.
Review the diff like any other PR
Section titled “Review the diff like any other PR”An agent’s output is a pull request, and it gets the same scrutiny as a colleague’s — see Working Effectively. Read every file. Run it. Check the edge cases and security. The author being an agent lowers the bar for nothing — and arguably raises it, since the agent has no stake in the result.
Keep diffs reviewable: prefer several small, focused agent tasks over one sprawling one. A 200-line diff gets a real review; a 2,000-line diff gets a rubber stamp.
Let the agent close its own loop
Section titled “Let the agent close its own loop”Agents are far more effective when they can verify their own work and iterate. Give them that loop:
- Point them at the test command so they run tests and fix failures.
- Ensure linters and type checks run, so mistakes are caught automatically.
- A task with a clear pass/fail signal (tests, a build) lets the agent self-correct before you ever see the result.
This is why a strong test suite is now a force multiplier for AI-assisted development: it’s the agent’s feedback loop and your safety net.
Pitfalls
Section titled “Pitfalls”| Pitfall | Why it happens | Counter |
|---|---|---|
| Over-scoping | Vague, sprawling task | Small, bounded, well-specified tasks |
| Rubber-stamp review | Diff too large to absorb | Keep diffs small; review every line |
| Plausible-but-wrong | Output reads well, logic is off | Run it; test edges; understand it |
| Context gaps | Agent didn’t see a key file | Name the relevant files and conventions |
| Lost-thread looping | Agent flails on a hard task | Stop it; re-specify; or do it yourself |
| Skill erosion | Delegating the thinking, not the typing | Keep owning design and judgment |
Key takeaways
Section titled “Key takeaways”A coding agent runs the explore–plan–edit–test loop, shifting your role to specifying and reviewing. Hand it tasks that are well-specified, verifiable, bounded, and low-ambiguity — make the design decisions yourself. Review the plan before code (the cheapest fix) and the diff like any PR (the real gate). Keep diffs small. Give the agent a test command so it can self-correct. You remain fully accountable for every line that merges.