Module 8: Agentic Coding Pipelines
How to structure a codebase, a workflow, and a team of agents that ship quality code consistently — without stepping on each other.
This is Module 8 of a 12-part curriculum: Build Software Products with AI — From First Principles to Production Pipeline.
This is the module most people are here for. The agentic coding pipeline is the practical payoff of everything we’ve covered: memory, tools, orchestration, multi-agent architecture — all applied to shipping software.
The goal of this module: give you a production-grade pattern that prevents agents from colliding, keeps your codebase clean, and lets you work at a pace that would otherwise require a team.
The Core Mental Model
Think of yourself as a tech lead. Your job is not to write every line of code — it’s to:
- Define the work clearly
- Assign it to the right agent
- Review the output
- Merge when it’s right
You set the standard. The agents implement. You don’t write first drafts. You direct them.
This isn’t laziness — it’s leverage. The highest-value thing you do in a coding session is define the problem clearly and review the output critically. Both are things humans do better than agents. Let the agents do the implementation; you do the design and review.
Branch-Per-Task: The Foundation Rule
Every coding task gets its own branch. No exceptions.
Why? Because agents make mistakes. An agent working on feature A should never be able to accidentally break feature B. If each task lives on its own branch, a bad run affects only that branch. You review, you reject or fix, you move on. Main stays clean.
The workflow:
main (always clean, always deployable)
└── feat/user-auth ← Agent working here
└── fix/broken-pagination ← Different agent, isolated
└── feat/email-templates ← Another task, another branch
Before assigning any task to an agent, create the branch. After the agent completes work, review the diff before merging. Main is your quality gate, not a working area.
Git Worktrees: Parallel Development Without the Chaos
Git worktrees are a feature most developers don’t use. For agentic development, they’re essential.
A worktree lets you check out multiple branches simultaneously in separate directories. Without worktrees, if you have three tasks running in parallel, you’d need to stash/checkout between them constantly. With worktrees, each task has its own directory with its own branch checked out — completely isolated.
# Create a worktree for a new feature
git worktree add ../myproject-feat-auth feat/user-auth
# Create a worktree for a bug fix
git worktree add ../myproject-fix-pagination fix/broken-pagination
Now you have:
~/Desktop/myproject/ ← main branch
~/Desktop/myproject-feat-auth/ ← feat/user-auth
~/Desktop/myproject-fix-pagination/ ← fix/broken-pagination
Each agent gets its own worktree. They can’t touch each other’s files. They can’t cause merge conflicts mid-task. You review each worktree’s output independently and merge when ready.
The PR-First Discipline
No code merges to main without a pull request. Even if you’re the only developer.
This sounds like overhead. It’s not. PRs serve three critical functions in an agentic pipeline:
1. Reviewability. The PR diff is your review surface. You see every change the agent made, in context. You can catch mistakes, question decisions, and verify the implementation matches the spec before it lands on main.
2. Documentation. A PR with a clear title and description creates an audit trail. Six months from now, you’ll know why a change was made and what it was for.
3. Quality gates. CI runs on the PR branch. Tests must pass. Lint must pass. Type checks must pass. The agent can’t slip broken code through because the pipeline blocks it.
The agent’s job ends when it opens the PR. Your job begins when you review it. Never let an agent merge its own PR without human review — that’s where the quality gate disappears.
Channel-to-Task Mapping
If you’re running multiple projects with channel-based routing (e.g., Slack channels), map each channel to one codebase and one set of responsibilities.
#whorang-dev → ~/Desktop/whorang/ repo
→ Whorang-specific coding agent
→ Worktrees in ~/Desktop/whorang-worktrees/
→ Never commits to main directly
→ Branch → PR → human review
#aim-dev → ~/Desktop/useaim/ repo
→ AIM-specific coding agent
→ Task registry in ~/Desktop/useaim/.clawdbot/active-tasks.json
→ Branch → PR → human merges
This mapping is the routing table of your multi-agent coding system. It should be explicit and documented. When a message arrives in #whorang-dev, there should be zero ambiguity about which agent handles it and what repository it touches.
The Task Lifecycle
A well-defined task lifecycle prevents the most common failure modes: abandoned work, duplicate work, and work that never gets reviewed.
backlog → ready → in-progress → review → done
Backlog: The task exists but isn’t ready to start. Missing spec, blocked by dependencies, or not prioritised yet.
Ready: Fully specified, unblocked, prioritised. The agent can pick this up immediately.
In-progress: An agent has been assigned and is working. Only one agent per task.
Review: The agent has completed the work and opened a PR. Awaiting human review.
Done: PR reviewed, merged, and verified.
This lifecycle should be tracked somewhere — a JSON file, a backlog board, a database table. The agent updates the task state as it works. You can see, at a glance, exactly where every piece of work is.
Writing a Good Task Spec
The quality of the agent’s output is directly proportional to the quality of the spec.
A good task spec includes:
What to build. Specific feature, fix, or change. Not “improve the search” — “add a filter parameter to GET /api/search that accepts a comma-separated list of tag values and returns only matching results.”
Context. Where in the codebase does this live? What files are relevant? What decisions have already been made? What constraints apply?
Acceptance criteria. How do you know when it’s done? What should the tests verify? What should the user experience look like?
What NOT to do. Constraints, out-of-scope items, things to avoid. Agents without constraints will sometimes do more than you asked — and “more” isn’t always better.
Time spent on a good spec pays back 3x in reduced review cycles.
Coding Agent System Prompt Pattern
The system prompt for a coding agent should be tight and specific:
You are a senior TypeScript engineer working on [Project Name].
## Repo context
- Location: ~/Desktop/myproject/
- Stack: Next.js 14, TypeScript, Prisma ORM, PostgreSQL
- Tests: Vitest + React Testing Library
- CI: GitHub Actions (runs on every PR)
## Rules
- Never commit to main — always work in the provided worktree
- Write tests for every new function (>3 lines)
- Follow the existing file naming conventions
- Run `npm run typecheck && npm run test` before opening a PR
- If you discover scope you weren't asked for, open a separate issue — don't do it in this PR
## When done
- Open a PR with a clear title and description of what changed and why
- Link the relevant issue or task ID
- Set status to "review" in the task registry
## Communication
- If you're blocked or something is unclear, ask before proceeding
- Don't guess at business logic — verify with the provided tests or ask
This prompt is opinionated, specific, and gives the agent clear operating constraints. It knows where it lives, what the rules are, and exactly how to hand off.
Quality Gates in the Pipeline
Automated quality gates are your safety net. They catch mistakes the agent makes and that review might miss.
Minimum quality gate stack:
PR opened →
lint (ESLint/Biome) →
type check (tsc --noEmit) →
unit tests (Vitest) →
build (next build) →
← all pass → reviewable
Configure these to block merging if they fail. This isn’t about not trusting the agent — it’s about not trusting any code (human or agent) that hasn’t passed automated checks.
For higher-stakes codebases, add:
- Integration tests
- E2E tests (Playwright/Cypress)
- Security scanning (Snyk, Semgrep)
- Performance budgets
The Review Habit
Code review is your highest-leverage activity in an agentic pipeline. Do it seriously.
What to check in an AI-generated PR:
- Logic correctness. Does this actually do what it’s supposed to do? Read the key functions.
- Edge cases. What happens with empty input, null values, large datasets, concurrent requests?
- Scope creep. Did the agent do more than asked? Sometimes that’s useful; often it’s risky.
- Test quality. Are the tests actually testing the right things, or are they testing implementation details that will break on the next refactor?
- Consistency. Does this follow the patterns established in the rest of the codebase?
Give the agent precise feedback when you reject. “This is wrong” produces a random rewrite. “The pagination is off by one — the offset should be (page - 1) * limit, not page * limit” produces a targeted fix.
What’s Next
You have a complete agentic coding pipeline: branch-per-task, worktrees, PR-first discipline, task lifecycle management, quality gates, and review discipline. In Module 9, we tackle background work — how to build agents that do useful things without being asked. Cron jobs, heartbeats, and the architecture of autonomous operation.
Further Reading
-
[Docs] Git Worktrees — Official Git Documentation — The official reference for worktrees. The examples section is the fastest way to get started.
-
[Docs] GitHub Actions — Getting Started — Setting up CI for PR quality gates. Start here if you don’t have CI configured yet.
-
[Blog] My AI Coding Workflow — Simon Willison — Willison on using LLMs for coding in production. Sceptical, practical, worth reading for the counter-perspective.
-
[Docs] Anthropic Claude Code — The official documentation for Claude Code — the agentic coding tool from Anthropic. Shows where the tooling is heading.
-
[Course] Berkeley Full Stack Deep Learning — The most production-focused university course available. Covers the full pipeline from model training to deployment. Directly relevant to building real systems, not just demos.
Referenced from @nikovijay
“Using ChatGPT (o1 or 4o) to generate detailed, scalable database designs from a PRD… saving output as markdown for Cursor.” — @PrajwalTomar_
“I built an AI legal document generator that made $2,345 in 24 hours while caring for my newborn son. No coding needed.” — @sirajraval
Subscribe to get notified when new modules and courses drop. No drip — just updates when there's something worth reading.
Subscribe on Substack →