Phase 03 — Production

Module 08 of 12 · 10 min read · Free

Module 8: Agentic Coding Pipelines

How to structure a codebase, a workflow, and a team of agents that ship quality code consistently — without stepping on each other.

This is Module 8 of a 12-part curriculum: Build Software Products with AI — From First Principles to Production Pipeline.

This is the module most people are here for. The agentic coding pipeline is the practical payoff of everything we’ve covered: memory, tools, orchestration, multi-agent architecture — all applied to shipping software.

The goal of this module: give you a production-grade pattern that prevents agents from colliding, keeps your codebase clean, and lets you work at a pace that would otherwise require a team.

The Core Mental Model

Think of yourself as a tech lead. Your job is not to write every line of code — it’s to:

Define the work clearly
Assign it to the right agent
Review the output
Merge when it’s right

You set the standard. The agents implement. You don’t write first drafts. You direct them.

This isn’t laziness — it’s leverage. The highest-value thing you do in a coding session is define the problem clearly and review the output critically. Both are things humans do better than agents. Let the agents do the implementation; you do the design and review.

Branch-Per-Task: The Foundation Rule

Every coding task gets its own branch. No exceptions.

Why? Because agents make mistakes. An agent working on feature A should never be able to accidentally break feature B. If each task lives on its own branch, a bad run affects only that branch. You review, you reject or fix, you move on. Main stays clean.

The workflow:

main (always clean, always deployable)
  └── feat/user-auth         ← Agent working here
  └── fix/broken-pagination  ← Different agent, isolated
  └── feat/email-templates   ← Another task, another branch

Before assigning any task to an agent, create the branch. After the agent completes work, review the diff before merging. Main is your quality gate, not a working area.

Git Worktrees: Parallel Development Without the Chaos

Git worktrees are a feature most developers don’t use. For agentic development, they’re essential.

A worktree lets you check out multiple branches simultaneously in separate directories. Without worktrees, if you have three tasks running in parallel, you’d need to stash/checkout between them constantly. With worktrees, each task has its own directory with its own branch checked out — completely isolated.

# Create a worktree for a new feature
git worktree add ../myproject-feat-auth feat/user-auth

# Create a worktree for a bug fix
git worktree add ../myproject-fix-pagination fix/broken-pagination

Now you have:

~/Desktop/myproject/              ← main branch
~/Desktop/myproject-feat-auth/    ← feat/user-auth
~/Desktop/myproject-fix-pagination/ ← fix/broken-pagination

Each agent gets its own worktree. They can’t touch each other’s files. They can’t cause merge conflicts mid-task. You review each worktree’s output independently and merge when ready.

The PR-First Discipline

No code merges to main without a pull request. Even if you’re the only developer.

This sounds like overhead. It’s not. PRs serve three critical functions in an agentic pipeline:

1. Reviewability. The PR diff is your review surface. You see every change the agent made, in context. You can catch mistakes, question decisions, and verify the implementation matches the spec before it lands on main.

2. Documentation. A PR with a clear title and description creates an audit trail. Six months from now, you’ll know why a change was made and what it was for.

3. Quality gates. CI runs on the PR branch. Tests must pass. Lint must pass. Type checks must pass. The agent can’t slip broken code through because the pipeline blocks it.

The agent’s job ends when it opens the PR. Your job begins when you review it. Never let an agent merge its own PR without human review — that’s where the quality gate disappears.

Channel-to-Task Mapping

If you’re running multiple projects with channel-based routing (e.g., Slack channels), map each channel to one codebase and one set of responsibilities.

#whorang-dev     → ~/Desktop/whorang/ repo
                 → Whorang-specific coding agent
                 → Worktrees in ~/Desktop/whorang-worktrees/
                 → Never commits to main directly
                 → Branch → PR → human review

#aim-dev         → ~/Desktop/useaim/ repo
                 → AIM-specific coding agent
                 → Task registry in ~/Desktop/useaim/.clawdbot/active-tasks.json
                 → Branch → PR → human merges

This mapping is the routing table of your multi-agent coding system. It should be explicit and documented. When a message arrives in #whorang-dev, there should be zero ambiguity about which agent handles it and what repository it touches.

The Task Lifecycle

A well-defined task lifecycle prevents the most common failure modes: abandoned work, duplicate work, and work that never gets reviewed.

backlog → ready → in-progress → review → done

Backlog: The task exists but isn’t ready to start. Missing spec, blocked by dependencies, or not prioritised yet.

Ready: Fully specified, unblocked, prioritised. The agent can pick this up immediately.

In-progress: An agent has been assigned and is working. Only one agent per task.

Review: The agent has completed the work and opened a PR. Awaiting human review.

Done: PR reviewed, merged, and verified.

This lifecycle should be tracked somewhere — a JSON file, a backlog board, a database table. The agent updates the task state as it works. You can see, at a glance, exactly where every piece of work is.

Writing a Good Task Spec

The quality of the agent’s output is directly proportional to the quality of the spec.

A good task spec includes:

What to build. Specific feature, fix, or change. Not “improve the search” — “add a filter parameter to GET /api/search that accepts a comma-separated list of tag values and returns only matching results.”

Context. Where in the codebase does this live? What files are relevant? What decisions have already been made? What constraints apply?

Acceptance criteria. How do you know when it’s done? What should the tests verify? What should the user experience look like?

What NOT to do. Constraints, out-of-scope items, things to avoid. Agents without constraints will sometimes do more than you asked — and “more” isn’t always better.

Time spent on a good spec pays back 3x in reduced review cycles.

Coding Agent System Prompt Pattern

The system prompt for a coding agent should be tight and specific:

You are a senior TypeScript engineer working on [Project Name].

## Repo context
- Location: ~/Desktop/myproject/
- Stack: Next.js 14, TypeScript, Prisma ORM, PostgreSQL
- Tests: Vitest + React Testing Library
- CI: GitHub Actions (runs on every PR)

## Rules
- Never commit to main — always work in the provided worktree
- Write tests for every new function (>3 lines)
- Follow the existing file naming conventions
- Run `npm run typecheck && npm run test` before opening a PR
- If you discover scope you weren't asked for, open a separate issue — don't do it in this PR

## When done
- Open a PR with a clear title and description of what changed and why
- Link the relevant issue or task ID
- Set status to "review" in the task registry

## Communication
- If you're blocked or something is unclear, ask before proceeding
- Don't guess at business logic — verify with the provided tests or ask

This prompt is opinionated, specific, and gives the agent clear operating constraints. It knows where it lives, what the rules are, and exactly how to hand off.

Quality Gates in the Pipeline

Automated quality gates are your safety net. They catch mistakes the agent makes and that review might miss.

Minimum quality gate stack:

PR opened → 
  lint (ESLint/Biome) → 
  type check (tsc --noEmit) → 
  unit tests (Vitest) → 
  build (next build) → 
  ← all pass → reviewable

Configure these to block merging if they fail. This isn’t about not trusting the agent — it’s about not trusting any code (human or agent) that hasn’t passed automated checks.

For higher-stakes codebases, add:

Integration tests
E2E tests (Playwright/Cypress)
Security scanning (Snyk, Semgrep)
Performance budgets

The Review Habit

Code review is your highest-leverage activity in an agentic pipeline. Do it seriously.

What to check in an AI-generated PR:

Logic correctness. Does this actually do what it’s supposed to do? Read the key functions.
Edge cases. What happens with empty input, null values, large datasets, concurrent requests?
Scope creep. Did the agent do more than asked? Sometimes that’s useful; often it’s risky.
Test quality. Are the tests actually testing the right things, or are they testing implementation details that will break on the next refactor?
Consistency. Does this follow the patterns established in the rest of the codebase?

Give the agent precise feedback when you reject. “This is wrong” produces a random rewrite. “The pagination is off by one — the offset should be (page - 1) * limit, not page * limit” produces a targeted fix.

What’s Next

You have a complete agentic coding pipeline: branch-per-task, worktrees, PR-first discipline, task lifecycle management, quality gates, and review discipline. In Module 9, we tackle background work — how to build agents that do useful things without being asked. Cron jobs, heartbeats, and the architecture of autonomous operation.

Referenced from @nikovijay

“Using ChatGPT (o1 or 4o) to generate detailed, scalable database designs from a PRD… saving output as markdown for Cursor.” — @PrajwalTomar_

“I built an AI legal document generator that made $2,345 in 24 hours while caring for my newborn son. No coding needed.” — @sirajraval

N+1 Newsletter

Enjoyed this module?

Subscribe to get notified when new modules and courses drop. No drip — just updates when there's something worth reading.

Subscribe on Substack →

Before you dive in —